Development of an automated metabolomics workflow to achieve omics integration for molecular phenotyping
Global metabolomics is a method to study the dynamics of gene and protein expression in terms of metabolites as their final downstream products, which can provide unique biological insight and allow biomarker discovery. However, metabolite profiling presents unique challenges in data processing, identifying unknown metabolites, and multilevel integration with other omics datasets. These challenges require specialized mathematical, statistical and bioinformatics tools which are still under development. This project describes the development and validation of a metabolomics pipeline that combines processing, identification, and proteomics integration in a comprehensive manner to determine the metabolic phenotype of a system in the context of proteomic and genomic activity. The pipeline performs data conversion by means of an open-source MS file converter, data processing with XCMS’s R library, database search using an open-source metabolite search engine called LipidFinder, and omics integration linked to KEGG database. By implementing these tools locally using R scripting, the pipeline is able to complete simultaneous jobs at a rate 60 times faster than XCMS online and with identical data output. This novel, combined pipeline was validated using human plasma samples and successfully identified 50% of the published human metabolome in a single LC-MS/MS separation mode. To determine metabolic composition of extracellular vesicles (EVs), the pipeline was applied to LC-MS/MS results from EVs comprising of apoptotic bodies and microvesicles (ABMVs) and exosomes from a mouse Glioma cell line (GL02612). It identified 6023 metabolites in ABMVs and 6382 metabolites in exosomes from Glioma providing evidence that EVs carry a rich metabolome. Metabolomics and proteomics datasets were integrated to map metabolic pathways in EVs using the KEGG database. This integrated high-throughput pipeline will allow us to measure change in metabolites and their corresponding enzymes to advance our understanding of pathophysiological mechanisms and aid in biomarker discovery.