Data Preparation for Somatic eQTL Analysis in Patients with Multiple Myeloma
Analysis of somatic, non-coding mutations in cancer is a novel area of research, with only a few studies that aim to understand how these non-coding mutations are relevant to cancer development and progression. Because variants in the coding region make up less than 2% of the entire human genome, attention is shifting towards the greater number of somatic variants present within the non-coding regions and how these variants affect cancer. One method used to analyze the effects of genetic variance is through expression quantitative trait loci (eQTLs); identifying genetic variants that explain a portion of inter-individual differences in gene expression levels. While methods for germline eQTL analysis are well established, somatic eQTL analysis is a novel approach with no clearly defined methodology. Furthermore, these somatic, non-coding mutations are more challenging to detect due to the general lack of uniformity in somatic mutations within the cells of a given individual and between individuals. The goal of this study is to identify somatic eQTLs in a cohort of patients with multiple myeloma, using samples from the Multiple Myeloma Research Foundation CoMMpass study. The CoMMpass study is a longitudinal study that collects and analyzes bone marrow samples from roughly 1150 patients with newly diagnosed multiple myeloma with the goal of mapping genomic profiles to understand each patient’s response to treatment. Our work will focus on the bone marrow samples taken at baseline, prior to any cancer treatments. Of the available patient samples, 519 had both whole genome long-insert and mRNA sequencing. These samples underwent whole genome long-insert sequencing at relatively low depth and MuTect was used to identify somatic mutations. This data was then filtered for the somatic mutations that occurred only within the non-coding regions. Then clusters of non-coding mutations that were within 50 base pairs of each other were merged and binned together, under the assumption that if the mutations are located quite close to each other, then that group of mutations must affect the same proximal gene. After the binning process, the next step will be to proceed with the somatic eQTL analysis, identifying any potential loci associated with gene expression of genes within the tumors of the patients with multiple myeloma. Somatic eQTL analysis can provide insight to the effects of non-coding mutations in cancer and can aid in identifying the best therapies for patients based on the entire genetic makeup of their tumor.