Assembly of V(D)J on the Immunoglobulin Heavy Chain
Multiple myeloma is a form of hematological cancer that affects plasma cells. Plasma cells typically reside in the bone marrow and produce antibodies in order to aid an immune response and fight infections. They do this by producing immunoglobulin proteins, more commonly known as antibodies. However, in the case of multiple myeloma, the malignant plasma cells produce an abnormal antibody called M protein, which offers no benefit to the body. The focus of this study was to identify which immunoglobulin heavy chain (IgH) V(D)J rearrangement is present in each patient in the MMRF CoMMpass cohort through local assembly of the B cell receptor region using either DNA or RNA sequencing data. Each patient should have two isoforms of the V(D)J region: the most abundant being functional and the other less abundant non-functional. The goal of this study is to identify the most abundant isoform as this could be used to characterize the patient’s myeloma. Patients participating in the CoMMpass study, funded by the Multiple Myeloma Research Foundation, have bone aspirate biopsies collected that are then whole genome, whole exome, and RNA (RNAseq) sequenced.
From here, two approaches were considered, DNA-based or RNA-based, for assembling the IgH V(D)J region, which is located on Chromosome 14 (Chr14:105586437-106879844). Both approaches had the same fundamental workflow: assemble V(D)J region>> identify contigs of interest >> align and quantify RNA to identified contigs to determine which is the most abundant form. Human reference build GRCh38/hg38 was used for both approaches as this build has better representation of the V(D)J region. Velvet, developed by the European Bioinformatics Institute, is a de novo based assembler of WGS data, and was selected for DNA-based assembly approach. V’DJer, an anchoring based assembler using RNAseq data developed at Lineberger Comprehensive Cancer Center, was selected for RNA based approach.
Initial trials with Velvet produced contigs that were tough to validate which V-D-J genes were present. As a result, the RNA-based approach became the focus. V’DJer was tested on sample, MMRF_2313, selected for its high purity (98.8%) and known V gene expression. V’DJer assembly was successful in assembling across the V(D)J region of of this sample. This finding instilled confidence in the ability of the tool to assemble over the region of interest. V’DJer was then used for the assembly of 9 other samples, 5 of which had a lower purity value (<80%); the remaining 4 had a high purity value (>90%) and were expected to behave similarly to the initial testing sample. These trials returned a mixed array of results, with 3 samples assembling the primary isoform (both high and low purity samples), and 3 additional samples assembling the secondary isoforms. The remaining 4 samples did not assemble the expected isoforms. The overall process shows promise and with additional tuning of technical parameters, an assembly of the V(D)J region for both primary and secondary isoforms should be possible.