Germline copy number variation calling and analysis in an American Indian population
American Indians (AIs) have profound cancer health disparities, such that they have the highest cancer rates, worst patient survival, and diagnosis occurs at a much younger age when compared to other racial groups in the United States. Because of a historical lack of genomic research initiatives focused on AI populations, AIs have been excluded from genomic studies informing the development of precision medicine therapeutics, thus leading to precision medicine inequities. For example, the current reference genome (GRCh38) is largely Caucasian; therefore, we cannot improve precision medicine for American Indian communities unless they are included in genomic research. Almarri et al. completed a comprehensive analysis of structural variation in different ethnic groups from the Human Genome Diversity Panel (HGDP), including consented members of the Pima Indian population from Northern Mexico. Our goal was to extend this analysis and identify significant copy number variations (CNVs) unique to this Pima population. CNVs were called using the Delly platform, which was then refined through VCFtools’ quality and genotyping features. These variants were subsequently processed with Ensembl Variant Effect Predictor for annotation and mapped to genes using Ensembl’s Biomart, which identified 44 protein-coding genes. The disease biology of these genes was determined using NCBI and GeneCards. From this subset, the biological relevance of cancer-associating genes were assessed, identifying genes such as LGI1, STEAP2, and NRG1. The results from this analysis indicate that there are clinically relevant applications to be derived from this CNV analysis of Pima American Indians’ genomes.