1
|
Tuncer SB, Celik B, Erciyas SK, Erdogan OS, Gültaslar BK, Odemis DA, Avsar M, Sen F, Saip PM, Yazici H. Germline mutational variants of Turkish ovarian cancer patients suspected of Hereditary Breast and Ovarian Cancer (HBOC) by next-generation sequencing. Pathol Res Pract 2024; 254:155075. [PMID: 38219492 DOI: 10.1016/j.prp.2023.155075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/11/2023] [Accepted: 12/29/2023] [Indexed: 01/16/2024]
Abstract
Hereditary Breast and Ovarian Cancer (HBOC) syndrome is characterized by an increased risk of developing breast cancer (BC) and ovarian cancer (OC) due to inherited genetic mutations. Understanding the genetic variants associated with HBOC is crucial for identifying individuals at high risk and implementing appropriate preventive measures. The study included 630 Turkish OC patients with confirmed diagnostic criteria of The National Comprehensive Cancer Network (NCCN) concerning HBOC. Genomic DNA was extracted from peripheral blood samples, and targeted Next-generation sequencing (NGS) was performed. Bioinformatics analysis and variant interpretation were conducted to identify pathogenic variants (PVs). Our analysis revealed a spectrum of germline pathogenic variants associated with HBOC in Turkish OC patients. Notably, several pathogenic variants in BRCA1, BRCA2, and other DNA repair genes were identified. Specifically, we observed germline PVs in 130 individuals, accounting for 20.63% of the total cohort. 76 distinct PVs in genes, BRCA1 (40 PVs), BRCA2 (29 PVs), ATM (1 PV), CHEK2 (2 PVs), ERCC2 (1 PV), MUTYH (1 PV), RAD51C (1 PV), and TP53 (1PV) and also, two different PVs (i.e., c.135-2 A>G p.? in BRCA1 and c.6466_6469delTCTC in BRCA2) were detected in a 34-year-old OC patient. In conclusion, our study contributes to a better understanding of the genetic variants underlying HBOC in Turkish OC patients. These findings provide valuable insights into the genetic architecture of HBOC in the Turkish population and shed light on the potential contribution of specific germline PVs to the increased risk of OC.
Collapse
Affiliation(s)
- Seref Bugra Tuncer
- Department of Cancer Genetics, Istanbul Faculty of Medicine, Oncology Institute, Istanbul University, Istanbul, Türkiye.
| | - Betul Celik
- Erzincan Binali Yıldırım University, Department of Molecular Biology, Erzincan, Türkiye
| | - Seda Kilic Erciyas
- Department of Cancer Genetics, Istanbul Faculty of Medicine, Oncology Institute, Istanbul University, Istanbul, Türkiye
| | - Ozge Sukruoglu Erdogan
- Department of Cancer Genetics, Istanbul Faculty of Medicine, Oncology Institute, Istanbul University, Istanbul, Türkiye
| | - Busra Kurt Gültaslar
- Department of Cancer Genetics, Istanbul Faculty of Medicine, Oncology Institute, Istanbul University, Istanbul, Türkiye
| | - Demet Akdeniz Odemis
- Department of Cancer Genetics, Istanbul Faculty of Medicine, Oncology Institute, Istanbul University, Istanbul, Türkiye
| | - Mukaddes Avsar
- Health Services Vocational of Higher Education, T.C. Istanbul Aydın University, Istanbul, Türkiye
| | - Fatma Sen
- Clinic of Medical Oncology, Avrasya Hospital, Istanbul, Türkiye
| | - Pınar Mualla Saip
- Department of Medical Oncology, Oncology Institute, Istanbul University, Istanbul, Türkiye
| | - Hulya Yazici
- Istanbul Arel University, Arel Medical Faculty, Department of Medical Biology and Genetics, Istanbul, Türkiye
| |
Collapse
|
2
|
Jiagge E, Jin DX, Newberg JY, Perea-Chamblee T, Pekala KR, Fong C, Waters M, Ma D, Dei-Adomakoh Y, Erb G, Arora KS, Maund SL, Njiraini N, Ntekim A, Kim S, Bai X, Thomas M, van Eeden R, Hegde P, Jee J, Chakravarty D, Schultz N, Berger MF, Frampton GM, Sokol ES, Carrot-Zhang J. Tumor sequencing of African ancestry reveals differences in clinically relevant alterations across common cancers. Cancer Cell 2023; 41:1963-1971.e3. [PMID: 37890492 PMCID: PMC11097212 DOI: 10.1016/j.ccell.2023.10.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 08/02/2023] [Accepted: 10/04/2023] [Indexed: 10/29/2023]
Abstract
Cancer genomes from patients with African (AFR) ancestry have been poorly studied in clinical research. We leverage two large genomic cohorts to investigate the relationship between genomic alterations and AFR ancestry in six common cancers. Cross-cancer type associations, such as an enrichment of MYC amplification with AFR ancestry in lung, breast, and prostate cancers, and depletion of BRAF alterations are observed in colorectal and pancreatic cancers. There are differences in actionable alterations, such as depletion of KRAS G12C and EGFR L858R, and enrichment of ROS1 fusion with AFR ancestry in lung cancers. Interestingly, in lung cancer, KRAS mutations are less common in both smokers and non-smokers with AFR ancestry, whereas the association of TP53 mutations with AFR ancestry is only seen in smokers, suggesting an ancestry-environment interaction that modifies driver rates. Our study highlights the need to increase representation of patients with AFR ancestry in drug development and biomarker discovery.
Collapse
Affiliation(s)
- Evelyn Jiagge
- Hematology/Oncology Division, Department of Medicine, Henry Ford Health System, Detroit, MI, USA
| | - Dexter X. Jin
- Cancer Genomics Research, Foundation Medicine, Inc., Cambridge, MA, USA
| | - Justin Y. Newberg
- Cancer Genomics Research, Foundation Medicine, Inc., Cambridge, MA, USA
| | - Tomin Perea-Chamblee
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Kelly R. Pekala
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Surgery, Urology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Christopher Fong
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Michele Waters
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - David Ma
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | - Gilles Erb
- Global Product Development Medical Affairs – Oncology, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Kanika S. Arora
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer, New York, NY, USA
| | - Sophia L. Maund
- Computational Sciences, Genentech, Inc., South San Francisco, CA, USA
| | - Njoki Njiraini
- Department of Oncology, Kenyatta University Teaching Research and Referral Hospital, Nairobi, Kenya
| | - Atara Ntekim
- Department of Radiation Oncology, University of Ibadan, Ibadan, Nigeria
| | - Susie Kim
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Xuechun Bai
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Marlene Thomas
- Global Product Development Medical Affairs – Oncology, F. Hoffmann-La Roche Ltd, Basel, Switzerland
| | - Ronwyn van Eeden
- Department of Medical Oncology, Chris Hani Academic Baragwanath Hospital, Johannesburg, South Africa
| | - Priti Hegde
- Cancer Genomics Research, Foundation Medicine, Inc., Cambridge, MA, USA
| | - Justin Jee
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Debyani Chakravarty
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Nikolaus Schultz
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Michael F. Berger
- Marie-Josée and Henry R. Kravis Center for Molecular Oncology, Memorial Sloan Kettering Cancer, New York, NY, USA
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | - Ethan S. Sokol
- Cancer Genomics Research, Foundation Medicine, Inc., Cambridge, MA, USA
| | - Jian Carrot-Zhang
- Computational Oncology, Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Clinial Genetics, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
3
|
Ding Y, Liao Y, He J, Ma J, Wei X, Liu X, Zhang G, Wang J. Enhancing genomic mutation data storage optimization based on the compression of asymmetry of sparsity. Front Genet 2023; 14:1213907. [PMID: 37323665 PMCID: PMC10267386 DOI: 10.3389/fgene.2023.1213907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 05/24/2023] [Indexed: 06/17/2023] Open
Abstract
Background: With the rapid development of high-throughput sequencing technology and the explosive growth of genomic data, storing, transmitting and processing massive amounts of data has become a new challenge. How to achieve fast lossless compression and decompression according to the characteristics of the data to speed up data transmission and processing requires research on relevant compression algorithms. Methods: In this paper, a compression algorithm for sparse asymmetric gene mutations (CA_SAGM) based on the characteristics of sparse genomic mutation data was proposed. The data was first sorted on a row-first basis so that neighboring non-zero elements were as close as possible to each other. The data were then renumbered using the reverse Cuthill-Mckee sorting technique. Finally the data were compressed into sparse row format (CSR) and stored. We had analyzed and compared the results of the CA_SAGM, coordinate format (COO) and compressed sparse column format (CSC) algorithms for sparse asymmetric genomic data. Nine types of single-nucleotide variation (SNV) data and six types of copy number variation (CNV) data from the TCGA database were used as the subjects of this study. Compression and decompression time, compression and decompression rate, compression memory and compression ratio were used as evaluation metrics. The correlation between each metric and the basic characteristics of the original data was further investigated. Results: The experimental results showed that the COO method had the shortest compression time, the fastest compression rate and the largest compression ratio, and had the best compression performance. CSC compression performance was the worst, and CA_SAGM compression performance was between the two. When decompressing the data, CA_SAGM performed the best, with the shortest decompression time and the fastest decompression rate. COO decompression performance was the worst. With increasing sparsity, the COO, CSC and CA_SAGM algorithms all exhibited longer compression and decompression times, lower compression and decompression rates, larger compression memory and lower compression ratios. When the sparsity was large, the compression memory and compression ratio of the three algorithms showed no difference characteristics, but the rest of the indexes were still different. Conclusion: CA_SAGM was an efficient compression algorithm that combines compression and decompression performance for sparse genomic mutation data.
Collapse
Affiliation(s)
- Youde Ding
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Yuan Liao
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
| | - Ji He
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Jianfeng Ma
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Xu Wei
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Xuemei Liu
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Guiying Zhang
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| | - Jing Wang
- The Sixth Affiliated Hospital of Guangzhou Medical University, Qingyuan People’s Hospital, Qingyuan, China
- School of Biomedical Engineering, Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
4
|
Pipek O, Alpár D, Rusz O, Bödör C, Udvarnoki Z, Medgyes-Horváth A, Csabai I, Szállási Z, Madaras L, Kahán Z, Cserni G, Kővári B, Kulka J, Tőkés AM. Genomic Landscape of Normal and Breast Cancer Tissues in a Hungarian Pilot Cohort. Int J Mol Sci 2023; 24:ijms24108553. [PMID: 37239898 DOI: 10.3390/ijms24108553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 04/24/2023] [Accepted: 04/27/2023] [Indexed: 05/28/2023] Open
Abstract
A limited number of studies have focused on the mutational landscape of breast cancer in different ethnic populations within Europe and compared the data with other ethnic groups and databases. We performed whole-genome sequencing of 63 samples from 29 Hungarian breast cancer patients. We validated a subset of the identified variants at the DNA level using the Illumina TruSight Oncology (TSO) 500 assay. Canonical breast-cancer-associated genes with pathogenic germline mutations were CHEK2 and ATM. Nearly all the observed germline mutations were as frequent in the Hungarian breast cancer cohort as in independent European populations. The majority of the detected somatic short variants were single-nucleotide polymorphisms (SNPs), and only 8% and 6% of them were deletions or insertions, respectively. The genes most frequently affected by somatic mutations were KMT2C (31%), MUC4 (34%), PIK3CA (18%), and TP53 (34%). Copy number alterations were most common in the NBN, RAD51C, BRIP1, and CDH1 genes. For many samples, the somatic mutational landscape was dominated by mutational processes associated with homologous recombination deficiency (HRD). Our study, as the first breast tumor/normal sequencing study in Hungary, revealed several aspects of the significantly mutated genes and mutational signatures, and some of the copy number variations and somatic fusion events. Multiple signs of HRD were detected, highlighting the value of the comprehensive genomic characterization of breast cancer patient populations.
Collapse
Affiliation(s)
- Orsolya Pipek
- Department of Physics of Complex Systems, Institute of Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Donát Alpár
- HCEMM-SE Molecular Oncohematology Research Group, Department of Pathology and Experimental Cancer Research, Semmelweis University, 1085 Budapest, Hungary
| | - Orsolya Rusz
- Department of Pathology, Forensic and Insurance Medicine, SE NAP, Brain Metastasis Research Group, Semmelweis University, 1091 Budapest, Hungary
| | - Csaba Bödör
- HCEMM-SE Molecular Oncohematology Research Group, Department of Pathology and Experimental Cancer Research, Semmelweis University, 1085 Budapest, Hungary
| | - Zoltán Udvarnoki
- Department of Physics of Complex Systems, Institute of Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Anna Medgyes-Horváth
- Department of Physics of Complex Systems, Institute of Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - István Csabai
- Department of Physics of Complex Systems, Institute of Physics, Eötvös Loránd University, 1117 Budapest, Hungary
| | - Zoltán Szállási
- Department of Pathology, Forensic and Insurance Medicine, SE NAP, Brain Metastasis Research Group, Semmelweis University, 1091 Budapest, Hungary
- Computational Health Informatics Program (CHIP), Boston Children's Hospital, Harvard Medical School, Boston, MA 02115, USA
- Danish Cancer Society Research Center, 2100 Copenhagen, Denmark
| | - Lilla Madaras
- Department of Pathology, Forensic and Insurance Medicine, Semmelweis University, 1091 Budapest, Hungary
| | - Zsuzsanna Kahán
- Department of Oncotherapy, University of Szeged, 6720 Szeged, Hungary
| | - Gábor Cserni
- Department of Pathology, Albert Szent-Györgyi Medical Centre, University of Szeged, 6720 Szeged, Hungary
- Department of Pathology, Bács-Kiskun County Teaching Hospital, 6000 Kecskemét, Hungary
| | - Bence Kővári
- Department of Pathology, Albert Szent-Györgyi Medical Centre, University of Szeged, 6720 Szeged, Hungary
- Department of Pathology, Henry Lee Moffitt Cancer Center and Research Institute, Tampa, FL 33612, USA
| | - Janina Kulka
- Department of Pathology, Forensic and Insurance Medicine, Semmelweis University, 1091 Budapest, Hungary
| | - Anna Mária Tőkés
- Department of Pathology, Forensic and Insurance Medicine, Semmelweis University, 1091 Budapest, Hungary
| |
Collapse
|
5
|
Hamid I, Korunes KL, Schrider DR, Goldberg A. Localizing Post-Admixture Adaptive Variants with Object Detection on Ancestry-Painted Chromosomes. Mol Biol Evol 2023; 40:msad074. [PMID: 36947126 PMCID: PMC10116606 DOI: 10.1093/molbev/msad074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 03/14/2023] [Accepted: 03/20/2023] [Indexed: 03/23/2023] Open
Abstract
Gene flow between previously differentiated populations during the founding of an admixed or hybrid population has the potential to introduce adaptive alleles into the new population. If the adaptive allele is common in one source population, but not the other, then as the adaptive allele rises in frequency in the admixed population, genetic ancestry from the source containing the adaptive allele will increase nearby as well. Patterns of genetic ancestry have therefore been used to identify post-admixture positive selection in humans and other animals, including examples in immunity, metabolism, and animal coloration. A common method identifies regions of the genome that have local ancestry "outliers" compared with the distribution across the rest of the genome, considering each locus independently. However, we lack theoretical models for expected distributions of ancestry under various demographic scenarios, resulting in potential false positives and false negatives. Further, ancestry patterns between distant sites are often not independent. As a result, current methods tend to infer wide genomic regions containing many genes as under selection, limiting biological interpretation. Instead, we develop a deep learning object detection method applied to images generated from local ancestry-painted genomes. This approach preserves information from the surrounding genomic context and avoids potential pitfalls of user-defined summary statistics. We find the method is robust to a variety of demographic misspecifications using simulated data. Applied to human genotype data from Cabo Verde, we localize a known adaptive locus to a single narrow region compared with multiple or long windows obtained using two other ancestry-based methods.
Collapse
Affiliation(s)
- Iman Hamid
- Department of Evolutionary Anthropology, Duke University, Durham, NC
| | | | - Daniel R Schrider
- Department of Genetics, University of North Carolina, Chapel Hill, NC
| | - Amy Goldberg
- Department of Evolutionary Anthropology, Duke University, Durham, NC
| |
Collapse
|
6
|
Todisco S, Musio B, Pesce V, Cavalluzzi MM, Petrosillo G, La Piana G, Sgobba MN, Schlosserová N, Cafferati Beltrame L, Di Lorenzo R, Tragni V, Marzulli D, Guerra L, De Grassi A, Gallo V, Volpicella M, Palese LL, Lentini G, Pierri CL. Targeting mitochondrial impairment for the treatment of cardiovascular diseases: From hypertension to ischemia-reperfusion injury, searching for new pharmacological targets. Biochem Pharmacol 2023; 208:115405. [PMID: 36603686 DOI: 10.1016/j.bcp.2022.115405] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 12/26/2022] [Accepted: 12/28/2022] [Indexed: 01/03/2023]
Abstract
Mitochondria and mitochondrial proteins represent a group of promising pharmacological target candidates in the search of new molecular targets and drugs to counteract the onset of hypertension and more in general cardiovascular diseases (CVDs). Indeed, several mitochondrial pathways result impaired in CVDs, showing ATP depletion and ROS production as common traits of cardiac tissue degeneration. Thus, targeting mitochondrial dysfunction in cardiomyocytes can represent a successful strategy to prevent heart failure. In this context, the identification of new pharmacological targets among mitochondrial proteins paves the way for the design of new selective drugs. Thanks to the advances in omics approaches, to a greater availability of mitochondrial crystallized protein structures and to the development of new computational approaches for protein 3D-modelling and drug design, it is now possible to investigate in detail impaired mitochondrial pathways in CVDs. Furthermore, it is possible to design new powerful drugs able to hit the selected pharmacological targets in a highly selective way to rescue mitochondrial dysfunction and prevent cardiac tissue degeneration. The role of mitochondrial dysfunction in the onset of CVDs appears increasingly evident, as reflected by the impairment of proteins involved in lipid peroxidation, mitochondrial dynamics, respiratory chain complexes, and membrane polarization maintenance in CVD patients. Conversely, little is known about proteins responsible for the cross-talk between mitochondria and cytoplasm in cardiomyocytes. Mitochondrial transporters of the SLC25A family, in particular, are responsible for the translocation of nucleotides (e.g., ATP), amino acids (e.g., aspartate, glutamate, ornithine), organic acids (e.g. malate and 2-oxoglutarate), and other cofactors (e.g., inorganic phosphate, NAD+, FAD, carnitine, CoA derivatives) between the mitochondrial and cytosolic compartments. Thus, mitochondrial transporters play a key role in the mitochondria-cytosol cross-talk by leading metabolic pathways such as the malate/aspartate shuttle, the carnitine shuttle, the ATP export from mitochondria, and the regulation of permeability transition pore opening. Since all these pathways are crucial for maintaining healthy cardiomyocytes, mitochondrial carriers emerge as an interesting class of new possible pharmacological targets for CVD treatments.
Collapse
|
7
|
Frontanilla TS, Valle-Silva G, Ayala J, Mendes-Junior CT. Open-Access Worldwide Population STR Database Constructed Using High-Coverage Massively Parallel Sequencing Data Obtained from the 1000 Genomes Project. Genes (Basel) 2022; 13:genes13122205. [PMID: 36553472 PMCID: PMC9778533 DOI: 10.3390/genes13122205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 11/13/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022] Open
Abstract
Achieving accurate STR genotyping by using next-generation sequencing data has been challenging. To provide the forensic genetics community with a reliable open-access STR database, we conducted a comprehensive genotyping analysis of a set of STRs of broad forensic interest obtained from 1000 Genome populations. We analyzed 22 STR markers using files of the high-coverage dataset of Phase 3 of the 1000 Genomes Project. We used HipSTR to call genotypes from 2504 samples obtained from 26 populations. We were not able to detect the D21S11 marker. The Hardy-Weinberg equilibrium analysis coupled with a comprehensive analysis of allele frequencies revealed that HipSTR was not able to identify longer alleles, which resulted in heterozygote deficiency. Nevertheless, AMOVA, a clustering analysis that uses STRUCTURE, and a Principal Coordinates Analysis showed a clear-cut separation between the four major ancestries sampled by the 1000 Genomes Consortium. Except for larger Penta D and Penta E alleles, and two very small Penta D alleles (2.2 and 3.2) usually observed in African populations, our analyses revealed that allele frequencies and genotypes offered as an open-access database are consistent and reliable.
Collapse
Affiliation(s)
- Tamara Soledad Frontanilla
- Departamento de Genética, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto 14049-900, SP, Brazil
| | - Guilherme Valle-Silva
- Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto 14040-901, SP, Brazil
| | - Jesus Ayala
- Facultad de Ingeniería Informática, Universidad de la Integración de las Americas, Asunción 00120-6, Paraguay
| | - Celso Teixeira Mendes-Junior
- Departamento de Química, Laboratório de Pesquisas Forenses e Genômicas, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo, Ribeirão Preto 14040-901, SP, Brazil
- Correspondence:
| |
Collapse
|
8
|
Robinson M, Joshi A, Vidyarthi A, Maccoun M, Rangavajjhala S, Glusman G. Quality control of large genome datasets. Human Genetics and Genomics Advances 2022; 3:100123. [PMID: 35789587 PMCID: PMC9250042 DOI: 10.1016/j.xhgg.2022.100123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 06/02/2022] [Indexed: 11/26/2022] Open
Abstract
The 1000 Genomes Project (TGP) is a foundational resource that serves the biomedical community as a standard reference cohort for human genetic variation. There are now seven public versions of these genomes. The TGP Consortium produced the first by mapping its final data release against human reference sequence GRCh37, then “lifted over” these genomes to the improved reference sequence (GRCh38) when it was released, and remapped the original data to GRCh38 with two similar pipelines. As best-practice quality validation, the pipelines that generated these versions were benchmarked against the Genome In A Bottle Consortium’s “platinum quality” genome (NA12878). The New York Genome Center recently released the results of independently resequencing the cohort at greater depth (30×), a phased version informed by the inclusion of related individuals, and independently remapped the original variant calls to GRCh38. We performed a cross-comparison evaluation of all seven versions using genome fingerprinting, which supports ultrafast genome comparison even across reference versions. We noted multiple issues, including discrepancies in cohort membership, disagreement on the overall level of variation, evidence of substandard pipeline performance on specific genomes and in specific regions of the genome, cryptic relationships between individuals, inconsistent phasing, and annotation distortions caused by the history of the reference genome itself. We therefore recommend global quality assessment by rapid genome comparisons, alongside benchmarking as part of best-practice quality assessment of large genome datasets. Our observations also help inform the decision of which version to use, to support analyses by individual researchers.
Collapse
|
9
|
Avery CL, Howard AG, Ballou AF, Buchanan VL, Collins JM, Downie CG, Engel SM, Graff M, Highland HM, Lee MP, Lilly AG, Lu K, Rager JE, Staley BS, North KE, Gordon-Larsen P. Strengthening Causal Inference in Exposomics Research: Application of Genetic Data and Methods. Environ Health Perspect 2022; 130:55001. [PMID: 35533073 PMCID: PMC9084332 DOI: 10.1289/ehp9098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Advances in technologies to measure a broad set of exposures have led to a range of exposome research efforts. Yet, these efforts have insufficiently integrated methods that incorporate genetic data to strengthen causal inference, despite evidence that many exposome-associated phenotypes are heritable. Objective: We demonstrate how integration of methods and study designs that incorporate genetic data can strengthen causal inference in exposomics research by helping address six challenges: reverse causation and unmeasured confounding, comprehensive examination of phenotypic effects, low efficiency, replication, multilevel data integration, and characterization of tissue-specific effects. Examples are drawn from studies of biomarkers and health behaviors, exposure domains where the causal inference methods we describe are most often applied. Discussion: Technological, computational, and statistical advances in genotyping, imputation, and analysis, combined with broad data sharing and cross-study collaborations, offer multiple opportunities to strengthen causal inference in exposomics research. Full application of these opportunities will require an expanded understanding of genetic variants that predict exposome phenotypes as well as an appreciation that the utility of genetic variants for causal inference will vary by exposure and may depend on large sample sizes. However, several of these challenges can be addressed through international scientific collaborations that prioritize data sharing. Ultimately, we anticipate that efforts to better integrate methods that incorporate genetic data will extend the reach of exposomics research by helping address the challenges of comprehensively measuring the exposome and its health effects across studies, the life course, and in varied contexts and diverse populations. https://doi.org/10.1289/EHP9098.
Collapse
Affiliation(s)
- Christy L Avery
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Annie Green Howard
- Department of Biostatistics, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Anna F Ballou
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Victoria L Buchanan
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Jason M Collins
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Carolina G Downie
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Stephanie M Engel
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Mariaelisa Graff
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Heather M Highland
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Moa P Lee
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Adam G Lilly
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Sociology, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kun Lu
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Julia E Rager
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Brooke S Staley
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kari E North
- Department of Epidemiology, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Penny Gordon-Larsen
- Department of Nutrition, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Carolina Population Center, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
10
|
Abstract
The sequencing of modern and ancient genomes from around the world has revolutionized our understanding of human history and evolution. However, the problem of how best to characterize ancestral relationships from the totality of human genomic variation remains unsolved. Here, we address this challenge with nonparametric methods that enable us to infer a unified genealogy of modern and ancient humans. This compact representation of multiple datasets explores the challenges of missing and erroneous data and uses ancient samples to constrain and date relationships. We demonstrate the power of the method to recover relationships between individuals and populations as well as to identify descendants of ancient samples. Finally, we introduce a simple nonparametric estimator of the geographical location of ancestors that recapitulates key events in human history.
Collapse
Affiliation(s)
- Anthony Wilder Wohns
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Yan Wong
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Ben Jeffery
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Ali Akbari
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Department of Human Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
- Department of Genetics, Harvard Medical School; Boston, MA 02115, USA
| | - Swapan Mallick
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Howard Hughes Medical Institute; Boston, MA 02115, USA
| | - Ron Pinhasi
- Department of Evolutionary Anthropology, University of Vienna; 1090 Vienna, Austria
| | - Nick Patterson
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Department of Human Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
- Howard Hughes Medical Institute; Boston, MA 02115, USA
- Department of Genetics, Harvard Medical School; Boston, MA 02115, USA
| | - David Reich
- Broad Institute of MIT and Harvard; Cambridge, MA 02142, USA
- Department of Human Evolutionary Biology, Harvard University; Cambridge, MA 02138, USA
- Howard Hughes Medical Institute; Boston, MA 02115, USA
- Department of Genetics, Harvard Medical School; Boston, MA 02115, USA
| | - Jerome Kelleher
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford; Oxford OX3 7LF, UK
- Corresponding author.
| |
Collapse
|
11
|
Bray M, Chang Y, Baker TB, Jorenby D, Carney RM, Fox L, Pham G, Stoneking F, Smock N, Amos CI, Bierut L, Chen LS. The Promise of Polygenic Risk Prediction in Smoking Cessation: Evidence From Two Treatment Trials. Nicotine Tob Res 2022; 24:1573-1580. [PMID: 35170738 PMCID: PMC9575976 DOI: 10.1093/ntr/ntac043] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 12/14/2021] [Accepted: 02/14/2022] [Indexed: 11/12/2022]
Abstract
INTRODUCTION Tobacco use disorder is a complex behavior with a strong genetic component. Genome-wide association studies (GWAS) on smoking behaviors allow for the creation of polygenic risk scores (PRSs) to approximate genetic vulnerability. However, the utility of smoking-related PRSs in predicting smoking cessation in clinical trials remains unknown. AIMS AND METHODS We evaluated the association between polygenic risk scores and bioverified smoking abstinence in a meta-analysis of two randomized, placebo-controlled smoking cessation trials. PRSs of smoking behaviors were created using the GWAS and Sequencing Consortium of Alcohol and Nicotine use (GSCAN) consortium summary statistics. We evaluated the utility of using individual PRS of specific smoking behavior versus a combined genetic risk that combines PRS of all four smoking behaviors. Study participants came from the Transdisciplinary Tobacco Use Research Centers (TTURCs) Study (1091 smokers of European descent), and the Genetically Informed Smoking Cessation Trial (GISC) Study (501 smokers of European descent). RESULTS PRS of later age of smoking initiation (OR [95% CI]: 1.20, [1.04-1.37], p = .0097) was significantly associated with bioverified smoking abstinence at end of treatment. In addition, the combined PRS of smoking behaviors also significantly predicted bioverified smoking abstinence (OR [95% CI] 0.71 [0.51-0.99], p = .045). CONCLUSIONS PRS of later age at smoking initiation may be useful in predicting smoking cessation at the end of treatment. A combined PRS may be a useful predictor for smoking abstinence by capturing the genetic propensity for multiple smoking behaviors. IMPLICATIONS There is a potential for polygenic risk scores to inform future clinical medicine, and a great need for evidence on whether these scores predict clinically meaningful outcomes. Our meta-analysis provides early evidence for potential utility of using polygenic risk scores to predict smoking cessation amongst smokers undergoing quit attempts, informing further work to optimize the use of polygenic risk scores in clinical care.
Collapse
Affiliation(s)
| | | | - Timothy B Baker
- Department of Medicine, School of Medicine and Public Health, Center for Tobacco Research and Intervention, University of Wisconsin, Madison, WI, USA
| | - Douglas Jorenby
- Department of Medicine, School of Medicine and Public Health, Center for Tobacco Research and Intervention, University of Wisconsin, Madison, WI, USA
| | - Robert M Carney
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Louis Fox
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Giang Pham
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Faith Stoneking
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
| | - Nina Smock
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA,The Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
| | - Christopher I Amos
- Department of Medicine, Baylor College of Medicine, Institute for Clinical and Translational Research, Houston, TX, USA,Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Hanover, NH, USA
| | - Laura Bierut
- Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA,The Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO, USA
| | - Li-Shiun Chen
- Corresponding Author: Li-Shiun Chen, MD, MPH, ScD, Department of Psychiatry (Box 8134), Washington University School of Medicine, 660 S. Euclid Ave., St. Louis, MO 63110, USA. Telephone: 314-362-3932; Fax: 314-362-4247; E-mail:
| |
Collapse
|
12
|
Stover DA, Housman G, Stone AC, Rosenberg MS, Verrelli BC. Evolutionary Genetic Signatures of Selection on Bone-Related Variation within Human and Chimpanzee Populations. Genes (Basel) 2022; 13:genes13020183. [PMID: 35205228 PMCID: PMC8871609 DOI: 10.3390/genes13020183] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 01/19/2022] [Accepted: 01/19/2022] [Indexed: 02/06/2023] Open
Abstract
Bone strength and the incidence and severity of skeletal disorders vary significantly among human populations, due in part to underlying genetic differentiation. While clinical models predict that this variation is largely deleterious, natural population variation unrelated to disease can go unnoticed, altering our perception of how natural selection has shaped bone morphologies over deep and recent time periods. Here, we conduct the first comparative population-based genetic analysis of the main bone structural protein gene, collagen type I α 1 (COL1A1), in clinical and 1000 Genomes Project datasets in humans, and in natural populations of chimpanzees. Contrary to predictions from clinical studies, we reveal abundant COL1A1 amino acid variation, predicted to have little association with disease in the natural population. We also find signatures of positive selection associated with intron haplotype structure, linkage disequilibrium, and population differentiation in regions of known gene expression regulation in humans and chimpanzees. These results recall how recent and deep evolutionary regimes can be linked, in that bone morphology differences that developed among vertebrates over 450 million years of evolution are the result of positive selection on subtle type I collagen functional variation segregating within populations over time.
Collapse
Affiliation(s)
- Daryn A. Stover
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA;
- Arizona State University at Lake Havasu, Lake Havasu, AZ 86403, USA
| | - Genevieve Housman
- Section of Genetic Medicine, University of Chicago, Chicago, IL 60637, USA;
| | - Anne C. Stone
- School of Human Evolution and Social Change, Arizona State University, Tempe, AZ 85287, USA;
| | - Michael S. Rosenberg
- Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA 23284, USA;
| | - Brian C. Verrelli
- School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA;
- Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA 23284, USA;
- Correspondence:
| |
Collapse
|
13
|
Karim MR, Cochez M, Zappa A, Sahay R, Rebholz-Schuhmann D, Beyan O, Decker S. Convolutional Embedded Networks for Population Scale Clustering and Bio-Ancestry Inferencing. IEEE/ACM Trans Comput Biol Bioinform 2022; 19:369-382. [PMID: 32750845 DOI: 10.1109/tcbb.2020.2994649] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The study of genetic variants (GVs) can help find correlating population groups and to identify cohorts that are predisposed to common diseases and explain differences in disease susceptibility and how patients react to drugs. Machine learning techniques are increasingly being applied to identify interacting GVs to understand their complex phenotypic traits. Since the performance of a learning algorithm not only depends on the size and nature of the data but also on the quality of underlying representation, deep neural networks (DNNs) can learn non-linear mappings that allow transforming GVs data into more clustering and classification friendly representations than manual feature selection. In this paper, we propose convolutional embedded networks (CEN) in which we combine two DNN architectures called convolutional embedded clustering (CEC) and convolutional autoencoder (CAE) classifier for clustering individuals and predicting geographic ethnicity based on GVs, respectively. We employed CAE-based representation learning to 95 million GVs from the '1000 genomes' (covering 2,504 individuals from 26 ethnic origins) and 'Simons genome diversity' (covering 279 individuals from 130 ethnic origins) projects. Quantitative and qualitative analyses with a focus on accuracy and scalability show that our approach outperforms state-of-the-art approaches such as VariantSpark and ADMIXTURE. In particular, CEC can cluster targeted population groups in 22 hours with an adjusted rand index (ARI) of 0.915, the normalized mutual information (NMI) of 0.92, and the clustering accuracy (ACC) of 89 percent. Contrarily, the CAE classifier can predict the geographic ethnicity of unknown samples with an F1 and Mathews correlation coefficient (MCC) score of 0.9004 and 0.8245, respectively. Further, to provide interpretations of the predictions, we identify significant biomarkers using gradient boosted trees (GBT) and SHapley Additive exPlanations (SHAP). Overall, our approach is transparent and faster than the baseline methods, and scalable for 5 to 100 percent of the full human genome.
Collapse
|
14
|
Maceda I, Lao O. Analysis of the Batch Effect Due to Sequencing Center in Population Statistics Quantifying Rare Events in the 1000 Genomes Project. Genes (Basel) 2021; 13:genes13010044. [PMID: 35052384 PMCID: PMC8775088 DOI: 10.3390/genes13010044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/19/2021] [Accepted: 12/21/2021] [Indexed: 12/01/2022] Open
Abstract
The 1000 Genomes Project (1000G) is one of the most popular whole genome sequencing datasets used in different genomics fields and has boosting our knowledge in medical and population genomics, among other fields. Recent studies have reported the presence of ghost mutation signals in the 1000G. Furthermore, studies have shown that these mutations can influence the outcomes of follow-up studies based on the genetic variation of 1000G, such as single nucleotide variants (SNV) imputation. While the overall effect of these ghost mutations can be considered negligible for common genetic variants in many populations, the potential bias remains unclear when studying low frequency genetic variants in the population. In this study, we analyze the effect of the sequencing center in predicted loss of function (LoF) alleles, the number of singletons, and the patterns of archaic introgression in the 1000G. Our results support previous studies showing that the sequencing center is associated with LoF and singletons independent of the population that is considered. Furthermore, we observed that patterns of archaic introgression were distorted for some populations depending on the sequencing center. When analyzing the frequency of SNPs showing extreme patterns of genotype differentiation among centers for CEU, YRI, CHB, and JPT, we observed that the magnitude of the sequencing batch effect was stronger at MAF < 0.2 and showed different profiles between CHB and the other populations. All these results suggest that data from 1000G must be interpreted with caution when considering statistics using variants at low frequency.
Collapse
Affiliation(s)
- Iago Maceda
- Population Genomics, CNAG-CRG, Centre for Genomic Regulation, 08028 Barcelona, Spain;
- Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Oscar Lao
- Population Genomics, CNAG-CRG, Centre for Genomic Regulation, 08028 Barcelona, Spain;
- Barcelona Institute of Science and Technology (BIST), 08036 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Correspondence:
| |
Collapse
|
15
|
Bashirova AA, Zheng W, Akdag M, Augusto DG, Vince N, Dong KL, O'hUigin C, Carrington M. Population-specific diversity of the immunoglobulin constant heavy G chain (IGHG) genes. Genes Immun 2021; 22:327-334. [PMID: 34864821 PMCID: PMC8674132 DOI: 10.1038/s41435-021-00156-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 11/11/2021] [Accepted: 11/22/2021] [Indexed: 12/27/2022]
Abstract
Human immunoglobulin G (IgG) molecules, IgG1, IgG2 and IgG3, exhibit substantial inter-individual variation in their constant heavy chain regions, as discovered by serological methods. This polymorphism is encoded by the IGHG1, IGHG2, and IGHG3 genes and may influence antibody function. We sequenced the coding fragments of these genes in 95 European Americans, 94 African Americans, and 94 Black South Africans. Striking differences were observed between the population groups, including extremely low amino acid sequence variation in IGHG1 among South Africans, and higher IGHG2 and IGHG3 diversity in individuals of African descent compared to individuals of European descent. Molecular definition of the loci illustrates a greater level of allelic polymorphism than previously described, including the presence of common IGHG2 and IGHG3 variants that were indistinguishable serologically. Comparison of our data with the 1000 Genome Project sequences indicates overall agreement between the datasets, although some inaccuracies in the 1000 Genomes Project are likely. These data represent the most comprehensive analysis of IGHG polymorphisms across major populations, which can now be applied to deciphering their functional impact.
Collapse
Affiliation(s)
- Arman A Bashirova
- Basic Science Program, Frederick National Laboratory for Cancer Research in the Laboratory of Integrative Cancer Immunology, National Cancer Institute, Bethesda, MD, USA
| | - Wanjing Zheng
- The Laboratory of Integrative Cancer Immunology, National Cancer Institute, Bethesda, MD, USA
| | - Marjan Akdag
- Basic Science Program, Frederick National Laboratory for Cancer Research in the Laboratory of Integrative Cancer Immunology, National Cancer Institute, Bethesda, MD, USA
| | - Danillo G Augusto
- Programa de Pós-Graduação em Genética, Universidade Federal do Paraná, Curitiba, Brazil
- Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Nicolas Vince
- Université de Nantes, CHU Nantes, Inserm, Centre de Recherche en Transplantation et Immunologie, UMR 1064, ITUN, F-44000, Nantes, France
| | - Krista L Dong
- Females Rising through Education, Support, and Health, Durban, KwaZulu-Natal, South Africa
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, USA
| | - Colm O'hUigin
- Basic Science Program, Frederick National Laboratory for Cancer Research in the Laboratory of Integrative Cancer Immunology, National Cancer Institute, Bethesda, MD, USA
| | - Mary Carrington
- Basic Science Program, Frederick National Laboratory for Cancer Research in the Laboratory of Integrative Cancer Immunology, National Cancer Institute, Bethesda, MD, USA.
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
16
|
Farrer RA. HaplotypeTools: a toolkit for accurately identifying recombination and recombinant genotypes. BMC Bioinformatics 2021; 22:560. [PMID: 34809571 PMCID: PMC8607637 DOI: 10.1186/s12859-021-04473-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 11/10/2021] [Indexed: 11/17/2022] Open
Abstract
Background Identifying haplotypes is central to sequence analysis in diploid or polyploid genomes. Despite this, there remains a lack of research and tools designed for physical phasing and its downstream analysis. Results HaplotypeTools is a new toolset to phase variant sites using VCF and BAM files and to analyse phased VCFs. Phasing is achieved via the identification of reads overlapping ≥ 2 heterozygous positions and then extended by additional reads, a process that can be parallelized across a computer cluster. HaplotypeTools includes various utility scripts for downstream analysis including crossover detection and phylogenetic placement of haplotypes to other lineages or species. HaplotypeTools was assessed for accuracy against WhatsHap using simulated short and long reads, demonstrating higher accuracy, albeit with reduced haplotype length. HaplotypeTools was also tested on real Illumina data to determine the ancestry of hybrid fungal isolate Batrachochytrium dendrobatidis (Bd) SA-EC3, finding 80% of haplotypes across the genome phylogenetically cluster with parental lineages BdGPL (39%) and BdCAPE (41%), indicating those are the parental lineages. Finally, ~ 99% of phasing was conserved between overlapping phase groups between SA-EC3 and either parental lineage, indicating mitotic gene conversion/parasexuality as the mechanism of recombination for this hybrid isolate. HaplotypeTools is open source and freely available from https://github.com/rhysf/HaplotypeTools under the MIT License. Conclusions HaplotypeTools is a powerful resource for analyzing hybrid or recombinant diploid or polyploid genomes and identifying parental ancestry for sub-genomic regions.
Collapse
Affiliation(s)
- Rhys A Farrer
- Medical Research Council Centre for Medical Mycology at the University of Exeter, Exeter, UK.
| |
Collapse
|
17
|
Guo S, Jin Y, Zhou J, Zhu Q, Jiang T, Bian Y, Zhang R, Chang C, Xu L, Shen J, Zheng X, Shen Y, Qin Y, Chen J, Tang X, Cheng P, Ding Q, Zhang Y, Liu J, Cheng Q, Guo M, Liu Z, Qiu W, Qian Y, Sun Y, Shen Y, Nie H, Schrodi SJ, He D. MicroRNA Variants and HLA-miRNA Interactions are Novel Rheumatoid Arthritis Susceptibility Factors. Front Genet 2021; 12:747274. [PMID: 34777472 PMCID: PMC8585984 DOI: 10.3389/fgene.2021.747274] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 10/11/2021] [Indexed: 12/25/2022] Open
Abstract
Genome-wide association studies have identified >100 genetic risk factors for rheumatoid arthritis. However, the reported genetic variants could only explain less than 40% heritability of rheumatoid arthritis. The majority of the heritability is still missing and needs to be identified with more studies with different approaches and populations. In order to identify novel function SNPs to explain missing heritability and reveal novel mechanism pathogenesis of rheumatoid arthritis, 4 HLA SNPs (HLA-DRB1, HLA-DRB9, HLA-DQB1, and TNFAIP3) and 225 common SNPs located in miRNA, which might influence the miRNA target binding or pre-miRNA stability, were genotyped in 1,607 rheumatoid arthritis and 1,580 matched normal individuals. We identified 2 novel SNPs as significantly associated with rheumatoid arthritis including rs1414273 (miR-548ac, OR = 0.84, p = 8.26 × 10-4) and rs2620381 (miR-627, OR = 0.77, p = 2.55 × 10-3). We also identified that rs5997893 (miR-3928) showed significant epistasis effect with rs4947332 (HLA-DRB1, OR = 4.23, p = 0.04) and rs2967897 (miR-5695) with rs7752903 (TNFAIP3, OR = 4.43, p = 0.03). In addition, we found that individuals who carried 8 risk alleles showed 15.38 (95%CI: 4.69-50.49, p < 1.0 × 10-6) times more risk of being affected by RA. Finally, we demonstrated that the targets of the significant miRNAs showed enrichment in immune related genes (p = 2.0 × 10-5) and FDA approved drug target genes (p = 0.014). Overall, 6 novel miRNA SNPs including rs1414273 (miR-548ac, p = 8.26 × 10-4), rs2620381 (miR-627, p = 2.55 × 10-3), rs4285314 (miR-3135b, p = 1.10 × 10-13), rs28477407 (miR-4308, p = 3.44 × 10-5), rs5997893 (miR-3928, p = 5.9 × 10-3) and rs45596840 (miR-4482, p = 6.6 × 10-3) were confirmed to be significantly associated with RA in a Chinese population. Our study suggests that miRNAs might be interesting targets to accelerate understanding of the pathogenesis and drug development for rheumatoid arthritis.
Collapse
Affiliation(s)
- Shicheng Guo
- Department of Medical Genetics, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - Yehua Jin
- Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Jieru Zhou
- Department of Health Management, Shanghai East Hospital, Tongji University School of Medicine, Shanghai, China
| | - Qi Zhu
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Ting Jiang
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yanqin Bian
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Runrun Zhang
- Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Cen Chang
- Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Lingxia Xu
- Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Jie Shen
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Xinchun Zheng
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yi Shen
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yingying Qin
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Jihong Chen
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xiaorong Tang
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Peng Cheng
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Qin Ding
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yuanyuan Zhang
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Jia Liu
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Qingqing Cheng
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Mengru Guo
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Zhaoyi Liu
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Weifang Qiu
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yi Qian
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Yang Sun
- Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Yu Shen
- Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| | - Hong Nie
- Shanghai Institute of Immunology, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Steven J Schrodi
- Department of Medical Genetics, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI, United States
| | - Dongyi He
- Department of Rheumatology,Guanghua Hospital Affiliated to Shanghai University of Traditional Chinese Medicine, Shanghai, China.,Institute of Arthritis Research in Integrative Medicine, Shanghai Academy of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
18
|
Singh R, Singh PK, Kumar R, Kabir MT, Kamal MA, Rauf A, Albadrani GM, Sayed AA, Mousa SA, Abdel-Daim MM, Uddin MS. Multi-Omics Approach in the Identification of Potential Therapeutic Biomolecule for COVID-19. Front Pharmacol 2021; 12:652335. [PMID: 34054532 PMCID: PMC8149611 DOI: 10.3389/fphar.2021.652335] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Accepted: 04/21/2021] [Indexed: 02/05/2023] Open
Abstract
COVID-19 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It has a disastrous effect on mankind due to the contagious and rapid nature of its spread. Although vaccines for SARS-CoV-2 have been successfully developed, the proven, effective, and specific therapeutic molecules are yet to be identified for the treatment. The repurposing of existing drugs and recognition of new medicines are continuously in progress. Efforts are being made to single out plant-based novel therapeutic compounds. As a result, some of these biomolecules are in their testing phase. During these efforts, the whole-genome sequencing of SARS-CoV-2 has given the direction to explore the omics systems and approaches to overcome this unprecedented health challenge globally. Genome, proteome, and metagenome sequence analyses have helped identify virus nature, thereby assisting in understanding the molecular mechanism, structural understanding, and disease propagation. The multi-omics approaches offer various tools and strategies for identifying potential therapeutic biomolecules for COVID-19 and exploring the plants producing biomolecules that can be used as biopharmaceutical products. This review explores the available multi-omics approaches and their scope to investigate the therapeutic promises of plant-based biomolecules in treating SARS-CoV-2 infection.
Collapse
Affiliation(s)
- Rachana Singh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, India
| | - Pradhyumna Kumar Singh
- Plant Molecular Biology and Biotechnology Division, Council of Scientific and Industrial Research- National Botanical Research Institute (CSIR-NBRI), Lucknow, India
| | - Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, India
| | | | - Mohammad Amjad Kamal
- West China School of Nursing/Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Abdur Rauf
- Department of Chemistry, University of Swabi, Khyber Pakhtunkhwa, Pakistan
| | - Ghadeer M. Albadrani
- Department of Biology, College of Science, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Amany A. Sayed
- Zoology Department, Faculty of Science, Cairo University, Giza, Egypt
| | - Shaker A. Mousa
- Pharmaceutical Research Institute, Albany College of Pharmacy and Health Sciences, Rensselaer, NY, United States
| | - Mohamed M. Abdel-Daim
- Pharmacology Department, Faculty of Veterinary Medicine, Suez Canal University, Ismailia, Egypt
| | - Md. Sahab Uddin
- Department of Pharmacy, Southeast University, Dhaka, Bangladesh
- Pharmakon Neuroscience Research Network, Dhaka, Bangladesh
| |
Collapse
|
19
|
Monteiro B, Arenas M, Prata MJ, Amorim A. Evolutionary dynamics of the human pseudoautosomal regions. PLoS Genet 2021; 17:e1009532. [PMID: 33872316 PMCID: PMC8084340 DOI: 10.1371/journal.pgen.1009532] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2020] [Revised: 04/29/2021] [Accepted: 04/06/2021] [Indexed: 01/19/2023] Open
Abstract
Recombination between the X and Y human sex chromosomes is limited to the two pseudoautosomal regions (PARs) that present quite distinct evolutionary origins. Despite the crucial importance for male meiosis, genetic diversity patterns and evolutionary dynamics of these regions are poorly understood. In the present study, we analyzed and compared the genetic diversity of the PAR regions using publicly available genomic sequences encompassing both PAR1 and PAR2. Comparisons were performed through allele diversities, linkage disequilibrium status and recombination frequencies within and between X and Y chromosomes. In agreement with previous studies, we confirmed the role of PAR1 as a male-specific recombination hotspot, but also observed similar characteristic patterns of diversity in both regions although male recombination occurs at PAR2 to a much lower extent (at least one recombination event at PAR1 and in ≈1% in normal male meioses at PAR2). Furthermore, we demonstrate that both PARs harbor significantly different allele frequencies between X and Y chromosomes, which could support that recombination is not sufficient to homogenize the pseudoautosomal gene pool or is counterbalanced by other evolutionary forces. Nevertheless, the observed patterns of diversity are not entirely explainable by sexually antagonistic selection. A better understanding of such processes requires new data from intergenerational transmission studies of PARs, which would be decisive on the elucidation of PARs evolution and their role in male-driven heterosomal aneuploidies.
Collapse
Affiliation(s)
- Bruno Monteiro
- Institute of Investigation and Innovation in Health (i3S). University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Porto, Portugal
| | - Miguel Arenas
- Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
- CINBIO (Biomedical Research Centre), University of Vigo, Vigo, Spain
| | - Maria João Prata
- Institute of Investigation and Innovation in Health (i3S). University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Porto, Portugal
- Faculty of Sciences, University of Porto, Porto, Portugal
- * E-mail:
| | - António Amorim
- Institute of Investigation and Innovation in Health (i3S). University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology (IPATIMUP), University of Porto, Porto, Portugal
- Faculty of Sciences, University of Porto, Porto, Portugal
| |
Collapse
|
20
|
Wei CY, Yang JH, Yeh EC, Tsai MF, Kao HJ, Lo CZ, Chang LP, Lin WJ, Hsieh FJ, Belsare S, Bhaskar A, Su MW, Lee TC, Lin YL, Liu FT, Shen CY, Li LH, Chen CH, Wall JD, Wu JY, Kwok PY. Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese. NPJ Genom Med 2021; 6:10. [PMID: 33574314 PMCID: PMC7878858 DOI: 10.1038/s41525-021-00178-9] [Citation(s) in RCA: 92] [Impact Index Per Article: 30.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 01/06/2021] [Indexed: 02/06/2023] Open
Abstract
Personalized medical care focuses on prediction of disease risk and response to medications. To build the risk models, access to both large-scale genomic resources and human genetic studies is required. The Taiwan Biobank (TWB) has generated high-coverage, whole-genome sequencing data from 1492 individuals and genome-wide SNP data from 103,106 individuals of Han Chinese ancestry using custom SNP arrays. Principal components analysis of the genotyping data showed that the full range of Han Chinese genetic variation was found in the cohort. The arrays also include thousands of known functional variants, allowing for simultaneous ascertainment of Mendelian disease-causing mutations and variants that affect drug metabolism. We found that 21.2% of the population are mutation carriers of autosomal recessive diseases, 3.1% have mutations in cancer-predisposing genes, and 87.3% carry variants that affect drug response. We highlight how TWB data provide insight into both population history and disease burden, while showing how widespread genetic testing can be used to improve clinical care.
Collapse
Affiliation(s)
- Chun-Yu Wei
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Jenn-Hwai Yang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Erh-Chan Yeh
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Ming-Fang Tsai
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Hsiao-Jung Kao
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Chen-Zen Lo
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Lung-Pao Chang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Wan-Jia Lin
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Feng-Jen Hsieh
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Saurabh Belsare
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Anand Bhaskar
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Ming-Wei Su
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Te-Chang Lee
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Yi-Ling Lin
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Fu-Tong Liu
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Chen-Yang Shen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Ling-Hui Li
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Chien-Hsiun Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Jeffrey D Wall
- Institute for Human Genetics, University of California, San Francisco, CA, USA
| | - Jer-Yuarn Wu
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Pui-Yan Kwok
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.
- Institute for Human Genetics, University of California, San Francisco, CA, USA.
| |
Collapse
|
21
|
Abstract
The expression of ABO antigens in human saliva is regulated by the FUT2 gene, which encodes a secretor type α(1,2)fucosyltransferase. Secretors express ABO substrates in saliva and non-secretors do not. Secretor status is an object of concern, especially for susceptibility to various infectious diseases. A multitude of single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) have been reported, and they show unique distributions among different populations. In this study, we selected 18 uncharacterized FUT2 alleles listed in the Erythrogene database and obtained genomic DNA having these alleles. We experimentally confirmed the haplotypes, but 10 of 18 alleles disagreed with those in the database, which may be attributed to their low frequency. We then examined the activity of the encoded α(1,2)fucosyltransferase for 13 alleles by flow cytometry of H antigen expression. The impact of each nonsynonymous SNP on the enzyme was also estimated by software. We finally identified two non-secretor alleles (se610and se357,856,863) and one weak secretor allele (se262,357), while in silico analysis predicted that many alleles impair the function. The present results suggest that correct haplotyping and functional assays are desirable for analysis of the FUT2 gene.
Collapse
Affiliation(s)
- Mikiko Soejima
- Department of Forensic Medicine, Kurume University School of Medicine, Kurume, 830-0011, Japan
| | - Yoshiro Koda
- Department of Forensic Medicine, Kurume University School of Medicine, Kurume, 830-0011, Japan.
| |
Collapse
|
22
|
Gao Y, Yang Z, Yang W, Yang Y, Gong J, Yang QY, Niu X. Plant-ImputeDB: an integrated multiple plant reference panel database for genotype imputation. Nucleic Acids Res 2021; 49:D1480-D1488. [PMID: 33137192 PMCID: PMC7779032 DOI: 10.1093/nar/gkaa953] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 09/23/2020] [Accepted: 10/08/2020] [Indexed: 12/21/2022] Open
Abstract
Genotype imputation is a process that estimates missing genotypes in terms of the haplotypes and genotypes in a reference panel. It can effectively increase the density of single nucleotide polymorphisms (SNPs), boost the power to identify genetic association and promote the combination of genetic studies. However, there has been a lack of high-quality reference panels for most plants, which greatly hinders the application of genotype imputation. Here, we developed Plant-ImputeDB (http://gong_lab.hzau.edu.cn/Plant_imputeDB/), a comprehensive database with reference panels of 12 plant species for online genotype imputation, SNP and block search and free download. By integrating genotype data and whole-genome resequencing data of plants from various studies and databases, the current Plant-ImputeDB provides high-quality reference panels of 12 plant species, including ∼69.9 million SNPs from 34 244 samples. It also provides an easy-to-use online tool with the option of two popular tools specifically designed for genotype imputation. In addition, Plant-ImputeDB accepts submissions of different types of genomic variations, and provides free and open access to all publicly available data in support of related research worldwide. In general, Plant-ImputeDB may serve as an important resource for plant genotype imputation and greatly facilitate the research on plant genetic research.
Collapse
Affiliation(s)
- Yingjie Gao
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Zhiquan Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Wenqian Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Yanbo Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Jing Gong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China.,College of Biomedicine and Health, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Qing-Yong Yang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China.,College of Agriculture, Shihezi University, Xinjiang 832003, P.R. China
| | - Xiaohui Niu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|
23
|
Singh R, Singh PK, Kumar R, Kabir MT, Kamal MA, Rauf A, Albadrani GM, Sayed AA, Mousa SA, Abdel-Daim MM, Uddin MS. Multi-Omics Approach in the Identification of Potential Therapeutic Biomolecule for COVID-19. Front Pharmacol 2021. [PMID: 34054532 DOI: 10.3389/fphar2021652335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023] Open
Abstract
COVID-19 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It has a disastrous effect on mankind due to the contagious and rapid nature of its spread. Although vaccines for SARS-CoV-2 have been successfully developed, the proven, effective, and specific therapeutic molecules are yet to be identified for the treatment. The repurposing of existing drugs and recognition of new medicines are continuously in progress. Efforts are being made to single out plant-based novel therapeutic compounds. As a result, some of these biomolecules are in their testing phase. During these efforts, the whole-genome sequencing of SARS-CoV-2 has given the direction to explore the omics systems and approaches to overcome this unprecedented health challenge globally. Genome, proteome, and metagenome sequence analyses have helped identify virus nature, thereby assisting in understanding the molecular mechanism, structural understanding, and disease propagation. The multi-omics approaches offer various tools and strategies for identifying potential therapeutic biomolecules for COVID-19 and exploring the plants producing biomolecules that can be used as biopharmaceutical products. This review explores the available multi-omics approaches and their scope to investigate the therapeutic promises of plant-based biomolecules in treating SARS-CoV-2 infection.
Collapse
Affiliation(s)
- Rachana Singh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, India
| | - Pradhyumna Kumar Singh
- Plant Molecular Biology and Biotechnology Division, Council of Scientific and Industrial Research- National Botanical Research Institute (CSIR-NBRI), Lucknow, India
| | - Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, India
| | | | - Mohammad Amjad Kamal
- West China School of Nursing/Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Abdur Rauf
- Department of Chemistry, University of Swabi, Khyber Pakhtunkhwa, Pakistan
| | - Ghadeer M Albadrani
- Department of Biology, College of Science, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Amany A Sayed
- Zoology Department, Faculty of Science, Cairo University, Giza, Egypt
| | - Shaker A Mousa
- Pharmaceutical Research Institute, Albany College of Pharmacy and Health Sciences, Rensselaer, NY, United States
| | - Mohamed M Abdel-Daim
- Pharmacology Department, Faculty of Veterinary Medicine, Suez Canal University, Ismailia, Egypt
| | - Md Sahab Uddin
- Department of Pharmacy, Southeast University, Dhaka, Bangladesh
- Pharmakon Neuroscience Research Network, Dhaka, Bangladesh
| |
Collapse
|
24
|
Al Bkhetan Z, Chana G, Ramamohanarao K, Verspoor K, Goudey B. Evaluation of consensus strategies for haplotype phasing. Brief Bioinform 2020; 22:5998997. [PMID: 33236761 DOI: 10.1093/bib/bbaa280] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Revised: 09/22/2020] [Accepted: 09/22/2020] [Indexed: 01/05/2023] Open
Abstract
Haplotype phasing is a critical step for many genetic applications but incorrect estimates of phase can negatively impact downstream analyses. One proposed strategy to improve phasing accuracy is to combine multiple independent phasing estimates to overcome the limitations of any individual estimate. However, such a strategy is yet to be thoroughly explored. This study provides a comprehensive evaluation of consensus strategies for haplotype phasing. We explore the performance of different consensus paradigms, and the effect of specific constituent tools, across several datasets with different characteristics and their impact on the downstream task of genotype imputation. Based on the outputs of existing phasing tools, we explore two different strategies to construct haplotype consensus estimators: voting across outputs from multiple phasing tools and multiple outputs of a single non-deterministic tool. We find that the consensus approach from multiple tools reduces SE by an average of 10% compared to any constituent tool when applied to European populations and has the highest accuracy regardless of population ethnicity, sample size, variant density or variant frequency. Furthermore, the consensus estimator improves the accuracy of the downstream task of genotype imputation carried out by the widely used Minimac3, pbwt and BEAGLE5 tools. Our results provide guidance on how to produce the most accurate phasing estimates and the trade-offs that a consensus approach may have. Our implementation of consensus haplotype phasing, consHap, is available freely at https://github.com/ziadbkh/consHap. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Ziad Al Bkhetan
- School of Computing and Information Systems at the University of Melbourne
| | | | | | - Karin Verspoor
- School of Computing and Information Systems at the University of Melbourne
| | - Benjamin Goudey
- IBM Research Australia and an Honorary Research Fellow at the School of Computing and Information Systems, University of Melbourne
| |
Collapse
|
25
|
Donato L, Scimone C, Alibrandi S, Pitruzzella A, Scalia F, D'Angelo R, Sidoti A. Possible A2E Mutagenic Effects on RPE Mitochondrial DNA from Innovative RNA-Seq Bioinformatics Pipeline. Antioxidants (Basel) 2020; 9:E1158. [PMID: 33233726 DOI: 10.3390/antiox9111158] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 11/12/2020] [Accepted: 11/18/2020] [Indexed: 01/10/2023] Open
Abstract
Mitochondria are subject to continuous oxidative stress stimuli that, over time, can impair their genome and lead to several pathologies, like retinal degenerations. Our main purpose was the identification of mtDNA variants that might be induced by intense oxidative stress determined by N-retinylidene-N-retinylethanolamine (A2E), together with molecular pathways involving the genes carrying them, possibly linked to retinal degeneration. We performed a variant analysis comparison between transcriptome profiles of human retinal pigment epithelial (RPE) cells exposed to A2E and untreated ones, hypothesizing that it might act as a mutagenic compound towards mtDNA. To optimize analysis, we proposed an integrated approach that foresaw the complementary use of the most recent algorithms applied to mtDNA data, characterized by a mixed output coming from several tools and databases. An increased number of variants emerged following treatment. Variants mainly occurred within mtDNA coding sequences, corresponding with either the polypeptide-encoding genes or the RNA. Time-dependent impairments foresaw the involvement of all oxidative phosphorylation complexes, suggesting a serious damage to adenosine triphosphate (ATP) biosynthesis, that can result in cell death. The obtained results could be incorporated into clinical diagnostic settings, as they are hypothesized to modulate the phenotypic expression of mtDNA pathogenic variants, drastically improving the field of precision molecular medicine.
Collapse
|
26
|
Zeng Q, Leach NT, Zhou Z, Zhu H, Smith JA, Rosenblum LS, Kenyon A, Heim RA, Eisenberg M, Letovsky S, Okamoto PM. A customized scaffolds approach for the detection and phasing of complex variants by next-generation sequencing. Sci Rep 2020; 10:15060. [PMID: 32929119 DOI: 10.1038/s41598-020-71471-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 08/13/2020] [Indexed: 02/06/2023] Open
Abstract
Next-generation sequencing (NGS) is widely used in genetic testing for the highly sensitive detection of single nucleotide changes and small insertions or deletions. However, detection and phasing of structural variants, especially in repetitive or homologous regions, can be problematic due to uneven read coverage or genome reference bias, resulting in false calls. To circumvent this challenge, a computational approach utilizing customized scaffolds as supplementary reference sequences for read alignment was developed, and its effectiveness demonstrated with two CBS gene variants: NM_000071.2:c.833T>C and NM_000071.2:c.[833T>C; 844_845ins68]. Variant c.833T>C is a known causative mutation for homocystinuria, but is not pathogenic when in cis with the insertion, c.844_845ins68, because of alternative splicing. Using simulated reads, the custom scaffolds method resolved all possible combinations with 100% accuracy and, based on > 60,000 clinical specimens, exceeded the performance of current approaches that only align reads to GRCh37/hg19 for the detection of c.833T>C alone or in cis with c.844_845ins68. Furthermore, analysis of two 1000 Genomes Project trios revealed that the c.[833T>C; 844_845ins68] complex variant had previously been undetected in these datasets, likely due to the alignment method used. This approach can be configured for existing workflows to detect other challenging and potentially underrepresented variants, thereby augmenting accurate variant calling in clinical NGS testing.
Collapse
|
27
|
Srivastava K, Khil PP, Sippert E, Volkova E, Dekker JP, Rios M, Flegel WA. ACKR1 Alleles at 5.6 kb in a Well-Characterized Renewable US Food and Drug Administration (FDA) Reference Panel for Standardization of Blood Group Genotyping. J Mol Diagn 2020; 22:1272-1279. [PMID: 32688055 DOI: 10.1016/j.jmoldx.2020.06.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 06/17/2020] [Accepted: 06/26/2020] [Indexed: 12/18/2022] Open
Abstract
The glycoprotein encoded by the ACKR1 gene expresses the Duffy blood group antigens and is a receptor for malaria parasites. We recently described 18 long-range ACKR1 alleles in an autochthonous population of a malaria endemic region. Extending this work, we sequenced the gene in a 53-sample repository established by the US Food and Drug Administration (FDA) as reference reagents for blood group genotyping. The FDA samples have been characterized for 19 genes; however, long-range haplotype information for these genes, including ACKR1, was lacking. We used a hybrid approach, novel for this type of gene, to characterize ACKR1 by combining two next-generation sequencing technologies, the short-read massively parallel sequencing and the long-read nanopore sequencing. The expedient integration of data from both next-generation sequencing systems were necessary and sufficient to allow determination of all 25 long-range ACKR1 alleles found in the 53 samples accurately. All 25 alleles identified in our current FDA cohort were novel and, unexpectedly, none had been observed among the 18 alleles in our previous study. The alleles will be useful for validation, calibration, and proficiency testing of red cell genotyping. The lack of any overlap between the ACKR1 alleles in the two studies documents differences in mutation rate and recombination frequency among populations. The exact haplotype and their interethnic or interpopulation dissimilarities can influence disease susceptibility and therapy.
Collapse
Affiliation(s)
- Kshitij Srivastava
- Department of Transfusion Medicine, NIH Clinical Center, NIH, Bethesda, Maryland
| | - Pavel P Khil
- Laboratory Medicine, NIH Clinical Center, NIH, Bethesda, Maryland
| | - Emilia Sippert
- Office of Blood Research and Review, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland
| | - Evgeniya Volkova
- Office of Blood Research and Review, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland
| | - John P Dekker
- Laboratory of Clinical Immunology and Microbiology, National Institute of Allergy and Infectious Diseases, Bethesda, Maryland
| | - Maria Rios
- Office of Blood Research and Review, Center for Biologics Evaluation and Research, US Food and Drug Administration, Silver Spring, Maryland
| | - Willy A Flegel
- Department of Transfusion Medicine, NIH Clinical Center, NIH, Bethesda, Maryland.
| |
Collapse
|
28
|
Albers PK, McVean G. Dating genomic variants and shared ancestry in population-scale sequencing data. PLoS Biol 2020; 18:e3000586. [PMID: 31951611 PMCID: PMC6992231 DOI: 10.1371/journal.pbio.3000586] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 01/30/2020] [Accepted: 01/02/2020] [Indexed: 12/31/2022] Open
Abstract
The origin and fate of new mutations within species is the fundamental process underlying evolution. However, while much attention has been focused on characterizing the presence, frequency, and phenotypic impact of genetic variation, the evolutionary histories of most variants are largely unexplored. We have developed a nonparametric approach for estimating the date of origin of genetic variants in large-scale sequencing data sets. The accuracy and robustness of the approach is demonstrated through simulation. Using data from two publicly available human genomic diversity resources, we estimated the age of more than 45 million single-nucleotide polymorphisms (SNPs) in the human genome and release the Atlas of Variant Age as a public online database. We characterize the relationship between variant age and frequency in different geographical regions and demonstrate the value of age information in interpreting variants of functional and selective importance. Finally, we use allele age estimates to power a rapid approach for inferring the ancestry shared between individual genomes and to quantify genealogical relationships at different points in the past, as well as to describe and explore the evolutionary history of modern human populations.
Collapse
Affiliation(s)
- Patrick K. Albers
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
- * E-mail:
| | - Gil McVean
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, United Kingdom
| |
Collapse
|