1
|
El Fahime E, Kartti S, Chemao-Elfihri MW, Festali R, Hakmi M, Ibrahimi A, Boutayeb S, Belyamani L. Moroccan genome project: genomic insight into a North African population. Commun Biol 2025; 8:584. [PMID: 40204857 PMCID: PMC11982406 DOI: 10.1038/s42003-025-08020-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Accepted: 03/31/2025] [Indexed: 04/11/2025] Open
Abstract
Africa's 1.5 billion people are underrepresented in genomic databases. The African Genome Variation Project exclusively focuses on Sub-Saharan populations, making Morocco, located in North Africa, a valuable site for studying genetic diversity. Understanding genetic variation and customized therapy requires population-specific reference genomes. This study presents Phase 1 results from the Moroccan Genome Project (MGP), which sequenced 109 Moroccan genomes. We report over 27 million variants, including 1.4 million novel ones, of which 15,378 are highly prevalent in the Moroccan population. Furthermore, we propose a Moroccan Major Allele Reference Genome (MMARG), generated using high-coverage consensus sequences from the 109 whole genomes. This MMARG represents more accurately the Moroccan genetic variation than GRCh38. This baseline study also generates an informative genetic variation database that supports regional population-specific initiatives and precision medicine in Morocco and North Africa. The results stress the necessity of population-relevant data in Human genetic research.
Collapse
Affiliation(s)
- Elmostafa El Fahime
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco.
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco.
| | - Souad Kartti
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco
| | - Mohammed Walid Chemao-Elfihri
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco
| | - Rihab Festali
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco
| | - Mohammed Hakmi
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco
| | | | - Saber Boutayeb
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco
| | - Lahcen Belyamani
- Mohammed VI University of Sciences and Health (UM6SS), Casablanca, 20370, Morocco
- Mohammed VI Center for Research and Innovation (CM6RI), Rabat, 10100, Morocco
- 3Mohammed V University (UM5), Rabat, B.P:8007.N.U, Morocco
| |
Collapse
|
2
|
Kulmanov M, Tawfiq R, Liu Y, Al Ali H, Abdelhakim M, Alarawi M, Aldakhil H, Alhattab D, Alsolme EA, Althagafi A, Angelov A, Bougouffa S, Driguez P, Park C, Putra A, Reyes-Ramos AM, Hauser CAE, Cheung MS, Abedalthagafi MS, Hoehndorf R. A reference quality, fully annotated diploid genome from a Saudi individual. Sci Data 2024; 11:1278. [PMID: 39580486 PMCID: PMC11585617 DOI: 10.1038/s41597-024-04121-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 11/11/2024] [Indexed: 11/25/2024] Open
Abstract
We have used multiple sequencing approaches to sequence the genome of a volunteer from Saudi Arabia. We use the resulting data to generate a de novo assembly of the genome, and use different computational approaches to refine the assembly. As a consequence, we provide a contiguous assembly of the complete genome of an individual from Saudi Arabia for all chromosomes except chromosome Y, and label this assembly KSA001. We transferred genome annotations from reference genomes to fully annotate KSA001, and we make all primary sequencing data, the assembly, and the genome annotations freely available in public databases using the FAIR data principles. KSA001 is the first telomere-to-telomere-assembled genome from a Saudi individual that is freely available for any purpose.
Collapse
Affiliation(s)
- Maxat Kulmanov
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Generative AI, King Abdullah University of Sciene and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Rund Tawfiq
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Generative AI, King Abdullah University of Sciene and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Yang Liu
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Generative AI, King Abdullah University of Sciene and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Hatoon Al Ali
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Marwa Abdelhakim
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Mohammed Alarawi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Hind Aldakhil
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Dana Alhattab
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia
- Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Laboratory for Nanomedicine, Biological and Environmental Science & Engineering (BESE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Ebtehal A Alsolme
- Genomic and Precision Medicine Department, King Fahad Medical City, Riyadh, Saudi Arabia
| | - Azza Althagafi
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Computer Science Department, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Angel Angelov
- Core Labs, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, 23955, Thuwal, Makkah, Saudi Arabia
| | - Salim Bougouffa
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Patrick Driguez
- Core Labs, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, 23955, Thuwal, Makkah, Saudi Arabia
| | - Changsook Park
- Core Labs, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, 23955, Thuwal, Makkah, Saudi Arabia
| | - Alexander Putra
- Core Labs, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, 23955, Thuwal, Makkah, Saudi Arabia
| | - Ana M Reyes-Ramos
- Core Labs, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, 23955, Thuwal, Makkah, Saudi Arabia
| | - Charlotte A E Hauser
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Biological and Environmental Sciences & Engineering (BESE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Laboratory for Nanomedicine, Biological and Environmental Science & Engineering (BESE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Max Planck Institute for Biology of Ageing, Cologne, Germany
- Institute of Health Care Engineering with European Testing Center of Medical Devices, Graz University of Technology, Stremayrgasse 16/II, 8010, Graz, Austria
| | - Ming Sin Cheung
- Core Labs, King Abdullah University of Science and Technology (KAUST), 4700 KAUST, 23955, Thuwal, Makkah, Saudi Arabia
| | - Malak S Abedalthagafi
- Department of Pathology and Laboratory Medicine, Emory School of Medicine, Atlanta, GA, USA.
- King Salman Center for Disability Research, Riyadh, Saudi Arabia.
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
- KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia.
- KAUST Center of Excellence for Generative AI, King Abdullah University of Sciene and Technology, 4700 KAUST, 23955, Thuwal, Saudi Arabia.
- Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
| |
Collapse
|
3
|
Marzouka NAD, Alnaqbi H, Al-Aamri A, Tay G, Alsafar H. Investigating the genetic makeup of the major histocompatibility complex (MHC) in the United Arab Emirates population through next-generation sequencing. Sci Rep 2024; 14:3392. [PMID: 38337023 PMCID: PMC10858242 DOI: 10.1038/s41598-024-53986-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Accepted: 02/07/2024] [Indexed: 02/12/2024] Open
Abstract
The Human leukocyte antigen (HLA) molecules are central to immune response and have associations with the phenotypes of various diseases and induced drug toxicity. Further, the role of HLA molecules in presenting antigens significantly affects the transplantation outcome. The objective of this study was to examine the extent of the diversity of HLA alleles in the population of the United Arab Emirates (UAE) using Next-Generation Sequencing methodologies and encompassing a larger cohort of individuals. A cohort of 570 unrelated healthy citizens of the UAE volunteered to provide samples for Whole Genome Sequencing and Whole Exome Sequencing. The definition of the HLA alleles was achieved through the application of the bioinformatics tools, HLA-LA and xHLA. Subsequently, the findings from this study were compared with other local and international datasets. A broad range of HLA alleles in the UAE population, of which some were previously unreported, was identified. A comparison with other populations confirmed the current population's unique intertwined genetic heritage while highlighting similarities with populations from the Middle East region. Some disease-associated HLA alleles were detected at a frequency of > 5%, such as HLA-B*51:01, HLA-DRB1*03:01, HLA-DRB1*15:01, and HLA-DQB1*02:01. The increase in allele homozygosity, especially for HLA class I genes, was identified in samples with a higher level of genome-wide homozygosity. This highlights a possible effect of consanguinity on the HLA homozygosity. The HLA allele distribution in the UAE population showcases a unique profile, underscoring the need for tailored databases for traditional activities such as unrelated transplant matching and for newer initiatives in precision medicine based on specific populations. This research is part of a concerted effort to improve the knowledge base, particularly in the fields of transplant medicine and investigating disease associations as well as in understanding human migration patterns within the Arabian Peninsula and surrounding regions.
Collapse
Affiliation(s)
- Nour Al Dain Marzouka
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Halima Alnaqbi
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Amira Al-Aamri
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Guan Tay
- Division of Psychiatry, Faculty of Health and Medical Sciences, Medical School, The University of Western Australia, Crawley, WA, Australia
- School of Medical and Health Sciences, Edith Cowan University, Joondalup, WA, Australia
| | - Habiba Alsafar
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.
- College of Medicine and Health Sciences, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
4
|
Alqasrawi MN, Al-Mahayri ZN, Alblooshi H, Alsafar H, Ali BR. Utilizing Pharmacogenomic Data for a Safer Use of Statins among the Emirati Population. Curr Vasc Pharmacol 2024; 22:218-229. [PMID: 38284696 DOI: 10.2174/0115701611283841231227064343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 12/09/2023] [Accepted: 12/12/2023] [Indexed: 01/30/2024]
Abstract
BACKGROUND Statins are the most prescribed lipid-lowering drugs worldwide. The associated adverse events, especially muscle symptoms, have been frequently reported despite their perceived safety. Three pharmacogenes, the solute carrier organic anion transporter family member 1B1 (SLCO1B1), ATP-binding cassette subfamily G member 2 (ABCG2), and cytochrome P450 2C9 (CYP2C9) are suggested as safety biomarkers for statins. The Clinical Pharmacogenomic Implementation Consortium (CPIC) issued clinical guidelines for statin use based on these three genes. OBJECTIVES The present study aimed to examine variants in these pharmacogenes to predict the safety of statin use among the Emirati population. METHODS Analyzing 242 whole exome sequencing data at the three genes enabled the determination of the frequencies of the single nucleotide polymorphisms (SNPs), annotating the haplotypes and the predicted functions of their proteins. RESULTS In our cohort, 29.8% and 5.4% had SLCO1B1 decreased and poor function, respectively. The high frequency warns of the possibility of significant side effects of some statins and the importance of pharmacogenomic testing. We found a low frequency (6%) of the ABCG2:rs2231142 variant, which indicates the low probability of Emirati patients being recommended against higher rosuvastatin doses compared with other populations with higher frequencies of this variant. In contrast, we found high frequencies of the functionally impaired CYP2C9 alleles, which makes fluvastatin a less favorable choice. CONCLUSION Among the sparse studies available, the present one demonstrates all SLCO1B1 and CYP2C9 function-impairing alleles among Emiratis. We highlighted how population-specific pharmacogenomic data can predict safer choices of statins, especially in understudied populations.
Collapse
Affiliation(s)
- Mais N Alqasrawi
- Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Zeina N Al-Mahayri
- Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Hiba Alblooshi
- Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Habiba Alsafar
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Bassam R Ali
- Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
- ASPIRE Precision Medicine Research Institute Abu Dhabi, United Arab Emirates University, Al Ain, United Arab Emirates
| |
Collapse
|
5
|
Al-Aamri A, Kamarul Azman S, Daw Elbait G, Alsafar H, Henschel A. Critical assessment of on-premise approaches to scalable genome analysis. BMC Bioinformatics 2023; 24:354. [PMID: 37735350 PMCID: PMC10512525 DOI: 10.1186/s12859-023-05470-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 09/08/2023] [Indexed: 09/23/2023] Open
Abstract
BACKGROUND Plummeting DNA sequencing cost in recent years has enabled genome sequencing projects to scale up by several orders of magnitude, which is transforming genomics into a highly data-intensive field of research. This development provides the much needed statistical power required for genotype-phenotype predictions in complex diseases. METHODS In order to efficiently leverage the wealth of information, we here assessed several genomic data science tools. The rationale to focus on on-premise installations is to cope with situations where data confidentiality and compliance regulations etc. rule out cloud based solutions. We established a comprehensive qualitative and quantitative comparison between BCFtools, SnpSift, Hail, GEMINI, and OpenCGA. The tools were compared in terms of data storage technology, query speed, scalability, annotation, data manipulation, visualization, data output representation, and availability. RESULTS Tools that leverage sophisticated data structures are noted as the most suitable for large-scale projects in varying degrees of scalability in comparison to flat-file manipulation (e.g., BCFtools, and SnpSift). Remarkably, for small to mid-size projects, even lightweight relational database. CONCLUSION The assessment criteria provide insights into the typical questions posed in scalable genomics and serve as guidance for the development of scalable computational infrastructure in genomics.
Collapse
Affiliation(s)
- Amira Al-Aamri
- Department of Electrical Engineering and Computer Science, College of Engineering, Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates
| | - Syafiq Kamarul Azman
- Department of Electrical Engineering and Computer Science, College of Engineering, Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates
| | - Gihan Daw Elbait
- Department of Biology, College of Arts and Sciences, Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates
- Center for Biotechnology (BTC), Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates
| | - Habiba Alsafar
- Center for Biotechnology (BTC), Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates
- Department of Biomedical Engineering, Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates
| | - Andreas Henschel
- Department of Electrical Engineering and Computer Science, College of Engineering, Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates.
- Center for Biotechnology (BTC), Khalifa University, P.O. Box 127788, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
6
|
Osman W, Mousa M, Albreiki M, Baalfaqih Z, Daggag H, Hill C, McKnight AJ, Maxwell AP, Al Safar H. A genome-wide association study identifies a possible role for cannabinoid signalling in the pathogenesis of diabetic kidney disease. Sci Rep 2023; 13:4661. [PMID: 36949158 PMCID: PMC10033677 DOI: 10.1038/s41598-023-31701-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 03/16/2023] [Indexed: 03/24/2023] Open
Abstract
Diabetic kidney disease (DKD), also known as diabetic nephropathy, is the leading cause of renal impairment and end-stage renal disease. Patients with diabetes are at risk for DKD because of poor control of their blood glucose, as well as nonmodifiable risk factors including age, ethnicity, and genetics. This genome-wide association study (GWAS) was conducted for the first time in the Emirati population to investigate possible genetic factors associated with the development and progression of DKD. We included data on 7,921,925 single nucleotide polymorphism (SNPs) in 258 cases of type 2 diabetes mellitus (T2DM) who developed DKD and 938 control subjects with T2DM who did not develop DKD. GWAS suggestive results (P < 1 × 10-5) were further replicated using summary statistics from three cohorts with T2DM-induced DKD (Bio Bank Japan data, UK Biobank, and FinnGen Project data) and T1DM-induced DKD (UK-ROI cohort data from Belfast, UK). When conducting a multiple linear regression model for gene-set analyses, the CNR2 gene demonstrated genome-wide significance at 1.46 × 10-6. SNPs in CNR2 gene, encodes cannabinoid receptor 2 or CB2, were replicated in Japanese samples with the leading SNP rs2501391 showing a Pcombined = 9.3 × 10-7, and odds ratio = 0.67 in association with DKD associated with T2DM, but not with T1DM, without any significant association with T2DM itself. The allele frequencies of our cohort and those of the replication cohorts were in most cases markedly different. In addition, we replicated the association between rs1564939 in the GLRA3 gene and DKD in T2DM (P = 0.016, odds ratio = 0.54 per allele C). Our findings suggest evidence that cannabinoid signalling may be involved in the development of DKD through CB2, which is expressed in different kidney regions and known to be involved in insulin resistance, inflammation, and the development of kidney fibrosis.
Collapse
Affiliation(s)
- Wael Osman
- Center for Biotechnology, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates
- Department of Biology, College of Arts and Sciences, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Mira Mousa
- Center for Biotechnology, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates
| | - Mohammed Albreiki
- Center for Biotechnology, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates
| | - Zahrah Baalfaqih
- Center for Biotechnology, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates
| | - Hinda Daggag
- Imperial College of London Diabetes Centre, Abu Dhabi, United Arab Emirates
| | - Claire Hill
- Centre for Public Health, Queen's University of Belfast, Belfast, UK
| | | | | | - Habiba Al Safar
- Center for Biotechnology, Khalifa University, PO Box 127788, Abu Dhabi, United Arab Emirates.
- Department of Biomedical Engineering, College of Engineering, Khalifa University, Abu Dhabi, United Arab Emirates.
| |
Collapse
|
7
|
Mousa M, Albarguthi S, Albreiki M, Farooq Z, Sajid S, El Hajj Chehadeh S, ElBait GD, Tay G, Deeb AA, Alsafar H. Whole-Exome Sequencing in Family Trios Reveals De Novo Mutations Associated with Type 1 Diabetes Mellitus. BIOLOGY 2023; 12:biology12030413. [PMID: 36979105 PMCID: PMC10044903 DOI: 10.3390/biology12030413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/16/2023] [Accepted: 02/23/2023] [Indexed: 03/10/2023]
Abstract
Type 1 diabetes mellitus (T1DM) is a chronic autoimmune disease characterized by insulin deficiency and loss of pancreatic islet β-cells. The objective of this study is to identify de novo mutations in 13 trios from singleton families that contribute to the genetic basis of T1DM through the application of whole-exome sequencing (WES). Of the 13 families sampled for this project, 12 had de novo variants, with Family 7 having the highest number (nine) of variants linked to T1DM/autoimmune pathways, whilst Family 4 did not have any variants past the filtering steps. There were 10 variants of 7 genes reportedly associated with T1DM (MST1; TDG; TYRO3; IFIHI; GLIS3; VEGFA; TYK2). There were 20 variants of 13 genes that were linked to endocrine, metabolic, or autoimmune diseases. Our findings demonstrate that trio-based WES is a powerful approach for identifying new candidate genes for the pathogenesis of T1D. Genotyping and functional annotation of the discovered de novo variants in a large cohort is recommended to ascertain their association with disease pathogenesis.
Collapse
Affiliation(s)
- Mira Mousa
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
| | - Sara Albarguthi
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
| | - Mohammed Albreiki
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
| | - Zenab Farooq
- College of Medicine and Health Sciences, Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Sameeha Sajid
- College of Medicine and Health Sciences, Khalifa University, Abu Dhabi 127788, United Arab Emirates
| | - Sarah El Hajj Chehadeh
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
| | - Gihan Daw ElBait
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
| | - Guan Tay
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
| | - Asma Al Deeb
- College of Medicine and Health Sciences, Khalifa University, Abu Dhabi 127788, United Arab Emirates
- Department of Endocrinology, Mafraq Hospital, Abu Dhabi 127788, United Arab Emirates
| | - Habiba Alsafar
- Center of Biotechnology, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi 127788, United Arab Emirates
- Correspondence:
| |
Collapse
|
8
|
Van Der Merwe N, Ramesar R, De Vries J. Whole Exome Sequencing in South Africa: Stakeholder Views on Return of Individual Research Results and Incidental Findings. Front Genet 2022; 13:864822. [PMID: 35754817 PMCID: PMC9216214 DOI: 10.3389/fgene.2022.864822] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 03/30/2022] [Indexed: 11/17/2022] Open
Abstract
The use of whole exome sequencing (WES) in medical research is increasing in South Africa (SA), raising important questions about whether and which individual genetic research results, particularly incidental findings, should be returned to patients. Whilst some commentaries and opinions related to the topic have been published in SA, there is no qualitative data on the views of professional stakeholders on this topic. Seventeen participants including clinicians, genomics researchers, and genetic counsellors (GCs) were recruited from the Western Cape in SA. Semi-structured interviews were conducted, and the transcripts analysed using the framework approach for data analysis. Current roadblocks for the clinical adoption of WES in SA include a lack of standardised guidelines; complexities relating to variant interpretation due to lack of functional studies and underrepresentation of people of African ancestry in the reference genome, population and variant databases; lack of resources and skilled personnel for variant confirmation and follow-up. Suggestions to overcome these barriers include obtaining funding and buy-in from the private and public sectors and medical insurance companies; the generation of a locally relevant reference genome; training of health professionals in the field of genomics and bioinformatics; and multidisciplinary collaboration. Participants emphasised the importance of upscaling the accessibility to and training of GCs, as well as upskilling of clinicians and genetic nurses for return of genetic data in collaboration with GCs and medical geneticists. Future research could focus on exploring the development of stakeholder partnerships for increased access to trained specialists as well as community engagement and education, alongside the development of guidelines for result disclosure.
Collapse
Affiliation(s)
- Nicole Van Der Merwe
- UCT/MRC Genomic and Precision Medicine Research Unit, Division of Human Genetics, Institute for Infectious Diseases and Molecular Medicine, Department of Pathology, Faculty of Medicine and Health Sciences, University of Cape Town, Cape Town, South Africa.,Department of Pathology, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, South Africa
| | - Raj Ramesar
- UCT/MRC Genomic and Precision Medicine Research Unit, Division of Human Genetics, Institute for Infectious Diseases and Molecular Medicine, Department of Pathology, Faculty of Medicine and Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Jantina De Vries
- Department of Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa.,Neuroscience Institute, Faculty of Health Sciences, University of Cape Town, Observatory, South Africa
| |
Collapse
|
9
|
Al Zahmi F, Habuza T, Awawdeh R, Elshekhali H, Lee M, Salamin N, Sajid R, Kiran D, Nihalani S, Smetanina D, Talako T, Neidl-Van Gorkom K, Zaki N, Loney T, Statsenko Y. Ethnicity-Specific Features of COVID-19 Among Arabs, Africans, South Asians, East Asians, and Caucasians in the United Arab Emirates. Front Cell Infect Microbiol 2022; 11:773141. [PMID: 35368452 PMCID: PMC8967254 DOI: 10.3389/fcimb.2021.773141] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 11/22/2021] [Indexed: 01/08/2023] Open
Abstract
BackgroundDubai (United Arab Emirates; UAE) has a multi-national population which makes it exceptionally interesting study sample because of its unique demographic factors.ObjectiveTo stratify the risk factors for the multinational society of the UAE.MethodsA retrospective chart review of 560 patients sequentially admitted to inpatient care with laboratory confirmed COVID-19 was conducted. We studied patients’ demographics, clinical features, laboratory results, disease severity, and outcomes. The parameters were compared across different ethnic groups using tree-based estimators to rank the ethnicity-specific disease features. We trained ML classification algorithms to build a model of ethnic specificity of COVID-19 based on clinical presentation and laboratory findings on admission.ResultsOut of 560 patients, 43.6% were South Asians, 26.4% Middle Easterns, 16.8% East Asians, 10.7% Caucasians, and 2.5% are under others. UAE nationals represented half of the Middle Eastern patients, and 13% of the entire cohort. Hypertension was the most common comorbidity in COVID-19 patients. Subjective complaint of fever and cough were the chief presenting symptoms. Two-thirds of the patients had either a mild disease or were asymptomatic. Only 20% of the entire cohort needed oxygen therapy, and 12% needed ICU admission. Forty patients (~7%) needed invasive ventilation and fifteen patients died (2.7%). We observed differences in disease severity among different ethnic groups. Caucasian or East-Asian COVID-19 patients tended to have a more severe disease despite a lower risk profile. In contrast to this, Middle Eastern COVID-19 patients had a higher risk factor profile, but they did not differ markedly in disease severity from the other ethnic groups. There was no noticeable difference between the Middle Eastern subethnicities—Arabs and Africans—in disease severity (p = 0.81). However, there were disparities in the SOFA score, D-dimer (p = 0.015), fibrinogen (p = 0.007), and background diseases (hypertension, p = 0.003; diabetes and smoking, p = 0.045) between the subethnicities.ConclusionWe observed variations in disease severity among different ethnic groups. The high accuracy (average AUC = 0.9586) of the ethnicity classification model based on the laboratory and clinical findings suggests the presence of ethnic-specific disease features. Larger studies are needed to explore the role of ethnicity in COVID-19 disease features.
Collapse
Affiliation(s)
- Fatmah Al Zahmi
- Mediclinic Parkview Hospital, Dubai, United Arab Emirates
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates
- *Correspondence: Fatmah Al Zahmi, ; Yauhen Statsenko, ;
| | - Tetiana Habuza
- College of Information Technology, United Arab Emirates University, Al Ain, United Arab Emirates
- Big Data Analytics Center, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Rasha Awawdeh
- Mediclinic Parkview Hospital, Dubai, United Arab Emirates
| | | | - Martin Lee
- Mediclinic Parkview Hospital, Dubai, United Arab Emirates
| | - Nassim Salamin
- Mediclinic Parkview Hospital, Dubai, United Arab Emirates
| | - Ruhina Sajid
- Mediclinic Parkview Hospital, Dubai, United Arab Emirates
| | - Dhanya Kiran
- Mediclinic Parkview Hospital, Dubai, United Arab Emirates
| | | | - Darya Smetanina
- College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Tatsiana Talako
- Belarusian Medical Academy of Postgraduate Education, Minsk, Belarus
- Minsk Scientific and Practical Center for Surgery, Transplantology and Hematology, Minsk, Belarus
| | - Klaus Neidl-Van Gorkom
- College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Nazar Zaki
- College of Information Technology, United Arab Emirates University, Al Ain, United Arab Emirates
- Big Data Analytics Center, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Tom Loney
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, United Arab Emirates
| | - Yauhen Statsenko
- Big Data Analytics Center, United Arab Emirates University, Al Ain, United Arab Emirates
- College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
- *Correspondence: Fatmah Al Zahmi, ; Yauhen Statsenko, ;
| |
Collapse
|
10
|
Borgio JF. Heterogeneity in biomarkers, mitogenome and genetic disorders of the Arab population with special emphasis on large-scale whole-exome sequencing. Arch Med Sci 2021; 19:765-783. [PMID: 37313193 PMCID: PMC10259412 DOI: 10.5114/aoms/145370] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/27/2021] [Indexed: 09/20/2024] Open
Abstract
More than 25 million DNA variations have been discovered as novel including major alleles from the Arab population. Exome studies on the Saudi genome discovered > 3000 novel nucleotide variants associated with > 1200 rare genetic disorders. Reclassification of many pathogenic variants in the Human Gene Mutation Database and ClinVar Database as benign through the Arab database facilitates building a detailed and comprehensive map of the human morbid genome. Intellectual disability comes first with the combined and observed carrier frequency of 0.06779 among Saudi Arabians; retinal dystrophy is the next highest. Genome studies have discovered interesting novel candidate disease marker variations in many genes from consanguineous families. More than 7 pathogenic variants in the C12orf57 gene are prominently associated with the etiology of developmental delay/intellectual impairment in Arab ancestries. Advances in large-scale genome studies open a new outlook on Mendelian genes and disorders. In the past half-dozen years, candidate genes of intellectual disability, neurogenetic disorders, blood and bleeding disorders and rare genetic diseases have been well documented through genomic medicine studies in combination with advanced computational biology applications. The Arab mitogenome exposed hundreds of variations in the mtDNA genome and ancestral sharing with Africa, the Near East and East Asia and its association with obesity. These recent discoveries in disease markers and molecular genetics of the Arab population will have a positive impact towards supporting genetic counsellors on reaching consanguineous families to manage stress linked to genetics and precision medicine. This narrative review summarizes the advances in molecular medical genetics and recent discoveries on pathogenic variants. Despite the fact that these initiatives are targeting the genetics and genomics of disorders prevalent in Arab populations, a lack of complete cooperation across the projects needed to be revisited to uncover the Arab population's prominent disease markers. This shows that further study is needed in genomics to fully comprehend the molecular abnormalities and associated pathogenesis that cause inherited disorders in Arab ancestries.
Collapse
Affiliation(s)
- J Francis Borgio
- Department of Genetic Research, Institute for Research and Medical Consultations (IRMC), Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| |
Collapse
|
11
|
Trost B, Loureiro LO, Scherer SW. Discovery of genomic variation across a generation. Hum Mol Genet 2021; 30:R174-R186. [PMID: 34296264 PMCID: PMC8490016 DOI: 10.1093/hmg/ddab209] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/09/2021] [Accepted: 07/19/2021] [Indexed: 11/12/2022] Open
Abstract
Over the past 30 years (the timespan of a generation), advances in genomics technologies have revealed tremendous and unexpected variation in the human genome and have provided increasingly accurate answers to long-standing questions of how much genetic variation exists in human populations and to what degree the DNA complement changes between parents and offspring. Tracking the characteristics of these inherited and spontaneous (or de novo) variations has been the basis of the study of human genetic disease. From genome-wide microarray and next-generation sequencing scans, we now know that each human genome contains over 3 million single nucleotide variants when compared with the ~ 3 billion base pairs in the human reference genome, along with roughly an order of magnitude more DNA—approximately 30 megabase pairs (Mb)—being ‘structurally variable’, mostly in the form of indels and copy number changes. Additional large-scale variations include balanced inversions (average of 18 Mb) and complex, difficult-to-resolve alterations. Collectively, ~1% of an individual’s genome will differ from the human reference sequence. When comparing across a generation, fewer than 100 new genetic variants are typically detected in the euchromatic portion of a child’s genome. Driven by increasingly higher-resolution and higher-throughput sequencing technologies, newer and more accurate databases of genetic variation (for instance, more comprehensive structural variation data and phasing of combinations of variants along chromosomes) of worldwide populations will emerge to underpin the next era of discovery in human molecular genetics.
Collapse
Affiliation(s)
- Brett Trost
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Livia O Loureiro
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Stephen W Scherer
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada.,McLaughlin Centre and Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|