1
|
Albrecht V, Pahlke M, Sauter J, Paech C, Putke K, Schmidt AH, Lange V, Klussmeier A. Extensive Analysis of Genetic Diversity in HLA-DMA, HLA-DMB, HLA-DOA and HLA-DOB: Characterisation of 236 Novel Alleles. HLA 2025; 105:e70231. [PMID: 40347049 PMCID: PMC12063561 DOI: 10.1111/tan.70231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Revised: 03/19/2025] [Accepted: 04/25/2025] [Indexed: 05/12/2025]
Abstract
HLA-DMA, -DMB, -DOA and -DOB are non-classical HLA Class II genes that play a crucial role in the selection of highly stable HLA Class II/peptide complexes on antigen-presenting cells. Although the genes were initially thought to have a limited diversity with less than 13 alleles per gene documented in the IPD-IMGT/HLA Database in 2022, recent studies suggest a potential impact of certain alleles on the outcome of hematopoietic cell transplantation. To gain a deeper understanding of allelic diversity, we sequenced HLA-DMA, -DMB, -DOA and -DOB of 1880 potential stem cell donors from Germany, Poland, Great Britain and Chile, achieving full-gene resolution. Remarkably, we identified 3968 previously undescribed sequences, including 28 distinct novel proteins. The observed allele frequencies were consistent across all studied populations with one dominating protein for each gene: HLA-DMA*01:01 (> 77%), HLA-DMB*01:01 (> 63%), HLA-DOA*01:01 (> 97%) and HLA-DOB*01:01 (> 77%). Notably, a much higher diversity was observed in full-genomic resolution. Finally, we submitted 51 distinct novel sequences for HLA-DMA, 58 for HLA-DMB, 80 for HLA-DOA and 47 for HLA-DOB to the IPD-IMGT/HLA Database. This comprehensive reference database update will not only simplify future genotyping of HLA-DMA, -DMB, -DOA and -DOB but will hopefully also enhance our understanding of the complex process of peptide selection and loading to the HLA Class II proteins.
Collapse
|
2
|
Siddiqui J, Sinha R, Grantham J, LaCombe R, Alonzo JR, Cowden S, Kleiboeker S. A computational HLA allele-typing protocol to de-noise and leverage nanopore amplicon data. BMC Genomics 2025; 26:356. [PMID: 40200142 PMCID: PMC11980251 DOI: 10.1186/s12864-025-11547-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 03/28/2025] [Indexed: 04/10/2025] Open
Abstract
BACKGROUND Rapid turnaround time for a third-field resolution deceased donor human leukocyte antigen (HLA) typing is critical to improve organ transplantation outcomes. Third generation DNA sequencing platforms such as Oxford Nanopore (ONT) offer the opportunity to deliver rapid results at single nucleotide level resolution, in particular sequencing data that could be denoised computationally. Here we present a computational pipeline for up-to third-field HLA allele typing following ONT sequencing. RESULTS From a R10.3 flow cell batch of 31 samples of known HLA allele types, up to 10,000 ONT reads were aligned using BWA aligner to reference allele sequences from the IPD-IMGT/HLA database. For each gene, the top two hits to reference alleles at the third field were selected. Using our pipeline, we obtained the following percent concordance at the 1st, 2nd and 3rd field: HLA-A (98.4%, 98.4%, 98.4%), HLA-B (100%, 96.8%, 96.8%), HLA-C (100%, 98.4%, 98.4%), HLA-DPA1 (100%, 96.8%, 96.8%), HLA-DPB1 (100%, 100%, 98.4%), HLA-DQA1 (100%, 98.4%, 98.4%), HLA-DQB1 (100%, 98.4%, 98.4%), HLA-DRB1 (83.9%, 64.5%, 64.5%), HLA-DRB3 (82.6%, 73.9%, 73.9%), HLA-DRB4 (100%, 100%, 100%) and HLA-DRB5 (100%, 100%, 100%). By running our pipeline on an additional R10.3 flow cell batch of 63 samples, the following percent concordances were obtained:: HLA-A (100%, 96.8%, 88.1%), HLA-B (100%, 90.5.4%, 88.1%), HLA-C (100%, 99.2%, 99.2%), HLA-DPA1 (100%, 98.4%, 97.6%), HLA-DPB1 (98.4%, 97.6%, 92.9%), HLA-DQA1 (100%, 100%, 98.4%), HLA-DQB1 (100%, 97.6%, 96.0%), HLA-DRB1 (88.9%, 68.3%, 68.3%), HLA-DRB3 (81.0%, 61.9%, 61.9%), HLA-DRB4 (100%, 97.4%, 94.7%) and HLA-DRB5 (73.3%, 66.7%, 66.7%). In addition, our pipeline demonstrated significantly improved concordance compared to publicly available pipeline HLA-LA and concordances close to Athlon2 in commercial development. CONCLUSION Our algorithm had a > 96% concordance for non-HLA-DRB genes at 3rd field on the first batch and > 88% concordance for non-HLA-DRB genes at 3rd field and > 90% at 2nd field on the second batch tested. In addition, it out-performs HLA-LA and approaches the performance of the Athlon2. This lays groundwork for better utilizing Nanopore sequencing data for HLA typing especially in improving organ transplant outcomes.
Collapse
Affiliation(s)
- Jalal Siddiqui
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America.
| | - Rohita Sinha
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America
| | - James Grantham
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America
| | - Ronnie LaCombe
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America
| | - Judith R Alonzo
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America
| | - Scott Cowden
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America
| | - Steven Kleiboeker
- Eurofins Viracor Clinical Diagnostics, 18000 W 99th St, Lenexa, KS, 66219, United States of America
| |
Collapse
|
3
|
Schöne B, Fuhrmann M, Surendranath V, Schmidt AH, Lange V, Schöfl G. Submitting Novel Full-Length HLA, MIC, and KIR Alleles with TypeLoader2. Methods Mol Biol 2024; 2809:157-169. [PMID: 38907897 DOI: 10.1007/978-1-0716-3874-3_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]
Abstract
The Immuno Polymorphism Database (IPD) plays a pivotal role for immunogenetics. Due to technical limitations, genotyping often focuses on specific key regions like the antigen recognition domain (ARD) for HLA genotyping, and the databases are populated accordingly. More recently, though, modern next generation sequencing (NGS) assays allow using larger gene segments or even complete genes for genotyping. It is therefore essential that the databases are updated with complete genetic reference sequences to fully serve current and future applications. However, the process of manually annotating and submitting full-length allele sequences to IPD is time-consuming and error-prone, which may discourage HLA-genotyping laboratories or researchers from submitting full-length sequences of novel alleles.Here, we detail the process of preparing and submitting novel HLA, MIC, and KIR alleles to ENA and IPD using TypeLoader2, a convenient software tool developed to streamline this process by automating the sequence annotation, the creation of all necessary files, as well as parts of the submission process itself. The software is freely available from GitHub ( https://github.com/DKMS-LSL/typeloader ).
Collapse
|
4
|
Putke K, Albrecht V, Paech C, Pahlke M, Schöne B, Klasberg S, Schmidt AH, Lange V, Schöfl G, Klussmeier A. Full-Length Characterization of Novel HLA-DRB1 Alleles for Reference Database Submission. Methods Mol Biol 2024; 2809:145-156. [PMID: 38907896 DOI: 10.1007/978-1-0716-3874-3_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2024]
Abstract
The prerequisite for successful HLA genotyping is the integrity of the large allele reference database IPD-IMGT/HLA. Consequently, it is in the laboratories' best interest that the data quality of submitted novel sequences is high. However, due to its long and variable length, the gene HLA-DRB1 presents the biggest challenge and as of today only 16% of the HLA-DRB1 alleles in the database are characterized in full length. To improve this situation, we developed a protocol for long-range PCR amplification of targeted HLA-DRB1 alleles. By subsequently combining both long-read and short-read sequencing technologies, our protocol ensures phased and error-corrected sequences of reference grade quality. This dual redundant reference sequencing (DR2S) approach is of particular importance for correctly resolving the challenging repeat regions of DRB1 intron 1. Until today, we used this protocol to characterize and submit 384 full-length HLA-DRB1 sequences to IPD-IMGT/HLA.
Collapse
|
5
|
Klussmeier A, Putke K, Klasberg S, Kohler M, Sauter J, Schefzyk D, Schöfl G, Massalski C, Schäfer G, Schmidt AH, Roers A, Lange V. High population frequencies of MICA copy number variations originate from independent recombination events. Front Immunol 2023; 14:1297589. [PMID: 38035108 PMCID: PMC10684724 DOI: 10.3389/fimmu.2023.1297589] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 10/24/2023] [Indexed: 12/02/2023] Open
Abstract
MICA is a stress-induced ligand of the NKG2D receptor that stimulates NK and T cell responses and was identified as a key determinant of anti-tumor immunity. The MICA gene is located inside the MHC complex and is in strong linkage disequilibrium with HLA-B. While an HLA-B*48-linked MICA deletion-haplotype was previously described in Asian populations, little is known about other MICA copy number variations. Here, we report the genotyping of more than two million individuals revealing high frequencies of MICA duplications (1%) and MICA deletions (0.4%). Their prevalence differs between ethnic groups and can rise to 2.8% (Croatia) and 9.2% (Mexico), respectively. Targeted sequencing of more than 70 samples indicates that these copy number variations originate from independent nonallelic homologous recombination events between segmental duplications upstream of MICA and MICB. Overall, our data warrant further investigation of disease associations and consideration of MICA copy number data in oncological study protocols.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Axel Roers
- Institute for Immunology, Medical Faculty Carl Gustav Carus, University of Technology (TU) Dresden, Dresden, Germany
- Institute for Immunology, University Hospital Heidelberg, Heidelberg, Germany
| | | |
Collapse
|
6
|
Yin X, Xiang Y, Huang F, Chen Y, Ding H, Du J, Chen X, Wang X, Wei X, Cai Y, Gao W, Guo D, Alolga RN, Kan X, Zhang B, Alejo‐Jacuinde G, Li P, Tran LP, Herrera‐Estrella L, Lu X, Qi L. Comparative genomics of the medicinal plants Lonicera macranthoides and L. japonica provides insight into genus genome evolution and hederagenin-based saponin biosynthesis. PLANT BIOTECHNOLOGY JOURNAL 2023; 21:2209-2223. [PMID: 37449344 PMCID: PMC10579715 DOI: 10.1111/pbi.14123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 05/29/2023] [Accepted: 06/29/2023] [Indexed: 07/18/2023]
Abstract
Lonicera macranthoides (LM) and L. japonica (LJ) are medicinal plants widely used in treating viral diseases, such as COVID-19. Although the two species are morphologically similar, their secondary metabolite profiles are significantly different. Here, metabolomics analysis showed that LM contained ~86.01 mg/g hederagenin-based saponins, 2000-fold higher than LJ. To gain molecular insights into its secondary metabolite production, a chromosome-level genome of LM was constructed, comprising 9 pseudo-chromosomes with 40 097 protein-encoding genes. Genome evolution analysis showed that LM and LJ were diverged 1.30-2.27 million years ago (MYA). The two plant species experienced a common whole-genome duplication event that occurred ∼53.9-55.2 MYA before speciation. Genes involved in hederagenin-based saponin biosynthesis were arranged in clusters on the chromosomes of LM and they were more highly expressed in LM than in LJ. Among them, oleanolic acid synthase (OAS) and UDP-glycosyltransferase 73 (UGT73) families were much more highly expressed in LM than in LJ. Specifically, LmOAS1 was identified to effectively catalyse the C-28 oxidation of β-Amyrin to form oleanolic acid, the precursor of hederagenin-based saponin. LmUGT73P1 was identified to catalyse cauloside A to produce α-hederin. We further identified the key amino acid residues of LmOAS1 and LmUGT73P1 for their enzymatic activities. Additionally, comparing with collinear genes in LJ, LmOAS1 and LmUGT73P1 had an interesting phenomenon of 'neighbourhood replication' in LM genome. Collectively, the genomic resource and candidate genes reported here set the foundation to fully reveal the genome evolution of the Lonicera genus and hederagenin-based saponin biosynthetic pathway.
Collapse
Affiliation(s)
- Xiaojian Yin
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
- Key Laboratory of Soybean Molecular Design BreedingNortheast Institute of Geography and Agroecology, Chinese Academy of SciencesChangchunChina
| | - Yaping Xiang
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Feng‐Qing Huang
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Yahui Chen
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Hengwu Ding
- The Institute of Bioinformatics, College of Life SciencesAnhui Normal UniversityWuhuChina
| | - Jinfa Du
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Xiaojie Chen
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Xiaoxiao Wang
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Xinru Wei
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Yuan‐Yuan Cai
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Wen Gao
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Dongshu Guo
- Provincial Key Laboratory of AgrobiologyJiangsu Academy of Agricultural ScienceNanjingChina
| | - Raphael N. Alolga
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Xianzhao Kan
- The Institute of Bioinformatics, College of Life SciencesAnhui Normal UniversityWuhuChina
| | - Baolong Zhang
- Provincial Key Laboratory of AgrobiologyJiangsu Academy of Agricultural ScienceNanjingChina
| | - Gerardo Alejo‐Jacuinde
- Institute of Genomics for Crop Abiotic Stress Tolerance, Department of Plant and Soil Science, Texas Tech UniversityLubbockTXUSA
| | - Ping Li
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Lam‐Son Phan Tran
- Institute of Genomics for Crop Abiotic Stress Tolerance, Department of Plant and Soil Science, Texas Tech UniversityLubbockTXUSA
| | - Luis Herrera‐Estrella
- Institute of Genomics for Crop Abiotic Stress Tolerance, Department of Plant and Soil Science, Texas Tech UniversityLubbockTXUSA
- Laboratorio Nacional de Genomica/ Unidad de Genómica Avanzada del Centro de Investigación y de Estudios Avanzados del IPNIrapuatoMexico
| | - Xu Lu
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| | - Lian‐Wen Qi
- Clinical Metabolomics Center, School of Traditional Chinese PharmacyChina Pharmaceutical UniversityNanjingChina
| |
Collapse
|
7
|
Wang Y, Zhao Y, Bollas A, Wang Y, Au KF. Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 798] [Impact Index Per Article: 199.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
Affiliation(s)
- Yunhao Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Yue Zhao
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
- Biomedical Informatics Shared Resources, The Ohio State University, Columbus, OH, USA
| | - Audrey Bollas
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Yuru Wang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Kin Fai Au
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA.
- Biomedical Informatics Shared Resources, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
8
|
Paech C, Albrecht V, Putke K, Schöfl G, Schöne B, Schmidt AH, Lange V, Klussmeier A. HLA-E diversity unfolded: Identification and characterization of 170 novel HLA-E alleles. HLA 2021; 97:389-398. [PMID: 33527770 PMCID: PMC8247977 DOI: 10.1111/tan.14195] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 01/19/2021] [Accepted: 01/25/2021] [Indexed: 12/27/2022]
Abstract
HLA-E is a member of the nonclassical HLA class Ib genes. Even though it is structurally highly similar to the classical HLA class Ia genes, it is less diverse and only 45 alleles and 12 proteins were known in December 2019 (IPD-IMGT/HLA, release 3.38.0). Since 2017, we have genotyped over 3 million voluntary stem cell donors for HLA-E by sequencing the most relevant allele-determining bases of exons 2 and 3. As expected, most donors harbor the two predominant alleles HLA-E*01:01 and/or HLA-E*01:03. However, in 1666 (0.05%) of our samples we detected 345 distinct novel HLA-E sequences. The most frequent one was identified in 162 samples and has by now been named HLA-E*01:114. To characterize these novel alleles in full-length, we used both short-read Illumina and long-read PacBio sequencing to obtain fully phased and highly accurate sequences. This resulted in 234 submissions to IPD-IMGT/HLA comprising 170 novel HLA-E alleles, which encode for 93 novel HLA-E proteins, as well as 64 confirmations or sequence extensions. Consequently, the number of HLA-E alleles in the database (release 3.42.0) has now increased to 256 HLA-E alleles and 110 HLA-E proteins.
Collapse
|