1
|
García-Olivares V, Muñoz-Barrera A, Rubio-Rodríguez LA, Jáspez D, Díaz-de Usera A, Iñigo-Campos A, Veeramah KR, Alonso S, Thomas MG, Lorenzo-Salazar JM, González-Montelongo R, Flores C. Benchmarking of human Y-chromosomal haplogroup classifiers with whole-genome and whole-exome sequence data. Comput Struct Biotechnol J 2023; 21:4613-4618. [PMID: 37817776 PMCID: PMC10560978 DOI: 10.1016/j.csbj.2023.09.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 09/12/2023] [Accepted: 09/12/2023] [Indexed: 10/12/2023] Open
Abstract
In anthropological, medical, and forensic studies, the nonrecombinant region of the human Y chromosome (NRY) enables accurate reconstruction of pedigree relationships and retrieval of ancestral information. Using high-throughput sequencing (HTS) data, we present a benchmarking analysis of command-line tools for NRY haplogroup classification. The evaluation was performed using paired Illumina data from whole-genome sequencing (WGS) and whole-exome sequencing (WES) experiments from 50 unrelated donors. Additionally, as a validation, we also used paired WGS/WES datasets of 54 individuals from the 1000 Genomes Project. Finally, we evaluated the tools on data from third-generation HTS obtained from a subset of donors and one reference sample. Our results show that WES, despite typically offering less genealogical resolution than WGS, is an effective method for determining the NRY haplogroup. Y-LineageTracker and Yleaf showed the highest accuracy for WGS data, classifying precisely 98% and 96% of the samples, respectively. Yleaf outperforms all benchmarked tools in the WES data, classifying approximately 90% of the samples. Yleaf, Y-LineageTracker, and pathPhynder can correctly classify most samples (88%) sequenced with third-generation HTS. As a result, Yleaf provides the best performance for applications that use WGS and WES. Overall, our study offers researchers with a guide that allows them to select the most appropriate tool to analyze the NRY region using both second- and third-generation HTS data.
Collapse
Affiliation(s)
- Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- Plataforma Genómica de Alto Rendimiento para el Estudio de la Biodiversidad, Instituto de Productos Naturales y Agrobiología (IPNA), Consejo Superior de Investigaciones Científicas, San Cristóbal de La Laguna, Spain
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Antonio Iñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Krishna R. Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY 11794-5245, United States
| | - Santos Alonso
- Department of Genetics, Physical Anthropology and Animal Physiology, University of the Basque Country UPV/EHU, Leioa, Bizkaia, Spain
- María Goyri Building, Biotechnology Center, Human Molecular Evolution Lab 2.08 UPV/EHU Science Park, 48940 Leioa, Bizkaia, Spain
| | - Mark G. Thomas
- UCL Genetics Institute, University College London (UCL), Gower Street, London WC1E 6BT, United Kingdom
- Research Department of Genetics, Evolution & Environment, University College London (UCL), Darwin Building, Gower Street, London WC1E 6BT, United Kingdom
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- Plataforma Genómica de Alto Rendimiento para el Estudio de la Biodiversidad, Instituto de Productos Naturales y Agrobiología (IPNA), Consejo Superior de Investigaciones Científicas, San Cristóbal de La Laguna, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- Plataforma Genómica de Alto Rendimiento para el Estudio de la Biodiversidad, Instituto de Productos Naturales y Agrobiología (IPNA), Consejo Superior de Investigaciones Científicas, San Cristóbal de La Laguna, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias (CIBERES), Instituto de Salud Carlos III, Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, Las Palmas de Gran Canaria, Spain
| |
Collapse
|
2
|
Muñoz-Barrera A, Ciuffreda L, Alcoba-Florez J, Rubio-Rodríguez LA, Rodríguez-Pérez H, Gil-Campesino H, García-Martínez de Artola D, Salas-Hernández J, Rodríguez-Núñez J, Íñigo-Campos A, García-Olivares V, Díez-Gil O, González-Montelongo R, Valenzuela-Fernández A, Lorenzo-Salazar JM, Flores C. Bioinformatic approaches to draft the viral genome sequence of Canary Islands cases related to the multicountry mpox virus 2022-outbreak. Comput Struct Biotechnol J 2023; 21:2197-2203. [PMID: 36968018 PMCID: PMC10015108 DOI: 10.1016/j.csbj.2023.03.020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 03/13/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023] Open
Abstract
On July 23, 2022, monkeypox disease (mpox) was declared a Public Emergency of International Concern (PHEIC) by the World Health Organization (WHO) due to a multicountry outbreak. In Europe, several cases of mpox virus (MPXV) infection related to this outbreak were detected in the Canary Islands (Spain). Here we describe the combination of viral DNA sequencing and bioinformatic approaches, including methods for de novo genome assembly and short- and long-read technologies, used to reconstruct the first MPXV genome isolated in the Canary Islands on the 31st of May 2022 from a male adult patient with mild symptoms. The same sequencing and bioinformatic approaches were then validated with three other positive cases of MPXV infection from the same mpox outbreak. We obtained the best results using a reference-based approach with short reads, evidencing 46-79 nucleotide variants against viral sequences from the 2018-2019 mpox outbreak and placing the viral sequences in the new B.1 sublineage of clade IIb of the MPXV classification. This study of MPXV demonstrates the potential of metagenomics sequencing for rapid and precise pathogen identification.
Collapse
Affiliation(s)
- Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain
| | - Laura Ciuffreda
- Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Julia Alcoba-Florez
- Servicio de Microbiología, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain
| | - Héctor Rodríguez-Pérez
- Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Helena Gil-Campesino
- Servicio de Microbiología, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | | | - Josmar Salas-Hernández
- Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Julia Rodríguez-Núñez
- Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - Antonio Íñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain
| | - Oscar Díez-Gil
- Servicio de Microbiología, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | | | - Agustín Valenzuela-Fernández
- Laboratorio “Inmunología Celular y Viral”, Unidad de Farmacología, Sección de Medicina, Facultad de Ciencias de la Salud, Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables, 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Ntra. Sra. de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias (CIBERES), Instituto de Salud Carlos III, 28029 Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
- Correspondence to: Unidad de Investigación, Hospital Universitario Nuestra Señora de Candelaria, Carretera del Rosario s/n, 38010 Santa Cruz de Tenerife, Spain.
| |
Collapse
|
3
|
García-Olivares V, Rubio-Rodríguez LA, Muñoz-Barrera A, Díaz-de Usera A, Jáspez D, Iñigo-Campos A, Rodríguez Pérez MDC, Cabrera de León A, Lorenzo-Salazar JM, González-Montelongo R, Cabrera VM, Flores C. Digging into the admixture strata of current-day Canary Islanders based on mitogenomes. iScience 2022; 26:105907. [PMID: 36647378 PMCID: PMC9840145 DOI: 10.1016/j.isci.2022.105907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 10/18/2022] [Accepted: 12/19/2022] [Indexed: 12/30/2022] Open
Abstract
The conquest of the Canary Islands by Europeans began at the beginning of the 15th century and culminated in 1496 with the surrender of the aborigines. The collapse of the aboriginal population during the conquest and the arrival of settlers caused a drastic change in the demographic composition of the archipelago. To shed light on this historical process, we analyzed 896 mitogenomes of current inhabitants from the seven main islands. Our findings confirm the continuity of aboriginal maternal contributions and the persistence of their genetic footprints in the current population, even at higher levels (>60% on average) than previously evidenced. Moreover, the age estimates for most autochthonous founder lineages support a first aboriginal arrival to the islands at the beginning of the first millennium. We also revealed for the first time that the main recognizable genetic influences from Europe are from Portuguese and Galicians.
Collapse
Affiliation(s)
- Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain,Plataforma Genómica de Alto Rendimiento para el Estudio de la Biodiversidad, Instituto de Productos Naturales y Agrobiología (IPNA), Consejo Superior de Investigaciones Científicas, San Cristóbal de La Laguna, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Antonio Iñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | | | - Antonio Cabrera de León
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain,Área de Medicina Preventiva y Salud Pública, Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain,Plataforma Genómica de Alto Rendimiento para el Estudio de la Biodiversidad, Instituto de Productos Naturales y Agrobiología (IPNA), Consejo Superior de Investigaciones Científicas, San Cristóbal de La Laguna, Spain
| | | | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain,Plataforma Genómica de Alto Rendimiento para el Estudio de la Biodiversidad, Instituto de Productos Naturales y Agrobiología (IPNA), Consejo Superior de Investigaciones Científicas, San Cristóbal de La Laguna, Spain,Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain,Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, Las Palmas de Gran Canaria, Spain,Corresponding author
| |
Collapse
|
4
|
Tosco-Herrera E, Muñoz-Barrera A, Jáspez D, Rubio-Rodríguez LA, Mendoza-Alvarez A, Rodriguez-Perez H, Jou J, Iñigo-Campos A, Corrales A, Ciuffreda L, Martinez-Bugallo F, Prieto-Morin C, García-Olivares V, González-Montelongo R, Lorenzo-Salazar JM, Marcelino-Rodriguez I, Flores C. Evaluation of a whole-exome sequencing pipeline and benchmarking of causal germline variant prioritizers. Hum Mutat 2022; 43:2010-2020. [PMID: 36054330 DOI: 10.1002/humu.24459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 08/20/2022] [Accepted: 08/30/2022] [Indexed: 01/25/2023]
Abstract
Most causal variants of Mendelian diseases are exonic. Whole-exome sequencing (WES) has become the diagnostic gold standard, but causative variant prioritization constitutes a bottleneck. Here we assessed an in-house sample-to-sequence pipeline and benchmarked free prioritization tools for germline causal variants from WES data. WES of 61 unselected patients with a known genetic disease cause was obtained. Variant prioritizations were performed by diverse tools and recorded to obtain a diagnostic yield when the causal variant was present in the first, fifth, and 10th top rankings. A fraction of causal variants was not captured by WES (8.2%) or did not pass the quality control criteria (13.1%). Most of the applications inspected were unavailable or had technical limitations, leaving nine tools for complete evaluation. Exomiser performed best in the top first rankings, while LIRICAL led in the top fifth rankings. Based on the more conservative top 10th rankings, Xrare had the highest diagnostic yield, followed by a three-way tie among Exomiser, LIRICAL, and PhenIX, then followed by AMELIE, TAPES, Phen-Gen, AIVar, and VarNote-PAT. Xrare, Exomiser, LIRICAL, and PhenIX are the most efficient options for variant prioritization in real patient WES data.
Collapse
Affiliation(s)
- Eva Tosco-Herrera
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain
| | - Adrián Muñoz-Barrera
- Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - Luis A Rubio-Rodríguez
- Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - Alejandro Mendoza-Alvarez
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain
| | - Hector Rodriguez-Perez
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain
| | - Jonathan Jou
- Department of Surgery, University of Illinois College of Medicine, Peoria, Illinois, USA
| | - Antonio Iñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | - Almudena Corrales
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| | - Laura Ciuffreda
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain
| | - Francisco Martinez-Bugallo
- Clinical Analysis Service, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain
| | - Carol Prieto-Morin
- Clinical Analysis Service, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | | | - Jose Miguel Lorenzo-Salazar
- Escuela de Doctorado y Estudios de Posgrado de la Universidad de La Laguna (EDEPULL), Universidad de La Laguna (ULL), San Cristóbal de La Laguna, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain
| | | | - Carlos Flores
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria (HUNSC), Santa Cruz de Tenerife, Spain.,Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Granadilla de Abona, Spain.,CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain.,Facultad de Ciencias de la Salud, Universidad Fernando Pessoa Canarias, Las Palmas de Gran Canaria, Spain
| |
Collapse
|
5
|
Muñoz-Barrera A, Rubio-Rodríguez LA, Díaz-de Usera A, Jáspez D, Lorenzo-Salazar JM, González-Montelongo R, García-Olivares V, Flores C. From Samples to Germline and Somatic Sequence Variation: A Focus on Next-Generation Sequencing in Melanoma Research. Life (Basel) 2022; 12:1939. [PMID: 36431075 PMCID: PMC9695713 DOI: 10.3390/life12111939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/12/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022] Open
Abstract
Next-generation sequencing (NGS) applications have flourished in the last decade, permitting the identification of cancer driver genes and profoundly expanding the possibilities of genomic studies of cancer, including melanoma. Here we aimed to present a technical review across many of the methodological approaches brought by the use of NGS applications with a focus on assessing germline and somatic sequence variation. We provide cautionary notes and discuss key technical details involved in library preparation, the most common problems with the samples, and guidance to circumvent them. We also provide an overview of the sequence-based methods for cancer genomics, exposing the pros and cons of targeted sequencing vs. exome or whole-genome sequencing (WGS), the fundamentals of the most common commercial platforms, and a comparison of throughputs and key applications. Details of the steps and the main software involved in the bioinformatics processing of the sequencing results, from preprocessing to variant prioritization and filtering, are also provided in the context of the full spectrum of genetic variation (SNVs, indels, CNVs, structural variation, and gene fusions). Finally, we put the emphasis on selected bioinformatic pipelines behind (a) short-read WGS identification of small germline and somatic variants, (b) detection of gene fusions from transcriptomes, and (c) de novo assembly of genomes from long-read WGS data. Overall, we provide comprehensive guidance across the main methodological procedures involved in obtaining sequencing results for the most common short- and long-read NGS platforms, highlighting key applications in melanoma research.
Collapse
Affiliation(s)
- Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
| |
Collapse
|
6
|
Hernandez-Beeftink T, Guillen-Guio B, Lorenzo-Salazar JM, Corrales A, Suarez-Pajes E, Feng R, Rubio-Rodríguez LA, Paynton ML, Cruz R, García-Laorden MI, Prieto-González M, Rodríguez-Pérez A, Carriedo D, Blanco J, Ambrós A, González-Higueras E, Espinosa E, Muriel A, Tamayo E, Martin MM, Lorente L, Domínguez D, de Lorenzo AG, Giannini HM, Reilly JP, Jones TK, Añón JM, Soro M, Carracedo Á, Wain LV, Meyer NJ, Villar J, Flores C. A genome-wide association study of survival in patients with sepsis. Crit Care 2022; 26:341. [PMID: 36335405 PMCID: PMC9637317 DOI: 10.1186/s13054-022-04208-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open
Abstract
BACKGROUND Sepsis is a severe systemic inflammatory response to infections that is accompanied by organ dysfunction and has a high mortality rate in adult intensive care units. Most genetic studies have identified gene variants associated with development and outcomes of sepsis focusing on biological candidates. We conducted the first genome-wide association study (GWAS) of 28-day survival in adult patients with sepsis. METHODS This study was conducted in two stages. The first stage was performed on 687 European sepsis patients from the GEN-SEP network and 7.5 million imputed variants. Association testing was conducted with Cox regression models, adjusting by sex, age, and the main principal components of genetic variation. A second stage focusing on the prioritized genetic variants was performed on 2,063 ICU sepsis patients (1362 European Americans and 701 African-Americans) from the MESSI study. A meta-analysis of results from the two stages was conducted and significance was established at p < 5.0 × 10-8. Whole-blood transcriptomic, functional annotations, and sensitivity analyses were evaluated on the identified genes and variants. FINDINGS We identified three independent low-frequency variants associated with reduced 28-day sepsis survival, including a missense variant in SAMD9 (hazard ratio [95% confidence interval] = 1.64 [1.37-6.78], p = 4.92 × 10-8). SAMD9 encodes a possible mediator of the inflammatory response to tissue injury. INTERPRETATION We performed the first GWAS of 28-day sepsis survival and identified novel variants associated with reduced survival. Larger sample size studies are needed to better assess the genetic effects in sepsis survival and to validate the findings.
Collapse
Affiliation(s)
- Tamara Hernandez-Beeftink
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Carretera del Rosario S/N, Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario de Gran Canaria Dr. Negrin, Las Palmas de Gran Canaria, Spain
| | - Beatriz Guillen-Guio
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Carretera del Rosario S/N, Santa Cruz de Tenerife, Spain
- Department of Health Sciences, University of Leicester, Leicester, UK
| | - Jose M Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Almudena Corrales
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Carretera del Rosario S/N, Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| | - Eva Suarez-Pajes
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Carretera del Rosario S/N, Santa Cruz de Tenerife, Spain
| | - Rui Feng
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - Luis A Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Megan L Paynton
- Department of Health Sciences, University of Leicester, Leicester, UK
| | - Raquel Cruz
- Genomic Medicine Group, Biomedical Research Center of Rare Diseases (CIBERER), University of Santiago de Compostela, Santiago de Compostela, Spain
| | - M Isabel García-Laorden
- Research Unit, Hospital Universitario de Gran Canaria Dr. Negrin, Las Palmas de Gran Canaria, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| | | | - Aurelio Rodríguez-Pérez
- Department of Anesthesiology, Hospital Universitario de Gran Canaria Dr. Negrín, Las Palmas de Gran Canaria, Spain
- Department of Medical and Surgical Sciences, University of Las Palmas de Gran Canaria, Gran Canaria, Spain
| | - Demetrio Carriedo
- Intensive Care Unit, Complejo Hospitalario Universitario de León, León, Spain
| | - Jesús Blanco
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- Intensive Care Unit, Hospital Universitario Rio Hortega, Valladolid, Spain
| | - Alfonso Ambrós
- Intensive Care Unit, Hospital General de Ciudad Real, Ciudad Real, Spain
| | | | - Elena Espinosa
- Department of Anesthesiology, Hospital Universitario N.S. de Candelaria, Santa Cruz de Tenerife, Spain
| | - Arturo Muriel
- Intensive Care Unit, Hospital Universitario Rio Hortega, Valladolid, Spain
| | - Eduardo Tamayo
- CIBER de Enfermedades Infecciosas, Department of Anesthesiology and Resuscitation, Hospital Clínico Universitario de Valladolid, Valladolid, Spain
- Departamento de Cirugía, Facultad de Medicina, Universidad de Valladolid, Valladolid, Spain
| | - María M Martin
- Intensive Care Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Leonardo Lorente
- Intensive Care Unit, Hospital Universitario de Canarias, La Laguna, Tenerife, Spain
| | - David Domínguez
- Department of Anesthesiology, Hospital Universitario N.S. de Candelaria, Santa Cruz de Tenerife, Spain
| | | | - Heather M Giannini
- Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - John P Reilly
- Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - Tiffanie K Jones
- Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - José M Añón
- Intensive Care Unit, Hospital Universitario La Paz, IdiPAZ, Madrid, Spain
| | - Marina Soro
- Department of Anesthesiology, Hospital Clinico Universitario de Valencia, Valencia, Spain
| | - Ángel Carracedo
- Genomic Medicine Group, Biomedical Research Center of Rare Diseases (CIBERER), University of Santiago de Compostela, Santiago de Compostela, Spain
- Genomic Medicine Group, CIMUS, University of Santiago de Compostela, Santiago de Compostela, Spain
- Galician Foundation of Genomic Medicine, Foundation of Health Research Institute of Santiago de Compostela (FIDIS), SERGAS, Santiago de Compostela, Spain
| | - Louise V Wain
- Department of Health Sciences, University of Leicester, Leicester, UK
- Leicester Respiratory Biomedical Research, Centre, National Institute for Health Research, Glenfield Hospital, Leicester, UK
| | - Nuala J Meyer
- Division of Pulmonary, Allergy, and Critical Care Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, USA
| | - Jesús Villar
- Research Unit, Hospital Universitario de Gran Canaria Dr. Negrin, Las Palmas de Gran Canaria, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
- Li Ka Shing Knowledge Institute, St. Michael's Hospital, Toronto, ON, Canada
| | - Carlos Flores
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Carretera del Rosario S/N, Santa Cruz de Tenerife, Spain.
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain.
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain.
- Facultad de Ciencias de la Salud, Universidad Fernando Pessoa Canarias, Las Palmas de Gran Canaria, Spain.
| |
Collapse
|
7
|
Mendoza-Alvarez A, Tosco-Herrera E, Muñoz-Barrera A, Rubio-Rodríguez LA, Alonso-Gonzalez A, Corrales A, Iñigo-Campos A, Almeida-Quintana L, Martin-Fernandez E, Martinez-Beltran D, Perez-Rodriguez E, Callero A, Garcia-Robaina JC, González-Montelongo R, Marcelino-Rodriguez I, Lorenzo-Salazar JM, Flores C. A catalog of the genetic causes of hereditary angioedema in the Canary Islands (Spain). Front Immunol 2022; 13:997148. [PMID: 36203598 PMCID: PMC9531158 DOI: 10.3389/fimmu.2022.997148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 08/23/2022] [Indexed: 11/13/2022] Open
Abstract
Hereditary angioedema (HAE) is a rare disease where known causes involve C1 inhibitor dysfunction or dysregulation of the kinin cascade. The updated HAE management guidelines recommend performing genetic tests to reach a precise diagnosis. Unfortunately, genetic tests are still uncommon in the diagnosis routine. Here, we characterized for the first time the genetic causes of HAE in affected families from the Canary Islands (Spain). Whole-exome sequencing data was obtained from 41 affected patients and unaffected relatives from 29 unrelated families identified in the archipelago. The Hereditary Angioedema Database Annotation (HADA) tool was used for pathogenicity classification and causal variant prioritization among the genes known to cause HAE. Manual reclassification of prioritized variants was used in those families lacking known causal variants. We detected a total of eight different variants causing HAE in this patient series, affecting essentially SERPING1 and F12 genes, one of them being a novel SERPING1 variant (c.686-12A>G) with a predicted splicing effect which was reclassified as likely pathogenic in one family. Altogether, the diagnostic yield by assessing previously reported causal genes and considering variant reclassifications according to the American College of Medical Genetics guidelines reached 66.7% (95% Confidence Interval [CI]: 30.1-91.0) in families with more than one affected member and 10.0% (95% CI: 1.8-33.1) among cases without family information for the disease. Despite the genetic causes of many patients remain to be identified, our results reinforce the need of genetic tests as first-tier diagnostic tool in this disease, as recommended by the international WAO/EAACI guidelines for the management of HAE.
Collapse
Affiliation(s)
| | - Eva Tosco-Herrera
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Adrian Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables, Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables, Santa Cruz de Tenerife, Spain
| | - Aitana Alonso-Gonzalez
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
- Universidad de Santiago de Compostela, Santiago de Compostela, Spain
| | - Almudena Corrales
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
| | - Antonio Iñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables, Santa Cruz de Tenerife, Spain
| | - Lourdes Almeida-Quintana
- Allergy Service, Hospital Universitario de Gran Canaria Dr. Negrín, Las Palmas de Gran Canaria, Spain
| | - Elena Martin-Fernandez
- Allergy Service, Hospital Universitario Dr. Molina Orosa, Las Palmas de Gran Canaria, Spain
| | - Dara Martinez-Beltran
- Allergy Service, Hospital Universitario Insular-Materno Infantil, Las Palmas de Gran Canaria, Spain
| | - Eva Perez-Rodriguez
- Allergy Service, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Ariel Callero
- Allergy Service, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | - Jose C. Garcia-Robaina
- Allergy Service, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
| | | | - Itahisa Marcelino-Rodriguez
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
- Public Health and Preventive Medicine Area, Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | - Jose M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables, Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Santa Cruz de Tenerife, Spain
- Genomics Division, Instituto Tecnológico y de Energías Renovables, Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando Pessoa Canarias, Las Palmas de Gran Canaria, Spain
- *Correspondence: Carlos Flores,
| |
Collapse
|
8
|
Olson ND, Wagner J, McDaniel J, Stephens SH, Westreich ST, Prasanna AG, Johanson E, Boja E, Maier EJ, Serang O, Jáspez D, Lorenzo-Salazar JM, Muñoz-Barrera A, Rubio-Rodríguez LA, Flores C, Kyriakidis K, Malousi A, Shafin K, Pesout T, Jain M, Paten B, Chang PC, Kolesnikov A, Nattestad M, Baid G, Goel S, Yang H, Carroll A, Eveleigh R, Bourgey M, Bourque G, Li G, Ma C, Tang L, Du Y, Zhang S, Morata J, Tonda R, Parra G, Trotta JR, Brueffer C, Demirkaya-Budak S, Kabakci-Zorlu D, Turgut D, Kalay Ö, Budak G, Narcı K, Arslan E, Brown R, Johnson IJ, Dolgoborodov A, Semenyuk V, Jain A, Tetikol HS, Jain V, Ruehle M, Lajoie B, Roddey C, Catreux S, Mehio R, Ahsan MU, Liu Q, Wang K, Ebrahim Sahraeian SM, Fang LT, Mohiyuddin M, Hung C, Jain C, Feng H, Li Z, Chen L, Sedlazeck FJ, Zook JM. PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions. Cell Genom 2022; 2:S2666-979X(22)00058-1. [PMID: 35720974 PMCID: PMC9205427 DOI: 10.1016/j.xgen.2022.100129] [Citation(s) in RCA: 48] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 11/01/2021] [Accepted: 04/08/2022] [Indexed: 11/19/2022]
Abstract
The precisionFDA Truth Challenge V2 aimed to assess the state of the art of variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 variant call sets for one or more sequencing technologies (Illumina, PacBio HiFi, and Oxford Nanopore Technologies). Submissions were evaluated following best practices for benchmarking small variants with updated Genome in a Bottle benchmark sets and genome stratifications. Challenge submissions included numerous innovative methods, with graph-based and machine learning methods scoring best for short-read and long-read datasets, respectively. With machine learning approaches, combining multiple sequencing technologies performed particularly well. Recent developments in sequencing and variant calling have enabled benchmarking variants in challenging genomic regions, paving the way for the identification of previously unknown clinically relevant variants.
Collapse
Affiliation(s)
- Nathan D. Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| | - Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| | - Jennifer McDaniel
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| | | | | | | | - Elaine Johanson
- Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, USA
| | - Emily Boja
- Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, USA
| | - Ezekiel J. Maier
- Booz Allen Hamilton, 8283 Greensboro Drive, Mclean, VA 22102, USA
| | - Omar Serang
- DNAnexus, Inc., 1975 W El Camino Real #204, Mountain View, CA 94040, USA
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
- Research Unit, Hospital Universitario N.S. de Candelaria, Santa Cruz de Tenerife, Spain
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain
| | - Konstantinos Kyriakidis
- School of Pharmacy, Aristotle University of Thessaloniki (AUTH), 541 24 Thessaloniki, Greece
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation, 570 01 Thessaloniki, Greece
| | - Andigoni Malousi
- Genomics and Epigenomics Translational Research (GENeTres), Center for Interdisciplinary Research and Innovation, 570 01 Thessaloniki, Greece
- Laboratory of Biological Chemistry, School of Medicine, Aristotle University of Thessaloniki (AUTH), 541 24 Thessaloniki, Greece
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Trevor Pesout
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, 1156 High Street, Santa Cruz, CA, USA
| | - Pi-Chuan Chang
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | | | - Maria Nattestad
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Gunjan Baid
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Sidharth Goel
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Howard Yang
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Andrew Carroll
- Google Inc, 1600 Amphitheater Pkwy, Mountain View, CA 94040, USA
| | - Robert Eveleigh
- The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
| | - Mathieu Bourgey
- The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
| | - Guillaume Bourque
- The Canadian Center for Computational Genomics (C3G), Montréal, QC, Canada
| | - Gen Li
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - ChouXian Ma
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - LinQi Tang
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - YuanPing Du
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - ShaoWei Zhang
- HuXinDao, QingZhuHu TaiYangShan Road, KaiFu, ChangSha, HuNan, China
| | - Jordi Morata
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Raúl Tonda
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Genís Parra
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Jean-Rémi Trotta
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Christian Brueffer
- Division of Oncology, Department of Clinical Sciences, Lund University, Lund, Sweden
| | | | | | - Deniz Turgut
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Özem Kalay
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Gungor Budak
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Kübra Narcı
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | - Elif Arslan
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | | | | | | | | | - Amit Jain
- Seven Bridges Genomics, Inc, Charlestown, MA, USA
| | | | | | | | | | | | | | | | - Mian Umair Ahsan
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Qian Liu
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kai Wang
- Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | - Li Tai Fang
- Roche Sequencing Solutions, Santa Clara, CA 95050, USA
| | | | | | - Chirag Jain
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | - Justin M. Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, MS8312, Gaithersburg, MD 20899, USA
| |
Collapse
|
9
|
Wagner J, Olson ND, Harris L, McDaniel J, Cheng H, Fungtammasan A, Hwang YC, Gupta R, Wenger AM, Rowell WJ, Khan ZM, Farek J, Zhu Y, Pisupati A, Mahmoud M, Xiao C, Yoo B, Sahraeian SME, Miller DE, Jáspez D, Lorenzo-Salazar JM, Muñoz-Barrera A, Rubio-Rodríguez LA, Flores C, Narzisi G, Evani US, Clarke WE, Lee J, Mason CE, Lincoln SE, Miga KH, Ebbert MTW, Shumate A, Li H, Chin CS, Zook JM, Sedlazeck FJ. Curated variation benchmarks for challenging medically relevant autosomal genes. Nat Biotechnol 2022; 40:672-680. [PMID: 35132260 PMCID: PMC9117392 DOI: 10.1038/s41587-021-01158-1] [Citation(s) in RCA: 71] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 11/10/2021] [Indexed: 11/09/2022]
Abstract
The repetitive nature and complexity of some medically relevant genes poses a challenge for their accurate analysis in a clinical setting. The Genome in a Bottle Consortium has provided variant benchmark sets, but these exclude nearly 400 medically relevant genes due to their repetitiveness or polymorphic complexity. Here, we characterize 273 of these 395 challenging autosomal genes using a haplotype-resolved whole-genome assembly. This curated benchmark reports over 17,000 single-nucleotide variations, 3,600 insertions and deletions and 200 structural variations each for human genome reference GRCh37 and GRCh38 across HG002. We show that false duplications in either GRCh37 or GRCh38 result in reference-specific, missed variants for short- and long-read technologies in medically relevant genes, including CBS, CRYAA and KCNE1. When masking these false duplications, variant recall can improve from 8% to 100%. Forming benchmarks from a haplotype-resolved whole-genome assembly may become a prototype for future benchmarks covering the whole genome.
Collapse
Affiliation(s)
- Justin Wagner
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Nathan D Olson
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Lindsay Harris
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Jennifer McDaniel
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Haoyu Cheng
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | | | | | | | | | | | - Ziad M Khan
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Jesse Farek
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Aishwarya Pisupati
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Medhat Mahmoud
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Byunggil Yoo
- Genomic Medicine Center, Children's Mercy Kansas City, Kansas City, MO, USA
| | | | - Danny E Miller
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - José M Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Luis A Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain
- Research Unit, Hospital Universitario N.S. de Candelaria, Santa Cruz de Tenerife, Spain
| | | | | | | | - Joyce Lee
- Bionano Genomics, San Diego, CA, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA
| | | | - Karen H Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Mark T W Ebbert
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY, USA
- Department of Internal Medicine, Division of Biomedical Informatics, University of Kentucky, Lexington, KY, USA
- Department of Neuroscience, University of Kentucky, Lexington, KY, USA
| | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | | | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, USA.
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
10
|
Lorente-Arencibia P, García-Villarreal L, González-Montelongo R, Rubio-Rodríguez LA, Flores C, Garay-Sánchez P, delaCruz T, Santana-Verano M, Rodríguez-Esparragón F, Benitez-Reyes JN, Fernández-Fuertes F, Tugores A. Wilson Disease Prevalence: Discrepancy Between Clinical Records, Registries and Mutation Carrier Frequency. J Pediatr Gastroenterol Nutr 2022; 74:192-199. [PMID: 34620762 DOI: 10.1097/mpg.0000000000003322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
OBJECTIVES Diagnosis of Wilson disease (WD) is difficult and, as early detection may prevent all symptoms, it is essential to know the exact prevalence to evaluate the cost-efficacy of a screening program. As the number of WD patients was high in our population, we wished to estimate prevalence by determining the carrier frequency for clinically relevant ATP7B mutations. METHODS To estimate prevalence, screening for the most prevalent mutation was performed in 1661 individuals with ancestry in Gran Canaria, and the frequency of other mutations was estimated from patient records. Alternatively, ATP7B mutations were detected from exomes and genomes from 851 individuals with Canarian ancestry, 236 from Gran Canaria, and a public Spanish exome database. RESULTS Estimated carrier frequencies in Gran Canaria ranged from 1 in 20 to 28, depending on the method used, resulting in prevalences of 1 case per 1547 to 3140 inhabitants. Alternatively, the estimated affected frequencies were 1 in 5985 to 7980 and 1 in 6278 to 16,510 in the archipelago or mainland Spain respectively. CONCLUSIONS The number of carriers predicts much higher prevalences than reported, suggesting that WD is underdiagnosed; specific mutations may remain unnoticed due to low penetrance or no signs of disease at all; regional prevalence rather than national prevalence should be considered in cost-efficacy models to approach preventive screening in the asymptomatic population and genetic screening strategies will have to deal with the genetic heterogeneity of ATP7B in the general population and in patients.
Collapse
Affiliation(s)
- Pascual Lorente-Arencibia
- Unidad de Investigación, Complejo Hospitalario Universitario Insular Materno-Infantil, Las Palmas de GC
| | - Luis García-Villarreal
- Unidad de Investigación, Complejo Hospitalario Universitario Insular Materno-Infantil, Las Palmas de GC
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER)
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna
| | | | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER)
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerif
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid
| | - Paloma Garay-Sánchez
- Unidad de Investigación, Complejo Hospitalario Universitario Insular Materno-Infantil, Las Palmas de GC
| | - Tanausú delaCruz
- Unidad de Investigación, Complejo Hospitalario Universitario Insular Materno-Infantil, Las Palmas de GC
| | - Milagros Santana-Verano
- Unidad de Investigación, Complejo Hospitalario Universitario Insular Materno-Infantil, Las Palmas de GC
| | | | - Juana N Benitez-Reyes
- Department of Haematology, Complejo Hospitalario Universitario Insular Materno-Infantil, Spain
| | | | - Antonio Tugores
- Unidad de Investigación, Complejo Hospitalario Universitario Insular Materno-Infantil, Las Palmas de GC
| |
Collapse
|
11
|
García-Olivares V, Muñoz-Barrera A, Lorenzo-Salazar JM, Zaragoza-Trello C, Rubio-Rodríguez LA, Díaz-de Usera A, Jáspez D, Iñigo-Campos A, González-Montelongo R, Flores C. A benchmarking of human mitochondrial DNA haplogroup classifiers from whole-genome and whole-exome sequence data. Sci Rep 2021; 11:20510. [PMID: 34654896 PMCID: PMC8519921 DOI: 10.1038/s41598-021-99895-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 09/28/2021] [Indexed: 12/18/2022] Open
Abstract
The mitochondrial genome (mtDNA) is of interest for a range of fields including evolutionary, forensic, and medical genetics. Human mitogenomes can be classified into evolutionary related haplogroups that provide ancestral information and pedigree relationships. Because of this and the advent of high-throughput sequencing (HTS) technology, there is a diversity of bioinformatic tools for haplogroup classification. We present a benchmarking of the 11 most salient tools for human mtDNA classification using empirical whole-genome (WGS) and whole-exome (WES) short-read sequencing data from 36 unrelated donors. We also assessed the best performing tool in third-generation long noisy read WGS data obtained with nanopore technology for a subset of the donors. We found that, for short-read WGS, most of the tools exhibit high accuracy for haplogroup classification irrespective of the input file used for the analysis. However, for short-read WES, Haplocheck and MixEmt were the most accurate tools. Based on the performance shown for WGS and WES, and the accompanying qualitative assessment, Haplocheck stands out as the most complete tool. For third-generation HTS data, we also showed that Haplocheck was able to accurately retrieve mtDNA haplogroups for all samples assessed, although only after following assembly-based approaches (either based on a referenced-based assembly or a hybrid de novo assembly). Taken together, our results provide guidance for researchers to select the most suitable tool to conduct the mtDNA analyses from HTS data.
Collapse
Affiliation(s)
- Víctor García-Olivares
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - José M Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | | | - Luis A Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Antonio Iñigo-Campos
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico Y de Energías Renovables (ITER), Santa Cruz de Tenerife, Spain.
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, Santa Cruz de Tenerife, Spain.
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, Universidad de La Laguna, Santa Cruz de Tenerife, Spain.
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, Madrid, Spain.
| |
Collapse
|
12
|
Díaz-de Usera A, Lorenzo-Salazar JM, Rubio-Rodríguez LA, Muñoz-Barrera A, Guillen-Guio B, Marcelino-Rodríguez I, García-Olivares V, Mendoza-Alvarez A, Corrales A, Íñigo-Campos A, González-Montelongo R, Flores C. Evaluation of Whole-Exome Enrichment Solutions: Lessons from the High-End of the Short-Read Sequencing Scale. J Clin Med 2020; 9:jcm9113656. [PMID: 33202991 PMCID: PMC7696786 DOI: 10.3390/jcm9113656] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2020] [Revised: 11/10/2020] [Accepted: 11/10/2020] [Indexed: 12/13/2022] Open
Abstract
Whole-exome sequencing has become a popular technique in research and clinical settings, assisting in disease diagnosis and increasing the understanding of disease pathogenesis. In this study, we aimed to compare common enrichment capture solutions available in the market. Peripheral blood-purified DNA samples were enriched with SureSelectQXT V6 (Agilent) and various Illumina solutions: TruSeq DNA Nano, TruSeq DNA Exome, Nextera DNA Exome, and Illumina DNA Prep with Enrichment, and sequenced on a HiSeq 4000. We found that their percentage of duplicate reads was as much as 2 times higher than previously reported values for the previous HiSeq series. SureSelectQXT and Illumina DNA Prep with Enrichment showed the best average on-target coverage, which improved when off-target regions were included. At high coverage levels and in shared bases, these two solutions and TruSeq DNA Exome provided three of the best performances. With respect to the number of small variants detected, SureSelectQXT presented the lowest number of detected variants in target regions. When off-target regions were considered, its ability equalized to other solutions. Our results show SureSelectQXT and Illumina DNA Prep with Enrichment to be the best enrichment capture solutions.
Collapse
Affiliation(s)
- Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Jose M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Beatriz Guillen-Guio
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain; (B.G.-G.); (I.M.-R.); (A.M.-A.); (A.C.)
| | - Itahisa Marcelino-Rodríguez
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain; (B.G.-G.); (I.M.-R.); (A.M.-A.); (A.C.)
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Alejandro Mendoza-Alvarez
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain; (B.G.-G.); (I.M.-R.); (A.M.-A.); (A.C.)
| | - Almudena Corrales
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain; (B.G.-G.); (I.M.-R.); (A.M.-A.); (A.C.)
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
| | - Antonio Íñigo-Campos
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain; (A.D.-d.U.); (J.M.L.-S.); (L.A.R.-R.); (A.M.-B.); (V.G.-O.); (A.Í.-C.); (R.G.-M.)
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain; (B.G.-G.); (I.M.-R.); (A.M.-A.); (A.C.)
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, 38200 San Cristóbal de La Laguna, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Correspondence: ; Tel.: +34-922-602938
| |
Collapse
|