1
|
Valls-Margarit J, Galván-Femenía I, Matías-Sánchez D, Blay N, Puiggròs M, Carreras A, Salvoro C, Cortés B, Amela R, Farre X, Lerga-Jaso J, Puig M, Sánchez-Herrero J, Moreno V, Perucho M, Sumoy L, Armengol L, Delaneau O, Cáceres M, de Cid R, Torrents D. GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing. Nucleic Acids Res 2022; 50:2464-2479. [PMID: 35176773 PMCID: PMC8934637 DOI: 10.1093/nar/gkac076] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 12/24/2021] [Accepted: 02/09/2022] [Indexed: 11/17/2022] Open
Abstract
The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.
Collapse
Affiliation(s)
| | | | | | - Natalia Blay
- Genomes for Life-GCAT lab Group, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona 08916, Spain
| | - Montserrat Puiggròs
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Anna Carreras
- Genomes for Life-GCAT lab Group, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona 08916, Spain
| | - Cecilia Salvoro
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Beatriz Cortés
- Genomes for Life-GCAT lab Group, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona 08916, Spain
| | - Ramon Amela
- Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain
| | - Xavier Farre
- Genomes for Life-GCAT lab Group, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona 08916, Spain
| | - Jon Lerga-Jaso
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Marta Puig
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
| | - Jose Francisco Sánchez-Herrero
- High Content Genomics and Bioinformatics Unit, Institute for Health Science Research Germans Trias i Pujol (IGTP), 08916 Badalona, Spain
| | - Victor Moreno
- Catalan Institute of Oncology, Hospitalet del Llobregat, 08908, Spain
- Bellvitge Biomedical Research Institute (IDIBELL), Hospitalet del Llobregat, 08908, Spain
- CIBER Epidemiología y Salud Pública (CIBERESP), Madrid 28029, Spain
- Universitat de Barcelona (UB), Barcelona 08007, Spain
| | - Manuel Perucho
- Sanford Burnham Prebys Medical Discovery Institute (SBP), La Jolla, CA 92037, USA
- Cancer Genetics and Epigenetics, Program of Predictive and Personalized Medicine of Cancer (PMPPC), Health Science Research Institute Germans Trias i Pujol (IGTP), Badalona 08916, Spain
| | - Lauro Sumoy
- High Content Genomics and Bioinformatics Unit, Institute for Health Science Research Germans Trias i Pujol (IGTP), 08916 Badalona, Spain
| | - Lluís Armengol
- Quantitative Genomic Medicine Laboratories (qGenomics), Esplugues del Llobregat, 08950, Spain
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, Génopode, 1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), University of Lausanne, Quartier Sorge – Batiment Amphipole, 1015 Lausanne, Switzerland
| | - Mario Cáceres
- Institut de Biotecnologia i de Biomedicina, Universitat Autònoma de Barcelona, Bellaterra, Barcelona 08193, Spain
- ICREA, Barcelona 08010, Spain
| | - Rafael de Cid
- Correspondence may also be addressed to Rafael de Cid. Tel: +34 930330542;
| | - David Torrents
- To whom correspondence should be addressed. Tel: +34 934134074;
| |
Collapse
|