1
|
Poklukar K, Mestre C, Škrlep M, Čandek-Potokar M, Ovilo C, Fontanesi L, Riquet J, Bovo S, Schiavo G, Ribani A, Muñoz M, Gallo M, Bozzi R, Charneca R, Quintanilla R, Kušec G, Mercat MJ, Zimmer C, Razmaite V, Araujo JP, Radović Č, Savić R, Karolyi D, Servin B. A meta-analysis of genetic and phenotypic diversity of European local pig breeds reveals genomic regions associated with breed differentiation for production traits. Genet Sel Evol 2023; 55:88. [PMID: 38062367 PMCID: PMC10704730 DOI: 10.1186/s12711-023-00858-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 11/17/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Intense selection of modern pig breeds has resulted in genetic improvement of production traits while the performance of local pig breeds has remained lower. As local pig breeds have been bred in extensive systems, they have adapted to specific environmental conditions, resulting in a rich genotypic and phenotypic diversity. This study is based on European local pig breeds that have been genetically characterized using DNA-pool sequencing data and phenotypically characterized using breed level phenotypes related to stature, fatness, growth, and reproductive performance traits. These data were analyzed using a dedicated approach to detect signatures of selection linked to phenotypic traits in order to uncover potential candidate genes that may underlie adaptation to specific environments. RESULTS Analysis of the genetic data of European pig breeds revealed four main axes of genetic variation represented by the Iberian and three modern breeds (i.e. Large White, Landrace, and Duroc). In addition, breeds clustered according to their geographical origin, for example French Gascon and Basque breeds, Italian Apulo Calabrese and Casertana breeds, Spanish Iberian, and Portuguese Alentejano breeds. Principal component analysis of the phenotypic data distinguished the larger and leaner breeds with better growth potential and reproductive performance from the smaller and fatter breeds with low growth and reproductive efficiency. Linking the signatures of selection with phenotype identified 16 significant genomic regions associated with stature, 24 with fatness, 2 with growth, and 192 with reproduction. Among them, several regions contained candidate genes with possible biological effects on stature, fatness, growth, and reproductive performance traits. For example, strong associations were found for stature in two regions containing, respectively, the ANXA4 and ANTXR1 genes, for fatness in a region containing the DNMT3A and POMC genes and for reproductive performance in a region containing the HSD17B7 gene. CONCLUSIONS In this study on European local pig breeds, we used a dedicated approach for detecting signatures of selection that were supported by phenotypic data at the breed level to identify potential candidate genes that may have adapted to different living environments and production systems.
Collapse
Affiliation(s)
- Klavdija Poklukar
- Agricultural Institute of Slovenia, Hacquetova Ulica 17, 1000, Ljubljana, Slovenia
| | - Camille Mestre
- GenPhySE, Université de Toulouse, INRAE, INP, ENVT, 31320, Castanet-Tolosan, France
| | - Martin Škrlep
- Agricultural Institute of Slovenia, Hacquetova Ulica 17, 1000, Ljubljana, Slovenia
| | | | - Cristina Ovilo
- Departamento Mejora Genética Animal, INIA-CSIC, Crta. de la Coruña Km. 7,5, 28040, Madrid, Spain
| | - Luca Fontanesi
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Juliette Riquet
- GenPhySE, Université de Toulouse, INRAE, INP, ENVT, 31320, Castanet-Tolosan, France
| | - Samuele Bovo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Giuseppina Schiavo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Anisa Ribani
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Maria Muñoz
- Departamento Mejora Genética Animal, INIA-CSIC, Crta. de la Coruña Km. 7,5, 28040, Madrid, Spain
| | - Maurizio Gallo
- Associazione Nazionale Allevatori Suini (ANAS), Via Nizza 53, 00198, Rome, Italy
| | - Ricardo Bozzi
- DAGRI-Animal Science Section, Università Di Firenze, Via Delle Cascine 5, 50144, Florence, Italy
| | - Rui Charneca
- MED- Mediterranean Institute for Agriculture, Environment and Development, Universidade de Évora, Pólo da Mitra, Apartado 94, 7006-554, Évora, Portugal
| | - Raquel Quintanilla
- Programa de Genética y Mejora Animal, IRTA, Torre Marimon, Caldes de Montbui, 08140, Barcelona, Spain
| | - Goran Kušec
- Faculty of Agrobiotechnical Sciences, University of Osijek, Vladimira Preloga 1, 31000, Osijek, Croatia
| | - Marie-José Mercat
- IFIP Institut du Porc, La Motte au Vicomte, BP 35104, 35651, Le Rheu Cedex, France
| | - Christoph Zimmer
- Bauerliche Erzeugergemeinschaft Schwäbisch Hall, Haller Str. 20, 74549, Wolpertshausen, Germany
| | - Violeta Razmaite
- Animal Science Institute, Lithuanian University of Health Sciences, 82317, Baisogala, Lithuania
| | - Jose P Araujo
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Viana do Castelo, Escola Superior Agrária, Refóios do Lima, 4990-706, Ponte de Lima, Portugal
| | - Čedomir Radović
- Department of Pig Breeding and Genetics, Institute for Animal Husbandry, 11080, Belgrade-Zemun, Serbia
| | - Radomir Savić
- Faculty of Agriculture, University of Belgrade, Nemanjina 6, 11080, Belgrade-Zemun, Serbia
| | - Danijel Karolyi
- Department of Animal Science, Faculty of Agriculture, University of Zagreb, Svetošimunska c. 25, 10000, Zagreb, Croatia
| | - Bertrand Servin
- GenPhySE, Université de Toulouse, INRAE, INP, ENVT, 31320, Castanet-Tolosan, France.
| |
Collapse
|
2
|
Selvakumar R, Jat GS, Manjunathagowda DC. Allele mining through TILLING and EcoTILLING approaches in vegetable crops. PLANTA 2023; 258:15. [PMID: 37311932 DOI: 10.1007/s00425-023-04176-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/01/2023] [Indexed: 06/15/2023]
Abstract
MAIN CONCLUSION The present review illustrates a comprehensive overview of the allele mining for genetic improvement in vegetable crops, and allele exploration methods and their utilization in various applications related to pre-breeding of economically important traits in vegetable crops. Vegetable crops have numerous wild descendants, ancestors and terrestrial races that could be exploited to develop high-yielding and climate-resilient varieties resistant/tolerant to biotic and abiotic stresses. To further boost the genetic potential of economic traits, the available genomic tools must be targeted and re-opened for exploitation of novel alleles from genetic stocks by the discovery of beneficial alleles from wild relatives and their introgression to cultivated types. This capability would be useful for giving plant breeders direct access to critical alleles that confer higher production, improve bioactive compounds, increase water and nutrient productivity as well as biotic and abiotic stress resilience. Allele mining is a new sophisticated technique for dissecting naturally occurring allelic variants in candidate genes that influence important traits which could be used for genetic improvement of vegetable crops. Target-induced local lesions in genomes (TILLINGs) is a sensitive mutation detection avenue in functional genomics, particularly wherein genome sequence information is limited or not available. Population exposure to chemical mutagens and the absence of selectivity lead to TILLING and EcoTILLING. EcoTILLING may lead to natural induction of SNPs and InDels. It is anticipated that as TILLING is used for vegetable crops improvement in the near future, indirect benefits will become apparent. Therefore, in this review we have highlighted the up-to-date information on allele mining for genetic enhancement in vegetable crops and methods of allele exploration and their use in pre-breeding for improvement of economic traits.
Collapse
Affiliation(s)
- Raman Selvakumar
- ICAR-Indian Agricultural Research Institute, Pusa Campus, New Delhi, 110 012, India
| | - Gograj Singh Jat
- ICAR-Indian Agricultural Research Institute, Pusa Campus, New Delhi, 110 012, India.
| | | |
Collapse
|
3
|
Bertolini F, Ribani A, Capoccioni F, Buttazzoni L, Bovo S, Schiavo G, Caggiano M, Rothschild MF, Fontanesi L. Whole Genome Sequencing Provides Information on the Genomic Architecture and Diversity of Cultivated Gilthead Seabream ( Sparus aurata) Broodstock Nuclei. Genes (Basel) 2023; 14:839. [PMID: 37107597 PMCID: PMC10137967 DOI: 10.3390/genes14040839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 03/21/2023] [Accepted: 03/27/2023] [Indexed: 04/03/2023] Open
Abstract
The gilthead seabream (Sparus aurata) is a species of relevance for the Mediterranean aquaculture industry. Despite the advancement of genetic tools for the species, breeding programs still do not often include genomics. In this study, we designed a genomic strategy to identify signatures of selection and genomic regions of high differentiation among populations of farmed fish stocks. A comparative DNA pooling sequencing approach was applied to identify signatures of selection in gilthead seabream from the same hatchery and from different nuclei that had not been subjected to genetic selection. Identified genomic regions were further investigated to detect SNPs with predicted high impact. The analyses underlined major genomic differences in the proportion of fixed alleles among the investigated nuclei. Some of these differences highlighted genomic regions, including genes involved in general metabolism and development already detected in QTL for growth, size, skeletal deformity, and adaptation to variation of oxygen levels in other teleosts. The obtained results pointed out the need to control the genetic effect of breeding programs in this species to avoid the reduction of genetic variability within populations and the increase in inbreeding level that, in turn, might lead to an increased frequency of alleles with deleterious effects.
Collapse
Affiliation(s)
- Francesca Bertolini
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale G. Fanin 46, 40127 Bologna, Italy
| | - Anisa Ribani
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale G. Fanin 46, 40127 Bologna, Italy
| | - Fabrizio Capoccioni
- Centro di Ricerca “Zootecnia e Acquacoltura”, Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria (CREA), 00198 Roma, Italy
| | - Luca Buttazzoni
- Centro di Ricerca “Zootecnia e Acquacoltura”, Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria (CREA), 00198 Roma, Italy
| | - Samuele Bovo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale G. Fanin 46, 40127 Bologna, Italy
| | - Giuseppina Schiavo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale G. Fanin 46, 40127 Bologna, Italy
| | - Massimo Caggiano
- Panittica Italia Società Agricola Srl, Torre Canne di Fasano, 72016 Brindisi, Italy
| | - Max F. Rothschild
- Department of Animal Science, Iowa State University, Ames, IA 50011-3150, USA
| | - Luca Fontanesi
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale G. Fanin 46, 40127 Bologna, Italy
| |
Collapse
|
4
|
Ma W, Guan X, Miao Y, Zhang L. Whole Genome Resequencing Revealed the Effect of Helicase yqhH Gene on Regulating Bacillus thuringiensis LLP29 against Ultraviolet Radiation Stress. Int J Mol Sci 2023; 24:ijms24065810. [PMID: 36982883 PMCID: PMC10054049 DOI: 10.3390/ijms24065810] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 02/23/2023] [Accepted: 03/01/2023] [Indexed: 03/30/2023] Open
Abstract
Bacillus thuringiensis (Bt) is a widely used microbial pesticide. However, its duration of effectiveness is greatly shortened due to the irradiation of ultraviolet rays, which seriously hinders the application of Bt preparations. Therefore, it is of great importance to study the resistance mechanism of Bt to UV at the molecular level to improve the UV-resistance of Bt strains. In order to know the functional genes in the UV resistance, the genome of UV-induced mutant Bt LLP29-M19 was re-sequenced and compared with the original strain Bt LLP29. It was shown that there were 1318 SNPs, 31 InDels, and 206 SV between the mutant strain and the original strain Bt LLP29 after UV irradiation, which were then analyzed for gene annotation. Additionally, a mutated gene named yqhH, a member of helicase superfamily II, was detected as an important candidate. Then, yqhH was expressed and purified successfully. Through the result of the enzymatic activity in vitro, yqhH was found to have ATP hydrolase and helicase activities. In order to further verify its function, the yqhH gene was knocked out and complemented by homologous recombinant gene knockout technology. The survival rate of the knockout mutant strain Bt LLP29-ΔyqhH was significantly lower than that of the original strain Bt LLP29 and the back-complemented strain Bt LLP29-ΔyqhH-R after treated with UV. Meanwhile, the total helicase activity was not significantly different on whether Bt carried yqhH or not. All of these greatly enrich important molecular mechanisms of Bt when it is in UV stress.
Collapse
Affiliation(s)
- Weibo Ma
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Key Laboratory of Biopesticide and Chemical Biology of Ministry of Education & Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Xiong Guan
- College of Plant Protection, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Ying Miao
- College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Lingling Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Key Laboratory of Biopesticide and Chemical Biology of Ministry of Education & Ministerial and Provincial Joint Innovation Centre for Safety Production of Cross-Strait Crops, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| |
Collapse
|
5
|
Eynard SE, Vignal A, Basso B, Canale‐Tabet K, Le Conte Y, Decourtye A, Genestout L, Labarthe E, Mondet F, Servin B. Reconstructing queen genotypes by pool sequencing colonies in eusocial insects: Statistical Methods and their application to honeybee. Mol Ecol Resour 2022; 22:3035-3048. [PMID: 35816386 PMCID: PMC9796407 DOI: 10.1111/1755-0998.13685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 06/29/2022] [Accepted: 07/04/2022] [Indexed: 01/01/2023]
Abstract
Eusocial insects are crucial to many ecosystems, and particularly the honeybee (Apis mellifera). One approach to facilitate their study in molecular genetics, is to consider whole-colony genotyping by combining DNA of multiple individuals in a single pool sequencing experiment. Cheap and fast, this technique comes with the drawback of producing data requiring dedicated methods to be fully exploited. Despite this limitation, pool sequencing data have been shown to be informative and cost-effective when working on random mating populations. Here, we present new statistical methods for exploiting pool sequencing of eusocial colonies in order to reconstruct the genotypes of the queen of such colony. This leverages the possibility to monitor genetic diversity, perform genomic-based studies or implement selective breeding. Using simulations and honeybee real data, we show that the new methods allow for a fast and accurate estimation of the queen's genetic ancestry, with correlations of about 0.9 to that obtained from individual genotyping. Also, it allows for an accurate reconstruction of the queen genotypes, with about 2% genotyping error. We further validate these inferences using experimental data on colonies with both pool sequencing and individual genotyping of drones. In brief, in this study we present statistical models to accurately estimate the genetic ancestry and reconstruct the genotypes of the queen from pool sequencing data from workers of an eusocial colony. Such information allows to exploit pool sequencing for traditional population genetics analyses, association studies and for selective breeding. While validated in Apis mellifera, these methods are applicable to other eusocial hymenopterans.
Collapse
Affiliation(s)
- Sonia E. Eynard
- GenPhySE, INRAE, INP, ENVTUniversité de ToulouseCastanet‐TolosanFrance
- LABOGENA DNAJouy‐en‐JosasFrance
| | - Alain Vignal
- GenPhySE, INRAE, INP, ENVTUniversité de ToulouseCastanet‐TolosanFrance
| | - Benjamin Basso
- Abeilles et EnvironnementINRAEAvignonFrance
- ITSAPAvignonFrance
| | | | | | | | | | | | | | - Bertrand Servin
- GenPhySE, INRAE, INP, ENVTUniversité de ToulouseCastanet‐TolosanFrance
| |
Collapse
|
6
|
Comparative population genomics in Tabebuia alliance shows evidence of adaptation in Neotropical tree species. Heredity (Edinb) 2022; 128:141-153. [PMID: 35132209 PMCID: PMC8897506 DOI: 10.1038/s41437-021-00491-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 12/07/2021] [Accepted: 12/07/2021] [Indexed: 11/08/2022] Open
Abstract
The role of natural selection in shaping spatial patterns of genetic diversity in the Neotropics is still poorly understood. Here, we perform a genome scan with 24,751 probes targeting 11,026 loci in two Neotropical Bignoniaceae tree species: Handroanthus serratifolius from the seasonally dry tropical forest (SDTF) and Tabebuia aurea from savannas, and compared with the population genomics of H. impetiginosus from SDTF. OutFLANK detected 29 loci in 20 genes with selection signal in H. serratifolius and no loci in T. aurea. Using BayPass, we found evidence of selection in 335 loci in 312 genes in H. serratifolius, 101 loci in 92 genes in T. aurea, and 448 loci in 416 genes in H. impetiginosus. All approaches evidenced several genes affecting plant response to environmental stress and primary metabolic processes. The three species shared no SNPs with selection signal, but we found SNPs affecting the same gene in pair of species. Handroanthus serratifolius showed differences in allele frequencies at SNPs with selection signal among ecosystems, mainly between Caatinga/Cerrado and Atlantic Forest, while H. impetiginosus had one allele fixed across all populations, and T. aurea had similar allele frequency distribution among ecosystems and polymorphism across populations. Taken together, our results indicate that natural selection related to environmental stress shaped the spatial pattern of genetic diversity in the three species. However, the three species have different geographical distribution and niches, which may affect tolerances and adaption, and natural selection may lead to different signatures due to the differences in adaptive landscapes in different niches.
Collapse
|
7
|
Clarelli F, Barizzone N, Mangano E, Zuccalà M, Basagni C, Anand S, Sorosina M, Mascia E, Santoro S, Guerini FR, Virgilio E, Gallo A, Pizzino A, Comi C, Martinelli V, Comi G, De Bellis G, Leone M, Filippi M, Esposito F, Bordoni R, Martinelli Boneschi F, D'Alfonso S. Contribution of Rare and Low-Frequency Variants to Multiple Sclerosis Susceptibility in the Italian Continental Population. Front Genet 2022; 12:800262. [PMID: 35047017 PMCID: PMC8762330 DOI: 10.3389/fgene.2021.800262] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 11/17/2021] [Indexed: 12/15/2022] Open
Abstract
Genome-wide association studies identified over 200 risk loci for multiple sclerosis (MS) focusing on common variants, which account for about 50% of disease heritability. The goal of this study was to investigate whether low-frequency and rare functional variants, located in MS-established associated loci, may contribute to disease risk in a relatively homogeneous population, testing their cumulative effect (burden) with gene-wise tests. We sequenced 98 genes in 588 Italian patients with MS and 408 matched healthy controls (HCs). Variants were selected using different filtering criteria based on allelic frequency and in silico functional impacts. Genes showing a significant burden (n = 17) were sequenced in an independent cohort of 504 MS and 504 HC. The highest signal in both cohorts was observed for the disruptive variants (stop-gain, stop-loss, or splicing variants) located in EFCAB13, a gene coding for a protein of an unknown function (p < 10-4). Among these variants, the minor allele of a stop-gain variant showed a significantly higher frequency in MS versus HC in both sequenced cohorts (p = 0.0093 and p = 0.025), confirmed by a meta-analysis on a third independent cohort of 1298 MS and 1430 HC (p = 0.001) assayed with an SNP array. Real-time PCR on 14 heterozygous individuals for this variant did not evidence the presence of the stop-gain allele, suggesting a transcript degradation by non-sense mediated decay, supported by the evidence that the carriers of the stop-gain variant had a lower expression of this gene (p = 0.0184). In conclusion, we identified a novel low-frequency functional variant associated with MS susceptibility, suggesting the possible role of rare/low-frequency variants in MS as reported for other complex diseases.
Collapse
Affiliation(s)
- Ferdinando Clarelli
- Laboratory of Human Genetics of Neurological Disorders, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Nadia Barizzone
- Department of Health Sciences, UPO, University of Eastern Piedmont, and CAAD (Center for Translational Research on Autoimmune and Allergic Disease), Novara, Italy
| | - Eleonora Mangano
- Institute for Biomedical Technologies, National Research Council of Italy, Segrate, Italy
| | - Miriam Zuccalà
- Department of Health Sciences, UPO, University of Eastern Piedmont, and CAAD (Center for Translational Research on Autoimmune and Allergic Disease), Novara, Italy
| | - Chiara Basagni
- Department of Health Sciences, UPO, University of Eastern Piedmont, and CAAD (Center for Translational Research on Autoimmune and Allergic Disease), Novara, Italy
| | - Santosh Anand
- Department of Informatics, Systems and Communications (DISCo), University of Milano-Bicocca, Milan, Italy
| | - Melissa Sorosina
- Laboratory of Human Genetics of Neurological Disorders, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Elisabetta Mascia
- Laboratory of Human Genetics of Neurological Disorders, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Silvia Santoro
- Laboratory of Human Genetics of Neurological Disorders, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | | | | | | | - Eleonora Virgilio
- Department of Translational Medicine, Section of Neurology and IRCAD, UNIUPO, Novara, Italy
| | - Antonio Gallo
- MS Center, I Division of Neurology, Department of Advanced Medical and Surgical Sciences (DAMSS), University of Campania "Luigi Vanvitelli", Naples, Italy
| | - Alessandro Pizzino
- Department of Health Sciences, UPO, University of Eastern Piedmont, and CAAD (Center for Translational Research on Autoimmune and Allergic Disease), Novara, Italy
| | - Cristoforo Comi
- Department of Translational Medicine, Section of Neurology and IRCAD, UNIUPO, Novara, Italy
| | - Vittorio Martinelli
- Neurology Unit and Neurorehabilitation Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | | | - Gianluca De Bellis
- Institute for Biomedical Technologies, National Research Council of Italy, Segrate, Italy
| | - Maurizio Leone
- Neurology Unit, Fondazione IRCCS Casa Sollievo Della Sofferenza, San Giovanni Rotondo, Italy
| | - Massimo Filippi
- Neurology Unit and Neurorehabilitation Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy.,Vita-Salute San Raffaele University, Milan, Italy.,Neuroimaging Research Unit, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy.,Neurophysiology Service, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Federica Esposito
- Laboratory of Human Genetics of Neurological Disorders, Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy.,Neurology Unit and Neurorehabilitation Unit, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Roberta Bordoni
- Institute for Biomedical Technologies, National Research Council of Italy, Segrate, Italy
| | - Filippo Martinelli Boneschi
- Department of Pathophysiology and Transplantation (DEPT), Dino Ferrari Centre, Neuroscience Section, University of Milan, Milan, Italy.,Neurology Unit, MS Centre, Foundation IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, Italy
| | - Sandra D'Alfonso
- Department of Health Sciences, UPO, University of Eastern Piedmont, and CAAD (Center for Translational Research on Autoimmune and Allergic Disease), Novara, Italy
| |
Collapse
|
8
|
Tran Mau-Them F, Duffourd Y, Vitobello A, Bruel AL, Denommé-Pichon AS, Nambot S, Delanne J, Moutton S, Sorlin A, Couturier V, Bourgeois V, Chevarin M, Poe C, Mosca-Boidron AL, Callier P, Safraou H, Faivre L, Philippe C, Thauvin-Robinet C. Interest of exome sequencing trio-like strategy based on pooled parental DNA for diagnosis and translational research in rare diseases. Mol Genet Genomic Med 2021; 9:e1836. [PMID: 34716697 PMCID: PMC8683640 DOI: 10.1002/mgg3.1836] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/22/2021] [Accepted: 10/01/2021] [Indexed: 11/10/2022] Open
Abstract
Background Exome sequencing (ES) has become the most powerful and cost‐effective molecular tool for deciphering rare diseases with a diagnostic yield approaching 30%–40% in solo‐ES and 50% in trio‐ES. We applied an innovative parental DNA pooling method to reduce the parental sequencing cost while maintaining the diagnostic yield of trio‐ES. Methods We pooled six (Agilent‐CRE‐v2–100X) or five parental DNA (TWIST‐HCE–70X) aiming to detect allelic balance around 8–10% for heterozygous status. The strategies were applied as second‐tier (74 individuals after negative solo‐ES) and first‐tier approaches (324 individuals without previous ES). Results The allelic balance of parental‐pool variants was around 8.97%. Sanger sequencing uncovered false positives in 1.5% of sporadic variants. In the second‐tier approach, we evaluated than two thirds of the Sanger validations performed after solo‐ES (41/59–69%) would have been saved if the parental‐pool segregations had been available from the start. The parental‐pool strategy identified a causative diagnosis in 18/74 individuals (24%) in the second‐tier and in 116/324 individuals (36%) in the first‐tier approaches, including 19 genes newly associated with human disorders. Conclusions Parental‐pooling is an efficient alternative to trio‐ES. It provides rapid segregation and extension to translational research while reducing the cost of parental and Sanger sequencing.
Collapse
Affiliation(s)
- Frederic Tran Mau-Them
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Yannis Duffourd
- Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France.,FHU-TRANSLAD, Dijon, France
| | - Antonio Vitobello
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Ange-Line Bruel
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Anne-Sophie Denommé-Pichon
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Sophie Nambot
- Centre de Référence Maladies Rares « Anomalies du Développement et Syndrome Malformatifs » de l'Est, Hôpital d'Enfants, CHU Dijon Bourgogne, Dijon, France
| | - Julian Delanne
- Centre de Référence Maladies Rares « Anomalies du Développement et Syndrome Malformatifs » de l'Est, Hôpital d'Enfants, CHU Dijon Bourgogne, Dijon, France
| | - Sebastien Moutton
- Centre de Référence Maladies Rares « Anomalies du Développement et Syndrome Malformatifs » de l'Est, Hôpital d'Enfants, CHU Dijon Bourgogne, Dijon, France
| | - Arthur Sorlin
- Centre de Référence Maladies Rares « Anomalies du Développement et Syndrome Malformatifs » de l'Est, Hôpital d'Enfants, CHU Dijon Bourgogne, Dijon, France
| | -
- FHU-TRANSLAD, Dijon, France
| | - Victor Couturier
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Valentin Bourgeois
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Martin Chevarin
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Charlotte Poe
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | | | - Patrick Callier
- Laboratoire de Génétique Chromosomique et Moléculaire, CHU de Dijon, France
| | - Hana Safraou
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Laurence Faivre
- Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France.,Centre de Référence Maladies Rares « Anomalies du Développement et Syndrome Malformatifs » de l'Est, Hôpital d'Enfants, CHU Dijon Bourgogne, Dijon, France
| | - Christophe Philippe
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France
| | - Christel Thauvin-Robinet
- Unité Fonctionnelle 6254 d'Innovation en Diagnostique Génomique des Maladies Rares, Pôle de Biologie, CHU Dijon Bourgogne, Dijon, France.,Inserm - Université de Bourgogne UMR1231 GAD, FHU-TRANSLAD, Dijon, France.,FHU-TRANSLAD, Dijon, France.,Centre de Référence Maladies Rares «Déficiences Intellectuelles de Causes Rares», Hôpital d'Enfants, CHU Dijon Bourgogne, Dijon, France
| |
Collapse
|
9
|
SNP Development in Penaeus vannamei via Next-Generation Sequencing and DNA Pool Sequencing. FISHES 2021. [DOI: 10.3390/fishes6030036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Next-generation sequencing and pool sequencing have been widely used in SNP (single-nucleotide polymorphism) detection and population genetics research; however, there are few reports on SNPs related to the growth of Penaeus vannamei. The purpose of this study was to call SNPs from rapid-growing (RG) and slow-growing (SG) individuals’ transcriptomes and use DNA pool sequencing to assess the reliability of SNPs. Two parameters were applied to detect SNPs. One parameter was the p-values generated using Fisher’s exact test, which were used to calculate the significance of allele frequency differences between RG and SG. The other one was the AFI (minor allele frequency imbalance), which was defined to highlight the fold changes in MAF (minor allele frequency) values between RG and SG. There were 216,015 hypothetical SNPs, which were obtained based on the transcriptome data. Finally, 104 high-quality SNPs and 96,819 low-quality SNPs were predicted. Then, 18 high-quality SNPs and 17 low-quality SNPs were selected to assess the reliability of the detection process. Here, 72.22% (13/18) accuracy was achieved for high-quality SNPs, while only 52.94% (9/17) accuracy was achieved for low-quality SNPs. These SNPs enrich the data for population genetics studies of P. vannamei and may play a role in the development of SNP markers for future breeding studies.
Collapse
|
10
|
Genomic and functional evaluation of TNFSF14 in multiple sclerosis susceptibility. J Genet Genomics 2021; 48:497-507. [PMID: 34353742 DOI: 10.1016/j.jgg.2021.03.017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 02/24/2021] [Accepted: 03/05/2021] [Indexed: 11/24/2022]
Abstract
Among multiple sclerosis (MS) susceptibility genes, the strongest non-human leukocyte antigen (HLA) signal in the Italian population maps to the TNFSF14 gene encoding LIGHT, a glycoprotein involved in dendritic cell (DC) maturation. Through fine-mapping in a large Italian dataset (4,198 patients with MS and 3,903 controls), we show that the TNFSF14 intronic SNP rs1077667 is the primarily MS-associated variant in the region. Expression quantitative trait locus (eQTL) analysis indicates that the MS risk allele is significantly associated with reduced TNFSF14 messenger RNA levels in blood cells, which is consistent with the allelic imbalance in RNA-Seq reads (P < 0.0001). The MS risk allele is associated with reduced levels of TNFSF14 gene expression (P < 0.01) in blood cells from 84 Italian patients with MS and 80 healthy controls (HCs). Interestingly, patients with MS are lower expressors of TNFSF14 compared to HC (P < 0.007). Individuals homozygous for the MS risk allele display an increased percentage of LIGHT-positive peripheral blood myeloid DCs (CD11c+, P = 0.035) in 37 HCs, as well as in in vitro monocyte-derived DCs from 22 HCs (P = 0.04). Our findings suggest that the intronic variant rs1077667 alters the expression of TNFSF14 in immune cells, which may play a role in MS pathogenesis.
Collapse
|
11
|
Detecting and phasing minor single-nucleotide variants from long-read sequencing data. Nat Commun 2021; 12:3032. [PMID: 34031367 PMCID: PMC8144375 DOI: 10.1038/s41467-021-23289-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 04/15/2021] [Indexed: 02/04/2023] Open
Abstract
Cellular genetic heterogeneity is common in many biological conditions including cancer, microbiome, and co-infection of multiple pathogens. Detecting and phasing minor variants play an instrumental role in deciphering cellular genetic heterogeneity, but they are still difficult tasks because of technological limitations. Recently, long-read sequencing technologies, including those by Pacific Biosciences and Oxford Nanopore, provide an opportunity to tackle these challenges. However, high error rates make it difficult to take full advantage of these technologies. To fill this gap, we introduce iGDA, an open-source tool that can accurately detect and phase minor single-nucleotide variants (SNVs), whose frequencies are as low as 0.2%, from raw long-read sequencing data. We also demonstrate that iGDA can accurately reconstruct haplotypes in closely related strains of the same species (divergence ≥0.011%) from long-read metagenomic data.
Collapse
|
12
|
Lakhssassi N, Lopes-Caitar VS, Knizia D, Cullen MA, Badad O, El Baze A, Zhou Z, Embaby MG, Meksem J, Lakhssassi A, Chen P, AbuGhazaleh A, Vuong TD, Nguyen HT, Hewezi T, Meksem K. TILLING-by-Sequencing + Reveals the Role of Novel Fatty Acid Desaturases (GmFAD2-2s) in Increasing Soybean Seed Oleic Acid Content. Cells 2021; 10:1245. [PMID: 34069320 PMCID: PMC8158723 DOI: 10.3390/cells10051245] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 05/15/2021] [Accepted: 05/16/2021] [Indexed: 11/17/2022] Open
Abstract
Soybean is the second largest source of oil worldwide. Developing soybean varieties with high levels of oleic acid is a primary goal of the soybean breeders and industry. Edible oils containing high level of oleic acid and low level of linoleic acid are considered with higher oxidative stability and can be used as a natural antioxidant in food stability. All developed high oleic acid soybeans carry two alleles; GmFAD2-1A and GmFAD2-1B. However, when planted in cold soil, a possible reduction in seed germination was reported when high seed oleic acid derived from GmFAD2-1 alleles were used. Besides the soybean fatty acid desaturase (GmFAD2-1) subfamily, the GmFAD2-2 subfamily is composed of five members, including GmFAD2-2A, GmFAD2-2B, GmFAD2-2C, GmFAD2-2D, and GmFAD2-2E. Segmental duplication of GmFAD2-1A/GmFAD2-1B, GmFAD2-2A/GmFAD2-2C, GmFAD2-2A/GmFAD2-2D, and GmFAD2-2D/GmFAD2-2C have occurred about 10.65, 27.04, 100.81, and 106.55 Mya, respectively. Using TILLING-by-Sequencing+ technology, we successfully identified 12, 8, 10, 9, and 19 EMS mutants at the GmFAD2-2A, GmFAD2-2B, GmFAD2-2C, GmFAD2-2D, and GmFAD2-2E genes, respectively. Functional analyses of newly identified mutants revealed unprecedented role of the five GmFAD2-2A, GmFAD2-2B, GmFAD2-2C, GmFAD2-2D, and GmFAD2-2E members in controlling the seed oleic acid content. Most importantly, unlike GmFAD2-1 members, subcellular localization revealed that members of the GmFAD2-2 subfamily showed a cytoplasmic localization, which may suggest the presence of an alternative fatty acid desaturase pathway in soybean for converting oleic acid content without substantially altering the traditional plastidial/ER fatty acid production.
Collapse
Affiliation(s)
- Naoufal Lakhssassi
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| | | | - Dounya Knizia
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| | - Mallory A. Cullen
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| | - Oussama Badad
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| | - Abdelhalim El Baze
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| | - Zhou Zhou
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| | - Mohamed G. Embaby
- Department of Animal Science, Food, and Nutrition, Southern Illinois University, Carbondale, IL 62901, USA; (M.G.E.); (A.A.)
| | - Jonas Meksem
- Trinity College of Arts and Sciences, Duke University, Durham, NC 27708, USA;
| | - Aicha Lakhssassi
- Faculty of Sciences and Technologies, University of Lorraine, 54506 Nancy, France;
| | - Pengyin Chen
- Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA; (P.C.); (T.D.V.); (H.T.N.)
| | - Amer AbuGhazaleh
- Department of Animal Science, Food, and Nutrition, Southern Illinois University, Carbondale, IL 62901, USA; (M.G.E.); (A.A.)
| | - Tri D. Vuong
- Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA; (P.C.); (T.D.V.); (H.T.N.)
| | - Henry T. Nguyen
- Division of Plant Sciences, University of Missouri, Columbia, MO 65211, USA; (P.C.); (T.D.V.); (H.T.N.)
| | - Tarek Hewezi
- Department of Plant Sciences, University of Tennessee, Knoxville, TN 37996, USA; (V.S.L.-C.); (T.H.)
| | - Khalid Meksem
- Department of Plant, Soil and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (D.K.); (M.A.C.); (O.B.); (A.E.B.); (Z.Z.)
| |
Collapse
|
13
|
Guirao‐Rico S, González J. Benchmarking the performance of Pool-seq SNP callers using simulated and real sequencing data. Mol Ecol Resour 2021; 21:1216-1229. [PMID: 33534960 PMCID: PMC8251607 DOI: 10.1111/1755-0998.13343] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 12/21/2020] [Accepted: 01/27/2021] [Indexed: 12/13/2022]
Abstract
Population genomics is a fast-developing discipline with promising applications in a growing number of life sciences fields. Advances in sequencing technologies and bioinformatics tools allow population genomics to exploit genome-wide information to identify the molecular variants underlying traits of interest and the evolutionary forces that modulate these variants through space and time. However, the cost of genomic analyses of multiple populations is still too high to address them through individual genome sequencing. Pooling individuals for sequencing can be a more effective strategy in Single Nucleotide Polymorphism (SNP) detection and allele frequency estimation because of a higher total coverage. However, compared to individual sequencing, SNP calling from pools has the additional difficulty of distinguishing rare variants from sequencing errors, which is often avoided by establishing a minimum threshold allele frequency for the analysis. Finding an optimal balance between minimizing information loss and reducing sequencing costs is essential to ensure the success of population genomics studies. Here, we have benchmarked the performance of SNP callers for Pool-seq data, based on different approaches, under different conditions, and using computer simulations and real data. We found that SNP callers performance varied for allele frequencies up to 0.35. We also found that SNP callers based on Bayesian (SNAPE-pooled) or maximum likelihood (MAPGD) approaches outperform the two heuristic callers tested (VarScan and PoolSNP), in terms of the balance between sensitivity and FDR both in simulated and sequencing data. Our results will help inform the selection of the most appropriate SNP caller not only for large-scale population studies but also in cases where the Pool-seq strategy is the only option, such as in metagenomic or polyploid studies.
Collapse
Affiliation(s)
- Sara Guirao‐Rico
- Institute of Evolutionary BiologyCSIC‐Universitat Pompeu FabraBarcelonaSpain
| | - Josefa González
- Institute of Evolutionary BiologyCSIC‐Universitat Pompeu FabraBarcelonaSpain
| |
Collapse
|
14
|
Lakhssassi N, Zhou Z, Cullen MA, Badad O, El Baze A, Chetto O, Embaby MG, Knizia D, Liu S, Neves LG, Meksem K. TILLING-by-Sequencing + to Decipher Oil Biosynthesis Pathway in Soybeans: A New and Effective Platform for High-Throughput Gene Functional Analysis. Int J Mol Sci 2021; 22:4219. [PMID: 33921707 PMCID: PMC8073088 DOI: 10.3390/ijms22084219] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 04/08/2021] [Accepted: 04/13/2021] [Indexed: 12/25/2022] Open
Abstract
Reverse genetic approaches have been widely applied to study gene function in crop species; however, these techniques, including gel-based TILLING, present low efficiency to characterize genes in soybeans due to genome complexity, gene duplication, and the presence of multiple gene family members that share high homology in their DNA sequence. Chemical mutagenesis emerges as a genetically modified-free strategy to produce large-scale soybean mutants for economically important traits improvement. The current study uses an optimized high-throughput TILLING by target capture sequencing technology, or TILLING-by-Sequencing+ (TbyS+), coupled with universal bioinformatic tools to identify population-wide mutations in soybeans. Four ethyl methanesulfonate mutagenized populations (4032 mutant families) have been screened for the presence of induced mutations in targeted genes. The mutation types and effects have been characterized for a total of 138 soybean genes involved in soybean seed composition, disease resistance, and many other quality traits. To test the efficiency of TbyS+ in complex genomes, we used soybeans as a model with a focus on three desaturase gene families, GmSACPD, GmFAD2, and GmFAD3, that are involved in the soybean fatty acid biosynthesis pathway. We successfully isolated mutants from all the six gene family members. Unsurprisingly, most of the characterized mutants showed significant changes either in their stearic, oleic, or linolenic acids. By using TbyS+, we discovered novel sources of soybean oil traits, including high saturated and monosaturated fatty acids in addition to low polyunsaturated fatty acid contents. This technology provides an unprecedented platform for highly effective screening of polyploid mutant populations and functional gene analysis. The obtained soybean mutants from this study can be used in subsequent soybean breeding programs for improved oil composition traits.
Collapse
Affiliation(s)
- Naoufal Lakhssassi
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Zhou Zhou
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Mallory A. Cullen
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Oussama Badad
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Abdelhalim El Baze
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Oumaima Chetto
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Mohamed G. Embaby
- Department of Animal Science, Food, and Nutrition, Southern Illinois University, Carbondale, IL 62901, USA;
| | - Dounya Knizia
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | - Shiming Liu
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| | | | - Khalid Meksem
- Department of Plant, Soil, and Agricultural Systems, Southern Illinois University, Carbondale, IL 62901, USA; (N.L.); (Z.Z.); (M.A.C.); (O.B.); (A.E.B.); (O.C.); (D.K.); (S.L.)
| |
Collapse
|
15
|
Dudley JN, Hong CS, Hawari MA, Shwetar J, Sapp JC, Lack J, Shiferaw H, Johnston JJ, Biesecker LG. Low-level variant calling for non-matched samples using a position-based and nucleotide-specific approach. BMC Bioinformatics 2021; 22:181. [PMID: 33832433 PMCID: PMC8028235 DOI: 10.1186/s12859-021-04090-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 03/18/2021] [Indexed: 11/19/2022] Open
Abstract
Background The widespread use of next-generation sequencing has identified an important role for somatic mosaicism in many diseases. However, detecting low-level mosaic variants from next-generation sequencing data remains challenging. Results Here, we present a method for Position-Based Variant Identification (PBVI) that uses empirically-derived distributions of alternate nucleotides from a control dataset. We modeled this approach on 11 segmental overgrowth genes. We show that this method improves detection of single nucleotide mosaic variants of 0.01–0.05 variant allele fraction compared to other low-level variant callers. At depths of 600 × and 1200 ×, we observed > 85% and > 95% sensitivity, respectively. In a cohort of 26 individuals with somatic overgrowth disorders PBVI showed improved signal to noise, identifying pathogenic variants in 17 individuals. Conclusion PBVI can facilitate identification of low-level mosaic variants thus increasing the utility of next-generation sequencing data for research and diagnostic purposes. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04090-y.
Collapse
Affiliation(s)
- Jeffrey N Dudley
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| | - Celine S Hong
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA.
| | - Marwan A Hawari
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| | - Jasmine Shwetar
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| | - Julie C Sapp
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| | - Justin Lack
- NIAID Collaborative Bioinformatics Resource, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA.,Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Henoke Shiferaw
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| | | | - Jennifer J Johnston
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| | - Leslie G Biesecker
- National Human Genome Research Institute, National Institutes of Health, 50 South Drive Room 5140, Bethesda, MD, 20892, USA
| |
Collapse
|
16
|
Bertolini F, Ribani A, Capoccioni F, Buttazzoni L, Utzeri VJ, Bovo S, Schiavo G, Caggiano M, Rothschild MF, Fontanesi L. A comparative whole genome sequencing analysis identified a candidate locus for lack of operculum in cultivated gilthead seabream (Sparus aurata). Anim Genet 2021; 52:365-370. [PMID: 33609290 DOI: 10.1111/age.13049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2021] [Indexed: 01/29/2023]
Abstract
The gilthead seabream (Sparus aurata, Sparidae family) is commonly used for aquaculture. Despite its great economic value, several problems in its cultivation remain. One of the major concerns is the high frequency of morphological abnormalities occurring during the early developmental stages. Partial and/or total lack of operculum is the most frequent anomaly affecting the fish cranial region. The existence of genetic factors that can at least partially determine this defect has been hypothesized. In this work, two DNA pools of highly related fry, one composed of normal-looking (control) fish and the other lacking an operculum (case), were constructed and whole-genome resequencing data produced from the two were compared. The analysis revealed a 1 Mb region on chromosome 2 with higher heterozygosity in the lack of operculum DNA pool than in the control DNA pool, consistent with the enrichment, in the first DNA pool, of one or more haplotypes causing or predisposing to the defect together with other normal haplotypes. A window-based FST analysis between the two DNA pools indicated that the same region had the highest divergence score. This region contained 2921 SNVs, 10 of which, with predicted high impacts (three splice donor and seven stop-gained variants), were detected in novel genes that are homologous to calcium-sensing receptor-like genes, probably involved in bone development. Other studies are needed to clarify the genetic mechanisms involved in predisposing fry to this deformity and then to identify associated markers that could be used in breeding programs to reduce the frequency of this defect in the broodstock.
Collapse
Affiliation(s)
- F Bertolini
- National Institute of Aquatic Resources, Technical University of Denmark, Kongens Lyngby, 2800, Denmark
| | - A Ribani
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale Giuseppe Fanin 46, Bologna, 40127, Italy
| | - F Capoccioni
- Centro di ricerca 'Zootecnia e Acquacoltura', Consiglio per la Ricerca in Agricoltura e L'Analisi dell'Economia Agraria, Via Salaria 31, Monterotondo, Roma, 00015, Italy
| | - L Buttazzoni
- Centro di ricerca 'Zootecnia e Acquacoltura', Consiglio per la Ricerca in Agricoltura e L'Analisi dell'Economia Agraria, Via Salaria 31, Monterotondo, Roma, 00015, Italy
| | - V J Utzeri
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale Giuseppe Fanin 46, Bologna, 40127, Italy
| | - S Bovo
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale Giuseppe Fanin 46, Bologna, 40127, Italy
| | - G Schiavo
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale Giuseppe Fanin 46, Bologna, 40127, Italy
| | - M Caggiano
- Panittica Italia Società Agricola Srl, Torre Canne di Fasano, Brindisi, 72016, Italy
| | - M F Rothschild
- Department of Animal Science, Iowa State University, Ames, IA, 50011-3150, USA
| | - L Fontanesi
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale Giuseppe Fanin 46, Bologna, 40127, Italy
| |
Collapse
|
17
|
Bovo S, Schiavo G, Ribani A, Utzeri VJ, Taurisano V, Ballan M, Muñoz M, Alves E, Araujo JP, Bozzi R, Charneca R, Di Palma F, Djurkin Kušec I, Etherington G, Fernandez AI, García F, García-Casco J, Karolyi D, Gallo M, Martins JM, Mercat MJ, Núñez Y, Quintanilla R, Radović Č, Razmaite V, Riquet J, Savić R, Škrlep M, Usai G, Zimmer C, Ovilo C, Fontanesi L. Describing variability in pig genes involved in coronavirus infections for a One Health perspective in conservation of animal genetic resources. Sci Rep 2021; 11:3359. [PMID: 33564056 PMCID: PMC7873263 DOI: 10.1038/s41598-021-82956-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 01/25/2021] [Indexed: 02/08/2023] Open
Abstract
Coronaviruses silently circulate in human and animal populations, causing mild to severe diseases. Therefore, livestock are important components of a “One Health” perspective aimed to control these viral infections. However, at present there is no example that considers pig genetic resources in this context. In this study, we investigated the variability of four genes (ACE2, ANPEP and DPP4 encoding for host receptors of the viral spike proteins and TMPRSS2 encoding for a host proteinase) in 23 European (19 autochthonous and three commercial breeds and one wild boar population) and two Asian Sus scrofa populations. A total of 2229 variants were identified in the four candidate genes: 26% of them were not previously described; 29 variants affected the protein sequence and might potentially interact with the infection mechanisms. The results coming from this work are a first step towards a “One Health” perspective that should consider conservation programs of pig genetic resources with twofold objectives: (i) genetic resources could be reservoirs of host gene variability useful to design selection programs to increase resistance to coronaviruses; (ii) the described variability in genes involved in coronavirus infections across many different pig populations might be part of a risk assessment including pig genetic resources.
Collapse
Affiliation(s)
- Samuele Bovo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Giuseppina Schiavo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Anisa Ribani
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Valerio J Utzeri
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Valeria Taurisano
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Mohamad Ballan
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Maria Muñoz
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Estefania Alves
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Jose P Araujo
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Viana do Castelo, Escola Superior Agrária, Refóios do Lima, 4990-706, Ponte de Lima, Portugal
| | - Riccardo Bozzi
- DAGRI - Animal Science Section, University of Florence, Via delle Cascine 5, 50144, Florence, Italy
| | - Rui Charneca
- MED - Mediterranean Institute for Agriculture, Environment and Development, Universidade de Évora, Pólo da Mitra, Apartado 94, 7006-554, Évora, Portugal
| | - Federica Di Palma
- Biodiversity School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk, NR47UH, UK
| | - Ivona Djurkin Kušec
- Faculty of Agrobiotechnical Sciences Osijek, Josip Juraj Strossmayer University of Osijek, Vladimira Preloga 1, 31000, Osijek, Croatia
| | - Graham Etherington
- Earlham Institute, Norwich Research Park, Colney Lane, Norwich, Norfolk, NR47UZ, UK
| | - Ana I Fernandez
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Fabián García
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Juan García-Casco
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Danijel Karolyi
- Department of Animal Science, Faculty of Agriculture, University of Zagreb, Svetošimunska c. 25, 10000, Zagreb, Croatia
| | - Maurizio Gallo
- Associazione Nazionale Allevatori Suini (ANAS), Via Nizza 53, 00198, Rome, Italy
| | - José Manuel Martins
- MED - Mediterranean Institute for Agriculture, Environment and Development, Universidade de Évora, Pólo da Mitra, Apartado 94, 7006-554, Évora, Portugal
| | - Marie-José Mercat
- IFIP Institut du porc, La Motte au Vicomte, BP 35104, 35651, Le Rheu Cedex, France
| | - Yolanda Núñez
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Raquel Quintanilla
- Programa de Genética y Mejora Animal, Institute for Research and Technology in Food and Agriculture (IRTA), Torre Marimon, 08140, Caldes de Montbui, Barcelona, Spain
| | - Čedomir Radović
- Department of Pig Breeding and Genetics, Institute for Animal Husbandry, 11080, Belgrade-Zemun, Serbia
| | - Violeta Razmaite
- Animal Science Institute, Lithuanian University of Health Sciences, Baisogala, Lithuania
| | - Juliette Riquet
- Génétique Physiologie et Systèmes d'Elevage (GenPhySE), Université de Toulouse, INRA, Chemin de Borde-Rouge 24, Auzeville Tolosane, 31326, Castanet Tolosan, France
| | - Radomir Savić
- Faculty of Agriculture, University of Belgrade, Nemanjina 6, 11080, Belgrade-Zemun, Serbia
| | - Martin Škrlep
- Kmetijski Inštitut Slovenije, Hacquetova 17, 1000, Ljubljana, Slovenia
| | - Graziano Usai
- AGRIS SARDEGNA, Loc. Bonassai, 07100, Sassari, Italy
| | - Christoph Zimmer
- Bäuerliche Erzeugergemeinschaft Schwäbisch Hall, Schwäbisch Hall, Germany
| | - Cristina Ovilo
- Departamento Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria yAlimentaria (INIA), Crta. de la Coruña, km. 7, 5, 28040, Madrid, Spain
| | - Luca Fontanesi
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy.
| |
Collapse
|
18
|
Gil J, Andrade-Martínez JS, Duitama J. Accurate, Efficient and User-Friendly Mutation Calling and Sample Identification for TILLING Experiments. Front Genet 2021; 12:624513. [PMID: 33613641 PMCID: PMC7886796 DOI: 10.3389/fgene.2021.624513] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 01/08/2021] [Indexed: 11/13/2022] Open
Abstract
TILLING (Targeting Induced Local Lesions IN Genomes) is a powerful reverse genetics method in plant functional genomics and breeding to identify mutagenized individuals with improved behavior for a trait of interest. Pooled high throughput sequencing (HTS) of the targeted genes allows efficient identification and sample assignment of variants within genes of interest in hundreds of individuals. Although TILLING has been used successfully in different crops and even applied to natural populations, one of the main issues for a successful TILLING experiment is that most currently available bioinformatics tools for variant detection are not designed to identify mutations with low frequencies in pooled samples or to perform sample identification from variants identified in overlapping pools. Our research group maintains the Next Generation Sequencing Experience Platform (NGSEP), an open source solution for analysis of HTS data. In this manuscript, we present three novel components within NGSEP to facilitate the design and analysis of TILLING experiments: a pooled variants detector, a sample identifier from variants detected in overlapping pools and a simulator of TILLING experiments. A new implementation of the NGSEP calling model for variant detection allows accurate detection of low frequency mutations within pools. The samples identifier implements the process to triangulate the mutations called within overlapping pools in order to assign mutations to single individuals whenever possible. Finally, we developed a complete simulator of TILLING experiments to enable benchmarking of different tools and to facilitate the design of experimental alternatives varying the number of pools and individuals per pool. Simulation experiments based on genes from the common bean genome indicate that NGSEP provides similar accuracy and better efficiency than other tools to perform pooled variants detection. To the best of our knowledge, NGSEP is currently the only tool that generates individual assignments of the mutations discovered from the pooled data. We expect that this development will be of great use for different groups implementing TILLING as an alternative for plant breeding and even to research groups performing pooled sequencing for other applications.
Collapse
Affiliation(s)
- Juanita Gil
- Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá, Colombia
| | - Juan Sebastian Andrade-Martínez
- Research Group on Computational Biology and Microbial Ecology, Department of Biological Sciences, Universidad de Los Andes, Bogotá, Colombia.,Max Planck Tandem Group in Computational Biology, Universidad de Los Andes, Bogotá, Colombia
| | - Jorge Duitama
- Systems and Computing Engineering Department, Universidad de Los Andes, Bogotá, Colombia
| |
Collapse
|
19
|
Hawliczek A, Bolibok L, Tofil K, Borzęcka E, Jankowicz-Cieślak J, Gawroński P, Kral A, Till BJ, Bolibok-Brągoszewska H. Deep sampling and pooled amplicon sequencing reveals hidden genic variation in heterogeneous rye accessions. BMC Genomics 2020; 21:845. [PMID: 33256606 PMCID: PMC7706248 DOI: 10.1186/s12864-020-07240-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/18/2020] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Loss of genetic variation negatively impacts breeding efforts and food security. Genebanks house over 7 million accessions representing vast allelic diversity that is a resource for sustainable breeding. Discovery of DNA variations is an important step in the efficient use of these resources. While technologies have improved and costs dropped, it remains impractical to consider resequencing millions of accessions. Candidate genes are known for most agronomic traits, providing a list of high priority targets. Heterogeneity in seed stocks means that multiple samples from an accession need to be evaluated to recover available alleles. To address this we developed a pooled amplicon sequencing approach and applied it to the out-crossing cereal rye (Secale cereale L.). RESULTS Using the amplicon sequencing approach 95 rye accessions of different improvement status and worldwide origin, each represented by a pooled sample comprising DNA of 96 individual plants, were evaluated for sequence variation in six candidate genes with significant functions on biotic and abiotic stress resistance, and seed quality. Seventy-four predicted deleterious variants were identified using multiple algorithms. Rare variants were recovered including those found only in a low percentage of seed. CONCLUSIONS We conclude that this approach provides a rapid and flexible method for evaluating stock heterogeneity, probing allele diversity, and recovering previously hidden variation. A large extent of within-population heterogeneity revealed in the study provides an important point for consideration during rye germplasm conservation and utilization efforts.
Collapse
Affiliation(s)
- Anna Hawliczek
- Department of Plant Genetics, Breeding and Biotechnology, Institute of Biology, Warsaw University of Life Sciences - SGGW, Warsaw, Poland
| | - Leszek Bolibok
- Department of Silviculture, Institute of Forest Sciences, Warsaw University of Life Sciences - SGGW, Warsaw, Poland
| | - Katarzyna Tofil
- Department of Plant Genetics, Breeding and Biotechnology, Institute of Biology, Warsaw University of Life Sciences - SGGW, Warsaw, Poland
| | - Ewa Borzęcka
- Department of Plant Genetics, Breeding and Biotechnology, Institute of Biology, Warsaw University of Life Sciences - SGGW, Warsaw, Poland
| | - Joanna Jankowicz-Cieślak
- Plant Breeding and Genetics Laboratory, Joint FAO/IAEA Division of Nuclear Techniques in Food and Agriculture, IAEA Laboratories Seibersdorf, International Atomic Energy Agency, Vienna International Centre, Vienna, Austria
| | - Piotr Gawroński
- Department of Plant Genetics, Breeding and Biotechnology, Institute of Biology, Warsaw University of Life Sciences - SGGW, Warsaw, Poland
| | - Adam Kral
- Department of Plant Genetics, Breeding and Biotechnology, Institute of Biology, Warsaw University of Life Sciences - SGGW, Warsaw, Poland
| | - Bradley J Till
- Plant Breeding and Genetics Laboratory, Joint FAO/IAEA Division of Nuclear Techniques in Food and Agriculture, IAEA Laboratories Seibersdorf, International Atomic Energy Agency, Vienna International Centre, Vienna, Austria.
- Veterinary Genetics Laboratory, University of California, Davis, Davis, California, USA.
| | - Hanna Bolibok-Brągoszewska
- Department of Plant Genetics, Breeding and Biotechnology, Institute of Biology, Warsaw University of Life Sciences - SGGW, Warsaw, Poland.
| |
Collapse
|
20
|
Genetic Variation Bias toward Noncoding Regions and Secreted Proteins in the Rice Blast Fungus Magnaporthe oryzae. mSystems 2020; 5:5/3/e00346-20. [PMID: 32606028 PMCID: PMC7329325 DOI: 10.1128/msystems.00346-20] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The genomes of plant pathogens are highly variable and plastic. Pathogen gene repertoires change quickly with the plant environment, which results in a rapid loss of plant resistance shortly after the pathogen emerges in the field. Extensive studies have evaluated natural pathogen populations to understand their evolutionary effects; however, the number of studies that have examined the dynamic processes of the mutation and adaptation of plant pathogens to host plants remains limited. Here, we applied experimental evolution and high-throughput pool sequencing to Magnaporthe oryzae, a fungal pathogen that causes massive losses in rice production, to observe the evolution of genome variation. We found that mutations, including single-nucleotide variants (SNVs), insertions and deletions (indels), and transposable element (TE) insertions, accumulated very rapidly throughout the genome of M. oryzae during sequential plant inoculation and preferentially in noncoding regions, while such mutations were not frequently found in coding regions. However, we also observed that new TE insertions accumulated with time and preferentially accumulated at the proximal region of secreted protein (SP) coding genes in M. oryzae populations. Taken together, these results revealed a bias in genetic variation toward noncoding regions and SP genes in M. oryzae and may contribute to the rapid adaptive evolution of the blast fungal effectors under host selection.IMPORTANCE Plants "lose" resistance toward pathogens shortly after their widespread emergence in the field because plant pathogens mutate and adapt rapidly under resistance selection. Thus, the rapid evolution of pathogens is a serious threat to plant health. Extensive studies have evaluated natural pathogen populations to understand their evolutionary effects; however, the study of the dynamic processes of the mutation and adaptation of plant pathogens to host plants remains limited. Here, by performing an experimental evolution study, we found a bias in genetic variation toward noncoding regions and SPs in the rice blast fungus Magnaporthe oryzae, which explains the ability of the rice blast fungus to maintain high virulence variation to overcome rice resistance in the field.
Collapse
|
21
|
Bovo S, Ribani A, Muñoz M, Alves E, Araujo JP, Bozzi R, Čandek-Potokar M, Charneca R, Di Palma F, Etherington G, Fernandez AI, García F, García-Casco J, Karolyi D, Gallo M, Margeta V, Martins JM, Mercat MJ, Moscatelli G, Núñez Y, Quintanilla R, Radović Č, Razmaite V, Riquet J, Savić R, Schiavo G, Usai G, Utzeri VJ, Zimmer C, Ovilo C, Fontanesi L. Whole-genome sequencing of European autochthonous and commercial pig breeds allows the detection of signatures of selection for adaptation of genetic resources to different breeding and production systems. Genet Sel Evol 2020; 52:33. [PMID: 32591011 PMCID: PMC7318759 DOI: 10.1186/s12711-020-00553-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Accepted: 06/17/2020] [Indexed: 12/21/2022] Open
Abstract
Background Natural and artificial directional selection in cosmopolitan and autochthonous pig breeds and wild boars have shaped their genomes and resulted in a reservoir of animal genetic diversity. Signatures of selection are the result of these selection events that have contributed to the adaptation of breeds to different environments and production systems. In this study, we analysed the genome variability of 19 European autochthonous pig breeds (Alentejana, Bísara, Majorcan Black, Basque, Gascon, Apulo-Calabrese, Casertana, Cinta Senese, Mora Romagnola, Nero Siciliano, Sarda, Krškopolje pig, Black Slavonian, Turopolje, Moravka, Swallow-Bellied Mangalitsa, Schwäbisch-Hällisches Schwein, Lithuanian indigenous wattle and Lithuanian White old type) from nine countries, three European commercial breeds (Italian Large White, Italian Landrace and Italian Duroc), and European wild boars, by mining whole-genome sequencing data obtained by using a DNA-pool sequencing approach. Signatures of selection were identified by using a single-breed approach with two statistics [within-breed pooled heterozygosity (HP) and fixation index (FST)] and group-based FST approaches, which compare groups of breeds defined according to external traits and use/specialization/type. Results We detected more than 22 million single nucleotide polymorphisms (SNPs) across the 23 compared populations and identified 359 chromosome regions showing signatures of selection. These regions harbour genes that are already known or new genes that are under selection and relevant for the domestication process in this species, and that affect several morphological and physiological traits (e.g. coat colours and patterns, body size, number of vertebrae and teats, ear size and conformation, reproductive traits, growth and fat deposition traits). Wild boar related signatures of selection were detected across all the genome of several autochthonous breeds, which suggests that crossbreeding (accidental or deliberate) occurred with wild boars. Conclusions Our findings provide a catalogue of genetic variants of many European pig populations and identify genome regions that can explain, at least in part, the phenotypic diversity of these genetic resources.
Collapse
Affiliation(s)
- Samuele Bovo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Anisa Ribani
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Maria Muñoz
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Estefania Alves
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Jose P Araujo
- Centro de Investigação de Montanha (CIMO), Instituto Politécnico de Viana do Castelo, Escola Superior Agrária, Refóios do Lima, 4990-706, Ponte de Lima, Portugal
| | - Riccardo Bozzi
- DAGRI - Animal Science Section, Università di Firenze, Via delle Cascine 5, 50144, Florence, Italy
| | | | - Rui Charneca
- Instituto de Ciências Agrárias e Ambientais Mediterrânicas (ICAAM), Universidade de Évora, Polo da Mitra, Apartado 94, 7006-554, Évora, Portugal
| | - Federica Di Palma
- Earlham Institute, Norwich Research Park, Colney Lane, Norwich, NR47UZ, UK
| | - Graham Etherington
- Earlham Institute, Norwich Research Park, Colney Lane, Norwich, NR47UZ, UK
| | - Ana I Fernandez
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Fabián García
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Juan García-Casco
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Danijel Karolyi
- Department of Animal Science, Faculty of Agriculture, University of Zagreb, Svetošimunska c. 25, 10000, Zagreb, Croatia
| | - Maurizio Gallo
- Associazione Nazionale Allevatori Suini (ANAS), Via Nizza 53, 00198, Rome, Italy
| | - Vladimir Margeta
- Faculty of Agrobiotechnical Sciences, University of Osijek, Vladimira Preloga 1, 31000, Osijek, Croatia
| | - José Manuel Martins
- Instituto de Ciências Agrárias e Ambientais Mediterrânicas (ICAAM), Universidade de Évora, Polo da Mitra, Apartado 94, 7006-554, Évora, Portugal
| | - Marie J Mercat
- IFIP Institut du porc, La Motte au Vicomte, BP 35104, 35651, Le Rheu Cedex, France
| | - Giulia Moscatelli
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Yolanda Núñez
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Raquel Quintanilla
- Programa de Genética y Mejora Animal, IRTA, Torre Marimon, 08140, Caldes de Montbui, Barcelona, Spain
| | - Čedomir Radović
- Department of Pig Breeding and Genetics, Institute for Animal Husbandry, Belgrade-Zemun, 11080, Serbia
| | - Violeta Razmaite
- Animal Science Institute, Lithuanian University of Health Sciences, Baisogala, Lithuania
| | - Juliette Riquet
- GenPhySE, INRAE, Université de Toulouse, Chemin de Borde-Rouge 24, Auzeville Tolosane, 31326, Castanet Tolosan, France
| | - Radomir Savić
- Faculty of Agriculture, University of Belgrade, Nemanjina 6, Belgrade-Zemun, 11080, Serbia
| | - Giuseppina Schiavo
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Graziano Usai
- AGRIS SARDEGNA, Loc. Bonassai, 07100, Sassari, Italy
| | - Valerio J Utzeri
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy
| | - Christoph Zimmer
- Bäuerliche Erzeugergemeinschaft Schwäbisch Hall, Schwäbisch Hall, Germany
| | - Cristina Ovilo
- Departamento Mejora Genética Animal, INIA, Crta. de la Coruña km. 7,5, 28040, Madrid, Spain
| | - Luca Fontanesi
- Department of Agricultural and Food Sciences, Division of Animal Sciences, University of Bologna, Viale Fanin 46, 40127, Bologna, Italy.
| |
Collapse
|
22
|
Özdemir Özdoğan G, Kaya H. Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data. Interdiscip Sci 2020; 12:302-310. [PMID: 32519123 DOI: 10.1007/s12539-020-00374-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2019] [Revised: 04/26/2020] [Accepted: 05/22/2020] [Indexed: 12/31/2022]
Abstract
Next-generation sequencing (NGS) is related to massively parallel or deep deoxyribonucleic acid (DNA) sequencing technology which has revolutionized genomic researches in recent years. Although the cost of generating NGS data was decreased compared to the one at the time of emerging this technology, its cost might still be somewhat a problem. Hence, new strategies as pool-seq and low-coverage NGS data have been developed to overcome the cost problem. Despite decreasing cost, it is important to elucidate whether they are efficient in NGS studies. We applied a bioinformatics pipeline on pool-seq and low-coverage retinoblastoma data retrieved from only tumor data. Retinoblastoma is an eye malignancy in childhood that is initiated by RB1 mutation or MYCN amplification and can lead to the loss of vision of eye(s), and even sometimes life. We applied our pipeline on both retinoblastoma disease data and two other particular data to testify the validity and also for comparison purposes in the aspect of performance. High-confidence variant calls from Genome in a Bottle Consortium were used for fulfilling these purposes. We observed that our pipeline successfully called higher number of variants than a standard pipeline for all these three different data. Besides, the recall and F-score values were quite better in our pipeline as being noteworthy. We further presented our results on disease data in the aspects of the variants, variant types and disease-related genes. This study provides a guideline for performing NGS data analysis pipeline on pool-seq and low-coverage sequencing data in conjunction. To get more conclusive outcomes of these two strategies, we recommend using cancer data having higher mutation rates and larger pools.
Collapse
Affiliation(s)
| | - Hilal Kaya
- Department of Computer Engineering, Ankara Yildirim Beyazit University, 06010, Ankara, Turkey.
| |
Collapse
|
23
|
Wei YB, McCarthy M, Ren H, Carrillo-Roa T, Shekhtman T, DeModena A, Liu JJ, Leckband SG, Mors O, Rietschel M, Henigsberg N, Cattaneo A, Binder EB, Aitchison KJ, Kelsoe JR. A functional variant in the serotonin receptor 7 gene (HTR7), rs7905446, is associated with good response to SSRIs in bipolar and unipolar depression. Mol Psychiatry 2020; 25:1312-1322. [PMID: 30874608 PMCID: PMC6745302 DOI: 10.1038/s41380-019-0397-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/24/2018] [Revised: 02/18/2019] [Accepted: 02/21/2019] [Indexed: 02/06/2023]
Abstract
Predicting antidepressant response has been a clinical challenge for mood disorder. Although several genome-wide association studies have suggested a number of genetic variants to be associated with antidepressant response, the sample sizes are small and the results are difficult to replicate. Previous animal studies have shown that knockout of the serotonin receptor 7 gene (HTR7) resulted in an antidepressant-like phenotype, suggesting it was important to antidepressant action. In this report, in the first stage, we used a cost-effective pooled-sequencing strategy to sequence the entire HTR7 gene and its regulatory regions to investigate the association of common variants in HTR7 and clinical response to four selective serotonin reuptake inhibitors (SSRIs: citalopram, paroxetine, fluoxetine and sertraline) in a retrospective cohort mainly consisting of subjects with bipolar disorder (n = 359). We found 80 single-nucleotide polymorphisms (SNPs) with false discovery rate < 0.05 associated with response to paroxetine. Among the significant SNPs, rs7905446 (T/G), which is located at the promoter region, also showed nominal significance (P < 0.05) in fluoxetine group. GG/TG genotypes for rs7905446 and female gender were associated with better response to two SSRIs (paroxetine and fluoxetine). In the second stage, we replicated this association in two independent prospective samples of SSRI-treated patients with major depressive disorder: the MARS (n = 253, P = 0.0169) and GENDEP studies (n = 432, P = 0.008). The GG/TG genotypes were consistently associated with response in all three samples. Functional study of rs7905446 showed greater activity of the G allele in regulating expression of HTR7. The G allele displayed higher luciferase activity in two neuronal-related cell lines, and estrogen treatment decreased the activity of only the G allele. Electrophoretic mobility shift assay suggested that the G allele interacted with CCAAT/enhancer-binding protein beta transcription factor (TF), while the T allele did not show any interaction with any TFs. Our results provided novel pharmacogenomic evidence to support the role of HTR7 in association with antidepressant response.
Collapse
Affiliation(s)
- Ya Bin Wei
- Center for Molecular Medicine, Karolinska University Hospital, Stockholm, 17176, Sweden.,Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, 17176, Sweden.,Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA
| | - Michael McCarthy
- Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA.,Psychiatry Service, VA San Diego Healthcare System, San Diego, CA,92161, USA
| | - Hongyan Ren
- Psychiatric Laboratory and Mental Health Center, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, Sichuan, P.R. China.,Department of Psychiatry and Medical Genetics, University of Alberta, Edmonton, Alberta, Canada
| | - Tania Carrillo-Roa
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, 80804, Germany
| | - Tatyana Shekhtman
- Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA.,Psychiatry Service, VA San Diego Healthcare System, San Diego, CA,92161, USA
| | - Anna DeModena
- Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA.,Psychiatry Service, VA San Diego Healthcare System, San Diego, CA,92161, USA
| | - Jia Jia Liu
- National Institute on Drug Dependence, Peking University, Beijing 100191, China.,Institute of Mental Health, National Clinical Research Center for Mental Disorders, Key Laboratory of Mental Health and Peking University Sixth Hospital, Peking University, Beijing 100191, China
| | - Susan G. Leckband
- Center for Molecular Medicine, Karolinska University Hospital, Stockholm, 17176, Sweden.,Psychiatry Service, VA San Diego Healthcare System, San Diego, CA,92161, USA
| | - Ole Mors
- Psychosis Research Unit, Aarhus University Hospital, Denmark
| | - Marcella Rietschel
- Department of Genetic Epidemiology in Psychiatry, Central Institute of Mental Health, Mannheim Heidelberg University, Mannheim Germany
| | - Neven Henigsberg
- Croatian Institute for Brain Research, Center of Research Excellence for Basic, Clinical and Translational Neuroscience, University of Zagreb, School of Medicine, Zagreb, Croatia
| | | | - Elisabeth B. Binder
- Department of Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Munich, 80804, Germany.,Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Katherine J. Aitchison
- Department of Psychiatry and Medical Genetics, University of Alberta, Edmonton, Alberta, Canada
| | - John R. Kelsoe
- Department of Psychiatry, University of California San Diego, La Jolla, CA, 92093, USA.,Psychiatry Service, VA San Diego Healthcare System, San Diego, CA,92161, USA.,Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| |
Collapse
|
24
|
Zhang J, Müller BSF, Tyre KN, Hersh HL, Bai F, Hu Y, Resende MFR, Rathinasabapathi B, Settles AM. Competitive Growth Assay of Mutagenized Chlamydomonas reinhardtii Compatible With the International Space Station Veggie Plant Growth Chamber. FRONTIERS IN PLANT SCIENCE 2020; 11:631. [PMID: 32523594 PMCID: PMC7261848 DOI: 10.3389/fpls.2020.00631] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 04/24/2020] [Indexed: 06/11/2023]
Abstract
A biological life support system for spaceflight would capture carbon dioxide waste produced by living and working in space to generate useful organic compounds. Photosynthesis is the primary mechanism to fix carbon into organic molecules. Microalgae are highly efficient at converting light, water, and carbon dioxide into biomass, particularly under limiting, artificial light conditions that are a necessity in space photosynthetic production. Although there is great promise in developing algae for chemical or food production in space, most spaceflight algae growth studies have been conducted on solid agar-media to avoid handling liquids in microgravity. Here we report that breathable plastic tissue culture bags can support robust growth of Chlamydomonas reinhardtii in the Veggie plant growth chamber, which is used on the International Space Station (ISS) to grow terrestrial plants. Live cultures can be stored for at least 1 month in the bags at room temperature. The gene set required for growth in these photobioreactors was tested using a competitive growth assay with mutations induced by short-wave ultraviolet light (UVC) mutagenesis in either wild-type (CC-5082) or cw15 mutant (CC-1883) strains at the start of the assay. Genome sequencing identified UVC-induced mutations, which were enriched for transversions and non-synonymous mutations relative to natural variants among laboratory strains. Genes with mutations indicating positive selection were enriched for information processing genes related to DNA repair, RNA processing, translation, cytoskeletal motors, kinases, and ABC transporters. These data suggest that modification of DNA repair, signal transduction, and metabolite transport may be needed to improve growth rates in this spaceflight production system.
Collapse
Affiliation(s)
- Junya Zhang
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Bárbara S. F. Müller
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Kevin N. Tyre
- Center for the Advancement of Science in Space, Melbourne, FL, United States
| | - Hope L. Hersh
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Fang Bai
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Ying Hu
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Marcio F. R. Resende
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - Bala Rathinasabapathi
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| | - A. Mark Settles
- Horticultural Sciences Department, University of Florida, Gainesville, FL, United States
| |
Collapse
|
25
|
Zhu S, He M, Liu Z, Qin Z, Wang Z, Duan L. Shared genetic susceptibilities for irritable bowel syndrome and depressive disorder in Chinese patients uncovered by pooled whole-exome sequencing. J Adv Res 2020; 23:113-121. [PMID: 32099673 PMCID: PMC7029050 DOI: 10.1016/j.jare.2020.01.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Revised: 12/25/2019] [Accepted: 01/28/2020] [Indexed: 02/07/2023] Open
Abstract
Irritable bowel syndrome (IBS) is the most prevalent functional gastrointestinal disorder presenting a high comorbidity with depressive disorder (DD). Many studies have confirmed that these two disease share the similar pathophysiological process, but evidence of the genetic risks is limited. This study aimed to analyze the genetic susceptibilities for IBS and DD in Chinese patients. Pooled whole-exome sequencing (pooled-WES) was performed to identify the candidate variants in the group of diarrhea predominant IBS (IBS-D) patients, DD patients, and healthy controls (HC). Then, targeted sequencing was used to validate the candidate variants in three additional cohorts of IBS-D, DD, and HC. Four variants associated with both IBS-D and DD were identified through pooled-WES, and three of them were validated in targeted sequencing. SYT8 rs3741231 G allele and SSPO rs12536873 TT genotype were associated with both IBS-D and DD. The genes of these variants are important in neurogenesis and neurotransmission. In addition, we found COL6A1 rs13051496, a unique risk variation for IBS-D. It increased the IBS-D risk and had a positive correlation with the scores of abdominal bloating and dissatisfaction of bowel habits. Through the results of this study, it provides a genetic basis for the high comorbidity of IBS-D and DD.
Collapse
Affiliation(s)
- Shiwei Zhu
- Department of Gastroenterology, Peking University Third Hospital, No. 49 North Garden Rd., Haidian District, Beijing 100191, China
| | - Meibo He
- Department of Gastroenterology, Peking University Third Hospital, No. 49 North Garden Rd., Haidian District, Beijing 100191, China
| | - Zuojing Liu
- Department of Gastroenterology, Peking University Third Hospital, No. 49 North Garden Rd., Haidian District, Beijing 100191, China
| | - Zelian Qin
- Department of Plastic Surgery, Peking University Third Hospital, No.49 North Garden Rd., Haidian District, Beijing 100191, China
| | - Zhiren Wang
- Department of Science & Technology, Peking University HuiLongGuan Clinical Medical School, Beijing HuiLongGuan Hospital, Huilongguan Town, Changping District, Beijing 100096, China
| | - Liping Duan
- Department of Gastroenterology, Peking University Third Hospital, No. 49 North Garden Rd., Haidian District, Beijing 100191, China
| |
Collapse
|
26
|
Bertolini F, Ribani A, Capoccioni F, Buttazzoni L, Utzeri VJ, Bovo S, Schiavo G, Caggiano M, Fontanesi L, Rothschild MF. Identification of a major locus determining a pigmentation defect in cultivated gilthead seabream (Sparus aurata). Anim Genet 2020; 51:319-323. [PMID: 31900984 DOI: 10.1111/age.12890] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/15/2019] [Indexed: 12/30/2022]
Abstract
The gilthead seabream (Sparus aurata) is an important cultivated species in the Mediterranean area. A major problem for the gilthead seabream aquaculture sector derives from the high frequency of phenotypic abnormalities, including discolorations. In this study, we applied a whole-genome resequencing approach to identify a genomic region affecting a pigmentation defect that occurred in a cultivated S. aurata population. Two equimolar DNA pools were constructed using DNA extracted from 30 normally coloured and 21 non-pigmented fish collected among the offspring of the same broodstock nucleus. Whole-genome resequencing reads from the two DNA pools were aligned to the S. aurata draft genome and variant calling was performed. A whole-genome heterozygosity scan from single pool sequencing data highlighted a peak of reduced heterozygosity of approximately 5 Mbp on chromosome 6 in the non-pigmented pool that was not present in the normally coloured pool. The comparison of the non-pigmented with the normally coloured fish using a whole-genome FST analysis detected three main regions within the coordinates previously detected with the heterozygosity analysis. The results support the presence of a major locus affecting this discoloration defect in this fish population. The results of this study have practical applications, including the possibility of eliminating this defect from the breeding stock, with direct economic advantages derived from the reduction of discarded fry. Other studies are needed to identify the candidate gene and the causative mutation, which could add information to understand the complex biology of fish pigmentation.
Collapse
Affiliation(s)
- F Bertolini
- National Institute of Aquatic Resources, Technical University of Denmark, Kongens Lyngby, 2800, Denmark.,Department of Animal Science, Iowa State University, Ames, IA, 50011-3150, USA
| | - A Ribani
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale G. Fanin 46, Bologna, 40127, Italy
| | - F Capoccioni
- Centro di ricerca di Zootecnia e Acquacoltura, Consiglio per la ricerca in agricoltura e l'analisi dell'economia agraria (CREA), Rome, 00198, Italy
| | - L Buttazzoni
- Centro di ricerca di Zootecnia e Acquacoltura, Consiglio per la ricerca in agricoltura e l'analisi dell'economia agraria (CREA), Rome, 00198, Italy
| | - V J Utzeri
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale G. Fanin 46, Bologna, 40127, Italy
| | - S Bovo
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale G. Fanin 46, Bologna, 40127, Italy
| | - G Schiavo
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale G. Fanin 46, Bologna, 40127, Italy
| | - M Caggiano
- Panittica Italia Società Agricola Srl, Torre Canne di Fasano, Brindisi, 72016, Italy
| | - L Fontanesi
- Division of Animal Sciences, Department of Agricultural and Food Sciences, University of Bologna, Viale G. Fanin 46, Bologna, 40127, Italy
| | - M F Rothschild
- Department of Animal Science, Iowa State University, Ames, IA, 50011-3150, USA
| |
Collapse
|
27
|
Craig DJ, Morrison T, Khuder SA, Crawford EL, Wu L, Xu J, Blomquist TM, Willey JC. Technical advance in targeted NGS analysis enables identification of lung cancer risk-associated low frequency TP53, PIK3CA, and BRAF mutations in airway epithelial cells. BMC Cancer 2019; 19:1081. [PMID: 31711466 PMCID: PMC6844032 DOI: 10.1186/s12885-019-6313-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 10/30/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Standardized Nucleic Acid Quantification for SEQuencing (SNAQ-SEQ) is a novel method that utilizes synthetic DNA internal standards spiked into each sample prior to next generation sequencing (NGS) library preparation. This method was applied to analysis of normal appearing airway epithelial cells (AEC) obtained by bronchoscopy in an effort to define a somatic mutation field effect associated with lung cancer risk. There is a need for biomarkers that reliably detect those at highest lung cancer risk, thereby enabling more effective screening by annual low dose CT. The purpose of this study was to test the hypothesis that lung cancer risk is characterized by increased prevalence of low variant allele frequency (VAF) somatic mutations in lung cancer driver genes in AEC. METHODS Synthetic DNA internal standards (IS) were prepared for 11 lung cancer driver genes and mixed with each AEC genomic (g) DNA specimen prior to competitive multiplex PCR amplicon NGS library preparation. A custom Perl script was developed to separate IS reads and respective specimen gDNA reads from each target into separate files for parallel variant frequency analysis. This approach identified nucleotide-specific sequencing error and enabled reliable detection of specimen mutations with VAF as low as 5 × 10- 4 (0.05%). This method was applied in a retrospective case-control study of AEC specimens collected by bronchoscopic brush biopsy from the normal airways of 19 subjects, including eleven lung cancer cases and eight non-cancer controls, and the association of lung cancer risk with AEC driver gene mutations was tested. RESULTS TP53 mutations with 0.05-1.0% VAF were more prevalent (p < 0.05) and also enriched for tobacco smoke and age-associated mutation signatures in normal AEC from lung cancer cases compared to non-cancer controls matched for smoking and age. Further, PIK3CA and BRAF mutations in this VAF range were identified in AEC from cases but not controls. CONCLUSIONS Application of SNAQ-SEQ to measure mutations in the 0.05-1.0% VAF range enabled identification of an AEC somatic mutation field of injury associated with lung cancer risk. A biomarker comprising TP53, PIK3CA, and BRAF somatic mutations may better stratify individuals for optimal lung cancer screening and prevention outcomes.
Collapse
Affiliation(s)
- Daniel J. Craig
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH 43614 USA
| | - Thomas Morrison
- Accugenomics, Inc, 1410 Commonwealth Dr #105, Wilmington, NC 28403 USA
| | - Sadik A. Khuder
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH 43614 USA
| | - Erin L. Crawford
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH 43614 USA
| | - Leihong Wu
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR USA
| | - Joshua Xu
- National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR USA
| | - Thomas M. Blomquist
- Department of Pathology, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH 43614 USA
| | - James C. Willey
- Department of Medicine, The University of Toledo College of Medicine, 3000 Arlington Avenue, Toledo, OH 43614 USA
| |
Collapse
|
28
|
Nagarajan N, Yapp EKY, Le NQK, Kamaraj B, Al-Subaie AM, Yeh HY. Application of Computational Biology and Artificial Intelligence Technologies in Cancer Precision Drug Discovery. BIOMED RESEARCH INTERNATIONAL 2019; 2019:8427042. [PMID: 31886259 PMCID: PMC6925679 DOI: 10.1155/2019/8427042] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 10/14/2019] [Indexed: 02/08/2023]
Abstract
Artificial intelligence (AI) proves to have enormous potential in many areas of healthcare including research and chemical discoveries. Using large amounts of aggregated data, the AI can discover and learn further transforming these data into "usable" knowledge. Being well aware of this, the world's leading pharmaceutical companies have already begun to use artificial intelligence to improve their research regarding new drugs. The goal is to exploit modern computational biology and machine learning systems to predict the molecular behaviour and the likelihood of getting a useful drug, thus saving time and money on unnecessary tests. Clinical studies, electronic medical records, high-resolution medical images, and genomic profiles can be used as resources to aid drug development. Pharmaceutical and medical researchers have extensive data sets that can be analyzed by strong AI systems. This review focused on how computational biology and artificial intelligence technologies can be implemented by integrating the knowledge of cancer drugs, drug resistance, next-generation sequencing, genetic variants, and structural biology in the cancer precision drug discovery.
Collapse
Affiliation(s)
| | - Edward K. Y. Yapp
- Singapore Institute of Manufacturing Technology, 2 Fusionopolis Way, Singapore 138634
| | - Nguyen Quoc Khanh Le
- School of Humanities, Nanyang Technological University, 14 Nanyang Dr, Singapore 637332
| | - Balu Kamaraj
- Department of Neuroscience Technology, College of Applied Medical Sciences, Imam Abdulrahman Bin Faisal University, Jubail 35816, Saudi Arabia
| | - Abeer Mohammed Al-Subaie
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Imam Abdulrahman Bin Faisal University, Dammam, Saudi Arabia
| | - Hui-Yuan Yeh
- School of Humanities, Nanyang Technological University, 14 Nanyang Dr, Singapore 637332
| |
Collapse
|
29
|
Balaparya A, De S. Revisiting signatures of neutral tumor evolution in the light of complexity of cancer genomic data. Nat Genet 2019; 50:1626-1628. [PMID: 30250123 DOI: 10.1038/s41588-018-0219-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Affiliation(s)
- Abdul Balaparya
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA
| | - Subhajyoti De
- Rutgers Cancer Institute of New Jersey, New Brunswick, NJ, USA.
| |
Collapse
|
30
|
Topa H, Honkela A. GPrank: an R package for detecting dynamic elements from genome-wide time series. BMC Bioinformatics 2018; 19:367. [PMID: 30286713 PMCID: PMC6172792 DOI: 10.1186/s12859-018-2370-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2018] [Accepted: 09/11/2018] [Indexed: 01/30/2023] Open
Abstract
Background Genome-wide high-throughput sequencing (HTS) time series experiments are a powerful tool for monitoring various genomic elements over time. They can be used to monitor, for example, gene or transcript expression with RNA sequencing (RNA-seq), DNA methylation levels with bisulfite sequencing (BS-seq), or abundances of genetic variants in populations with pooled sequencing (Pool-seq). However, because of high experimental costs, the time series data sets often consist of a very limited number of time points with very few or no biological replicates, posing challenges in the data analysis. Results Here we present the GPrank R package for modelling genome-wide time series by incorporating variance information obtained during pre-processing of the HTS data using probabilistic quantification methods or from a beta-binomial model using sequencing depth. GPrank is well-suited for analysing both short and irregularly sampled time series. It is based on modelling each time series by two Gaussian process (GP) models, namely, time-dependent and time-independent GP models, and comparing the evidence provided by data under two models by computing their Bayes factor (BF). Genomic elements are then ranked by their BFs, and temporally most dynamic elements can be identified. Conclusions Incorporating the variance information helps GPrank avoid false positives without compromising computational efficiency. Fitted models can be easily further explored in a browser. Detection and visualisation of temporally most active dynamic elements in the genome can provide a good starting point for further downstream analyses for increasing our understanding of the studied processes.
Collapse
Affiliation(s)
- Hande Topa
- Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, 00014, Finland. .,Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, 00076, Finland.
| | - Antti Honkela
- Helsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics, University of Helsinki, Helsinki, 00014, Finland.,Department of Public Health, University of Helsinki, Helsinki, 00014, Finland
| |
Collapse
|
31
|
Shulskaya MV, Alieva AK, Vlasov IN, Zyrin VV, Fedotova EY, Abramycheva NY, Usenko TS, Yakimovsky AF, Emelyanov AK, Pchelina SN, Illarioshkin SN, Slominsky PA, Shadrina MI. Whole-Exome Sequencing in Searching for New Variants Associated With the Development of Parkinson's Disease. Front Aging Neurosci 2018; 10:136. [PMID: 29867446 PMCID: PMC5963122 DOI: 10.3389/fnagi.2018.00136] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Accepted: 04/24/2018] [Indexed: 01/08/2023] Open
Abstract
Background: Parkinson’s disease (PD) is a complex disease with its monogenic forms accounting for less than 10% of all cases. Whole-exome sequencing (WES) technology has been used successfully to find mutations in large families. However, because of the late onset of the disease, only small families and unrelated patients are usually available. WES conducted in such cases yields in a large number of candidate variants. There are currently a number of imperfect software tools that allow the pathogenicity of variants to be evaluated. Objectives: We analyzed 48 unrelated patients with an alleged autosomal dominant familial form of PD using WES and developed a strategy for selecting potential pathogenetically significant variants using almost all available bioinformatics resources for the analysis of exonic areas. Methods: DNA sequencing of 48 patients with excluded frequent mutations was performed using an Illumina HiSeq 2500 platform. The possible pathogenetic significance of identified variants and their involvement in the pathogenesis of PD was assessed using SNP and Variation Suite (SVS), Combined Annotation Dependent Depletion (CADD) and Rare Exome Variant Ensemble Learner (REVEL) software. Functional evaluation was performed using the Pathway Studio database. Results: A significant reduction in the search range from 7082 to 25 variants in 23 genes associated with PD or neuronal function was achieved. Eight (FXN, MFN2, MYOC, NPC1, PSEN1, RET, SCN3A and SPG7) were the most significant. Conclusions: The multistep approach developed made it possible to conduct an effective search for potential pathogenetically significant variants, presumably involved in the pathogenesis of PD. The data obtained need to be further verified experimentally.
Collapse
Affiliation(s)
- Marina V Shulskaya
- Laboratory of Molecular Genetics of Hereditary Diseases, Institute of Molecular Genetics, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Anelya Kh Alieva
- Laboratory of Molecular Genetics of Hereditary Diseases, Institute of Molecular Genetics, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Ivan N Vlasov
- Laboratory of Molecular Genetics of Hereditary Diseases, Institute of Molecular Genetics, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Vladimir V Zyrin
- Laboratory of Molecular Genetics of Hereditary Diseases, Institute of Molecular Genetics, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Ekaterina Yu Fedotova
- Federal State Scientific Institution, Scientific Center of Neurology, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Natalia Yu Abramycheva
- Federal State Scientific Institution, Scientific Center of Neurology, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Tatiana S Usenko
- The Petersburg Nuclear Physics Institute of the National Research Center, Kurchatov Institute, Russian Academy of Sciences (RAS), Gatchina, Russia.,Federal State Budgetary Educational Institution of Higher Education, Pavlov First Saint Petersburg State Medical University, Saint Petersburg, Russia
| | - Andrei F Yakimovsky
- Federal State Budgetary Educational Institution of Higher Education, Pavlov First Saint Petersburg State Medical University, Saint Petersburg, Russia
| | - Anton K Emelyanov
- The Petersburg Nuclear Physics Institute of the National Research Center, Kurchatov Institute, Russian Academy of Sciences (RAS), Gatchina, Russia.,Federal State Budgetary Educational Institution of Higher Education, Pavlov First Saint Petersburg State Medical University, Saint Petersburg, Russia
| | - Sofya N Pchelina
- The Petersburg Nuclear Physics Institute of the National Research Center, Kurchatov Institute, Russian Academy of Sciences (RAS), Gatchina, Russia.,Federal State Budgetary Educational Institution of Higher Education, Pavlov First Saint Petersburg State Medical University, Saint Petersburg, Russia
| | - Sergei N Illarioshkin
- Federal State Scientific Institution, Scientific Center of Neurology, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Petr A Slominsky
- Laboratory of Molecular Genetics of Hereditary Diseases, Institute of Molecular Genetics, Russian Academy of Sciences (RAS), Moscow, Russia
| | - Maria I Shadrina
- Laboratory of Molecular Genetics of Hereditary Diseases, Institute of Molecular Genetics, Russian Academy of Sciences (RAS), Moscow, Russia
| |
Collapse
|
32
|
Ryu S, Han J, Norden-Krichmar TM, Schork NJ, Suh Y. Effective discovery of rare variants by pooled target capture sequencing: A comparative analysis with individually indexed target capture sequencing. Mutat Res 2018; 809:24-31. [PMID: 29677560 PMCID: PMC5962423 DOI: 10.1016/j.mrfmmm.2018.03.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Accepted: 03/28/2018] [Indexed: 01/11/2023]
Abstract
Identification of all genetic variants associated with complex traits is one of the most important goals in modern human genetics. Genome-wide association studies (GWAS) have been successfully applied to identify common variants, which thus far explain only small portion of heritability. Interests in rare variants have been increasingly growing as an answer for this missing heritability. While next-generation sequencing allows detection of rare variants, its cost is still prohibitively high to sequence a large number of human DNA samples required for rare variant association studies. In this study, we evaluated the sensitivity and specificity of sequencing for pooled DNA samples of multiple individuals (Pool-seq) as a cost-effective and robust approach for rare variant discovery. We comparatively analyzed Pool-seq vs. individual-seq of indexed target capture of up to 960 genes in ∼1000 individuals, followed by independent genotyping validation studies. We found that Pool-seq was as effective and accurate as individual-seq in detecting rare variants and accurately estimating their minor allele frequencies (MAFs). Our results suggest that Pool-seq can be used as an efficient and cost-effective method for discovery of rare variants for population-based sequencing studies in individual laboratories.
Collapse
Affiliation(s)
- Seungjin Ryu
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | - Jeehae Han
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA
| | | | - Nicholas J Schork
- The Scripps Research Institute, La Jolla, CA 92037, USA; J. Craig Venter Institute, La Jolla, CA, 92037, USA
| | - Yousin Suh
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, 10461, USA; Department of Medicine, Albert Einstein College of Medicine, Bronx, NY, 10461, USA; Department of Ophthalmology and Visual Sciences, Albert Einstein College of Medicine, Bronx, NY, 10461, USA.
| |
Collapse
|
33
|
Xu C. A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data. Comput Struct Biotechnol J 2018; 16:15-24. [PMID: 29552334 PMCID: PMC5852328 DOI: 10.1016/j.csbj.2018.01.003] [Citation(s) in RCA: 153] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2017] [Revised: 01/20/2018] [Accepted: 01/28/2018] [Indexed: 02/06/2023] Open
Abstract
Detection of somatic mutations holds great potential in cancer treatment and has been a very active research field in the past few years, especially since the breakthrough of the next-generation sequencing technology. A collection of variant calling pipelines have been developed with different underlying models, filters, input data requirements, and targeted applications. This review aims to enumerate these unique features of the state-of-the-art variant callers, in the hope to provide a practical guide for selecting the appropriate pipeline for specific applications. We will focus on the detection of somatic single nucleotide variants, ranging from traditional variant callers based on whole genome or exome sequencing of paired tumor-normal samples to recent low-frequency variant callers designed for targeted sequencing protocols with unique molecular identifiers. The variant callers have been extensively benchmarked with inconsistent performances across these studies. We will review the reference materials, datasets, and performance metrics that have been used in the benchmarking studies. In the end, we will discuss emerging trends and future directions of the variant calling algorithms.
Collapse
Affiliation(s)
- Chang Xu
- Life Science Research and Foundation, Qiagen Sciences, Inc., 6951 Executive Way, Frederick, Maryland 21703, USA
| |
Collapse
|
34
|
Bansal V, Gassenhuber J, Phillips T, Oliveira G, Harbaugh R, Villarasa N, Topol EJ, Seufferlein T, Boehm BO. Spectrum of mutations in monogenic diabetes genes identified from high-throughput DNA sequencing of 6888 individuals. BMC Med 2017; 15:213. [PMID: 29207974 PMCID: PMC5717832 DOI: 10.1186/s12916-017-0977-3] [Citation(s) in RCA: 68] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 11/11/2017] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Diagnosis of monogenic as well as atypical forms of diabetes mellitus has important clinical implications for their specific diagnosis, prognosis, and targeted treatment. Single gene mutations that affect beta-cell function represent 1-2% of all cases of diabetes. However, phenotypic heterogeneity and lack of family history of diabetes can limit the diagnosis of monogenic forms of diabetes. Next-generation sequencing technologies provide an excellent opportunity to screen large numbers of individuals with a diagnosis of diabetes for mutations in disease-associated genes. METHODS We utilized a targeted sequencing approach using the Illumina HiSeq to perform a case-control sequencing study of 22 monogenic diabetes genes in 4016 individuals with type 2 diabetes (including 1346 individuals diagnosed before the age of 40 years) and 2872 controls. We analyzed protein-coding variants identified from the sequence data and compared the frequencies of pathogenic variants (protein-truncating variants and missense variants) between the cases and controls. RESULTS A total of 40 individuals with diabetes (1.8% of early onset sub-group and 0.6% of adult onset sub-group) were carriers of known pathogenic missense variants in the GCK, HNF1A, HNF4A, ABCC8, and INS genes. In addition, heterozygous protein truncating mutations were detected in the GCK, HNF1A, and HNF1B genes in seven individuals with diabetes. Rare missense mutations in the GCK gene were significantly over-represented in individuals with diabetes (0.5% carrier frequency) compared to controls (0.035%). One individual with early onset diabetes was homozygous for a rare pathogenic missense variant in the WFS1 gene but did not have the additional phenotypes associated with Wolfram syndrome. CONCLUSION Targeted sequencing of genes linked with monogenic diabetes can identify disease-relevant mutations in individuals diagnosed with type 2 diabetes not suspected of having monogenic forms of the disease. Our data suggests that GCK-MODY frequently masquerades as classical type 2 diabetes. The results confirm that MODY is under-diagnosed, particularly in individuals presenting with early onset diabetes and clinically labeled as type 2 diabetes; thus, sequencing of all monogenic diabetes genes should be routinely considered in such individuals. Genetic information can provide a specific diagnosis, inform disease prognosis and may help to better stratify treatment plans.
Collapse
Affiliation(s)
- Vikas Bansal
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
| | | | - Tierney Phillips
- Scripps Translational Science Institute and Scripps Health, La Jolla, CA, USA
| | - Glenn Oliveira
- Scripps Translational Science Institute and Scripps Health, La Jolla, CA, USA
| | - Rebecca Harbaugh
- Scripps Translational Science Institute and Scripps Health, La Jolla, CA, USA
| | - Nikki Villarasa
- Scripps Translational Science Institute and Scripps Health, La Jolla, CA, USA
| | - Eric J Topol
- Scripps Translational Science Institute and Scripps Health, La Jolla, CA, USA
| | - Thomas Seufferlein
- Department of Internal Medicine I, Ulm University Medical Centre, Ulm, Germany
| | - Bernhard O Boehm
- Department of Internal Medicine I, Ulm University Medical Centre, Ulm, Germany. .,Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore. .,Imperial College London, London, UK.
| |
Collapse
|
35
|
Gupta P, Reddaiah B, Salava H, Upadhyaya P, Tyagi K, Sarma S, Datta S, Malhotra B, Thomas S, Sunkum A, Devulapalli S, Till BJ, Sreelakshmi Y, Sharma R. Next-generation sequencing (NGS)-based identification of induced mutations in a doubly mutagenized tomato (Solanum lycopersicum) population. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 92:495-508. [PMID: 28779536 DOI: 10.1111/tpj.13654] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Revised: 07/25/2017] [Accepted: 07/26/2017] [Indexed: 05/21/2023]
Abstract
The identification of mutations in targeted genes has been significantly simplified by the advent of TILLING (Targeting Induced Local Lesions In Genomes), speeding up the functional genomic analysis of animals and plants. Next-generation sequencing (NGS) is gradually replacing classical TILLING for mutation detection, as it allows the analysis of a large number of amplicons in short durations. The NGS approach was used to identify mutations in a population of Solanum lycopersicum (tomato) that was doubly mutagenized by ethylmethane sulphonate (EMS). Twenty-five genes belonging to carotenoids and folate metabolism were PCR-amplified and screened to identify potentially beneficial alleles. To augment efficiency, the 600-bp amplicons were directly sequenced in a non-overlapping manner in Illumina MiSeq, obviating the need for a fragmentation step before library preparation. A comparison of the different pooling depths revealed that heterozygous mutations could be identified up to 128-fold pooling. An evaluation of six different software programs (camba, crisp, gatk unified genotyper, lofreq, snver and vipr) revealed that no software program was robust enough to predict mutations with high fidelity. Among these, crisp and camba predicted mutations with lower false discovery rates. The false positives were largely eliminated by considering only mutations commonly predicted by two different software programs. The screening of 23.47 Mb of tomato genome yielded 75 predicted mutations, 64 of which were confirmed by Sanger sequencing with an average mutation density of 1/367 Kb. Our results indicate that NGS combined with multiple variant detection tools can reduce false positives and significantly speed up the mutation discovery rate.
Collapse
Affiliation(s)
- Prateek Gupta
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Bodanapu Reddaiah
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Hymavathi Salava
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Pallawi Upadhyaya
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Kamal Tyagi
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Supriya Sarma
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Sneha Datta
- Plant Breeding and Genetics Laboratory, IAEA Seibersdorf Laboratories, Reaktorstrasse 1, Seibersdorf, Austria
| | - Bharti Malhotra
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Sherinmol Thomas
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Anusha Sunkum
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Sameera Devulapalli
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Bradley John Till
- Plant Breeding and Genetics Laboratory, IAEA Seibersdorf Laboratories, Reaktorstrasse 1, Seibersdorf, Austria
| | - Yellamaraju Sreelakshmi
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| | - Rameshwar Sharma
- Repository of Tomato Genomics Resources, Department of Plant Sciences, University of Hyderabad, Hyderabad, India
| |
Collapse
|
36
|
Doyle SR, Bourguinat C, Nana-Djeunga HC, Kengne-Ouafo JA, Pion SDS, Bopda J, Kamgno J, Wanji S, Che H, Kuesel AC, Walker M, Basáñez MG, Boakye DA, Osei-Atweneboana MY, Boussinesq M, Prichard RK, Grant WN. Genome-wide analysis of ivermectin response by Onchocerca volvulus reveals that genetic drift and soft selective sweeps contribute to loss of drug sensitivity. PLoS Negl Trop Dis 2017; 11:e0005816. [PMID: 28746337 PMCID: PMC5546710 DOI: 10.1371/journal.pntd.0005816] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 08/07/2017] [Accepted: 07/19/2017] [Indexed: 12/30/2022] Open
Abstract
Background Treatment of onchocerciasis using mass ivermectin administration has reduced morbidity and transmission throughout Africa and Central/South America. Mass drug administration is likely to exert selection pressure on parasites, and phenotypic and genetic changes in several Onchocerca volvulus populations from Cameroon and Ghana—exposed to more than a decade of regular ivermectin treatment—have raised concern that sub-optimal responses to ivermectin's anti-fecundity effect are becoming more frequent and may spread. Methodology/Principal findings Pooled next generation sequencing (Pool-seq) was used to characterise genetic diversity within and between 108 adult female worms differing in ivermectin treatment history and response. Genome-wide analyses revealed genetic variation that significantly differentiated good responder (GR) and sub-optimal responder (SOR) parasites. These variants were not randomly distributed but clustered in ~31 quantitative trait loci (QTLs), with little overlap in putative QTL position and gene content between the two countries. Published candidate ivermectin SOR genes were largely absent in these regions; QTLs differentiating GR and SOR worms were enriched for genes in molecular pathways associated with neurotransmission, development, and stress responses. Finally, single worm genotyping demonstrated that geographic isolation and genetic change over time (in the presence of drug exposure) had a significantly greater role in shaping genetic diversity than the evolution of SOR. Conclusions/Significance This study is one of the first genome-wide association analyses in a parasitic nematode, and provides insight into the genomics of ivermectin response and population structure of O. volvulus. We argue that ivermectin response is a polygenically-determined quantitative trait (QT) whereby identical or related molecular pathways but not necessarily individual genes are likely to determine the extent of ivermectin response in different parasite populations. Furthermore, we propose that genetic drift rather than genetic selection of SOR is the underlying driver of population differentiation, which has significant implications for the emergence and potential spread of SOR within and between these parasite populations. Onchocerciasis is a human parasitic disease endemic across large areas of Sub-Saharan Africa, where more than 99% of the estimated 100 million people globally at-risk live. The microfilarial stage of Onchocerca volvulus causes pathologies ranging from mild itching to visual impairment and ultimately, irreversible blindness. Mass administration of ivermectin kills microfilariae and has an anti-fecundity effect on adult worms by temporarily inhibiting the development in utero and/or release into the skin of new microfilariae, thereby reducing morbidity and transmission. Phenotypic and genetic changes in some parasite populations that have undergone multiple ivermectin treatments in Cameroon and Ghana have raised concern that sub-optimal response to ivermectin's anti-fecundity effect may increase in frequency, reducing the impact of ivermectin-based control measures. We used next generation sequencing of small pools of parasites to define genome-wide genetic differences between phenotypically characterised good and sub-optimal responder parasites from Cameroon and Ghana, and identified multiple regions of the genome that differentiated the response types. These regions were largely different between parasites from these two countries but revealed common molecular pathways that might be involved in determining the extent of response to ivermectin's anti-fecundity effect. These data reveal a more complex than previously described pattern of genetic diversity among O. volvulus populations that differ in their geography and response to ivermectin treatment.
Collapse
Affiliation(s)
- Stephen R. Doyle
- Department of Animal, Plant and Soil Sciences, La Trobe University, Bundoora, Australia
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, United Kingdom
- * E-mail: (SRD); (RKP); (WNG)
| | - Catherine Bourguinat
- Institute of Parasitology, McGill University, Sainte Anne-de-Bellevue, Québec, Canada
| | - Hugues C. Nana-Djeunga
- Parasitology and Ecology Laboratory, Department of Animal Biology and Physiology, Faculty of Science, University of Yaoundé 1, Yaoundé, Cameroon
- Centre for Research on Filariasis and other Tropical Diseases (CRFilMT), Yaoundé, Cameroon
| | - Jonas A. Kengne-Ouafo
- Research Foundation in Tropical Diseases and the Environment (REFOTDE), Buea, Cameroon
| | - Sébastien D. S. Pion
- Institut de Recherche pour le Développement (IRD), IRD UMI 233 TransVIHMI – Université Montpellier – INSERM U1175, Montpellier, France
| | - Jean Bopda
- Faculty of Medicine and Biomedical Sciences, University of Yaoundé 1, Yaoundé, Cameroon
| | - Joseph Kamgno
- Centre for Research on Filariasis and other Tropical Diseases (CRFilMT), Yaoundé, Cameroon
- Faculty of Medicine and Biomedical Sciences, University of Yaoundé 1, Yaoundé, Cameroon
| | - Samuel Wanji
- Research Foundation in Tropical Diseases and the Environment (REFOTDE), Buea, Cameroon
| | - Hua Che
- Institute of Parasitology, McGill University, Sainte Anne-de-Bellevue, Québec, Canada
| | - Annette C. Kuesel
- UNICEF/UNDP/World Bank/World Health Organization Special Programme for Research and Training in Tropical Diseases (WHO/TDR), World Health Organization, Geneva, Switzerland
| | - Martin Walker
- London Centre for Neglected Tropical Disease Research, Department of Infectious Disease Epidemiology, Faculty of Medicine, School of Public Health, Imperial College London, United Kingdom
| | - Maria-Gloria Basáñez
- London Centre for Neglected Tropical Disease Research, Department of Infectious Disease Epidemiology, Faculty of Medicine, School of Public Health, Imperial College London, United Kingdom
| | - Daniel A. Boakye
- Noguchi Memorial Institute for Medical Research, University of Ghana, Legon, Ghana
| | - Mike Y. Osei-Atweneboana
- Department of Environmental Biology and Health Water Research Institute, Council for Scientific and Industrial Research (CSIR), Accra, Ghana
| | - Michel Boussinesq
- Institut de Recherche pour le Développement (IRD), IRD UMI 233 TransVIHMI – Université Montpellier – INSERM U1175, Montpellier, France
| | - Roger K. Prichard
- Institute of Parasitology, McGill University, Sainte Anne-de-Bellevue, Québec, Canada
- * E-mail: (SRD); (RKP); (WNG)
| | - Warwick N. Grant
- Department of Animal, Plant and Soil Sciences, La Trobe University, Bundoora, Australia
- * E-mail: (SRD); (RKP); (WNG)
| |
Collapse
|
37
|
Zhou S, Luoma SE, St. Armour GE, Thakkar E, Mackay TFC, Anholt RRH. A Drosophila model for toxicogenomics: Genetic variation in susceptibility to heavy metal exposure. PLoS Genet 2017; 13:e1006907. [PMID: 28732062 PMCID: PMC5544243 DOI: 10.1371/journal.pgen.1006907] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Revised: 08/04/2017] [Accepted: 07/06/2017] [Indexed: 12/20/2022] Open
Abstract
The genetic factors that give rise to variation in susceptibility to environmental toxins remain largely unexplored. Studies on genetic variation in susceptibility to environmental toxins are challenging in human populations, due to the variety of clinical symptoms and difficulty in determining which symptoms causally result from toxic exposure; uncontrolled environments, often with exposure to multiple toxicants; and difficulty in relating phenotypic effect size to toxic dose, especially when symptoms become manifest with a substantial time lag. Drosophila melanogaster is a powerful model that enables genome-wide studies for the identification of allelic variants that contribute to variation in susceptibility to environmental toxins, since the genetic background, environmental rearing conditions and toxic exposure can be precisely controlled. Here, we used extreme QTL mapping in an outbred population derived from the D. melanogaster Genetic Reference Panel to identify alleles associated with resistance to lead and/or cadmium, two ubiquitous environmental toxins that present serious health risks. We identified single nucleotide polymorphisms (SNPs) associated with variation in resistance to both heavy metals as well as SNPs associated with resistance specific to each of them. The effects of these SNPs were largely sex-specific. We applied mutational and RNAi analyses to 33 candidate genes and functionally validated 28 of them. We constructed networks of candidate genes as blueprints for orthologous networks of human genes. The latter not only provided functional contexts for known human targets of heavy metal toxicity, but also implicated novel candidate susceptibility genes. These studies validate Drosophila as a translational toxicogenomics gene discovery system. Although physiological effects of environmental toxins are well documented, we know little about the genetic factors that determine individual variation in susceptibility to toxins. Such information is difficult to obtain in human populations due to heterogeneity in genetic background and environmental exposure, and the diversity of symptoms and time lag with which they appear after toxic exposure. Here, we show that the fruit fly, Drosophila, can serve as a powerful genetic model system to elucidate the genetic underpinnings that contribute to individual variation in resistance to toxicity, using lead and cadmium exposure as an experimental paradigm. We identified genes that harbor genetic variants that contribute to individual variation in resistance to heavy metal exposure. Furthermore, we constructed genetic networks on which we could superimpose human counterparts of Drosophila genes. We were able to place human genes previously implicated in heavy metal toxicity in biological context and identify novel targets for heavy metal toxicity. Thus, we demonstrate that based on evolutionary conservation of fundamental biological processes, we can use Drosophila as a powerful translational model for toxicogenomics studies.
Collapse
Affiliation(s)
- Shanshan Zhou
- Program in Genetics, W. M. Keck Center for Behavioral Biology, and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Sarah E. Luoma
- Program in Genetics, W. M. Keck Center for Behavioral Biology, and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Genevieve E. St. Armour
- Program in Genetics, W. M. Keck Center for Behavioral Biology, and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Esha Thakkar
- Enloe Magnet High School, Raleigh, North Carolina, United States of America
| | - Trudy F. C. Mackay
- Program in Genetics, W. M. Keck Center for Behavioral Biology, and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Robert R. H. Anholt
- Program in Genetics, W. M. Keck Center for Behavioral Biology, and Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
38
|
Garlapow ME, Everett LJ, Zhou S, Gearhart AW, Fay KA, Huang W, Morozova TV, Arya GH, Turlapati L, St Armour G, Hussain YN, McAdams SE, Fochler S, Mackay TFC. Genetic and Genomic Response to Selection for Food Consumption in Drosophila melanogaster. Behav Genet 2017; 47:227-243. [PMID: 27704301 PMCID: PMC5305434 DOI: 10.1007/s10519-016-9819-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 09/16/2016] [Indexed: 12/21/2022]
Abstract
Food consumption is an essential component of animal fitness; however, excessive food intake in humans increases risk for many diseases. The roles of neuroendocrine feedback loops, food sensing modalities, and physiological state in regulating food intake are well understood, but not the genetic basis underlying variation in food consumption. Here, we applied ten generations of artificial selection for high and low food consumption in replicate populations of Drosophila melanogaster. The phenotypic response to selection was highly asymmetric, with significant responses only for increased food consumption and minimal correlated responses in body mass and composition. We assessed the molecular correlates of selection responses by DNA and RNA sequencing of the selection lines. The high and low selection lines had variants with significantly divergent allele frequencies within or near 2081 genes and 3526 differentially expressed genes in one or both sexes. A total of 519 genes were both genetically divergent and differentially expressed between the divergent selection lines. We performed functional analyses of the effects of RNAi suppression of gene expression and induced mutations for 27 of these candidate genes that have human orthologs and the strongest statistical support, and confirmed that 25 (93 %) affected the mean and/or variance of food consumption.
Collapse
Affiliation(s)
- Megan E Garlapow
- Program in Genetics, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Logan J Everett
- Program in Genetics, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Initiative for Biological Complexity, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Shanshan Zhou
- Program in Genetics, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Initiative for Biological Complexity, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Alexander W Gearhart
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Kairsten A Fay
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Wen Huang
- Program in Genetics, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
- Initiative for Biological Complexity, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Tatiana V Morozova
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Gunjan H Arya
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Lavanya Turlapati
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Genevieve St Armour
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Yasmeen N Hussain
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Sarah E McAdams
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
| | - Sophia Fochler
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA
- School of Biosciences and Medicine, Faculty of Health and Medical Sciences, University of Surrey, Guildford, UK
| | - Trudy F C Mackay
- Program in Genetics, North Carolina State University, Raleigh, NC, 27695-7614, USA.
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, 27695-7614, USA.
- W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, NC, 27695-7614, USA.
- Initiative for Biological Complexity, North Carolina State University, Raleigh, NC, 27695-7614, USA.
| |
Collapse
|
39
|
Variational inference for rare variant detection in deep, heterogeneous next-generation sequencing data. BMC Bioinformatics 2017; 18:45. [PMID: 28103803 PMCID: PMC5244592 DOI: 10.1186/s12859-016-1451-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Accepted: 12/22/2016] [Indexed: 01/09/2023] Open
Abstract
Background The detection of rare single nucleotide variants (SNVs) is important for understanding genetic heterogeneity using next-generation sequencing (NGS) data. Various computational algorithms have been proposed to detect variants at the single nucleotide level in mixed samples. Yet, the noise inherent in the biological processes involved in NGS technology necessitates the development of statistically accurate methods to identify true rare variants. Results We propose a Bayesian statistical model and a variational expectation maximization (EM) algorithm to estimate non-reference allele frequency (NRAF) and identify SNVs in heterogeneous cell populations. We demonstrate that our variational EM algorithm has comparable sensitivity and specificity compared with a Markov Chain Monte Carlo (MCMC) sampling inference algorithm, and is more computationally efficient on tests of relatively low coverage (27× and 298×) data. Furthermore, we show that our model with a variational EM inference algorithm has higher specificity than many state-of-the-art algorithms. In an analysis of a directed evolution longitudinal yeast data set, we are able to identify a time-series trend in non-reference allele frequency and detect novel variants that have not yet been reported. Our model also detects the emergence of a beneficial variant earlier than was previously shown, and a pair of concomitant variants. Conclusions We developed a variational EM algorithm for a hierarchical Bayesian model to identify rare variants in heterogeneous next-generation sequencing data. Our algorithm is able to identify variants in a broad range of read depths and non-reference allele frequencies with high sensitivity and specificity. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1451-5) contains supplementary material, which is available to authorized users.
Collapse
|
40
|
Rudewicz J, Soueidan H, Uricaru R, Bonnefoi H, Iggo R, Bergh J, Nikolski M. MICADo - Looking for Mutations in Targeted PacBio Cancer Data: An Alignment-Free Method. Front Genet 2016; 7:214. [PMID: 28008336 PMCID: PMC5143680 DOI: 10.3389/fgene.2016.00214] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Accepted: 11/23/2016] [Indexed: 12/11/2022] Open
Abstract
Targeted sequencing is commonly used in clinical application of NGS technology since it enables generation of sufficient sequencing depth in the targeted genes of interest and thus ensures the best possible downstream analysis. This notwithstanding, the accurate discovery and annotation of disease causing mutations remains a challenging problem even in such favorable context. The difficulty is particularly salient in the case of third generation sequencing technology, such as PacBio. We present MICADo, a de Bruijn graph based method, implemented in python, that makes possible to distinguish between patient specific mutations and other alterations for targeted sequencing of a cohort of patients. MICADo analyses NGS reads for each sample within the context of the data of the whole cohort in order to capture the differences between specificities of the sample with respect to the cohort. MICADo is particularly suitable for sequencing data from highly heterogeneous samples, especially when it involves high rates of non-uniform sequencing errors. It was validated on PacBio sequencing datasets from several cohorts of patients. The comparison with two widely used available tools, namely VarScan and GATK, shows that MICADo is more accurate, especially when true mutations have frequencies close to backgound noise. The source code is available at http://github.com/cbib/MICADo.
Collapse
Affiliation(s)
- Justine Rudewicz
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France; Bergonié Cancer Institute, Institut National de la Santé et de la Recherche Médicale U1218, University of BordeauxBordeaux, France
| | - Hayssam Soueidan
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France
| | - Raluca Uricaru
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France
| | - Hervé Bonnefoi
- Bergonié Cancer Institute, Institut National de la Santé et de la Recherche Médicale U1218, University of Bordeaux Bordeaux, France
| | - Richard Iggo
- Bergonié Cancer Institute, Institut National de la Santé et de la Recherche Médicale U1218, University of Bordeaux Bordeaux, France
| | - Jonas Bergh
- Karolinska Institute and University Hospital Stockholm, Sweden
| | - Macha Nikolski
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France
| |
Collapse
|
41
|
Wang X, Sui W, Wu W, Hou X, Ou M, Xiang Y, Dai Y. Whole-genome resequencing of 100 healthy individuals using DNA pooling. Exp Ther Med 2016; 12:3143-3150. [PMID: 27882129 PMCID: PMC5103757 DOI: 10.3892/etm.2016.3797] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 08/11/2016] [Indexed: 12/27/2022] Open
Abstract
With the advent of next-generation sequencing technology, the cost of sequencing has significantly decreased. However, sequencing costs remain high for large-scale studies. In the present study, DNA pooling was applied as a cost-effective strategy for sequencing. The sequencing results for 100 healthy individuals obtained via whole-genome resequencing and using DNA pooling are presented in the present study. In order to minimise the likelihood of systematic bias in sampling, paired-end libraries with an insert size of 500 bp were prepared for all samples and then subjected to whole-genome sequencing using four lanes for each library and resulting in at least a 30-fold haploid coverage for each sample. The NCBI human genome build37 (hg19) was used as a reference genome for the present study and the short reads were aligned to the reference genome achieving 99.84% coverage. In addition, the average sequencing depth was 32.76. In total, ~3 million single-nucleotide polymorphisms were identified, of which 99.88% were in the NCBI dbSNP database. Furthermore, ~600,000 small insertion/deletions, 500,000 structure variants, 5,000 copy number variations and 13,000 single nucleotide variants were identified. According to the present study, the whole genome has been sequenced for a small sample subjects from southern China for the first time. Furthermore, new variation sites were identified by comparing with the reference sequence, and new knowledge of the human genome variation was added to the human genomic databases. Furthermore, the particular distribution regions of variation were illustrated by analyzing various sites of variation, such as single-nucleotide polymorphisms.
Collapse
Affiliation(s)
- Xiaobin Wang
- Health Management Centre, The Affiliated Guilin Hospital, Southern Medical University, Guilin, Guangxi 541000, P.R. China; Guangxi Key Laboratory of Metabolic Diseases Research, Guilin, Guangxi 541000, P.R. China
| | - Weiguo Sui
- Guangxi Key Laboratory of Metabolic Diseases Research, Guilin, Guangxi 541000, P.R. China; Department of Nephrology, Guilin 181st Hospital, Guilin, Guangxi 541000, P.R. China
| | - Weiqing Wu
- Health Management Centre, The Second Clinical Medical College, Jinan University, Shenzhen, Guangdong 518001, P.R. China
| | - Xianliang Hou
- Guangxi Key Laboratory of Metabolic Diseases Research, Guilin, Guangxi 541000, P.R. China; Department of Nephrology, Guilin 181st Hospital, Guilin, Guangxi 541000, P.R. China; College of Life Science, Guangxi Normal University, Guilin, Guangxi 541001, P.R. China
| | - Minglin Ou
- Guangxi Key Laboratory of Metabolic Diseases Research, Guilin, Guangxi 541000, P.R. China; Department of Nephrology, Guilin 181st Hospital, Guilin, Guangxi 541000, P.R. China
| | - Yueying Xiang
- Health Management Centre, The Affiliated Guilin Hospital, Southern Medical University, Guilin, Guangxi 541000, P.R. China
| | - Yong Dai
- Guangxi Key Laboratory of Metabolic Diseases Research, Guilin, Guangxi 541000, P.R. China; Department of Nephrology, Guilin 181st Hospital, Guilin, Guangxi 541000, P.R. China; Clinical Medical Research Center, The Second Clinical Medical College, Jinan University, Shenzhen, Guangdong 518001, P.R. China
| |
Collapse
|
42
|
From next-generation resequencing reads to a high-quality variant data set. Heredity (Edinb) 2016; 118:111-124. [PMID: 27759079 DOI: 10.1038/hdy.2016.102] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2016] [Revised: 09/03/2016] [Accepted: 09/06/2016] [Indexed: 12/11/2022] Open
Abstract
Sequencing has revolutionized biology by permitting the analysis of genomic variation at an unprecedented resolution. High-throughput sequencing is fast and inexpensive, making it accessible for a wide range of research topics. However, the produced data contain subtle but complex types of errors, biases and uncertainties that impose several statistical and computational challenges to the reliable detection of variants. To tap the full potential of high-throughput sequencing, a thorough understanding of the data produced as well as the available methodologies is required. Here, I review several commonly used methods for generating and processing next-generation resequencing data, discuss the influence of errors and biases together with their resulting implications for downstream analyses and provide general guidelines and recommendations for producing high-quality single-nucleotide polymorphism data sets from raw reads by highlighting several sophisticated reference-based methods representing the current state of the art.
Collapse
|
43
|
Skums P, Artyomenko A, Glebova O, Ramachandran S, Campo DS, Dimitrova Z, Măndoiu II, Zelikovsky A, Khudyakov Y. Pooling Strategy for Massive Viral Sequencing. COMPUTATIONAL METHODS FOR NEXT GENERATION SEQUENCING DATA ANALYSIS 2016:57-83. [DOI: 10.1002/9781119272182.ch3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2025]
|
44
|
Next Generation Sequencing of Pooled Samples: Guideline for Variants' Filtering. Sci Rep 2016; 6:33735. [PMID: 27670852 PMCID: PMC5037392 DOI: 10.1038/srep33735] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 08/30/2016] [Indexed: 02/07/2023] Open
Abstract
Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments.
Collapse
|
45
|
Jakaitiene A, Avino M, Guarracino MR. Beta-Binomial Model for the Detection of Rare Mutations in Pooled Next-Generation Sequencing Experiments. J Comput Biol 2016; 24:357-367. [PMID: 27632638 DOI: 10.1089/cmb.2016.0106] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.
Collapse
Affiliation(s)
- Audrone Jakaitiene
- 1 Bioinformatics and Biostatistics Center, Department of Human and Medical Genetics, Faculty of Medicine, Vilnius University , Vilnius, Lithuania
| | - Mariano Avino
- 2 High Performance Computing and Networking Institute , National Research Council, Naples, Italy
| | - Mario Rosario Guarracino
- 2 High Performance Computing and Networking Institute , National Research Council, Naples, Italy
| |
Collapse
|
46
|
Fu Y, Jovelet C, Filleron T, Pedrero M, Motté N, Boursin Y, Luo Y, Massard C, Campone M, Levy C, Diéras V, Bachelot T, Garrabey J, Soria JC, Lacroix L, André F, Lefebvre C. Improving the Performance of Somatic Mutation Identification by Recovering Circulating Tumor DNA Mutations. Cancer Res 2016; 76:5954-5961. [DOI: 10.1158/0008-5472.can-15-3457] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Accepted: 07/28/2016] [Indexed: 11/16/2022]
|
47
|
Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies. G3-GENES GENOMES GENETICS 2016; 6:1979-89. [PMID: 27172202 PMCID: PMC4938651 DOI: 10.1534/g3.116.028753] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI.
Collapse
|
48
|
Lan F, Haliburton JR, Yuan A, Abate AR. Droplet barcoding for massively parallel single-molecule deep sequencing. Nat Commun 2016; 7:11784. [PMID: 27353563 PMCID: PMC4931254 DOI: 10.1038/ncomms11784] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 04/28/2016] [Indexed: 02/08/2023] Open
Abstract
The ability to accurately sequence long DNA molecules is important across biology, but existing sequencers are limited in read length and accuracy. Here, we demonstrate a method to leverage short-read sequencing to obtain long and accurate reads. Using droplet microfluidics, we isolate, amplify, fragment and barcode single DNA molecules in aqueous picolitre droplets, allowing the full-length molecules to be sequenced with multi-fold coverage using short-read sequencing. We show that this approach can provide accurate sequences of up to 10 kb, allowing us to identify rare mutations below the detection limit of conventional sequencing and directly link them into haplotypes. This barcoding methodology can be a powerful tool in sequencing heterogeneous populations such as viruses. The ability to accurately sequence long DNA molecules is important across biology. Here, Lan et al. report a droplet-based method that barcodes single DNA molecules, allowing the full-length molecules to be sequenced with multi-fold coverage using short-read next-generation sequencing.
Collapse
Affiliation(s)
- Freeman Lan
- Department of Bioengineering and Therapeutic Sciences, California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, California 94158, USA.,UC Berkeley - UCSF Bioengineering Graduate program, University of California, San Francisco, California 94158, USA
| | - John R Haliburton
- Department of Bioengineering and Therapeutic Sciences, California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, California 94158, USA.,Integrative Program in Quantitative Biology (iPQB) Biophysics Graduate program, University of California, San Francisco, California 94158, USA
| | - Aaron Yuan
- Department of Bioengineering and Therapeutic Sciences, California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, California 94158, USA.,Department of Electrical Engineering and Computer Sciences (EECS), Computer Science Division (CS), University of California, Berkeley, California 94720, USA
| | - Adam R Abate
- Department of Bioengineering and Therapeutic Sciences, California Institute for Quantitative Biosciences (QB3), University of California, San Francisco, California 94158, USA.,UC Berkeley - UCSF Bioengineering Graduate program, University of California, San Francisco, California 94158, USA.,Integrative Program in Quantitative Biology (iPQB) Biophysics Graduate program, University of California, San Francisco, California 94158, USA
| |
Collapse
|
49
|
Fountain ED, Pauli JN, Reid BN, Palsbøll PJ, Peery MZ. Finding the right coverage: the impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates. Mol Ecol Resour 2016; 16:966-78. [DOI: 10.1111/1755-0998.12519] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Revised: 02/10/2016] [Accepted: 02/11/2016] [Indexed: 12/13/2022]
Affiliation(s)
- Emily D. Fountain
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| | - Jonathan N. Pauli
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| | - Brendan N. Reid
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| | - Per J. Palsbøll
- Marine Evolution and Conservation Groningen Institute of Evolutionary Life Sciences University of Groningen Groningen9747 AG The Netherlands
| | - M. Zachariah Peery
- Department of Forest and Wildlife Ecology University of Wisconsin‐Madison Madison WI 53706 USA
| |
Collapse
|
50
|
Machado HE, Bergland AO, O'Brien KR, Behrman EL, Schmidt PS, Petrov DA. Comparative population genomics of latitudinal variation in Drosophila simulans and Drosophila melanogaster. Mol Ecol 2016; 25:723-40. [PMID: 26523848 DOI: 10.1111/mec.13446] [Citation(s) in RCA: 111] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Revised: 10/26/2015] [Accepted: 10/28/2015] [Indexed: 12/15/2022]
Abstract
Examples of clinal variation in phenotypes and genotypes across latitudinal transects have served as important models for understanding how spatially varying selection and demographic forces shape variation within species. Here, we examine the selective and demographic contributions to latitudinal variation through the largest comparative genomic study to date of Drosophila simulans and Drosophila melanogaster, with genomic sequence data from 382 individual fruit flies, collected across a spatial transect of 19 degrees latitude and at multiple time points over 2 years. Consistent with phenotypic studies, we find less clinal variation in D. simulans than D. melanogaster, particularly for the autosomes. Moreover, we find that clinally varying loci in D. simulans are less stable over multiple years than comparable clines in D. melanogaster. D. simulans shows a significantly weaker pattern of isolation by distance than D. melanogaster and we find evidence for a stronger contribution of migration to D. simulans population genetic structure. While population bottlenecks and migration can plausibly explain the differences in stability of clinal variation between the two species, we also observe a significant enrichment of shared clinal genes, suggesting that the selective forces associated with climate are acting on the same genes and phenotypes in D. simulans and D. melanogaster.
Collapse
Affiliation(s)
- Heather E Machado
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305-5020, USA
| | - Alan O Bergland
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305-5020, USA
| | - Katherine R O'Brien
- School of Biological Sciences, University of Nebraska-Lincoln, 348 Manter Hall, Lincoln, NE, 68588, USA.,Department of Biology, University of Pennsylvania, 102 Leidy Laboratories, Philadelphia, PA, 19104-6313, USA
| | - Emily L Behrman
- Department of Biology, University of Pennsylvania, 102 Leidy Laboratories, Philadelphia, PA, 19104-6313, USA
| | - Paul S Schmidt
- Department of Biology, University of Pennsylvania, 102 Leidy Laboratories, Philadelphia, PA, 19104-6313, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, 371 Serra Mall, Stanford, CA, 94305-5020, USA
| |
Collapse
|