1
|
Gurgul A, Szmatoła T, Ocłoń E, Jasielczuk I, Semik-Gurgul E, Finno CJ, Petersen JL, Bellone R, Hales EN, Ząbek T, Arent Z, Kotula-Balak M, Bugno-Poniewierska M. Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues. J Appl Genet 2022; 63:571-581. [PMID: 35670911 DOI: 10.1007/s13353-022-00705-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 04/27/2022] [Accepted: 05/31/2022] [Indexed: 11/25/2022]
Abstract
In recent years, a vast amount of sequencing data has been generated and large improvements have been made to reference genome sequences. Despite these advances, significant portions of reads still do not map to reference genomes and these reads have been considered as junk or artificial sequences. Recent studies have shown that these reads can be useful, e.g., for refining reference genomes or detecting contaminating microorganisms present in the analyzed biological samples. A special case of this is RNA sequencing (RNA-Seq) reads that come from tissue transcriptomes. Unmapped reads from RNA-Seq have received much less attention than those from whole-genome sequencing. In particular, in the horse, an analysis of unmapped RNA reads has not been performed yet. Thus, in this study, we analyzed the unmapped reads originating from the RNA-Seq performed through the Functional Annotation of Animal Genomes (FAANG) project in the horse, using eight different tissues from two mares. We demonstrated that unmapped reads from RNA-Seq could be easily assembled into transcripts relating to many important genes present in the sequences of other mammals. Large portions of these transcripts did not have coding potential and, thus, can be considered as non-coding RNA. Moreover, reads that were not mapped to the reference genome but aligned to the entries in NCBI database of horse proteins were enriched for biological processes that largely correspond to the functions of organ from which RNA was isolated and thus are presumably true transcripts of genes associated with cell metabolism in those tissues. In addition, a portion of reads aligned to the common pathogenic or neutral microbiota, of which the most common was Brucella spp. These data suggest that unmapped reads can be an important target for in-depth analysis that may substantially enrich results of initial RNA-Seq experiments for various tissues and organs.
Collapse
Affiliation(s)
- Artur Gurgul
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland.
| | - Tomasz Szmatoła
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Ewa Ocłoń
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Igor Jasielczuk
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Ewelina Semik-Gurgul
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Carrie J Finno
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Jessica L Petersen
- Department of Animal Science, University of Nebraska Lincoln, Lincoln, NB, USA
| | - Rebecca Bellone
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
- Veterinary Genetics Laboratory, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Erin N Hales
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Tomasz Ząbek
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Zbigniew Arent
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Małgorzata Kotula-Balak
- University Centre of Veterinary Medicine, University of Agriculture in Krakow, Mickiewicza 24/28, 30-059, Krakow, Poland
| | - Monika Bugno-Poniewierska
- Department of Animal Reproduction, Anatomy and Genomics, University of Agriculture in Kraków, al. Mickiewicza 24/28, 30-059, Kraków, Poland
| |
Collapse
|
2
|
Huang Y, Yuan C, Zhao Y, Li C, Cao M, Li H, Zhao Z, Sun A, Basang W, Zhu Y, Chen L, He F, Huan C, Zhang B, Iqbal T, Wei Y, Fan W, Yi K, Zhou X. Identification and Regulatory Network Analysis of Genes Related to Reproductive Performance in the Hypothalamus and Pituitary of Angus Cattle. Genes (Basel) 2022; 13:genes13060965. [PMID: 35741727 PMCID: PMC9222274 DOI: 10.3390/genes13060965] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 05/16/2022] [Accepted: 05/25/2022] [Indexed: 11/30/2022] Open
Abstract
In this study, we explored the gene expression patterns of the pituitary gland and hypothalamus of Angus cows at different growth and developmental stages by deep sequencing and we identified genes that affect bovine reproductive performance to provide new ideas for improving bovine fertility in production practice. We selected three 6-month-old (weaning period), three 18-month-old (first mating period), and three 30-month-old (early postpartum) Angus cattle. The physiological status of the cows in each group was the same, and their body conformations were similar. After quality control of the sequencing, the transcriptome analyses of 18 samples yielded 129.18 GB of clean data. We detected 13,280 and 13,318 expressed genes in the pituitary gland and hypothalamus, respectively, and screened 35 and 50 differentially expressed genes (DEGs) for each, respectively. The differentially expressed genes in both tissues were mainly engaged in metabolism, lipid synthesis, and immune-related pathways in the 18-month-old cows as compared with the 6-month-old cows. The 30-month-old cows presented more regulated reproductive behavior, and pituitary CAMK4 was the main factor regulating the reproductive behavior during this period via the pathways for calcium signaling, longevity, oxytocin, and aldosterone synthesis and secretion. A variant calling analysis also was performed. The SNP inversions and conversions in each sample were counted according to the different base substitution methods. In all samples, most base substitutions were represented by substitutions between bases A and G, and the probability of base conversion exceeded 70%, far exceeding the transversion. Heterozygous SNP sites exceeded 37.68%.
Collapse
Affiliation(s)
- Yuwen Huang
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Chenfeng Yuan
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Yun Zhao
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Chunjin Li
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Maosheng Cao
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Haobang Li
- Hunan Institute of Animal and Veterinary Science, 8 Changliang Road, Changsha 410131, China; (H.L.); (A.S.); (F.H.); (C.H.)
| | - Zijiao Zhao
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Ao Sun
- Hunan Institute of Animal and Veterinary Science, 8 Changliang Road, Changsha 410131, China; (H.L.); (A.S.); (F.H.); (C.H.)
| | - Wangdui Basang
- Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa 850002, China; (W.B.); (Y.Z.)
| | - Yanbin Zhu
- Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa 850002, China; (W.B.); (Y.Z.)
| | - Lu Chen
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Fang He
- Hunan Institute of Animal and Veterinary Science, 8 Changliang Road, Changsha 410131, China; (H.L.); (A.S.); (F.H.); (C.H.)
| | - Cheng Huan
- Hunan Institute of Animal and Veterinary Science, 8 Changliang Road, Changsha 410131, China; (H.L.); (A.S.); (F.H.); (C.H.)
| | - Boqi Zhang
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Tariq Iqbal
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Yamen Wei
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Wenjing Fan
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
| | - Kangle Yi
- Hunan Institute of Animal and Veterinary Science, 8 Changliang Road, Changsha 410131, China; (H.L.); (A.S.); (F.H.); (C.H.)
- Correspondence: (K.Y.); (X.Z.)
| | - Xu Zhou
- Jilin Provincial Key Laboratory of Animal Embryo Engineering, College of Animal Science and Veterinary Medicine, Jilin University, 5333 Xi’an Avenue, Changchun 130062, China; (Y.H.); (C.Y.); (Y.Z.); (C.L.); (M.C.); (Z.Z.); (L.C.); (B.Z.); (T.I.); (Y.W.); (W.F.)
- Correspondence: (K.Y.); (X.Z.)
| |
Collapse
|
3
|
Bovo S, Schiavo G, Bolner M, Ballan M, Fontanesi L. Mining livestock genome datasets for an unconventional characterization of animal DNA viromes. Genomics 2022; 114:110312. [DOI: 10.1016/j.ygeno.2022.110312] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 01/16/2022] [Accepted: 02/06/2022] [Indexed: 11/04/2022]
|
4
|
Chen S, Ren C, Zhai J, Yu J, Zhao X, Li Z, Zhang T, Ma W, Han Z, Ma C. CAFU: a Galaxy framework for exploring unmapped RNA-Seq data. Brief Bioinform 2021; 21:676-686. [PMID: 30815667 PMCID: PMC7299299 DOI: 10.1093/bib/bbz018] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/23/2019] [Accepted: 01/27/2019] [Indexed: 12/13/2022] Open
Abstract
A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU.
Collapse
Affiliation(s)
- Siyuan Chen
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Chengzhi Ren
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Jingjing Zhai
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Jiantao Yu
- College of Information Engineering, Northwest Agriculture and Forestry University
| | - Xuyang Zhao
- College of Information Engineering, Northwest Agriculture and Forestry University
| | - Zelong Li
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Ting Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Wenlong Ma
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Zhaoxue Han
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Chuang Ma
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| |
Collapse
|
5
|
Novel functional sequences uncovered through a bovine multiassembly graph. Proc Natl Acad Sci U S A 2021; 118:2101056118. [PMID: 33972446 DOI: 10.1073/pnas.2101056118] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Many genomic analyses start by aligning sequencing reads to a linear reference genome. However, linear reference genomes are imperfect, lacking millions of bases of unknown relevance and are unable to reflect the genetic diversity of populations. This makes reference-guided methods susceptible to reference-allele bias. To overcome such limitations, we build a pangenome from six reference-quality assemblies from taurine and indicine cattle as well as yak. The pangenome contains an additional 70,329,827 bases compared to the Bos taurus reference genome. Our multiassembly approach reveals 30 and 10.1 million bases private to yak and indicine cattle, respectively, and between 3.3 and 4.4 million bases unique to each taurine assembly. Utilizing transcriptomes from 56 cattle, we show that these nonreference sequences encode transcripts that hitherto remained undetected from the B. taurus reference genome. We uncover genes, primarily encoding proteins contributing to immune response and pathogen-mediated immunomodulation, differentially expressed between Mycobacterium bovis-infected and noninfected cattle that are also undetectable in the B. taurus reference genome. Using whole-genome sequencing data of cattle from five breeds, we show that reads which were previously misaligned against the Bos taurus reference genome now align accurately to the pangenome sequences. This enables us to discover 83,250 polymorphic sites that segregate within and between breeds of cattle and capture genetic differentiation across breeds. Our work makes a so-far unused source of variation amenable to genetic investigations and provides methods and a framework for establishing and exploiting a more diverse reference genome.
Collapse
|
6
|
Scott MA, Woolums AR, Swiderski CE, Perkins AD, Nanduri B, Smith DR, Karisch BB, Epperson WB, Blanton JR. Comprehensive at-arrival transcriptomic analysis of post-weaned beef cattle uncovers type I interferon and antiviral mechanisms associated with bovine respiratory disease mortality. PLoS One 2021; 16:e0250758. [PMID: 33901263 PMCID: PMC8075194 DOI: 10.1371/journal.pone.0250758] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Accepted: 04/13/2021] [Indexed: 12/02/2022] Open
Abstract
Background Despite decades of extensive research, bovine respiratory disease (BRD) remains the most devastating disease in beef cattle production. Establishing a clinical diagnosis often relies upon visual detection of non-specific signs, leading to low diagnostic accuracy. Thus, post-weaned beef cattle are often metaphylactically administered antimicrobials at facility arrival, which poses concerns regarding antimicrobial stewardship and resistance. Additionally, there is a lack of high-quality research that addresses the gene-by-environment interactions that underlie why some cattle that develop BRD die while others survive. Therefore, it is necessary to decipher the underlying host genomic factors associated with BRD mortality versus survival to help determine BRD risk and severity. Using transcriptomic analysis of at-arrival whole blood samples from cattle that died of BRD, as compared to those that developed signs of BRD but lived (n = 3 DEAD, n = 3 ALIVE), we identified differentially expressed genes (DEGs) and associated pathways in cattle that died of BRD. Additionally, we evaluated unmapped reads, which are often overlooked within transcriptomic experiments. Results 69 DEGs (FDR<0.10) were identified between ALIVE and DEAD cohorts. Several DEGs possess immunological and proinflammatory function and associations with TLR4 and IL6. Biological processes, pathways, and disease phenotype associations related to type-I interferon production and antiviral defense were enriched in DEAD cattle at arrival. Unmapped reads aligned primarily to various ungulate assemblies, but failed to align to viral assemblies. Conclusion This study further revealed increased proinflammatory immunological mechanisms in cattle that develop BRD. DEGs upregulated in DEAD cattle were predominantly involved in innate immune pathways typically associated with antiviral defense, although no viral genes were identified within unmapped reads. Our findings provide genomic targets for further analysis in cattle at highest risk of BRD, suggesting that mechanisms related to type I interferons and antiviral defense may be indicative of viral respiratory disease at arrival and contribute to eventual BRD mortality.
Collapse
Affiliation(s)
- Matthew A Scott
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Amelia R Woolums
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Cyprianna E Swiderski
- Department of Clinical Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - Andy D Perkins
- Department of Computer Science and Engineering, Mississippi State University, Mississippi State, MS, United States of America
| | - Bindu Nanduri
- Department of Basic Sciences, Mississippi State University College of Veterinary Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - David R Smith
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - Brandi B Karisch
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| | - William B Epperson
- Department of Pathobiology and Population Medicine, Mississippi State University, Mississippi State, MS, United States of America
| | - John R Blanton
- Department of Animal and Dairy Sciences, Mississippi State University, Mississippi State, MS, United States of America
| |
Collapse
|
7
|
Noreikiene K, Ozerov M, Ahmad F, Kõiv T, Kahar S, Gross R, Sepp M, Pellizzone A, Vesterinen EJ, Kisand V, Vasemägi A. Humic-acid-driven escape from eye parasites revealed by RNA-seq and target-specific metabarcoding. Parasit Vectors 2020; 13:433. [PMID: 32859251 PMCID: PMC7456052 DOI: 10.1186/s13071-020-04306-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 08/16/2020] [Indexed: 01/09/2023] Open
Abstract
Background Next generation sequencing (NGS) technologies are extensively used to dissect the molecular mechanisms of host-parasite interactions in human pathogens. However, ecological studies have yet to fully exploit the power of NGS as a rich source for formulating and testing new hypotheses. Methods We studied Eurasian perch (Perca fluviatilis) and its eye parasite (Trematoda, Diplostomidae) communities in 14 lakes that differed in humic content in order to explore host-parasite-environment interactions. We hypothesised that high humic content along with low pH would decrease the abundance of the intermediate hosts (gastropods), thus limiting the occurrence of diplostomid parasites in humic lakes. This hypothesis was initially invoked by whole eye RNA-seq data analysis and subsequently tested using PCR-based detection and a novel targeted metabarcoding approach. Results Whole eye transcriptome results revealed overexpression of immune-related genes and the presence of eye parasite sequences in RNA-seq data obtained from perch living in clear-water lakes. Both PCR-based and targeted-metabarcoding approach showed that perch from humic lakes were completely free from diplostomid parasites, while the prevalence of eye flukes in clear-water lakes that contain low amounts of humic substances was close to 100%, with the majority of NGS reads assigned to Tylodelphys clavata. Conclusions High intraspecific diversity of T. clavata indicates that massively parallel sequencing of naturally pooled samples represents an efficient and powerful strategy for shedding light on cryptic diversity of eye parasites. Our results demonstrate that perch populations in clear-water lakes experience contrasting eye parasite pressure compared to those from humic lakes, which is reflected by prevalent differences in the expression of immune-related genes in the eye. This study highlights the utility of NGS to discover novel host-parasite-environment interactions and provide unprecedented power to characterize the molecular diversity of cryptic parasites.![]()
Collapse
Affiliation(s)
- Kristina Noreikiene
- Chair of Aquaculture, Institute of Veterinary Medicine and Animal Sciences, Estonian University of Life Sciences, Kreutzwaldi 46, 51006, Tartu, Estonia.
| | - Mikhail Ozerov
- Department of Biology, University of Turku, 20014, Turku, Finland.,Department of Aquatic Resources, Institute of Freshwater Research, Swedish University of Agricultural Sciences, 17893, Drottningholm, Sweden.,Biodiversity Unit, University of Turku, 20014, Turku, Finland
| | - Freed Ahmad
- Department of Biology, University of Turku, 20014, Turku, Finland
| | - Toomas Kõiv
- Chair of Hydrobiology and Fishery, Institute of Agricultural and Environmental Sciences, Estonian University of Life Sciences, Kreutzwaldi 5, 51006, Tartu, Estonia
| | - Siim Kahar
- Chair of Aquaculture, Institute of Veterinary Medicine and Animal Sciences, Estonian University of Life Sciences, Kreutzwaldi 46, 51006, Tartu, Estonia
| | - Riho Gross
- Chair of Aquaculture, Institute of Veterinary Medicine and Animal Sciences, Estonian University of Life Sciences, Kreutzwaldi 46, 51006, Tartu, Estonia
| | - Margot Sepp
- Chair of Hydrobiology and Fishery, Institute of Agricultural and Environmental Sciences, Estonian University of Life Sciences, Kreutzwaldi 5, 51006, Tartu, Estonia
| | - Antonia Pellizzone
- Department of Biology, University of Turku, 20014, Turku, Finland.,Department of Life Sciences and Biotechnology, University of Ferrara, 44121, Ferrara, Italy
| | - Eero J Vesterinen
- Biodiversity Unit, University of Turku, 20014, Turku, Finland.,Department of Ecology, Swedish University of Agricultural Sciences, 75651, Uppsala, Sweden
| | - Veljo Kisand
- Institute of Technology, University of Tartu, Nooruse 1, 50411, Tartu, Estonia
| | - Anti Vasemägi
- Chair of Aquaculture, Institute of Veterinary Medicine and Animal Sciences, Estonian University of Life Sciences, Kreutzwaldi 46, 51006, Tartu, Estonia. .,Department of Aquatic Resources, Institute of Freshwater Research, Swedish University of Agricultural Sciences, 17893, Drottningholm, Sweden.
| |
Collapse
|
8
|
Nia AM, Khanipov K, Barnette BL, Ullrich RL, Golovko G, Emmett MR. Comparative RNA-Seq transcriptome analyses reveal dynamic time-dependent effects of 56Fe, 16O, and 28Si irradiation on the induction of murine hepatocellular carcinoma. BMC Genomics 2020; 21:453. [PMID: 32611366 PMCID: PMC7329445 DOI: 10.1186/s12864-020-06869-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 06/24/2020] [Indexed: 01/04/2023] Open
Abstract
Background One of the health risks posed to astronauts during deep space flights is exposure to high charge, high-energy (HZE) ions (Z > 13), which can lead to the induction of hepatocellular carcinoma (HCC). However, little is known on the molecular mechanisms of HZE irradiation-induced HCC. Results We performed comparative RNA-Seq transcriptomic analyses to assess the carcinogenic effects of 600 MeV/n 56Fe (0.2 Gy), 1 GeV/n 16O (0.2 Gy), and 350 MeV/n 28Si (0.2 Gy) ions in a mouse model for irradiation-induced HCC. C3H/HeNCrl mice were subjected to total body irradiation to simulate space environment HZE-irradiation, and liver tissues were extracted at five different time points post-irradiation to investigate the time-dependent carcinogenic response at the transcriptomic level. Our data demonstrated a clear difference in the biological effects of these HZE ions, particularly immunological, such as Acute Phase Response Signaling, B Cell Receptor Signaling, IL-8 Signaling, and ROS Production in Macrophages. Also seen in this study were novel unannotated transcripts that were significantly affected by HZE. To investigate the biological functions of these novel transcripts, we used a machine learning technique known as self-organizing maps (SOMs) to characterize the transcriptome expression profiles of 60 samples (45 HZE-irradiated, 15 non-irradiated control) from liver tissues. A handful of localized modules in the maps emerged as groups of co-regulated and co-expressed transcripts. The functional context of these modules was discovered using overrepresentation analysis. We found that these spots typically contained enriched populations of transcripts related to specific immunological molecular processes (e.g., Acute Phase Response Signaling, B Cell Receptor Signaling, IL-3 Signaling), and RNA Transcription/Expression. Conclusions A large number of transcripts were found differentially expressed post-HZE irradiation. These results provide valuable information for uncovering the differences in molecular mechanisms underlying HZE specific induced HCC carcinogenesis. Additionally, a handful of novel differentially expressed unannotated transcripts were discovered for each HZE ion. Taken together, these findings may provide a better understanding of biological mechanisms underlying risks for HCC after HZE irradiation and may also have important implications for the discovery of potential countermeasures against and identification of biomarkers for HZE-induced HCC.
Collapse
Affiliation(s)
- Anna M Nia
- Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Kamil Khanipov
- Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Brooke L Barnette
- Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Robert L Ullrich
- The Radiation Effects Research Foundation (RERF), Hiroshima, Japan
| | - George Golovko
- Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Mark R Emmett
- Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA. .,Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA.
| |
Collapse
|
9
|
Laine VN, Gossmann TI, van Oers K, Visser ME, Groenen MAM. Exploring the unmapped DNA and RNA reads in a songbird genome. BMC Genomics 2019; 20:19. [PMID: 30621573 PMCID: PMC6323668 DOI: 10.1186/s12864-018-5378-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Accepted: 12/16/2018] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND A widely used approach in next-generation sequencing projects is the alignment of reads to a reference genome. Despite methodological and hardware improvements which have enhanced the efficiency and accuracy of alignments, a significant percentage of reads frequently remain unmapped. Usually, unmapped reads are discarded from the analysis process, but significant biological information and insights can be uncovered from these data. We explored the unmapped DNA (normal and bisulfite treated) and RNA sequence reads of the great tit (Parus major) reference genome individual. From the unmapped reads we generated de novo assemblies, after which the generated sequence contigs were aligned to the NCBI non-redundant nucleotide database using BLAST, identifying the closest known matching sequence. RESULTS Many of the aligned contigs showed sequence similarity to different bird species and genes that were absent in the great tit reference assembly. Furthermore, there were also contigs that represented known P. major pathogenic species. Most interesting were several species of blood parasites such as Plasmodium and Trypanosoma. CONCLUSIONS Our analyses revealed that meaningful biological information can be found when further exploring unmapped reads. For instance, it is possible to discover sequences that are either absent or misassembled in the reference genome, and sequences that indicate infection or sample contamination. In this study we also propose strategies to aid the capture and interpretation of this information from unmapped reads.
Collapse
Affiliation(s)
- Veronika N Laine
- Department of Animal Ecology, NIOO-KNAW, Wageningen, The Netherlands.
| | - Toni I Gossmann
- Department of Animal and Plant Sciences, The University of Sheffield, Sheffield, UK
| | - Kees van Oers
- Department of Animal Ecology, NIOO-KNAW, Wageningen, The Netherlands
| | - Marcel E Visser
- Department of Animal Ecology, NIOO-KNAW, Wageningen, The Netherlands.,Department of Animal Sciences, Wageningen University, Wageningen, The Netherlands
| | - Martien A M Groenen
- Department of Animal Sciences, Wageningen University, Wageningen, The Netherlands
| |
Collapse
|
10
|
Braz CU, Taylor JF, Decker JE, Bresolin T, Espigolan R, Garcia DA, Gordo DGM, Magalhães AFB, de Albuquerque LG, de Oliveira HN. Polymorphism analysis in genes associated with meat tenderness in Nelore cattle. Meta Gene 2018. [DOI: 10.1016/j.mgene.2018.08.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022] Open
|
11
|
de Souza MM, Zerlotini A, Geistlinger L, Tizioto PC, Taylor JF, Rocha MIP, Diniz WJS, Coutinho LL, Regitano LCA. A comprehensive manually-curated compendium of bovine transcription factors. Sci Rep 2018; 8:13747. [PMID: 30213987 PMCID: PMC6137171 DOI: 10.1038/s41598-018-32146-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 08/29/2018] [Indexed: 01/28/2023] Open
Abstract
Transcription factors (TFs) are pivotal regulatory proteins that control gene expression in a context-dependent and tissue-specific manner. In contrast to human, where comprehensive curated TF collections exist, bovine TFs are only rudimentary recorded and characterized. In this article, we present a manually-curated compendium of 865 sequence-specific DNA-binding bovines TFs, which we analyzed for domain family distribution, evolutionary conservation, and tissue-specific expression. In addition, we provide a list of putative transcription cofactors derived from known interactions with the identified TFs. Since there is a general lack of knowledge concerning the regulation of gene expression in cattle, the curated list of TF should provide a basis for an improved comprehension of regulatory mechanisms that are specific to the species.
Collapse
Affiliation(s)
- Marcela M de Souza
- Post-graduation Program of Evolutionary Genetics and Molecular Biology, Federal University of São Carlos, São Carlos, São Paulo, 13560-970, Brazil.,Animal Biotechnology, Embrapa Pecuária Sudeste, São Carlos, São Paulo, 13560-970, Brazil
| | - Adhemar Zerlotini
- Bioinformatic Multi-user Laboratory, Embrapa Informática Agropecuária, Campinas, São Paulo, 70770-901, Brazil
| | - Ludwig Geistlinger
- Animal Biotechnology, Embrapa Pecuária Sudeste, São Carlos, São Paulo, 13560-970, Brazil
| | | | - Jeremy F Taylor
- Division of Animal Science, University of Missouri, Columbia, Missouri, 65211-5300, USA
| | - Marina I P Rocha
- Post-graduation Program of Evolutionary Genetics and Molecular Biology, Federal University of São Carlos, São Carlos, São Paulo, 13560-970, Brazil
| | - Wellison J S Diniz
- Post-graduation Program of Evolutionary Genetics and Molecular Biology, Federal University of São Carlos, São Carlos, São Paulo, 13560-970, Brazil
| | - Luiz L Coutinho
- Functional Genomic Center, University of São Paulo, Piracicaba, São Paulo, 13418-900, Brazil
| | - Luciana C A Regitano
- Animal Biotechnology, Embrapa Pecuária Sudeste, São Carlos, São Paulo, 13560-970, Brazil.
| |
Collapse
|
12
|
Assembly and Analysis of Unmapped Genome Sequence Reads Reveal Novel Sequence and Variation in Dogs. Sci Rep 2018; 8:10862. [PMID: 30022108 PMCID: PMC6052005 DOI: 10.1038/s41598-018-29190-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 06/27/2018] [Indexed: 12/29/2022] Open
Abstract
Dogs are excellent animal models for human disease. They have extensive veterinary histories, pedigrees, and a unique genetic system due to breeding practices. Despite these advantages, one factor limiting their usefulness is the canine genome reference (CGR) which was assembled using a single purebred Boxer. Although a common practice, this results in many high-quality reads remaining unmapped. To address this whole-genome sequence data from three breeds, Border Collie (n = 26), Bearded Collie (n = 7), and Entlebucher Sennenhund (n = 8), were analyzed to identify novel, non-CGR genomic contigs using the previously validated pseudo-de novo assembly pipeline. We identified 256,957 novel contigs and paired-end relationships together with BLAT scores provided 126,555 (49%) high-quality contigs with genomic coordinates containing 4.6 Mb of novel sequence absent from the CGR. These contigs close 12,503 known gaps, including 2.4 Mb containing partially missing sequences for 11.5% of Ensembl, 16.4% of RefSeq and 12.2% of canFam3.1+ CGR annotated genes and 1,748 unmapped contigs containing 2,366 novel gene variants. Examples for six disease-associated genes (SCARF2, RD3, COL9A3, FAM161A, RASGRP1 and DLX6) containing gaps or alternate splice variants missing from the CGR are also presented. These findings from non-reference breeds support the need for improvement of the current Boxer-only CGR to avoid missing important biological information. The inclusion of the missing gene sequences into the CGR will facilitate identification of putative disease mutations across diverse breeds and phenotypes.
Collapse
|
13
|
Schoonvaere K, Smagghe G, Francis F, de Graaf DC. Study of the Metatranscriptome of Eight Social and Solitary Wild Bee Species Reveals Novel Viruses and Bee Parasites. Front Microbiol 2018; 9:177. [PMID: 29491849 PMCID: PMC5817871 DOI: 10.3389/fmicb.2018.00177] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2017] [Accepted: 01/25/2018] [Indexed: 01/05/2023] Open
Abstract
Bees are associated with a remarkable diversity of microorganisms, including unicellular parasites, bacteria, fungi, and viruses. The application of next-generation sequencing approaches enables the identification of this rich species composition as well as the discovery of previously unknown associations. Using high-throughput polyadenylated ribonucleic acid (RNA) sequencing, we investigated the metatranscriptome of eight wild bee species (Andrena cineraria, Andrena fulva, Andrena haemorrhoa, Bombus terrestris, Bombus cryptarum, Bombus pascuorum, Osmia bicornis, and Osmia cornuta) sampled from four different localities in Belgium. Across the RNA sequencing libraries, 88–99% of the taxonomically informative reads were of the host transcriptome. Four viruses with homology to insect pathogens were found including two RNA viruses (belonging to the families Iflaviridae and Tymoviridae that harbor already viruses of honey bees), a double stranded DNA virus (family Nudiviridae) and a single stranded DNA virus (family Parvoviridae). In addition, we found genomic sequences of 11 unclassified arthropod viruses (related to negeviruses, sobemoviruses, totiviruses, rhabdoviruses, and mononegaviruses), seven plant pathogenic viruses, and one fungal virus. Interestingly, nege-like viruses appear to be widespread, host-specific, and capable of attaining high copy numbers inside bees. Next to viruses, three novel parasite associations were discovered in wild bees, including Crithidia pragensis and a tubulinosematid and a neogregarine parasite. Yeasts of the genus Metschnikowia were identified in solitary bees. This study gives a glimpse of the microorganisms and viruses associated with social and solitary wild bees and demonstrates that their diversity exceeds by far the subset of species first discovered in honey bees.
Collapse
Affiliation(s)
- Karel Schoonvaere
- Laboratory of Molecular Entomology and Bee Pathology, Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium.,Functional and Evolutionary Entomology, Gembloux Agro-Bio Tech, University of Liege, Gembloux, Belgium
| | - Guy Smagghe
- Laboratory of Agrozoology, Department of Crop Protection, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Frédéric Francis
- Functional and Evolutionary Entomology, Gembloux Agro-Bio Tech, University of Liege, Gembloux, Belgium
| | - Dirk C de Graaf
- Laboratory of Molecular Entomology and Bee Pathology, Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium
| |
Collapse
|
14
|
Lopes RJ, Mérida AM, Carneiro M. Unleashing the Potential of Public Genomic Resources to Find Parasite Genetic Data. Trends Parasitol 2017; 33:750-753. [DOI: 10.1016/j.pt.2017.06.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Revised: 06/15/2017] [Accepted: 06/16/2017] [Indexed: 11/24/2022]
|
15
|
Bovo S, Mazzoni G, Ribani A, Utzeri VJ, Bertolini F, Schiavo G, Fontanesi L. A viral metagenomic approach on a non-metagenomic experiment: Mining next generation sequencing datasets from pig DNA identified several porcine parvoviruses for a retrospective evaluation of viral infections. PLoS One 2017; 12:e0179462. [PMID: 28662150 PMCID: PMC5491021 DOI: 10.1371/journal.pone.0179462] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 05/29/2017] [Indexed: 12/14/2022] Open
Abstract
Shot-gun next generation sequencing (NGS) on whole DNA extracted from specimens collected from mammals often produces reads that are not mapped (i.e. unmapped reads) on the host reference genome and that are usually discarded as by-products of the experiments. In this study, we mined Ion Torrent reads obtained by sequencing DNA isolated from archived blood samples collected from 100 performance tested Italian Large White pigs. Two reduced representation libraries were prepared from two DNA pools constructed each from 50 equimolar DNA samples. Bioinformatic analyses were carried out to mine unmapped reads on the reference pig genome that were obtained from the two NGS datasets. In silico analyses included read mapping and sequence assembly approaches for a viral metagenomic analysis using the NCBI Viral Genome Resource. Our approach identified sequences matching several viruses of the Parvoviridae family: porcine parvovirus 2 (PPV2), PPV4, PPV5 and PPV6 and porcine bocavirus 1-H18 isolate (PBoV1-H18). The presence of these viruses was confirmed by PCR and Sanger sequencing of individual DNA samples. PPV2, PPV4, PPV5, PPV6 and PBoV1-H18 were all identified in samples collected in 1998-2007, 1998-2000, 1997-2000, 1998-2004 and 2003, respectively. For most of these viruses (PPV4, PPV5, PPV6 and PBoV1-H18) previous studies reported their first occurrence much later (from 5 to more than 10 years) than our identification period and in different geographic areas. Our study provided a retrospective evaluation of apparently asymptomatic parvovirus infected pigs providing information that could be important to define occurrence and prevalence of different parvoviruses in South Europe. This study demonstrated the potential of mining NGS datasets non-originally derived by metagenomics experiments for viral metagenomics analyses in a livestock species.
Collapse
Affiliation(s)
- Samuele Bovo
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
- Department of Biological, Geological, and Environmental Sciences (BiGeA), Biocomputing Group, University of Bologna, Bologna, Italy
| | - Gianluca Mazzoni
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
- Department of Veterinary Clinical and Animal Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Anisa Ribani
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
| | - Valerio Joe Utzeri
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
| | - Francesca Bertolini
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
- Department of Animal Science, Iowa State University, Iowa, United States of America
| | - Giuseppina Schiavo
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
| | - Luca Fontanesi
- Department of Agricultural and Food Sciences (DISTAL), Division of Animal Sciences, University of Bologna, Bologna, Italy
- * E-mail:
| |
Collapse
|
16
|
Suqueli García MF, Castellote MA, Feingold SE, Corva PM. Characterization of a deletion in the Hsp70 cluster in the bovine reference genome. Anim Genet 2017; 48:377-385. [PMID: 28568840 DOI: 10.1111/age.12561] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2017] [Indexed: 11/27/2022]
Abstract
The 70 kilodalton heat shock proteins (Hsp70) are highly conserved molecular chaperones which have a crucial role in the stress response of the cell. In mammals, the Hsp70 proteins are encoded by a cluster of three genes: HSPA1A, HSPA1B and HSPA1L. In bovines, this cluster is located on chromosome 23 downstream of the major histocompatibility complex (BoLA). We detected inconsistencies in the location of markers on the Hsp70 genes reported in the literature that pointed to a potential deletion in the bovine reference genome UMD 3.1.1. An in silico analysis of the bovine genomic region of the Hsp70 cluster, using available information from public databases, confirmed the existence of a deletion of 11.1-kb spanning the HSPA1B gene and the intergenic region between HSPA1B and HSPA1A. Although we originally considered this an assembly error, it is most likely a particular condition of L1 Dominette 01449, the cow sequenced in the Bovine Genome Project. Moreover, we suggest a new classification of bovine Hsp70 sequences reported in NCBI and a reassignment of the location of SNPs from dbSNP that map to the deletion on BTA23. We also compared the location of selected transcription factor binding sites on the promoters of HSPA1A and HSPA1B. The results generated in the present work could be helpful to refine the reference genome of an important livestock species and also to understand the role and the regulation of the bovine Hsp70 genes.
Collapse
Affiliation(s)
- M F Suqueli García
- Facultad de Ciencias Agrarias, Universidad Nacional de Mar del Plata, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| | - M A Castellote
- Laboratorio de Agrobiotecnología, EEA Balcarce, Instituto Nacional de Tecnología Agropecuaria, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| | - S E Feingold
- Laboratorio de Agrobiotecnología, EEA Balcarce, Instituto Nacional de Tecnología Agropecuaria, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| | - P M Corva
- Facultad de Ciencias Agrarias, Universidad Nacional de Mar del Plata, Unidad Integrada Balcarce, C.C. 276, 7620, Balcarce, Argentina
| |
Collapse
|
17
|
Usman T, Hadlich F, Demasius W, Weikard R, Kühn C. Unmapped reads from cattle RNAseq data: A source for missing and misassembled sequences in the reference assemblies and for detection of pathogens in the host. Genomics 2017; 109:36-42. [DOI: 10.1016/j.ygeno.2016.11.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Revised: 11/21/2016] [Accepted: 11/28/2016] [Indexed: 11/15/2022]
|
18
|
Taylor JF, Whitacre LK, Hoff JL, Tizioto PC, Kim J, Decker JE, Schnabel RD. Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals. Genet Sel Evol 2016; 48:59. [PMID: 27534529 PMCID: PMC4989351 DOI: 10.1186/s12711-016-0237-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 08/02/2016] [Indexed: 12/31/2022] Open
Abstract
Background Decreasing sequencing costs and development of new protocols for characterizing global methylation, gene expression patterns and regulatory regions have stimulated the generation of large livestock datasets. Here, we discuss experiences in the analysis of whole-genome and transcriptome sequence data. Methods We analyzed whole-genome sequence (WGS) data from 132 individuals from five canid species (Canis familiaris, C. latrans, C. dingo, C. aureus and C. lupus) and 61 breeds, three bison (Bison bison), 64 water buffalo (Bubalus bubalis) and 297 bovines from 17 breeds. By individual, data vary in extent of reference genome depth of coverage from 4.9X to 64.0X. We have also analyzed RNA-seq data for 580 samples representing 159 Bos taurus and Rattus norvegicus animals and 98 tissues. By aligning reads to a reference assembly and calling variants, we assessed effects of average depth of coverage on the actual coverage and on the number of called variants. We examined the identity of unmapped reads by assembling them and querying produced contigs against the non-redundant nucleic acids database. By imputing high-density single nucleotide polymorphism data on 4010 US registered Angus animals to WGS using Run4 of the 1000 Bull Genomes Project and assessing the accuracy of imputation, we identified misassembled reference sequence regions. Results We estimate that a 24X depth of coverage is required to achieve 99.5 % coverage of the reference assembly and identify 95 % of the variants within an individual’s genome. Genomes sequenced to low average coverage (e.g., <10X) may fail to cover 10 % of the reference genome and identify <75 % of variants. About 10 % of genomic DNA or transcriptome sequence reads fail to align to the reference assembly. These reads include loci missing from the reference assembly and misassembled genes and interesting symbionts, commensal and pathogenic organisms. Conclusions Assembly errors and a lack of annotation of functional elements significantly limit the utility of the current draft livestock reference assemblies. The Functional Annotation of Animal Genomes initiative seeks to annotate functional elements, while a 70X Pac-Bio assembly for cow is underway and may result in a significantly improved reference assembly. Electronic supplementary material The online version of this article (doi:10.1186/s12711-016-0237-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jeremy F Taylor
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA.
| | - Lynsey K Whitacre
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA.,Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Jesse L Hoff
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA
| | - Polyana C Tizioto
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA.,Embrapa Southeast Livestock, São Carlos, SP, Brazil
| | - JaeWoo Kim
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA
| | - Jared E Decker
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA.,Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO, USA.,Informatics Institute, University of Missouri, Columbia, MO, USA
| |
Collapse
|
19
|
van der Weide RH, Simonis M, Hermsen R, Toonen P, Cuppen E, de Ligt J. The Genomic Scrapheap Challenge; Extracting Relevant Data from Unmapped Whole Genome Sequencing Reads, Including Strain Specific Genomic Segments, in Rats. PLoS One 2016; 11:e0160036. [PMID: 27501045 PMCID: PMC4976967 DOI: 10.1371/journal.pone.0160036] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Accepted: 07/12/2016] [Indexed: 01/17/2023] Open
Abstract
Unmapped next-generation sequencing reads are typically ignored while they contain biologically relevant information. We systematically analyzed unmapped reads from whole genome sequencing of 33 inbred rat strains. High quality reads were selected and enriched for biologically relevant sequences; similarity-based analysis revealed clustering similar to previously reported phylogenetic trees. Our results demonstrate that on average 20% of all unmapped reads harbor sequences that can be used to improve reference genomes and generate hypotheses on potential genotype-phenotype relationships. Analysis pipelines would benefit from incorporating the described methods and reference genomes would benefit from inclusion of the genomic segments obtained through these efforts.
Collapse
Affiliation(s)
- Robin H. van der Weide
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW), University Medical Centre Utrecht, Utrecht, The Netherlands
- Division of Gene Regulation, The Netherlands Cancer Institute, Amsterdam, The Netherlands
| | - Marieke Simonis
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW), University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Roel Hermsen
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW), University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Pim Toonen
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW), University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Edwin Cuppen
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW), University Medical Centre Utrecht, Utrecht, The Netherlands
| | - Joep de Ligt
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences (KNAW), University Medical Centre Utrecht, Utrecht, The Netherlands
| |
Collapse
|
20
|
Geary TW, Burns GW, Moraes JGN, Moss JI, Denicol AC, Dobbs KB, Ortega MS, Hansen PJ, Wehrman ME, Neibergs H, O'Neil E, Behura S, Spencer TE. Identification of Beef Heifers with Superior Uterine Capacity for Pregnancy. Biol Reprod 2016; 95:47. [PMID: 27417907 PMCID: PMC5029478 DOI: 10.1095/biolreprod.116.141390] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Accepted: 07/06/2016] [Indexed: 11/16/2022] Open
Abstract
Infertility and subfertility represent major problems in domestic animals and humans, and the majority of embryonic loss occurs during the first month of gestation that involves pregnancy recognition and conceptus implantation. The critical genes and physiological pathways in the endometrium that mediate pregnancy establishment and success are not well understood. In study one, predominantly Angus heifers were classified based on fertility using serial embryo transfer to select animals with intrinsic differences in pregnancy loss. In each of the four rounds, a single in vitro-produced, high-quality embryo was transferred into heifers on Day 7 postestrus and pregnancy was determined on Days 28 and 42 by ultrasound and then terminated. Heifers were classified based on pregnancy success as high fertile (HF), subfertile (SF), or infertile (IF). In study two, fertility-classified heifers were resynchronized and bred with semen from a single high-fertility bull. Blood samples were collected every other day from Days 0 to 36 postmating. Pregnancy rate was determined on Day 28 by ultrasound and was higher in HF (70.4%) than in heifers with low fertility (36.8%; SF and IF). Progesterone concentrations in serum during the first 20 days postestrus were not different in nonpregnant heifers and also not different in pregnant heifers among fertility groups. In study three, a single in vivo-produced embryo was transferred into fertility-classified heifers on Day 7 postestrus. The uteri were flushed on Day 14 to recover embryos, and endometrial biopsies were obtained from the ipsilateral uterine horn. Embryo recovery rate and conceptus length and area were not different among the heifer groups. RNA was sequenced from the Day 14 endometrial biopsies of pregnant HF, SF, and IF heifers (n = 5 per group) and analyzed by edgeR-robust analysis. There were 26 differentially expressed genes (DEGs) in the HF compared to SF endometrium, 12 DEGs for SF compared to IF endometrium, and three DEGs between the HF and IF endometrium. Several of the DEG-encoded proteins are involved in immune responses and are expressed in B cells. Results indicate that preimplantation conceptus survival and growth to Day 14 is not compromised in SF and IF heifers. Thus, the observed difference in capacity for pregnancy success in these fertility-classified heifers is manifest between Days 14 and 28 when pregnancy recognition signaling and conceptus elongation and implantation must occur for the establishment of pregnancy.
Collapse
Affiliation(s)
- Thomas W Geary
- USDA-ARS, Fort Keogh Livestock and Range Research Laboratory, Miles City, Montana
| | - Gregory W Burns
- Division of Animal Sciences, University of Missouri, Columbia, Missouri
| | - Joao G N Moraes
- Division of Animal Sciences, University of Missouri, Columbia, Missouri
| | - James I Moss
- Department of Animal Sciences, University of Florida, Gainesville, Florida
| | - Anna C Denicol
- Department of Animal Sciences, University of Florida, Gainesville, Florida
| | - Kyle B Dobbs
- Department of Animal Sciences, University of Florida, Gainesville, Florida
| | - M Sofia Ortega
- Department of Animal Sciences, University of Florida, Gainesville, Florida
| | - Peter J Hansen
- Department of Animal Sciences, University of Florida, Gainesville, Florida
| | | | - Holly Neibergs
- Department of Animal Sciences, Washington State University, Pullman, Washington
| | - Eleanore O'Neil
- Division of Animal Sciences, University of Missouri, Columbia, Missouri
| | - Susanta Behura
- Division of Animal Sciences, University of Missouri, Columbia, Missouri
| | - Thomas E Spencer
- Division of Animal Sciences, University of Missouri, Columbia, Missouri
| |
Collapse
|
21
|
Chen Z, Hagen DE, Wang J, Elsik CG, Ji T, Siqueira LG, Hansen PJ, Rivera RM. Global assessment of imprinted gene expression in the bovine conceptus by next generation sequencing. Epigenetics 2016; 11:501-16. [PMID: 27245094 PMCID: PMC4939914 DOI: 10.1080/15592294.2016.1184805] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Genomic imprinting is an epigenetic mechanism that leads to parental-allele-specific gene expression. Approximately 150 imprinted genes have been identified in humans and mice but less than 30 have been described as imprinted in cattle. For the purpose of de novo identification of imprinted genes in bovine, we determined global monoallelic gene expression in brain, skeletal muscle, liver, kidney and placenta of day ∼105 Bos taurus indicus × Bos taurus taurus F1 conceptuses using RNA sequencing. To accomplish this, we developed a bioinformatics pipeline to identify parent-specific single nucleotide polymorphism alleles after filtering adenosine to inosine (A-to-I) RNA editing sites. We identified 53 genes subject to monoallelic expression. Twenty three are genes known to be imprinted in the cow and an additional 7 have previously been characterized as imprinted in human and/or mouse that have not been reported as imprinted in cattle. Of the remaining 23 genes, we found that 10 are uncharacterized or unannotated transcripts located in known imprinted clusters, whereas the other 13 genes are distributed throughout the bovine genome and are not close to any known imprinted clusters. To exclude potential cis-eQTL effects on allele expression, we corroborated the parental specificity of monoallelic expression in day 86 Bos taurus taurus × Bos taurus taurus conceptuses and identified 8 novel bovine imprinted genes. Further, we identified 671 candidate A-to-I RNA editing sites and describe random X-inactivation in day 15 bovine extraembryonic membranes. Our results expand the imprinted gene list in bovine and demonstrate that monoallelic gene expression can be the result of cis-eQTL effects.
Collapse
Affiliation(s)
- Zhiyuan Chen
- a Division of Animal Sciences , University of Missouri , Columbia , MO , USA
| | - Darren E Hagen
- a Division of Animal Sciences , University of Missouri , Columbia , MO , USA
| | - Juanbin Wang
- b Department of Statistics , University of Missouri , Columbia , MO , USA
| | - Christine G Elsik
- a Division of Animal Sciences , University of Missouri , Columbia , MO , USA
| | - Tieming Ji
- b Department of Statistics , University of Missouri , Columbia , MO , USA
| | - Luiz G Siqueira
- c Department of Animal Sciences , University of Florida , Gainesville , FL , USA
| | - Peter J Hansen
- c Department of Animal Sciences , University of Florida , Gainesville , FL , USA
| | - Rocío M Rivera
- a Division of Animal Sciences , University of Missouri , Columbia , MO , USA
| |
Collapse
|
22
|
Raszek MM, Guan LL, Plastow GS. Use of Genomic Tools to Improve Cattle Health in the Context of Infectious Diseases. Front Genet 2016; 7:30. [PMID: 27014337 PMCID: PMC4780072 DOI: 10.3389/fgene.2016.00030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 02/18/2016] [Indexed: 12/15/2022] Open
Abstract
Although infectious diseases impose a heavy economic burden on the cattle industry, the etiology of many disorders that affect livestock is not fully elucidated, and effective countermeasures are often lacking. The main tools available until now have been vaccines, antibiotics and antiparasitic drugs. Although these have been very successful in some cases, the appearance of parasite and microbial resistance to these treatments is a cause of concern. Next-generation sequencing provides important opportunities to tackle problems associated with pathogenic illnesses. This review describes the rapid gains achieved to track disease progression, identify the pathogens involved, and map pathogen interactions with the host. Use of novel genomic tools subsequently aids in treatment development, as well as successful creation of breeding programs aimed toward less susceptible livestock. These may be important tools for mitigating the long term effects of combating infection and helping reduce the reliance on antibiotic treatment.
Collapse
Affiliation(s)
- Mikolaj M Raszek
- Livestock Gentec, Department of Agricultural, Food and Nutritional Science, University of Alberta Edmonton, AB, Canada
| | - Le L Guan
- Livestock Gentec, Department of Agricultural, Food and Nutritional Science, University of Alberta Edmonton, AB, Canada
| | - Graham S Plastow
- Livestock Gentec, Department of Agricultural, Food and Nutritional Science, University of Alberta Edmonton, AB, Canada
| |
Collapse
|
23
|
Friis-Nielsen J, Kjartansdóttir KR, Mollerup S, Asplund M, Mourier T, Jensen RH, Hansen TA, Rey-Iglesia A, Richter SR, Nielsen IB, Alquezar-Planas DE, Olsen PVS, Vinner L, Fridholm H, Nielsen LP, Willerslev E, Sicheritz-Pontén T, Lund O, Hansen AJ, Izarzugaza JMG, Brunak S. Identification of Known and Novel Recurrent Viral Sequences in Data from Multiple Patients and Multiple Cancers. Viruses 2016; 8:E53. [PMID: 26907326 PMCID: PMC4776208 DOI: 10.3390/v8020053] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 01/29/2016] [Accepted: 02/05/2016] [Indexed: 12/17/2022] Open
Abstract
Virus discovery from high throughput sequencing data often follows a bottom-up approach where taxonomic annotation takes place prior to association to disease. Albeit effective in some cases, the approach fails to detect novel pathogens and remote variants not present in reference databases. We have developed a species independent pipeline that utilises sequence clustering for the identification of nucleotide sequences that co-occur across multiple sequencing data instances. We applied the workflow to 686 sequencing libraries from 252 cancer samples of different cancer and tissue types, 32 non-template controls, and 24 test samples. Recurrent sequences were statistically associated to biological, methodological or technical features with the aim to identify novel pathogens or plausible contaminants that may associate to a particular kit or method. We provide examples of identified inhabitants of the healthy tissue flora as well as experimental contaminants. Unmapped sequences that co-occur with high statistical significance potentially represent the unknown sequence space where novel pathogens can be identified.
Collapse
Affiliation(s)
- Jens Friis-Nielsen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Kristín Rós Kjartansdóttir
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Sarah Mollerup
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Maria Asplund
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Tobias Mourier
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Randi Holm Jensen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Thomas Arn Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Alba Rey-Iglesia
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Stine Raith Richter
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Ida Broman Nielsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - David E Alquezar-Planas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Pernille V S Olsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Lasse Vinner
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Helena Fridholm
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Lars Peter Nielsen
- Department of Autoimmunology and Biomarkers, Statens Serum Institut, DK-2300 Copenhagen S, Denmark.
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Thomas Sicheritz-Pontén
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Ole Lund
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Anders Johannes Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Jose M G Izarzugaza
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Søren Brunak
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
- NNF Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark.
| |
Collapse
|