1
|
Mollerup S, Asplund M, Friis-Nielsen J, Kjartansdóttir KR, Fridholm H, Hansen TA, Herrera JAR, Barnes CJ, Jensen RH, Richter SR, Nielsen IB, Pietroni C, Alquezar-Planas DE, Rey-Iglesia A, Olsen PVS, Rajpert-De Meyts E, Groth-Pedersen L, von Buchwald C, Jensen DH, Gniadecki R, Høgdall E, Langhoff JL, Pete I, Vereczkey I, Baranyai Z, Dybkaer K, Johnsen HE, Steiniche T, Hokland P, Rosenberg J, Baandrup U, Sicheritz-Pontén T, Willerslev E, Brunak S, Lund O, Mourier T, Vinner L, Izarzugaza JMG, Nielsen LP, Hansen AJ. High-Throughput Sequencing-Based Investigation of Viruses in Human Cancers by Multienrichment Approach. J Infect Dis 2020; 220:1312-1324. [PMID: 31253993 PMCID: PMC6743825 DOI: 10.1093/infdis/jiz318] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2019] [Accepted: 06/27/2019] [Indexed: 01/10/2023] Open
Abstract
Background Viruses and other infectious agents cause more than 15% of human cancer cases. High-throughput sequencing-based studies of virus-cancer associations have mainly focused on cancer transcriptome data. Methods In this study, we applied a diverse selection of presequencing enrichment methods targeting all major viral groups, to characterize the viruses present in 197 samples from 18 sample types of cancerous origin. Using high-throughput sequencing, we generated 710 datasets constituting 57 billion sequencing reads. Results Detailed in silico investigation of the viral content, including exclusion of viral artefacts, from de novo assembled contigs and individual sequencing reads yielded a map of the viruses detected. Our data reveal a virome dominated by papillomaviruses, anelloviruses, herpesviruses, and parvoviruses. More than half of the included samples contained 1 or more viruses; however, no link between specific viruses and cancer types were found. Conclusions Our study sheds light on viral presence in cancers and provides highly relevant virome data for future reference.
Collapse
Affiliation(s)
- Sarah Mollerup
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Maria Asplund
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Jens Friis-Nielsen
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | | | - Helena Fridholm
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Thomas Arn Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - José Alejandro Romero Herrera
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark.,Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
| | | | - Randi Holm Jensen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Stine Raith Richter
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Ida Broman Nielsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Carlotta Pietroni
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - David E Alquezar-Planas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Alba Rey-Iglesia
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Pernille V S Olsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Ewa Rajpert-De Meyts
- Department of Growth and Reproduction, Copenhagen University Hospital (Rigshospitalet), Denmark
| | - Line Groth-Pedersen
- Department of Pediatrics and Adolescent Medicine, University Hospital Rigshospitalet, Denmark
| | - Christian von Buchwald
- Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet, Copenhagen University Hospital
| | - David H Jensen
- Department of Otorhinolaryngology, Head and Neck Surgery and Audiology, Rigshospitalet, Copenhagen University Hospital
| | - Robert Gniadecki
- Department of Dermato-Venerology, Faculty of Health Sciences, Copenhagen University Hospital, Bispebjerg Hospital, Denmark
| | - Estrid Høgdall
- Department of Pathology, Herlev and Gentofte Hospital, University of Copenhagen, Denmark
| | - Jill Levin Langhoff
- Department of Pathology, Herlev and Gentofte Hospital, University of Copenhagen, Denmark
| | - Imre Pete
- National Institute of Oncology, Department of Gynecology, Budapest, Hungary
| | - Ildikó Vereczkey
- National Institute of Oncology, Department of Gynecology, Budapest, Hungary
| | - Zsolt Baranyai
- 1st Department of Surgery, Semmelweis University, Budapest, Hungary
| | - Karen Dybkaer
- Department of Clinical Medicine, Aalborg University, Denmark
| | | | | | - Peter Hokland
- Department of Clinical Medicine, Department of Haematology, Aarhus University Hospital, Denmark
| | - Jacob Rosenberg
- Department of Surgery, Herlev and Gentofte Hospital, University of Copenhagen, Denmark
| | - Ulrik Baandrup
- Center for Clinical Research, North Denmark Regional Hospital and Department of Clinical Medicine, Aalborg University, Hjørring, Denmark
| | - Thomas Sicheritz-Pontén
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark.,Centre of Excellence for Omics-Driven Computational Biodiscovery, AIMST University, Kedah, Malaysia
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Søren Brunak
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark.,Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
| | - Ole Lund
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - Tobias Mourier
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Lasse Vinner
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| | - Jose M G Izarzugaza
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - Lars Peter Nielsen
- Department of Autoimmunology and Biomarkers, Statens Serum Institut, Copenhagen S, Denmark
| | - Anders Johannes Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Denmark
| |
Collapse
|
2
|
Asplund M, Kjartansdóttir KR, Mollerup S, Vinner L, Fridholm H, Herrera JAR, Friis-Nielsen J, Hansen TA, Jensen RH, Nielsen IB, Richter SR, Rey-Iglesia A, Matey-Hernandez ML, Alquezar-Planas DE, Olsen PVS, Sicheritz-Pontén T, Willerslev E, Lund O, Brunak S, Mourier T, Nielsen LP, Izarzugaza JMG, Hansen AJ. Contaminating viral sequences in high-throughput sequencing viromics: a linkage study of 700 sequencing libraries. Clin Microbiol Infect 2019; 25:1277-1285. [PMID: 31059795 DOI: 10.1016/j.cmi.2019.04.028] [Citation(s) in RCA: 81] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 04/12/2019] [Accepted: 04/18/2019] [Indexed: 12/11/2022]
Abstract
OBJECTIVES Sample preparation for high-throughput sequencing (HTS) includes treatment with various laboratory components, potentially carrying viral nucleic acids, the extent of which has not been thoroughly investigated. Our aim was to systematically examine a diverse repertoire of laboratory components used to prepare samples for HTS in order to identify contaminating viral sequences. METHODS A total of 322 samples of mainly human origin were analysed using eight protocols, applying a wide variety of laboratory components. Several samples (60% of human specimens) were processed using different protocols. In total, 712 sequencing libraries were investigated for viral sequence contamination. RESULTS Among sequences showing similarity to viruses, 493 were significantly associated with the use of laboratory components. Each of these viral sequences had sporadic appearance, only being identified in a subset of the samples treated with the linked laboratory component, and some were not identified in the non-template control samples. Remarkably, more than 65% of all viral sequences identified were within viral clusters linked to the use of laboratory components. CONCLUSIONS We show that high prevalence of contaminating viral sequences can be expected in HTS-based virome data and provide an extensive list of novel contaminating viral sequences that can be used for evaluation of viral findings in future virome and metagenome studies. Moreover, we show that detection can be problematic due to stochastic appearance and limited non-template controls. Although the exact origin of these viral sequences requires further research, our results support laboratory-component-linked viral sequence contamination of both biological and synthetic origin.
Collapse
Affiliation(s)
- M Asplund
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.
| | - K R Kjartansdóttir
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - S Mollerup
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - L Vinner
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - H Fridholm
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark; Department of Autoimmunology and Biomarkers, Statens Serum Institut, Copenhagen, Denmark
| | - J A R Herrera
- Disease Systems Biology Programme, Panum Instituttet, Copenhagen, Denmark; Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - J Friis-Nielsen
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - T A Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - R H Jensen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - I B Nielsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - S R Richter
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - A Rey-Iglesia
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - M L Matey-Hernandez
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - D E Alquezar-Planas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - P V S Olsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - T Sicheritz-Pontén
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark; Centre of Excellence for Omics-Driven Computational Biodiscovery, AIMST University, Kedah, Malaysia
| | - E Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - O Lund
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - S Brunak
- Disease Systems Biology Programme, Panum Instituttet, Copenhagen, Denmark; Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - T Mourier
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
| | - L P Nielsen
- Department of Autoimmunology and Biomarkers, Statens Serum Institut, Copenhagen, Denmark
| | - J M G Izarzugaza
- Department of Bio and Health Informatics, Technical University of Denmark, Lyngby, Denmark
| | - A J Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
3
|
Friis-Nielsen J, Kjartansdóttir KR, Mollerup S, Asplund M, Mourier T, Jensen RH, Hansen TA, Rey-Iglesia A, Richter SR, Nielsen IB, Alquezar-Planas DE, Olsen PVS, Vinner L, Fridholm H, Nielsen LP, Willerslev E, Sicheritz-Pontén T, Lund O, Hansen AJ, Izarzugaza JMG, Brunak S. Identification of Known and Novel Recurrent Viral Sequences in Data from Multiple Patients and Multiple Cancers. Viruses 2016; 8:E53. [PMID: 26907326 PMCID: PMC4776208 DOI: 10.3390/v8020053] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 01/29/2016] [Accepted: 02/05/2016] [Indexed: 12/17/2022] Open
Abstract
Virus discovery from high throughput sequencing data often follows a bottom-up approach where taxonomic annotation takes place prior to association to disease. Albeit effective in some cases, the approach fails to detect novel pathogens and remote variants not present in reference databases. We have developed a species independent pipeline that utilises sequence clustering for the identification of nucleotide sequences that co-occur across multiple sequencing data instances. We applied the workflow to 686 sequencing libraries from 252 cancer samples of different cancer and tissue types, 32 non-template controls, and 24 test samples. Recurrent sequences were statistically associated to biological, methodological or technical features with the aim to identify novel pathogens or plausible contaminants that may associate to a particular kit or method. We provide examples of identified inhabitants of the healthy tissue flora as well as experimental contaminants. Unmapped sequences that co-occur with high statistical significance potentially represent the unknown sequence space where novel pathogens can be identified.
Collapse
Affiliation(s)
- Jens Friis-Nielsen
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Kristín Rós Kjartansdóttir
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Sarah Mollerup
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Maria Asplund
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Tobias Mourier
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Randi Holm Jensen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Thomas Arn Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Alba Rey-Iglesia
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Stine Raith Richter
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Ida Broman Nielsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - David E Alquezar-Planas
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Pernille V S Olsen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Lasse Vinner
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Helena Fridholm
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Lars Peter Nielsen
- Department of Autoimmunology and Biomarkers, Statens Serum Institut, DK-2300 Copenhagen S, Denmark.
| | - Eske Willerslev
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Thomas Sicheritz-Pontén
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Ole Lund
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Anders Johannes Hansen
- Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, DK-1350 Copenhagen, Denmark.
| | - Jose M G Izarzugaza
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
| | - Søren Brunak
- Center for Biological Sequence Analysis, Department of Systems Biology, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark.
- NNF Center for Protein Research, University of Copenhagen, Blegdamsvej 3B, DK-2200 Copenhagen, Denmark.
| |
Collapse
|