Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

147
(from Reference Citation Analysis)

Article PDFs (40)

Cited by > 0 (113)

Searched Name

René L. Warren

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Warren RL, Abraham R, Calingo M, Garant JM, Jones SJM, Birol I. Establishing association between HLA-C*04:01 and severe COVID-19. HLA 2024;103:e15355. [PMID: 38273454 DOI: 10.1111/tan.15355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/05/2024] [Accepted: 01/09/2024] [Indexed: 01/27/2024]

Lo T, Coombe L, Gagalova KK, Marr A, Warren RL, Kirk H, Pandoh P, Zhao Y, Moore RA, Mungall AJ, Ritland C, Pavy N, Jones SJM, Bohlmann J, Bousquet J, Birol I, Thomson A. Assembly and annotation of the black spruce genome provide insights on spruce phylogeny and evolution of stress response. G3 (Bethesda) 2023;14:jkad247. [PMID: 37875130 PMCID: PMC10755193 DOI: 10.1093/g3journal/jkad247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 05/17/2023] [Accepted: 10/09/2023] [Indexed: 10/26/2023]

Affiliation(s)

Theodora Lo Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Lauren Coombe Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Kristina K Gagalova Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Alex Marr Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
René L Warren Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Heather Kirk Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Pawan Pandoh Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Yongjun Zhao Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Richard A Moore Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Andrew J Mungall Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Carol Ritland Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Nathalie Pavy Canada Research Chair in Forest Genomics, Laval University, Quebec City, QC G1V 0A6, Canada
Steven J M Jones Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Joerg Bohlmann Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada Michael Smith Laboratories, University of British Columbia, Vancouver, BC V6T 1Z4, Canada Department of Botany, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Jean Bousquet Canada Research Chair in Forest Genomics, Laval University, Quebec City, QC G1V 0A6, Canada
Inanç Birol Canada’s Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, BC V5Z 4S6, Canada
Ashley Thomson Faculty of Natural Resources Management, Lakehead University, Thunder Bay, ON P7B 5E1, Canada

Collapse

Wong J, Kazemi P, Coombe L, Warren RL, Birol I. aaHash: recursive amino acid sequence hashing. Bioinform Adv 2023;3:vbad162. [PMID: 38023332 PMCID: PMC10660294 DOI: 10.1093/bioadv/vbad162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 10/13/2023] [Accepted: 11/08/2023] [Indexed: 12/01/2023]

Li JX, Fernandez KX, Ritland C, Jancsik S, Engelhardt DB, Coombe L, Warren RL, van Belkum MJ, Carroll AL, Vederas JC, Bohlmann J, Birol I. Genomic virulence features of Beauveria bassiana as a biocontrol agent for the mountain pine beetle population. BMC Genomics 2023;24:390. [PMID: 37430186 DOI: 10.1186/s12864-023-09473-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 06/21/2023] [Indexed: 07/12/2023] Open

Affiliation(s)

Janet X Li Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC, V6T 1Z4, Canada. Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 W 7th Ave #100, Vancouver, BC, V5Z 4S6, Canada.
Kleinberg X Fernandez Department of Chemistry, University of Alberta, 11227 Saskatchewan Drive NW, Edmonton, AB, T6G 2G2, Canada
Carol Ritland Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC, V6T 1Z4, Canada Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Sharon Jancsik Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC, V6T 1Z4, Canada
Daniel B Engelhardt Department of Chemistry, University of Alberta, 11227 Saskatchewan Drive NW, Edmonton, AB, T6G 2G2, Canada
Lauren Coombe Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 W 7th Ave #100, Vancouver, BC, V5Z 4S6, Canada
René L Warren Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 W 7th Ave #100, Vancouver, BC, V5Z 4S6, Canada
Marco J van Belkum Department of Chemistry, University of Alberta, 11227 Saskatchewan Drive NW, Edmonton, AB, T6G 2G2, Canada
Allan L Carroll Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
John C Vederas Department of Chemistry, University of Alberta, 11227 Saskatchewan Drive NW, Edmonton, AB, T6G 2G2, Canada
Joerg Bohlmann Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, BC, V6T 1Z4, Canada Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Inanc Birol Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 W 7th Ave #100, Vancouver, BC, V5Z 4S6, Canada

Collapse

Wong J, Coombe L, Nikolić V, Zhang E, Nip KM, Sidhu P, Warren RL, Birol I. Linear time complexity de novo long read genome assembly with GoldRush. Nat Commun 2023;14:2906. [PMID: 37217507 DOI: 10.1038/s41467-023-38716-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 05/11/2023] [Indexed: 05/24/2023] Open

Nip KM, Hafezqorani S, Gagalova KK, Chiu R, Yang C, Warren RL, Birol I. Reference-free assembly of long-read transcriptome sequencing data with RNA-Bloom2. Nat Commun 2023;14:2940. [PMID: 37217540 PMCID: PMC10202958 DOI: 10.1038/s41467-023-38553-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Accepted: 05/08/2023] [Indexed: 05/24/2023] Open

Wong J, Kazemi P, Coombe L, Warren RL, Birol I. aaHash: recursive amino acid sequence hashing. bioRxiv 2023:2023.05.08.539909. [PMID: 37214907 PMCID: PMC10197579 DOI: 10.1101/2023.05.08.539909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]

Yoo S, Garg E, Elliott LT, Hung RJ, Halevy AR, Brooks JD, Bull SB, Gagnon F, Greenwood C, Lawless JF, Paterson AD, Sun L, Zawati MH, Lerner-Ellis J, Abraham R, Birol I, Bourque G, Garant JM, Gosselin C, Li J, Whitney J, Thiruvahindrapuram B, Herbrick JA, Lorenti M, Reuter MS, Adeoye OO, Liu S, Allen U, Bernier FP, Biggs CM, Cheung AM, Cowan J, Herridge M, Maslove DM, Modi BP, Mooser V, Morris SK, Ostrowski M, Parekh RS, Pfeffer G, Suchowersky O, Taher J, Upton J, Warren RL, Yeung R, Aziz N, Turvey SE, Knoppers BM, Lathrop M, Jones S, Scherer SW, Strug LJ. HostSeq: a Canadian whole genome sequencing and clinical data resource. BMC Genom Data 2023;24:26. [PMID: 37131148 PMCID: PMC10152008 DOI: 10.1186/s12863-023-01128-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 02/22/2023] [Indexed: 05/04/2023] Open

Affiliation(s)

S Yoo The Hospital for Sick Children, Toronto, ON, Canada University of Ottawa, Ottawa, ON, Canada
E Garg Simon Fraser University, Burnaby, BC, Canada
L T Elliott Simon Fraser University, Burnaby, BC, Canada
R J Hung University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
A R Halevy The Hospital for Sick Children, Toronto, ON, Canada
J D Brooks University of Toronto, Toronto, ON, Canada
S B Bull University of Toronto, Toronto, ON, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
F Gagnon University of Toronto, Toronto, ON, Canada
Cmt Greenwood McGill University, Montreal, QC, Canada Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC, Canada
J F Lawless University of Waterloo, Waterloo, ON, Canada
A D Paterson The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada
L Sun University of Toronto, Toronto, ON, Canada
M H Zawati McGill University, Montreal, QC, Canada
J Lerner-Ellis University of Toronto, Toronto, ON, Canada Sinai Health System, Toronto, ON, Canada
Rjs Abraham Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
I Birol Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
G Bourque McGill University, Montreal, QC, Canada
J-M Garant Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
C Gosselin Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
J Li Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
J Whitney The Hospital for Sick Children, Toronto, ON, Canada
B Thiruvahindrapuram The Hospital for Sick Children, Toronto, ON, Canada
J-A Herbrick The Hospital for Sick Children, Toronto, ON, Canada
M Lorenti The Hospital for Sick Children, Toronto, ON, Canada
M S Reuter The Hospital for Sick Children, Toronto, ON, Canada
O O Adeoye The Hospital for Sick Children, Toronto, ON, Canada
S Liu The Hospital for Sick Children, Toronto, ON, Canada
U Allen The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada
F P Bernier University of Calgary, Calgary, AB, Canada Alberta Children's Hospital, Calgary, AB, Canada
C M Biggs University of British Columbia, Vancouver, BC, Canada BC Children's Hospital, Vancouver, BC, Canada St. Paul's Hospital, Vancouver, BC, Canada
A M Cheung University Health Network, Toronto, ON, Canada
J Cowan University of Ottawa, Ottawa, ON, Canada The Ottawa Hospital Research Institute, Ottawa, ON, Canada
M Herridge University Health Network, Toronto, ON, Canada
D M Maslove Queen's University, Kingston, ON, Canada
B P Modi BC Children's Hospital, Vancouver, BC, Canada
V Mooser McGill University, Montreal, QC, Canada
S K Morris The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada
M Ostrowski University of Toronto, Toronto, ON, Canada St. Michael's Hospital, Unity Health, Toronto, ON, Canada
R S Parekh The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada Women's College Hospital, Toronto, ON, Canada
G Pfeffer University of Calgary, Calgary, AB, Canada
O Suchowersky University of Alberta, Edmonton, AB, Canada
J Taher University of Toronto, Toronto, ON, Canada Sinai Health System, Toronto, ON, Canada
J Upton The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada
R L Warren Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
Rsm Yeung The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada
N Aziz The Hospital for Sick Children, Toronto, ON, Canada
S E Turvey University of British Columbia, Vancouver, BC, Canada BC Children's Hospital, Vancouver, BC, Canada
B M Knoppers McGill University, Montreal, QC, Canada
M Lathrop McGill University, Montreal, QC, Canada
Sjm Jones Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
S W Scherer The Hospital for Sick Children, Toronto, ON, Canada University of Toronto, Toronto, ON, Canada
L J Strug The Hospital for Sick Children, Toronto, ON, Canada. University of Toronto, Toronto, ON, Canada.

Collapse

Coombe L, Warren RL, Wong J, Nikolic V, Birol I. ntLink: A Toolkit for De Novo Genome Assembly Scaffolding and Mapping Using Long Reads. Curr Protoc 2023;3:e733. [PMID: 37039735 PMCID: PMC10091225 DOI: 10.1002/cpz1.733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]

Abstract

With the increasing affordability and accessibility of genome sequencing data, de novo genome assembly is an important first step to a wide variety of downstream studies and analyses. Therefore, bioinformatics tools that enable the generation of high-quality genome assemblies in a computationally efficient manner are essential. Recent developments in long-read sequencing technologies have greatly benefited genome assembly work, including scaffolding, by providing long-range evidence that can aid in resolving the challenging repetitive regions of complex genomes. ntLink is a flexible and resource-efficient genome scaffolding tool that utilizes long-read sequencing data to improve upon draft genome assemblies built from any sequencing technologies, including the same long reads. Instead of using read alignments to identify candidate joins, ntLink utilizes minimizer-based mappings to infer how input sequences should be ordered and oriented into scaffolds. Recent improvements to ntLink have added important features such as overlap detection, gap-filling, and in-code scaffolding iterations. Here, we present three basic protocols demonstrating how to use each of these new features to yield highly contiguous genome assemblies, while still maintaining ntLink's proven computational efficiency. Further, as we illustrate in the alternate protocols, the lightweight minimizer-based mappings that enable ntLink scaffolding can also be utilized for other downstream applications, such as misassembly detection. With its modularity and multiple modes of execution, ntLink has broad benefit to the genomics community, from genome scaffolding and beyond. ntLink is an open-source project and is freely available from https://github.com/bcgsc/ntLink. © 2023 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: ntLink scaffolding using overlap detection Basic Protocol 2: ntLink scaffolding with gap-filling Basic Protocol 3: Running in-code iterations of ntLink scaffolding Alternate Protocol 1: Generating long-read to contig mappings with ntLink Alternate Protocol 2: Using ntLink mappings for genome assembly correction with Tigmint-long Support Protocol: Installing ntLink.

Collapse

Yang C, Lo T, Nip KM, Hafezqorani S, Warren RL, Birol I. Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim. Gigascience 2023;12:giad013. [PMID: 36939007 PMCID: PMC10025935 DOI: 10.1093/gigascience/giad013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 01/19/2023] [Accepted: 02/17/2023] [Indexed: 03/21/2023] Open

Li C, Warren RL, Birol I. Models and data of AMPlify: a deep learning tool for antimicrobial peptide prediction. BMC Res Notes 2023;16:11. [PMID: 36732807 PMCID: PMC9896668 DOI: 10.1186/s13104-023-06279-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Accepted: 01/24/2023] [Indexed: 02/04/2023] Open

Shalev TJ, Gamal El-Dien O, Yuen MM, Shengqiang S, Jackman SD, Warren RL, Coombe L, van der Merwe L, Stewart A, Boston LB, Plott C, Jenkins J, He G, Yan J, Yan M, Guo J, Breinholt JW, Neves LG, Grimwood J, Rieseberg LH, Schmutz J, Birol I, Kirst M, Yanchuk AD, Ritland C, Russell JH, Bohlmann J. The western redcedar genome reveals low genetic diversity in a self-compatible conifer. Genome Res 2022;32:1952-1964. [DOI: 10.1101/gr.276358.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 09/06/2022] [Indexed: 11/24/2022]

Gagalova KK, Warren RL, Coombe L, Wong J, Nip KM, Yuen MMS, Whitehill JGA, Celedon JM, Ritland C, Taylor GA, Cheng D, Plettner P, Hammond SA, Mohamadi H, Zhao Y, Moore RA, Mungall AJ, Boyle B, Laroche J, Cottrell J, Mackay JJ, Lamothe M, Gérardi S, Isabel N, Pavy N, Jones SJM, Bohlmann J, Bousquet J, Birol I. Spruce giga-genomes: structurally similar yet distinctive with differentially expanding gene families and rapidly evolving genes. Plant J 2022;111:1469-1485. [PMID: 35789009 DOI: 10.1111/tpj.15889] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 06/22/2022] [Accepted: 06/27/2022] [Indexed: 06/15/2023]

Abstract

Spruces (Picea spp.) are coniferous trees widespread in boreal and mountainous forests of the northern hemisphere, with large economic significance and enormous contributions to global carbon sequestration. Spruces harbor very large genomes with high repetitiveness, hampering their comparative analysis. Here, we present and compare the genomes of four different North American spruces: the genome assemblies for Engelmann spruce (Picea engelmannii) and Sitka spruce (Picea sitchensis) together with improved and more contiguous genome assemblies for white spruce (Picea glauca) and for a naturally occurring introgress of these three species known as interior spruce (P. engelmannii × glauca × sitchensis). The genomes were structurally similar, and a large part of scaffolds could be anchored to a genetic map. The composition of the interior spruce genome indicated asymmetric contributions from the three ancestral genomes. Phylogenetic analysis of the nuclear and organelle genomes revealed a topology indicative of ancient reticulation. Different patterns of expansion of gene families among genomes were observed and related with presumed diversifying ecological adaptations. We identified rapidly evolving genes that harbored high rates of non-synonymous polymorphisms relative to synonymous ones, indicative of positive selection and its hitchhiking effects. These gene sets were mostly distinct between the genomes of ecologically contrasted species, and signatures of convergent balancing selection were detected. Stress and stimulus response was identified as the most frequent function assigned to expanding gene families and rapidly evolving genes. These two aspects of genomic evolution were complementary in their contribution to divergent evolution of presumed adaptive nature. These more contiguous spruce giga-genome sequences should strengthen our understanding of conifer genome structure and evolution, as their comparison offers clues into the genetic basis of adaptation and ecology of conifers at the genomic level. They will also provide tools to better monitor natural genetic diversity and improve the management of conifer forests. The genomes of four closely related North American spruces indicate that their high similarity at the morphological level is paralleled by the high conservation of their physical genome structure. Yet, the evidence of divergent evolution is apparent in their rapidly evolving genomes, supported by differential expansion of key gene families and large sets of genes under positive selection, largely in relation to stimulus and environmental stress response.

Collapse

Affiliation(s)

Kristina K Gagalova Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
René L Warren Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Lauren Coombe Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Johnathan Wong Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Ka Ming Nip Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Macaire Man Saint Yuen Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Justin G A Whitehill Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Jose M Celedon Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Carol Ritland Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Greg A Taylor Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Dean Cheng Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Patrick Plettner Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
S Austin Hammond Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada Next-Generation Sequencing Facility, University of Saskatchewan, Saskatoon, SK, S7N 5E5, Canada
Hamid Mohamadi Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Yongjun Zhao Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Richard A Moore Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Andrew J Mungall Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Brian Boyle Institute for Systems and Integrative Biology, Université Laval, Québec, QC, GIV 0A6, Canada
Jérôme Laroche Institute for Systems and Integrative Biology, Université Laval, Québec, QC, GIV 0A6, Canada
Joan Cottrell Forest Research, U.K. Forestry Commission, Northern Research Station, Roslin, EH25 9SY, Midlothian, UK
John J Mackay Department of Plant Sciences, University of Oxford, Oxford, OX1 3RB, UK
Manuel Lamothe Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, Québec, QC, G1V 4C7, Canada
Sébastien Gérardi Institute for Systems and Integrative Biology, Université Laval, Québec, QC, GIV 0A6, Canada Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, QC, G1V 0A6, Canada
Nathalie Isabel Natural Resources Canada, Canadian Forest Service, Laurentian Forestry Centre, Québec, QC, G1V 4C7, Canada Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, QC, G1V 0A6, Canada
Nathalie Pavy Institute for Systems and Integrative Biology, Université Laval, Québec, QC, GIV 0A6, Canada Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, QC, G1V 0A6, Canada
Steven J M Jones Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada
Joerg Bohlmann Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Jean Bousquet Institute for Systems and Integrative Biology, Université Laval, Québec, QC, GIV 0A6, Canada Canada Research Chair in Forest Genomics, Forest Research Centre, Université Laval, Québec, QC, G1V 0A6, Canada
Inanc Birol Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, V5Z 4S6, Canada

Collapse

Kazemi P, Wong J, Nikolić V, Mohamadi H, Warren RL, Birol I. ntHash2: recursive spaced seed hashing for nucleotide sequences. Bioinformatics 2022;38:4812-4813. [PMID: 36000872 PMCID: PMC9563681 DOI: 10.1093/bioinformatics/btac564] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 07/21/2022] [Indexed: 11/29/2022] Open

Lin D, Sutherland D, Aninta SI, Louie N, Nip KM, Li C, Yanai A, Coombe L, Warren RL, Helbing CC, Hoang LMN, Birol I. Mining Amphibian and Insect Transcriptomes for Antimicrobial Peptide Sequences with rAMPage. Antibiotics (Basel) 2022;11:antibiotics11070952. [PMID: 35884206 PMCID: PMC9312091 DOI: 10.3390/antibiotics11070952] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 07/12/2022] [Accepted: 07/13/2022] [Indexed: 02/01/2023] Open

Affiliation(s)

Diana Lin Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.)
Darcy Sutherland Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.) British Columbia Centre for Disease Control, Public Health Laboratory, Vancouver, BC V6Z R4R, Canada; Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Sambina Islam Aninta Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.)
Nathan Louie Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.)
Ka Ming Nip Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.) Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Chenkai Li Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.) Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Anat Yanai Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.)
Lauren Coombe Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.)
René L. Warren Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.)
Caren C. Helbing Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC V8P 5C2, Canada;
Linda M. N. Hoang British Columbia Centre for Disease Control, Public Health Laboratory, Vancouver, BC V6Z R4R, Canada; Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
Inanc Birol Canada’s Michael Smith Genome Sciences Centre at BC Cancer, Vancouver, BC V5Z 4S6, Canada; (D.L.); (D.S.); (S.I.A.); (N.L.); (K.M.N.); (C.L.); (A.Y.); (L.C.); (R.L.W.) British Columbia Centre for Disease Control, Public Health Laboratory, Vancouver, BC V6Z R4R, Canada; Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC V6T 1Z4, Canada Correspondence:

Collapse

Nikolić V, Afshinfard A, Chu J, Wong J, Coombe L, Nip KM, Warren RL, Birol I. RResolver: efficient short-read repeat resolution within ABySS. BMC Bioinformatics 2022;23:246. [PMID: 35729491 PMCID: PMC9215042 DOI: 10.1186/s12859-022-04790-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 06/09/2022] [Indexed: 11/26/2022] Open

Abstract

BACKGROUND

De novo genome assembly is essential to modern genomics studies. As it is not biased by a reference, it is also a useful method for studying genomes with high variation, such as cancer genomes. De novo short-read assemblers commonly use de Bruijn graphs, where nodes are sequences of equal length k, also known as k-mers. Edges in this graph are established between nodes that overlap by [Formula: see text] bases, and nodes along unambiguous walks in the graph are subsequently merged. The selection of k is influenced by multiple factors, and optimizing this value results in a trade-off between graph connectivity and sequence contiguity. Ideally, multiple k sizes should be used, so lower values can provide good connectivity in lesser covered regions and higher values can increase contiguity in well-covered regions. However, current approaches that use multiple k values do not address the scalability issues inherent to the assembly of large genomes.

RESULTS

Here we present RResolver, a scalable algorithm that takes a short-read de Bruijn graph assembly with a starting k as input and uses a k value closer to that of the read length to resolve repeats. RResolver builds a Bloom filter of sequencing reads which is used to evaluate the assembly graph path support at branching points and removes paths with insufficient support. RResolver runs efficiently, taking only 26 min on average for an ABySS human assembly with 48 threads and 60 GiB memory. Across all experiments, compared to a baseline assembly, RResolver improves scaffold contiguity (NGA50) by up to 15% and reduces misassemblies by up to 12%.

CONCLUSIONS

RResolver adds a missing component to scalable de Bruijn graph genome assembly. By improving the initial and fundamental graph traversal outcome, all downstream ABySS algorithms greatly benefit by working with a more accurate and less complex representation of the genome. The RResolver code is integrated into ABySS and is available at https://github.com/bcgsc/abyss/tree/master/RResolver .

Collapse

Li JX, Coombe L, Wong J, Birol I, Warren RL. ntEdit+Sealer: Efficient Targeted Error Resolution and Automated Finishing of Long-Read Genome Assemblies. Curr Protoc 2022;2:e442. [PMID: 35567771 PMCID: PMC9196995 DOI: 10.1002/cpz1.442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Li C, Sutherland D, Hammond SA, Yang C, Taho F, Bergman L, Houston S, Warren RL, Wong T, Hoang LMN, Cameron CE, Helbing CC, Birol I. AMPlify: attentive deep learning model for discovery of novel antimicrobial peptides effective against WHO priority pathogens. BMC Genomics 2022;23:77. [PMID: 35078402 PMCID: PMC8788131 DOI: 10.1186/s12864-022-08310-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Accepted: 01/12/2022] [Indexed: 01/25/2023] Open

Affiliation(s)

Chenkai Li Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Darcy Sutherland Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Public Health Laboratory, British Columbia Centre for Disease Control, Vancouver, BC, V5Z 4R4, Canada Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
S Austin Hammond Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Chen Yang Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Figali Taho Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Bioinformatics Graduate Program, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Lauren Bergman Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, V8P 5C3, Canada
Simon Houston Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, V8P 5C3, Canada
René L Warren Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Titus Wong Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada Medical Microbiology Laboratory, Vancouver General Hospital, Vancouver, BC, V5Z 1M9, Canada
Linda M N Hoang Public Health Laboratory, British Columbia Centre for Disease Control, Vancouver, BC, V5Z 4R4, Canada Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Caroline E Cameron Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, V8P 5C3, Canada Division of Infectious Diseases, Department of Medicine, University of Washington, Seattle, WA, 98195, USA
Caren C Helbing Department of Biochemistry and Microbiology, University of Victoria, Victoria, BC, V8P 5C3, Canada
Inanc Birol Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, BC, V5Z 4S6, Canada. Public Health Laboratory, British Columbia Centre for Disease Control, Vancouver, BC, V5Z 4R4, Canada. Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada.

Collapse

Stephenson M, Nip KM, HafezQorani S, Gagalova KK, Yang C, Warren RL, Birol I. RNA-Scoop: interactive visualization of transcripts in single-cell transcriptomes. NAR Genom Bioinform 2021;3:lqab105. [PMID: 34859209 PMCID: PMC8633890 DOI: 10.1093/nargab/lqab105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 08/21/2021] [Accepted: 11/26/2021] [Indexed: 11/12/2022] Open

Coombe L, Li JX, Lo T, Wong J, Nikolic V, Warren RL, Birol I. LongStitch: high-quality genome assembly correction and scaffolding using long reads. BMC Bioinformatics 2021;22:534. [PMID: 34717540 PMCID: PMC8557608 DOI: 10.1186/s12859-021-04451-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 10/19/2021] [Indexed: 12/12/2022] Open

Abstract

BACKGROUND

Generating high-quality de novo genome assemblies is foundational to the genomics study of model and non-model organisms. In recent years, long-read sequencing has greatly benefited genome assembly and scaffolding, a process by which assembled sequences are ordered and oriented through the use of long-range information. Long reads are better able to span repetitive genomic regions compared to short reads, and thus have tremendous utility for resolving problematic regions and helping generate more complete draft assemblies. Here, we present LongStitch, a scalable pipeline that corrects and scaffolds draft genome assemblies exclusively using long reads.

RESULTS

LongStitch incorporates multiple tools developed by our group and runs in up to three stages, which includes initial assembly correction (Tigmint-long), followed by two incremental scaffolding stages (ntLink and ARKS-long). Tigmint-long and ARKS-long are misassembly correction and scaffolding utilities, respectively, previously developed for linked reads, that we adapted for long reads. Here, we describe the LongStitch pipeline and introduce our new long-read scaffolder, ntLink, which utilizes lightweight minimizer mappings to join contigs. LongStitch was tested on short and long-read assemblies of Caenorhabditis elegans, Oryza sativa, and three different human individuals using corresponding nanopore long-read data, and improves the contiguity of each assembly from 1.2-fold up to 304.6-fold (as measured by NGA50 length). Furthermore, LongStitch generates more contiguous and correct assemblies compared to state-of-the-art long-read scaffolder LRScaf in most tests, and consistently improves upon human assemblies in under five hours using less than 23 GB of RAM.

CONCLUSIONS

Due to its effectiveness and efficiency in improving draft assemblies using long reads, we expect LongStitch to benefit a wide variety of de novo genome assembly projects. The LongStitch pipeline is freely available at https://github.com/bcgsc/longstitch .

Collapse

Warren RL, Birol I. HLA alleles measured from COVID-19 patient transcriptomes reveal associations with disease prognosis in a New York cohort. PeerJ 2021;9:e12368. [PMID: 34722002 PMCID: PMC8522641 DOI: 10.7717/peerj.12368] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 10/01/2021] [Indexed: 12/19/2022] Open

Jackman SD, Coombe L, Warren RL, Kirk H, Trinh E, MacLeod T, Pleasance S, Pandoh P, Zhao Y, Coope RJ, Bousquet J, Bohlmann J, Jones SJM, Birol I. Complete Mitochondrial Genome of a Gymnosperm, Sitka Spruce (Picea sitchensis), Indicates a Complex Physical Structure. Genome Biol Evol 2021;12:1174-1179. [PMID: 32449750 PMCID: PMC7486957 DOI: 10.1093/gbe/evaa108] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/20/2020] [Indexed: 12/12/2022] Open

Warren RL, Birol I. Interactive SARS-CoV-2 mutation timemaps. F1000Res 2021;10:68. [PMID: 34136131 PMCID: PMC8188262 DOI: 10.12688/f1000research.50857.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/29/2021] [Indexed: 04/03/2024] Open

Warren RL, Birol I. Interactive SARS-CoV-2 mutation timemaps. F1000Res 2021;10:68. [PMID: 34136131 PMCID: PMC8188262 DOI: 10.12688/f1000research.50857.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/24/2021] [Indexed: 11/23/2022] Open

Warren RL, Birol I. HLA predictions from the bronchoalveolar lavage fluid and blood samples of eight COVID-19 patients at the pandemic onset. Bioinformatics 2021;36:5271-5273. [PMID: 32853340 PMCID: PMC7540287 DOI: 10.1093/bioinformatics/btaa756] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 08/18/2020] [Accepted: 08/20/2020] [Indexed: 12/16/2022] Open

Warren RL, Birol I. Interactive SARS-CoV-2 mutation timemaps. ArXiv 2020:2012.15697. [PMID: 33398246 PMCID: PMC7781321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Warren RL, Birol I. Retrospective in silico HLA predictions from COVID-19 patients reveal alleles associated with disease prognosis. medRxiv 2020:2020.10.27.20220863. [PMID: 33140057 PMCID: PMC7605564 DOI: 10.1101/2020.10.27.20220863] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Nip KM, Chiu R, Yang C, Chu J, Mohamadi H, Warren RL, Birol I. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. Genome Res 2020;30:1191-1200. [PMID: 32817073 PMCID: PMC7462077 DOI: 10.1101/gr.260174.119] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 07/23/2020] [Indexed: 12/27/2022]

Warren RL, Coombe L, Mohamadi H, Zhang J, Jaquish B, Isabel N, Jones SJM, Bousquet J, Bohlmann J, Birol I. ntEdit: scalable genome sequence polishing. Bioinformatics 2020;35:4430-4432. [PMID: 31095290 PMCID: PMC6821332 DOI: 10.1093/bioinformatics/btz400] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Revised: 03/04/2019] [Accepted: 05/07/2019] [Indexed: 02/05/2023] Open

Abstract

Motivation

In the modern genomics era, genome sequence assemblies are routine practice. However, depending on the methodology, resulting drafts may contain considerable base errors. Although utilities exist for genome base polishing, they work best with high read coverage and do not scale well. We developed ntEdit, a Bloom filter-based genome sequence editing utility that scales to large mammalian and conifer genomes.

Results

We first tested ntEdit and the state-of-the-art assembly improvement tools GATK, Pilon and Racon on controlled Escherichia coli and Caenorhabditis elegans sequence data. Generally, ntEdit performs well at low sequence depths (<20×), fixing the majority (>97%) of base substitutions and indels, and its performance is largely constant with increased coverage. In all experiments conducted using a single CPU, the ntEdit pipeline executed in <14 s and <3 m, on average, on E.coli and C.elegans, respectively. We performed similar benchmarks on a sub-20× coverage human genome sequence dataset, inspecting accuracy and resource usage in editing chromosomes 1 and 21, and whole genome. ntEdit scaled linearly, executing in 30–40 m on those sequences. We show how ntEdit ran in <2 h 20 m to improve upon long and linked read human genome assemblies of NA12878, using high-coverage (54×) Illumina sequence data from the same individual, fixing frame shifts in coding sequences. We also generated 17-fold coverage spruce sequence data from haploid sequence sources (seed megagametophyte), and used it to edit our pseudo haploid assemblies of the 20 Gb interior and white spruce genomes in <4 and <5 h, respectively, making roughly 50M edits at a (substitution+indel) rate of 0.0024.

Availability and implementation

https://github.com/bcgsc/ntedit

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Hafezqorani S, Yang C, Lo T, Nip KM, Warren RL, Birol I. Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data. Gigascience 2020;9:5855462. [PMID: 32520350 PMCID: PMC7285873 DOI: 10.1093/gigascience/giaa061] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 04/14/2020] [Accepted: 05/12/2020] [Indexed: 01/08/2023] Open

Coombe L, Nikolić V, Chu J, Birol I, Warren RL. ntJoin: Fast and lightweight assembly-guided scaffolding using minimizer graphs. Bioinformatics 2020;36:3885-3887. [PMID: 32311025 PMCID: PMC7320612 DOI: 10.1093/bioinformatics/btaa253] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 03/23/2020] [Accepted: 04/14/2020] [Indexed: 11/17/2022] Open

Law WD, Warren RL, McCallion AS. Establishment of an eHAP1 human haploid cell line hybrid reference genome assembled from short and long reads. Genomics 2020;112:2379-2384. [PMID: 31962144 DOI: 10.1016/j.ygeno.2020.01.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 01/13/2020] [Accepted: 01/15/2020] [Indexed: 12/31/2022]

Warren RL, Birol I. HLA predictions from the bronchoalveolar lavage fluid samples of five patients at the early stage of the wuhan seafood market COVID-19 outbreak. ArXiv 2020:arXiv:2004.07108v3. [PMID: 32550246 PMCID: PMC7280900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Helbing CC, Hammond SA, Jackman SH, Houston S, Warren RL, Cameron CE, Birol I. Antimicrobial peptides from Rana [Lithobates] catesbeiana: Gene structure and bioinformatic identification of novel forms from tadpoles. Sci Rep 2019;9:1529. [PMID: 30728430 PMCID: PMC6365531 DOI: 10.1038/s41598-018-38442-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Accepted: 12/28/2018] [Indexed: 01/21/2023] Open

Yeo S, Coombe L, Warren RL, Chu J, Birol I. ARCS: scaffolding genome drafts with linked reads. Bioinformatics 2018;34:725-731. [PMID: 29069293 PMCID: PMC6030987 DOI: 10.1093/bioinformatics/btx675] [Citation(s) in RCA: 100] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Accepted: 10/20/2017] [Indexed: 01/12/2023] Open

Xue Z, Warren RL, Gibb EA, MacMillan D, Wong J, Chiu R, Hammond SA, Yang C, Nip KM, Ennis CA, Hahn A, Reynolds S, Birol I. Recurrent tumor-specific regulation of alternative polyadenylation of cancer-related genes. BMC Genomics 2018;19:536. [PMID: 30005633 PMCID: PMC6045855 DOI: 10.1186/s12864-018-4903-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 06/27/2018] [Indexed: 01/09/2023] Open

Abstract

Background

Alternative polyadenylation (APA) results in messenger RNA molecules with different 3′ untranslated regions (3’ UTRs), affecting the molecules’ stability, localization, and translation. APA is pervasive and implicated in cancer. Earlier reports on APA focused on 3’ UTR length modifications and commonly characterized APA events as 3’ UTR shortening or lengthening. However, such characterization oversimplifies the processing of 3′ ends of transcripts and fails to adequately describe the various scenarios we observe.

Results

We built a cloud-based targeted de novo transcript assembly and analysis pipeline that incorporates our previously developed cleavage site prediction tool, KLEAT. We applied this pipeline to elucidate the APA profiles of 114 genes in 9939 tumor and 729 tissue normal samples from The Cancer Genome Atlas (TCGA). The full set of 10,668 RNA-Seq samples from 33 cancer types has not been utilized by previous APA studies. By comparing the frequencies of predicted cleavage sites between normal and tumor sample groups, we identified 77 events (i.e. gene-cancer type pairs) of tumor-specific APA regulation in 13 cancer types; for 15 genes, such regulation is recurrent across multiple cancers. Our results also support a previous report showing the 3’ UTR shortening of FGF2 in multiple cancers. However, over half of the events we identified display complex changes to 3’ UTR length that resist simple classification like shortening or lengthening.

Conclusions

Recurrent tumor-specific regulation of APA is widespread in cancer. However, the regulation pattern that we observed in TCGA RNA-seq data cannot be described as straightforward 3’ UTR shortening or lengthening. Continued investigation into this complex, nuanced regulatory landscape will provide further insight into its role in tumor formation and development.

Electronic supplementary material

The online version of this article (10.1186/s12864-018-4903-7) contains supplementary material, which is available to authorized users.

Collapse

Coombe L, Zhang J, Vandervalk BP, Chu J, Jackman SD, Birol I, Warren RL. ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers. BMC Bioinformatics 2018;19:234. [PMID: 29925315 PMCID: PMC6011487 DOI: 10.1186/s12859-018-2243-x] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 06/13/2018] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate and contiguous draft genome assemblies. We introduce ARKS, an alignment-free linked read genome scaffolding methodology that uses linked reads to organize genome assemblies further into contiguous drafts. Our approach departs from other read alignment-dependent linked read scaffolders, including our own (ARCS), and uses a kmer-based mapping approach. The kmer mapping strategy has several advantages over read alignment methods, including better usability and faster processing, as it precludes the need for input sequence formatting and draft sequence assembly indexing. The reliance on kmers instead of read alignments for pairing sequences relaxes the workflow requirements, and drastically reduces the run time.

RESULTS

Here, we show how linked reads, when used in conjunction with Hi-C data for scaffolding, improve a draft human genome assembly of PacBio long-read data five-fold (baseline vs. ARKS NG50 = 4.6 vs. 23.1 Mbp, respectively). We also demonstrate how the method provides further improvements of a megabase-scale Supernova human genome assembly (NG50 = 14.74 Mbp vs. 25.94 Mbp before and after ARKS), which itself exclusively uses linked read data for assembly, with an execution speed six to nine times faster than competitive linked read scaffolders (~ 10.5 h compared to 75.7 h, on average). Following ARKS scaffolding of a human genome 10xG Supernova assembly (of cell line NA12878), fewer than 9 scaffolds cover each chromosome, except the largest (chromosome 1, n = 13).

CONCLUSIONS

ARKS uses a kmer mapping strategy instead of linked read alignments to record and associate the barcode information needed to order and orient draft assembly sequences. The simplified workflow, when compared to that of our initial implementation, ARCS, markedly improves run time performances on experimental human genome datasets. Furthermore, the novel distance estimator in ARKS utilizes barcoding information from linked reads to estimate gap sizes. It accomplishes this by modeling the relationship between known distances of a region within contigs and calculating associated Jaccard indices. ARKS has the potential to provide correct, chromosome-scale genome assemblies, promptly. We expect ARKS to have broad utility in helping refine draft genomes.

Collapse

Kucuk E, Chu J, Vandervalk BP, Hammond SA, Warren RL, Birol I. Kollector: transcript-informed, targeted de novo assembly of gene loci. Bioinformatics 2018;33:1782-1788. [PMID: 28186221 PMCID: PMC5572715 DOI: 10.1093/bioinformatics/btx078] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Accepted: 02/07/2017] [Indexed: 11/13/2022] Open

Abstract

Motivation

Despite considerable advancements in sequencing and computing technologies, de novo assembly of whole eukaryotic genomes is still a time-consuming task that requires a significant amount of computational resources and expertise. A targeted assembly approach to perform local assembly of sequences of interest remains a valuable option for some applications. This is especially true for gene-centric assemblies, whose resulting sequence can be readily utilized for more focused biological research. Here we describe Kollector, an alignment-free targeted assembly pipeline that uses thousands of transcript sequences concurrently to inform the localized assembly of corresponding gene loci. Kollector robustly reconstructs introns and novel sequences within these loci, and scales well to large genomes—properties that makes it especially useful for researchers working on non-model eukaryotic organisms.

Results

We demonstrate the performance of Kollector for assembling complete or near-complete Caenorhabditis elegans and Homo sapiens gene loci from their respective, input transcripts. In a time- and memory-efficient manner, the Kollector pipeline successfully reconstructs respectively 99% and 80% (compared to 86% and 73% with standard de novo assembly techniques) of C.elegans and H.sapiens transcript targets in their corresponding genomic space using whole genome shotgun sequencing reads. We also show that Kollector outperforms both established and recently released targeted assembly tools. Finally, we demonstrate three use cases for Kollector, including comparative and cancer genomics applications.

Availability and Implementation

Kollector is implemented as a bash script, and is available at https://github.com/bcgsc/kollector

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Jones SJM, Taylor GA, Chan S, Warren RL, Hammond SA, Bilobram S, Mordecai G, Suttle CA, Miller KM, Schulze A, Chan AM, Jones SJ, Tse K, Li I, Cheung D, Mungall KL, Choo C, Ally A, Dhalla N, Tam AKY, Troussard A, Kirk H, Pandoh P, Paulino D, Coope RJN, Mungall AJ, Moore R, Zhao Y, Birol I, Ma Y, Marra M, Haulena M. The Genome of the Beluga Whale (Delphinapterus leucas). Genes (Basel) 2017;8:genes8120378. [PMID: 29232881 PMCID: PMC5748696 DOI: 10.3390/genes8120378] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 11/28/2017] [Accepted: 12/01/2017] [Indexed: 12/17/2022] Open

Affiliation(s)

Steven J M Jones Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Gregory A Taylor Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Simon Chan Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
René L Warren Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
S Austin Hammond Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Steven Bilobram Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Gideon Mordecai Department of Earth, Ocean & Atmospheric Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada. Institute for the Oceans & Fisheries, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
Curtis A Suttle Department of Earth, Ocean & Atmospheric Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada. Institute for the Oceans & Fisheries, University of British Columbia, Vancouver, BC V6T 1Z4, Canada. Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada. Department of Botany, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
Kristina M Miller Fisheries and Oceans Canada, Molecular Genetics Section, Pacific Biological Station, Nanaimo, BC V9R 5K6, Canada.
Angela Schulze Fisheries and Oceans Canada, Molecular Genetics Section, Pacific Biological Station, Nanaimo, BC V9R 5K6, Canada.
Amy M Chan Department of Earth, Ocean & Atmospheric Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada. Institute for the Oceans & Fisheries, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
Samantha J Jones Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Kane Tse Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Irene Li Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Dorothy Cheung Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Karen L Mungall Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Caleb Choo Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Adrian Ally Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Noreen Dhalla Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Angela K Y Tam Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Armelle Troussard Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Heather Kirk Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Pawan Pandoh Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Daniel Paulino Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Robin J N Coope Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Andrew J Mungall Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Richard Moore Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Yongjun Zhao Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Inanc Birol Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Yussanne Ma Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Marco Marra Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Martin Haulena Vancouver Aquarium, Vancouver, BC V6G 3E2, Canada.

Collapse

Jones SJ, Haulena M, Taylor GA, Chan S, Bilobram S, Warren RL, Hammond SA, Mungall KL, Choo C, Kirk H, Pandoh P, Ally A, Dhalla N, Tam AKY, Troussard A, Paulino D, Coope RJN, Mungall AJ, Moore R, Zhao Y, Birol I, Ma Y, Marra M, Jones SJM. The Genome of the Northern Sea Otter (Enhydra lutris kenyoni). Genes (Basel) 2017;8:genes8120379. [PMID: 29232880 PMCID: PMC5748697 DOI: 10.3390/genes8120379] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 11/28/2017] [Accepted: 12/01/2017] [Indexed: 11/21/2022] Open

Affiliation(s)

Samantha J Jones Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Martin Haulena Vancouver Aquarium, Vancouver, BC V6G 3E2, Canada.
Gregory A Taylor Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Simon Chan Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Steven Bilobram Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
René L Warren Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
S Austin Hammond Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Karen L Mungall Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Caleb Choo Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Heather Kirk Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Pawan Pandoh Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Adrian Ally Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Noreen Dhalla Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Angela K Y Tam Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Armelle Troussard Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Daniel Paulino Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Robin J N Coope Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Andrew J Mungall Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Richard Moore Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Yongjun Zhao Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Inanc Birol Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Yussanne Ma Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada.
Marco Marra Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada.
Steven J M Jones Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC V5Z 4E6, Canada. Department of Medical Genetics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada. Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada.

Collapse

Hammond SA, Warren RL, Vandervalk BP, Kucuk E, Khan H, Gibb EA, Pandoh P, Kirk H, Zhao Y, Jones M, Mungall AJ, Coope R, Pleasance S, Moore RA, Holt RA, Round JM, Ohora S, Walle BV, Veldhoen N, Helbing CC, Birol I. The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA. Nat Commun 2017;8:1433. [PMID: 29127278 PMCID: PMC5681567 DOI: 10.1038/s41467-017-01316-7] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 09/07/2017] [Indexed: 12/16/2022] Open

Affiliation(s)

S Austin Hammond Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
René L Warren Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Benjamin P Vandervalk Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Erdi Kucuk Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Hamza Khan Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Ewan A Gibb Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Pawan Pandoh Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Heather Kirk Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Yongjun Zhao Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Martin Jones Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Andrew J Mungall Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Robin Coope Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Stephen Pleasance Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Richard A Moore Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Robert A Holt Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6
Jessica M Round Department of Biochemistry and Microbiology, University of Victoria, Petch Bldg Room 207, 3800 Finnerty Road, Victoria, BC, Canada, V8P 5C2
Sara Ohora Department of Biochemistry and Microbiology, University of Victoria, Petch Bldg Room 207, 3800 Finnerty Road, Victoria, BC, Canada, V8P 5C2
Branden V Walle Department of Biochemistry and Microbiology, University of Victoria, Petch Bldg Room 207, 3800 Finnerty Road, Victoria, BC, Canada, V8P 5C2
Nik Veldhoen Department of Biochemistry and Microbiology, University of Victoria, Petch Bldg Room 207, 3800 Finnerty Road, Victoria, BC, Canada, V8P 5C2
Caren C Helbing Department of Biochemistry and Microbiology, University of Victoria, Petch Bldg Room 207, 3800 Finnerty Road, Victoria, BC, Canada, V8P 5C2.
Inanc Birol Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, 570 West 7th Ave - Suite 100, Vancouver, BC, Canada, V5Z 4S6.

Collapse

Kucuk E, Chu J, Vandervalk BP, Austin Hammond S, Warren RL. Kollector: transcript-informed, targeted de novo assembly of gene loci. Bioinformatics 2017;33:2789. [PMID: 28903539 PMCID: PMC5860073 DOI: 10.1093/bioinformatics/btx405] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Chu J, Mohamadi H, Warren RL, Yang C, Birol I. Innovations and challenges in detecting long read overlaps: an evaluation of the state-of-the-art. Bioinformatics 2017;33:1261-1270. [PMID: 28003261 PMCID: PMC5408847 DOI: 10.1093/bioinformatics/btw811] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2016] [Accepted: 12/16/2016] [Indexed: 01/23/2023] Open

Yang C, Chu J, Warren RL, Birol I. NanoSim: nanopore sequence read simulator based on statistical characterization. Gigascience 2017;6:1-6. [PMID: 28327957 PMCID: PMC5530317 DOI: 10.1093/gigascience/gix010] [Citation(s) in RCA: 106] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2016] [Revised: 01/12/2017] [Accepted: 02/21/2017] [Indexed: 01/19/2023] Open

Abstract

Background

The MinION sequencing instrument from Oxford Nanopore Technologies (ONT) produces long read lengths from single-molecule sequencing - valuable features for detailed genome characterization. To realize the potential of this platform, a number of groups are developing bioinformatics tools tuned for the unique characteristics of its data. We note that these development efforts would benefit from a simulator software, the output of which could be used to benchmark analysis tools.

Results

Here, we introduce NanoSim, a fast and scalable read simulator that captures the technology-specific features of ONT data and allows for adjustments upon improvement of nanopore sequencing technology. The first step of NanoSim is read characterization, which provides a comprehensive alignment-based analysis and generates a set of read profiles serving as the input to the next step, the simulation stage. The simulation stage uses the model built in the previous step to produce in silico reads for a given reference genome. NanoSim is written in Python and R. The source files and manual are available at the Genome Sciences Centre website: http://www.bcgsc.ca/platform/bioinfo/software/nanosim.

Conclusion

In this work, we model the base-calling errors of ONT reads to inform the simulation of sequences with similar characteristics. We showcase the performance of NanoSim on publicly available datasets generated using the R7 and R7.3 chemistries and different sequencing kits and compare the resulting synthetic reads to those of other long-sequence simulators and experimental ONT reads. We expect NanoSim to have an enabling role in the field and benefit the development of scalable next-generation sequencing technologies for the long nanopore reads, including genome assembly, mutation detection, and even metagenomic analysis software.

Collapse

Coombe L, Warren RL, Jackman SD, Yang C, Vandervalk BP, Moore RA, Pleasance S, Coope RJ, Bohlmann J, Holt RA, Jones SJM, Birol I. Assembly of the Complete Sitka Spruce Chloroplast Genome Using 10X Genomics' GemCode Sequencing Data. PLoS One 2016;11:e0163059. [PMID: 27632164 PMCID: PMC5025161 DOI: 10.1371/journal.pone.0163059] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Accepted: 09/01/2016] [Indexed: 11/19/2022] Open

Jackman SD, Warren RL, Gibb EA, Vandervalk BP, Mohamadi H, Chu J, Raymond A, Pleasance S, Coope R, Wildung MR, Ritland CE, Bousquet J, Jones SJM, Bohlmann J, Birol I. Organellar Genomes of White Spruce (Picea glauca): Assembly and Annotation. Genome Biol Evol 2015;8:29-41. [PMID: 26645680 PMCID: PMC4758241 DOI: 10.1093/gbe/evv244] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Affiliation(s)

Shaun D Jackman Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
René L Warren Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Ewan A Gibb Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Benjamin P Vandervalk Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Hamid Mohamadi Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Justin Chu Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Anthony Raymond Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Stephen Pleasance Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Robin Coope Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada
Mark R Wildung School of Molecular Biosciences, Washington State University
Carol E Ritland Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, Canada
Jean Bousquet Department of Forest and Environmental Genomics, Université Laval, Québec, QC, Canada
Steven J M Jones Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
Joerg Bohlmann Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, Canada Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada Department of Botany, University of British Columbia, Vancouver, BC, Canada
Inanç Birol Canada's Michael Smith Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, Canada Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada School of Computing Science, Simon Fraser University, Burnaby, BC, Canada Department of Computer Science, University of British Columbia, Vancouver, BC, Canada

Collapse

Vandervalk BP, Yang C, Xue Z, Raghavan K, Chu J, Mohamadi H, Jackman SD, Chiu R, Warren RL, Birol I. Konnector v2.0: pseudo-long reads from paired-end sequencing data. BMC Med Genomics 2015;8 Suppl 3:S1. [PMID: 26399504 PMCID: PMC4582294 DOI: 10.1186/1755-8794-8-s3-s1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Warren RL, Yang C, Vandervalk BP, Behsaz B, Lagman A, Jones SJM, Birol I. LINKS: Scalable, alignment-free scaffolding of draft genomes with long reads. Gigascience 2015;4:35. [PMID: 26244089 PMCID: PMC4524009 DOI: 10.1186/s13742-015-0076-3] [Citation(s) in RCA: 121] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Accepted: 07/29/2015] [Indexed: 12/05/2022] Open

Abstract

Background

Owing to the complexity of the assembly problem, we do not yet have complete genome sequences. The difficulty in assembling reads into finished genomes is exacerbated by sequence repeats and the inability of short reads to capture sufficient genomic information to resolve those problematic regions. In this regard, established and emerging long read technologies show great promise, but their current associated higher error rates typically require computational base correction and/or additional bioinformatics pre-processing before they can be of value.

Results

We present LINKS, the Long Interval Nucleotide K-mer Scaffolder algorithm, a method that makes use of the sequence properties of nanopore sequence data and other error-containing sequence data, to scaffold high-quality genome assemblies, without the need for read alignment or base correction. Here, we show how the contiguity of an ABySS Escherichia coli K-12 genome assembly can be increased greater than five-fold by the use of beta-released Oxford Nanopore Technologies Ltd. long reads and how LINKS leverages long-range information in Saccharomyces cerevisiae W303 nanopore reads to yield assemblies whose resulting contiguity and correctness are on par with or better than that of competing applications. We also present the re-scaffolding of the colossal white spruce (Picea glauca) draft assembly (PG29, 20 Gbp) and demonstrate how LINKS scales to larger genomes.

Conclusions

This study highlights the present utility of nanopore reads for genome scaffolding in spite of their current limitations, which are expected to diminish as the nanopore sequencing technology advances. We expect LINKS to have broad utility in harnessing the potential of long reads in connecting high-quality sequences of small and large genome assembly drafts.

Electronic supplementary material

The online version of this article (doi:10.1186/s13742-015-0076-3) contains supplementary material, which is available to authorized users.

Collapse

Paulino D, Warren RL, Vandervalk BP, Raymond A, Jackman SD, Birol I. Sealer: a scalable gap-closing application for finishing draft genomes. BMC Bioinformatics 2015. [PMID: 26209068 PMCID: PMC4515008 DOI: 10.1186/s12859-015-0663-4] [Citation(s) in RCA: 92] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Warren RL, Keeling CI, Yuen MMS, Raymond A, Taylor GA, Vandervalk BP, Mohamadi H, Paulino D, Chiu R, Jackman SD, Robertson G, Yang C, Boyle B, Hoffmann M, Weigel D, Nelson DR, Ritland C, Isabel N, Jaquish B, Yanchuk A, Bousquet J, Jones SJM, MacKay J, Birol I, Bohlmann J. Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. Plant J 2015;83:189-212. [PMID: 26017574 DOI: 10.1111/tpj.12886] [Citation(s) in RCA: 120] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 05/15/2015] [Indexed: 05/21/2023]

Affiliation(s)

René L Warren Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Christopher I Keeling Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Macaire Man Saint Yuen Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Anthony Raymond Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Greg A Taylor Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Benjamin P Vandervalk Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Hamid Mohamadi Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Daniel Paulino Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Readman Chiu Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Shaun D Jackman Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Gordon Robertson Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Chen Yang Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada
Brian Boyle Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
Margarete Hoffmann Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
Detlef Weigel Max Planck Institute for Developmental Biology, Spemannstrasse 35, 72076, Tübingen, Germany
David R Nelson Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, TN, 38163, USA
Carol Ritland Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Nathalie Isabel Natural Resources Canada, Laurentian Forestry Centre, Québec, QC, G1V 4C7, Canada
Barry Jaquish British Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, BC, V8W 9C2, Canada
Alvin Yanchuk British Columbia Ministry of Forests, Lands, and Natural Resource Operations, Victoria, BC, V8W 9C2, Canada
Jean Bousquet Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada
Steven J M Jones Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
John MacKay Department of Wood and Forest Sciences, Université Laval, Québec, QC, G1V 0A6, Canada Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK
Inanc Birol Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, BC, V5Z 4S6, Canada Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3N1, Canada School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
Joerg Bohlmann Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada

Collapse