1
|
Limberis JD, Metcalfe JZ. primerJinn: a tool for rationally designing multiplex PCR primer sets for amplicon sequencing and performing in silico PCR. BMC Bioinformatics 2023; 24:468. [PMID: 38082220 PMCID: PMC10714587 DOI: 10.1186/s12859-023-05609-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 12/08/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Multiplex PCR amplifies numerous targets in a single tube reaction and is essential in molecular biology and clinical diagnostics. One of its most important applications is in the targeted sequencing of pathogens. Despite this importance, few tools are available for designing multiplex primers. RESULTS We developed primerJinn, a tool that designs a set of multiplex primers and allows for the in silico PCR evaluation of primer sets against numerous input genomes. We used primerJinn to create a multiplex PCR for the sequencing of drug resistance-conferring gene regions from Mycobacterium tuberculosis, which were then successfully sequenced. CONCLUSIONS primerJinn provides a user-friendly, efficient, and accurate method for designing multiplex PCR primers for targeted sequencing and performing in silico PCR. It can be used for various applications in molecular biology and bioinformatics research, including the design of assays for amplifying and sequencing drug-resistance-conferring regions in important pathogens.
Collapse
Affiliation(s)
- Jason D Limberis
- Division of Pulmonary and Critical Care Medicine, Trauma Centre, Zuckerberg San Francisco General Hospital, University of California, San Francisco, San Francisco, CA, USA.
| | - John Z Metcalfe
- Division of Pulmonary and Critical Care Medicine, Trauma Centre, Zuckerberg San Francisco General Hospital, University of California, San Francisco, San Francisco, CA, USA
| |
Collapse
|
2
|
Limberis JD, Metcalfe JZ. primerJinn - a tool for rationally designing multiplex PCR primer sets and in silico PCR. RESEARCH SQUARE 2023:rs.3.rs-3025970. [PMID: 37461503 PMCID: PMC10350116 DOI: 10.21203/rs.3.rs-3025970/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/23/2023]
Abstract
Background Multiplex PCR amplifies numerous targets in a single tube reaction and is essential in molecular biology and clinical diagnostics. One of its most important applications is in the targeted sequencing of pathogens. Despite this importance, few tools are available for designing multiplex primers. Results We developed primerJinn, a tool that designs a set of multiplex primers and allows for the in silico PCR evaluation of primer sets against numerous input genomes. We used primerJinn to create a multiplex PCR for the sequencing of drug resistance-conferring gene regions from Mycobacterium tuberculosis, which were then successfully sequenced. Conclusions primerJinn provides a user-friendly, efficient, and accurate method for designing multiplex PCR primers and performing in silico PCR. It can be used for various applications in molecular biology and bioinformatics research, including the design of assays for amplifying and sequencing drug-resistance-conferring regions in important pathogens.
Collapse
|
3
|
Cai ZF, Hu JY, Yin TT, Wang D, Shen QK, Ma C, Ou DQ, Xu MM, Shi X, Li QL, Wu RN, Ajuma L, Adeola AC, Zhang YP, Peng MS. Long amplicon HiFi sequencing for mitochondrial DNA genomes. Mol Ecol Resour 2023. [PMID: 36756726 DOI: 10.1111/1755-0998.13765] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 01/11/2023] [Accepted: 02/03/2023] [Indexed: 02/10/2023]
Abstract
Long-read sequencing technology is a powerful approach with application in various genetic and genomic research. Herein, we developed the pipeline for long amplicon high-fidelity (HiFi) sequencing and then applied it for sequencing mitochondrial DNA (mtDNA) genomes from pools of 79 Tibetan Mastiffs. We amplified the mtDNA genome with long-range PCR using two pairs of primers. Two rounds of circular consensus sequencing (CCS) were conducted and their accuracy was evaluated. The results indicate that the second round of CCS can improve the accuracy of HiFi reads. In addition, the analysis of 79 high-quality mtDNA genomes shows the Tibetan Mastiffs from outside of the Tibetan Plateau experienced hybridization with other dogs. The high quality reads generator (HQGR) software is provided to facilitate data analyses, which is publicly accessible on GitHub (https://github.com/Caizf-script/HQGR). Our long amplicon HiFi sequencing pipeline can also be applied in various target enrichment strategies for small genomes and candidate genes.
Collapse
Affiliation(s)
- Zheng-Fei Cai
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, China.,State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Ji-Yuan Hu
- School of Software, Yunnan University, Kunming, China
| | - Ting-Ting Yin
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Da Wang
- School of Software, Yunnan University, Kunming, China
| | - Quan-Kuan Shen
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Cheng Ma
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Ding-Qin Ou
- Department of Anesthesiology, First Affiliated Hospital of Kunming Medical University, Kunming, China
| | - Ming-Min Xu
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Xian Shi
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Qing-Long Li
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, China.,State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Ru-Nian Wu
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Lameck Ajuma
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Adeniyi C Adeola
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Ya-Ping Zhang
- State Key Laboratory for Conservation and Utilization of Bio-resources in Yunnan, Yunnan University, Kunming, China.,State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Min-Sheng Peng
- State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.,University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
4
|
Yuan J, Yi J, Zhan M, Xie Q, Zhen TT, Zhou J, Li Z, Li Z. The web-based multiplex PCR primer design software Ultiplex and the associated experimental workflow: up to 100- plex multiplicity. BMC Genomics 2021; 22:835. [PMID: 34794394 PMCID: PMC8600765 DOI: 10.1186/s12864-021-08149-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 11/04/2021] [Indexed: 12/22/2022] Open
Abstract
Background A large number of variants have been employed in various medical applications, such as providing medication instructions, disease susceptibility testing, paternity testing, and tumour diagnosis. A high multiplicity PCR will outperform other technologies because of its lower cost, reaction time and sample consumption. To conduct a multiplex PCR with higher than 100 plex multiplicity, primers need to be carefully designed to avoid the formation of secondary structures and nonspecific amplification between primers, templates and products. Thus, a user-friendly, highly automated and highly user-defined web-based multiplex PCR primer design software is needed to minimize the work of primer design and experimental verification. Results Ultiplex was developed as a free online multiplex primer design tool with a user-friendly web-based interface (http://ultiplex.igenebook.cn). To evaluate the performance of Ultiplex, 294 out of 295 (99.7%) target primers were successfully designed. A total of 275 targets produced qualified primers after primer filtration, and 271 of those targets were successfully clustered into one compatible PCR group and could be covered by 108 primers. The designed primer group stably detected the rs28934573(C > T) mutation at lower than a 0.25% mutation rate in a series of samples with different ratios of HCT-15 and HaCaT cell line DNA. Conclusion Ultiplex is a web-based multiplex PCR primer tool that has several functions, including batch design and compatibility checking for the exclusion of mutual secondary structures and mutual false alignments across the whole genome. It offers flexible arguments for users to define their own references, primer Tm values, product lengths, plex numbers and tag oligos. With its user-friendly reports and web-based interface, Ultiplex will provide assistance for biological applications and research involving genomic variants. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-08149-1.
Collapse
Affiliation(s)
- Jie Yuan
- General Surgery Center, Zhujiang Hospital, Southern Medical University, 253 Industrial Boulevard, Guangzhou, 510280, Guangdong Province, China.,Department of General Surgery, Foshan Fosun Chancheng Hospital, Foshan, 528010, China
| | - Ji Yi
- Medical Department, Wuhan Igenebook Biotechnology co., Ltd, Floor 3, building 1, Zone B, Gaonong Biological Park, 888 Gaoxin Avenue, Wuhan, 430014, Hubei Province, China
| | - Meixiao Zhan
- Zhuhai Interventional Medical Center, Zhuhai City People's Hospital, Zhuhai, 519000, China
| | - Qingqing Xie
- Medical Department, Wuhan Igenebook Biotechnology co., Ltd, Floor 3, building 1, Zone B, Gaonong Biological Park, 888 Gaoxin Avenue, Wuhan, 430014, Hubei Province, China
| | - Ting Ting Zhen
- Medical Department, Wuhan Igenebook Biotechnology co., Ltd, Floor 3, building 1, Zone B, Gaonong Biological Park, 888 Gaoxin Avenue, Wuhan, 430014, Hubei Province, China
| | - Jian Zhou
- Medical Department, Wuhan Igenebook Biotechnology co., Ltd, Floor 3, building 1, Zone B, Gaonong Biological Park, 888 Gaoxin Avenue, Wuhan, 430014, Hubei Province, China
| | - Zeqing Li
- Medical Department, Wuhan Igenebook Biotechnology co., Ltd, Floor 3, building 1, Zone B, Gaonong Biological Park, 888 Gaoxin Avenue, Wuhan, 430014, Hubei Province, China. .,College of Landscape Architecture, Central South University of Forestry and Technology, Changsha, 410004, China.
| | - Zhou Li
- General Surgery Center, Zhujiang Hospital, Southern Medical University, 253 Industrial Boulevard, Guangzhou, 510280, Guangdong Province, China.
| |
Collapse
|
5
|
Stelzer CP, Blommaert J, Waldvogel AM, Pichler M, Hecox-Lea B, Mark Welch DB. Comparative analysis reveals within-population genome size variation in a rotifer is driven by large genomic elements with highly abundant satellite DNA repeat elements. BMC Biol 2021; 19:206. [PMID: 34530817 PMCID: PMC8447722 DOI: 10.1186/s12915-021-01134-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 08/27/2021] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Eukaryotic genomes are known to display an enormous variation in size, but the evolutionary causes of this phenomenon are still poorly understood. To obtain mechanistic insights into such variation, previous studies have often employed comparative genomics approaches involving closely related species or geographically isolated populations within a species. Genome comparisons among individuals of the same population remained so far understudied-despite their great potential in providing a microevolutionary perspective to genome size evolution. The rotifer Brachionus asplanchnoidis represents one of the most extreme cases of within-population genome size variation among eukaryotes, displaying almost twofold variation within a geographic population. RESULTS Here, we used a whole-genome sequencing approach to identify the underlying DNA sequence differences by assembling a high-quality reference genome draft for one individual of the population and aligning short reads of 15 individuals from the same geographic population including the reference individual. We identified several large, contiguous copy number variable regions (CNVs), up to megabases in size, which exhibited striking coverage differences among individuals, and whose coverage overall scaled with genome size. CNVs were of remarkably low complexity, being mainly composed of tandemly repeated satellite DNA with only a few interspersed genes or other sequences, and were characterized by a significantly elevated GC-content. CNV patterns in offspring of two parents with divergent genome size and CNV patterns in several individuals from an inbred line differing in genome size demonstrated inheritance and accumulation of CNVs across generations. CONCLUSIONS By identifying the exact genomic elements that cause within-population genome size variation, our study paves the way for studying genome size evolution in contemporary populations rather than inferring patterns and processes a posteriori from species comparisons.
Collapse
Affiliation(s)
- C P Stelzer
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria.
| | - J Blommaert
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | - A M Waldvogel
- Institute of Zoology, University of Cologne, Cologne, Germany
| | - M Pichler
- Research Department for Limnology, University of Innsbruck, Mondsee, Austria
| | - B Hecox-Lea
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, USA
| | - D B Mark Welch
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA, USA
| |
Collapse
|
6
|
Jayaraman P, Mosbruger T, Hu T, Tairis NG, Wu C, Clark PM, D’Arcy M, Ferriola D, Mackiewicz K, Gai X, Monos D, Sarmady M. AnthOligo: automating the design of oligonucleotides for capture/enrichment technologies. Bioinformatics 2020; 36:4353-4356. [PMID: 32484858 PMCID: PMC7520035 DOI: 10.1093/bioinformatics/btaa552] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 05/22/2020] [Accepted: 05/28/2020] [Indexed: 11/13/2022] Open
Abstract
SUMMARY A number of methods have been devised to address the need for targeted genomic resequencing. One of these methods, region-specific extraction (RSE) is characterized by the capture of long DNA fragments (15-20 kb) by magnetic beads, after enzymatic extension of oligonucleotides hybridized to selected genomic regions. Facilitating the selection of the most appropriate capture oligos for targeting a region of interest, satisfying the properties of temperature (Tm) and entropy (ΔG), while minimizing the formation of primer-dimers in a pooled experiment, is therefore necessary. Manual design and selection of oligos becomes very challenging, complicated by factors such as length of the target region and number of targeted regions. Here we describe, AnthOligo, a web-based application developed to optimally automate the process of generation of oligo sequences used to target and capture the continuum of large and complex genomic regions. Apart from generating oligos for RSE, this program may have wider applications in the design of customizable internal oligos to be used as baits for gene panel analysis or even probes for large-scale comparative genomic hybridization array processes. AnthOligo was tested by capturing the Major Histocompatibility Complex (MHC) of a random sample.The application provides users with a simple interface to upload an input file in BED format and customize parameters for each task. The task of probe design in AnthOligo commences when a user uploads an input file and concludes with the generation of a result-set containing an optimal set of region-specific oligos. AnthOligo is currently available as a public web application with URL: http://antholigo.chop.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Taishan Hu
- Department of Pathology and Laboratory Medicine
| | | | - Chao Wu
- Department of Biomedical Health & Informatics, The Children's Hospitals of Philadelphia, Philadelphia, PA, USA
| | - Peter M Clark
- Department of Research & Development, The Janssen Pharmaceutical Companies of Johnson & Johnson, Raritan, NJ, USA
| | - Monica D’Arcy
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | | | - Katarzyna Mackiewicz
- Department of Biochemistry and Molecular Biology, Medical University of South Carolina, Charleston, SC, USA
| | - Xiaowu Gai
- Department of Pathology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, Center for Personalized Medicine, Children’s Hospital of Los Angeles, Los Angeles, CA, USA
| | - Dimitrios Monos
- Department of Pathology and Laboratory Medicine
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Mahdi Sarmady
- Department of Pathology and Laboratory Medicine
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
7
|
Fichou Y, Berlivet I, Richard G, Tournamille C, Castilho L, Férec C. Defining Blood Group Gene Reference Alleles by Long-Read Sequencing: Proof of Concept in the ACKR1 Gene Encoding the Duffy Antigens. Transfus Med Hemother 2019; 47:23-32. [PMID: 32110191 DOI: 10.1159/000504584] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 11/01/2019] [Indexed: 01/31/2023] Open
Abstract
Background In the novel era of blood group genomics, (re-)defining reference gene/allele sequences of blood group genes has become an important goal to achieve, both for diagnostic and research purposes. As novel potent sequencing technologies are available, we thought to investigate the variability encountered in the three most common alleles of ACKR1, the gene encoding the clinically relevant Duffy antigens, at the haplotype level by a long-read sequencing approach. Materials and Methods After long-range PCR amplification spanning the whole ACKR1 gene locus (∼2.5 kilobases), amplicons generated from 81 samples with known genotypes were sequenced in a single read by using the Pacific Biosciences (PacBio) single molecule, real-time (SMRT) sequencing technology. Results High-quality sequencing reads were obtained for the 162 alleles (accuracy >0.999). Twenty-two nucleotide variations reported in databases were identified, defining 19 haplotypes: four, eight, and seven haplotypes in 46 ACKR1*01, 63 ACKR1*02, and 53 ACKR1*02N.01 alleles, respectively. Discussion Overall, we have defined a subset of reference alleles by third-generation (long-read) sequencing. This technology, which provides a "longitudinal" overview of the loci of interest (several thousand base pairs) and is complementary to the second-generation (short-read) next-generation sequencing technology, is of critical interest for resolving novel, rare, and null alleles.
Collapse
Affiliation(s)
- Yann Fichou
- EFS, Inserm, Univ Brest, UMR 1078, GGB, Brest, France.,Laboratoire d'Excellence GR-Ex, Paris, France
| | | | | | - Christophe Tournamille
- Laboratoire d'Excellence GR-Ex, Paris, France.,IMRB-Inserm U955 Equipe 2 Transfusion et Maladies du Globule Rouge, EFS Ile-de-France, Créteil, France
| | | | - Claude Férec
- EFS, Inserm, Univ Brest, UMR 1078, GGB, Brest, France.,Laboratoire de Génétique Moléculaire et d'Histocompatibilité, CHU Morvan, Brest, France
| |
Collapse
|
8
|
Sari E, Cabral AL, Polley B, Tan Y, Hsueh E, Konkin DJ, Knox RE, Ruan Y, Fobert PR. Weighted gene co-expression network analysis unveils gene networks associated with the Fusarium head blight resistance in tetraploid wheat. BMC Genomics 2019; 20:925. [PMID: 31795948 PMCID: PMC6891979 DOI: 10.1186/s12864-019-6161-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 10/09/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Fusarium head blight (FHB) resistance in the durum wheat breeding gene pool is rarely reported. Triticum turgidum ssp. carthlicum line Blackbird is a tetraploid relative of durum wheat that offers partial FHB resistance. Resistance QTL were identified for the durum wheat cv. Strongfield × Blackbird population on chromosomes 1A, 2A, 2B, 3A, 6A, 6B and 7B in a previous study. The objective of this study was to identify the defense mechanisms underlying the resistance of Blackbird and report candidate regulator defense genes and single nucleotide polymorphism (SNP) markers within these genes for high-resolution mapping of resistance QTL reported for the durum wheat cv. Strongfield/Blackbird population. RESULTS Gene network analysis identified five networks significantly (P < 0.05) associated with the resistance to FHB spread (Type II FHB resistance) one of which showed significant correlation with both plant height and relative maturity traits. Two gene networks showed subtle differences between Fusarium graminearum-inoculated and mock-inoculated plants, supporting their involvement in constitutive defense. The candidate regulator genes have been implicated in various layers of plant defense including pathogen recognition (mainly Nucleotide-binding Leucine-rich Repeat proteins), signaling pathways including the abscisic acid and mitogen activated protein (MAP) kinase, and downstream defense genes activation including transcription factors (mostly with dual roles in defense and development), and cell death regulator and cell wall reinforcement genes. The expression of five candidate genes measured by quantitative real-time PCR was correlated with that of RNA-seq, corroborating the technical and analytical accuracy of RNA-sequencing. CONCLUSIONS Gene network analysis allowed identification of candidate regulator genes and genes associated with constitutive resistance, those that will not be detected using traditional differential expression analysis. This study also shed light on the association of developmental traits with FHB resistance and partially explained the co-localization of FHB resistance with plant height and maturity QTL reported in several previous studies. It also allowed the identification of candidate hub genes within the interval of three previously reported FHB resistance QTL for the Strongfield/Blackbird population and associated SNPs for future high resolution mapping studies.
Collapse
Affiliation(s)
- Ehsan Sari
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada.
| | - Adrian L Cabral
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada
| | - Brittany Polley
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada
| | - Yifang Tan
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada
| | - Emma Hsueh
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada
| | - David J Konkin
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada
| | - Ron E Knox
- Swift Current Research and Development Centre, Agriculture and Agri-Food Canada, Swift Current, SK, Canada
| | - Yuefeng Ruan
- Swift Current Research and Development Centre, Agriculture and Agri-Food Canada, Swift Current, SK, Canada
| | - Pierre R Fobert
- Aquatic and Crop Resource Development Centre, National Research Council Canada, Saskatoon, SK, Canada
| |
Collapse
|
9
|
Hendling M, Barišić I. In-silico Design of DNA Oligonucleotides: Challenges and Approaches. Comput Struct Biotechnol J 2019; 17:1056-1065. [PMID: 31452858 PMCID: PMC6700205 DOI: 10.1016/j.csbj.2019.07.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Revised: 07/18/2019] [Accepted: 07/23/2019] [Indexed: 11/13/2022] Open
Abstract
DNA oligonucleotides are essential components of a high number of technologies in molecular biology. The key event of each oligonucleotide-based assay is the specific binding between oligonucleotides and their target DNA. However, single-stranded DNA molecules also tend to bind to unintended targets or themselves. The probability of such unspecific binding increases with the complexity of an assay. Therefore, accurate data management and design workflows are necessary to optimize the in-silico design of primers and probes. Important considerations concerning computational infrastructure and run time need to be made for both data management and the design process. Data retrieval, data updates, storage, filtering and analysis are the main parts of a sequence data management system. Each part needs to be well-implemented as the resulting sequences form the basis for the oligonucleotide design. Important key features, such as the oligonucleotide length, melting temperature, secondary structures and primer dimer formation, as well as the specificity, should be considered for the in-silico selection of oligonucleotides. The development of an efficient oligonucleotide design workflow demands the right balance between the precision of the applied computer models, the general expenditure of time, and computational workload. This paper gives an overview of important parameters during the design process, starting from the data retrieval, up to the design parameters for optimized oligonucleotide design.
Collapse
Affiliation(s)
- Michaela Hendling
- Austrian Institute of Technology GmbH, Center for Health & Bioresources, Molecular Diagnostics, Giefinggasse 4, 1210 Vienna, Austria
| | | |
Collapse
|
10
|
Francis F, Dumas MD, Davis SB, Wisser RJ. Clustering of circular consensus sequences: accurate error correction and assembly of single molecule real-time reads from multiplexed amplicon libraries. BMC Bioinformatics 2018; 19:302. [PMID: 30126356 PMCID: PMC6102811 DOI: 10.1186/s12859-018-2293-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 07/20/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Targeted resequencing with high-throughput sequencing (HTS) platforms can be used to efficiently interrogate the genomes of large numbers of individuals. A critical issue for research and applications using HTS data, especially from long-read platforms, is error in base calling arising from technological limits and bioinformatic algorithms. We found that the community standard long amplicon analysis (LAA) module from Pacific Biosciences is prone to substantial bioinformatic errors that raise concerns about findings based on this pipeline, prompting the need for a new method. RESULTS A single molecule real-time (SMRT) sequencing-error correction and assembly pipeline, C3S-LAA, was developed for libraries of pooled amplicons. By uniquely leveraging the structure of SMRT sequence data (comprised of multiple low quality subreads from which higher quality circular consensus sequences are formed) to cluster raw reads, C3S-LAA produced accurate consensus sequences and assemblies of overlapping amplicons from single sample and multiplexed libraries. In contrast, despite read depths in excess of 100X per amplicon, the standard long amplicon analysis module from Pacific Biosciences generated unexpected numbers of amplicon sequences with substantial inaccuracies in the consensus sequences. A bootstrap analysis showed that the C3S-LAA pipeline per se was effective at removing bioinformatic sources of error, but in rare cases a read depth of nearly 400X was not sufficient to overcome minor but systematic errors inherent to amplification or sequencing. CONCLUSIONS C3S-LAA uses a divide and conquer processing algorithm for SMRT amplicon-sequence data that generates accurate consensus sequences and local sequence assemblies. Solving the confounding bioinformatic source of error in LAA allowed for the identification of limited instances of errors due to DNA amplification or sequencing of homopolymeric nucleotide tracts. For research and development in genomics, C3S-LAA allows meaningful conclusions and biological inferences to be made from accurately polished sequence output.
Collapse
Affiliation(s)
- Felix Francis
- Department of Plant and Soil Sciences, University of Delaware, Newark, Delaware, 19716, USA.,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, 19714, USA
| | - Michael D Dumas
- Department of Plant and Soil Sciences, University of Delaware, Newark, Delaware, 19716, USA
| | - Scott B Davis
- Department of Plant and Soil Sciences, University of Delaware, Newark, Delaware, 19716, USA
| | - Randall J Wisser
- Department of Plant and Soil Sciences, University of Delaware, Newark, Delaware, 19716, USA. .,Center for Bioinformatics and Computational Biology, University of Delaware, Newark, Delaware, 19714, USA.
| |
Collapse
|