1
|
Węglewska M, Barylski J, Wojnarowski F, Nowicki G, Łukaszewicz M. Genome, biology and stability of the Thurquoise phage – A new virus from the Bastillevirinae subfamily. Front Microbiol 2023; 14:1120147. [PMID: 36998400 PMCID: PMC10043171 DOI: 10.3389/fmicb.2023.1120147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 02/17/2023] [Indexed: 03/18/2023] Open
Abstract
Bacteriophages from the Bastillevirinae subfamily (Herelleviridae family) have proven to be effective against bacteria from the Bacillus genus including organisms from the B. cereus group, which cause food poisoning and persistent contamination of industrial installations. However, successful application of these phages in biocontrol depends on understanding of their biology and stability in different environments. In this study, we isolated a novel virus from garden soil in Wrocław (Poland) and named it ‘Thurquoise’. The genome of that phage was sequenced and assembled into a single continuous contig with 226 predicted protein-coding genes and 18 tRNAs. The cryo-electron microscopy revealed that Thurquoise has complex virion structure typical for the Bastillevirinae family. Confirmed hosts include selected bacteria from the Bacillus cereus group–specifically B. thuringiensis (isolation host) and B. mycoides, but susceptible strains display different efficiency of plating (EOP). The eclipse and latent periods of Thurquoise in the isolation host last ~ 50 min and ~ 70 min, respectively. The phage remains viable for more than 8 weeks in variants of the SM buffer with magnesium, calcium, caesium, manganese or potassium and can withstand numerous freeze–thaw cycles if protected by the addition of 15% glycerol or, to a lesser extent, 2% gelatine. Thus, with proper buffer formulation, this virus can be safely stored in common freezers and refrigerators for a considerable time. The Thurquoise phage is the exemplar of a new candidate species within the Caeruleovirus genus in the Bastillevirinae subfamily of the Herelleviridae family with a genome, morphology and biology typical for these taxa.
Collapse
Affiliation(s)
- Martyna Węglewska
- Department of Molecular Virology, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Jakub Barylski
- Department of Molecular Virology, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
- *Correspondence: Jakub Barylski,
| | - Filip Wojnarowski
- Department of Molecular Virology, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | | | - Marcin Łukaszewicz
- Department of Biotransformation, Faculty of Biotechnology, University of Wrocław, Wrocław, Poland
| |
Collapse
|
2
|
Yu J, Jiang W, Zhu SB, Liao Z, Dou X, Liu J, Guo FB, Dong C. Prediction of protein-coding small ORFs in multi-species using integrated sequence-derived features and the random forest model. Methods 2023; 210:10-19. [PMID: 36621557 DOI: 10.1016/j.ymeth.2022.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 01/07/2023] Open
Abstract
Proteins encoded by small open reading frames (sORFs) can serve as functional elements playing important roles in vivo. Such sORFs also constitute the potential pool for facilitating the de novo gene birth, driving evolutionary innovation and species diversity. Therefore, their theoretical and experimental identification has become a critical issue. Herein, we proposed a protein-coding sORFs prediction method merely based on integrative sequence-derived features. Our prediction performance is better or comparable compared with other nine prevalent methods, which shows that our method can provide a relatively reliable research tool for the prediction of protein-coding sORFs. Our method allows users to estimate the potential expression of a queried sORF, which has been demonstrated by the correlation analysis between our possibility estimation and codon adaption index (CAI). Based on the features that we used, we demonstrated that the sequence features of the protein-coding sORFs in the two domains have significant differences implying that it might be a relatively hard task in terms of cross-domain prediction, hence domain-specific models were developed, which allowed users to predict protein-coding sORFs both in eukaryotes and prokaryotes. Finally, a web-server was developed and provided to boost and facilitate the study of the related field, which is freely available at http://guolab.whu.edu.cn/codingCapacity/index.html.
Collapse
Affiliation(s)
- Jiafeng Yu
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Wenwen Jiang
- Department of Bioinformatics, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Sen-Bin Zhu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Zhen Liao
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xianghua Dou
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Jian Liu
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Feng-Biao Guo
- School of Pharmaceutical Sciences, Wuhan University, Wuhan 430071, China.
| | - Chuan Dong
- School of Pharmaceutical Sciences, Wuhan University, Wuhan 430071, China.
| |
Collapse
|
3
|
Khan T, Raza S. Exploration of Computational Aids for Effective Drug Designing and Management of Viral Diseases: A Comprehensive Review. Curr Top Med Chem 2023; 23:1640-1663. [PMID: 36725827 DOI: 10.2174/1568026623666230201144522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/14/2022] [Accepted: 12/19/2022] [Indexed: 02/03/2023]
Abstract
BACKGROUND Microbial diseases, specifically originating from viruses are the major cause of human mortality all over the world. The current COVID-19 pandemic is a case in point, where the dynamics of the viral-human interactions are still not completely understood, making its treatment a case of trial and error. Scientists are struggling to devise a strategy to contain the pandemic for over a year and this brings to light the lack of understanding of how the virus grows and multiplies in the human body. METHODS This paper presents the perspective of the authors on the applicability of computational tools for deep learning and understanding of host-microbe interaction, disease progression and management, drug resistance and immune modulation through in silico methodologies which can aid in effective and selective drug development. The paper has summarized advances in the last five years. The studies published and indexed in leading databases have been included in the review. RESULTS Computational systems biology works on an interface of biology and mathematics and intends to unravel the complex mechanisms between the biological systems and the inter and intra species dynamics using computational tools, and high-throughput technologies developed on algorithms, networks and complex connections to simulate cellular biological processes. CONCLUSION Computational strategies and modelling integrate and prioritize microbial-host interactions and may predict the conditions in which the fine-tuning attenuates. These microbial-host interactions and working mechanisms are important from the aspect of effective drug designing and fine- tuning the therapeutic interventions.
Collapse
Affiliation(s)
- Tahmeena Khan
- Department of Chemistry, Integral University, Lucknow, 226026, U.P., India
| | - Saman Raza
- Department of Chemistry, Isabella Thoburn College, Lucknow, 226007, U.P., India
| |
Collapse
|
4
|
Abdelsattar AS, Dawoud A, Makky S, Nofal R, Aziz RK, El-Shibiny A. Bacteriophages: from isolation to application. Curr Pharm Biotechnol 2021; 23:337-360. [PMID: 33902418 DOI: 10.2174/1389201022666210426092002] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 01/29/2021] [Accepted: 03/11/2021] [Indexed: 11/22/2022]
Abstract
Bacteriophages are considered as a potential alternative to fight pathogenic bacteria during the antibiotic resistance era. With their high specificity, they are being widely used in various applications: medicine, food industry, agriculture, animal farms, biotechnology, diagnosis, etc. Many techniques have been designed by different researchers for phage isolation, purification, and amplification, each of which has strengths and weaknesses. However, all aim at having a reasonably pure phage sample that can be further characterized. Phages can be characterized based on their physiological, morphological or inactivation tests. Microscopy, in particular, has opened a wide gate not only for visualizing phage morphological structure, but also for monitoring biochemistry and behavior. Meanwhile, computational analysis of phage genomes provides more details about phage history, lifestyle, and potential for toxigenic or lysogenic conversion, which translate to safety in biocontrol and phage therapy applications. This review summarizes phage application pipelines at different levels and addresses specific restrictions and knowledge gaps in the field. Recently developed computational approaches, which are used in phage genome analysis, are critically assessed. We hope that this assessment provides researchers with useful insights for selection of suitable approaches for Phage-related research aims and applications.
Collapse
Affiliation(s)
- Abdallah S Abdelsattar
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Alyaa Dawoud
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Salsabil Makky
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Rana Nofal
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Ramy K Aziz
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Qasr El-Ainy St, Cairo. Egypt
| | - Ayman El-Shibiny
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| |
Collapse
|
5
|
Dong Y, Li C, Kim K, Cui L, Liu X. Genome annotation of disease-causing microorganisms. Brief Bioinform 2021; 22:845-854. [PMID: 33537706 PMCID: PMC7986607 DOI: 10.1093/bib/bbab004] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 12/23/2020] [Accepted: 01/03/2021] [Indexed: 11/13/2022] Open
Abstract
Humans have coexisted with pathogenic microorganisms throughout its history of evolution. We have never halted the exploration of pathogenic microorganisms. With the improvement of genome-sequencing technology and the continuous reduction of sequencing costs, an increasing number of complete genome sequences of pathogenic microorganisms have become available. Genome annotation of this massive sequence information has become a daunting task in biological research. This paper summarizes the approaches to the genome annotation of pathogenic microorganisms and the available popular genome annotation tools for prokaryotes, eukaryotes and viruses. Furthermore, real-world comparisons of different annotation tools using 12 genomes from prokaryotes, eukaryotes and viruses were conducted. Current challenges and problems were also discussed.
Collapse
Affiliation(s)
- Yibo Dong
- College of Public Health, University of South Florida, Tampa, FL, USA
| | - Chang Li
- College of Public Health, University of South Florida, Tampa, FL, USA
| | - Kami Kim
- Division of Infectious Disease and International Medicine, Department of Internal Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Liwang Cui
- Division of Infectious Disease and International Medicine, Department of Internal Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Xiaoming Liu
- College of Public Health, University of South Florida, Tampa, FL, USA
| |
Collapse
|
6
|
Mahmoudieh M, Noor MRM, Harikrishna JA, Othman RY. Identification and characterization of Ageratum yellow vein Malaysia virus (AYVMV) and an associated betasatellite among begomoviruses infecting Solanum lycopersicum in Malaysia. J Appl Genet 2020; 61:619-628. [PMID: 32808206 DOI: 10.1007/s13353-020-00574-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2020] [Revised: 06/06/2020] [Accepted: 08/11/2020] [Indexed: 11/27/2022]
Abstract
The study describes results of a survey of tomato fields for the presence of begomoviruses from different regions of Peninsular Malaysia. An ORF-based (C2 and C3) study was performed to determine the distribution of begomoviruses associated with a severe leaf curl disease in tomato-growing areas of Peninsular Malaysia. Viral DNA was isolated from symptomatic tomato plants, and begomovirus association was confirmed by PCR using DNA-A degenerate primers. The C2 and C3 sequences of the putative begomoviruses were similar to two corresponded ORFs of different geographically separated strains of begomoviruses: Pepper yellow leaf curl Indonesia virus and Tomato yellow leaf curl Kanchanaburi virus. The present study also identified a unique isolate, Ageratum yellow vein Malaysia virus (AYVMV) among above mentioned survey. It has a single-stranded DNA component and its associated betasatellite. The single-stranded DNA component is consisting of 2750 nt with six open reading frames and an organization resembling that of monopartite geminiviruses. The full length of viral single-stranded DNA component genome obtained using next generation sequencing (NGS) showed the highest sequence identity (99%) with Ageratum yellow vein virus (AYVV-BA). The betasatellite component genome obtained by NGS has 1342 nt and showed the highest sequence identity (91%) with the Pepper yellow leaf curl betasatellite. Following ICTV guidelines, Ageratum yellow vein Malaysia virus was assigned the abbreviation AYVMV with sequence and phylogenetic analysis indicating that it might have evolved by recombination of two or more viral ancestors.
Collapse
Affiliation(s)
- Mohtaram Mahmoudieh
- Centre for Research in Biotechnology for Agriculture and Institute of Biological Science, Faculty of Science, University of Malaya, 50603, Kuala Lumpur, Malaysia.
| | - Mohamad Roff Mohd Noor
- Horticulture Research Centre, MARDI Headquarters, P.O.Box 12301, GPO, 50774, Kuala Lumpur, Malaysia
| | - Jennifer Ann Harikrishna
- Centre for Research in Biotechnology for Agriculture and Institute of Biological Science, Faculty of Science, University of Malaya, 50603, Kuala Lumpur, Malaysia
| | - Rofina Yasmin Othman
- Centre for Research in Biotechnology for Agriculture and Institute of Biological Science, Faculty of Science, University of Malaya, 50603, Kuala Lumpur, Malaysia
| |
Collapse
|
7
|
Zhou L, Yu H, Wang K, Chen T, Ma Y, Huang Y, Li J, Liu L, Li Y, Kong Z, Zheng Q, Wang Y, Gu Y, Xia N, Li S. Genome re-sequencing and reannotation of the Escherichia coli ER2566 strain and transcriptome sequencing under overexpression conditions. BMC Genomics 2020; 21:407. [PMID: 32546194 PMCID: PMC7296898 DOI: 10.1186/s12864-020-06818-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 06/10/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND The Escherichia coli ER2566 strain (NC_CP014268.2) was developed as a BL21 (DE3) derivative strain and had been widely used in recombinant protein expression. However, like many other current RefSeq annotations, the annotation of the ER2566 strain was incomplete, with missing gene names and miscellaneous RNAs, as well as uncorrected annotations of some pseudogenes. Here, we performed a systematic reannotation of the ER2566 genome by combining multiple annotation tools with manual revision to provide a comprehensive understanding of the E. coli ER2566 strain, and used high-throughput sequencing to explore how the strain adapted under external pressure. RESULTS The reannotation included noteworthy corrections to all protein-coding genes, led to the exclusion of 190 hypothetical genes or pseudogenes, and resulted in the addition of 237 coding sequences and 230 miscellaneous noncoding RNAs and 2 tRNAs. In addition, we further manually examined all 194 pseudogenes in the Ref-seq annotation and directly identified 123 (63%) as coding genes. We then used whole-genome sequencing and high-throughput RNA sequencing to assess mutational adaptations under consecutive subculture or overexpression burden. Whereas no mutations were detected in response to consecutive subculture, overexpression of the human papillomavirus 16 type capsid led to the identification of a mutation (position 1,094,824 within the 3' non-coding region) positioned 19-bp away from the lacI gene in the transcribed RNA, which was not detected at the genomic level by Sanger sequencing. CONCLUSION The ER2566 strain was used by both the general scientific community and the biotechnology industry. Reannotation of the E. coli ER2566 strain not only improved the RefSeq data but uncovered a key site that might be involved in the transcription and translation of genes encoding the lactose operon repressor. We proposed that our pipeline might offer a universal method for the reannotation of other bacterial genomes with high speed and accuracy. This study might facilitate a better understanding of gene function for the ER2566 strain under external burden and provided more clues to engineer bacteria for biotechnological applications.
Collapse
Affiliation(s)
- Lizhi Zhou
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China
| | - Hai Yu
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China
| | - Kaihang Wang
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Tingting Chen
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Yue Ma
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Yang Huang
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Jiajia Li
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Liqin Liu
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Yuqian Li
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Zhibo Kong
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Qingbing Zheng
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China
| | - Yingbin Wang
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China
| | - Ying Gu
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Ningshao Xia
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Shaowei Li
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, School of Public Health, Xiamen University, Xiamen, 361102, Fujian, China.
- National Institute of Diagnostics and Vaccine Development in Infectious Disease, School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China.
| |
Collapse
|
8
|
Zhang KY, Gao YZ, Du MZ, Liu S, Dong C, Guo FB. Vgas: A Viral Genome Annotation System. Front Microbiol 2019; 10:184. [PMID: 30814982 PMCID: PMC6381048 DOI: 10.3389/fmicb.2019.00184] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2018] [Accepted: 01/23/2019] [Indexed: 11/13/2022] Open
Abstract
The in-depth study of viral genomes is of great help in many aspects, especially in the treatment of human diseases caused by viral infections. With the rapid accumulation of viral sequencing data, improved, or alternative gene-finding systems have become necessary to process and mine these data. In this article, we present Vgas, a system combining an ab initio method and a similarity-based method to automatically find viral genes and perform gene function annotation. Vgas was compared with existing programs, such as Prodigal, GeneMarkS, and Glimmer. Through testing 5,705 virus genomes downloaded from RefSeq, Vgas demonstrated its superiority with the highest average precision and recall (both indexes were 1% higher or more than the other programs); particularly for small virus genomes (≤ 10 kb), it showed significantly improved performance (precision was 6% higher, and recall was 2% higher). Moreover, Vgas presents an annotation module to provide functional information for predicted genes based on BLASTp alignment. This characteristic may be specifically useful in some cases. When combining Vgas with GeneMarkS and Prodigal, better prediction results could be obtained than with each of the three individual programs, suggesting that collaborative prediction using several different software programs is an alternative for gene prediction. Vgas is freely available at http://cefg.uestc.cn/vgas/ or http://121.48.162.133/vgas/. We hope that Vgas could be an alternative virus gene finder to annotate new genomes or reannotate existing genome.
Collapse
Affiliation(s)
- Kai-Yue Zhang
- Centre for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yi-Zhou Gao
- Centre for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Meng-Ze Du
- Centre for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuo Liu
- Centre for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Chuan Dong
- Centre for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Feng-Biao Guo
- Centre for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
9
|
Bioinformatics Applications in Advancing Animal Virus Research. RECENT ADVANCES IN ANIMAL VIROLOGY 2019. [PMCID: PMC7121192 DOI: 10.1007/978-981-13-9073-9_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Viruses serve as infectious agents for all living entities. There have been various research groups that focus on understanding the viruses in terms of their host-viral relationships, pathogenesis and immune evasion. However, with the current advances in the field of science, now the research field has widened up at the ‘omics’ level. Apparently, generation of viral sequence data has been increasing. There are numerous bioinformatics tools available that not only aid in analysing such sequence data but also aid in deducing useful information that can be exploited in developing preventive and therapeutic measures. This chapter elaborates on bioinformatics tools that are specifically designed for animal viruses as well as other generic tools that can be exploited to study animal viruses. The chapter further provides information on the tools that can be used to study viral epidemiology, phylogenetic analysis, structural modelling of proteins, epitope recognition and open reading frame (ORF) recognition and tools that enable to analyse host-viral interactions, gene prediction in the viral genome, etc. Various databases that organize information on animal and human viruses have also been described. The chapter will converse on overview of the current advances, online and downloadable tools and databases in the field of bioinformatics that will enable the researchers to study animal viruses at gene level.
Collapse
|
10
|
Harrison RL, Mowery JD, Bauchan GR, Theilmann DA, Erlandson MA. The complete genome sequence of a second alphabaculovirus from the true armyworm, Mythimna unipuncta: implications for baculovirus phylogeny and host specificity. Virus Genes 2018; 55:104-116. [PMID: 30430308 DOI: 10.1007/s11262-018-1615-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 11/09/2018] [Indexed: 12/19/2022]
Abstract
The Mythimna unipuncta nucleopolyhedrovirus isolate KY310 (MyunNPV-KY310) is an alphabaculovirus isolated from a true armyworm (Mythimna unipuncta) population in Kentucky, USA. Occlusion bodies of this virus were examined by electron microscopy and the genome sequence was determined by 454 pyrosequencing. MyunNPV-KY310 occlusion bodies consisted of irregular polyhedra measuring 0.8-1.8 µm in diameter and containing multiple virions, with one to six nucleocapsids per virion. The genome sequence was determined to be 156,647 bp with a nucleotide distribution of 43.9% G+C. 152 ORFs and six homologous repeat (hr) regions were annotated for the sequence, including the 38 core genes of family Baculoviridae and an additional group of 26 conserved alphabaculovirus genes. BLAST queries and phylogenetic inference confirmed that MyunNPV-KY310 is most closely related to the alphabaculovirus Leucania separata nucleopolyhedrovirus isolate AH1, which infects Mythimna separata. In contrast, MyunNPV-KY310 did not exhibit a close relationship with Mythimna unipuncta nucleopolyhedrovirus isolate #7, an alphabaculovirus from the same host species. MyunNPV-KY310 lacks the gp64 envelope glycoprotein, which is a characteristic of group II alphabaculoviruses. However, this virus and five other alphabaculoviruses lacking gp64 are placed outside the group I and group II clades in core gene phylogenies, further demonstrating that viruses of genus Alphabaculovirus do not occur in two monophyletic clades. Potential instances of MyunNPV-KY310 ORFs arising by horizontal transfer were detected. Although there are now genome sequences of four different baculoviruses from M. unipuncta, comparison of their genome sequences provides little insight into the genetic basis for their host specificity.
Collapse
Affiliation(s)
- Robert L Harrison
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA.
| | - Joseph D Mowery
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA
| | - Gary R Bauchan
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA
| | - David A Theilmann
- Summerland Research and Development Centre, Agriculture and Agri-Food Canada, Summerland, BC, V0H 1Z0, Canada
| | - Martin A Erlandson
- Saskatoon Research and Development Centre, Agriculture and Agri-Food Canada, Saskatoon, SK, S7N 0X2, Canada
| |
Collapse
|
11
|
Guo FB, Dong C, Hua HL, Liu S, Luo H, Zhang HW, Jin YT, Zhang KY. Accurate prediction of human essential genes using only nucleotide composition and association information. Bioinformatics 2018; 33:1758-1764. [PMID: 28158612 PMCID: PMC7110051 DOI: 10.1093/bioinformatics/btx055] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 01/25/2017] [Indexed: 12/20/2022] Open
Abstract
Motivation Previously constructed classifiers in predicting eukaryotic essential genes integrated a variety of features including experimental ones. If we can obtain satisfactory prediction using only nucleotide (sequence) information, it would be more promising. Three groups recently identified essential genes in human cancer cell lines using wet experiments and it provided wonderful opportunity to accomplish our idea. Here we improved the Z curve method into the λ-interval form to denote nucleotide composition and association information and used it to construct the SVM classifying model. Results Our model accurately predicted human gene essentiality with an AUC higher than 0.88 both for 5-fold cross-validation and jackknife tests. These results demonstrated that the essentiality of human genes could be reliably reflected by only sequence information. We re-predicted the negative dataset by our Pheg server and 118 genes were additionally predicted as essential. Among them, 20 were found to be homologues in mouse essential genes, indicating that some of the 118 genes were indeed essential, however previous experiments overlooked them. As the first available server, Pheg could predict essentiality for anonymous gene sequences of human. It is also hoped the λ-interval Z curve method could be effectively extended to classification issues of other DNA elements. Availability and Implementation http://cefg.uestc.edu.cn/Pheg. Contact fbguo@uestc.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Feng-Biao Guo
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Chuan Dong
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Hong-Li Hua
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Shuo Liu
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Hao Luo
- Department of Physics, Tianjin University, Tianjin, China
| | - Hong-Wan Zhang
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Yan-Ting Jin
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| | - Kai-Yue Zhang
- School of Life Science and Technology, Center for Informational Biology and Key Laboratory for Neuro-information of the Ministry of Education, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
12
|
Harrison RL, Mowery JD, Rowley DL, Bauchan GR, Theilmann DA, Rohrmann GF, Erlandson MA. The complete genome sequence of a third distinct baculovirus isolated from the true armyworm, Mythimna unipuncta, contains two copies of the lef-7 gene. Virus Genes 2017; 54:297-310. [PMID: 29204787 DOI: 10.1007/s11262-017-1525-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 11/21/2017] [Indexed: 10/18/2022]
Abstract
A baculovirus isolate from a USDA Forest Service collection was characterized by electron microscopy and analysis of its genome sequence. The isolate, formerly referred to as Pseudoletia (Mythimna) sp. nucleopolyhedrovirus #7 (MyspNPV#7), was determined by barcoding PCR to derive from the host species Mythimna unipuncta (true armyworm) and was renamed Mythimna unipuncta nucleopolyhedrovirus #7 (MyunNPV#7). The occlusion bodies (OBs) and virions exhibited a size and morphology typical for OBs produced by the species of genus Alphabaculovirus, with occlusion-derived virions consisting of 2-5 nucleocapsids within a single envelope. The MyunNPV#7 genome was determined to be 148,482 bp with a 48.58% G+C nucleotide distribution. A total of 159 ORFs of 150 bp or larger were annotated in the genome sequence, including the 38 core genes of family Baculoviridae. The genome contained six homologous repeat regions (hrs) consisting of multiple copies of a 34-bp imperfect palindrome. Phylogenetic inference from concatenated baculovirus core gene amino acid sequence alignments placed MyunNPV#7 with group II alphabaculoviruses isolated from other armyworm and cutworm host species of lepidopteran family Noctuidae. MyunNPV#7 could be distinguished from other viruses in this group on the basis of differences in gene content and order. Pairwise nucleotide distances suggested that MyunNPV#7 represents a distinct species in Alphabaculovirus. The MyunNPV#7 genome was found to contain two copies of the late expression factor-7 (lef-7) gene, a feature not reported for any other baculovirus genome to date. Both copies of lef-7 encoded an F-box domain, which is required for the function of LEF-7 in baculovirus DNA replication.
Collapse
Affiliation(s)
- Robert L Harrison
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA.
| | - Joseph D Mowery
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA
| | - Daniel L Rowley
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA
| | - Gary R Bauchan
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD, 20705, USA
| | - David A Theilmann
- Summerland Research and Development Centre, Agriculture and Agri-Food Canada, Summerland, BC, V0H 1Z0, Canada
| | - George F Rohrmann
- Department of Microbiology, Oregon State University, Corvallis, OR, 97331-3804, USA
| | - Martin A Erlandson
- Saskatoon Research and Development Centre, Agriculture and Agri-Food Canada, Saskatoon, SK, S7N 0X2, Canada
| |
Collapse
|
13
|
Harrison RL, Rowley DL, Mowery JD, Bauchan GR, Burand JP. The Operophtera brumata Nucleopolyhedrovirus (OpbuNPV) Represents an Early, Divergent Lineage within Genus Alphabaculovirus. Viruses 2017; 9:v9100307. [PMID: 29065456 PMCID: PMC5691658 DOI: 10.3390/v9100307] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2017] [Revised: 10/12/2017] [Accepted: 10/17/2017] [Indexed: 12/16/2022] Open
Abstract
Operophtera brumata nucleopolyhedrovirus (OpbuNPV) infects the larvae of the winter moth, Operophtera brumata. As part of an effort to explore the pesticidal potential of OpbuNPV, an isolate of this virus from Massachusetts (USA)-OpbuNPV-MA-was characterized by electron microscopy of OpbuNPV occlusion bodies (OBs) and by sequencing of the viral genome. The OBs of OpbuNPV-MA consisted of irregular polyhedra and contained virions consisting of a single rod-shaped nucleocapsid within each envelope. Presumptive cypovirus OBs were also detected in sections of the OB preparation. The OpbuNPV-MA genome assembly yielded a circular contig of 119,054 bp and was found to contain little genetic variation, with most polymorphisms occurring at a frequency of < 6%. A total of 130 open reading frames (ORFs) were annotated, including the 38 core genes of Baculoviridae, along with five homologous repeat (hr) regions. The results of BLASTp and phylogenetic analysis with selected ORFs indicated that OpbuNPV-MA is not closely related to other alphabaculoviruses. Phylogenies based on concatenated core gene amino acid sequence alignments placed OpbuNPV-MA on a basal branch lying outside other alphabaculovirus clades. These results indicate that OpbuNPV-MA represents a divergent baculovirus lineage that appeared early during the diversification of genus Alphabaculovirus.
Collapse
Affiliation(s)
- Robert L Harrison
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD 20705, USA.
| | - Daniel L Rowley
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD 20705, USA.
| | - Joseph D Mowery
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD 20705, USA.
| | - Gary R Bauchan
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, MD 20705, USA.
| | - John P Burand
- Department of Microbiology, University of Massachusetts-Amherst, Amherst, MA 01003, USA.
| |
Collapse
|
14
|
Harrison RL, Rowley DL, Mowery J, Bauchan GR, Theilmann DA, Rohrmann GF, Erlandson MA. The Complete Genome Sequence of a Second Distinct Betabaculovirus from the True Armyworm, Mythimna unipuncta. PLoS One 2017; 12:e0170510. [PMID: 28103323 PMCID: PMC5245865 DOI: 10.1371/journal.pone.0170510] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Accepted: 01/05/2017] [Indexed: 11/19/2022] Open
Abstract
The betabaculovirus originally called Pseudaletia (Mythimna) sp. granulovirus #8 (MyspGV#8) was examined by electron microscopy, host barcoding PCR, and determination of the nucleotide sequence of its genome. Scanning and transmission electron microscopy revealed that the occlusion bodies of MyspGV#8 possessed the characteristic size range and morphology of betabaculovirus granules. Barcoding PCR using cytochrome oxidase I primers with DNA from the MyspGV#8 collection sample confirmed that it had been isolated from the true armyworm, Mythimna unipuncta (Lepidoptera: Noctuidae) and therefore was renamed MyunGV#8. The MyunGV#8 genome was found to be 144,673 bp in size with a nucleotide distribution of 49.9% G+C, which was significantly smaller and more GC-rich than the genome of Pseudaletia unipuncta granulovirus H (PsunGV-H), another M. unipuncta betabaculovirus. A phylogeny based on concatenated baculovirus core gene amino acid sequence alignments placed MyunGV#8 in clade a of genus Betabaculovirus. Kimura-2-parameter nucleotide distances suggested that MyunGV#8 represents a virus species different and distinct from other species of Betabaculovirus. Among the 153 ORFs annotated in the MyunGV#8 genome, four ORFs appeared to have been obtained from or donated to the alphabaculovirus lineage represented by Leucania separata nucleopolyhedrovirus AH1 (LeseNPV-AH1) during co-infection of Mythimna sp. larvae. A set of 33 ORFs was identified that appears only in other clade a betabaculovirus isolates. This clade a-specific set includes an ORF that encodes a polypeptide sequence containing a CIDE_N domain, which is found in caspase-activated DNAse/DNA fragmentation factor (CAD/DFF) proteins. CAD/DFF proteins are involved in digesting DNA during apoptosis.
Collapse
Affiliation(s)
- Robert L. Harrison
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
- * E-mail:
| | - Daniel L. Rowley
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - Joseph Mowery
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - Gary R. Bauchan
- Electron and Confocal Microscopy Unit, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - David A. Theilmann
- Summerland Research and Development Centre, Agriculture and Agri-Food Canada, Summerland, British Columbia, Canada
| | - George F. Rohrmann
- Department of Microbiology, Oregon State University, Corvallis, Oregon, United States of America
| | - Martin A. Erlandson
- Saskatoon Research and Development Centre, Agriculture and Agri-Food Canada, Saskatoon, Saskatchewan, Canada
| |
Collapse
|
15
|
Liu X, Zheng H, Zhang W, Shen Z, Zhao M, Chen Y, Sun L, Shi J, Zhang J. Tracking Cefoperazone/Sulbactam Resistance Development In vivo in A. baumannii Isolated from a Patient with Hospital-Acquired Pneumonia by Whole-Genome Sequencing. Front Microbiol 2016; 7:1268. [PMID: 27594850 PMCID: PMC4990596 DOI: 10.3389/fmicb.2016.01268] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Accepted: 08/02/2016] [Indexed: 01/10/2023] Open
Abstract
Cefoperazone/sulbactam has been shown to be efficacious for the treatment of infections caused by Acinetobacter baumannii; however, the mechanism underlying resistance to this synergistic combination is not well understood. In the present study, two A. baumannii isolates, AB1845 and AB2092, were isolated from a patient with hospital-acquired pneumonia before and after 20 days of cefoperazone/sulbactam therapy (2:1, 3 g every 8 h with a 1-h infusion). The minimum inhibitory concentration (MIC) of cefoperazone/sulbactam for AB1845 and AB2092 was 16/8 and 128/64 mg/L, respectively. Blood samples were collected on day 4 of the treatment to determine the concentration of cefoperazone and sulbactam. The pharmacokinetic/pharmacodynamic (PK/PD) indices (%T>MIC) were calculated to evaluate the dosage regimen and resistance development. The results showed that %T>MIC of cefoperazone and sulbactam was 100% and 34.5% for AB1845, and 0% and 0% for AB2092, respectively. Although there was no available PK/PD target for sulbactam, it was proposed that sulbactam should be administered at higher doses or for prolonged infusion times to achieve better efficacy. To investigate the mechanism of A. baumannii resistance to the cefoperazone/sulbactam combination in vivo, whole-genome sequencing of these two isolates was further performed. The sequencing results showed that 97.6% of the genome sequences were identical and 33 non-synonymous mutations were detected between AB1845 and AB2092. The only difference of these two isolates was showed in sequencing coverage comparison. There was a 6-kb amplified DNA fragment which was three times higher in AB2092, compared with AB1845. The amplified DNA fragment containing the blaOXA-23 gene on transposon Tn2009. Further quantitative real-time PCR results demonstrated that gene expression at the mRNA level of blaOXA-23 was >5 times higher in AB2092 than in AB1845. These results suggested that the blaOXA-23 gene had higher expression level in AB2092 via gene amplification and following transcription. Because gene amplification plays a critical role in antibiotic resistance in many bacteria, it is very likely that the blaOXA-23 amplification results in the development of cefoperazone/sulbactam resistance in vivo.
Collapse
Affiliation(s)
- Xiaofen Liu
- Institute of Antibiotics, Huashan Hospital, Fudan UniversityShanghai, China; Roche Innovation Center ShanghaiShanghai, China
| | - Huajun Zheng
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai Shanghai, China
| | - Weipeng Zhang
- Division of Life Science, The Hong Kong University of Science and Technology Hong Kong, China
| | - Zhen Shen
- Institute of Antibiotics, Huashan Hospital, Fudan University Shanghai, China
| | - Miao Zhao
- Institute of Antibiotics, Huashan Hospital, Fudan University Shanghai, China
| | - Yuancheng Chen
- Institute of Antibiotics, Huashan Hospital, Fudan UniversityShanghai, China; Key Laboratory of Clinical Pharmacology of Antibiotics, National Population and Family Planning CommissionShanghai, China
| | - Li Sun
- Institute of Antibiotics, Huashan Hospital, Fudan University Shanghai, China
| | - Jun Shi
- Roche Innovation Center Shanghai Shanghai, China
| | - Jing Zhang
- Institute of Antibiotics, Huashan Hospital, Fudan UniversityShanghai, China; Key Laboratory of Clinical Pharmacology of Antibiotics, National Population and Family Planning CommissionShanghai, China
| |
Collapse
|
16
|
Harrison RL, Rowley DL, Funk CJ. The Complete Genome Sequence of Plodia Interpunctella Granulovirus: Evidence for Horizontal Gene Transfer and Discovery of an Unusual Inhibitor-of-Apoptosis Gene. PLoS One 2016; 11:e0160389. [PMID: 27472489 PMCID: PMC4966970 DOI: 10.1371/journal.pone.0160389] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2016] [Accepted: 07/18/2016] [Indexed: 12/21/2022] Open
Abstract
The Indianmeal moth, Plodia interpunctella (Lepidoptera: Pyralidae), is a common pest of stored goods with a worldwide distribution. The complete genome sequence for a larval pathogen of this moth, the baculovirus Plodia interpunctella granulovirus (PiGV), was determined by next-generation sequencing. The PiGV genome was found to be 112, 536 bp in length with a 44.2% G+C nucleotide distribution. A total of 123 open reading frames (ORFs) and seven homologous regions (hrs) were identified and annotated. Phylogenetic inference using concatenated alignments of 36 baculovirus core genes placed PiGV in the “b” clade of viruses from genus Betabaculovirus with a branch length suggesting that PiGV represents a distinct betabaculovirus species. In addition to the baculovirus core genes and orthologues of other genes found in other betabaculovirus genomes, the PiGV genome sequence contained orthologues of the bidensovirus NS3 gene, as well as ORFs that occur in alphabaculoviruses but not betabaculoviruses. While PiGV contained an orthologue of inhibitor of apoptosis-5 (iap-5), an orthologue of inhibitor of apoptosis-3 (iap-3) was not present. Instead, the PiGV sequence contained an ORF (PiGV ORF81) encoding an IAP homologue with sequence similarity to insect cellular IAPs, but not to viral IAPs. Phylogenetic analysis of baculovirus and insect IAP amino acid sequences suggested that the baculovirus IAP-3 genes and the PiGV ORF81 IAP homologue represent different lineages arising from more than one acquisition event. The presence of genes from other sources in the PiGV genome highlights the extent to which baculovirus gene content is shaped by horizontal gene transfer.
Collapse
Affiliation(s)
- Robert L. Harrison
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
- * E-mail:
| | - Daniel L. Rowley
- Invasive Insect Biocontrol and Behavior Laboratory, Beltsville Agricultural Research Center, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - C. Joel Funk
- Department of Biology, John Brown University, Siloam Springs, Arkansas, United States of America
| |
Collapse
|
17
|
Geographic isolates of Lymantria dispar multiple nucleopolyhedrovirus: Genome sequence analysis and pathogenicity against European and Asian gypsy moth strains. J Invertebr Pathol 2016; 137:10-22. [PMID: 27090923 DOI: 10.1016/j.jip.2016.03.014] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2015] [Revised: 03/07/2016] [Accepted: 03/29/2016] [Indexed: 02/04/2023]
Abstract
Isolates of the baculovirus species Lymantria dispar multiple nucleopolyhedrovirus have been formulated and applied to suppress outbreaks of the gypsy moth, L. dispar. To evaluate the genetic diversity in this species at the genomic level, the genomes of three isolates from Massachusetts, USA (LdMNPV-Ab-a624), Spain (LdMNPV-3054), and Japan (LdMNPV-3041) were sequenced and compared with four previously determined LdMNPV genome sequences. The LdMNPV genome sequences were collinear and contained the same homologous repeats (hrs) and clusters of baculovirus repeat orf (bro) gene family members in the same relative positions in their genomes, although sequence identities in these regions were low. Of 146 non-bro ORFs annotated in the genome of the representative isolate LdMNPV 5-6, 135 ORFs were found in every other LdMNPV genome, including the 37 core genes of Baculoviridae and other genes conserved in genus Alphabaculovirus. Phylogenetic inference with an alignment of the core gene nucleotide sequences grouped isolates 3041 (Japan) and 2161 (Korea) separately from a cluster containing isolates from Europe, North America, and Russia. To examine phenotypic diversity, bioassays were carried out with a selection of isolates against neonate larvae from three European gypsy moth (Lymantria dispar dispar) and three Asian gypsy moth (Lymantria dispar asiatica and Lymantria dispar japonica) colonies. LdMNPV isolates 2161 (Korea), 3029 (Russia), and 3041 (Japan) exhibited a greater degree of pathogenicity against all L. dispar strains than LdMNPV from a sample of Gypchek. This study provides additional information on the genetic diversity of LdMNPV isolates and their activity against the Asian gypsy moth, a potential invasive pest of North American trees and forests.
Collapse
|
18
|
Complete Genome Sequence of a Western Siberian Lymantria dispar Multiple Nucleopolyhedrovirus Isolate. GENOME ANNOUNCEMENTS 2015; 3:3/2/e00335-15. [PMID: 25908142 PMCID: PMC4408343 DOI: 10.1128/genomea.00335-15] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
A novel strain of Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV-27) was isolated from dead larvae of a Western Siberian (WS) population of gypsy moths (Lymantria dispar L.). We report the complete genome sequence of this strain, comprising 164,108 bp and double-stranded circular DNA encoding 162 predicted open reading frames.
Collapse
|
19
|
Lactococcal 949 group phages recognize a carbohydrate receptor on the host cell surface. Appl Environ Microbiol 2015; 81:3299-305. [PMID: 25746988 DOI: 10.1128/aem.00143-15] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2015] [Accepted: 02/25/2015] [Indexed: 12/27/2022] Open
Abstract
Lactococcal bacteriophages represent one of the leading causes of dairy fermentation failure and product inconsistencies. A new member of the lactococcal 949 phage group, named WRP3, was isolated from cheese whey from a Sicilian factory in 2011. The genome sequence of this phage was determined, and it constitutes the largest lactococcal phage genome currently known, at 130,008 bp. Detailed bioinformatic analysis of the genomic region encoding the presumed initiator complex and baseplate of WRP3 has aided in the functional assignment of several open reading frames (ORFs), particularly that for the receptor binding protein required for host recognition. Furthermore, we demonstrate that the 949 phages target cell wall phospho-polysaccharides as their receptors, accounting for the specificity of the interactions of these phages with their lactococcal hosts. Such information may ultimately aid in the identification of strains/strain blends that do not present the necessary saccharidic target for infection by these problematic phages.
Collapse
|
20
|
Unraveling the web of viroinformatics: computational tools and databases in virus research. J Virol 2014; 89:1489-501. [PMID: 25428870 DOI: 10.1128/jvi.02027-14] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
The beginning of the second century of research in the field of virology (the first virus was discovered in 1898) was marked by its amalgamation with bioinformatics, resulting in the birth of a new domain--viroinformatics. The availability of more than 100 Web servers and databases embracing all or specific viruses (for example, dengue virus, influenza virus, hepatitis virus, human immunodeficiency virus [HIV], hemorrhagic fever virus [HFV], human papillomavirus [HPV], West Nile virus, etc.) as well as distinct applications (comparative/diversity analysis, viral recombination, small interfering RNA [siRNA]/short hairpin RNA [shRNA]/microRNA [miRNA] studies, RNA folding, protein-protein interaction, structural analysis, and phylotyping and genotyping) will definitely aid the development of effective drugs and vaccines. However, information about their access and utility is not available at any single source or on any single platform. Therefore, a compendium of various computational tools and resources dedicated specifically to virology is presented in this article.
Collapse
|
21
|
Huang H, Dong Y, Yang ZL, Luo H, Zhang X, Gao F. Complete sequence of pABTJ2, a plasmid from Acinetobacter baumannii MDR-TJ, carrying many phage-like elements. GENOMICS PROTEOMICS & BIOINFORMATICS 2014; 12:172-7. [PMID: 25046542 PMCID: PMC4411360 DOI: 10.1016/j.gpb.2014.05.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2014] [Revised: 05/14/2014] [Accepted: 05/26/2014] [Indexed: 12/24/2022]
Abstract
Acinetobacter baumannii is an important opportunistic pathogen in hospital, and the multidrug-resistant isolates of A. baumannii have been increasingly reported in recent years. A number of different mechanisms of resistance have been reported, some of which are associated with plasmid-mediated acquisition of genes. Therefore, studies on plasmids in A. baumannii have been a hot issue lately. We have performed complete genome sequencing of A. baumannii MDR-TJ, which is a multidrug-resistant isolate. Finalizing the remaining large scaffold of the previous assembly, we found a new plasmid pABTJ2, which carries many phage-like elements. The plasmid pABTJ2 is a circular double-stranded DNA molecule, which is 110,967bp in length. We annotated 125 CDSs from pABTJ2 using IMG ER and ZCURVE_V, accounting for 88.28% of the whole plasmid sequence. Many phage-like elements and a tRNA-coding gene were detected in pABTJ2, which is rarely reported among A. baumannii. The tRNA gene is specific for asparagine codon GTT, which may be a small chromosomal sequence picked up through incorrect excision during plasmid formation. The phage-like elements may have been acquired during the integration process, as the GC content of the region carrying phage-like elements was higher than that of the adjacent regions. The finding of phage-like elements and tRNA-coding gene in pABTJ2 may provide a novel insight into the study of A. baumannii pan-plasmidome.
Collapse
Affiliation(s)
- He Huang
- MOE Key Laboratory of Systems Bioengineering, Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Yan Dong
- MOE Key Laboratory of Systems Bioengineering, Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Zhi-Liang Yang
- MOE Key Laboratory of Systems Bioengineering, Department of Biochemical Engineering, School of Chemical Engineering and Technology, Tianjin University, Tianjin 300072, China
| | - Hao Luo
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Xi Zhang
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, Tianjin 300072, China; Collaborative Innovation Center of Chemical Science and Engineering, Tianjin 300072, China.
| |
Collapse
|
22
|
-Biao Guo F, Lin Y, -Ling Chen L. Recognition of Protein-coding Genes Based on Z-curve Algorithms. Curr Genomics 2014; 15:95-103. [PMID: 24822027 PMCID: PMC4009845 DOI: 10.2174/1389202915999140328162724] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Revised: 11/19/2013] [Accepted: 11/20/2013] [Indexed: 01/18/2023] Open
Abstract
Recognition of protein-coding genes, a classical bioinformatics issue, is an absolutely needed step for annotating newly sequenced genomes. The Z-curve algorithm, as one of the most effective methods on this issue, has been successfully applied in annotating or re-annotating many genomes, including those of bacteria, archaea and viruses. Two Z-curve based ab initio gene-finding programs have been developed: ZCURVE (for bacteria and archaea) and ZCURVE_V (for viruses and phages). ZCURVE_C (for 57 bacteria) and Zfisher (for any bacterium) are web servers for re-annotation of bacterial and archaeal genomes. The above four tools can be used for genome annotation or re-annotation, either independently or combined with the other gene-finding programs. In addition to recognizing protein-coding genes and exons, Z-curve algorithms are also effective in recognizing promoters and translation start sites. Here, we summarize the applications of Z-curve algorithms in gene finding and genome annotation.
Collapse
Affiliation(s)
- Feng -Biao Guo
- Center of Bioinformatics and Key Laboratory for NeuroInformation of the Ministry of Education, University of Elec-tronic Science and Technology of China, Chengdu, 610054, China
| | - Yan Lin
- Department of Physics, Tianjin University, Tianjin 300072, China
| | - Ling -Ling Chen
- cCollege of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, China
| |
Collapse
|
23
|
Zhang R, Zhang CT. A Brief Review: The Z-curve Theory and its Application in Genome Analysis. Curr Genomics 2014; 15:78-94. [PMID: 24822026 PMCID: PMC4009844 DOI: 10.2174/1389202915999140328162433] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2013] [Revised: 10/16/2013] [Accepted: 10/16/2013] [Indexed: 11/22/2022] Open
Abstract
In theoretical physics, there exist two basic mathematical approaches, algebraic and geometrical methods, which, in most cases, are complementary. In the area of genome sequence analysis, however, algebraic approaches have been widely used, while geometrical approaches have been less explored for a long time. The Z-curve theory is a geometrical approach to genome analysis. The Z-curve is a three-dimensional curve that represents a given DNA sequence in the sense that each can be uniquely reconstructed given the other. The Z-curve, therefore, contains all the information that the corresponding DNA sequence carries. The analysis of a DNA sequence can then be performed through studying the corresponding Z-curve. The Z-curve method has found applications in a wide range of areas in the past two decades, including the identifications of protein-coding genes, replication origins, horizontally-transferred genomic islands, promoters, translational start sides and isochores, as well as studies on phylogenetics, genome visualization and comparative genomics. Here, we review the progress of Z-curve studies from aspects of both theory and applications in genome analysis.
Collapse
Affiliation(s)
- Ren Zhang
- Center for Molecular Medicine and Genetics, Wayne State University Medical School, Detroit, MI 48201, USA
| | - Chun-Ting Zhang
- Department of Physics, Tianjin University, Tianjin 300072, China
| |
Collapse
|
24
|
Handtke S, Schroeter R, Jürgen B, Methling K, Schlüter R, Albrecht D, van Hijum SAFT, Bongaerts J, Maurer KH, Lalk M, Schweder T, Hecker M, Voigt B. Bacillus pumilus reveals a remarkably high resistance to hydrogen peroxide provoked oxidative stress. PLoS One 2014; 9:e85625. [PMID: 24465625 PMCID: PMC3896406 DOI: 10.1371/journal.pone.0085625] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2013] [Accepted: 12/05/2013] [Indexed: 12/16/2022] Open
Abstract
Bacillus pumilus is characterized by a higher oxidative stress resistance than other comparable industrially relevant Bacilli such as B. subtilis or B. licheniformis. In this study the response of B. pumilus to oxidative stress was investigated during a treatment with high concentrations of hydrogen peroxide at the proteome, transcriptome and metabolome level. Genes/proteins belonging to regulons, which are known to have important functions in the oxidative stress response of other organisms, were found to be upregulated, such as the Fur, Spx, SOS or CtsR regulon. Strikingly, parts of the fundamental PerR regulon responding to peroxide stress in B. subtilis are not encoded in the B. pumilus genome. Thus, B. pumilus misses the catalase KatA, the DNA-protection protein MrgA or the alkyl hydroperoxide reductase AhpCF. Data of this study suggests that the catalase KatX2 takes over the function of the missing KatA in the oxidative stress response of B. pumilus. The genome-wide expression analysis revealed an induction of bacillithiol (Cys-GlcN-malate, BSH) relevant genes. An analysis of the intracellular metabolites detected high intracellular levels of this protective metabolite, which indicates the importance of bacillithiol in the peroxide stress resistance of B. pumilus.
Collapse
Affiliation(s)
- Stefan Handtke
- Institute for Microbiology, University of Greifswald, Greifswald, Germany
| | - Rebecca Schroeter
- Pharmaceutical Biotechnology, Institute of Pharmacy, University of Greifswald, Greifswald, Germany
| | - Britta Jürgen
- Pharmaceutical Biotechnology, Institute of Pharmacy, University of Greifswald, Greifswald, Germany
| | - Karen Methling
- Institute of Biochemistry, University of Greifswald, Greifswald, Germany
| | - Rabea Schlüter
- Institute for Microbiology, University of Greifswald, Greifswald, Germany
| | - Dirk Albrecht
- Institute for Microbiology, University of Greifswald, Greifswald, Germany
| | - Sacha A. F. T. van Hijum
- Centre for Molecular and Biomolecular Informatics (CMBI), Nijmegen Centre for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, The Netherlands; and Division Processing and Safety, NIZO Food Research B.V., Ede, The Netherlands
| | - Johannes Bongaerts
- Department of Chemistry and Biotechnology, Aachen University of Applied Sciences, Jülich, Germany
| | | | - Michael Lalk
- Institute of Biochemistry, University of Greifswald, Greifswald, Germany
| | - Thomas Schweder
- Pharmaceutical Biotechnology, Institute of Pharmacy, University of Greifswald, Greifswald, Germany
- Institute of Marine Biotechnology, Greifswald, Germany
| | - Michael Hecker
- Institute for Microbiology, University of Greifswald, Greifswald, Germany
- Institute of Marine Biotechnology, Greifswald, Germany
| | - Birgit Voigt
- Institute for Microbiology, University of Greifswald, Greifswald, Germany
- Institute of Marine Biotechnology, Greifswald, Germany
| |
Collapse
|
25
|
Wu X, Liu H, Liu H, Su J, Lv J, Cui Y, Wang F, Zhang Y. Z curve theory-based analysis of the dynamic nature of nucleosome positioning in Saccharomyces cerevisiae. Gene 2013; 530:8-18. [DOI: 10.1016/j.gene.2013.08.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 07/30/2013] [Accepted: 08/03/2013] [Indexed: 01/01/2023]
|
26
|
Complete Genome Sequence of the Novel Phage MG-B1 Infecting Bacillus weihenstephanensis. GENOME ANNOUNCEMENTS 2013; 1:1/3/e00216-13. [PMID: 23766400 PMCID: PMC3707571 DOI: 10.1128/genomea.00216-13] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Here, we describe a novel virulent bacteriophage that infects Bacillus weihenstephanensis, isolated from soil in Austria. It is the first phage to be discovered that infects this species. Here, we present the complete genome sequence of this podovirus.
Collapse
|
27
|
Investigation of the relationship between lactococcal host cell wall polysaccharide genotype and 936 phage receptor binding protein phylogeny. Appl Environ Microbiol 2013; 79:4385-92. [PMID: 23666332 DOI: 10.1128/aem.00653-13] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Comparative genomics of 11 lactococcal 936-type phages combined with host range analysis allowed subgrouping of these phage genomes, particularly with respect to their encoded receptor binding proteins. The so-called pellicle or cell wall polysaccharide of Lactococcus lactis, which has been implicated as a host receptor of (certain) 936-type phages, is specified by a large gene cluster, which, among different lactococcal strains, contains highly conserved regions as well as regions of diversity. The regions of diversity within this cluster on the genomes of lactococcal strains MG1363, SK11, IL1403, KF147, CV56, and UC509.9 were used for the development of a multiplex PCR system to identify the pellicle genotype of lactococcal strains used in this study. The resulting comparative analysis revealed an apparent correlation between the pellicle genotype of a given host strain and the host range of tested 936-type phages. Such a correlation would allow prediction of the intrinsic 936-type phage sensitivity of a particular lactococcal strain and substantiates the notion that the lactococcal pellicle polysaccharide represents the receptor for (certain) 936-type phages while also partially explaining the molecular reasons behind the observed narrow host range of such phages.
Collapse
|
28
|
Identification of a new P335 subgroup through molecular analysis of lactococcal phages Q33 and BM13. Appl Environ Microbiol 2013; 79:4401-9. [PMID: 23666331 DOI: 10.1128/aem.00832-13] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lactococcal dairy starter strains are under constant threat from phages in dairy fermentation facilities, especially by members of the so-called 936, P335, and c2 species. Among these three phage groups, members of the P335 species are the most genetically diverse. Here, we present the complete genome sequences of two P335-type phages, Q33 and BM13, isolated in North America and representing a novel lineage within this phage group. The Q33 and BM13 genomes exhibit homology, not only to P335-type, but also to elements of the 936-type phage sequences. The two phage genomes also have close relatedness to phages infecting Enterococcus and Clostridium, a heretofore unknown feature among lactococcal P335 phages. The Q33 and BM13 genomes are organized in functionally related clusters with genes encoding functions such as DNA replication and packaging, morphogenesis, and host cell lysis. Electron micrographic analysis of the two phages highlights the presence of a baseplate more reminiscent of the baseplate of 936 phages than that of the majority of members of the P335 group, with the exception of r1t and LC3.
Collapse
|
29
|
The genome of VP3, a T7-like phage used for the typing of Vibrio cholerae. Arch Virol 2013; 158:1865-76. [PMID: 23543142 DOI: 10.1007/s00705-013-1676-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2012] [Accepted: 02/13/2013] [Indexed: 12/15/2022]
Abstract
The bacteriophage VP3 is used in a phage-biotyping scheme as one of the typing phages of Vibrio cholerae O1 biotype El Tor strains. Here, we have sequenced and analyzed its genome. The genome consists of 39,481 bp with an overall G + C content of 42.6 %. Fifty-two open reading frames (ORFs) were predicted. Within the genome, 17 highly conserved phage promoters and 6 rho-independent terminators were predicted. When assessed with Rluc as a reporter gene, 12 of 16 cloned VP3 promoters showed activity in the host strain V. cholerae biotype El Tor. Based on the temporal expression pattern detected using reverse transcription PCR (RT-PCR), VP3 ORFs can be classed into four groups, arranged according to their order in the VP3 genome. Terminators T1 and T6 are presumed to work efficiently. Sequencing of the typing phage VP3 of V. cholerae reveals its evolutionary subdivisions from the members of T7-like phages of Escherichia coli. Knowledge of VP3 expands the known host range of T7-like phages and will promote understanding the different infection mechanisms used by members of this genus.
Collapse
|
30
|
Genome sequence of Pseudomonas aeruginosa strain AH16, isolated from a patient with chronic pneumonia in China. J Bacteriol 2013; 194:5976-7. [PMID: 23045492 DOI: 10.1128/jb.01451-12] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Pseudomonas aeruginosa AH16 is a virulent strain isolated from a patient with chronic pneumonia in China. Here, we present a 6.8-Mb (G+C content, 66.13%) assembly of its genome with 6,332 putative coding sequences, which may provide insights into the genomic basis of activity of the clinical P. aeruginosa strain in China.
Collapse
|
31
|
Wang S, Sundaram JP, Stockwell TB. VIGOR extended to annotate genomes for additional 12 different viruses. Nucleic Acids Res 2012; 40:W186-92. [PMID: 22669909 PMCID: PMC3394299 DOI: 10.1093/nar/gks528] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
A gene prediction program, VIGOR (Viral Genome ORF Reader), was developed at J. Craig Venter Institute in 2010 and has been successfully performing gene calling in coronavirus, influenza, rhinovirus and rotavirus for projects at the Genome Sequencing Center for Infectious Diseases. VIGOR uses sequence similarity search against custom protein databases to identify protein coding regions, start and stop codons and other gene features. Ribonucleicacid editing and other features are accurately identified based on sequence similarity and signature residues. VIGOR produces four output files: a gene prediction file, a complementary DNA file, an alignment file, and a gene feature table file. The gene feature table can be used to create GenBank submission. VIGOR takes a single input: viral genomic sequences in FASTA format. VIGOR has been extended to predict genes for 12 viruses: measles virus, mumps virus, rubella virus, respiratory syncytial virus, alphavirus and Venezuelan equine encephalitis virus, norovirus, metapneumovirus, yellow fever virus, Japanese encephalitis virus, parainfluenza virus and Sendai virus. VIGOR accurately detects the complex gene features like ribonucleicacid editing, stop codon leakage and ribosomal shunting. Precisely identifying the mat_peptide cleavage for some viruses is a built-in feature of VIGOR. The gene predictions for these viruses have been evaluated by testing from 27 to 240 genomes from GenBank.
Collapse
Affiliation(s)
- Shiliang Wang
- J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
| | | | | |
Collapse
|
32
|
Cheng H, Chan WS, Li Z, Wang D, Liu S, Zhou Y. Small open reading frames: current prediction techniques and future prospect. Curr Protein Pept Sci 2012; 12:503-7. [PMID: 21787300 DOI: 10.2174/138920311796957667] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Revised: 04/01/2011] [Accepted: 05/04/2011] [Indexed: 11/22/2022]
Abstract
Evidence is accumulating that small open reading frames (sORF, <100 codons) play key roles in many important biological processes. Yet, they are generally ignored in gene annotation despite they are far more abundant than the genes with more than 100 codons. Here, we demonstrate that popular homolog search and codon-index techniques perform poorly for small genes relative to that for larger genes, while a method dedicated to sORF discovery has a similar level of accuracy as homology search. The result is largely due to the small dataset of experimentally verified sORF available for homology search and for training ab initio techniques. It highlights the urgent need for both experimental and computational studies in order to further advance the accuracy of sORF prediction.
Collapse
Affiliation(s)
- Haoyu Cheng
- Indiana University School of Informatics, Indiana University-Purdue University and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | | | | | | | | | | |
Collapse
|
33
|
Krüger D, Kapturska D, Fischer C, Daniel R, Wubet T. Diversity measures in environmental sequences are highly dependent on alignment quality--data from ITS and new LSU primers targeting basidiomycetes. PLoS One 2012; 7:e32139. [PMID: 22363808 PMCID: PMC3283731 DOI: 10.1371/journal.pone.0032139] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Accepted: 01/23/2012] [Indexed: 02/04/2023] Open
Abstract
The ribosomal DNA comprised of the ITS1-5.8S-ITS2 regions is widely used as a fungal marker in molecular ecology and systematics but cannot be aligned with confidence across genetically distant taxa. In order to study the diversity of Agaricomycotina in forest soils, we designed primers targeting the more alignable 28S (LSU) gene, which should be more useful for phylogenetic analyses of the detected taxa. This paper compares the performance of the established ITS1F/4B primer pair, which targets basidiomycetes, to that of two new pairs. Key factors in the comparison were the diversity covered, off-target amplification, rarefaction at different Operational Taxonomic Unit (OTU) cutoff levels, sensitivity of the method used to process the alignment to missing data and insecure positional homology, and the congruence of monophyletic clades with OTU assignments and BLAST-derived OTU names. The ITS primer pair yielded no off-target amplification but also exhibited the least fidelity to the expected phylogenetic groups. The LSU primers give complementary pictures of diversity, but were more sensitive to modifications of the alignment such as the removal of difficult-to align stretches. The LSU primers also yielded greater numbers of singletons but also had a greater tendency to produce OTUs containing sequences from a wider variety of species as judged by BLAST similarity. We introduced some new parameters to describe alignment heterogeneity based on Shannon entropy and the extent and contents of the OTUs in a phylogenetic tree space. Our results suggest that ITS should not be used when calculating phylogenetic trees from genetically distant sequences obtained from environmental DNA extractions and that it is inadvisable to define OTUs on the basis of very heterogeneous alignments.
Collapse
Affiliation(s)
- Dirk Krüger
- Department of Soil Ecology, UFZ-Helmholtz Centre for Environmental Research, Halle (Saale), Germany.
| | | | | | | | | |
Collapse
|
34
|
Du MZ, Guo FB, Chen YY. Gene re-annotation in genome of the extremophile Pyrobaculum aerophilum by using bioinformatics methods. J Biomol Struct Dyn 2012; 29:391-401. [PMID: 21875157 DOI: 10.1080/07391102.2011.10507393] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
In this paper, we re-annotated the genome of Pyrobaculum aerophilum str. IM2, particularly for hypothetical ORFs. The annotation process includes three parts. Firstly and most importantly, 23 new genes, which were missed in the original annotation, are found by combining similarity search and the ab initio gene finding approaches. Among these new genes, five have significant similarities with function-known genes and the rest have significant similarities with hypothetical ORFs contained in other genomes. Secondly, the coding potentials of the 1645 hypothetical ORFs are re-predicted by using 33 Z curve variables combined with Fisher linear discrimination method. With the accuracy being 99.68%, 25 originally annotated hypothetical ORFs are recognized as non-coding by our method. Thirdly, 80 hypothetical ORFs are assigned with potential functions by using similarity search with BLAST program. Re-annotation of the genome will benefit related researches on this hyperthermophilic crenarchaeon. Also, the re-annotation procedure could be taken as a reference for other archaeal genomes. Details of the revised annotation are freely available at http://cobi.uestc.edu.cn/resource/paero/
Collapse
Affiliation(s)
- Meng-Ze Du
- Key Laboratory for NeuroInformation of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | | | | |
Collapse
|
35
|
Liao WC, Ng WV, Lin IH, Syu WJ, Liu TT, Chang CH. T4-Like genome organization of the Escherichia coli O157:H7 lytic phage AR1. J Virol 2011; 85:6567-78. [PMID: 21507986 PMCID: PMC3126482 DOI: 10.1128/jvi.02378-10] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Accepted: 04/04/2011] [Indexed: 11/20/2022] Open
Abstract
We report the genome organization and analysis of the first completely sequenced T4-like phage, AR1, of Escherichia coli O157:H7. Unlike most of the other sequenced phages of O157:H7, which belong to the temperate Podoviridae and Siphoviridae families, AR1 is a T4-like phage known to efficiently infect this pathogenic bacterial strain. The 167,435-bp AR1 genome is currently the largest among all the sequenced E. coli O157:H7 phages. It carries a total of 281 potential open reading frames (ORFs) and 10 putative tRNA genes. Of these, 126 predicted proteins could be classified into six viral orthologous group categories, with at least 18 proteins of the structural protein category having been detected by tandem mass spectrometry. Comparative genomic analysis of AR1 and four other completely sequenced T4-like genomes (RB32, RB69, T4, and JS98) indicated that they share a well-organized and highly conserved core genome, particularly in the regions encoding DNA replication and virion structural proteins. The major diverse features between these phages include the modules of distal tail fibers and the types and numbers of internal proteins, tRNA genes, and mobile elements. Codon usage analysis suggested that the presence of AR1-encoded tRNAs may be relevant to the codon usage of structural proteins. Furthermore, protein sequence analysis of AR1 gp37, a potential receptor binding protein, indicated that eight residues in the C terminus are unique to O157:H7 T4-like phages AR1 and PP01. These residues are known to be located in the T4 receptor recognition domain, and they may contribute to specificity for adsorption to the O157:H7 strain.
Collapse
Affiliation(s)
- Wei-Chao Liao
- Department of Biotechnology and Laboratory Science in Medicine
| | | | | | - Wan-Jr Syu
- Institute of Microbiology and Immunology
| | - Tze-Tze Liu
- Genome Research Center, National Yang-Ming University, Taipei, Taiwan
| | - Chuan-Hsiung Chang
- Center for Systems and Synthetic Biology
- Institute of Biomedical Informatics
| |
Collapse
|
36
|
Zhang R. A rebuttal to the comments on the genome order index and the Z-curve. Biol Direct 2011; 6:10. [PMID: 21324187 PMCID: PMC3046898 DOI: 10.1186/1745-6150-6-10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Accepted: 02/16/2011] [Indexed: 11/15/2022] Open
Abstract
Background Elhaik, Graur and Josic recently commented on the genome order index (S) and the Z-curve (Elhaik et al. Biol Direct 2010, 5: 10). S is a quantity defined as S = a2 + c2 + g2 + t2, where a, c, g and t denote corresponding base frequencies. The Z-curve is a three dimensional curve that represents a DNA sequence in the manner that each can be uniquely reconstructed given the other. Elhaik et al. made 4 major claims. 1) In the previous mapping system with the regular tetrahedron, calculation of the radius of the inscribed sphere is "a mathematical error". 2) S follows an exponential distribution and is narrowly distributed with a range of (0.25 - 0.33). 3) Based on the Chargaff's second parity rule (PR2), "S is equivalent to H [Shannon entropy]" and they are derivable from each other. 4) Z-curve "suffers from over dimensionality", because based on the analysis of 235 bacterial genomes, x and y components contributed only less than 1% of the variance and therefore "would be of little use". Results 1) Elhaik et al. mistakenly neglected the parameter 4/3 when calculating the radius of the inscribed sphere. 2) The exponential distribution of S is a restatement of our previous conclusion, and the range of (0.25 - 0.33) only paraphrases the previously suggested S range (0.25 -1/3). 3) Elhaik et al. incorrectly disregard deviations from PR2 by treating the deviations as 0 altogether, reduce S and H, both having 4 variables, a, c, g and t, into functions of one single variable, a only, and apply this treatment to all DNA sequences as the basis of their "demonstration", which is therefore invalid. 4) Elhaik et al. confuse numeral smallness with biological insignificance, and disregard the distributions of purine/pyrimidine and amino/keto bases (x and y components), the variations of which, although can be less than that of GC content, contain rich information that is important and useful, such as in locating replication origins of bacterial and archaeal genomes, and in studies of gene recognition in various species. Conclusion Elhaik et al. confuse S (a single number) with Z-curve (a series of 3D coordinates), which are distinct. To use S as a case study of Z-curve, by itself, is invalid. S and H are neither equivalent nor derivable from each other. The criticisms of Elhaik, Graur and Josic are wrong. Reviewers This article was reviewed by Erik van Nimwegen.
Collapse
Affiliation(s)
- Ren Zhang
- Department of Epidemiology and Biostatistics, Tianjin Cancer Institute and Hospital, Tianjin 300060, PR China.
| |
Collapse
|
37
|
Hu S, Zheng H, Gu Y, Zhao J, Zhang W, Yang Y, Wang S, Zhao G, Yang S, Jiang W. Comparative genomic and transcriptomic analysis revealed genetic characteristics related to solvent formation and xylose utilization in Clostridium acetobutylicum EA 2018. BMC Genomics 2011; 12:93. [PMID: 21284892 PMCID: PMC3044671 DOI: 10.1186/1471-2164-12-93] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2010] [Accepted: 02/02/2011] [Indexed: 12/17/2022] Open
Abstract
Background Clostridium acetobutylicum, a gram-positive and spore-forming anaerobe, is a major strain for the fermentative production of acetone, butanol and ethanol. But a previously isolated hyper-butanol producing strain C. acetobutylicum EA 2018 does not produce spores and has greater capability of solvent production, especially for butanol, than the type strain C. acetobutylicum ATCC 824. Results Complete genome of C. acetobutylicum EA 2018 was sequenced using Roche 454 pyrosequencing. Genomic comparison with ATCC 824 identified many variations which may contribute to the hyper-butanol producing characteristics in the EA 2018 strain, including a total of 46 deletion sites and 26 insertion sites. In addition, transcriptomic profiling of gene expression in EA 2018 relative to that of ATCC824 revealed expression-level changes of several key genes related to solvent formation. For example, spo0A and adhEII have higher expression level, and most of the acid formation related genes have lower expression level in EA 2018. Interestingly, the results also showed that the variation in CEA_G2622 (CAC2613 in ATCC 824), a putative transcriptional regulator involved in xylose utilization, might accelerate utilization of substrate xylose. Conclusions Comparative analysis of C. acetobutylicum hyper-butanol producing strain EA 2018 and type strain ATCC 824 at both genomic and transcriptomic levels, for the first time, provides molecular-level understanding of non-sporulation, higher solvent production and enhanced xylose utilization in the mutant EA 2018. The information could be valuable for further genetic modification of C. acetobutylicum for more effective butanol production.
Collapse
Affiliation(s)
- Shiyuan Hu
- Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Lu S, Gischkat S, Reiche M, Akob DM, Hallberg KB, Küsel K. Ecophysiology of Fe-cycling bacteria in acidic sediments. Appl Environ Microbiol 2010; 76:8174-83. [PMID: 20971876 PMCID: PMC3008266 DOI: 10.1128/aem.01931-10] [Citation(s) in RCA: 112] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2010] [Accepted: 10/13/2010] [Indexed: 02/01/2023] Open
Abstract
Using a combination of cultivation-dependent and -independent methods, this study aimed to elucidate the diversity of microorganisms involved in iron cycling and to resolve their in situ functional links in sediments of an acidic lignite mine lake. Using six different media with pH values ranging from 2.5 to 4.3, 117 isolates were obtained that grouped into 38 different strains, including 27 putative new species with respect to the closest characterized strains. Among the isolated strains, 22 strains were able to oxidize Fe(II), 34 were able to reduce Fe(III) in schwertmannite, the dominant iron oxide in this lake, and 21 could do both. All isolates falling into the Gammaproteobacteria (an unknown Dyella-like genus and Acidithiobacillus-related strains) were obtained from the top acidic sediment zones (pH 2.8). Firmicutes strains (related to Bacillus and Alicyclobacillus) were only isolated from deep, moderately acidic sediment zones (pH 4 to 5). Of the Alphaproteobacteria, Acidocella-related strains were only isolated from acidic zones, whereas Acidiphilium-related strains were isolated from all sediment depths. Bacterial clone libraries generally supported and complemented these patterns. Geobacter-related clone sequences were only obtained from deep sediment zones, and Geobacter-specific quantitative PCR yielded 8 × 10(5) gene copy numbers. Isolates related to the Acidobacterium, Acidocella, and Alicyclobacillus genera and to the unknown Dyella-like genus showed a broad pH tolerance, ranging from 2.5 to 5.0, and preferred schwertmannite to goethite for Fe(III) reduction. This study highlighted the variety of acidophilic microorganisms that are responsible for iron cycling in acidic environments, extending the results of recent laboratory-based studies that showed this trait to be widespread among acidophiles.
Collapse
Affiliation(s)
- Shipeng Lu
- Institute of Ecology, Friedrich Schiller University Jena, Dornburger Strasse 159, D-07743 Jena, Germany, School of Biological Sciences, College of Natural Sciences, Bangor University, Bangor LL57 2UW, United Kingdom
| | - Stefan Gischkat
- Institute of Ecology, Friedrich Schiller University Jena, Dornburger Strasse 159, D-07743 Jena, Germany, School of Biological Sciences, College of Natural Sciences, Bangor University, Bangor LL57 2UW, United Kingdom
| | - Marco Reiche
- Institute of Ecology, Friedrich Schiller University Jena, Dornburger Strasse 159, D-07743 Jena, Germany, School of Biological Sciences, College of Natural Sciences, Bangor University, Bangor LL57 2UW, United Kingdom
| | - Denise M. Akob
- Institute of Ecology, Friedrich Schiller University Jena, Dornburger Strasse 159, D-07743 Jena, Germany, School of Biological Sciences, College of Natural Sciences, Bangor University, Bangor LL57 2UW, United Kingdom
| | - Kevin B. Hallberg
- Institute of Ecology, Friedrich Schiller University Jena, Dornburger Strasse 159, D-07743 Jena, Germany, School of Biological Sciences, College of Natural Sciences, Bangor University, Bangor LL57 2UW, United Kingdom
| | - Kirsten Küsel
- Institute of Ecology, Friedrich Schiller University Jena, Dornburger Strasse 159, D-07743 Jena, Germany, School of Biological Sciences, College of Natural Sciences, Bangor University, Bangor LL57 2UW, United Kingdom
| |
Collapse
|
39
|
Towards Viral Genome Annotation Standards, Report from the 2010 NCBI Annotation Workshop. Viruses 2010; 2:2258-2268. [PMID: 21994619 PMCID: PMC3185566 DOI: 10.3390/v2102258] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2010] [Revised: 09/18/2010] [Accepted: 09/20/2010] [Indexed: 11/29/2022] Open
Abstract
Improvements in DNA sequencing technologies portend a new era in virology and could possibly lead to a giant leap in our understanding of viral evolution and ecology. Yet, as viral genome sequences begin to fill the world’s biological databases, it is critically important to recognize that the scientific promise of this era is dependent on consistent and comprehensive genome annotation. With this in mind, the NCBI Genome Annotation Workshop recently hosted a study group tasked with developing sequence, function, and metadata annotation standards for viral genomes. This report describes the issues involved in viral genome annotation and reviews policy recommendations presented at the NCBI Annotation Workshop.
Collapse
|
40
|
Wang S, Sundaram JP, Spiro D. VIGOR, an annotation program for small viral genomes. BMC Bioinformatics 2010; 11:451. [PMID: 20822531 PMCID: PMC2942859 DOI: 10.1186/1471-2105-11-451] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2010] [Accepted: 09/07/2010] [Indexed: 11/10/2022] Open
Abstract
Background The decrease in cost for sequencing and improvement in technologies has made it easier and more common for the re-sequencing of large genomes as well as parallel sequencing of small genomes. It is possible to completely sequence a small genome within days and this increases the number of publicly available genomes. Among the types of genomes being rapidly sequenced are those of microbial and viral genomes responsible for infectious diseases. However, accurate gene prediction is a challenge that persists for decoding a newly sequenced genome. Therefore, accurate and efficient gene prediction programs are highly desired for rapid and cost effective surveillance of RNA viruses through full genome sequencing. Results We have developed VIGOR (Viral Genome ORF Reader), a web application tool for gene prediction in influenza virus, rotavirus, rhinovirus and coronavirus subtypes. VIGOR detects protein coding regions based on sequence similarity searches and can accurately detect genome specific features such as frame shifts, overlapping genes, embedded genes, and can predict mature peptides within the context of a single polypeptide open reading frame. Genotyping capability for influenza and rotavirus is built into the program. We compared VIGOR to previously described gene prediction programs, ZCURVE_V, GeneMarkS and FLAN. The specificity and sensitivity of VIGOR are greater than 99% for the RNA viral genomes tested. Conclusions VIGOR is a user friendly web-based genome annotation program for five different viral agents, influenza, rotavirus, rhinovirus, coronavirus and SARS coronavirus. This is the first gene prediction program for rotavirus and rhinovirus for public access. VIGOR is able to accurately predict protein coding genes for the above five viral types and has the capability to assign function to the predicted open reading frames and genotype influenza virus. The prediction software was designed for performing high throughput annotation and closure validation in a post-sequencing production pipeline.
Collapse
Affiliation(s)
- Shiliang Wang
- J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
| | | | | |
Collapse
|
41
|
Characterization of a new plasmid-like prophage in a pandemic Vibrio parahaemolyticus O3:K6 strain. Appl Environ Microbiol 2009; 75:2659-67. [PMID: 19286788 DOI: 10.1128/aem.02483-08] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Vibrio parahaemolyticus is a common food-borne pathogen that is normally associated with seafood. In 1996, a pandemic O3:K6 strain abruptly appeared and caused the first pandemic of this pathogen to spread throughout many Asian countries, America, Europe, and Africa. The role of temperate bacteriophages in the evolution of this pathogen is of great interest. In this work, a new temperate phage, VP882, from a pandemic O3:K6 strain of V. parahaemolyticus was purified and characterized after mitomycin C induction. VP882 was a Myoviridae bacteriophage with a polyhedral head and a long rigid tail with a sheath-like structure. It infected and lysed high proportions of V. parahaemolyticus, Vibrio vulnificus, and Vibrio cholerae strains. The genome of phage VP882 was sequenced and was 38,197 bp long, and 71 putative open reading frames were identified, of which 27 were putative functional phage or bacterial genes. VP882 had a linear plasmid-like genome with a putative protelomerase gene and cohesive ends. The genome does not integrate into the host chromosome but was maintained as a plasmid in the lysogen. Analysis of the reaction sites of the protelomerases in different plasmid-like phages revealed that VP882 and PhiHAP-1 were highly similar, while N15, PhiKO2, and PY54 made up another closely related group. The presence of DNA adenine methylase and quorum-sensing transcriptional regulators in VP882 may play a specific role in this phage or regulate physiological or virulence-associated traits of the hosts. These genes may also be remnants from the bacterial chromosome following transduction.
Collapse
|
42
|
Guo FB, Lin Y. Identify Protein-coding Genes in the Genomes ofAeropyrum pernixK1 andChlorobium tepidumTLS. J Biomol Struct Dyn 2009; 26:413-20. [DOI: 10.1080/07391102.2009.10507256] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
43
|
Tan LV, Ha DQ, Hien VM, van der Hoek L, Farrar J, de Jong MD. Me Tri virus: a Semliki Forest virus strain from Vietnam? J Gen Virol 2008; 89:2132-2135. [PMID: 18753222 DOI: 10.1099/vir.0.2008/002121-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Me Tri virus (MTV) is a member of the Semliki Forest virus (SFV) complex in the genus Alphavirus, first isolated from Culex tritaeniorhynchus mosquitoes in Vietnam in 1971 and described as a newly recognized alphavirus, based on antigenic characterization. However, based on a partial nucleotide sequence of the E1 envelope glycoprotein gene, it has recently been argued that MTV may represent a variant of SFV rather than a separate species. To enable definitive classification, we determined the complete genome sequence of MTV from original virus stock. Nucleotide homology, as well as phylogenetic analyses based on whole and partial genome sequences confirmed that MTV is an isolate of SFV. Notable differences to other reported SFV sequences included a 122 nt insertion at the 5' non-translated region (NTR), likely resulting from homologous recombination of part of the nsP2 gene, and differences in the sequence length of the 3' NTR. To our knowledge, this is the first and only documentation of SFV isolation outside Africa. Further research is needed to clarify whether SFV continues to circulate in Vietnam.
Collapse
Affiliation(s)
- Le Van Tan
- Oxford University Clinical Research Unit, 190 Ben Ham Tu, Ho Chi Minh City, Vietnam
| | - Do Quang Ha
- Oxford University Clinical Research Unit, 190 Ben Ham Tu, Ho Chi Minh City, Vietnam
| | - Vo Minh Hien
- Hospital for Tropical Diseases, 190 Ben Ham Tu, Ho Chi Minh City, Vietnam
| | - Lia van der Hoek
- Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands
| | - Jeremy Farrar
- Centre for Tropical Medicine, Oxford University, UK.,Oxford University Clinical Research Unit, 190 Ben Ham Tu, Ho Chi Minh City, Vietnam
| | - Menno D de Jong
- Centre for Tropical Medicine, Oxford University, UK.,Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Meibergdreef 15, 1105 AZ, Amsterdam, The Netherlands.,Oxford University Clinical Research Unit, 190 Ben Ham Tu, Ho Chi Minh City, Vietnam
| |
Collapse
|
44
|
Yang JY, Zhou Y, Yu ZG, Anh V, Zhou LQ. Human Pol II promoter recognition based on primary sequences and free energy of dinucleotides. BMC Bioinformatics 2008; 9:113. [PMID: 18294399 PMCID: PMC2292139 DOI: 10.1186/1471-2105-9-113] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2007] [Accepted: 02/24/2008] [Indexed: 01/29/2023] Open
Abstract
Background Promoter region plays an important role in determining where the transcription of a particular gene should be initiated. Computational prediction of eukaryotic Pol II promoter sequences is one of the most significant problems in sequence analysis. Existing promoter prediction methods are still far from being satisfactory. Results We attempt to recognize the human Pol II promoter sequences from the non-promoter sequences which are made up of exon and intron sequences. Four methods are used: two kinds of multifractal analysis performed on the numeric sequences obtained from the dinucleotide free energy, Z curve analysis and global descriptor of the promoter/non-promoter primary sequences. A total of 141 parameters are extracted from these methods and categorized into seven groups (methods). They are used to generate certain spaces and then each promoter/non-promoter sequence is represented by a point in the corresponding space. All the 120 possible combinations of the seven methods are tested. Based on Fisher's linear discriminant algorithm, with a relatively smaller number of parameters (96 and 117), we get satisfactory discriminant accuracies. Particularly, in the case of 117 parameters, the accuracies for the training and test sets reach 90.43% and 89.79%, respectively. A comparison with five other existing methods indicates that our methods have a better performance. Using the global descriptor method (36 parameters), 17 of the 18 experimentally verified promoter sequences of human chromosome 22 are correctly identified. Conclusion The high accuracies achieved suggest that the methods of this paper are useful for understanding the difficult problem of promoter prediction.
Collapse
Affiliation(s)
- Jian-Yi Yang
- School of Mathematics and Computational Science, Xiangtan University, Hunan 411105, China.
| | | | | | | | | |
Collapse
|
45
|
Guo FB, Yu XJ. Separate base usages of genes located on the leading and lagging strands in Chlamydia muridarum revealed by the Z curve method. BMC Genomics 2007; 8:366. [PMID: 17925038 PMCID: PMC2089121 DOI: 10.1186/1471-2164-8-366] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2007] [Accepted: 10/10/2007] [Indexed: 11/10/2022] Open
Abstract
Background The nucleotide compositional asymmetry between the leading and lagging strands in bacterial genomes has been the subject of intensive study in the past few years. It is interesting to mention that almost all bacterial genomes exhibit the same kind of base asymmetry. This work aims to investigate the strand biases in Chlamydia muridarum genome and show the potential of the Z curve method for quantitatively differentiating genes on the leading and lagging strands. Results The occurrence frequencies of bases of protein-coding genes in C. muridarum genome were analyzed by the Z curve method. It was found that genes located on the two strands of replication have distinct base usages in C. muridarum genome. According to their positions in the 9-D space spanned by the variables u1 – u9 of the Z curve method, K-means clustering algorithm can assign about 94% of genes to the correct strands, which is a few percent higher than those correctly classified by K-means based on the RSCU. The base usage and codon usage analyses show that genes on the leading strand have more G than C and more T than A, particularly at the third codon position. For genes on the lagging strand the biases is reverse. The y component of the Z curves for the complete chromosome sequences show that the excess of G over C and T over A are more remarkable in C. muridarum genome than in other bacterial genomes without separating base and/or codon usages. Furthermore, for the genomes of Borrelia burgdorferi, Treponema pallidum, Chlamydia muridarum and Chlamydia trachomatis, in which distinct base and/or codon usages have been observed, closer phylogenetic distance is found compared with other bacterial genomes. Conclusion The nature of the strand biases of base composition in C. muridarum is similar to that in most other bacterial genomes. However, the base composition asymmetry between the leading and lagging strands in C. muridarum is more significant than that in other bacteria. It's supposed that the remarkable strand biases of G/C and T/A are responsible for the appearance of separate base or codon usages in C. muridarum. On the other hand, the closer phylogenetic distance among the four bacterial genomes with separate base and/or codon usages is necessary rather than occasional. It's also shown that the Z curve method may be more sensitive than RSCU when being used to quantitatively analyze DNA sequences.
Collapse
Affiliation(s)
- Feng-Biao Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | | |
Collapse
|
46
|
Guo FB, Yu XJ. Re-prediction of protein-coding genes in the genome of Amsacta moorei entomopoxvirus. J Virol Methods 2007; 146:389-92. [PMID: 17716751 DOI: 10.1016/j.jviromet.2007.07.010] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2007] [Revised: 07/12/2007] [Accepted: 07/18/2007] [Indexed: 11/23/2022]
Abstract
Using the Z curve method, the protein-coding genes in AmEPV genome are re-predicted. On the basis of the parameters trained on the experimentally validated genes, all of the 30 experimentally validated genes and 67 putative genes are predicted correctly as coding genes. The sensitivities of the present method for self-test and cross-validation are all 100% based on these test sets. Thirty-eight annotated conserved and hypothetical genes are predicted as non-coding ORFs. The number of re-predicted protein-coding genes in AmEPV is 256. It is significantly less than the number 294 reported in the original annotation. After extending the present method trained in AeEPV genome to the other entomopoxvirus genome, it is found that 116 of the 123 known and putative genes are predicted correctly as coding. Six of the seven falsely missed genes are less than 300bp. The present method could be extended to other poxvirus genomes with or without adaptation of training sets.
Collapse
Affiliation(s)
- Feng-Biao Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | | |
Collapse
|
47
|
Overbeek R, Bartels D, Vonstein V, Meyer F. Annotation of bacterial and archaeal genomes: improving accuracy and consistency. Chem Rev 2007; 107:3431-47. [PMID: 17658903 DOI: 10.1021/cr068308h] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ross Overbeek
- Fellowship for Interpretation of Genomes, Burr Ridge, Illinois 60527, USA
| | | | | | | |
Collapse
|
48
|
de Jong A, van Hijum SAFT, Bijlsma JJE, Kok J, Kuipers OP. BAGEL: a web-based bacteriocin genome mining tool. Nucleic Acids Res 2006; 34:W273-9. [PMID: 16845009 PMCID: PMC1538908 DOI: 10.1093/nar/gkl237] [Citation(s) in RCA: 173] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A common problem in the annotation of open reading frames (ORFs) is the identification of genes that are functionally similar but have limited or no sequence homology. This is particularly the case for bacteriocins, a very diverse group of antimicrobial peptides produced by bacteria and usually encoded by small, poorly conserved ORFs. ORFs surrounding bacteriocin genes are often biosynthetic genes. This information can be used to locate putative structural bacteriocin genes. Here, we describe BAGEL, a web server that identifies putative bacteriocin ORFs in a DNA sequence using novel, knowledge-based bacteriocin databases and motif databases. Many bacteriocins are encoded by small genes that are often omitted in the annotation process of bacterial genomes. Thus, we have implemented ORF detection using a number of published ORF prediction tools. In addition, BAGEL takes into account the genomic context, i.e. for each potential bacteriocin-encoding ORF, the sequence of the surrounding region on the genome is analyzed for genes that might encode proteins involved in biosynthesis, transport, regulation and/or immunity. These innovations make BAGEL unique in its ability to detect putative bacteriocin gene clusters in (new) bacterial genomes. BAGEL is freely accessible at: http://bioinformatics.biol.rug.nl/websoftware/bagel.
Collapse
|