1
|
Zhu X, Zheng C, Dong X, Wang K, Zhang H, Yi W, Ye Z, Xue H, Bu W. Chromosome-level genome of the bean bug Megacopta cribraria in native range, provides insights into adaptation and pest management. Int J Biol Macromol 2023; 237:123989. [PMID: 36921825 DOI: 10.1016/j.ijbiomac.2023.123989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Revised: 02/17/2023] [Accepted: 02/26/2023] [Indexed: 03/16/2023]
Abstract
Megacopta cribraria, a bean pest causing tremendous economic losses in Asia, was discovered in North America in 2009. Although M. cribraria has become the focus of research on biological invasion and pest management, the lack of genomic resources limits in-depth studies. Here, we report the first chromosome-level genome of M. cribraria using Illumina, PacBio, and Hi-C data. The assembled genome size was 699.65 Mb, with a contig N50 of 1.43 Mb and a scaffold N50 of 109.27 Mb. >97.51 % of bases were successfully anchored to six chromosomes. Through genome annotation, a total of 13,308 coding genes were predicted, 96.3 % of which were successfully accessed function. Expanded gene families were involved in proteolysis, protein metabolism and nitrogen metabolism reflected the underlying genome basis for host adaptation during evolution. Transcriptome analysis revealed different gene expression patterns in antenna, mouthpart, head, leg, wing, and carcass body of the adult M. cribraria, respectively. Moreover, the expression profiles of the odorant receptor genes indicated the potential target genes for pest control. The high-quality chromosome-level genome will benefit further research on the adaptation, evolution, and population genetics of the M. cribraria that will assist in the pest management and tracking the biological invasion routes.
Collapse
Affiliation(s)
- Xiuxiu Zhu
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China
| | - Chenguang Zheng
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China.
| | - Xue Dong
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China
| | - Kaibin Wang
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China
| | - Haiguang Zhang
- College of Life Sciences, Linyi University, Middle Part of Shuangling Road, Linyi 276000, China
| | - Wenbo Yi
- Department of Biology, Xinzhou Teachers University, Xinzhou 034000, China
| | - Zhen Ye
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China
| | - Huaijun Xue
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China
| | - Wenjun Bu
- College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin, 300071, China.
| |
Collapse
|
2
|
Sharma D, Sharma K, Mishra A, Siwach P, Mittal A, Jayaram B. Molecular dynamics simulation-based trinucleotide and tetranucleotide level structural and energy characterization of the functional units of genomic DNA. Phys Chem Chem Phys 2023; 25:7323-7337. [PMID: 36825435 DOI: 10.1039/d2cp04820e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Genomes of most organisms on earth are written in a universal language of life, made up of four units - adenine (A), thymine (T), guanine (G), and cytosine (C), and understanding the way they are put together has been a great challenge to date. Multiple efforts have been made to annotate this wonderfully engineered string of DNA using different methods but they lack a universal character. In this article, we have investigated the structural and energetic profiles of both prokaryotes and eukaryotes by considering two essential genomic sites, viz., the transcription start sites (TSS) and exon-intron boundaries. We have characterized these sites by mapping the structural and energy features of DNA obtained from molecular dynamics simulations, which considers all possible trinucleotide and tetranucleotide steps. For DNA, these physicochemical properties show distinct signatures at the TSS and intron-exon boundaries. Our results firmly convey the idea that DNA uses the same dialect for prokaryotes and eukaryotes and that it is worth going beyond sequence-level analyses to physicochemical space to determine the functional destiny of DNA sequences.
Collapse
Affiliation(s)
- Dinesh Sharma
- Supercomputing Facility for Bioinformatics & Computational Biology, Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India
| | - Kopal Sharma
- Supercomputing Facility for Bioinformatics & Computational Biology, Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India
| | - Akhilesh Mishra
- Supercomputing Facility for Bioinformatics & Computational Biology, Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India
| | - Priyanka Siwach
- Department of Biotechnology, Chaudhary Devi Lal University, Sirsa, Haryana, India
| | - Aditya Mittal
- Supercomputing Facility for Bioinformatics & Computational Biology, Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India
| | - B Jayaram
- Supercomputing Facility for Bioinformatics & Computational Biology, Kusuma School of Biological Sciences, Indian Institute of Technology, Delhi, India.,Department of Chemistry, Indian Institute of Technology, Delhi, India.
| |
Collapse
|
3
|
Liu Z, Xing L, Huang W, Liu B, Wan F, Raffa KF, Hofstetter RW, Qian W, Sun J. Chromosome-level genome assembly and population genomic analyses provide insights into adaptive evolution of the red turpentine beetle, Dendroctonus valens. BMC Biol 2022; 20:190. [PMID: 36002826 PMCID: PMC9400205 DOI: 10.1186/s12915-022-01388-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 08/10/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biological invasions are responsible for substantial environmental and economic losses. The red turpentine beetle (RTB), Dendroctonus valens LeConte, is an important invasive bark beetle from North America that has caused substantial tree mortality in China. The lack of a high-quality reference genome seriously limits deciphering the extent to which genetic adaptions resulted in a secondary pest becoming so destructive in its invaded area. RESULTS Here, we present a 322.41 Mb chromosome-scale reference genome of RTB, of which 98% of assembled sequences are anchored onto fourteen linkage groups including the X chromosome with a N50 size of 24.36 Mb, which is significantly greater than other Coleoptera species. Repetitive sequences make up 45.22% of the genome, which is higher than four other Coleoptera species, i.e., Mountain pine beetle Dendroctonus ponderosae, red flour beetle Tribolium castaneum, blister beetle Hycleus cichorii, and Colorado potato beetle Leptinotarsa decemlineata. We identify rapidly expanded gene families and positively selected genes in RTB, which may be responsible for its rapid environmental adaptation. Population genetic structure of RTB was revealed by genome resequencing of geographic populations in native and invaded regions, suggesting substantial divergence of the North American population and illustrates the possible invasion and spread route in China. Selective sweep analysis highlighted the enhanced ability of Chinese populations in environmental adaptation. CONCLUSIONS Overall, our high-quality reference genome represents an important resource for genomics study of invasive bark beetles, which will facilitate the functional study and decipher mechanism underlying invasion success of RTB by integrating the Pinus tabuliformis genome.
Collapse
Affiliation(s)
- Zhudong Liu
- College of Life Science, Institute of Life Science and Green Development, Hebei University, Baoding, 071002, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 1000101, China
| | - Longsheng Xing
- College of Life Science, Institute of Life Science and Green Development, Hebei University, Baoding, 071002, China
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | | | - Bo Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Fanghao Wan
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China
| | - Kenneth F Raffa
- Department of Entomology, University of Wisconsin, Madison, WI, 53706, USA
| | | | - Wanqiang Qian
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.
| | - Jianghua Sun
- College of Life Science, Institute of Life Science and Green Development, Hebei University, Baoding, 071002, China.
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 1000101, China.
| |
Collapse
|
4
|
Zhao P, Zheng X, Yu Y, Hou Z, Diao C, Wang H, Kang H, Ning C, Li J, Feng W, Wang W, Liu GE, Li B, Smith J, Chamba Y, Liu JF. Mining Unknown Porcine Protein Isoforms by Tissue-based Map of Proteome Enhances Pig Genome Annotation. GENOMICS, PROTEOMICS & BIOINFORMATICS 2021; 19:772-786. [PMID: 33631433 PMCID: PMC9170766 DOI: 10.1016/j.gpb.2021.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 09/05/2019] [Accepted: 11/29/2019] [Indexed: 11/29/2022]
Abstract
A lack of the complete pig proteome has left a gap in our knowledge of the pig genome and has restricted the feasibility of using pigs as a biomedical model. In this study, we developed a tissue-based proteome map using 34 major normal pig tissues. A total of 5841 unknown protein isoforms were identified and systematically characterized, including 2225 novel protein isoforms, 669 protein isoforms from 460 genes symbolized beginning with LOC, and 2947 protein isoforms without clear NCBI annotation in the current pig reference genome. These newly identified protein isoforms were functionally annotated through profiling the pig transcriptome with high-throughput RNA sequencing of the same pig tissues, further improving the genome annotation of the corresponding protein-coding genes. Combining the well-annotated genes that have parallel expression pattern and subcellular witness, we predicted the tissue-related subcellularlocations and potential functions for these unknown proteins. Finally, we mined 3081 orthologous genes for 52.7% of unknown protein isoforms across multiple species, referring to 68 KEGG pathways as well as 23 disease signaling pathways. These findings provide valuable insights and a rich resource for enhancing studies of pig genomics and biology, as well as biomedical model application to human medicine.
Collapse
Affiliation(s)
- Pengju Zhao
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Xianrui Zheng
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Ying Yu
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Zhuocheng Hou
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Chenguang Diao
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Haifei Wang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Huimin Kang
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Chao Ning
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Junhui Li
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Wen Feng
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Wen Wang
- Center for Ecological and Environmental Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Beltsville Agricultural Research Center, U.S. Department of Agriculture, Beltsville, MD 20705, USA
| | - Bugao Li
- Department of Animal Sciences and Veterinary Medicine, Shanxi Agricultural University, Taigu 030801, China
| | - Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK
| | - Yangzom Chamba
- Tibet Agriculture and Animal Husbandry College, Linzhi 860000, China
| | - Jian-Feng Liu
- National Engineering Laboratory for Animal Breeding, Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China.
| |
Collapse
|
5
|
Abdelsattar AS, Dawoud A, Makky S, Nofal R, Aziz RK, El-Shibiny A. Bacteriophages: from isolation to application. Curr Pharm Biotechnol 2021; 23:337-360. [PMID: 33902418 DOI: 10.2174/1389201022666210426092002] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 01/29/2021] [Accepted: 03/11/2021] [Indexed: 11/22/2022]
Abstract
Bacteriophages are considered as a potential alternative to fight pathogenic bacteria during the antibiotic resistance era. With their high specificity, they are being widely used in various applications: medicine, food industry, agriculture, animal farms, biotechnology, diagnosis, etc. Many techniques have been designed by different researchers for phage isolation, purification, and amplification, each of which has strengths and weaknesses. However, all aim at having a reasonably pure phage sample that can be further characterized. Phages can be characterized based on their physiological, morphological or inactivation tests. Microscopy, in particular, has opened a wide gate not only for visualizing phage morphological structure, but also for monitoring biochemistry and behavior. Meanwhile, computational analysis of phage genomes provides more details about phage history, lifestyle, and potential for toxigenic or lysogenic conversion, which translate to safety in biocontrol and phage therapy applications. This review summarizes phage application pipelines at different levels and addresses specific restrictions and knowledge gaps in the field. Recently developed computational approaches, which are used in phage genome analysis, are critically assessed. We hope that this assessment provides researchers with useful insights for selection of suitable approaches for Phage-related research aims and applications.
Collapse
Affiliation(s)
- Abdallah S Abdelsattar
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Alyaa Dawoud
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Salsabil Makky
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Rana Nofal
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| | - Ramy K Aziz
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Qasr El-Ainy St, Cairo. Egypt
| | - Ayman El-Shibiny
- Center for Microbiology and Phage Therapy, Zewail City of Science and Technology, October Gardens, 6th of October City, Giza, 12578. Egypt
| |
Collapse
|
6
|
The prognosis predictive value of FMS-like tyrosine kinase 3-internal tandem duplications mutant allelic ratio (FLT3-ITD MR) in patients with acute myeloid leukemia detected by GeneScan. Gene 2020; 726:144195. [DOI: 10.1016/j.gene.2019.144195] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 10/18/2019] [Accepted: 10/20/2019] [Indexed: 01/04/2023]
|
7
|
Liu J, Xiao H, Huang S, Li F. OMIGA: Optimized Maker-Based Insect Genome Annotation. Mol Genet Genomics 2014; 289:567-73. [DOI: 10.1007/s00438-014-0831-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Accepted: 02/17/2014] [Indexed: 10/25/2022]
|
8
|
Paar V, Pavin N, Basar I, Rosandić M, Gluncić M, Paar N. Hierarchical structure of cascade of primary and secondary periodicities in Fourier power spectrum of alphoid higher order repeats. BMC Bioinformatics 2008; 9:466. [PMID: 18980673 PMCID: PMC2661002 DOI: 10.1186/1471-2105-9-466] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Accepted: 11/03/2008] [Indexed: 11/28/2022] Open
Abstract
Background Identification of approximate tandem repeats is an important task of broad significance and still remains a challenging problem of computational genomics. Often there is no single best approach to periodicity detection and a combination of different methods may improve the prediction accuracy. Discrete Fourier transform (DFT) has been extensively used to study primary periodicities in DNA sequences. Here we investigate the application of DFT method to identify and study alphoid higher order repeats. Results We used method based on DFT with mapping of symbolic into numerical sequence to identify and study alphoid higher order repeats (HOR). For HORs the power spectrum shows equidistant frequency pattern, with characteristic two-level hierarchical organization as signature of HOR. Our case study was the 16 mer HOR tandem in AC017075.8 from human chromosome 7. Very long array of equidistant peaks at multiple frequencies (more than a thousand higher harmonics) is based on fundamental frequency of 16 mer HOR. Pronounced subset of equidistant peaks is based on multiples of the fundamental HOR frequency (multiplication factor n for nmer) and higher harmonics. In general, nmer HOR-pattern contains equidistant secondary periodicity peaks, having a pronounced subset of equidistant primary periodicity peaks. This hierarchical pattern as signature for HOR detection is robust with respect to monomer insertions and deletions, random sequence insertions etc. For a monomeric alphoid sequence only primary periodicity peaks are present. The 1/fβ – noise and periodicity three pattern are missing from power spectra in alphoid regions, in accordance with expectations. Conclusion DFT provides a robust detection method for higher order periodicity. Easily recognizable HOR power spectrum is characterized by hierarchical two-level equidistant pattern: higher harmonics of the fundamental HOR-frequency (secondary periodicity) and a subset of pronounced peaks corresponding to constituent monomers (primary periodicity). The number of lower frequency peaks (secondary periodicity) below the frequency of the first primary periodicity peak reveals the size of nmer HOR, i.e., the number n of monomers contained in consensus HOR.
Collapse
Affiliation(s)
- Vladimir Paar
- Faculty of Science, University of Zagreb, Bijenicka 32, Zagreb, Croatia.
| | | | | | | | | | | |
Collapse
|
9
|
Zagursky RJ, Olmsted SB, Russell DP, Wooters JL. Bioinformatics: how it is being used to identify bacterial vaccine candidates. Expert Rev Vaccines 2003; 2:417-36. [PMID: 12903807 DOI: 10.1586/14760584.2.3.417] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Genomic sequencing has provided a tremendous amount of information that can be useful in vaccine target identification. The sheer volume of information available necessitates the use of new research disciplines and techniques. Using bioinformatics, researchers sift through available data to identify appropriate candidates for biological analysis. This review provides an overview of available bioinformatic techniques for vaccine candidate identification and a few examples of how these techniques are being applied to specific bacterial pathogens.
Collapse
|
10
|
Aggarwal G, Ramaswamy R. Ab initio gene identification: prokaryote genome annotation with GeneScan and GLIMMER. J Biosci 2002; 27:7-14. [PMID: 11927773 DOI: 10.1007/bf02703679] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We compare the annotation of three complete genomes using the ab initio methods of gene identification GeneScan and GLIMMER. The annotation given in GenBank, the standard against which these are compared, has been made using GeneMark. We find a number of novel genes which are predicted by both methods used here, as well as a number of genes that are predicted by GeneMark, but are not identified by either of the nonconsensus methods that we have used. The three organisms studied here are all prokaryotic species with fairly compact genomes. The Fourier measure forms the basis for an efficient non-consensus method for gene prediction, and the algorithm GeneScan exploits this measure. We have bench-marked this program as well as GLIMMER using 3 complete prokaryotic genomes. An effort has also been made to study the limitations of these techniques for complete genome analysis. GeneScan and GLIMMER are of comparable accuracy insofar as gene-identification is concerned, with sensitivities and specificities typically greater than 0.9. The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a significant number of cases, similar results are provided by the two techniques. This suggests that there could be some as-yet unidentified additional genes in these three genomes, and also that some of the putative identifications made hitherto might require re-evaluation. All these cases are discussed in detail.
Collapse
Affiliation(s)
- Gautam Aggarwal
- School of Physical Sciences, Jawaharlal Nehru University, New Delhi 110 067, India
| | | |
Collapse
|
11
|
Dandekar T, Du F, Schirmer RH, Schmidt S. Medical target prediction from genome sequence: combining different sequence analysis algorithms with expert knowledge and input from artificial intelligence approaches. COMPUTERS & CHEMISTRY 2001; 26:15-21. [PMID: 11765847 DOI: 10.1016/s0097-8485(01)00095-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
By exploiting the rapid increase in available sequence data, the definition of medically relevant protein targets has been improved by a combination of: (i) differential genome analysis (target list): and (ii) analysis of individual proteins (target analysis). Fast sequence comparisons, data mining, and genetic algorithms further promote these procedures. Mycobacterium tuberculosis proteins were chosen as applied examples.
Collapse
Affiliation(s)
- T Dandekar
- European Molecular Biology Laboratory, PO Box 102209, Meyerhostrasse 1, D-69012 Heidelberg, Germany.
| | | | | | | |
Collapse
|
12
|
Carter RJ, Dubchak I, Holbrook SR. A computational approach to identify genes for functional RNAs in genomic sequences. Nucleic Acids Res 2001; 29:3928-38. [PMID: 11574674 PMCID: PMC60242 DOI: 10.1093/nar/29.19.3928] [Citation(s) in RCA: 148] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80-90% accurate in jackknife testing experiments for bacteria and 90-99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web.
Collapse
Affiliation(s)
- R J Carter
- Computational and Theoretical Biology Department, Physical Biosciences Division, National Energy Research Scientific Computing Center, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
| | | | | |
Collapse
|
13
|
Lynn AM, Jain CK, Kosalai K, Barman P, Thakur N, Batra H, Bhattacharya A. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST. J Genet 2001; 80:9-16. [PMID: 11910119 DOI: 10.1007/bf02811413] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.
Collapse
Affiliation(s)
- A M Lynn
- Bioinformatics Centre, Jawaharlal Nehru University, New Delhi 110 067, India
| | | | | | | | | | | | | |
Collapse
|
14
|
Abstract
The Genome Annotation Assessment Project tested current methods of gene identification, including a critical assessment of the accuracy of different methods. Two new databases have provided new resources for gene annotation: these are the InterPro database of protein domains and motifs, and the Gene Ontology database for terms that describe the molecular functions and biological roles of gene products. Efforts in genome annotation are most often based upon advances in computer systems that are specifically designed to deal with the tremendous amounts of data being generated by current sequencing projects. These efforts in analysis are being linked to new ways of visualizing computationally annotated genomes.
Collapse
Affiliation(s)
- S Lewis
- Department of Molecular and Cell Biology, Berkeley Drosophila Genome Project, University of California, Berkeley, CA 94720-3200, USA.
| | | | | |
Collapse
|
15
|
Abstract
Complete genomic sequences of microbial pathogens and hosts offer sophisticated new strategies for studying host-pathogen interactions. DNA microarrays exploit primary sequence data to measure transcript levels and detect sequence polymorphisms, for every gene, simultaneously. The design and construction of a DNA microarray for any given microbial genome are straightforward. By monitoring microbial gene expression, one can predict the functions of uncharacterized genes, probe the physiologic adaptations made under various environmental conditions, identify virulence-associated genes, and test the effects of drugs. Similarly, by using host gene microarrays, one can explore host response at the level of gene expression and provide a molecular description of the events that follow infection. Host profiling might also identify gene expression signatures unique for each pathogen, thus providing a novel tool for diagnosis, prognosis, and clinical management of infectious disease.
Collapse
|