1
|
He J, Huang Y, Li L, Lin S, Ma M, Wang Y, Lin S. Novel Plastid Genome Characteristics in Fugacium kawagutii and the Trend of Accelerated Evolution of Plastid Proteins in Dinoflagellates. Genome Biol Evol 2024; 16:evad237. [PMID: 38155596 PMCID: PMC10781511 DOI: 10.1093/gbe/evad237] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/19/2023] [Accepted: 12/22/2023] [Indexed: 12/30/2023] Open
Abstract
Typical (peridinin-containing) dinoflagellates possess plastid genomes composed of small plasmids named "minicircles". Despite the ecological importance of dinoflagellate photosynthesis in corals and marine ecosystems, the structural characteristics, replication dynamics, and evolutionary forcing of dinoflagellate plastid genomes remain poorly understood. Here, we sequenced the plastid genome of the symbiodiniacean species Fugacium kawagutii and conducted comparative analyses. We identified psbT-coding minicircles, features previously not found in Symbiodiniaceae. The copy number of F. kawagutii minicircles showed a strong diel dynamics, changing between 3.89 and 34.3 copies/cell and peaking in mid-light period. We found that F. kawagutii minicircles are the shortest among all dinoflagellates examined to date. Besides, the core regions of the minicircles are highly conserved within genus in Symbiodiniaceae. Furthermore, the codon usage bias of the plastid genomes in Heterocapsaceae, Amphidiniaceae, and Prorocentraceae species are greatly influenced by selection pressure, and in Pyrocystaceae, Symbiodiniaceae, Peridiniaceae, and Ceratiaceae species are influenced by both natural selection pressure and mutation pressure, indicating a family-level distinction in codon usage evolution in dinoflagellates. Phylogenetic analysis using 12 plastid-encoded proteins and five nucleus-encoded plastid proteins revealed accelerated evolution trend of both plastid- and nucleus-encoded plastid proteins in peridinin- and fucoxanthin-dinoflagellate plastids compared to plastid proteins of nondinoflagellate algae. These findings shed new light on the structure and evolution of plastid genomes in dinoflagellates, which will facilitate further studies on the evolutionary forcing and function of the diverse dinoflagellate plastids. The accelerated evolution documented here suggests plastid-encoded sequences are potentially useful for resolving closely related dinoflagellates.
Collapse
Affiliation(s)
- Jiamin He
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
| | - Yulin Huang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
| | - Ling Li
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
| | - Sitong Lin
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
| | - Minglei Ma
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
| | - Yujie Wang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
| | - Senjie Lin
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361102, China
- Department of Marine Sciences, University of Connecticut, Groton, CT 06340, USA
| |
Collapse
|
2
|
Špoljarić D, Ugrina I. Limiting distribution of the number of clumps of palindromes in DNA. COMMUN STAT-THEOR M 2017. [DOI: 10.1080/03610926.2016.1189573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Drago Špoljarić
- Faculty of Mining, Geology and Petroleum Engineering, University of Zagreb, Zagreb, Croatia
| | - Ivo Ugrina
- Faculty of Science, Department of Mathematics, University of Zagreb, Zagreb, Croatia
| |
Collapse
|
3
|
Rizvi AZ, Bhattacharya C. Detection of Replication Origin Sites in Herpesvirus Genomes by Clustering and Scoring of Palindromes with Quadratic Entropy Measures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:1108-1118. [PMID: 26357048 DOI: 10.1109/tcbb.2014.2330622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Replication in herpesvirus genomes is a major concern of public health as they multiply rapidly during the lytic phase of infection that cause maximum damage to the host cells. Earlier research has established that sites of replication origin are dominated by high concentration of rare palindrome sequences of DNA. Computational methods are devised based on scoring to determine the concentration of palindromes. In this paper, we propose both extraction and localization of rare palindromes in an automated manner. Discrete Cosine Transform (DCT-II), a widely recognized image compression algorithm is utilized here to extract palindromic sequences based on their reverse complimentary symmetry property of existence. We formulate a novel approach to localize the rare palindrome clusters by devising a Minimum Quadratic Entropy (MQE) measure based on the Renyi's Quadratic Entropy (RQE) function. Experimental results over a large number of herpesvirus genomes show that the RQE based scoring of rare palindromes have higher order of sensitivity, and lesser false alarm in detecting concentration of rare palindromes and thereby sites of replication origin.
Collapse
|
4
|
|
5
|
Špoljarić D, Ugrina I. On Statistical Properties of Palindromes in DNA. COMMUN STAT-THEOR M 2013. [DOI: 10.1080/03610926.2012.739253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
6
|
Anjana R, Shankar M, Vaishnavi MK, Sekar K. A method to find palindromes in nucleic acid sequences. Bioinformation 2013; 9:255-8. [PMID: 23515654 PMCID: PMC3602881 DOI: 10.6026/97320630009255] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2012] [Revised: 02/08/2013] [Accepted: 02/11/2013] [Indexed: 12/01/2022] Open
Abstract
Various types of sequences in the human genome are known to play important roles in different aspects of genomic functioning. Among these sequences, palindromic nucleic acid sequences are one such type that have been studied in detail and found to influence a wide variety of genomic characteristics. For a nucleotide sequence to be considered as a palindrome, its complementary strand must read the same in the opposite direction. For example, both the strands i.e the strand going from 5' to 3' and its complementary strand from 3' to 5' must be complementary. A typical nucleotide palindromic sequence would be TATA (5' to 3') and its complimentary sequence from 3' to 5' would be ATAT. Thus, a new method has been developed using dynamic programming to fetch the palindromic nucleic acid sequences. The new method uses less memory and thereby it increases the overall speed and efficiency. The proposed method has been tested using the bacterial (3891 KB bases) and human chromosomal sequences (Chr-18: 74366 kb and Chr-Y: 25554 kb) and the computation time for finding the palindromic sequences is in milli seconds.
Collapse
Affiliation(s)
| | | | - Marthandan Kirti Vaishnavi
- Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560012, India
- Equally contributed to this work
| | - Kanagaraj Sekar
- Kanagaraj Sekar: Tel: +91-080-22933059/22933060/22932469; Fax: +91-080-23600683/23600551
| |
Collapse
|
7
|
Strawbridge EM, Benson G, Gelfand Y, Benham CJ. The distribution of inverted repeat sequences in the Saccharomyces cerevisiae genome. Curr Genet 2010; 56:321-40. [PMID: 20446088 PMCID: PMC2908449 DOI: 10.1007/s00294-010-0302-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 04/05/2010] [Accepted: 04/08/2010] [Indexed: 02/06/2023]
Abstract
Although a variety of possible functions have been proposed for inverted repeat sequences (IRs), it is not known which of them might occur in vivo. We investigate this question by assessing the distributions and properties of IRs in the Saccharomyces cerevisiae (SC) genome. Using the IRFinder algorithm we detect 100,514 IRs having copy length greater than 6 bp and spacer length less than 77 bp. To assess statistical significance we also determine the IR distributions in two types of randomization of the S. cerevisiae genome. We find that the S. cerevisiae genome is significantly enriched in IRs relative to random. The S. cerevisiae IRs are significantly longer and contain fewer imperfections than those from the randomized genomes, suggesting that processes to lengthen and/or correct errors in IRs may be operative in vivo. The S. cerevisiae IRs are highly clustered in intergenic regions, while their occurrence in coding sequences is consistent with random. Clustering is stronger in the 3' flanks of genes than in their 5' flanks. However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event. Various explanations for these results are considered.
Collapse
Affiliation(s)
| | - Gary Benson
- Laboratory for Biocomputing and Informatics, Boston University, Boston, MA USA
| | - Yevgeniy Gelfand
- Laboratory for Biocomputing and Informatics, Boston University, Boston, MA USA
| | - Craig J. Benham
- Department of Mathematics, University of California, Davis, CA 95616 USA
| |
Collapse
|
8
|
Cruz-Cano R, Chew DSH, Kwok-Pui C, Ming-Ying L. Least-Squares Support Vector Machine Approach to Viral Replication Origin Prediction. INFORMS JOURNAL ON COMPUTING 2010; 22:457-470. [PMID: 20729987 PMCID: PMC2923853 DOI: 10.1287/ijoc.1090.0360] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Replication of their DNA genomes is a central step in the reproduction of many viruses. Procedures to find replication origins, which are initiation sites of the DNA replication process, are therefore of great importance for controlling the growth and spread of such viruses. Existing computational methods for viral replication origin prediction have mostly been tested within the family of herpesviruses. This paper proposes a new approach by least-squares support vector machines (LS-SVMs) and tests its performance not only on the herpes family but also on a collection of caudoviruses coming from three viral families under the order of caudovirales. The LS-SVM approach provides sensitivities and positive predictive values superior or comparable to those given by the previous methods. When suitably combined with previous methods, the LS-SVM approach further improves the prediction accuracy for the herpesvirus replication origins. Furthermore, by recursive feature elimination, the LS-SVM has also helped find the most significant features of the data sets. The results suggest that the LS-SVMs will be a highly useful addition to the set of computational tools for viral replication origin prediction and illustrate the value of optimization-based computing techniques in biomedical applications.
Collapse
Affiliation(s)
- Raul Cruz-Cano
- Department of Computer and Information Sciences, Texas A&M University-Texarkana, Texarkana, TX, 75501, USA,
| | | | | | | |
Collapse
|
9
|
Abstract
The genome of Sorangium cellulosum has recently been completely sequenced, and it is the largest bacterial genome sequenced so far. In their report, Schneiker et al. (in Complete genome sequence of the myxobacterium Sorangium cellulosum, Nat. Biotechnol., 2007, 25, 1281–1289) concluded that ‘In the absence of the GC-skew inversion typically seen at the replication origin of bacterial chromosomes, it was not possible to discern the location of oriC’. In addition, the complete genome of Microcystis aeruginosa NIES-843 has also been recently sequenced, and in this report, Kaneko et al. (in Complete genomic structure of the bloom-forming toxic cyanobacterium Microcystis aeruginosa NIES-843, DNA Res., 2007, 14, 247–256) concluded that ‘there was no characteristic pattern, according to GC skew analysis’. Therefore, oriC locations of the above genomes remain unsolved. Using Ori-Finder, a recently developed computer program, in both genomes, we have identified candidate oriC regions that have almost all sequence hallmarks of bacterial oriCs, such as asymmetrical nucleotide distributions, being adjacent to the dnaN gene, and containing DnaA boxes and repeat elements.
Collapse
Affiliation(s)
- Feng Gao
- Department of Physics, Tianjin University, Tianjin 300072, People's Republic of China
| | | |
Collapse
|
10
|
Lillo F, Spanò M. Inverted and mirror repeats in model nucleotide sequences. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2007; 76:041914. [PMID: 17995033 DOI: 10.1103/physreve.76.041914] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2007] [Indexed: 05/25/2023]
Abstract
We analytically and numerically study the probabilistic properties of inverted and mirror repeats in model sequences of nucleic acids. We consider both perfect and nonperfect repeats, i.e., repeats with mismatches and gaps. The considered sequence models are independent identically distributed (i.i.d.) sequences, Markov processes and long-range sequences. We show that the number of repeats in correlated sequences is significantly larger than in i.i.d. sequences and that this discrepancy increases exponentially with the repeat length for long-range sequences.
Collapse
Affiliation(s)
- Fabrizio Lillo
- Dipartimento di Fisica e Tecnologie Relative, Università di Palermo, Viale delle Scienze, I-90128, Palermo, Italy
| | | |
Collapse
|
11
|
|
12
|
Chew DSH, Leung MY, Choi KP. AT excursion: a new approach to predict replication origins in viral genomes by locating AT-rich regions. BMC Bioinformatics 2007; 8:163. [PMID: 17517140 PMCID: PMC1904460 DOI: 10.1186/1471-2105-8-163] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2006] [Accepted: 05/21/2007] [Indexed: 11/12/2022] Open
Abstract
Background Replication origins are considered important sites for understanding the molecular mechanisms involved in DNA replication. Many computational methods have been developed for predicting their locations in archaeal, bacterial and eukaryotic genomes. However, a prediction method designed for a particular kind of genomes might not work well for another. In this paper, we propose the AT excursion method, which is a score-based approach, to quantify local AT abundance in genomic sequences and use the identified high scoring segments for predicting replication origins. This method has the advantages of requiring no preset window size and having rigorous criteria to evaluate statistical significance of high scoring segments. Results We have evaluated the AT excursion method by checking its predictions against known replication origins in herpesviruses and comparing its performance with an existing base weighted score method (BWS1). Out of 43 known origins, 39 are predicted by either one or the other method and 26 origins are predicted by both. The excursion method identifies six origins not predicted by BWS1, showing that the AT excursion method is a valuable complement to BWS1. We have also applied the AT excursion method to two other families of double stranded DNA viruses, the poxviruses and iridoviruses, of which very few replication origins are documented in the public domain. The prediction results are made available as supplementary materials at [1]. Preliminary investigation shows that the proposed method works well on some larger genomes too. Conclusion The AT excursion method will be a useful computational tool for identifying replication origins in a variety of genomic sequences.
Collapse
Affiliation(s)
- David SH Chew
- Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, Singapore
| | - Ming-Ying Leung
- Department of Mathematical Sciences and Bioinformatics Program, The University of Texas at El Paso, TX 79968, USA
| | - Kwok Pui Choi
- Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546, Singapore
- Department of Mathematics, National University of Singapore, Singapore 117543, Singapore
| |
Collapse
|
13
|
Lu L, Jia H, Dröge P, Li J. The human genome-wide distribution of DNA palindromes. Funct Integr Genomics 2007; 7:221-7. [PMID: 17340149 DOI: 10.1007/s10142-007-0047-6] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2006] [Revised: 02/05/2007] [Accepted: 02/19/2007] [Indexed: 10/23/2022]
Abstract
In this work, we performed a systematic study of perfect and nonspacer palindromes present in human genomic DNA, and we investigated palindrome distribution over the entire human genome and over the functional regions such as the exon, intron, intergenic, and upstream regions (2,000 bp upstream from translational start site). We found that 24 palindrome-abundant intervals are mostly located on G-bands, which condense early, replicate late, and are relatively A+T rich. In general, palindromes are overrepresented in introns but underrepresented in exons. Upstream region has enriched palindrome distribution, where palindromes can serve as transcription factor binding sites. We created a Human DNA Palindrome Database (HPALDB) which is accessible at http://vhp.ntu.edu.sg/hpaldb . It contains 12,556,994 entries covering all palindromes in the human genome longer than 6 bp. Queries can be performed in different ways. Each entry in the database is linked to its location on NCBI's human chromosome Map Viewer.
Collapse
Affiliation(s)
- Le Lu
- Division of Structural and Computational Biology, School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | | | | | | |
Collapse
|
14
|
Muylkens B, Thiry J, Kirten P, Schynts F, Thiry E. Bovine herpesvirus 1 infection and infectious bovine rhinotracheitis. Vet Res 2007; 38:181-209. [PMID: 17257569 DOI: 10.1051/vetres:2006059] [Citation(s) in RCA: 274] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2006] [Accepted: 11/15/2006] [Indexed: 12/12/2022] Open
Abstract
Bovine herpesvirus 1 (BoHV-1), classified as an alphaherpesvirus, is a major pathogen of cattle. Primary infection is accompanied by various clinical manifestations such as infectious bovine rhinotracheitis, abortion, infectious pustular vulvovaginitis, and systemic infection in neonates. When animals survive, a life-long latent infection is established in nervous sensory ganglia. Several reactivation stimuli can lead to viral re-excretion, which is responsible for the maintenance of BoHV-1 within a cattle herd. This paper focuses on an updated pathogenesis based on a molecular characterization of BoHV-1 and the description of the virus cycle. Special emphasis is accorded to the impact of the latency and reactivation cycle on the epidemiology and the control of BoHV-1. Several European countries have initiated BoHV-1 eradication schemes because of the significant losses incurred by disease and trading restrictions. The vaccines used against BoHV-1 are described in this context where the differentiation of infected from vaccinated animals is of critical importance to achieve BoHV-1 eradication.
Collapse
Affiliation(s)
- Benoît Muylkens
- Virology, Department of Infectious and Parasitic Diseases, Faculty of Veterinary Medicine, University of Liège, Boulevard de Colonster 20, B43b, 4000 Liège, Belgium
| | | | | | | | | |
Collapse
|