Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pinello L, Lo Bosco G, Yuan GC. Applications of alignment-free methods in epigenomics. Brief Bioinform 2014;15:419-30. [PMID: 24197932 PMCID: PMC4017331 DOI: 10.1093/bib/bbt078] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2013] [Accepted: 10/28/2013] [Indexed: 12/16/2022] Open

For:	Pinello L, Lo Bosco G, Yuan GC. Applications of alignment-free methods in epigenomics. Brief Bioinform 2014;15:419-30. [PMID: 24197932 PMCID: PMC4017331 DOI: 10.1093/bib/bbt078] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2013] [Accepted: 10/28/2013] [Indexed: 12/16/2022] Open

Number

Cited by Other Article(s)

Bohnsack KS, Kaden M, Abel J, Villmann T. Alignment-Free Sequence Comparison: A Systematic Survey From a Machine Learning Perspective. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023;20:119-135. [PMID: 34990369 DOI: 10.1109/tcbb.2022.3140873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Pal J, Ghosh S, Maji B, Bhattacharya DK. Mathematical Approach to Protein Sequence Comparison Based on Physiochemical Properties. ACS OMEGA 2022;7:39446-39455. [PMID: 36340165 PMCID: PMC9631895 DOI: 10.1021/acsomega.2c06103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 09/27/2022] [Indexed: 06/16/2023]

Câmara GBM, Coutinho MGF, da Silva LMD, Gadelha WVDN, Torquato MF, Barbosa RDM, Fernandes MAC. Convolutional Neural Network Applied to SARS-CoV-2 Sequence Classification. SENSORS (BASEL, SWITZERLAND) 2022;22:5730. [PMID: 35957287 PMCID: PMC9371030 DOI: 10.3390/s22155730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 07/28/2022] [Accepted: 07/28/2022] [Indexed: 06/15/2023]

Abstract

COVID-19, the illness caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus belonging to the Coronaviridade family, a single-strand positive-sense RNA genome, has been spreading around the world and has been declared a pandemic by the World Health Organization. On 17 January 2022, there were more than 329 million cases, with more than 5.5 million deaths. Although COVID-19 has a low mortality rate, its high capacities for contamination, spread, and mutation worry the authorities, especially after the emergence of the Omicron variant, which has a high transmission capacity and can more easily contaminate even vaccinated people. Such outbreaks require elucidation of the taxonomic classification and origin of the virus (SARS-CoV-2) from the genomic sequence for strategic planning, containment, and treatment of the disease. Thus, this work proposes a high-accuracy technique to classify viruses and other organisms from a genome sequence using a deep learning convolutional neural network (CNN). Unlike the other literature, the proposed approach does not limit the length of the genome sequence. The results show that the novel proposal accurately distinguishes SARS-CoV-2 from the sequences of other viruses. The results were obtained from 1557 instances of SARS-CoV-2 from the National Center for Biotechnology Information (NCBI) and 14,684 different viruses from the Virus-Host DB. As a CNN has several changeable parameters, the tests were performed with forty-eight different architectures; the best of these had an accuracy of 91.94 ± 2.62% in classifying viruses into their realms correctly, in addition to 100% accuracy in classifying SARS-CoV-2 into its respective realm, Riboviria. For the subsequent classifications (family, genera, and subgenus), this accuracy increased, which shows that the proposed architecture may be viable in the classification of the virus that causes COVID-19.

Collapse

Affiliation(s)

Gabriel B. M. Câmara Bioinformatics Multidisciplinary Environment (BioME), Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.)
Maria G. F. Coutinho Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.)
Lucileide M. D. da Silva Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.) Federal Institute of Education, Science and Technology of Rio Grande do Norte, Paraiso, Santa Cruz 59200-000, RN, Brazil
Walter V. do N. Gadelha Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.)
Matheus F. Torquato Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.)
Raquel de M. Barbosa Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.) Department of Pharmacy and Pharmaceutical Technology, University of Granada, 18071 Granada, Spain
Marcelo A. C. Fernandes Bioinformatics Multidisciplinary Environment (BioME), Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; Laboratory of Machine Learning and Intelligent Instrumentation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil; (M.G.F.C.); (L.M.D.d.S.); (W.V.d.N.G.); (M.F.T.) Department of Computer Engineering and Automation, Federal University of Rio Grande do Norte, Natal 59078-970, RN, Brazil

Collapse

Swain MT, Vickers M. Interpreting alignment-free sequence comparison: what makes a score a good score? NAR Genom Bioinform 2022;4:lqac062. [PMID: 36071721 PMCID: PMC9442500 DOI: 10.1093/nargab/lqac062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 07/01/2022] [Accepted: 08/16/2022] [Indexed: 11/13/2022] Open

Wang L, Zhang W, Wu X, Liang X, Cao L, Zhai J, Yang Y, Chen Q, Liu H, Zhang J, Ding Y, Zhu F, Tang J. MIAOME: Human Microbiome Affect The Host Epigenome. Comput Struct Biotechnol J 2022;20:2455-2463. [PMID: 35664224 PMCID: PMC9136154 DOI: 10.1016/j.csbj.2022.05.024] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 05/11/2022] [Accepted: 05/12/2022] [Indexed: 01/10/2023] Open

Affiliation(s)

Lidan Wang School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Wei Zhang College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
Xianglu Wu Joint International Research Laboratory of Reproductive and Development, Department of Reproductive Biology, School of Public Health, Chongqing Medical University, Chongqing 400016, China
Xiao Liang School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Lijie Cao School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Jincheng Zhai School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Yiyang Yang School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Qiuxiao Chen School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Hongqing Liu School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Jun Zhang School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China
Yubin Ding Joint International Research Laboratory of Reproductive and Development, Department of Reproductive Biology, School of Public Health, Chongqing Medical University, Chongqing 400016, China Corresponding authors at: School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China (J. Tang).
Feng Zhu College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China Corresponding authors at: School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China (J. Tang).
Jing Tang School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China Joint International Research Laboratory of Reproductive and Development, Department of Reproductive Biology, School of Public Health, Chongqing Medical University, Chongqing 400016, China Corresponding authors at: School of Basic Medicine, Chongqing Medical University, Chongqing 400016, China (J. Tang).

Collapse

Deif MA, Solyman AAA, Kamarposhti MA, Band SS, Hammam RE. A deep bidirectional recurrent neural network for identification of SARS-CoV-2 from viral genome sequences. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2021;18:8933-8950. [PMID: 34814329 DOI: 10.3934/mbe.2021440] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Lopez-Rincon A, Tonda A, Mendoza-Maldonado L, Mulders DGJC, Molenkamp R, Perez-Romero CA, Claassen E, Garssen J, Kraneveld AD. Classification and specific primer design for accurate detection of SARS-CoV-2 using deep learning. Sci Rep 2021;11:947. [PMID: 33441822 PMCID: PMC7806918 DOI: 10.1038/s41598-020-80363-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 12/21/2020] [Indexed: 02/07/2023] Open

Amato D, Bosco GL, Rizzo R. CORENup: a combination of convolutional and recurrent deep neural networks for nucleosome positioning identification. BMC Bioinformatics 2020;21:326. [PMID: 32938377 PMCID: PMC7493859 DOI: 10.1186/s12859-020-03627-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 06/22/2020] [Indexed: 12/14/2022] Open

Data stream dataset of SARS-CoV-2 genome. Data Brief 2020;31:105829. [PMID: 32596428 PMCID: PMC7306612 DOI: 10.1016/j.dib.2020.105829] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Revised: 06/02/2020] [Accepted: 06/04/2020] [Indexed: 11/22/2022] Open

Luczak BB, James BT, Girgis HZ. A survey and evaluations of histogram-based statistics in alignment-free sequence comparison. Brief Bioinform 2020;20:1222-1237. [PMID: 29220512 PMCID: PMC6781583 DOI: 10.1093/bib/bbx161] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 10/13/2017] [Indexed: 11/29/2022] Open

Abstract

Motivation

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences.

Results

We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover’s distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover’s distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours.

Availability

The source code of the benchmarking tool is available as Supplementary Materials.

Collapse

Di Gangi M, Lo Bosco G, Rizzo R. Deep learning architectures for prediction of nucleosome positioning from sequences data. BMC Bioinformatics 2018;19:418. [PMID: 30453896 PMCID: PMC6245688 DOI: 10.1186/s12859-018-2386-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Pizzi C, Ornamenti M, Spangaro S, Rombo SE, Parida L. Efficient Algorithms for Sequence Analysis with Entropic Profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:117-128. [PMID: 28113780 DOI: 10.1109/tcbb.2016.2620143] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Zielezinski A, Vinga S, Almeida J, Karlowski WM. Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol 2017;18:186. [PMID: 28974235 PMCID: PMC5627421 DOI: 10.1186/s13059-017-1319-7] [Citation(s) in RCA: 285] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open

Glouzon JPS, Perreault JP, Wang S. The super-n-motifs model: a novel alignment-free approach for representing and comparing RNA secondary structures. Bioinformatics 2017;33:1169-1178. [PMID: 28088762 DOI: 10.1093/bioinformatics/btw773] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Indexed: 12/13/2022] Open

Holder LB, Haque MM, Skinner MK. Machine learning for epigenetics and future medical applications. Epigenetics 2017;12:505-514. [PMID: 28524769 PMCID: PMC5687335 DOI: 10.1080/15592294.2017.1329068] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Pal J, Ghosh S, Maji B, Bhattacharya DK. WITHDRAWN: A Novel Way of Comparing Protein Sequences Represented Under Physio-Chemical Properties of their Amino Acids. Comput Biol Chem 2017. [DOI: 10.1016/j.compbiolchem.2017.04.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]

Sedlar K, Skutkova H, Vitek M, Provaznik I. Set of rules for genomic signal downsampling. Comput Biol Med 2015;69:308-14. [PMID: 26078051 DOI: 10.1016/j.compbiomed.2015.05.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Revised: 05/25/2015] [Accepted: 05/26/2015] [Indexed: 12/14/2022]

Giancarlo R, Rombo SE, Utro F. Epigenomick-mer dictionaries: shedding light on how sequence composition influencesin vivonucleosome positioning. Bioinformatics 2015;31:2939-46. [DOI: 10.1093/bioinformatics/btv295] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Accepted: 05/04/2015] [Indexed: 12/28/2022] Open

King BR, Aburdene M, Thompson A, Warres Z. Application of discrete Fourier inter-coefficient difference for assessing genetic sequence similarity. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2014;2014:8. [PMID: 24991213 PMCID: PMC4077688 DOI: 10.1186/1687-4153-2014-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 05/01/2014] [Indexed: 11/27/2022]