Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Urgese G, Parisi E, Scicolone O, Di Cataldo S, Ficarra E. BioSeqZip: a collapser of NGS redundant reads for the optimization of sequence analysis. Bioinformatics 2020;36:2705-2711. [PMID: 31999333 PMCID: PMC7203750 DOI: 10.1093/bioinformatics/btaa051] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 12/20/2019] [Accepted: 01/22/2020] [Indexed: 01/08/2023] Open

For:	Urgese G, Parisi E, Scicolone O, Di Cataldo S, Ficarra E. BioSeqZip: a collapser of NGS redundant reads for the optimization of sequence analysis. Bioinformatics 2020;36:2705-2711. [PMID: 31999333 PMCID: PMC7203750 DOI: 10.1093/bioinformatics/btaa051] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 12/20/2019] [Accepted: 01/22/2020] [Indexed: 01/08/2023] Open

Number

Cited by Other Article(s)

Orlov YL, Orlova NG. Bioinformatics tools for the sequence complexity estimates. Biophys Rev 2023;15:1367-1378. [PMID: 37974990 PMCID: PMC10643780 DOI: 10.1007/s12551-023-01140-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 09/01/2023] [Indexed: 11/19/2023] Open

Abstract

We review current methods and bioinformatics tools for the text complexity estimates (information and entropy measures). The search DNA regions with extreme statistical characteristics such as low complexity regions are important for biophysical models of chromosome function and gene transcription regulation in genome scale. We discuss the complexity profiling for segmentation and delineation of genome sequences, search for genome repeats and transposable elements, and applications to next-generation sequencing reads. We review the complexity methods and new applications fields: analysis of mutation hotspots loci, analysis of short sequencing reads with quality control, and alignment-free genome comparisons. The algorithms implementing various numerical measures of text complexity estimates including combinatorial and linguistic measures have been developed before genome sequencing era. The series of tools to estimate sequence complexity use compression approaches, mainly by modification of Lempel-Ziv compression. Most of the tools are available online providing large-scale service for whole genome analysis. Novel machine learning applications for classification of complete genome sequences also include sequence compression and complexity algorithms. We present comparison of the complexity methods on the different sequence sets, the applications for gene transcription regulatory regions analysis. Furthermore, we discuss approaches and application of sequence complexity for proteins. The complexity measures for amino acid sequences could be calculated by the same entropy and compression-based algorithms. But the functional and evolutionary roles of low complexity regions in protein have specific features differing from DNA. The tools for protein sequence complexity aimed for protein structural constraints. It was shown that low complexity regions in protein sequences are conservative in evolution and have important biological and structural functions. Finally, we summarize recent findings in large scale genome complexity comparison and applications for coronavirus genome analysis.

Collapse

Winkler J, Urgese G, Ficarra E, Reinert K. LaRA 2: parallel and vectorized program for sequence-structure alignment of RNA sequences. BMC Bioinformatics 2022;23:18. [PMID: 34991448 PMCID: PMC8734264 DOI: 10.1186/s12859-021-04532-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 12/13/2021] [Indexed: 11/10/2022] Open

Liu Y, Zhang X, Zou Q, Zeng X. Minirmd: accurate and fast duplicate removal tool for short reads via multiple minimizers. Bioinformatics 2021;37:1604-1606. [PMID: 33112385 DOI: 10.1093/bioinformatics/btaa915] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 09/30/2020] [Accepted: 10/14/2020] [Indexed: 12/21/2022] Open

Jeong J, Park SJ, Kim JW, No JS, Jeon HH, Lee JW, No A, Kim S, Park H. Cooperative Sequence Clustering and Decoding for DNA Storage System with Fountain Codes. Bioinformatics 2021;37:3136-3143. [PMID: 33904574 DOI: 10.1093/bioinformatics/btab246] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Revised: 03/03/2021] [Accepted: 04/13/2021] [Indexed: 11/12/2022] Open

Abstract

MOTIVATION

In DNA storage systems, there are tradeoffs between writing and reading costs. Increasing the code rate of error-correcting codes may save writing cost, but it will need more sequence reads for data retrieval. There is potentially a way to improve sequencing and decoding processes in such a way that the reading cost induced by this tradeoff is reduced without increasing the writing cost. In past researches, clustering, alignment, and decoding processes were considered as separate stages but we believe that using the information from all these processes together may improve decoding performance. Actual experiments of DNA synthesis and sequencing should be performed because simulations cannot be relied on to cover all error possibilities in practical circumstances.

RESULTS

For DNA storage systems using fountain code and Reed-Solomon (RS) code, we introduce several techniques to improve the decoding performance. We designed the decoding process focusing on the cooperation of key components: Hamming-distance based clustering, discarding of abnormal sequence reads, RS error correction as well as detection, and quality score-based ordering of sequences. We synthesized 513.6KB data into DNA oligo pools and sequenced this data successfully with Illumina MiSeq instrument. Compared to Erlich's research, the proposed decoding method additionally incorporates sequence reads with minor errors which had been discarded before, and thuswas able to make use of 10.6-11.9% more sequence reads from the same sequencing environment, this resulted in 6.5-8.9% reduction in the reading cost. Channel characteristics including sequence coverage and read-length distributions are provided as well.

AVAILABILITY

The raw data files and the source codes of our experiments are available at: https://github.com/jhjeong0702/dna-storage.

Collapse

Zhou Z, Gu G, Luo Y, Li W, Li B, Zhao Y, Liu J, Shuai X, Wu L, Chen J, Fan C, Huang Q, Han B, Wen J, Jiao H. Immunological pathways of macrophage response to Brucella ovis infection. Innate Immun 2020;26:635-648. [PMID: 32970502 PMCID: PMC7556187 DOI: 10.1177/1753425920958179] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open

Affiliation(s)

Zhixiong Zhou College of Veterinary Medicine, Southwest University, Chongqing, China
Guojing Gu College of Veterinary Medicine, Southwest University, Chongqing, China
Yichen Luo Immunology Research Center, Medical Research Institute, Southwest University, Chongqing, China.,College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China
Wenjie Li College of Veterinary Medicine, Southwest University, Chongqing, China
Bowen Li College of Veterinary Medicine, Southwest University, Chongqing, China
Yu Zhao College of Veterinary Medicine, Southwest University, Chongqing, China
Juan Liu Immunology Research Center, Medical Research Institute, Southwest University, Chongqing, China.,College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China
Xuehong Shuai Immunology Research Center, Medical Research Institute, Southwest University, Chongqing, China.,College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China
Li Wu College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China
Jixuan Chen College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China
Cailiang Fan College of Veterinary Medicine, Southwest University, Chongqing, China.,Animal Disease Prevention and Control Center of Rongchang, Chongqing, China
Qingzhou Huang College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China
Baoru Han College of Medical Informatics, Chongqing Medical University, Chongqing, China
Jianjun Wen Department of Microbiology and Immunology, University of Texas Medical Branch at Galveston, Galveston, USA
Hanwei Jiao Immunology Research Center, Medical Research Institute, Southwest University, Chongqing, China.,College of Veterinary Medicine, Southwest University, Chongqing, China.,Veterinary Scientific Engineering Research Center, Chongqing, China

Collapse