1
|
Nobile MS, Cazzaniga P, Tangherloni A, Besozzi D. Graphics processing units in bioinformatics, computational biology and systems biology. Brief Bioinform 2017; 18:870-885. [PMID: 27402792 PMCID: PMC5862309 DOI: 10.1093/bib/bbw058] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Indexed: 01/18/2023] Open
Abstract
Several studies in Bioinformatics, Computational Biology and Systems Biology rely on the definition of physico-chemical or mathematical models of biological systems at different scales and levels of complexity, ranging from the interaction of atoms in single molecules up to genome-wide interaction networks. Traditional computational methods and software tools developed in these research fields share a common trait: they can be computationally demanding on Central Processing Units (CPUs), therefore limiting their applicability in many circumstances. To overcome this issue, general-purpose Graphics Processing Units (GPUs) are gaining an increasing attention by the scientific community, as they can considerably reduce the running time required by standard CPU-based software, and allow more intensive investigations of biological systems. In this review, we present a collection of GPU tools recently developed to perform computational analyses in life science disciplines, emphasizing the advantages and the drawbacks in the use of these parallel architectures. The complete list of GPU-powered tools here reviewed is available at http://bit.ly/gputools.
Collapse
Affiliation(s)
- Marco S Nobile
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milano, Italy
- SYSBIO.IT Centre of Systems Biology, Milano, Italy
| | - Paolo Cazzaniga
- Department of Human and Social Sciences, University of Bergamo, Bergamo, Italy
- SYSBIO.IT Centre of Systems Biology, Milano, Italy
| | - Andrea Tangherloni
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milano, Italy
| | - Daniela Besozzi
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milano, Italy
- SYSBIO.IT Centre of Systems Biology, Milano, Italy
- Corresponding author. Daniela Besozzi, Department of Informatics, Systems and Communication, University of Milano-Bicocca, Milano, Italy and SYSBIO.IT Centre of Systems Biology, Milano, Italy. Tel.: +39 02 6448 7874. E-mail:
| |
Collapse
|
2
|
Riemenschneider M, Herbst A, Rasch A, Gorlatch S, Heider D. eccCL: parallelized GPU implementation of Ensemble Classifier Chains. BMC Bioinformatics 2017; 18:371. [PMID: 28818036 PMCID: PMC5561639 DOI: 10.1186/s12859-017-1783-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2016] [Accepted: 08/08/2017] [Indexed: 11/30/2022] Open
Abstract
Background Multi-label classification has recently gained great attention in diverse fields of research, e.g., in biomedical application such as protein function prediction or drug resistance testing in HIV. In this context, the concept of Classifier Chains has been shown to improve prediction accuracy, especially when applied as Ensemble Classifier Chains. However, these techniques lack computational efficiency when applied on large amounts of data, e.g., derived from next-generation sequencing experiments. By adapting algorithms for the use of graphics processing units, computational efficiency can be greatly improved due to parallelization of computations. Results Here, we provide a parallelized and optimized graphics processing unit implementation (eccCL) of Classifier Chains and Ensemble Classifier Chains. Additionally to the OpenCL implementation, we provide an R-Package with an easy to use R-interface for parallelized graphics processing unit usage. Conclusion
eccCL is a handy implementation of Classifier Chains on GPUs, which is able to process up to over 25,000 instances per second, and thus can be used efficiently in high-throughput experiments. The software is available at http://www.heiderlab.de.
Collapse
Affiliation(s)
- Mona Riemenschneider
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315, Germany
| | - Alexander Herbst
- Institute of Computer Science, University of Münster, Einsteinstr. 62, Münster, 48149, Germany
| | - Ari Rasch
- Institute of Computer Science, University of Münster, Einsteinstr. 62, Münster, 48149, Germany
| | - Sergei Gorlatch
- Institute of Computer Science, University of Münster, Einsteinstr. 62, Münster, 48149, Germany
| | - Dominik Heider
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315, Germany. .,Wissenschaftszentrum Weihenstephan, Technische Universität München, Alte Akademie 8, Freising, 85354, Germany. .,Present Address: Department of Mathematics and Computer Science, University of Marburg, Hans-Meerwein-Str. 6, Marburg, 35032, Germany.
| |
Collapse
|
3
|
Manconi A, Manca E, Moscatelli M, Gnocchi M, Orro A, Armano G, Milanesi L. G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods. Front Bioeng Biotechnol 2015; 3:28. [PMID: 25806367 PMCID: PMC4354384 DOI: 10.3389/fbioe.2015.00028] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 02/19/2015] [Indexed: 11/23/2022] Open
Abstract
Copy number variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases. Recently, computational methods based on high-throughput sequencing (HTS) are increasingly used. The majority of these methods focus on mapping short-read sequences generated from a donor against a reference genome to detect signatures distinctive of CNVs. In particular, read-depth based methods detect CNVs by analyzing genomic regions with significantly different read-depth from the other ones. The pipeline analysis of these methods consists of four main stages: (i) data preparation, (ii) data normalization, (iii) CNV regions identification, and (iv) copy number estimation. However, available tools do not support most of the operations required at the first two stages of this pipeline. Typically, they start the analysis by building the read-depth signal from pre-processed alignments. Therefore, third-party tools must be used to perform most of the preliminary operations required to build the read-depth signal. These data-intensive operations can be efficiently parallelized on graphics processing units (GPUs). In this article, we present G-CNV, a GPU-based tool devised to perform the common operations required at the first two stages of the analysis pipeline. G-CNV is able to filter low-quality read sequences, to mask low-quality nucleotides, to remove adapter sequences, to remove duplicated read sequences, to map the short-reads, to resolve multiple mapping ambiguities, to build the read-depth signal, and to normalize it. G-CNV can be efficiently used as a third-party tool able to prepare data for the subsequent read-depth signal generation and analysis. Moreover, it can also be integrated in CNV detection tools to generate read-depth signals.
Collapse
Affiliation(s)
- Andrea Manconi
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Emanuele Manca
- Department of Electrical and Electronic Engineering, University of Cagliari , Cagliari , Italy
| | - Marco Moscatelli
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Matteo Gnocchi
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Alessandro Orro
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Giuliano Armano
- Department of Electrical and Electronic Engineering, University of Cagliari , Cagliari , Italy
| | - Luciano Milanesi
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| |
Collapse
|
4
|
Romano P, Lisacek F, Masseroli M. NETTAB 2012 on "Integrated Bio-Search". BMC Bioinformatics 2014; 15 Suppl 1:S1. [PMID: 24564635 PMCID: PMC4015131 DOI: 10.1186/1471-2105-15-s1-s1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The NETTAB 2012 workshop, held in Como on November 14-16, 2012, was devoted to "Integrated Bio-Search", that is to technologies, methods, architectures, systems and applications for searching, retrieving, integrating and analyzing data, information, and knowledge with the aim of answering complex bio-medical-molecular questions, i.e. some of the most challenging issues in bioinformatics today. It brought together about 80 researchers working in the field of Bioinformatics, Computational Biology, Biology, Computer Science and Engineering. More than 50 scientific contributions, including keynote and tutorial talks, oral communications, posters and software demonstrations, were presented at the workshop. This preface provides a brief overview of the workshop and shortly introduces the peer-reviewed manuscripts that were accepted for publication in this Supplement.
Collapse
Affiliation(s)
- Paolo Romano
- Bioinformatics, IRCCS AOU San Martino - IST National Cancer Research Institute, Genoa, I-16132, Italy
| | - Frédérique Lisacek
- Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, 1211 Geneva 4, Switzerland
- Section of Biology, University of Geneva, 1211 Geneva 4, Switzerland
| | - Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milano, I-20133, Italy
| |
Collapse
|