1
|
Masser DR, Hadad N, Porter H, Stout MB, Unnikrishnan A, Stanford DR, Freeman WM. Analysis of DNA modifications in aging research. GeroScience 2018; 40:11-29. [PMID: 29327208 PMCID: PMC5832665 DOI: 10.1007/s11357-018-0005-3] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 01/05/2018] [Indexed: 12/22/2022] Open
Abstract
As geroscience research extends into the role of epigenetics in aging and age-related disease, researchers are being confronted with unfamiliar molecular techniques and data analysis methods that can be difficult to integrate into their work. In this review, we focus on the analysis of DNA modifications, namely cytosine methylation and hydroxymethylation, through next-generation sequencing methods. While older techniques for modification analysis performed relative quantitation across regions of the genome or examined average genome levels, these analyses lack the desired specificity, rigor, and genomic coverage to firmly establish the nature of genomic methylation patterns and their response to aging. With recent methodological advances, such as whole genome bisulfite sequencing (WGBS), bisulfite oligonucleotide capture sequencing (BOCS), and bisulfite amplicon sequencing (BSAS), cytosine modifications can now be readily analyzed with base-specific, absolute quantitation at both cytosine-guanine dinucleotide (CG) and non-CG sites throughout the genome or within specific regions of interest by next-generation sequencing. Additional advances, such as oxidative bisulfite conversion to differentiate methylation from hydroxymethylation and analysis of limited input/single-cells, have great promise for continuing to expand epigenomic capabilities. This review provides a background on DNA modifications, the current state-of-the-art for sequencing methods, bioinformatics tools for converting these large data sets into biological insights, and perspectives on future directions for the field.
Collapse
Affiliation(s)
- Dustin R Masser
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Department of Physiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Oklahoma Nathan Shock Center for Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Niran Hadad
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Oklahoma Nathan Shock Center for Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Hunter Porter
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Oklahoma Nathan Shock Center for Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Michael B Stout
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Department of Nutritional Sciences, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Archana Unnikrishnan
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Department of Geriatric Medicine, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - David R Stanford
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Department of Physiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
- Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA
| | - Willard M Freeman
- Reynolds Oklahoma Center on Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
- Department of Physiology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
- Oklahoma Nathan Shock Center for Aging, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
- Oklahoma Center for Neuroscience, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
- Department of Nutritional Sciences, University of Oklahoma Health Sciences Center, Oklahoma City, OK, USA.
| |
Collapse
|
2
|
Agarwal P, Owzar K. Next generation distributed computing for cancer research. Cancer Inform 2015; 13:97-109. [PMID: 25983539 PMCID: PMC4412427 DOI: 10.4137/cin.s16344] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 01/05/2015] [Accepted: 01/06/2015] [Indexed: 11/28/2022] Open
Abstract
Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing.
Collapse
Affiliation(s)
- Pankaj Agarwal
- Duke Cancer Institute, Duke University Medical Center, Durham, NC, USA
| | - Kouros Owzar
- Duke Cancer Institute, Duke University Medical Center, Durham, NC, USA
- Department of Biostatistics and Bioinformatics, Duke University Medical Center, Durham, NC, USA
| |
Collapse
|
3
|
Manconi A, Manca E, Moscatelli M, Gnocchi M, Orro A, Armano G, Milanesi L. G-CNV: A GPU-Based Tool for Preparing Data to Detect CNVs with Read-Depth Methods. Front Bioeng Biotechnol 2015; 3:28. [PMID: 25806367 PMCID: PMC4354384 DOI: 10.3389/fbioe.2015.00028] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 02/19/2015] [Indexed: 11/23/2022] Open
Abstract
Copy number variations (CNVs) are the most prevalent types of structural variations (SVs) in the human genome and are involved in a wide range of common human diseases. Different computational methods have been devised to detect this type of SVs and to study how they are implicated in human diseases. Recently, computational methods based on high-throughput sequencing (HTS) are increasingly used. The majority of these methods focus on mapping short-read sequences generated from a donor against a reference genome to detect signatures distinctive of CNVs. In particular, read-depth based methods detect CNVs by analyzing genomic regions with significantly different read-depth from the other ones. The pipeline analysis of these methods consists of four main stages: (i) data preparation, (ii) data normalization, (iii) CNV regions identification, and (iv) copy number estimation. However, available tools do not support most of the operations required at the first two stages of this pipeline. Typically, they start the analysis by building the read-depth signal from pre-processed alignments. Therefore, third-party tools must be used to perform most of the preliminary operations required to build the read-depth signal. These data-intensive operations can be efficiently parallelized on graphics processing units (GPUs). In this article, we present G-CNV, a GPU-based tool devised to perform the common operations required at the first two stages of the analysis pipeline. G-CNV is able to filter low-quality read sequences, to mask low-quality nucleotides, to remove adapter sequences, to remove duplicated read sequences, to map the short-reads, to resolve multiple mapping ambiguities, to build the read-depth signal, and to normalize it. G-CNV can be efficiently used as a third-party tool able to prepare data for the subsequent read-depth signal generation and analysis. Moreover, it can also be integrated in CNV detection tools to generate read-depth signals.
Collapse
Affiliation(s)
- Andrea Manconi
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Emanuele Manca
- Department of Electrical and Electronic Engineering, University of Cagliari , Cagliari , Italy
| | - Marco Moscatelli
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Matteo Gnocchi
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Alessandro Orro
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| | - Giuliano Armano
- Department of Electrical and Electronic Engineering, University of Cagliari , Cagliari , Italy
| | - Luciano Milanesi
- Institute for Biomedical Technologies, National Research Council , Milan , Italy
| |
Collapse
|