Gao Q, Ostendorf E, Cruz JA, Jin R, Kramer DM, Chen J. Inter-functional analysis of high-throughput phenotype data by non-parametric clustering and its application to photosynthesis.
Bioinformatics 2015;
32:67-76. [PMID:
26342101 DOI:
10.1093/bioinformatics/btv515]
[Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 08/25/2015] [Indexed: 01/20/2023] Open
Abstract
MOTIVATION
Phenomics is the study of the properties and behaviors of organisms (i.e. their phenotypes) on a high-throughput scale. New computational tools are needed to analyze complex phenomics data, which consists of multiple traits/behaviors that interact with each other and are dependent on external factors, such as genotype and environmental conditions, in a way that has not been well studied.
RESULTS
We deployed an efficient framework for partitioning complex and high dimensional phenotype data into distinct functional groups. To achieve this, we represented measured phenotype data from each genotype as a cloud-of-points, and developed a novel non-parametric clustering algorithm to cluster all the genotypes. When compared with conventional clustering approaches, the new method is advantageous in that it makes no assumption about the parametric form of the underlying data distribution and is thus particularly suitable for phenotype data analysis. We demonstrated the utility of the new clustering technique by distinguishing novel phenotypic patterns in both synthetic data and a high-throughput plant photosynthetic phenotype dataset. We biologically verified the clustering results using four Arabidopsis chloroplast mutant lines.
AVAILABILITY AND IMPLEMENTATION
Software is available at www.msu.edu/~jinchen/NPM.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
CONTACT
jinchen@msu.edu, kramerd8@cns.msu.edu or rongjin@cse.msu.edu.
Collapse