1
|
Borah K, Das HS, Seth S, Mallick K, Rahaman Z, Mallik S. A review on advancements in feature selection and feature extraction for high-dimensional NGS data analysis. Funct Integr Genomics 2024; 24:139. [PMID: 39158621 DOI: 10.1007/s10142-024-01415-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2024] [Revised: 07/30/2024] [Accepted: 08/01/2024] [Indexed: 08/20/2024]
Abstract
Recent advancements in biomedical technologies and the proliferation of high-dimensional Next Generation Sequencing (NGS) datasets have led to significant growth in the bulk and density of data. The NGS high-dimensional data, characterized by a large number of genomics, transcriptomics, proteomics, and metagenomics features relative to the number of biological samples, presents significant challenges for reducing feature dimensionality. The high dimensionality of NGS data poses significant challenges for data analysis, including increased computational burden, potential overfitting, and difficulty in interpreting results. Feature selection and feature extraction are two pivotal techniques employed to address these challenges by reducing the dimensionality of the data, thereby enhancing model performance, interpretability, and computational efficiency. Feature selection and feature extraction can be categorized into statistical and machine learning methods. The present study conducts a comprehensive and comparative review of various statistical, machine learning, and deep learning-based feature selection and extraction techniques specifically tailored for NGS and microarray data interpretation of humankind. A thorough literature search was performed to gather information on these techniques, focusing on array-based and NGS data analysis. Various techniques, including deep learning architectures, machine learning algorithms, and statistical methods, have been explored for microarray, bulk RNA-Seq, and single-cell, single-cell RNA-Seq (scRNA-Seq) technology-based datasets surveyed here. The study provides an overview of these techniques, highlighting their applications, advantages, and limitations in the context of high-dimensional NGS data. This review provides better insights for readers to apply feature selection and feature extraction techniques to enhance the performance of predictive models, uncover underlying biological patterns, and gain deeper insights into massive and complex NGS and microarray data.
Collapse
Affiliation(s)
- Kasmika Borah
- Department of Computer Science and Information Technology, Cotton University, Panbazar, Guwahati, 781001, Assam, India
| | - Himanish Shekhar Das
- Department of Computer Science and Information Technology, Cotton University, Panbazar, Guwahati, 781001, Assam, India.
| | - Soumita Seth
- Department of Computer Science and Engineering, Future Institute of Engineering and Management, Narendrapur, Kolkata, 700150, West Bengal, India
| | - Koushik Mallick
- Department of Computer Science and Engineering, RCC Institute of Information Technology, Canal S Rd, Beleghata, Kolkata, 700015, West Bengal, India
| | | | - Saurav Mallik
- Department of Environmental Health, Harvard T H Chan School of Public Health, Boston, MA, 02115, USA.
- Department of Pharmacology & Toxicology, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
2
|
Zhang L, Zhang Y, Liu Y, Miao W, Ai J, Li J, Peng S, Li S, Ye L, Zeng R, Shi X, Ma J, Lin Y, Kuang W, Cui R. Multi-omics analysis revealed that the protein kinase MoKin1 affected the cellular response to endoplasmic reticulum stress in the rice blast fungus, Magnaporthe oryzae. BMC Genomics 2024; 25:449. [PMID: 38714914 PMCID: PMC11077741 DOI: 10.1186/s12864-024-10337-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 04/23/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND Previous studies have shown that protein kinase MoKin1 played an important role in the growth, conidiation, germination and pathogenicity in rice blast fungus, Magnaporthe oryzae. ΔMokin1 mutant showed significant phenotypic defects and significantly reduced pathogenicity. However, the internal mechanism of how MoKin1 affected the development of physiology and biochemistry remained unclear in M. oryzae. RESULT This study adopted a multi-omics approach to comprehensively analyze MoKin1 function, and the results showed that MoKin1 affected the cellular response to endoplasmic reticulum stress (ER stress). Proteomic analysis revealed that the downregulated proteins in ΔMokin1 mutant were enriched mainly in the response to ER stress triggered by the unfolded protein. Loss of MoKin1 prevented the ER stress signal from reaching the nucleus. Therefore, the phosphorylation of various proteins regulating the transcription of ER stress-related genes and mRNA translation was significantly downregulated. The insensitivity to ER stress led to metabolic disorders, resulting in a significant shortage of carbohydrates and a low energy supply, which also resulted in severe phenotypic defects in ΔMokin1 mutant. Analysis of MoKin1-interacting proteins indicated that MoKin1 really took participate in the response to ER stress. CONCLUSION Our results showed the important role of protein kinase MoKin1 in regulating cellular response to ER stress, providing a new research direction to reveal the mechanism of MoKin1 affecting pathogenic formation, and to provide theoretical support for the new biological target sites searching and bio-pesticides developing.
Collapse
Affiliation(s)
- Lianhu Zhang
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Yifan Zhang
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Yankun Liu
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Wenjing Miao
- College of Bioscience and Bioengineering, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Jingyu Ai
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Jingling Li
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Song Peng
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Songyan Li
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Lifang Ye
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Rong Zeng
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Xugen Shi
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Jian Ma
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China
| | - Yachun Lin
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China.
| | - Weigang Kuang
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China.
| | - Ruqiang Cui
- College of Agronomy, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China.
- Key Laboratory of Crop Physiology, Ecology and Genetic Breeding, Ministry of Education, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi, China.
| |
Collapse
|
3
|
Kuo PH, Chang CW, Tseng YR, Yau HT. Efficient, automatic, and optimized portable Raman-spectrum-based pesticide detection system. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2024; 308:123787. [PMID: 38128328 DOI: 10.1016/j.saa.2023.123787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2023] [Revised: 10/08/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023]
Abstract
Raman spectroscopy can be used for accurately detecting pesticides and determining the chemical composition of a pesticide. To facilitate field detection, the present study used a portable Raman spectrometer for analysis. However, this spectrometer was found to be susceptible to noise interference and signal offsets, which increased the difficulty of pesticide identification. The most commonly used algorithm for Raman spectrum identification is principal component analysis (PCA). However, accurate classification often cannot be achieved with PCA because of the offset and noise in the Raman spectrum data. Therefore, in this study, after the collected Raman spectrum data were processed using the small-step, center-weighted moving-average method, these data were employed to train a convolutional neural network (CNN) model for prediction. To optimize the CNN model, the hyperparameters of the CNN were adjusted using various optimization algorithms, and the optimal solution was obtained after multiple iterations. Data preprocessing and architecture training models were then constructed in a self-optimized manner to improve the ability of the algorithm model to handle diverse types of data. Finally, a CNN model optimized using the cat swarm optimization algorithm was developed. This model was trained on 3000 samples containing three pesticides, and its accuracy for pesticide composition identification was discovered to be 89.33%.
Collapse
Affiliation(s)
- Ping-Huan Kuo
- Department of Mechanical Engineering, National Chung Cheng University, Chiayi 62102, Taiwan; Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chiayi 62102, Taiwan.
| | - Chen-Wen Chang
- Department of Mechanical Engineering, National Chung Cheng University, Chiayi 62102, Taiwan.
| | - Yung-Ruen Tseng
- Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chiayi 62102, Taiwan.
| | - Her-Terng Yau
- Department of Mechanical Engineering, National Chung Cheng University, Chiayi 62102, Taiwan; Advanced Institute of Manufacturing with High-Tech Innovations (AIM-HI), National Chung Cheng University, Chiayi 62102, Taiwan.
| |
Collapse
|
4
|
Watts J, Allen E, Mitoubsi A, Khojandi A, Eales J, Papamarkou T. Towards Faster Gene Expression Prediction via Dimensionality Reduction and Feature Selection. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083578 DOI: 10.1109/embc40787.2023.10340962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
The majority of genes have a genetic component to their expression. Elastic nets have been shown effective at predicting tissue-specific, individual-level gene expression from genotype data. We apply principal component analysis (PCA), linkage disequilibrium pruning, or the combination of the two to reduce, or generate, a lower-dimensional representation of the genetic variants used as inputs to the elastic net models for the prediction of gene expression. Our results show that, in general, elastic nets attain their best performance when all genetic variants are included as inputs; however, a relatively low number of principal components can effectively summarize the majority of genetic variation while reducing the overall computation time. Specifically, 100 principal components reduce the computational time of the models by over 80% with only an 8% loss in R2. Finally, linkage disequilibrium pruning does not effectively reduce the genetic variants for predicting gene expression. As predictive models are commonly made for over 27,000 genes for more than 50 tissues, PCA may provide an effective method for reducing the computational burden of gene expression analysis.
Collapse
|
5
|
Smith RN, Rosales IA, Tomaszewski KT, Mahowald GT, Araujo-Medina M, Acheampong E, Bruce A, Rios A, Otsuka T, Tsuji T, Hotta K, Colvin R. Utility of Banff Human Organ Transplant Gene Panel in Human Kidney Transplant Biopsies. Transplantation 2023; 107:1188-1199. [PMID: 36525551 PMCID: PMC10132999 DOI: 10.1097/tp.0000000000004389] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
BACKGROUND Microarray transcript analysis of human renal transplantation biopsies has successfully identified the many patterns of graft rejection. To evaluate an alternative, this report tests whether gene expression from the Banff Human Organ Transplant (B-HOT) probe set panel, derived from validated microarrays, can identify the relevant allograft diagnoses directly from archival human renal transplant formalin-fixed paraffin-embedded biopsies. To test this hypothesis, principal components (PCs) of gene expressions were used to identify allograft diagnoses, to classify diagnoses, and to determine whether the PC data were rich enough to identify diagnostic subtypes by clustering, which are all needed if the B-HOT panel can substitute for microarrays. METHODS RNA was isolated from routine, archival formalin-fixed paraffin-embedded tissue renal biopsy cores with both rejection and nonrejection diagnoses. The B-HOT panel expression of 770 genes was analyzed by PCs, which were then tested to determine their ability to identify diagnoses. RESULTS PCs of microarray gene sets identified the Banff categories of renal allograft diagnoses, modeled well the aggregate diagnoses, showing a similar correspondence with the pathologic diagnoses as microarrays. Clustering of the PCs identified diagnostic subtypes including non-chronic antibody-mediated rejection with high endothelial expression. PCs of cell types and pathways identified new mechanistic patterns including differential expression of B and plasma cells. CONCLUSIONS Using PCs of gene expression from the B-Hot panel confirms the utility of the B-HOT panel to identify allograft diagnoses and is similar to microarrays. The B-HOT panel will accelerate and expand transcript analysis and will be useful for longitudinal and outcome studies.
Collapse
Affiliation(s)
- Rex N Smith
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| | - Ivy A Rosales
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| | - Kristen T Tomaszewski
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| | - Grace T Mahowald
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Milagros Araujo-Medina
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Ellen Acheampong
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Amy Bruce
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Andrea Rios
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
| | - Takuya Otsuka
- Department of Surgical Pathology, Hokkaido University Hospital, Sapporo, Japan
| | - Takahiro Tsuji
- Department of Pathology, Sapporo City General Hospital, Sapporo, Japan
| | - Kiyohiko Hotta
- Department of Urology, Hokkaido University Hospital, Sapporo, Japan
| | - Robert Colvin
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA
- Center for Transplantation Sciences, Massachusetts General Hospital, Boston, MA
| |
Collapse
|
6
|
Zeng Z, Guan L, Zhu W, Dong J, Li J. Face Recognition Based on SVM Optimized by the Improved Bacterial Foraging Optimization Algorithm. INT J PATTERN RECOGN 2019. [DOI: 10.1142/s021800141956007x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Support vector machine (SVM) is always used for face recognition. However, kernel function selection (kernel selection and its parameters selection) is a key problem for SVMs, and it is difficult. This paper tries to make some contributions to this problem with focus on optimizing the parameters in the selected kernel function. Bacterial foraging optimization algorithm, inspired by the social foraging behavior of Escherichia coli, has been widely accepted as a global optimization algorithm of current interest for distributed optimization and control. Therefore, we proposed to optimize the parameters in SVM by an improved bacterial foraging optimization algorithm (IBFOA). In the improved version of bacterial foraging optimization algorithm, a dynamical elimination-dispersal probability in the elimination-dispersal step and a dynamical step size in the chemotactic step are used to improve the performance of bacterial foraging optimization algorithm. Then the optimized SVM is used for face recognition. Simultaneously, an improved local binary pattern is proposed to extract features of face images in this paper to improve the accuracy rate of face recognition. Numerical results show the advantage of our algorithm over a range of existing algorithms.
Collapse
Affiliation(s)
- Zhigao Zeng
- College of Computer, Hunan University of Technology, Zhuzhou City, Hunan Province 412007, P. R. China
- Intelligent Information Perception and Processing Technology, Hunan Province Key Laboratory, Zhuzhou City, Hunan Province 412007, P. R. China
| | - Lianghua Guan
- College of Computer, Hunan University of Technology, Zhuzhou City, Hunan Province 412007, P. R. China
- Intelligent Information Perception and Processing Technology, Hunan Province Key Laboratory, Zhuzhou City, Hunan Province 412007, P. R. China
| | - Wenqiu Zhu
- College of Computer, Hunan University of Technology, Zhuzhou City, Hunan Province 412007, P. R. China
- Intelligent Information Perception and Processing Technology, Hunan Province Key Laboratory, Zhuzhou City, Hunan Province 412007, P. R. China
| | - Jing Dong
- Institute of Automation, Chinese Academy of Sciences, Beijing 100190, P. R. China
| | - Jun Li
- Wuhan University of Science and Technology, Wuhan 430065, P. R. China
| |
Collapse
|
7
|
Singh V, Verma NK, Cui Y. Type-2 Fuzzy PCA Approach in Extracting Salient Features for Molecular Cancer Diagnostics and Prognostics. IEEE Trans Nanobioscience 2019; 18:482-489. [PMID: 31107656 DOI: 10.1109/tnb.2019.2917814] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Machine learning is becoming a powerful tool for cancer diagnosis and prognosis based on classification using high dimensional molecular data. However, extracting classification features from high-dimensional datasets remains a challenging problem. Principal component analysis (PCA) is a widely used method for dimensionality reduction. However, it is well-known that PCA and most PCA-based feature extraction methods are sensitive to noise, which may affect the accuracy of the subsequent classification. To address this problem, here we have proposed a robust fuzzy principal component analysis (PCA) with interval type-2 (IT-2) fuzzy membership functions for feature extraction. We have tested the performance of three widely used classifiers using the features extracted by proposed approaches and other feature extraction methods - PCA-based feature extraction methods (i.e. conventional PCA and fuzzy PCA), linear discriminant analysis (LDA), and support vector machine recursive feature elimination (SVM-RFE). The proposed feature extraction approaches showed better performance on cancer transcriptome and proteome datasets.
Collapse
|
8
|
Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 2018; 57:159-176. [DOI: 10.1007/s11517-018-1874-4] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 07/12/2018] [Indexed: 12/25/2022]
|
9
|
Carcillo F, Le Borgne YA, Caelen O, Bontempi G. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS 2018. [DOI: 10.1007/s41060-018-0116-z] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
10
|
Liu JX, Wang D, Gao YL, Zheng CH, Shang JL, Liu F, Xu Y. A joint-L2,1-norm-constraint-based semi-supervised feature extraction for RNA-Seq data analysis. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.09.083] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
11
|
Wang L, Wang Y, Chang Q. Feature selection methods for big data bioinformatics: A survey from the search perspective. Methods 2016; 111:21-31. [PMID: 27592382 DOI: 10.1016/j.ymeth.2016.08.014] [Citation(s) in RCA: 110] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Revised: 08/25/2016] [Accepted: 08/30/2016] [Indexed: 11/26/2022] Open
Abstract
This paper surveys main principles of feature selection and their recent applications in big data bioinformatics. Instead of the commonly used categorization into filter, wrapper, and embedded approaches to feature selection, we formulate feature selection as a combinatorial optimization or search problem and categorize feature selection methods into exhaustive search, heuristic search, and hybrid methods, where heuristic search methods may further be categorized into those with or without data-distilled feature ranking measures.
Collapse
Affiliation(s)
- Lipo Wang
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore.
| | - Yaoli Wang
- College of Information Engineering, Taiyuan University of Technology, Taiyuan, China.
| | - Qing Chang
- College of Information Engineering, Taiyuan University of Technology, Taiyuan, China.
| |
Collapse
|
12
|
Liu JX, Xu Y, Gao YL, Zheng CH, Wang D, Zhu Q. A Class-Information-Based Sparse Component Analysis Method to Identify Differentially Expressed Genes on RNA-Seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:392-398. [PMID: 27045835 DOI: 10.1109/tcbb.2015.2440265] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
With the development of deep sequencing technologies, many RNA-Seq data have been generated. Researchers have proposed many methods based on the sparse theory to identify the differentially expressed genes from these data. In order to improve the performance of sparse principal component analysis, in this paper, we propose a novel class-information-based sparse component analysis (CISCA) method which introduces the class information via a total scatter matrix. First, CISCA normalizes the RNA-Seq data by using a Poisson model to obtain their differential sections. Second, the total scatter matrix is gotten by combining the between-class and within-class scatter matrices. Third, we decompose the total scatter matrix by using singular value decomposition and construct a new data matrix by using singular values and left singular vectors. Then, aiming at obtaining sparse components, CISCA decomposes the constructed data matrix by solving an optimization problem with sparse constraints on loading vectors. Finally, the differentially expressed genes are identified by using the sparse loading vectors. The results on simulation and real RNA-Seq data demonstrate that our method is effective and suitable for analyzing these data.
Collapse
|
13
|
Chinnaswamy A, Srinivasan R. Hybrid Feature Selection Using Correlation Coefficient and Particle Swarm Optimization on Microarray Gene Expression Data. ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING 2016. [DOI: 10.1007/978-3-319-28031-8_20] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
14
|
|
15
|
Guo R, Ahn M, Zhu H. Spatially Weighted Principal Component Analysis for Imaging Classification. J Comput Graph Stat 2015; 24:274-296. [PMID: 26089629 DOI: 10.1080/10618600.2014.912135] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The aim of this paper is to develop a supervised dimension reduction framework, called Spatially Weighted Principal Component Analysis (SWPCA), for high dimensional imaging classification. Two main challenges in imaging classification are the high dimensionality of the feature space and the complex spatial structure of imaging data. In SWPCA, we introduce two sets of novel weights including global and local spatial weights, which enable a selective treatment of individual features and incorporation of the spatial structure of imaging data and class label information. We develop an e cient two-stage iterative SWPCA algorithm and its penalized version along with the associated weight determination. We use both simulation studies and real data analysis to evaluate the finite-sample performance of our SWPCA. The results show that SWPCA outperforms several competing principal component analysis (PCA) methods, such as supervised PCA (SPCA), and other competing methods, such as sparse discriminant analysis (SDA).
Collapse
Affiliation(s)
- Ruixin Guo
- Department of Biostatistics and Informatics, University of Colorado School of Public Health, University of North Carolina at Chapel Hill
| | - Mihye Ahn
- Department of Biostatistics and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill
| | - Hongtu Zhu
- Department of Biostatistics and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill
| | | |
Collapse
|
16
|
Shum HPH, Ho ESL, Jiang Y, Takagi S. Real-time posture reconstruction for Microsoft Kinect. IEEE TRANSACTIONS ON CYBERNETICS 2013; 43:1357-1369. [PMID: 23981562 DOI: 10.1109/tcyb.2013.2275945] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The recent advancement of motion recognition using Microsoft Kinect stimulates many new ideas in motion capture and virtual reality applications. Utilizing a pattern recognition algorithm, Kinect can determine the positions of different body parts from the user. However, due to the use of a single-depth camera, recognition accuracy drops significantly when the parts are occluded. This hugely limits the usability of applications that involve interaction with external objects, such as sport training or exercising systems. The problem becomes more critical when Kinect incorrectly perceives body parts. This is because applications have limited information about the recognition correctness, and using those parts to synthesize body postures would result in serious visual artifacts. In this paper, we propose a new method to reconstruct valid movement from incomplete and noisy postures captured by Kinect. We first design a set of measurements that objectively evaluates the degree of reliability on each tracked body part. By incorporating the reliability estimation into a motion database query during run time, we obtain a set of similar postures that are kinematically valid. These postures are used to construct a latent space, which is known as the natural posture space in our system, with local principle component analysis. We finally apply frame-based optimization in the space to synthesize a new posture that closely resembles the true user posture while satisfying kinematic constraints. Experimental results show that our method can significantly improve the quality of the recognized posture under severely occluded environments, such as a person exercising with a basketball or moving in a small room.
Collapse
|