201
|
Georges N, Rekik I. Data-Specific Feature Selection Method Identification for Most Reproducible Connectomic Feature Discovery Fingerprinting Brain States. CONNECTOMICS IN NEUROIMAGING 2018. [DOI: 10.1007/978-3-030-00755-3_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|
202
|
|
203
|
Vyas R, Bapat S, Goel P, Karthikeyan M, Tambe SS, Kulkarni BD. Application of Genetic Programming (GP) Formalism for Building Disease Predictive Models from Protein-Protein Interactions (PPI) Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:27-37. [PMID: 28113781 DOI: 10.1109/tcbb.2016.2621042] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Protein-protein interactions (PPIs) play a vital role in the biological processes involved in the cell functions and disease pathways. The experimental methods known to predict PPIs require tremendous efforts and the results are often hindered by the presence of a large number of false positives. Herein, we demonstrate the use of a new Genetic Programming (GP) based Symbolic Regression (SR) approach for predicting PPIs related to a disease. In a case study, a dataset consisting of one hundred and thirty five PPI complexes related to cancer was used to construct a generic PPI predicting model with good PPI prediction accuracy and generalization ability. A high correlation coefficient(CC) of 0.893, low root mean square error (RMSE) and mean absolute percentage error (MAPE) values of 478.221 and 0.239, respectively were achieved for both the training and test set outputs. To validate the discriminatory nature of the model, it was applied on a dataset of diabetes complexes where it yielded significantly low CC values. Thus, the GP model developed here serves a dual purpose: (a)a predictor of the binding energy of cancer related PPI complexes, and (b)a classifier for discriminating PPI complexes related to cancer from those of other diseases.
Collapse
|
204
|
Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2017.10.028] [Citation(s) in RCA: 186] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
205
|
Normalised Mutual Information of High-Density Surface Electromyography during Muscle Fatigue. ENTROPY 2017. [DOI: 10.3390/e19120697] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
206
|
Li X, Wong KC. Multiobjective Patient Stratification Using Evolutionary Multiobjective Optimization. IEEE J Biomed Health Inform 2017; 22:1619-1629. [PMID: 29990162 DOI: 10.1109/jbhi.2017.2769711] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
One of the main challenges in modern medic-ine is to stratify patients for personalized care. Many different clustering methods have been proposed to solve the problem in both quantitative and biologically meaningful manners. However, existing clustering algorithms suffer from numerous restrictions such as experimental noises, high dimensionality, and poor interpretability. To overcome those limitations altogether, we propose and formulate a multiobjective framework based on evolutionary multiobjective optimization to balance the feature relevance and redundancy for patient stratification. To demonstrate the effectiveness of our proposed algorithms, we benchmark our algorithms across 55 synthetic datasets based on a real human transcription regulation network model, 35 real cancer gene expression datasets, and two case studies. Experimental results suggest that the proposed algorithms perform better than the recent state-of-the-arts. In addition, time complexity analysis, convergence analysis, and parameter analysis are conducted to demonstrate the robustness of the proposed methods from different perspectives. Finally, the t-Distributed Stochastic Neighbor Embedding (t-SNE) is applied to project the selected feature subsets onto two or three dimensions to visualize the high-dimensional patient stratification data.
Collapse
|
207
|
Amir Haeri M, Ebadzadeh MM, Folino G. Statistical genetic programming for symbolic regression. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2017.06.050] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
208
|
Qi C, Hu L, Yu X. A framework of multiple kernel ensemble learning for classification using two-stage feature selection method. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2017. [DOI: 10.3233/jifs-169323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Chengming Qi
- College of Urban Rail Transit and Logistics, Beijing Union University, Beijing, China
| | - Lishuan Hu
- College of Urban Rail Transit and Logistics, Beijing Union University, Beijing, China
| | - Xin Yu
- College of Urban Rail Transit and Logistics, Beijing Union University, Beijing, China
| |
Collapse
|
209
|
Optimized automatic sleep stage classification using the normalized mutual information feature selection (NMIFS) method. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2017; 2017:3094-3097. [PMID: 29060552 DOI: 10.1109/embc.2017.8037511] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Sleep is a very important physiological phenomenon for recovery of physical and mental fatigue. Recently, there has been a lot of interest in the quality of sleep and the research is actively under way. In particular, it is important to have a repetitive and regular sleep cycle for good sleep. However, it takes a lot of time to determine sleep stages using physiological signals by experts. In this study, we constructed an optimized classifier based on normalized mutual information feature selection (NMIFS) and kernel based extreme learning machine (K-ELM), and total 4 sleep stages (Awake, weak sleep (stage1+stage2), deep sleep(stage3+stage4) and rapid eye movement (REM)) were automatically classified. As a results, the average of the accuracy obtained by proposed method (NMIFS+K-ELM) is 2~3% higher than that of simple method (K-ELM).
Collapse
|
210
|
A filter feature selection method based on the Maximal Information Coefficient and Gram-Schmidt Orthogonalization for biomedical data mining. Comput Biol Med 2017; 89:264-274. [PMID: 28850898 DOI: 10.1016/j.compbiomed.2017.08.021] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2017] [Revised: 08/19/2017] [Accepted: 08/20/2017] [Indexed: 12/22/2022]
|
211
|
|
212
|
Wang Y, Feng L, Li Y. Two-step based feature selection method for filtering redundant information. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2017. [DOI: 10.3233/jifs-161541] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Youwei Wang
- School of Information, Central University of Finance and Economics, Beijing, China
| | - Lizhou Feng
- School of Science and Engineering, Tianjin University of Finance and Economics, Tianjin, China
| | - Yang Li
- School of Information, Central University of Finance and Economics, Beijing, China
| |
Collapse
|
213
|
Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification. COMPUT SPEECH LANG 2017. [DOI: 10.1016/j.csl.2017.04.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
214
|
|
215
|
Roh SH, Hryc CF, Jeong HH, Fei X, Jakana J, Lorimer GH, Chiu W. Subunit conformational variation within individual GroEL oligomers resolved by Cryo-EM. Proc Natl Acad Sci U S A 2017; 114:8259-8264. [PMID: 28710336 PMCID: PMC5547627 DOI: 10.1073/pnas.1704725114] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Single-particle electron cryo-microscopy (cryo-EM) is an emerging tool for resolving structures of conformationally heterogeneous particles; however, each structure is derived from an average of many particles with presumed identical conformations. We used a 3.5-Å cryo-EM reconstruction with imposed D7 symmetry to further analyze structural heterogeneity among chemically identical subunits in each GroEL oligomer. Focused classification of the 14 subunits in each oligomer revealed three dominant classes of subunit conformations. Each class resembled a distinct GroEL crystal structure in the Protein Data Bank. The conformational differences stem from the orientations of the apical domain. We mapped each conformation class to its subunit locations within each GroEL oligomer in our dataset. The spatial distributions of each conformation class differed among oligomers, and most oligomers contained 10-12 subunits of the three dominant conformation classes. Adjacent subunits were found to more likely assume the same conformation class, suggesting correlation among subunits in the oligomer. This study demonstrates the utility of cryo-EM in revealing structure dynamics within a single protein oligomer.
Collapse
Affiliation(s)
- Soung-Hun Roh
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030
| | - Corey F Hryc
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX 77030
| | - Hyun-Hwan Jeong
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030
| | - Xue Fei
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742
| | - Joanita Jakana
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030
| | - George H Lorimer
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742
| | - Wah Chiu
- National Center for Macromolecular Imaging, Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030;
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, TX 77030
| |
Collapse
|
216
|
Sun K, Huang SH, Wong DSH, Jang SS. Design and Application of a Variable Selection Method for Multilayer Perceptron Neural Network With LASSO. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2017; 28:1386-1396. [PMID: 28113826 DOI: 10.1109/tnnls.2016.2542866] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
In this paper, a novel variable selection method for neural network that can be applied to describe nonlinear industrial processes is developed. The proposed method is an iterative two-step approach. First, a multilayer perceptron is constructed. Second, the least absolute shrinkage and selection operator is introduced to select the input variables that are truly essential to the model with the shrinkage parameter is determined using a cross-validation method. Then, variables whose input weights are zero are eliminated from the data set. The algorithm is repeated until there is no improvement in the model accuracy. Simulation examples as well as an industrial application in a crude distillation unit are used to validate the proposed algorithm. The results show that the proposed approach can be used to construct a more compressed model, which incorporates a higher level of prediction accuracy than other existing methods.
Collapse
|
217
|
Imbiriba T, Bermudez JCM, Richard C. Band Selection for Nonlinear Unmixing of Hyperspectral Images as a Maximal Clique Problem. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:2179-2191. [PMID: 28278463 DOI: 10.1109/tip.2017.2676344] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Kernel-based nonlinear mixing models have been applied to unmix spectral information of hyperspectral images when the type of mixing occurring in the scene is too complex or unknown. Such methods, however, usually require the inversion of matrices of sizes equal to the number of spectral bands. Reducing the computational load of these methods remains a challenge in large-scale applications. This paper proposes a centralized band selection (BS) method for supervised unmixing in the reproducing kernel Hilbert space. It is based upon the coherence criterion, which sets the largest value allowed for correlations between the basis kernel functions characterizing the selected bands in the unmixing model. We show that the proposed BS approach is equivalent to solving a maximum clique problem, i.e., searching for the biggest complete subgraph in a graph. Furthermore, we devise a strategy for selecting the coherence threshold and the Gaussian kernel bandwidth using coherence bounds for linearly independent bases. Simulation results illustrate the efficiency of the proposed method.
Collapse
|
218
|
Garcia-Chimeno Y, Garcia-Zapirain B, Gomez-Beldarrain M, Fernandez-Ruanova B, Garcia-Monco JC. Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data. BMC Med Inform Decis Mak 2017; 17:38. [PMID: 28407777 PMCID: PMC5390380 DOI: 10.1186/s12911-017-0434-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2016] [Accepted: 03/29/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. METHODS We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. RESULTS When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). CONCLUSIONS The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.
Collapse
Affiliation(s)
- Yolanda Garcia-Chimeno
- DeustoTech - Fundacion Deusto, Avda. Universidades, 24, Bilbao, 48007 Spain
- Facultad IngenieriaUniversidad de Deusto, Avda. Universidades, 24, Bilbao, 48007 Spain
| | - Begonya Garcia-Zapirain
- DeustoTech - Fundacion Deusto, Avda. Universidades, 24, Bilbao, 48007 Spain
- Facultad IngenieriaUniversidad de Deusto, Avda. Universidades, 24, Bilbao, 48007 Spain
| | - Marian Gomez-Beldarrain
- Service of Neurology Hospital de Galdakao-Usansolo, Barrio Labeaga, S/N, Galdakao, 48960 Spain
| | | | - Juan Carlos Garcia-Monco
- Research and Innovation Department, Magnetic Resonance Imaging Unit, OSATEK, Alameda Urquijo, 36, Bilbao, 48011 Spain
| |
Collapse
|
219
|
Wang S, Wei J, Yang Z. Discrimination Structure Complementarity-Based Feature Selection. Comput Intell 2017. [DOI: 10.1111/coin.12118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Shuqin Wang
- College of Computer and Information Engineering; Tianjin Normal University; Tianjin China
| | - Jinmao Wei
- College of Computer and Control Engineering; Nankai University; Tianjin China
- College of Software; Nankai University; Tianjin China
| | - Zhenglu Yang
- College of Computer and Control Engineering; Nankai University; Tianjin China
- College of Software; Nankai University; Tianjin China
| |
Collapse
|
220
|
A new validity index of feature subset for evaluating the dimensionality reduction algorithms. Knowl Based Syst 2017. [DOI: 10.1016/j.knosys.2017.01.017] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
221
|
|
222
|
Pascoal C, Oliveira MR, Pacheco A, Valadas R. Theoretical evaluation of feature selection methods based on mutual information. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.11.047] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
223
|
Wang Y, Veluvolu KC. Evolutionary Algorithm Based Feature Optimization for Multi-Channel EEG Classification. Front Neurosci 2017; 11:28. [PMID: 28203141 PMCID: PMC5285364 DOI: 10.3389/fnins.2017.00028] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 01/16/2017] [Indexed: 11/13/2022] Open
Abstract
The most BCI systems that rely on EEG signals employ Fourier based methods for time-frequency decomposition for feature extraction. The band-limited multiple Fourier linear combiner is well-suited for such band-limited signals due to its real-time applicability. Despite the improved performance of these techniques in two channel settings, its application in multiple-channel EEG is not straightforward and challenging. As more channels are available, a spatial filter will be required to eliminate the noise and preserve the required useful information. Moreover, multiple-channel EEG also adds the high dimensionality to the frequency feature space. Feature selection will be required to stabilize the performance of the classifier. In this paper, we develop a new method based on Evolutionary Algorithm (EA) to solve these two problems simultaneously. The real-valued EA encodes both the spatial filter estimates and the feature selection into its solution and optimizes it with respect to the classification error. Three Fourier based designs are tested in this paper. Our results show that the combination of Fourier based method with covariance matrix adaptation evolution strategy (CMA-ES) has the best overall performance.
Collapse
Affiliation(s)
- Yubo Wang
- School of Life Science and Technology, Xidian UniversityXi'an, China; School of Electronics Engineering, College of IT Engineering, Kyungpook National UniversityDaegu, South Korea
| | - Kalyana C Veluvolu
- School of Electronics Engineering, College of IT Engineering, Kyungpook National University Daegu, South Korea
| |
Collapse
|
224
|
Affiliation(s)
- Hedieh Sajedi
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
225
|
Yuan Y, Zheng X, Lu X. Discovering Diverse Subset for Unsupervised Hyperspectral Band Selection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2017; 26:51-64. [PMID: 28113180 DOI: 10.1109/tip.2016.2617462] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Band selection, as a special case of the feature selection problem, tries to remove redundant bands and select a few important bands to represent the whole image cube. This has attracted much attention, since the selected bands provide discriminative information for further applications and reduce the computational burden. Though hyperspectral band selection has gained rapid development in recent years, it is still a challenging task because of the following requirements: 1) an effective model can capture the underlying relations between different high-dimensional spectral bands; 2) a fast and robust measure function can adapt to general hyperspectral tasks; and 3) an efficient search strategy can find the desired selected bands in reasonable computational time. To satisfy these requirements, a multigraph determinantal point process (MDPP) model is proposed to capture the full structure between different bands and efficiently find the optimal band subset in extensive hyperspectral applications. There are three main contributions: 1) graphical model is naturally transferred to address band selection problem by the proposed MDPP; 2) multiple graphs are designed to capture the intrinsic relationships between hyperspectral bands; and 3) mixture DPP is proposed to model the multiple dependencies in the proposed multiple graphs, and offers an efficient search strategy to select the optimal bands. To verify the superiority of the proposed method, experiments have been conducted on three hyperspectral applications, such as hyperspectral classification, anomaly detection, and target detection. The reliability of the proposed method in generic hyperspectral tasks is experimentally proved on four real-world hyperspectral data sets.
Collapse
|
226
|
Yuan H, Xu H, Qian Y, Li Y. Make your travel smarter: Summarizing urban tourism information from massive blog data. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2016. [DOI: 10.1016/j.ijinfomgt.2016.02.009] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
227
|
Vyas R, Bapat S, Jain E, Karthikeyan M, Tambe S, Kulkarni BD. Building and analysis of protein-protein interactions related to diabetes mellitus using support vector machine, biomedical text mining and network analysis. Comput Biol Chem 2016; 65:37-44. [DOI: 10.1016/j.compbiolchem.2016.09.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 09/07/2016] [Accepted: 09/19/2016] [Indexed: 01/06/2023]
|
228
|
Li C, Liu Q, Dong W, Wei F, Zhang X, Yang L. Max-Margin-Based Discriminative Feature Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:2768-2775. [PMID: 26863680 DOI: 10.1109/tnnls.2016.2520099] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
In this brief, we propose a new max-margin-based discriminative feature learning method. In particular, we aim at learning a low-dimensional feature representation, so as to maximize the global margin of the data and make the samples from the same class as close as possible. In order to enhance the robustness to noise, we leverage a regularization term to make the transformation matrix sparse in rows. In addition, we further learn and leverage the correlations among multiple categories for assisting in learning discriminative features. The experimental results demonstrate the power of the proposed method against the related state-of-the-art methods.
Collapse
|
229
|
Nguyen HB, Xue B, Andreae P. Mutual information for feature selection: estimation or counting? EVOLUTIONARY INTELLIGENCE 2016. [DOI: 10.1007/s12065-016-0143-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
230
|
Affiliation(s)
- JinXing Che
- School of Mathematics and Statistics, Xidian University, Xi'an, People's Republic of China
| | - YouLong Yang
- School of Mathematics and Statistics, Xidian University, Xi'an, People's Republic of China
| |
Collapse
|
231
|
Thilaga M, Vijayalakshmi R, Nadarajan R, Nandagopal D. A novel pattern mining approach for identifying cognitive activity in EEG based functional brain networks. J Integr Neurosci 2016; 15:223-45. [PMID: 27401999 DOI: 10.1142/s0219635216500151] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The complex nature of neuronal interactions of the human brain has posed many challenges to the research community. To explore the underlying mechanisms of neuronal activity of cohesive brain regions during different cognitive activities, many innovative mathematical and computational models are required. This paper presents a novel Common Functional Pattern Mining approach to demonstrate the similar patterns of interactions due to common behavior of certain brain regions. The electrode sites of EEG-based functional brain network are modeled as a set of transactions and node-based complex network measures as itemsets. These itemsets are transformed into a graph data structure called Functional Pattern Graph. By mining this Functional Pattern Graph, the common functional patterns due to specific brain functioning can be identified. The empirical analyses show the efficiency of the proposed approach in identifying the extent to which the electrode sites (transactions) are similar during various cognitive load states.
Collapse
Affiliation(s)
- M Thilaga
- * Department of Applied Mathematics and Computational Sciences, Computational Neuroscience Laboratory, PSG College of Technology, Coimbatore 641004, Tamil Nadu, India
| | - R Vijayalakshmi
- * Department of Applied Mathematics and Computational Sciences, Computational Neuroscience Laboratory, PSG College of Technology, Coimbatore 641004, Tamil Nadu, India
| | - R Nadarajan
- * Department of Applied Mathematics and Computational Sciences, Computational Neuroscience Laboratory, PSG College of Technology, Coimbatore 641004, Tamil Nadu, India
| | - D Nandagopal
- † Cognitive NeuroEngineering Laboratory, Division of Information Technology, Engineering and the Environment, University of South Australia, Adelaide, South Australia 5001, Australia
| |
Collapse
|
232
|
Li Z, Lu W, Sun Z, Xing W. A parallel feature selection method study for text classification. Neural Comput Appl 2016. [DOI: 10.1007/s00521-016-2351-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
233
|
Zhao G, Liu S. Estimation of Discriminative Feature Subset Using Community Modularity. Sci Rep 2016; 6:25040. [PMID: 27121171 PMCID: PMC4848544 DOI: 10.1038/srep25040] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 04/11/2016] [Indexed: 11/13/2022] Open
Abstract
Feature selection (FS) is an important preprocessing step in machine learning and data mining. In this paper, a new feature subset evaluation method is proposed by constructing a sample graph (SG) in different k-features and applying community modularity to select highly informative features as a group. However, these features may not be relevant as an individual. Furthermore, relevant in-dependency rather than irrelevant redundancy among the selected features is effectively measured with the community modularity Q value of the sample graph in the k-features. An efficient FS method called k-features sample graph feature selection is presented. A key property of this approach is that the discriminative cues of a feature subset with the maximum relevant in-dependency among features can be accurately determined. This community modularity-based method is then verified with the theory of k-means cluster. Compared with other state-of-the-art methods, the proposed approach is more effective, as verified by the results of several experiments.
Collapse
Affiliation(s)
- Guodong Zhao
- School of Mathematics and Physics, Shanghai Dian Ji University, Shanghai 201306, P. R. China
| | - Sanming Liu
- School of Mathematics and Physics, Shanghai Dian Ji University, Shanghai 201306, P. R. China
| |
Collapse
|
234
|
Tutorial on practical tips of the most influential data preprocessing algorithms in data mining. Knowl Based Syst 2016. [DOI: 10.1016/j.knosys.2015.12.006] [Citation(s) in RCA: 146] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
235
|
Wang Y, Ji J, Liang P. Feature selection of fMRI data based on normalized mutual information and fisher discriminant ratio. JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY 2016; 24:467-475. [PMID: 27257882 DOI: 10.3233/xst-160565] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Pattern classification has been increasingly used in functional magnetic resonance imaging (fMRI) data analysis. However, the classification performance is restricted by the high dimensional property and noises of the fMRI data. In this paper, a new feature selection method (named as "NMI-F") was proposed by sequentially combining the normalized mutual information (NMI) and fisher discriminant ratio. In NMI-F, the normalized mutual information was firstly used to evaluate the relationships between features, and fisher discriminant ratio was then applied to calculate the importance of each feature involved. Two fMRI datasets (task-related and resting state) were used to test the proposed method. It was found that classification base on the NMI-F method could differentiate the brain cognitive and disease states effectively, and the proposed NMI-F method was prior to the other related methods. The current results also have implications to the future studies.
Collapse
Affiliation(s)
- Yanbin Wang
- Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology, College of Computer Science and Technology, Beijing University of Technology, Beijing, China
| | - Junzhong Ji
- Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology, College of Computer Science and Technology, Beijing University of Technology, Beijing, China
| | - Peipeng Liang
- Department of Radiology, Xuanwu Hospital, Capital Medical University, Beijing, China
- Beijing Key Laboratory of Magnetic Resonance Imaging and Brain Informatics, Beijing, China
| |
Collapse
|
236
|
Liébana-Cabanillas F, Herrera L, Guillén A. Variable selection for payment in social networks: Introducing the Hy-index. COMPUTERS IN HUMAN BEHAVIOR 2016. [DOI: 10.1016/j.chb.2015.10.022] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
237
|
|
238
|
Feature Selection for Heart Rate Variability Based Biometric Recognition Using Genetic Algorithm. ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING 2016. [DOI: 10.1007/978-3-319-23036-8_8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
239
|
Coelho F, Braga AP, Verleysen M. A Mutual Information estimator for continuous and discrete variables applied to Feature Selection and Classification problems. INT J COMPUT INT SYS 2016. [DOI: 10.1080/18756891.2016.1204120] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022] Open
|
240
|
Ginsburg SB, Lee G, Ali S, Madabhushi A. Feature Importance in Nonlinear Embeddings (FINE): Applications in Digital Pathology. IEEE TRANSACTIONS ON MEDICAL IMAGING 2016; 35:76-88. [PMID: 26186772 DOI: 10.1109/tmi.2015.2456188] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Quantitative histomorphometry (QH) refers to the process of computationally modeling disease appearance on digital pathology images by extracting hundreds of image features and using them to predict disease presence or outcome. Since constructing a robust and interpretable classifier is challenging in a high dimensional feature space, dimensionality reduction (DR) is often implemented prior to classifier construction. However, when DR is performed it can be challenging to quantify the contribution of each of the original features to the final classification result. We have previously presented a method for scoring features based on their importance for classification on an embedding derived via principal components analysis (PCA). However, nonlinear DR involves the eigen-decomposition of a kernel matrix rather than the data itself, compounding the issue of classifier interpretability. In this paper we present feature importance in nonlinear embeddings (FINE), an extension of our PCA-based feature scoring method to kernel PCA (KPCA), as well as several NLDR algorithms that can be cast as variants of KPCA. FINE is applied to four digital pathology datasets to identify key QH features for predicting the risk of breast and prostate cancer recurrence. Measures of nuclear and glandular architecture and clusteredness were found to play an important role in predicting the likelihood of recurrence of both breast and prostate cancers. Compared to the t-test, Fisher score, and Gini index, FINE was able to identify a stable set of features that provide good classification accuracy on four publicly available datasets from the NIPS 2003 Feature Selection Challenge.
Collapse
|
241
|
Thilaga M, Vijayalakshmi R, Nadarajan R, Nandagopal D, Cocks B, Archana C, Dahal N. A heuristic branch-and-bound based thresholding algorithm for unveiling cognitive activity from EEG data. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2015.03.095] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
242
|
Mansoori EG, Shafiee KS. On fuzzy feature selection in designing fuzzy classifiers for high-dimensional data. EVOLVING SYSTEMS 2015. [DOI: 10.1007/s12530-015-9142-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
243
|
Han M, Ren W. Global mutual information-based feature selection approach using single-objective and multi-objective optimization. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2015.06.016] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
244
|
Wei M, Chow TW, Chan RH. Heterogeneous feature subset selection using mutual information-based feature transformation. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2015.05.053] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
245
|
|
246
|
Do TT, Zhou Y, Zheng H, Cheung NM, Koh D. Early melanoma diagnosis with mobile imaging. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2015; 2014:6752-7. [PMID: 25571546 DOI: 10.1109/embc.2014.6945178] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We research a mobile imaging system for early diagnosis of melanoma. Different from previous work, we focus on smartphone-captured images, and propose a detection system that runs entirely on the smartphone. Smartphone-captured images taken under loosely-controlled conditions introduce new challenges for melanoma detection, while processing performed on the smartphone is subject to computation and memory constraints. To address these challenges, we propose to localize the skin lesion by combining fast skin detection and fusion of two fast segmentation results. We propose new features to capture color variation and border irregularity which are useful for smartphone-captured images. We also propose a new feature selection criterion to select a small set of good features used in the final lightweight system. Our evaluation confirms the effectiveness of proposed algorithms and features. In addition, we present our system prototype which computes selected visual features from a user-captured skin lesion image, and analyzes them to estimate the likelihood of malignance, all on an off-the-shelf smartphone.
Collapse
|
247
|
|
248
|
Guillén A, Herrera LJ, Pomares H, Rojas I, Liébana-Cabanillas F. Decision Support System to Determine Intention to Use Mobile Payment Systems on Social Networks: A Methodological Analysis. INT J INTELL SYST 2015. [DOI: 10.1002/int.21749] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Alberto Guillén
- CITIC-UGR, Research Centre on Information and Communication Technologies; University of Granada; Granada Spain
| | - Luis J. Herrera
- CITIC-UGR, Research Centre on Information and Communication Technologies; University of Granada; Granada Spain
| | - Héctor Pomares
- CITIC-UGR, Research Centre on Information and Communication Technologies; University of Granada; Granada Spain
| | - Ignacio Rojas
- CITIC-UGR, Research Centre on Information and Communication Technologies; University of Granada; Granada Spain
| | - Francisco Liébana-Cabanillas
- Marketing and Market Research Department, Faculty of Business and Economics, Campus Cartuja; University of Granada; Granada Spain
| |
Collapse
|
249
|
Ortuño FM, Valenzuela O, Prieto B, Saez-Lara MJ, Torres C, Pomares H, Rojas I. Comparing different machine learning and mathematical regression models to evaluate multiple sequence alignments. Neurocomputing 2015. [DOI: 10.1016/j.neucom.2015.01.080] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
250
|
Lafuente V, Herrera LJ, Pérez MDM, Val J, Negueruela I. Firmness prediction in Prunus persica 'Calrico' peaches by visible/short-wave near infrared spectroscopy and acoustic measurements using optimised linear and non-linear chemometric models. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE 2015; 95:2033-2040. [PMID: 25224468 DOI: 10.1002/jsfa.6916] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Revised: 09/05/2014] [Accepted: 09/10/2014] [Indexed: 06/03/2023]
Abstract
BACKGROUND In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. METHODS Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). RESULTS The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. CONCLUSION These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables.
Collapse
Affiliation(s)
- Victoria Lafuente
- Consejo Superior de Investigaciones Cientifcias (CSIC), Nutrición Vegetak, Zaragoza, Spain
| | - Luis J Herrera
- Departamento de Arquitectura y Tecnología de los computadores, Universidad de Granada, Granada, Spain
| | | | - Jesús Val
- Estación Experimental de Aula Dei, CSIC, Plant Nutrition, Avda, Montañana 1005, Zaragoza, Spain
| | | |
Collapse
|