1
|
Setting up of a machine learning algorithm for the identification of severe liver fibrosis profile in the general US population cohort. Int J Med Inform 2023; 170:104932. [PMID: 36459836 DOI: 10.1016/j.ijmedinf.2022.104932] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 11/19/2022] [Accepted: 11/21/2022] [Indexed: 11/27/2022]
Abstract
BACKGROUND The progress of digital transformation in clinical practice opens the door to transforming the current clinical line for liver disease diagnosis from a late-stage diagnosis approach to an early-stage based one. Early diagnosis of liver fibrosis can prevent the progression of the disease and decrease liver-related morbidity and mortality. We developed here a machine learning (ML) algorithm containing standard parameters that can identify liver fibrosis in the general US population. MATERIALS AND METHODS Starting from a public database (National Health and Nutrition Examination Survey, NHANES), representative of the American population with 7265 eligible subjects (control population n = 6828, with Fibroscan values E < 9.7 KPa; target population n = 437 with Fibroscan values E ≥ 9.7 KPa), we set up an SVM algorithm able to discriminate for individuals with liver fibrosis among the general US population. The algorithm set up involved the removal of missing data and a sampling optimization step to managing the data imbalance (only ∼ 5 % of the dataset is the target population). RESULTS For the feature selection, we performed an unbiased analysis, starting from 33 clinical, anthropometric, and biochemical parameters regardless of their previous application as biomarkers of liver diseases. Through PCA analysis, we identified the 26 more significant features and then used them to set up a sampling method on an SVM algorithm. The best sampling technique to manage the data imbalance was found to be oversampling through the SMOTE-NC. For final model validation, we utilized a subset of 300 individuals (150 with liver fibrosis and 150 controls), subtracted from the main dataset prior to sampling. Performances were evaluated on multiple independent runs. CONCLUSIONS We provide proof of concept of an ML clinical decision support tool for liver fibrosis diagnosis in the general US population. Though the presented ML model represents at this stage only a prototype, in the future, it might be implemented and potentially applied to program broad screenings for liver fibrosis.
Collapse
|
2
|
Garg N, Choudhry MS, Bodade RM. A review on Alzheimer's disease classification from normal controls and mild cognitive impairment using structural MR images. J Neurosci Methods 2023; 384:109745. [PMID: 36395961 DOI: 10.1016/j.jneumeth.2022.109745] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Revised: 10/04/2022] [Accepted: 11/11/2022] [Indexed: 11/16/2022]
Abstract
Alzheimer's disease (AD) is an irreversible neurodegenerative brain disorder that degrades the memory and cognitive ability in elderly people. The main reason for memory loss and reduction in cognitive ability is the structural changes in the brain that occur due to neuronal loss. These structural changes are most conspicuous in the hippocampus, cortex, and grey matter and can be assessed by using neuroimaging techniques viz. Positron Emission Tomography (PET), structural Magnetic Resonance Imaging (MRI) and functional MRI (fMRI), etc. Out of these neuroimaging techniques, structural MRI has evolved as the best technique as it indicates the best soft tissue contrast and high spatial resolution which is important for AD detection. Currently, the focus of researchers is on predicting the conversion of Mild Cognitive Impairment (MCI) into AD. MCI represents the transition state between expected cognitive changes with normal aging and Alzheimer's disease. Not every MCI patient progresses into Alzheimer's disease. MCI can develop into stable MCI (sMCI, patients are called non-converters) or into progressive MCI (pMCI, patients are diagnosed as MCI converters). This paper discusses the prognosis of MCI to AD conversion and presents a review of structural MRI-based studies for AD detection. AD detection framework includes feature extraction, feature selection, and classification process. This paper reviews the studies for AD detection based on different feature extraction methods and machine learning algorithms for classification. The performance of various feature extraction methods has been compared and it has been observed that the wavelet transform-based feature extraction method would give promising results for AD classification. The present study indicates that researchers are successful in classifying AD from Normal Controls (NrmC) but, it still requires a lot of work to be done for MCI/ NrmC and MCI/AD, which would help in detecting AD at its early stage.
Collapse
Affiliation(s)
- Neha Garg
- Delhi Technological University, Department of Electronics and Communication, Delhi 110042, India.
| | - Mahipal Singh Choudhry
- Delhi Technological University, Department of Electronics and Communication, Delhi 110042, India.
| | - Rajesh M Bodade
- Military College of Telecommunication Engineering (MCTE), Mhow, Indore 453441, Madhya Pradesh, India.
| |
Collapse
|
3
|
|
4
|
WR-SVM Model Based on the Margin Radius Approach for Solving the Minimum Enclosing Ball Problem in Support Vector Machine Classification. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11104657] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The generalization error of conventional support vector machine (SVM) depends on the ratio of two factors; radius and margin. The traditional SVM aims to maximize margin but ignore minimization of radius, which decreases the overall performance of the SVM classifier. However, different approaches are developed to achieve a trade-off between the margin and radius. Still, the computational cost of all these approaches is high due to the requirements of matrix transformation. Furthermore, a conventional SVM tries to set the best hyperplane between classes, and due to some robust kernel tricks, an SVM is used in many non-linear and complex problems. The configuration of the best hyperplane between classes is not effective; therefore, it is required to bind a class within its limited area to enhance the performance of the SVM classifier. The area enclosed by a class is called its Minimum Enclosing Ball (MEB), and it is one of the emerging problems of SVM. Therefore, a robust solution is needed to improve the performance of the conventional SVM to overcome the highlighted issues. In this research study, a novel weighted radius SVM (WR-SVM) is proposed to determine the tighter bounds of MEB. The proposed solution uses a weighted mean to find tighter bounds of radius, due to which the size of MEB decreases. Experiments are conducted on nine different benchmark datasets and one synthetic dataset to demonstrate the effectiveness of our proposed model. The experimental results reveal that the proposed WR-SVM significantly performed well compared to the conventional SVM classifier. Furthermore, experimental results are compared with F-SVM and traditional SVM in terms of classification accuracy to demonstrate the significance of the proposed WR-SVM.
Collapse
|
5
|
Omondiagbe DA, Veeramani S, Sidhu AS. Machine Learning Classification Techniques for Breast Cancer Diagnosis. ACTA ACUST UNITED AC 2019. [DOI: 10.1088/1757-899x/495/1/012033] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
6
|
Ghosh L, Konar A, Rakshit P, Nagar AK. Hemodynamic Analysis for Cognitive Load Assessment and Classification in Motor Learning Tasks Using Type-2 Fuzzy Sets. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2019. [DOI: 10.1109/tetci.2018.2868323] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
7
|
Arcadia CE, Tann H, Dombroski A, Ferguson K, Chen SL, Kim E, Rose C, Rubenstein BM, Reda S, Rosenstein JK. Parallelized Linear Classification with Volumetric Chemical Perceptrons. 2018 IEEE INTERNATIONAL CONFERENCE ON REBOOTING COMPUTING (ICRC) 2018. [DOI: 10.1109/icrc.2018.8638627] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
8
|
Two-Step Urban Water Index (TSUWI): A New Technique for High-Resolution Mapping of Urban Surface Water. REMOTE SENSING 2018. [DOI: 10.3390/rs10111704] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Urban surface water mapping is essential for studying its role in urban ecosystems and local microclimates. However, fast and accurate extraction of urban water remains a great challenge due to the limitations of conventional water indexes and the presence of shadows. Therefore, we proposed a new urban water mapping technique named the Two-Step Urban Water Index (TSUWI), which combines an Urban Water Index (UWI) and an Urban Shadow Index (USI). These two subindexes were established based on spectral analysis and linear Support Vector Machine (SVM) training of pure pixels from eight training sites across China. The performance of the TSUWI was compared with that of the Normalized Difference Water Index (NDWI), High Resolution Water Index (HRWI) and SVM classifier at twelve test sites. The results showed that this method consistently achieved good performance with a mean Kappa Coefficient (KC) of 0.97 and a mean total error (TE) of 2.28%. Overall, classification accuracy of TSUWI was significantly higher than that of the NDWI, HRWI, and SVM (p-value < 0.01). At most test sites, TSUWI improved accuracy by decreasing the TEs by more than 45% compared to NDWI and HRWI, and by more than 15% compared to SVM. In addition, both UWI and USI were shown to have more stable optimal thresholds that are close to 0 and maintain better performance near their optimum thresholds. Therefore, TSUWI can be used as a simple yet robust method for urban water mapping with high accuracy.
Collapse
|
9
|
Wu O, Mao X, Hu W. Iteratively Divide-and-Conquer Learning for Nonlinear Classification and Ranking. ACM T INTEL SYST TEC 2018. [DOI: 10.1145/3122802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Nonlinear classifiers (i.e., kernel support vector machines (SVMs)) are effective for nonlinear data classification. However, nonlinear classifiers are usually prohibitively expensive when dealing with large nonlinear data. Ensembles of linear classifiers have been proposed to address this inefficiency, which is called the ensemble linear classifiers for nonlinear data problem. In this article, a new iterative learning approach is introduced that involves two steps at each iteration: partitioning the data into clusters according to Gaussian mixture models with local consistency and then training basic classifiers (i.e., linear SVMs) for each cluster. The two divide-and-conquer steps are combined into a graphical model. Meanwhile, with training, each classifier is regarded as a task; clustered multitask learning is employed to capture the relatedness among different tasks and avoid overfitting in each task. In addition, two novel extensions are introduced based on the proposed approach. First, the approach is extended for quality-aware web data classification. In this problem, the types of web data vary in terms of information quality. The ignorance of the variations of information quality of web data leads to poor classification models. The proposed approach can effectively integrate quality-aware factors into web data classification. Second, the approach is extended for listwise learning to rank to construct an ensemble of linear ranking models, whereas most existing listwise ranking methods construct a solely linear ranking model. Experimental results on benchmark datasets show that our approach outperforms state-of-the-art algorithms. During prediction for nonlinear classification, it also obtains comparable classification performance to kernel SVMs, with much higher efficiency.
Collapse
Affiliation(s)
- Ou Wu
- Center for Applied Mathematics, Tianjin University, China
| | - Xue Mao
- NLPR, Institute of Automation, Chinese Academy of Sciences
| | - Weiming Hu
- NLPR, Institute of Automation, Chinese Academy of Sciences
| |
Collapse
|
10
|
Lee D, Park SH, Lee SG. Improving the Accuracy and Training Speed of Motor Imagery Brain-Computer Interfaces Using Wavelet-Based Combined Feature Vectors and Gaussian Mixture Model-Supervectors. SENSORS 2017; 17:s17102282. [PMID: 28991172 PMCID: PMC5677306 DOI: 10.3390/s17102282] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Revised: 09/30/2017] [Accepted: 10/04/2017] [Indexed: 12/02/2022]
Abstract
In this paper, we propose a set of wavelet-based combined feature vectors and a Gaussian mixture model (GMM)-supervector to enhance training speed and classification accuracy in motor imagery brain–computer interfaces. The proposed method is configured as follows: first, wavelet transforms are applied to extract the feature vectors for identification of motor imagery electroencephalography (EEG) and principal component analyses are used to reduce the dimensionality of the feature vectors and linearly combine them. Subsequently, the GMM universal background model is trained by the expectation–maximization (EM) algorithm to purify the training data and reduce its size. Finally, a purified and reduced GMM-supervector is used to train the support vector machine classifier. The performance of the proposed method was evaluated for three different motor imagery datasets in terms of accuracy, kappa, mutual information, and computation time, and compared with the state-of-the-art algorithms. The results from the study indicate that the proposed method achieves high accuracy with a small amount of training data compared with the state-of-the-art algorithms in motor imagery EEG classification.
Collapse
Affiliation(s)
- David Lee
- Department of Media Engineering, Catholic University of Korea, 43-1, Yeoggok 2-dong, Wonmmi-gu, Bucheon-si, Gyeonggi-do 14662, Korea.
| | - Sang-Hoon Park
- Department of Media Engineering, Catholic University of Korea, 43-1, Yeoggok 2-dong, Wonmmi-gu, Bucheon-si, Gyeonggi-do 14662, Korea.
| | - Sang-Goog Lee
- Department of Media Engineering, Catholic University of Korea, 43-1, Yeoggok 2-dong, Wonmmi-gu, Bucheon-si, Gyeonggi-do 14662, Korea.
| |
Collapse
|
11
|
HYDRA: Revealing heterogeneity of imaging and genetic patterns through a multiple max-margin discriminative analysis framework. Neuroimage 2016; 145:346-364. [PMID: 26923371 DOI: 10.1016/j.neuroimage.2016.02.041] [Citation(s) in RCA: 110] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 02/11/2016] [Accepted: 02/12/2016] [Indexed: 11/23/2022] Open
Abstract
Multivariate pattern analysis techniques have been increasingly used over the past decade to derive highly sensitive and specific biomarkers of diseases on an individual basis. The driving assumption behind the vast majority of the existing methodologies is that a single imaging pattern can distinguish between healthy and diseased populations, or between two subgroups of patients (e.g., progressors vs. non-progressors). This assumption effectively ignores the ample evidence for the heterogeneous nature of brain diseases. Neurodegenerative, neuropsychiatric and neurodevelopmental disorders are largely characterized by high clinical heterogeneity, which likely stems in part from underlying neuroanatomical heterogeneity of various pathologies. Detecting and characterizing heterogeneity may deepen our understanding of disease mechanisms and lead to patient-specific treatments. However, few approaches tackle disease subtype discovery in a principled machine learning framework. To address this challenge, we present a novel non-linear learning algorithm for simultaneous binary classification and subtype identification, termed HYDRA (Heterogeneity through Discriminative Analysis). Neuroanatomical subtypes are effectively captured by multiple linear hyperplanes, which form a convex polytope that separates two groups (e.g., healthy controls from pathologic samples); each face of this polytope effectively defines a disease subtype. We validated HYDRA on simulated and clinical data. In the latter case, we applied the proposed method independently to the imaging and genetic datasets of the Alzheimer's Disease Neuroimaging Initiative (ADNI 1) study. The imaging dataset consisted of T1-weighted volumetric magnetic resonance images of 123 AD patients and 177 controls. The genetic dataset consisted of single nucleotide polymorphism information of 103 AD patients and 139 controls. We identified 3 reproducible subtypes of atrophy in AD relative to controls: (1) diffuse and extensive atrophy, (2) precuneus and extensive temporal lobe atrophy, as well some prefrontal atrophy, (3) atrophy pattern very much confined to the hippocampus and the medial temporal lobe. The genetics dataset yielded two subtypes of AD characterized mainly by the presence/absence of the apolipoprotein E (APOE) ε4 genotype, but also involving differential presence of risk alleles of CD2AP, SPON1 and LOC39095 SNPs that were associated with differences in the respective patterns of brain atrophy, especially in the precuneus. The results demonstrate the potential of the proposed approach to map disease heterogeneity in neuroimaging and genetic studies.
Collapse
|
12
|
Powers S, Hastie T, Tibshirani R. CUSTOMIZED TRAINING WITH AN APPLICATION TO MASS SPECTROMETRIC IMAGING OF CANCER TISSUE. Ann Appl Stat 2016; 9:1709-1725. [PMID: 30370000 DOI: 10.1214/15-aoas866] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
We introduce a simple, interpretable strategy for making predictions on test data when the features of the test data are available at the time of model fitting. Our proposal-customized training-clusters the data to find training points close to each test point and then fits an ℓ 1-regularized model (lasso) separately in each training cluster. This approach combines the local adaptivity of k-nearest neighbors with the interpretability of the lasso. Although we use the lasso for the model fitting, any supervised learning method can be applied to the customized training sets. We apply the method to a mass-spectrometric imaging data set from an ongoing collaboration in gastric cancer detection which demonstrates the power and interpretability of the technique. Our idea is simple but potentially useful in situations where the data have some underlying structure.
Collapse
Affiliation(s)
- Scott Powers
- Department of Statistics, Stanford University, 390 Serra Mall, Stanford, California 94305-4065, USA
| | - Trevor Hastie
- Department of Statistics, Stanford University, 390 Serra Mall, Stanford, California 94305-4065, USA
| | - Robert Tibshirani
- Department of Statistics, Stanford University, 390 Serra Mall, Stanford, California 94305-4065, USA
| |
Collapse
|
13
|
Eavani H, Hsieh MK, An Y, Erus G, Beason-Held L, Resnick S, Davatzikos C. Capturing heterogeneous group differences using mixture-of-experts: Application to a study of aging. Neuroimage 2016; 125:498-514. [PMID: 26525656 PMCID: PMC5460911 DOI: 10.1016/j.neuroimage.2015.10.045] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2015] [Revised: 10/12/2015] [Accepted: 10/16/2015] [Indexed: 11/22/2022] Open
Abstract
In MRI studies, linear multi-variate methods are often employed to identify regions or connections that are affected due to disease or normal aging. Such linear models inherently assume that there is a single, homogeneous abnormality pattern that is present in all affected individuals. While kernel-based methods can implicitly model a non-linear effect, and therefore the heterogeneity in the affected group, extracting and interpreting information about affected regions is difficult. In this paper, we present a method that explicitly models and captures heterogeneous patterns of change in the affected group relative to a reference group of controls. For this purpose, we use the Mixture-of-Experts (MOE) framework, which combines unsupervised modeling of mixtures of distributions with supervised learning of classifiers. MOE approximates the non-linear boundary between the two groups with a piece-wise linear boundary, thus allowing discovery of multiple patterns of group differences. In the case of patient/control comparisons, each such pattern aims to capture a different dimension of a disease, and hence to identify patient subgroups. We validated our model using multiple simulation scenarios and performance measures. We applied this method to resting state functional MRI data from the Baltimore Longitudinal Study of Aging, to investigate heterogeneous effects of aging on brain function in cognitively normal older adults (>85years) relative to a reference group of normal young to middle-aged adults (<60years). We found strong evidence for the presence of two subgroups of older adults, with similar age distributions in each subgroup, but different connectivity patterns associated with aging. While both older subgroups showed reduced functional connectivity in the Default Mode Network (DMN), increases in functional connectivity within the pre-frontal cortex as well as the bilateral insula were observed only for one of the two subgroups. Interestingly, the subgroup showing this increased connectivity (unlike the other subgroup) was, cognitively similar at baseline to the young and middle-aged subjects in two of seven cognitive domains, and had a faster rate of cognitive decline in one of seven domains. These results suggest that older individuals whose baseline cognitive performance is comparable to that of younger individuals recruit their "cognitive reserve" later in life, to compensate for reduced connectivity in other brain regions.
Collapse
Affiliation(s)
- Harini Eavani
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, USA.
| | - Meng Kang Hsieh
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, USA
| | - Yang An
- National Institute on Aging, Baltimore, USA
| | - Guray Erus
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, USA
| | | | | | - Christos Davatzikos
- Center for Biomedical Image Computing and Analytics, University of Pennsylvania, USA
| |
Collapse
|
14
|
Yin J, Wang N. An online sequential extreme learning machine for tidal prediction based on improved Gath–Geva fuzzy segmentation. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2015.02.094] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
15
|
Abstract
There is ample evidence for the heterogeneous nature of diseases. For example, Alzheimer's Disease, Schizophrenia and Autism Spectrum Disorder are typical disease examples that are characterized by high clinical heterogeneity, and likely by heterogeneity in the underlying brain phenotypes. Parsing this heterogeneity as captured by neuroimaging studies is important both for better understanding of disease mechanisms, and for building subtype-specific classifiers. However, few existing methodologies tackle this problem in a principled machine learning framework. In this work, we developed a novel non-linear learning algorithm for integrated binary classification and subpopulation clustering. Non-linearity is introduced through the use of multiple linear hyperplanes that form a convex polytope that separates healthy controls from pathologic samples. Disease heterogeneity is disentangled by implicitly clustering pathologic samples through their association to single linear sub-classifiers. We show results of the proposed approach from an imaging study of Alzheimer's Disease, which highlight the potential of the proposed approach to map disease heterogeneity in neuroimaging studies.
Collapse
|
16
|
Chatzis SP, Andreou AS. Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2015; 26:2689-2701. [PMID: 25643418 DOI: 10.1109/tnnls.2015.2391171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.
Collapse
|
17
|
Fu Z, Lu G, Ting KM, Zhang D. Learning sparse kernel classifiers for multi-instance classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:1377-1389. [PMID: 24808575 DOI: 10.1109/tnnls.2013.2254721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
We propose a direct approach to learning sparse kernel classifiers for multi-instance (MI) classification to improve efficiency while maintaining predictive accuracy. The proposed method builds on a convex formulation for MI classification by considering the average score of individual instances for bag-level prediction. In contrast, existing formulations used the maximum score of individual instances in each bag, which leads to nonconvex optimization problems. Based on the convex MI framework, we formulate a sparse kernel learning algorithm by imposing additional constraints on the objective function to enforce the maximum number of expansions allowed in the prediction function. The formulated sparse learning problem for the MI classification is convex with respect to the classifier weights. Therefore, we can employ an effective optimization strategy to solve the optimization problem that involves the joint learning of both the classifier and the expansion vectors. In addition, the proposed formulation can explicitly control the complexity of the prediction model while still maintaining competitive predictive performance. Experimental results on benchmark data sets demonstrate that our proposed approach is effective in building very sparse kernel classifiers while achieving comparable performance to the state-of-the-art MI classifiers.
Collapse
|
18
|
Mayhua-López E, Gómez-Verdejo V, Figueiras-Vidal AR. Real AdaBoost with gate controlled fusion. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:2003-2009. [PMID: 24808153 DOI: 10.1109/tnnls.2012.2219318] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this brief, we propose to increase the capabilities of standard real AdaBoost (RAB) architectures by replacing their linear combinations with a fusion controlled by a gate with fixed kernels. Experimental results in a series of well-known benchmark problems support the effectiveness of this approach in improving classification performance. Although the need for cross-validation processes obviously leads to higher training requirements and more computational effort, the operation load is never much higher; in many cases it is even lower than that of competitive RAB schemes.
Collapse
|