Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Fernández A, García S, Herrera F. Addressing the Classification with Imbalanced Data: Open Problems and New Challenges on Class Distribution. Lecture Notes in Computer Science 2011. [DOI: 10.1007/978-3-642-21219-2_1] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

For:	Fernández A, García S, Herrera F. Addressing the Classification with Imbalanced Data: Open Problems and New Challenges on Class Distribution. Lecture Notes in Computer Science 2011. [DOI: 10.1007/978-3-642-21219-2_1] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Number

Cited by Other Article(s)

Yao T, Lu Y, Long J, Jha A, Zhu Z, Asad Z, Yang H, Fogo AB, Huo Y. Glo-In-One: holistic glomerular detection, segmentation, and lesion characterization with large-scale web image mining. J Med Imaging (Bellingham) 2022;9:052408. [PMID: 35747553 PMCID: PMC9207519 DOI: 10.1117/1.jmi.9.5.052408] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 05/31/2022] [Indexed: 11/14/2022] Open

Abstract

Purpose: The quantitative detection, segmentation, and characterization of glomeruli from high-resolution whole slide imaging (WSI) play essential roles in the computer-assisted diagnosis and scientific research in digital renal pathology. Historically, such comprehensive quantification requires extensive programming skills to be able to handle heterogeneous and customized computational tools. To bridge the gap of performing glomerular quantification for non-technical users, we develop the Glo-In-One toolkit to achieve holistic glomerular detection, segmentation, and characterization via a single line of command. Additionally, we release a large-scale collection of 30,000 unlabeled glomerular images to further facilitate the algorithmic development of self-supervised deep learning. Approach: The inputs of the Glo-In-One toolkit are WSIs, while the outputs are (1) WSI-level multi-class circle glomerular detection results (which can be directly manipulated with ImageScope), (2) glomerular image patches with segmentation masks, and (3) different lesion types. In the current version, the fine-grained global glomerulosclerosis (GGS) characterization is provided, including assessed-solidified-GSS (associated with hypertension-related injury), disappearing-GSS (a further end result of the SGGS becoming contiguous with fibrotic interstitium), and obsolescent-GSS (nonspecific GGS increasing with aging) glomeruli. To leverage the performance of the Glo-In-One toolkit, we introduce self-supervised deep learning to glomerular quantification via large-scale web image mining. Results: The GGS fine-grained classification model achieved a decent performance compared with baseline supervised methods while only using 10% of the annotated data. The glomerular detection achieved an average precision of 0.627 with circle representations, while the glomerular segmentation achieved a 0.955 patch-wise Dice dimilarity coefficient. Conclusion: We develop and release an open-source Glo-In-One toolkit, a software with holistic glomerular detection, segmentation, and lesion characterization. This toolkit is user-friendly to non-technical users via a single line of command. The toolbox and the 30,000 web mined glomerular images have been made publicly available at https://github.com/hrlblab/Glo-In-One.

Collapse

Galli G, Sabadin F, Yassue RM, Galves C, Carvalho HF, Crossa J, Montesinos-López OA, Fritsche-Neto R. Automated Machine Learning: A Case Study of Genomic "Image-Based" Prediction in Maize Hybrids. FRONTIERS IN PLANT SCIENCE 2022;13:845524. [PMID: 35321444 PMCID: PMC8936805 DOI: 10.3389/fpls.2022.845524] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 02/03/2022] [Indexed: 06/14/2023]

Gao J, Zhang L, Yu G, Qu G, Li Y, Yang X. Model with the GBDT for Colorectal Adenoma Risk Diagnosis. Curr Bioinform 2021. [DOI: 10.2174/1574893614666191120142005] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Takahashi CC, Braga AP. A Review of Off-Line Mode Dataset Shifts. IEEE COMPUT INTELL M 2020. [DOI: 10.1109/mci.2020.2998231] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Lopez-Garcia P, Masegosa AD, Osaba E, Onieva E, Perallos A. Ensemble classification for imbalanced data based on feature space partitioning and hybrid metaheuristics. APPL INTELL 2019. [DOI: 10.1007/s10489-019-01423-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Park E, Chang HJ, Nam HS. A Bayesian Network Model for Predicting Post-stroke Outcomes With Available Risk Factors. Front Neurol 2018;9:699. [PMID: 30245663 PMCID: PMC6137617 DOI: 10.3389/fneur.2018.00699] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 08/02/2018] [Indexed: 11/13/2022] Open

Multi-class and feature selection extensions of Roughly Balanced Bagging for imbalanced data. J Intell Inf Syst 2017. [DOI: 10.1007/s10844-017-0446-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Valero-Mas JJ, Calvo-Zaragoza J, Rico-Juan JR, Iñesta JM. A Study of Prototype Selection Algorithms for Nearest Neighbour in Class-Imbalanced Problems. PATTERN RECOGNITION AND IMAGE ANALYSIS 2017. [DOI: 10.1007/978-3-319-58838-4_37] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Komori O, Eguchi S, Ikeda S, Okamura H, Ichinokawa M, Nakayama S. An asymmetric logistic regression model for ecological data. Methods Ecol Evol 2015. [DOI: 10.1111/2041-210x.12473] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Ornella L, Pérez P, Tapia E, González-Camacho JM, Burgueño J, Zhang X, Singh S, Vicente FS, Bonnett D, Dreisigacker S, Singh R, Long N, Crossa J. Genomic-enabled prediction with classification algorithms. Heredity (Edinb) 2014;112:616-26. [PMID: 24424163 PMCID: PMC4023444 DOI: 10.1038/hdy.2013.144] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Revised: 12/05/2013] [Accepted: 12/09/2013] [Indexed: 11/09/2022] Open

Abstract

Pearson's correlation coefficient (ρ) is the most commonly reported metric of the success of prediction in genomic selection (GS). However, in real breeding ρ may not be very useful for assessing the quality of the regression in the tails of the distribution, where individuals are chosen for selection. This research used 14 maize and 16 wheat data sets with different trait–environment combinations. Six different models were evaluated by means of a cross-validation scheme (50 random partitions each, with 90% of the individuals in the training set and 10% in the testing set). The predictive accuracy of these algorithms for selecting individuals belonging to the best α=10, 15, 20, 25, 30, 35, 40% of the distribution was estimated using Cohen's kappa coefficient (κ) and an ad hoc measure, which we call relative efficiency (RE), which indicates the expected genetic gain due to selection when individuals are selected based on GS exclusively. We put special emphasis on the analysis for α=15%, because it is a percentile commonly used in plant breeding programmes (for example, at CIMMYT). We also used ρ as a criterion for overall success. The algorithms used were: Bayesian LASSO (BL), Ridge Regression (RR), Reproducing Kernel Hilbert Spaces (RHKS), Random Forest Regression (RFR), and Support Vector Regression (SVR) with linear (lin) and Gaussian kernels (rbf). The performance of regression methods for selecting the best individuals was compared with that of three supervised classification algorithms: Random Forest Classification (RFC) and Support Vector Classification (SVC) with linear (lin) and Gaussian (rbf) kernels. Classification methods were evaluated using the same cross-validation scheme but with the response vector of the original training sets dichotomised using a given threshold. For α=15%, SVC-lin presented the highest κ coefficients in 13 of the 14 maize data sets, with best values ranging from 0.131 to 0.722 (statistically significant in 9 data sets) and the best RE in the same 13 data sets, with values ranging from 0.393 to 0.948 (statistically significant in 12 data sets). RR produced the best mean for both κ and RE in one data set (0.148 and 0.381, respectively). Regarding the wheat data sets, SVC-lin presented the best κ in 12 of the 16 data sets, with outcomes ranging from 0.280 to 0.580 (statistically significant in 4 data sets) and the best RE in 9 data sets ranging from 0.484 to 0.821 (statistically significant in 5 data sets). SVC-rbf (0.235), RR (0.265) and RHKS (0.422) gave the best κ in one data set each, while RHKS and BL tied for the last one (0.234). Finally, BL presented the best RE in two data sets (0.738 and 0.750), RFR (0.636) and SVC-rbf (0.617) in one and RHKS in the remaining three (0.502, 0.458 and 0.586). The difference between the performance of SVC-lin and that of the rest of the models was not so pronounced at higher percentiles of the distribution. The behaviour of regression and classification algorithms varied markedly when selection was done at different thresholds, that is, κ and RE for each algorithm depended strongly on the selection percentile. Based on the results, we propose classification method as a promising alternative for GS in plant breeding.

Collapse

Brain tumor detection and segmentation in a CRF (conditional random fields) framework with pixel-pairwise affinity and superpixel-level features. Int J Comput Assist Radiol Surg 2013;9:241-53. [DOI: 10.1007/s11548-013-0922-7] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 07/03/2013] [Indexed: 01/10/2023]

Adaptive Weight Optimization for Classification of Imbalanced Data. ACTA ACUST UNITED AC 2013. [DOI: 10.1007/978-3-642-42057-3_69] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

A hierarchical genetic fuzzy system based on genetic programming for addressing classification with highly imbalanced and borderline data-sets. Knowl Based Syst 2013. [DOI: 10.1016/j.knosys.2012.08.025] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Millán-Giraldo M, García V, Sánchez JS. Instance Selection Methods and Resampling Techniques for Dissimilarity Representation with Imbalanced Data Sets. ACTA ACUST UNITED AC 2013. [DOI: 10.1007/978-3-642-36530-0_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2023]

The quest for the optimal class distribution: an approach for enhancing the effectiveness of learning via resampling methods for imbalanced data sets. PROGRESS IN ARTIFICIAL INTELLIGENCE 2012. [DOI: 10.1007/s13748-012-0034-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Identification of Different Types of Minority Class Examples in Imbalanced Data. LECTURE NOTES IN COMPUTER SCIENCE 2012. [DOI: 10.1007/978-3-642-28931-6_14] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]