Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Romano JD, Le TT, La Cava W, Gregg JT, Goldberg DJ, Chakraborty P, Ray NL, Himmelstein D, Fu W, Moore JH. PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods. Bioinformatics 2022;38:878-880. [PMID: 34677586 PMCID: PMC8756190 DOI: 10.1093/bioinformatics/btab727] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 08/17/2021] [Accepted: 10/18/2021] [Indexed: 02/04/2023] Open

For:	Romano JD, Le TT, La Cava W, Gregg JT, Goldberg DJ, Chakraborty P, Ray NL, Himmelstein D, Fu W, Moore JH. PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods. Bioinformatics 2022;38:878-880. [PMID: 34677586 PMCID: PMC8756190 DOI: 10.1093/bioinformatics/btab727] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 08/17/2021] [Accepted: 10/18/2021] [Indexed: 02/04/2023] Open

Number

Cited by Other Article(s)

Tan YS, Singh C, Nasseri K, Agarwal A, Duncan J, Ronen O, Epland M, Kornblith A, Yu B. Fast Interpretable Greedy-Tree Sums. Proc Natl Acad Sci U S A 2025;122:e2310151122. [PMID: 39951504 PMCID: PMC11848335 DOI: 10.1073/pnas.2310151122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 12/10/2024] [Indexed: 02/16/2025] Open

Chernigovskaya M, Pavlović M, Kanduri C, Gielis S, Robert P, Scheffer L, Slabodkin A, Haff IH, Meysman P, Yaari G, Sandve GK, Greiff V. Simulation of adaptive immune receptors and repertoires with complex immune information to guide the development and benchmarking of AIRR machine learning. Nucleic Acids Res 2025;53:gkaf025. [PMID: 39873270 PMCID: PMC11773363 DOI: 10.1093/nar/gkaf025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Accepted: 01/25/2025] [Indexed: 01/30/2025] Open

Peterson RA, McGrath M, Cavanaugh JE. Can a Transparent Machine Learning Algorithm Predict Better than Its Black Box Counterparts? A Benchmarking Study Using 110 Data Sets. ENTROPY (BASEL, SWITZERLAND) 2024;26:746. [PMID: 39330080 PMCID: PMC11431724 DOI: 10.3390/e26090746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/27/2024] [Revised: 08/27/2024] [Accepted: 08/28/2024] [Indexed: 09/28/2024]

Abstract

We developed a novel machine learning (ML) algorithm with the goal of producing transparent models (i.e., understandable by humans) while also flexibly accounting for nonlinearity and interactions. Our method is based on ranked sparsity, and it allows for flexibility and user control in varying the shade of the opacity of black box machine learning methods. The main tenet of ranked sparsity is that an algorithm should be more skeptical of higher-order polynomials and interactions a priori compared to main effects, and hence, the inclusion of these more complex terms should require a higher level of evidence. In this work, we put our new ranked sparsity algorithm (as implemented in the open source R package, sparseR) to the test in a predictive model "bakeoff" (i.e., a benchmarking study of ML algorithms applied "out of the box", that is, with no special tuning). Algorithms were trained on a large set of simulated and real-world data sets from the Penn Machine Learning Benchmarks database, addressing both regression and binary classification problems. We evaluated the extent to which our human-centered algorithm can attain predictive accuracy that rivals popular black box approaches such as neural networks, random forests, and support vector machines, while also producing more interpretable models. Using out-of-bag error as a meta-outcome, we describe the properties of data sets in which human-centered approaches can perform as well as or better than black box approaches. We found that interpretable approaches predicted optimally or within 5% of the optimal method in most real-world data sets. We provide a more in-depth comparison of the performances of random forests to interpretable methods for several case studies, including exemplars in which algorithms performed similarly, and several cases when interpretable methods underperformed. This work provides a strong rationale for including human-centered transparent algorithms such as ours in predictive modeling applications.

Collapse

Tjaden J, Tjaden B. MLpronto: A tool for democratizing machine learning. PLoS One 2023;18:e0294924. [PMID: 38032968 PMCID: PMC10688639 DOI: 10.1371/journal.pone.0294924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 11/11/2023] [Indexed: 12/02/2023] Open

John M, Schuhmacher J, Barkoutsos P, Tavernelli I, Tacchino F. Optimizing Quantum Classification Algorithms on Classical Benchmark Datasets. ENTROPY (BASEL, SWITZERLAND) 2023;25:860. [PMID: 37372204 PMCID: PMC10297005 DOI: 10.3390/e25060860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/24/2023] [Accepted: 05/24/2023] [Indexed: 06/29/2023]

Abstract

The discovery of quantum algorithms offering provable advantages over the best known classical alternatives, together with the parallel ongoing revolution brought about by classical artificial intelligence, motivates a search for applications of quantum information processing methods to machine learning. Among several proposals in this domain, quantum kernel methods have emerged as particularly promising candidates. However, while some rigorous speedups on certain highly specific problems have been formally proven, only empirical proof-of-principle results have been reported so far for real-world datasets. Moreover, no systematic procedure is known, in general, to fine tune and optimize the performances of kernel-based quantum classification algorithms. At the same time, certain limitations such as kernel concentration effects-hindering the trainability of quantum classifiers-have also been recently pointed out. In this work, we propose several general-purpose optimization methods and best practices designed to enhance the practical usefulness of fidelity-based quantum classification algorithms. Specifically, we first describe a data pre-processing strategy that, by preserving the relevant relationships between data points when processed through quantum feature maps, substantially alleviates the effect of kernel concentration on structured datasets. We also introduce a classical post-processing method that, based on standard fidelity measures estimated on a quantum processor, yields non-linear decision boundaries in the feature Hilbert space, thus achieving the quantum counterpart of the radial basis functions technique that is widely employed in classical kernel methods. Finally, we apply the so-called quantum metric learning protocol to engineer and adjust trainable quantum embeddings, demonstrating substantial performance improvements on several paradigmatic real-world classification tasks.

Collapse

Bertini C, Leporini R. Quantum-Inspired Applications for Classification Problems. ENTROPY (BASEL, SWITZERLAND) 2023;25:404. [PMID: 36981293 PMCID: PMC10047587 DOI: 10.3390/e25030404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 02/19/2023] [Accepted: 02/22/2023] [Indexed: 06/18/2023]

Sha Z, Chen Y, Hu T. NSPA: characterizing the disease association of multiple genetic interactions at single-subject resolution. BIOINFORMATICS ADVANCES 2023;3:vbad010. [PMID: 36818729 PMCID: PMC9927570 DOI: 10.1093/bioadv/vbad010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 01/02/2023] [Accepted: 02/02/2023] [Indexed: 02/10/2023]

Alòs J, Ansótegui C, Torres E. Interpretable decision trees through MaxSAT. Artif Intell Rev 2022;56:1-21. [PMID: 36590759 PMCID: PMC9794111 DOI: 10.1007/s10462-022-10377-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2022] [Indexed: 12/29/2022]

Duong-Trung N, Born S, Kim JW, Schermeyer MT, Paulick K, Borisyak M, Cruz-Bournazou MN, Werner T, Scholz R, Schmidt-Thieme L, Neubauer P, Martinez E. When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development. Biochem Eng J 2022. [DOI: 10.1016/j.bej.2022.108764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Leporini R, Pastorello D. An efficient geometric approach to quantum-inspired classifications. Sci Rep 2022;12:8781. [PMID: 35610272 PMCID: PMC9130267 DOI: 10.1038/s41598-022-12392-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 05/05/2022] [Indexed: 11/09/2022] Open

La Cava W, Burlacu B, Virgolin M, Kommenda M, Orzechowski P, de França FO, Jin Y, Moore JH. Contemporary Symbolic Regression Methods and their Relative Performance. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 2021;2021:1-16. [PMID: 38715933 PMCID: PMC11074949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/12/2024]