1
|
Barak T, Loewenstein Y. Untrained neural networks can demonstrate memorization-independent abstract reasoning. Sci Rep 2024; 14:27249. [PMID: 39516540 PMCID: PMC11549345 DOI: 10.1038/s41598-024-78530-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 10/31/2024] [Indexed: 11/16/2024] Open
Abstract
The nature of abstract reasoning is a matter of debate. Modern artificial neural network (ANN) models, like large language models, demonstrate impressive success when tested on abstract reasoning problems. However, it has been argued that their success reflects some form of memorization of similar problems (data contamination) rather than a general-purpose abstract reasoning capability. This concern is supported by evidence of brittleness, and the requirement of extensive training. In our study, we explored whether abstract reasoning can be achieved using the toolbox of ANNs, without prior training. Specifically, we studied an ANN model in which the weights of a naive network are optimized during the solution of the problem, using the problem data itself, rather than any prior knowledge. We tested this modeling approach on visual reasoning problems and found that it performs relatively well. Crucially, this success does not rely on memorization of similar problems. We further suggest an explanation of how it works. Finally, as problem solving is performed by changing the ANN weights, we explored the connection between problem solving and the accumulation of knowledge in the ANNs.
Collapse
Affiliation(s)
- Tomer Barak
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel.
| | - Yonatan Loewenstein
- The Edmond and Lily Safra Center for Brain Sciences, The Hebrew University, Jerusalem, Israel
- Department of Cognitive Sciences, The Federmann Center for the Study of Rationality, The Alexander Silberman Institute of Life Sciences, The Hebrew University, Jerusalem, Israel
| |
Collapse
|
2
|
Klein B, Kovacs K. The performance of ChatGPT and Bing on a computerized adaptive test of verbal intelligence. PLoS One 2024; 19:e0307097. [PMID: 39052613 PMCID: PMC11271876 DOI: 10.1371/journal.pone.0307097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 06/28/2024] [Indexed: 07/27/2024] Open
Abstract
We administered a computerized adaptive test of vocabulary three times to assess the verbal intelligence of chatGPT (GPT 3.5) and Bing (based on GPT 4). There was no difference between their performance; both performed at a high level, outperforming approximately 95% of humans and scoring above the level of native speakers with a doctoral degree. In 42% of test items that were administered more than once these large language models provided different answers to the same question in different sessions. They never engaged in guessing, but provided hallucinations: answers that were not among the options. Such hallucinations were not triggered by the inability to answer correctly as the same questions evoked correct answers in other sessions. The results implicate that psychometric tools developed for humans have limitations when assessing AI, but they also imply that computerised adaptive testing of verbal ability is an appropriate tool to critically evaluate the performance of large language models.
Collapse
|
3
|
Depeweg S, Rothkopf CA, Jäkel F. Solving Bongard Problems With a Visual Language and Pragmatic Constraints. Cogn Sci 2024; 48:e13432. [PMID: 38700123 DOI: 10.1111/cogs.13432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 02/15/2024] [Accepted: 02/26/2024] [Indexed: 05/05/2024]
Abstract
More than 50 years ago, Bongard introduced 100 visual concept learning problems as a challenge for artificial vision systems. These problems are now known as Bongard problems. Although they are well known in cognitive science and artificial intelligence, only very little progress has been made toward building systems that can solve a substantial subset of them. In the system presented here, visual features are extracted through image processing and then translated into a symbolic visual vocabulary. We introduce a formal language that allows representing compositional visual concepts based on this vocabulary. Using this language and Bayesian inference, concepts can be induced from the examples that are provided in each problem. We find a reasonable agreement between the concepts with high posterior probability and the solutions formulated by Bongard himself for a subset of 35 problems. While this approach is far from solving Bongard problems like humans, it does considerably better than previous approaches. We discuss the issues we encountered while developing this system and their continuing relevance for understanding visual cognition. For instance, contrary to other concept learning problems, the examples are not random in Bongard problems; instead they are carefully chosen to ensure that the concept can be induced, and we found it helpful to take the resulting pragmatic constraints into account.
Collapse
Affiliation(s)
| | - Contantin A Rothkopf
- Centre for Cognitive Science & Institute of Psychology, Technische Universität Darmstadt
- Frankfurt Institute for Advanced Studies, Frankfurt am Main
| | - Frank Jäkel
- Centre for Cognitive Science & Institute of Psychology, Technische Universität Darmstadt
| |
Collapse
|
4
|
Cheyette SJ, Piantadosi ST. Response to Difficulty Drives Variation in IQ Test Performance. Open Mind (Camb) 2024; 8:265-277. [PMID: 38571527 PMCID: PMC10990577 DOI: 10.1162/opmi_a_00127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 02/06/2024] [Indexed: 04/05/2024] Open
Abstract
In a large (N = 300), pre-registered experiment and data analysis model, we find that individual variation in overall performance on Raven's Progressive Matrices is substantially driven by differential strategizing in the face of difficulty. Some participants choose to spend more time on hard problems while others choose to spend less and these differences explain about 42% of the variance in overall performance. In a data analysis jointly predicting participants' reaction times and accuracy on each item, we find that the Raven's task captures at most half of participants' variation in time-controlled ability (48%) down to almost none (3%), depending on which notion of ability is assumed. Our results highlight the role that confounding factors such as motivation play in explaining individuals' differential performance in IQ testing.
Collapse
|
5
|
Kühl N, Goutier M, Baier L, Wolff C, Martin D. Human vs. supervised machine learning: Who learns patterns faster? COGN SYST RES 2022. [DOI: 10.1016/j.cogsys.2022.09.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
6
|
Das K, Cockerell CJ, Patil A, Pietkiewicz P, Giulini M, Grabbe S, Goldust M. Machine Learning and Its Application in Skin Cancer. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:13409. [PMID: 34949015 PMCID: PMC8705277 DOI: 10.3390/ijerph182413409] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 12/13/2021] [Accepted: 12/14/2021] [Indexed: 01/15/2023]
Abstract
Artificial intelligence (AI) has wide applications in healthcare, including dermatology. Machine learning (ML) is a subfield of AI involving statistical models and algorithms that can progressively learn from data to predict the characteristics of new samples and perform a desired task. Although it has a significant role in the detection of skin cancer, dermatology skill lags behind radiology in terms of AI acceptance. With continuous spread, use, and emerging technologies, AI is becoming more widely available even to the general population. AI can be of use for the early detection of skin cancer. For example, the use of deep convolutional neural networks can help to develop a system to evaluate images of the skin to diagnose skin cancer. Early detection is key for the effective treatment and better outcomes of skin cancer. Specialists can accurately diagnose the cancer, however, considering their limited numbers, there is a need to develop automated systems that can diagnose the disease efficiently to save lives and reduce health and financial burdens on the patients. ML can be of significant use in this regard. In this article, we discuss the fundamentals of ML and its potential in assisting the diagnosis of skin cancer.
Collapse
Affiliation(s)
- Kinnor Das
- Department of Dermatology Venereology and Leprosy, Silchar Medical College, Silchar 788014, India;
| | - Clay J. Cockerell
- Departments of Dermatology and Pathology, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA;
- Cockerell Dermatopathology, Dallas, TX 75235, USA
| | - Anant Patil
- Department of Pharmacology, Dr. DY Patil Medical College, Navi Mumbai 400706, India;
| | - Paweł Pietkiewicz
- Surgical Oncology and General Surgery Clinic I, Greater Poland Cancer Center, 61-866 Poznan, Poland
| | - Mario Giulini
- Department of Dermatology, University Medical Center Mainz, Langenbeckstraße 1, 55131 Mainz, Germany; (M.G.); (S.G.)
| | - Stephan Grabbe
- Department of Dermatology, University Medical Center Mainz, Langenbeckstraße 1, 55131 Mainz, Germany; (M.G.); (S.G.)
| | - Mohamad Goldust
- Department of Dermatology, University Medical Center Mainz, Langenbeckstraße 1, 55131 Mainz, Germany; (M.G.); (S.G.)
| |
Collapse
|
7
|
General intelligence disentangled via a generality metric for natural and artificial intelligence. Sci Rep 2021; 11:22822. [PMID: 34819537 PMCID: PMC8613222 DOI: 10.1038/s41598-021-01997-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 11/01/2021] [Indexed: 11/08/2022] Open
Abstract
Success in all sorts of situations is the most classical interpretation of general intelligence. Under limited resources, however, the capability of an agent must necessarily be limited too, and generality needs to be understood as comprehensive performance up to a level of difficulty. The degree of generality then refers to the way an agent's capability is distributed as a function of task difficulty. This dissects the notion of general intelligence into two non-populational measures, generality and capability, which we apply to individuals and groups of humans, other animals and AI systems, on several cognitive and perceptual tests. Our results indicate that generality and capability can decouple at the individual level: very specialised agents can show high capability and vice versa. The metrics also decouple at the population level, and we rarely see diminishing returns in generality for those groups of high capability. We relate the individual measure of generality to traditional notions of general intelligence and cognitive efficiency in humans, collectives, non-human animals and machines. The choice of the difficulty function now plays a prominent role in this new conception of generality, which brings a quantitative tool for shedding light on long-standing questions about the evolution of general intelligence and the evaluation of progress in Artificial General Intelligence.
Collapse
|
8
|
van der Maas HL, Snoek L, Stevenson CE. How much intelligence is there in artificial intelligence? A 2020 update. INTELLIGENCE 2021. [DOI: 10.1016/j.intell.2021.101548] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
9
|
|
10
|
AI, visual imagery, and a case study on the challenges posed by human intelligence tests. Proc Natl Acad Sci U S A 2020; 117:29390-29397. [PMID: 33229557 DOI: 10.1073/pnas.1912335117] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Observations abound about the power of visual imagery in human intelligence, from how Nobel prize-winning physicists make their discoveries to how children understand bedtime stories. These observations raise an important question for cognitive science, which is, what are the computations taking place in someone's mind when they use visual imagery? Answering this question is not easy and will require much continued research across the multiple disciplines of cognitive science. Here, we focus on a related and more circumscribed question from the perspective of artificial intelligence (AI): If you have an intelligent agent that uses visual imagery-based knowledge representations and reasoning operations, then what kinds of problem solving might be possible, and how would such problem solving work? We highlight recent progress in AI toward answering these questions in the domain of visuospatial reasoning, looking at a case study of how imagery-based artificial agents can solve visuospatial intelligence tests. In particular, we first examine several variations of imagery-based knowledge representations and problem-solving strategies that are sufficient for solving problems from the Raven's Progressive Matrices intelligence test. We then look at how artificial agents, instead of being designed manually by AI researchers, might learn portions of their own knowledge and reasoning procedures from experience, including learning visuospatial domain knowledge, learning and generalizing problem-solving strategies, and learning the actual definition of the task in the first place.
Collapse
|
11
|
Hernández-Orallo J. Twenty Years Beyond the Turing Test: Moving Beyond the Human Judges Too. Minds Mach (Dordr) 2020. [DOI: 10.1007/s11023-020-09549-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
12
|
Martínez-Plumed F, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ. A computational analysis of general intelligence tests for evaluating cognitive development. COGN SYST RES 2017. [DOI: 10.1016/j.cogsys.2017.01.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
13
|
Beltran WC, Prade H, Richard G. Constructive Solving of Raven's IQ Tests with Analogical Proportions. INT J INTELL SYST 2016. [DOI: 10.1002/int.21817] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
| | - Henri Prade
- IRIT; University of Toulouse; Toulouse France
| | | |
Collapse
|
14
|
Hernández-Orallo J. Evaluation in artificial intelligence: from task-oriented to ability-oriented measurement. Artif Intell Rev 2016. [DOI: 10.1007/s10462-016-9505-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|