1
|
Xie D, Wang Z, Chen C, Dong D. Depthwise Convolution for Multi-Agent Communication With Enhanced Mean-Field Approximation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8557-8569. [PMID: 37015645 DOI: 10.1109/tnnls.2022.3230701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Multi-Agent settings remain a fundamental challenge in the reinforcement learning (RL) domain due to the partial observability and the lack of accurate real-time interactions across agents. In this article, we propose a new method based on local communication learning to tackle the multi-agent RL (MARL) challenge within a large number of agents coexisting. First, we design a new communication protocol that exploits the ability of depthwise convolution to efficiently extract local relations and learn local communication between neighboring agents. To facilitate multi-agent coordination, we explicitly learn the effect of joint actions by taking the policies of neighboring agents as inputs. Second, we introduce the mean-field approximation into our method to reduce the scale of agent interactions. To more effectively coordinate behaviors of neighboring agents, we enhance the mean-field approximation by a supervised policy rectification network (PRN) for rectifying real-time agent interactions and by a learnable compensation term for correcting the approximation bias. The proposed method enables efficient coordination as well as outperforms several baseline approaches on the adaptive traffic signal control (ATSC) task and the StarCraft II multi-agent challenge (SMAC).
Collapse
|
2
|
Schlafly M, Prabhakar A, Popovic K, Schlafly G, Kim C, Murphey TD. Collaborative robots can augment human cognition in regret-sensitive tasks. PNAS NEXUS 2024; 3:pgae016. [PMID: 38725525 PMCID: PMC11079486 DOI: 10.1093/pnasnexus/pgae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/02/2024] [Indexed: 05/12/2024]
Abstract
Despite theoretical benefits of collaborative robots, disappointing outcomes are well documented by clinical studies, spanning rehabilitation, prostheses, and surgery. Cognitive load theory provides a possible explanation for why humans in the real world are not realizing the benefits of collaborative robots: high cognitive loads may be impeding human performance. Measuring cognitive availability using an electrocardiogram, we ask 25 participants to complete a virtual-reality task alongside an invisible agent that determines optimal performance by iteratively updating the Bellman equation. Three robots assist by providing environmental information relevant to task performance. By enabling the robots to act more autonomously-managing more of their own behavior with fewer instructions from the human-here we show that robots can augment participants' cognitive availability and decision-making. The way in which robots describe and achieve their objective can improve the human's cognitive ability to reason about the task and contribute to human-robot collaboration outcomes. Augmenting human cognition provides a path to improve the efficacy of collaborative robots. By demonstrating how robots can improve human cognition, this work paves the way for improving the cognitive capabilities of first responders, manufacturing workers, surgeons, and other future users of collaborative autonomy systems.
Collapse
Affiliation(s)
- Millicent Schlafly
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Ahalya Prabhakar
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Katarina Popovic
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Geneva Schlafly
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Christopher Kim
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| | - Todd D Murphey
- Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA
| |
Collapse
|
3
|
Yaremkevich DD, Scherbakov AV, De Clerk L, Kukhtaruk SM, Nadzeyka A, Campion R, Rushforth AW, Savel'ev S, Balanov AG, Bayer M. On-chip phonon-magnon reservoir for neuromorphic computing. Nat Commun 2023; 14:8296. [PMID: 38097654 PMCID: PMC10721880 DOI: 10.1038/s41467-023-43891-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 11/22/2023] [Indexed: 12/17/2023] Open
Abstract
Reservoir computing is a concept involving mapping signals onto a high-dimensional phase space of a dynamical system called "reservoir" for subsequent recognition by an artificial neural network. We implement this concept in a nanodevice consisting of a sandwich of a semiconductor phonon waveguide and a patterned ferromagnetic layer. A pulsed write-laser encodes input signals into propagating phonon wavepackets, interacting with ferromagnetic magnons. The second laser reads the output signal reflecting a phase-sensitive mix of phonon and magnon modes, whose content is highly sensitive to the write- and read-laser positions. The reservoir efficiently separates the visual shapes drawn by the write-laser beam on the nanodevice surface in an area with a size comparable to a single pixel of a modern digital camera. Our finding suggests the phonon-magnon interaction as a promising hardware basis for realizing on-chip reservoir computing in future neuromorphic architectures.
Collapse
Affiliation(s)
- Dmytro D Yaremkevich
- Experimentelle Physik 2, Technische Universität Dortmund, D-44227, Dortmund, Germany
| | - Alexey V Scherbakov
- Experimentelle Physik 2, Technische Universität Dortmund, D-44227, Dortmund, Germany.
| | - Luke De Clerk
- Department of Physics, Loughborough University, Loughborough, LE11 3TU, UK
- Machine Learning Development, SS&C Technologies, 128 Queen Victoria Street, London, EC4V 4BJ, UK
| | - Serhii M Kukhtaruk
- Department of Theoretical Physics, V. E. Lashkaryov Institute of Semiconductor Physics, 03028, Kyiv, Ukraine
| | | | - Richard Campion
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK
| | - Andrew W Rushforth
- School of Physics and Astronomy, University of Nottingham, Nottingham, NG7 2RD, UK
| | - Sergey Savel'ev
- Department of Physics, Loughborough University, Loughborough, LE11 3TU, UK
| | | | - Manfred Bayer
- Experimentelle Physik 2, Technische Universität Dortmund, D-44227, Dortmund, Germany
| |
Collapse
|
4
|
Wang Z, Chen C, Dong D. Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic Environments. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:9742-9756. [PMID: 35349452 DOI: 10.1109/tnnls.2022.3160173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Evolution strategies (ESs), as a family of black-box optimization algorithms, recently emerge as a scalable alternative to reinforcement learning (RL) approaches such as Q-learning or policy gradient and are much faster when many central processing units (CPUs) are available due to better parallelization. In this article, we propose a systematic incremental learning method for ES in dynamic environments. The goal is to adjust previously learned policy to a new one incrementally whenever the environment changes. We incorporate an instance weighting mechanism with ES to facilitate its learning adaptation while retaining scalability of ES. During parameter updating, higher weights are assigned to instances that contain more new knowledge, thus encouraging the search distribution to move toward new promising areas of parameter space. We propose two easy-to-implement metrics to calculate the weights: instance novelty and instance quality. Instance novelty measures an instance's difference from the previous optimum in the original environment, while instance quality corresponds to how well an instance performs in the new environment. The resulting algorithm, instance weighted incremental evolution strategies (IW-IESs), is verified to achieve significantly improved performance on challenging RL tasks ranging from robot navigation to locomotion. This article thus introduces a family of scalable ES algorithms for RL domains that enables rapid learning adaptation to dynamic environments.
Collapse
|
5
|
Wang Z, Chen C, Dong D. A Dirichlet Process Mixture of Robust Task Models for Scalable Lifelong Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:7509-7520. [PMID: 35580095 DOI: 10.1109/tcyb.2022.3170485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
While reinforcement learning (RL) algorithms are achieving state-of-the-art performance in various challenging tasks, they can easily encounter catastrophic forgetting or interference when faced with lifelong streaming information. In this article, we propose a scalable lifelong RL method that dynamically expands the network capacity to accommodate new knowledge while preventing past memories from being perturbed. We use a Dirichlet process mixture to model the nonstationary task distribution, which captures task relatedness by estimating the likelihood of task-to-cluster assignments and clusters the task models in a latent space. We formulate the prior distribution of the mixture as a Chinese restaurant process (CRP) that instantiates new mixture components as needed. The update and expansion of the mixture are governed by the Bayesian nonparametric framework with an expectation maximization (EM) procedure, which dynamically adapts the model complexity without explicit task boundaries or heuristics. Moreover, we use the domain randomization technique to train robust prior parameters for the initialization of each task model in the mixture; thus, the resulting model can better generalize and adapt to unseen tasks. With extensive experiments conducted on robot navigation and locomotion domains, we show that our method successfully facilitates scalable lifelong RL and outperforms relevant existing methods.
Collapse
|
6
|
Oyama K, Majima K, Nagai Y, Hori Y, Hirabayashi T, Eldridge MAG, Mimura K, Miyakawa N, Fujimoto A, Hori Y, Iwaoki H, Inoue KI, Saunders RC, Takada M, Yahata N, Higuchi M, Richmond BJ, Minamimoto T. Distinct roles of monkey OFC-subcortical pathways in adaptive behavior. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.17.567492. [PMID: 38076986 PMCID: PMC10705585 DOI: 10.1101/2023.11.17.567492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
To be the most successful, primates must adapt to changing environments and optimize their behavior by making the most beneficial choices. At the core of adaptive behavior is the orbitofrontal cortex (OFC) of the brain, which updates choice value through direct experience or knowledge-based inference. Here, we identify distinct neural circuitry underlying these two separate abilities. We designed two behavioral tasks in which macaque monkeys updated the values of certain items, either by directly experiencing changes in stimulus-reward associations, or by inferring the value of unexperienced items based on the task's rules. Chemogenetic silencing of bilateral OFC combined with mathematical model-fitting analysis revealed that monkey OFC is involved in updating item value based on both experience and inference. In vivo imaging of chemogenetic receptors by positron emission tomography allowed us to map projections from the OFC to the rostromedial caudate nucleus (rmCD) and the medial part of the mediodorsal thalamus (MDm). Chemogenetic silencing of the OFC-rmCD pathway impaired experience-based value updating, while silencing the OFC-MDm pathway impaired inference-based value updating. Our results thus demonstrate a dissociable contribution of distinct OFC projections to different behavioral strategies, and provide new insights into the neural basis of value-based adaptive decision-making in primates.
Collapse
Affiliation(s)
- Kei Oyama
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
- PRESTO, Japan Science and Technology Agency, Kawaguchi, Japan
| | - Kei Majima
- Institute for Quantum Life Science, National Institutes for Quantum Science and Technology, Chiba, Japan
- PRESTO, Japan Science and Technology Agency, Kawaguchi, Japan
| | - Yuji Nagai
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Yukiko Hori
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Toshiyuki Hirabayashi
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Mark A G Eldridge
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, USA
| | - Koki Mimura
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
- Research Center for Medical and Health Data Science, The Institute of Statistical Mathematics, Tachikawa, Japan
| | - Naohisa Miyakawa
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Atsushi Fujimoto
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Yuki Hori
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Haruhiko Iwaoki
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Ken-Ichi Inoue
- Systems Neuroscience Section, Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan
| | - Richard C Saunders
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, USA
| | - Masahiko Takada
- Systems Neuroscience Section, Center for the Evolutionary Origins of Human Behavior, Kyoto University, Inuyama, Japan
| | - Noriaki Yahata
- Institute for Quantum Life Science, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Makoto Higuchi
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| | - Barry J Richmond
- Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Bethesda, USA
| | - Takafumi Minamimoto
- Department of Functional Brain Imaging, National Institutes for Quantum Science and Technology, Chiba, Japan
| |
Collapse
|
7
|
Zhang P, Pan Y, Zha R, Song H, Yuan C, Zhao Q, Piao Y, Ren J, Chen Y, Liang P, Tao R, Wei Z, Zhang X. Impulsivity-related right superior frontal gyrus as a biomarker of internet gaming disorder. Gen Psychiatr 2023; 36:e100985. [PMID: 37583792 PMCID: PMC10423834 DOI: 10.1136/gpsych-2022-100985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 07/12/2023] [Indexed: 08/17/2023] Open
Abstract
Background Internet gaming disorder (IGD) is a mental health issue that affects individuals worldwide. However, the lack of knowledge about the biomarkers associated with the development of IGD has restricted the diagnosis and treatment of this disorder. Aims We aimed to reveal the biomarkers associated with the development of IGD through resting-state brain network analysis and provide clues for the diagnosis and treatment of IGD. Methods Twenty-six patients with IGD, 23 excessive internet game users (EIUs) who recurrently played internet games but were not diagnosed with IGD and 29 healthy controls (HCs) performed delay discounting task (DDT) and Iowa gambling task (IGT). Resting-state functional magnetic resonance imaging (fMRI) data were also collected. Results Patients with IGD exhibited significantly lower hubness in the right medial orbital part of the superior frontal gyrus (ORBsupmed) than both the EIU and the HC groups. Additionally, the hubness of the right ORBsupmed was found to be positively correlated with the highest excessive internet gaming degree during the past year in the EIU group but not the IGD group; this might be the protective mechanism that prevents EIUs from becoming addicted to internet games. Moreover, the hubness of the right ORBsupmed was found to be related to the treatment outcome of patients with IGD, with higher hubness of this region indicating better recovery when undergoing forced abstinence. Further modelling analysis of the DDT and IGT showed that patients with IGD displayed higher impulsivity during the decision-making process, and impulsivity-related parameters were negatively correlated with the hubness of right ORBsupmed. Conclusions Our findings revealed that the impulsivity-related right ORBsupmed hubness could serve as a potential biomarker of IGD and provide clues for the diagnosis and treatment of IGD.
Collapse
Affiliation(s)
- Pengyu Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale and School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Yu Pan
- Key Laboratory of Brain-Machine Intelligence for Information Behavior (Ministry of Education and Shanghai), School of Business and Management, Shanghai International Studies University, Shanghai, China
| | - Rujing Zha
- Department of Psychology, School of Humanities and Social Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Hongwen Song
- Hefei National Laboratory for Physical Sciences at the Microscale and School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Cunfeng Yuan
- Drug Rehabilitation Administration, Ministry of Justice of the People's Republic of China, Beijing, China
| | - Qian Zhao
- Hefei National Laboratory for Physical Sciences at the Microscale and School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Yi Piao
- Application Technology Center of Physical Therapy to Brain Disorders, Institute of Advanced Technology, University of Science and Technology of China, Hefei, Anhui, China
| | - Jiecheng Ren
- Hefei National Laboratory for Physical Sciences at the Microscale and School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Yijun Chen
- Hefei National Laboratory for Physical Sciences at the Microscale and School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Peipeng Liang
- School of Psychology, Beijing Key Laboratory of Learning and Cognition, Capital Normal University, Beijing, China
| | - Ran Tao
- Department of Psychological Medicine, Chinese People's Liberation Army General Hospital, Beijing, China
| | - Zhengde Wei
- Department of Psychology, School of Humanities and Social Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Xiaochu Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale and School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Department of Psychology, School of Humanities and Social Sciences, University of Science and Technology of China, Hefei, Anhui, China
- Application Technology Center of Physical Therapy to Brain Disorders, Institute of Advanced Technology, University of Science and Technology of China, Hefei, Anhui, China
- Institute of Health and Medicine, Hefei Comprehensive Science Center, Hefei, Anhui, China
| |
Collapse
|
8
|
Sannia A, Giordano A, Gullo NL, Mastroianni C, Plastina F. A hybrid classical-quantum approach to speed-up Q-learning. Sci Rep 2023; 13:3913. [PMID: 36890198 PMCID: PMC9995512 DOI: 10.1038/s41598-023-30990-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 03/06/2023] [Indexed: 03/10/2023] Open
Abstract
We introduce a classical-quantum hybrid approach to computation, allowing for a quadratic performance improvement in the decision process of a learning agent. Using the paradigm of quantum accelerators, we introduce a routine that runs on a quantum computer, which allows for the encoding of probability distributions. This quantum routine is then employed, in a reinforcement learning set-up, to encode the distributions that drive action choices. Our routine is well-suited in the case of a large, although finite, number of actions and can be employed in any scenario where a probability distribution with a large support is needed. We describe the routine and assess its performance in terms of computational complexity, needed quantum resource, and accuracy. Finally, we design an algorithm showing how to exploit it in the context of Q-learning.
Collapse
Affiliation(s)
- A Sannia
- Dipartimento di Fisica, Università della Calabria, 87036, Arcavacata di Rende, (CS), Italy.,Institute for Cross-Disciplinary Physics and Complex Systems (IFISC) UIB-CSIC, Campus Universitat Illes Balears, 07122, Palma de Mallorca, Spain
| | | | - N Lo Gullo
- Dipartimento di Fisica, Università della Calabria, 87036, Arcavacata di Rende, (CS), Italy.,INFN, gruppo collegato di Cosenza, Cosenza, Italy.,Quantum Algorithms and Software, VTT Technical Research Centre of Finland Ltd, Espoo, Finland
| | | | - F Plastina
- Dipartimento di Fisica, Università della Calabria, 87036, Arcavacata di Rende, (CS), Italy. .,INFN, gruppo collegato di Cosenza, Cosenza, Italy.
| |
Collapse
|
9
|
Yin L, Liu D. Quantum parallel model predictive control for grid-connected solid oxide fuel cells. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
|
10
|
Yago Malo J, Cicchini GM, Morrone MC, Chiofalo ML. Quantum spin models for numerosity perception. PLoS One 2023; 18:e0284610. [PMID: 37098002 PMCID: PMC10128973 DOI: 10.1371/journal.pone.0284610] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/04/2023] [Indexed: 04/26/2023] Open
Abstract
Humans share with animals, both vertebrates and invertebrates, the capacity to sense the number of items in their environment already at birth. The pervasiveness of this skill across the animal kingdom suggests that it should emerge in very simple populations of neurons. Current modelling literature, however, has struggled to provide a simple architecture carrying out this task, with most proposals suggesting the emergence of number sense in multi-layered complex neural networks, and typically requiring supervised learning; while simple accumulator models fail to predict Weber's Law, a common trait of human and animal numerosity processing. We present a simple quantum spin model with all-to-all connectivity, where numerosity is encoded in the spectrum after stimulation with a number of transient signals occurring in a random or orderly temporal sequence. We use a paradigmatic simulational approach borrowed from the theory and methods of open quantum systems out of equilibrium, as a possible way to describe information processing in neural systems. Our method is able to capture many of the perceptual characteristics of numerosity in such systems. The frequency components of the magnetization spectra at harmonics of the system's tunneling frequency increase with the number of stimuli presented. The amplitude decoding of each spectrum, performed with an ideal-observer model, reveals that the system follows Weber's law. This contrasts with the well-known failure to reproduce Weber's law with linear system or accumulators models.
Collapse
Affiliation(s)
- Jorge Yago Malo
- Department of Physics "Enrico Fermi" and INFN, University of Pisa, Pisa, Italy
| | | | - Maria Concetta Morrone
- Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa and PisaVisionLab, Pisa, Italy
| | | |
Collapse
|
11
|
Ho JKW, Hoorn JF. Quantum affective processes for multidimensional decision-making. Sci Rep 2022; 12:20468. [PMID: 36443304 PMCID: PMC9705568 DOI: 10.1038/s41598-022-22855-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 10/20/2022] [Indexed: 11/29/2022] Open
Abstract
In modeling the human affective system and applying lessons learned to human-robot interaction, the challenge is to handle ambiguous emotional states of an agency (whether human or artificial), probabilistic decisions, and freedom of choice in affective and behavioral patterns. Moreover, many cognitive processes seem to run in parallel whereas seriality is the standard in conventional computation. Representation of contextual aspects of behavior and processes and of self-directed neuroplasticity are still wanted and so we attempt a quantum-computational construction of robot affect, which theoretically should be able to account for indefinite and ambiguous states as well as parallelism. Our Quantum Coppélia (Q-Coppélia) is a translation into quantum logics of the fuzzy-based Silicon Coppélia system, which simulates the progression of a robot's attitude towards its user. We show the entire circuitry of the Q-Coppélia framework, aiming at contemporary descriptions of (neuro)psychological processes. Arguably, our work provides a system for simulating and handling affective interactions among various agencies from an understanding of the relations between quantum algorithms and the fundamental nature of psychology.
Collapse
Affiliation(s)
- Johnny K W Ho
- Laboratory for Artificial Intelligence in Design, Hong Kong Science Park, New Territories, Hong Kong.
| | - Johan F Hoorn
- School of Design, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
- Laboratory for Artificial Intelligence in Design, Hong Kong Science Park, New Territories, Hong Kong
- Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
- Department of Communication Science, VU University Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
12
|
Liu S, Wang B, Li H, Chen C, Wang Z. Continual portfolio selection in dynamic environments via incremental reinforcement learning. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01639-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
13
|
Wei Q, Ma H, Chen C, Dong D. Deep Reinforcement Learning With Quantum-Inspired Experience Replay. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:9326-9338. [PMID: 33600343 DOI: 10.1109/tcyb.2021.3053414] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, a novel training paradigm inspired by quantum computation is proposed for deep reinforcement learning (DRL) with experience replay. In contrast to the traditional experience replay mechanism in DRL, the proposed DRL with quantum-inspired experience replay (DRL-QER) adaptively chooses experiences from the replay buffer according to the complexity and the replayed times of each experience (also called transition), to achieve a balance between exploration and exploitation. In DRL-QER, transitions are first formulated in quantum representations and then the preparation operation and depreciation operation are performed on the transitions. In this process, the preparation operation reflects the relationship between the temporal-difference errors (TD-errors) and the importance of the experiences, while the depreciation operation is taken into account to ensure the diversity of the transitions. The experimental results on Atari 2600 games show that DRL-QER outperforms state-of-the-art algorithms, such as DRL-PER and DCRL on most of these games with improved training efficiency and is also applicable to such memory-based DRL approaches as double network and dueling network.
Collapse
|
14
|
Wang Z, Chen C, Dong D. Lifelong Incremental Reinforcement Learning With Online Bayesian Inference. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4003-4016. [PMID: 33571098 DOI: 10.1109/tnnls.2021.3055499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A central capability of a long-lived reinforcement learning (RL) agent is to incrementally adapt its behavior as its environment changes and to incrementally build upon previous experiences to facilitate future learning in real-world scenarios. In this article, we propose lifelong incremental reinforcement learning (LLIRL), a new incremental algorithm for efficient lifelong adaptation to dynamic environments. We develop and maintain a library that contains an infinite mixture of parameterized environment models, which is equivalent to clustering environment parameters in a latent space. The prior distribution over the mixture is formulated as a Chinese restaurant process (CRP), which incrementally instantiates new environment models without any external information to signal environmental changes in advance. During lifelong learning, we employ the expectation-maximization (EM) algorithm with online Bayesian inference to update the mixture in a fully incremental manner. In EM, the E-step involves estimating the posterior expectation of environment-to-cluster assignments, whereas the M-step updates the environment parameters for future learning. This method allows for all environment models to be adapted as necessary, with new models instantiated for environmental changes and old models retrieved when previously seen environments are encountered again. Simulation experiments demonstrate that LLIRL outperforms relevant existing methods and enables effective incremental adaptation to various dynamic environments for lifelong learning.
Collapse
|
15
|
Zhao Q, Zhang Y, Wang M, Ren J, Chen Y, Chen X, Wei Z, Sun J, Zhang X. Effects of retrieval-extinction training on internet gaming disorder. J Behav Addict 2022; 11:49-62. [PMID: 35316208 PMCID: PMC9109625 DOI: 10.1556/2006.2022.00006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 10/28/2021] [Accepted: 02/27/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND AND AIMS Internet gaming disorder (IGD) leads to serious impairments in cognitive functions, and lacks of effective treatments. Cue-induced craving is a hallmark feature of this disease and is associated with addictive memory elements. Memory retrieval-extinction manipulations could interfere with addictive memories and attenuate addictive syndromes, which might be a promising intervention for IGD. The aims of this study were to explore the effect of a memory retrieval-extinction manipulation on gaming cue-induced craving and reward processing in individuals with IGD. METHODS A total of 49 individuals (mean age: 20.52 ± 1.58) with IGD underwent a memory retrieval-extinction training (RET) with a 10-min interval (R-10min-E, n = 24) or a RET with a 6-h interval (R-6h-E, n = 25) for two consecutive days. We assessed cue-induced craving pre- and post-RET, and at the 1- and 3-month follow-ups. The neural activities during reward processing were also assessed pre- and post-RET. RESULTS Compared with the R-6h-E group, gaming cravings in individuals with IGD were significantly reduced after R-10min-E training at the 3-month follow-up (P < 0.05). Moreover, neural activities in the individuals with IGD were also altered after R-10min-E training, which was corroborated by enhanced reward processing, such as faster responses (P < 0.05) and stronger frontoparietal functional connectivity to monetary reward cues, while the R-6h-E training had no effects. DISCUSSION AND CONCLUSIONS The two-day R-10min-E training reduced addicts' craving for Internet games, restored monetary reward processing in IGD individuals, and maintained long-term efficacy.
Collapse
Affiliation(s)
- Qian Zhao
- Department of Otolaryngology-Head and Neck Surgery, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China,Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China
| | - Yongjun Zhang
- School of Foreign Languages, Anhui Jianzhu University, Hefei, Anhui, 230022, China,Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China,Department of Psychology, School of Humanities & Social Science, University of Science & Technology of China, Hefei, Anhui, 230027, China
| | - Min Wang
- Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China,Department of Psychology, School of Humanities & Social Science, University of Science & Technology of China, Hefei, Anhui, 230027, China
| | - Jiecheng Ren
- Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China
| | - Yijun Chen
- Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China
| | - Xueli Chen
- Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China,Department of Social and Behavioural Sciences, City University of Hong Kong, Hong Kong, People’s Republic of China
| | - Zhengde Wei
- Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China,Corresponding author. Tel.:/fax: +86-551-37 63607295. E-mail:
| | - Jingwu Sun
- Department of Otolaryngology-Head and Neck Surgery, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, China,Corresponding author. Tel.:/fax: +86-551-37 63607295. E-mail:
| | - Xiaochu Zhang
- Key Laboratory of Brain Function and Disease, Chinese Academy of Sciences, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei, Anhui, 230027, China,Department of Psychology, School of Humanities & Social Science, University of Science & Technology of China, Hefei, Anhui, 230027, China,Institute of Advanced Technology, University of Science and Technology of China, Hefei, Anhui, 230001, China,Hefei Medical Research Center on Alcohol Addiction, Affiliated Psychological Hospital of Anhui Medical University, Hefei Fourth People’s Hospital, Anhui Mental Health Center, Hefei, Anhui, 230017, China,Corresponding author. Tel.:/fax: +86-551-37 63607295. E-mail:
| |
Collapse
|
16
|
Lü W, Wu Q, Liu Y, Wang Y, Wei Z, Li Y, Fan C, Wang AL, Borland R, Zhang X. No smoking signs with strong smoking symbols induce weak cravings: an fMRI and EEG study. Neuroimage 2022; 252:119019. [PMID: 35202814 DOI: 10.1016/j.neuroimage.2022.119019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 12/12/2021] [Accepted: 02/17/2022] [Indexed: 11/28/2022] Open
Abstract
No smoking signs (NSSs) that combine smoking symbols (SSs) and prohibition symbols (PSs) represent common examples of reward and prohibition competition. To evaluate how SSs within NSSs influence their effectiveness in guiding reward vs. prohibition, we studied 93 male smokers. We collected self-reported craving ratings (N=30), cue reactivity under fMRI/EEG (N=33), and smoking-behavior anticipation for paired NSSs and SSs (N=30). We found that NSS-induced cravings were negatively correlated with SS-induced cravings and PS-induced inhibition. fMRI indicated that both correlations were mediated by activation of the inferior frontal gyrus and precuneus, suggesting that the effects of SSs and PSs interact with each other. EEG revealed that the prohibition response occurs after the cigarette response, indicating that the cigarette response might be precluded by the prohibition, supporting the effect of SSs in discouraging smoking. Moreover, stronger SSs induced stronger slow positive waves and late positive potentials, and the stronger the late positive potentials, the stronger the late positive potentials. Both the amplitudes of late positive potentials and slow positive waves were positively correlated with the amplitude of N2, which was positively correlated with the attention grabbed score by the NSS. In addition, the weaker the NSS-induced craving, the greater the smoking behavior anticipation reduction, indicating the capability of NSSs to decrease smoking behavior. Our study provides empirical evidence for selecting the most effective NSSs: those combining strong SS and PS, offering insights about competition between cigarette reward and prohibition and providing neural evidence on how cigarette reward and prohibition interact.
Collapse
Affiliation(s)
- Wanwan Lü
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China
| | - Qichao Wu
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China
| | - Ying Liu
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China.
| | - Ying Wang
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China
| | - Zhengde Wei
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China
| | - Yu Li
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China
| | - Chuan Fan
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China; Department of Psychiatry, the First Affiliated Hospital of Anhui Medical University, Hefei 230022, China
| | - An-Li Wang
- Addiction Institute at Mount Sinai, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, United States
| | - Ron Borland
- School of Psychological Sciences, University of Melbourne and Cancer Council Victoria, Australia
| | - Xiaochu Zhang
- Department of Radiology, the First Affiliated Hospital of USTC, School of Life Science, Division of Life Science and Medicine, University of Science & Technology of China, Hefei 230027, China; Department of Psychology, School of Humanities and Social Science, University of Science and Technology of China, Hefei, Anhui 230026, China; Hefei Medical Research Center on Alcohol Addiction, Affiliated Psychological Hospital of Anhui Medical University, Hefei Fourth People's Hospital, Anhui Mental Health Center, Hefei 230017, China; Biomedical Sciences and Health Laboratory of Anhui Province, University of Science & Technology of China, Hefei 230027, China.
| |
Collapse
|
17
|
Niraula D, Jamaluddin J, Matuszak MM, Haken RKT, Naqa IE. Quantum deep reinforcement learning for clinical decision support in oncology: application to adaptive radiotherapy. Sci Rep 2021; 11:23545. [PMID: 34876609 PMCID: PMC8651664 DOI: 10.1038/s41598-021-02910-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 11/24/2021] [Indexed: 01/31/2023] Open
Abstract
Subtle differences in a patient's genetics and physiology may alter radiotherapy (RT) treatment responses, motivating the need for a more personalized treatment plan. Accordingly, we have developed a novel quantum deep reinforcement learning (qDRL) framework for clinical decision support that can estimate an individual patient's dose response mid-treatment and recommend an optimal dose adjustment. Our framework considers patients' specific information including biological, physical, genetic, clinical, and dosimetric factors. Recognizing that physicians must make decisions amidst uncertainty in RT treatment outcomes, we employed indeterministic quantum states to represent human decision making in a real-life scenario. We paired quantum decision states with a model-based deep q-learning algorithm to optimize the clinical decision-making process in RT. We trained our proposed qDRL framework on an institutional dataset of 67 stage III non-small cell lung cancer (NSCLC) patients treated on prospective adaptive protocols and independently validated our framework in an external multi-institutional dataset of 174 NSCLC patients. For a comprehensive evaluation, we compared three frameworks: DRL, qDRL trained in a Qiskit quantum computing simulator, and qDRL trained in an IBM quantum computer. Two metrics were considered to evaluate our framework: (1) similarity score, defined as the root mean square error between retrospective clinical decisions and the AI recommendations, and (2) self-evaluation scheme that compares retrospective clinical decisions and AI recommendations based on the improvement in the observed clinical outcomes. Our analysis shows that our framework, which takes into consideration individual patient dose response in its decision-making, can potentially improve clinical RT decision-making by at least about 10% compared to unaided clinical practice. Further validation of our novel quantitative approach in a prospective study will provide a necessary framework for improving the standard of care in personalized RT.
Collapse
Affiliation(s)
- Dipesh Niraula
- Department of Machine Learning, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA.
| | - Jamalina Jamaluddin
- Department of Nuclear Engineering and Radiological Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Martha M Matuszak
- Department of Nuclear Engineering and Radiological Sciences, University of Michigan, Ann Arbor, MI, 48109, USA
- Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Randall K Ten Haken
- Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Issam El Naqa
- Department of Machine Learning, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| |
Collapse
|
18
|
Loganathan K, Ho ETW. Value, drug addiction and the brain. Addict Behav 2021; 116:106816. [PMID: 33453587 DOI: 10.1016/j.addbeh.2021.106816] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 11/17/2020] [Accepted: 01/02/2021] [Indexed: 12/15/2022]
Abstract
Over the years, various models have been proposed to explain the psychology and biology of drug addiction, built primarily around the habit and compulsion models. Recent research indicates drug addiction may be goal-directed, motivated by excessive valuation of drugs. Drug consumption may initially occur for the sake of pleasure but may transition to a means of escaping withdrawal, stress and negative emotions. In this hypothetical paper, we propose a value-based neurobiological model for drug addiction. We posit that during dependency, the value-based decision-making system in the brain is not inactive but has instead prioritized drugs as the reward of choice. In support of this model, we consider the role of valuation in choice, its influence on pleasure and punishment, and how valuation is contrasted in impulsive and compulsive behaviours. We then discuss the neurobiology of value, beginning with the dopaminergic system and its relationship with incentive salience before moving to brain-wide networks involved in valuation, control and prospection. These value-based neurobiological components are then integrated into the cycle of addiction as we consider the development of drug dependency from a valuation perspective. We conclude with a discussion of cognitive interventions utilizing value-based decision-making, highlighting not just advances in recalibrating the valuation system to focus on non-drug rewards, but also areas for improvement in refining this approach.
Collapse
Affiliation(s)
- Kavinash Loganathan
- Centre for Intelligent Signal & Imaging, Universiti Teknologi PETRONAS, Perak, Malaysia.
| | - Eric Tatt Wei Ho
- Centre for Intelligent Signal & Imaging, Universiti Teknologi PETRONAS, Perak, Malaysia; Dept of Electrical & Electronics Engineering, Universiti Teknologi PETRONAS, Perak, Malaysia
| |
Collapse
|
19
|
Outbreak of COVID-19 altered the relationship between memory bias and depressive degree in nonclinical depression. iScience 2021; 24:102081. [PMID: 33495750 PMCID: PMC7816901 DOI: 10.1016/j.isci.2021.102081] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Revised: 12/14/2020] [Accepted: 01/14/2021] [Indexed: 11/20/2022] Open
Abstract
The outbreak of the novel coronavirus disease 2019 (COVID-19) has increased concern about people's mental health under such serious stressful situation, especially depressive symptoms. Cognitive biases have been related to depression degree in previous studies. Here, we used behavioral and brain imaging analysis, to determine if and how the COVID-19 pandemic affects the relationship between current cognitive biases and future depression degree and the underlying neural basis in a nonclinical depressed population. An out-expectation result showed that a more negative memory bias was associated with a greater decrease in future depressive indices in nonclinical depressed participants during the COVID-19 pandemic, which might be due to decreased social stress. These data enhance our understanding of how the depressive degree of nonclinical depressed populations will change during the COVID-19 pandemic and also provide support for social distancing policies from a psychological perspective. We collected depressive degree before and during the COVID-19 pandemic Depressive degree negatively correlated with memory bias during the pandemic Reduced social stress during the pandemic might lead to the altered relationship Results provide extra support for social distancing policies during the pandemic
Collapse
|
20
|
Nowakowski K, Carvalho P, Six JB, Maillet Y, Nguyen AT, Seghiri I, M'Pemba L, Marcille T, Ngo ST, Dao TT. Human locomotion with reinforcement learning using bioinspired reward reshaping strategies. Med Biol Eng Comput 2021; 59:243-256. [PMID: 33417125 DOI: 10.1007/s11517-020-02309-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Accepted: 12/30/2020] [Indexed: 10/22/2022]
Abstract
Recent learning strategies such as reinforcement learning (RL) have favored the transition from applied artificial intelligence to general artificial intelligence. One of the current challenges of RL in healthcare relates to the development of a controller to teach a musculoskeletal model to perform dynamic movements. Several solutions have been proposed. However, there is still a lack of investigations exploring the muscle control problem from a biomechanical point of view. Moreover, no studies using biological knowledge to develop plausible motor control models for pathophysiological conditions make use of reward reshaping. Consequently, the objective of the present work was to design and evaluate specific bioinspired reward function strategies for human locomotion learning within an RL framework. The deep deterministic policy gradient (DDPG) method for a single-agent RL problem was applied. A 3D musculoskeletal model (8 DoF and 22 muscles) of a healthy adult was used. A virtual interactive environment was developed and simulated using opensim-rl library. Three reward functions were defined for walking, forward, and side falls. The training process was performed with Google Cloud Compute Engine. The obtained outcomes were compared to the NIPS 2017 challenge outcomes, experimental observations, and literature data. Regarding learning to walk, simulated musculoskeletal models were able to walk from 18 to 20.5 m for the best solutions. A compensation strategy of muscle activations was revealed. Soleus, tibia anterior, and vastii muscles are main actors of the simple forward fall. A higher intensity of muscle activations was also noted after the fall. All kinematics and muscle patterns were consistent with experimental observations and literature data. Regarding the side fall, an intensive level of muscle activation on the expected fall side to unbalance the body was noted. The obtained outcomes suggest that computational and human resources as well as biomechanical knowledge are needed together to develop and evaluate an efficient and robust RL solution. As perspectives, current solutions will be extended to a larger parameter space in 3D. Furthermore, a stochastic reinforcement learning model will be investigated in the future in scope with the uncertainties of the musculoskeletal model and associated environment to provide a general artificial intelligence solution for human locomotion learning. Graphical abstract.
Collapse
Affiliation(s)
- Katharine Nowakowski
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Philippe Carvalho
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Jean-Baptiste Six
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Yann Maillet
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Anh Tu Nguyen
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Ismail Seghiri
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Loick M'Pemba
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Theo Marcille
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Sy Toan Ngo
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France
| | - Tien-Tuan Dao
- Université de technologie de Compiègne, CNRS, Biomechanics and Bioengineering, Centre de recherche Royallieu, CS 60 319 - 60 203, Compiègne Cedex, France. .,Univ. Lille, CNRS, Centrale Lille, UMR 9013 LaMcube - Laboratoire de Mécanique, Multiphysique, Multiéchelle, F-59000, Lille, France.
| |
Collapse
|
21
|
Elbadawi M, Gaisford S, Basit AW. Advanced machine-learning techniques in drug discovery. Drug Discov Today 2020; 26:769-777. [PMID: 33290820 DOI: 10.1016/j.drudis.2020.12.003] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 11/16/2020] [Accepted: 12/02/2020] [Indexed: 01/20/2023]
Abstract
The popularity of machine learning (ML) across drug discovery continues to grow, yielding impressive results. As their use increases, so do their limitations become apparent. Such limitations include their need for big data, sparsity in data, and their lack of interpretability. It has also become apparent that the techniques are not truly autonomous, requiring retraining even post deployment. In this review, we detail the use of advanced techniques to circumvent these challenges, with examples drawn from drug discovery and allied disciplines. In addition, we present emerging techniques and their potential role in drug discovery. The techniques presented herein are anticipated to expand the applicability of ML in drug discovery.
Collapse
Affiliation(s)
- Moe Elbadawi
- Department of Pharmaceutics, UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London, WC1N 1AX, UK
| | - Simon Gaisford
- Department of Pharmaceutics, UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London, WC1N 1AX, UK; FabRx Ltd, 3 Romney Road, Ashford, TN24 0RW, UK
| | - Abdul W Basit
- Department of Pharmaceutics, UCL School of Pharmacy, University College London, 29-39 Brunswick Square, London, WC1N 1AX, UK; FabRx Ltd, 3 Romney Road, Ashford, TN24 0RW, UK.
| |
Collapse
|
22
|
Mahmoudi M. A missing, but essential, platform for multidisciplinary scientific discussion: understanding the 'elephant'. Future Sci OA 2020; 7:FSO666. [PMID: 33552544 PMCID: PMC7849986 DOI: 10.2144/fsoa-2020-0188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Affiliation(s)
- Morteza Mahmoudi
- Department of Radiology & Precision Health Program, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
23
|
Chen C, Zhang Y, Li Y, Wang Z, Liu Y, Cao W, Wu X. Iterative Learning Control for a Soft Exoskeleton with Hip and Knee Joint Assistance. SENSORS (BASEL, SWITZERLAND) 2020; 20:E4333. [PMID: 32759646 PMCID: PMC7435451 DOI: 10.3390/s20154333] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 07/28/2020] [Accepted: 08/02/2020] [Indexed: 11/16/2022]
Abstract
Walking on different terrains leads to different biomechanics, which motivates the development of exoskeletons for assisting on walking according to the type of a terrain. The design of a lightweight soft exoskeleton that simultaneously assists multiple joints in the lower limb is presented in this paper. It is used to assist both hip and knee joints in a single system, the assistance force is directly applied to the hip joint flexion and the knee joint extension, while indirectly to the hip extension also. Based on the biological torque of human walking at three different slopes, a novel strategy is developed to improve the performance of assistance. A parameter optimal iterative learning control (POILC) method is introduced to reduce the error generated due to the difference between the wearing position and the biological features of the different wearers. In order to obtain the metabolic rate, three subjects walked on a treadmill, for 10 min on each terrain, at a speed of 4 km/h under both conditions of wearing and not wearing the soft exoskeleton. Results showed that the metabolic rate was decreased with the increasing slope of the terrain. The reductions in the net metabolic rate in the experiments on the downhill, flat ground, and uphill were, respectively, 9.86%, 12.48%, and 22.08% compared to the condition of not wearing the soft exoskeleton, where their corresponding absolute values were 0.28 W/kg, 0.72 W/kg, and 1.60 W/kg.
Collapse
Affiliation(s)
- Chunjie Chen
- CAS Key Laboratory of Human-Machine-Intelligence Synergic Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China; (C.C.); (Y.Z.); (Z.W.); (Y.L.); (W.C.)
- Guangdong Provincial Key Lab of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- ShenZhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yu Zhang
- CAS Key Laboratory of Human-Machine-Intelligence Synergic Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China; (C.C.); (Y.Z.); (Z.W.); (Y.L.); (W.C.)
- Guangdong Provincial Key Lab of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Harbin Institute of Technology, School of Mechanical Engineering and Automation, Shenzhen 518055, China;
| | - Yanjie Li
- Harbin Institute of Technology, School of Mechanical Engineering and Automation, Shenzhen 518055, China;
| | - Zhuo Wang
- CAS Key Laboratory of Human-Machine-Intelligence Synergic Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China; (C.C.); (Y.Z.); (Z.W.); (Y.L.); (W.C.)
- Guangdong Provincial Key Lab of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- Harbin Institute of Technology, School of Mechanical Engineering and Automation, Shenzhen 518055, China;
| | - Yida Liu
- CAS Key Laboratory of Human-Machine-Intelligence Synergic Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China; (C.C.); (Y.Z.); (Z.W.); (Y.L.); (W.C.)
- Guangdong Provincial Key Lab of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- ShenZhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wujing Cao
- CAS Key Laboratory of Human-Machine-Intelligence Synergic Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China; (C.C.); (Y.Z.); (Z.W.); (Y.L.); (W.C.)
- Guangdong Provincial Key Lab of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- ShenZhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xinyu Wu
- CAS Key Laboratory of Human-Machine-Intelligence Synergic Systems, Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China; (C.C.); (Y.Z.); (Z.W.); (Y.L.); (W.C.)
- Guangdong Provincial Key Lab of Robotics and Intelligent System, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- ShenZhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen 518055, China
| |
Collapse
|