1
|
Chen YM, Chang KY, Liu C, Hsiao TC, Hong ZW, Lee CY. Composing Synergistic Macro Actions for Reinforcement Learning Agents. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7251-7258. [PMID: 36318571 DOI: 10.1109/tnnls.2022.3213606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Macro actions have been demonstrated to be beneficial for the learning processes of an agent and have encouraged a variety of techniques to be developed for constructing more effective ones. However, previous techniques usually do not further consider combining macro actions to form a synergistic macro action ensemble, in which synergism exhibits when the constituent macro actions are favorable to be jointly used by an agent during evaluation. Such a synergistic macro action ensemble may potentially allow an agent to perform even better than the individual macro actions within it. Motivated by the recent advances of neural architecture search (NAS), in this brief, we formulate the construction of a synergistic macro action ensemble as a Markov decision process (MDP) and evaluate the constructed macro action ensemble as a whole. Such a problem formulation enables synergism to be taken into account by the proposed evaluation procedure. Our experimental results demonstrate that the proposed framework is able to discover the synergistic macro action ensembles. Furthermore, we also highlight the benefits of these macro action ensembles through a set of analytical cases.
Collapse
|
2
|
Bäckström C, Jonsson P. A framework for analysing state-abstraction methods. ARTIF INTELL 2022. [DOI: 10.1016/j.artint.2021.103608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
3
|
Percassi F, Gerevini AE, Scala E, Serina I, Vallati M. Improving Domain-Independent Heuristic State-Space Planning via plan cost predictions. J EXP THEOR ARTIF IN 2021. [DOI: 10.1080/0952813x.2021.1970239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Francesco Percassi
- Dipartimento d’Ingegneria dell’Informazione, Universitá Degli Studi Di Brescia, Brescia, Italy
- Department of Informatics, University of Huddersfield, Huddersfield, UK
| | - Alfonso E. Gerevini
- Dipartimento d’Ingegneria dell’Informazione, Universitá Degli Studi Di Brescia, Brescia, Italy
| | - Enrico Scala
- Dipartimento d’Ingegneria dell’Informazione, Universitá Degli Studi Di Brescia, Brescia, Italy
| | - Ivan Serina
- Dipartimento d’Ingegneria dell’Informazione, Universitá Degli Studi Di Brescia, Brescia, Italy
| | - Mauro Vallati
- Department of Informatics, University of Huddersfield, Huddersfield, UK
| |
Collapse
|
4
|
|
5
|
Khan S, Parkinson S. Discovering and utilising expert knowledge from security event logs. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS 2019. [DOI: 10.1016/j.jisa.2019.102375] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
6
|
Affiliation(s)
- Mauro Vallati
- Department of Informatics, University of Huddersfield, Huddersfield, UK
| | - Lukáš Chrpa
- Department of Computer Science, Czech Technical University in Prague, Prague, Prague Czech Republic
- Department of Theoretical Computer Science and Mathematical Logic, Charles University in Prague, Prague, Prague Czech Republic
| | - Ivan Serina
- Dipartimento d’Ingegneria dell’Informazione, Universitá degli Studi di Brescia, Brescia, Italy
| |
Collapse
|
7
|
|
8
|
Chrpa L, Vallati M, McCluskey TL. Inner entanglements: Narrowing the search in classical planning by problem reformulation. Comput Intell 2019. [DOI: 10.1111/coin.12203] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Lukáš Chrpa
- Faculty of Electrical EngineeringCzech Technical University in Prague Prague Czech Republic
- Faculty of Mathematics and PhysicsCharles University Prague Czech Republic
| | - Mauro Vallati
- School of Computing and EngineeringUniversity of Huddersfield Huddersfield UK
| | | |
Collapse
|
9
|
Abstract
AbstractGeneralized planningstudies the representation, computation and evaluation of solutions that are valid for multiple planning instances. These are topics studied since the early days of AI. However, in recent years, we are experiencing the appearance of novel formalisms to compactly represent generalized planning tasks, the solutions to these tasks (calledgeneralized plans) and efficient algorithms to compute generalized plans. The paper reviews recent advances in generalized planning and relates them to existing planning formalisms, such asplanning with domain control knowledgeand approaches forplanning under uncertainty, that also aim at generality.
Collapse
|
10
|
Chrpa L, Vallati M, McCluskey TL. Outer entanglements: a general heuristic technique for improving the efficiency of planning algorithms. J EXP THEOR ARTIF IN 2018. [DOI: 10.1080/0952813x.2018.1509377] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Affiliation(s)
- Lukáš Chrpa
- Department of Computer Science, Czech Technical University in Prague, Prague, Czech Republic
- Department of Theoretical Computer Science and Mathematical Logic, Charles University in Prague, Prague, Czech Republic
| | - Mauro Vallati
- Department of Informatics, University of Huddersfield, Queensgate, UK
| | | |
Collapse
|
11
|
|
12
|
Fuentetaja R, de la Rosa T. Compiling irrelevant objects to counters. Special case of creation planning. AI COMMUN 2016. [DOI: 10.3233/aic-150692] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Raquel Fuentetaja
- Departamento de Informática, Universidad Carlos III de Madrid, Av. Universidad 30, Leganés, Madrid, Spain. E-mails: ,
| | - Tomás de la Rosa
- Departamento de Informática, Universidad Carlos III de Madrid, Av. Universidad 30, Leganés, Madrid, Spain. E-mails: ,
| |
Collapse
|
13
|
Affiliation(s)
- Erion Plaku
- Department of Electrical Engineering and Computer Science, Catholic University of America, Washington, DC, 22064USA
| | - Duong Le
- Department of Electrical Engineering and Computer Science, Catholic University of America, Washington, DC, 22064USA
| |
Collapse
|
14
|
Karapinar S, Sariel S. Cognitive robots learning failure contexts through real-world experimentation. Auton Robots 2015. [DOI: 10.1007/s10514-015-9471-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
15
|
Chrpa L, McCluskey TL, Osborne H. On the completeness of replacing primitive actions with macro-actions and its generalization to planning operators and macro-operators. AI COMMUN 2015. [DOI: 10.3233/aic-150679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Lukáš Chrpa
- PARK Research Group, School of Computing and Engineering, University of Huddersfield, Huddersfield, UK. E-mails: , ,
| | - Thomas Leo McCluskey
- PARK Research Group, School of Computing and Engineering, University of Huddersfield, Huddersfield, UK. E-mails: , ,
| | - Hugh Osborne
- PARK Research Group, School of Computing and Engineering, University of Huddersfield, Huddersfield, UK. E-mails: , ,
| |
Collapse
|
16
|
|
17
|
Hogg C, Muñoz‐Avila H, Kuter U. Learning Hierarchical Task Models from Input Traces. Comput Intell 2014. [DOI: 10.1111/coin.12044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Chad Hogg
- Department of Mathematics and Computer Science King's College Wilkes‐Barre Pennsylvania
| | - Héctor Muñoz‐Avila
- Department of Computer Science and Engineering Lehigh University Bethlehem Pennsylvania
| | - Ugur Kuter
- Smart Information Flow Technologies Minneapolis Minnesota
| |
Collapse
|
18
|
Zhuo HH, Muñoz-Avila H, Yang Q. Learning hierarchical task network domains from partially observed plan traces. ARTIF INTELL 2014. [DOI: 10.1016/j.artint.2014.04.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
19
|
Kishimoto A, Fukunaga A, Botea A. Evaluation of a simple, scalable, parallel best-first search strategy. ARTIF INTELL 2013. [DOI: 10.1016/j.artint.2012.10.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
20
|
Abstract
AbstractRecent discoveries in automated planning are broadening the scope of planners, from toy problems to real applications. However, applying automated planners to real-world problems is far from simple. On the one hand, the definition of accurate action models for planning is still a bottleneck. On the other hand, off-the-shelf planners fail to scale-up and to provide good solutions in many domains. In these problematic domains, planners can exploit domain-specific control knowledge to improve their performance in terms of both speed and quality of the solutions. However, manual definition of control knowledge is quite difficult. This paper reviews recent techniques in machine learning for the automatic definition of planning knowledge. It has been organized according to the target of the learning process: automatic definition of planning action models and automatic definition of planning control knowledge. In addition, the paper reviews the advances in the related field of reinforcement learning.
Collapse
|
21
|
Levine G, Kuter U, Rebguns A, Green D, Spears D. LEARNING AND VERIFYING SAFETY CONSTRAINTS FOR PLANNERS IN A KNOWLEDGE-IMPOVERISHED SYSTEM. Comput Intell 2012. [DOI: 10.1111/j.1467-8640.2012.00416.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
22
|
Fernández Arregui S, Jiménez Celorrio S, de la Rosa Turbides T. Improving Automated Planning with Machine Learning. Mach Learn 2012. [DOI: 10.4018/978-1-60960-818-7.ch510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This chapter reports the last machine learning techniques for the assistance of automated planning. Recent discoveries in automated planning have opened the scope of planners, from toy problems to real-world applications, making new challenges come into focus. The planning community believes that machine learning can assist to address these new challenges. The chapter collects the last machine learning techniques for assisting automated planners classified in: techniques for the improvement of the planning search processes and techniques for the automatic definition of planning action models. For each technique, the chapter provides an in-depth analysis of their domain, advantages and disadvantages. Finally, the chapter draws the outline of the new promising avenues for research in learning for planning systems.
Collapse
|
23
|
de la Rosa T, Fuentetaja R. On the importance of breaking ties in the relaxed plan heuristic. J EXP THEOR ARTIF IN 2011. [DOI: 10.1080/0952813x.2010.503341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Tomás de la Rosa
- a Departamento de Informática , Universidad Carlos III de Madrid , Av. Universidad 30, Leganés, (Madrid), 28911 Spain
| | - Raquel Fuentetaja
- a Departamento de Informática , Universidad Carlos III de Madrid , Av. Universidad 30, Leganés, (Madrid), 28911 Spain
| |
Collapse
|
24
|
Abstract
AbstractThere are many approaches for solving planning problems. Many of these approaches are based on ‘brute force’ search methods and they usually do not care about structures of plans previously computed in particular planning domains. By analyzing these structures, we can obtain useful knowledge that can help us find solutions to more complex planning problems. The method described in this paper is designed for gathering macro-operators by analyzing training plans. This sort of analysis is based on the investigation of action dependencies in training plans. Knowledge gained by our method can be passed directly to planning algorithms to improve their efficiency.
Collapse
|
25
|
|