1
|
Fan M, Wu Y, Cao Z, Song W, Sartoretti G, Liu H, Wu G. Conditional Neural Heuristic for Multiobjective Vehicle Routing Problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:4677-4689. [PMID: 38517723 DOI: 10.1109/tnnls.2024.3371706] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]
Abstract
Existing neural heuristics for multiobjective vehicle routing problems (MOVRPs) are primarily conditioned on instance context, which failed to appropriately exploit preference and problem size, thus holding back the performance. To thoroughly unleash the potential, we propose a novel conditional neural heuristic (CNH) that fully leverages the instance context, preference, and size with an encoder-decoder structured policy network. Particularly, in our CNH, we design a dual-attention-based encoder to relate preferences and instance contexts, so as to better capture their joint effect on approximating the exact Pareto front (PF). We also design a size-aware decoder based on the sinusoidal encoding to explicitly incorporate the problem size into the embedding, so that a single trained model could better solve instances of various scales. Besides, we customize the REINFORCE algorithm to train the neural heuristic by leveraging stochastic preferences (SPs), which further enhances the training performance. Extensive experimental results on random and benchmark instances reveal that our CNH could achieve favorable approximation to the whole PF with higher hypervolume (HV) and lower optimality gap (Gap) than those of the existing neural and conventional heuristics. More importantly, a single trained model of our CNH can outperform other neural heuristics that are exclusively trained on each size. In addition, the effectiveness of the key designs is also verified through ablation studies.
Collapse
|
2
|
Zhang K, Ye BL, Xia X, Wang Z, Zhang X, Jiang H. A Space Telescope Scheduling Approach Combining Observation Priority Coding with Problem Decomposition Strategies. Biomimetics (Basel) 2024; 9:718. [PMID: 39727722 DOI: 10.3390/biomimetics9120718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 11/14/2024] [Accepted: 11/17/2024] [Indexed: 12/28/2024] Open
Abstract
With the increasing number of space debris, the demand for telescopes to observe space debris is also constantly increasing. The telescope observation scheduling problem requires algorithms to schedule telescopes to maximize observation value within the visible time constraints of space debris, especially when dealing with large-scale problems. This paper proposes a practical heuristic algorithm to solve the telescope observation of space debris scheduling problem. In order to accelerate the solving speed of algorithms on large-scale problems, this paper combines the characteristics of the problem and partitions the large-scale problem into multiple sub-problems according to the observation time. In each sub-problem, a coding method based on the priority of the target going into the queue is proposed in combination with the actual observation data, and a decoding method matching the coding method is designed. In the solution process for each sub-problem, an adaptive variable neighborhood search is used to solve the space debris observation plan. When solving all sub-problems is completed, the observation plans obtained on all sub-problems are combined to obtain the observation plan of the original problem.
Collapse
Affiliation(s)
- Kaiyuan Zhang
- School of Information Science and Engineering, Jiaxing University, Jiaxing 314001, China
- School of Science, Jiangxi University of Science and Technology, Ganzhou 341000, China
| | - Bao-Lin Ye
- School of Information Science and Engineering, Jiaxing University, Jiaxing 314001, China
| | - Xiaoyun Xia
- School of Information Science and Engineering, Jiaxing University, Jiaxing 314001, China
| | - Zijia Wang
- School of Computer Science and Cyber Engineering, Guangzhou University, Guangzhou 510006, China
| | - Xianchao Zhang
- Institute of Information Network and Artificial Intelligence, Jiaxing University, Jiaxing 314001, China
| | - Hai Jiang
- National Astronomical Observatories, Chinese Academy of Sciences, Beijing 100012, China
| |
Collapse
|
3
|
Zhao S, Gu S. A deep reinforcement learning algorithm framework for solving multi-objective traveling salesman problem based on feature transformation. Neural Netw 2024; 176:106359. [PMID: 38733797 DOI: 10.1016/j.neunet.2024.106359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 03/10/2024] [Accepted: 04/29/2024] [Indexed: 05/13/2024]
Abstract
As a special type of multi-objective combinatorial optimization problems (MOCOPs), the multi-objective traveling salesman problem (MOTSP) plays an important role in practical fields such as transportation and robot control. However, due to the complexity of its solution space and the conflicts between different objectives, it is difficult to obtain satisfactory solutions in a short time. This paper proposes an end-to-end algorithm framework for solving MOTSP based on deep reinforcement learning (DRL). By decomposing strategies, solving MOTSP is transformed into solving multiple single-objective optimization subproblems. Through linear transformation, the features of the MOTSP are combined with the weights of the objective function. Subsequently, a modified graph pointer network (GPN) model is used to solve the decomposed subproblems. Compared with the previous DRL model, the proposed algorithm can solve all the subproblems using only one model without adding weight information as input features. Furthermore, our algorithm can output a corresponding solution for each weight, which increases the diversity of solutions. In order to verify the performance of our proposed algorithm, it is compared with four classical evolutionary algorithms and two DRL algorithms on several MOTSP instances. The comparison shows that our proposed algorithm outperforms the compared algorithms both in terms of training time and the quality of the resulting solutions.
Collapse
Affiliation(s)
- Shijie Zhao
- School of Mechatronic Engineering and Automation, Shanghai University, 99 Shangda Road, Shanghai 200444, China.
| | - Shenshen Gu
- School of Mechatronic Engineering and Automation, Shanghai University, 99 Shangda Road, Shanghai 200444, China.
| |
Collapse
|
4
|
Ai C, Yang H, Liu X, Dong R, Ding Y, Guo F. MTMol-GPT: De novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS Comput Biol 2024; 20:e1012229. [PMID: 38924082 PMCID: PMC11233020 DOI: 10.1371/journal.pcbi.1012229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/09/2024] [Accepted: 06/03/2024] [Indexed: 06/28/2024] Open
Abstract
De novo drug design is crucial in advancing drug discovery, which aims to generate new drugs with specific pharmacological properties. Recently, deep generative models have achieved inspiring progress in generating drug-like compounds. However, the models prioritize a single target drug generation for pharmacological intervention, neglecting the complicated inherent mechanisms of diseases, and influenced by multiple factors. Consequently, developing novel multi-target drugs that simultaneously target specific targets can enhance anti-tumor efficacy and address issues related to resistance mechanisms. To address this issue and inspired by Generative Pre-trained Transformers (GPT) models, we propose an upgraded GPT model with generative adversarial imitation learning for multi-target molecular generation called MTMol-GPT. The multi-target molecular generator employs a dual discriminator model using the Inverse Reinforcement Learning (IRL) method for a concurrently multi-target molecular generation. Extensive results show that MTMol-GPT generates various valid, novel, and effective multi-target molecules for various complex diseases, demonstrating robustness and generalization capability. In addition, molecular docking and pharmacophore mapping experiments demonstrate the drug-likeness properties and effectiveness of generated molecules potentially improve neuropsychiatric interventions. Furthermore, our model's generalizability is exemplified by a case study focusing on the multi-targeted drug design for breast cancer. As a broadly applicable solution for multiple targets, MTMol-GPT provides new insight into future directions to enhance potential complex disease therapeutics by generating high-quality multi-target molecules in drug discovery.
Collapse
Affiliation(s)
- Chengwei Ai
- School of computer science and engineering, Central South University, Changsha, China
| | - Hongpeng Yang
- Department of computer science and engineering, University of South Carolina, Columbia, South Carolina, United States of America
| | - Xiaoyi Liu
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Ministry of Education, Engineering Research Center for Pharmaceutics of Chinese Materia Medica and New Drug Development, Beijing, China
| | - Ruihan Dong
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Fei Guo
- School of computer science and engineering, Central South University, Changsha, China
| |
Collapse
|
5
|
Zhang Z, Wu Z, Zhang H, Wang J. Meta-Learning-Based Deep Reinforcement Learning for Multiobjective Optimization Problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7978-7991. [PMID: 35171781 DOI: 10.1109/tnnls.2022.3148435] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep reinforcement learning (DRL) has recently shown its success in tackling complex combinatorial optimization problems. When these problems are extended to multiobjective ones, it becomes difficult for the existing DRL approaches to flexibly and efficiently deal with multiple subproblems determined by the weight decomposition of objectives. This article proposes a concise meta-learning-based DRL approach. It first trains a meta-model by meta-learning. The meta-model is fine-tuned with a few update steps to derive submodels for the corresponding subproblems. The Pareto front is then built accordingly. Compared with other learning-based methods, our method can greatly shorten the training time of multiple submodels. Due to the rapid and excellent adaptability of the meta-model, more submodels can be derived so as to increase the quality and diversity of the found solutions. The computational experiments on multiobjective traveling salesman problems and multiobjective vehicle routing problems with time windows demonstrate the superiority of our method over most of the learning-based and iteration-based approaches.
Collapse
|
6
|
Wu T, Wang H, Liu Y, Li T, Yu Y. Learning-based interfered fluid avoidance guidance for hypersonic reentry vehicles with multiple constraints. ISA TRANSACTIONS 2023; 139:291-307. [PMID: 37076373 DOI: 10.1016/j.isatra.2023.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 04/05/2023] [Accepted: 04/05/2023] [Indexed: 05/03/2023]
Abstract
To address the problem of no-fly zone avoidance for hypersonic reentry vehicles in the multiple constraints gliding phase, a learning-based avoidance guidance framework is proposed. First, the reference heading angle determination problem is solved efficiently and skillfully by introducing a nature-inspired methodology based on the concept of the interfered fluid dynamic system (IFDS), in which the distance and relative position relationships of all no-fly zones can be comprehensively considered, and additional rules are no longer needed. Then, by incorporating the predictor-corrector method, the heading angle corridor, and bank angle reversal logic, a fundamental interfered fluid avoidance guidance algorithm is proposed to steer the vehicle toward the target zone while avoiding no-fly zones. In addition, a learning-based online optimization mechanism is used to optimize the IFDS parameters in real time to improve the avoidance guidance performance of the proposed algorithm in the entire gliding phase. Finally, the adaptability and robustness of the proposed guidance algorithm are verified via comparative and Monte Carlo simulations.
Collapse
Affiliation(s)
- Tiancai Wu
- School of Automation Science and Electrical Engineering, Beihang University, 100191, Beijing, China; Shenyuan Honors College of Beihang University, 100191, Beijing, China; The Science and Technology on Aircraft Control Laboratory, Beihang University, 100191, Beijing, China
| | - Honglun Wang
- School of Automation Science and Electrical Engineering, Beihang University, 100191, Beijing, China; The Science and Technology on Aircraft Control Laboratory, Beihang University, 100191, Beijing, China.
| | - Yiheng Liu
- School of Automation Science and Electrical Engineering, Beihang University, 100191, Beijing, China; Shenyuan Honors College of Beihang University, 100191, Beijing, China; The Science and Technology on Aircraft Control Laboratory, Beihang University, 100191, Beijing, China
| | - Tianren Li
- R & D Center, China Academy of Launch Vehicle Technology, Beijing 100071, China
| | - Yue Yu
- Beijing Aerospace Automatic Control Institute, Beijing 100854, China
| |
Collapse
|
7
|
Jiang Y, Cao Z, Zhang J. Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2864-2875. [PMID: 34748508 DOI: 10.1109/tcyb.2021.3121542] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently, there is a growing attention on applying deep reinforcement learning (DRL) to solve the 3-D bin packing problem (3-D BPP). However, due to the relatively less informative yet computationally heavy encoder, and considerably large action space inherent to the 3-D BPP, existing DRL methods are only able to handle up to 50 boxes. In this article, we propose to alleviate this issue via a DRL agent, which sequentially addresses three subtasks of sequence, orientation, and position, respectively. Specifically, we exploit a multimodal encoder, where a sparse attention subencoder embeds the box state to mitigate the computation while learning the packing policy, and a convolutional neural network subencoder embeds the view state to produce auxiliary spatial representation. We also leverage an action representation learning in the decoder to cope with the large action space of the position subtask. Besides, we integrate the proposed DRL agent into constraint programming (CP) to further improve the solution quality iteratively by exploiting the powerful search framework in CP. The experiments show that both the sole DRL and hybrid methods enable the agent to solve large-scale instances of 120 boxes or more. Moreover, they both could deliver superior performance against the baselines on instances of various scales.
Collapse
|
8
|
Shao Y, Lin JCW, Srivastava G, Guo D, Zhang H, Yi H, Jolfaei A. Multi-Objective Neural Evolutionary Algorithm for Combinatorial Optimization Problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2133-2143. [PMID: 34473629 DOI: 10.1109/tnnls.2021.3105937] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
There has been a recent surge of success in optimizing deep reinforcement learning (DRL) models with neural evolutionary algorithms. This type of method is inspired by biological evolution and uses different genetic operations to evolve neural networks. Previous neural evolutionary algorithms mainly focused on single-objective optimization problems (SOPs). In this article, we present an end-to-end multi-objective neural evolutionary algorithm based on decomposition and dominance (MONEADD) for combinatorial optimization problems. The proposed MONEADD is an end-to-end algorithm that utilizes genetic operations and rewards signals to evolve neural networks for different combinatorial optimization problems without further engineering. To accelerate convergence, a set of nondominated neural networks is maintained based on the notion of dominance and decomposition in each generation. In inference time, the trained model can be directly utilized to solve similar problems efficiently, while the conventional heuristic methods need to learn from scratch for every given test problem. To further enhance the model performance in inference time, three multi-objective search strategies are introduced in this work. Our experimental results clearly show that the proposed MONEADD has a competitive and robust performance on a bi-objective of the classic travel salesman problem (TSP), as well as Knapsack problem up to 200 instances. We also empirically show that the designed MONEADD has good scalability when distributed on multiple graphics processing units (GPUs).
Collapse
|
9
|
Guan Y, Ren Y, Sun Q, Li SE, Ma H, Duan J, Dai Y, Cheng B. Integrated Decision and Control: Toward Interpretable and Computationally Efficient Driving Intelligence. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:859-873. [PMID: 35439160 DOI: 10.1109/tcyb.2022.3163816] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Decision and control are core functionalities of high-level automated vehicles. Current mainstream methods, such as functional decomposition and end-to-end reinforcement learning (RL), suffer high time complexity or poor interpretability and adaptability on real-world autonomous driving tasks. In this article, we present an interpretable and computationally efficient framework called integrated decision and control (IDC) for automated vehicles, which decomposes the driving task into static path planning and dynamic optimal tracking that are structured hierarchically. First, the static path planning generates several candidate paths only considering static traffic elements. Then, the dynamic optimal tracking is designed to track the optimal path while considering the dynamic obstacles. To that end, we formulate a constrained optimal control problem (OCP) for each candidate path, optimize them separately, and follow the one with the best tracking performance. To unload the heavy online computation, we propose a model-based RL algorithm that can be served as an approximate-constrained OCP solver. Specifically, the OCPs for all paths are considered together to construct a single complete RL problem and then solved offline in the form of value and policy networks for real-time online path selecting and tracking, respectively. We verify our framework in both simulations and the real world. Results show that compared with baseline methods, IDC has an order of magnitude higher online computing efficiency, as well as better driving performance, including traffic efficiency and safety. In addition, it yields great interpretability and adaptability among different driving scenarios and tasks.
Collapse
|
10
|
Bai C, Wang L, Wang Y, Wang Z, Zhao R, Bai C, Liu P. Addressing Hindsight Bias in Multigoal Reinforcement Learning. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:392-405. [PMID: 34495860 DOI: 10.1109/tcyb.2021.3107202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Multigoal reinforcement learning (RL) extends the typical RL with goal-conditional value functions and policies. One efficient multigoal RL algorithm is the hindsight experience replay (HER). By treating a hindsight goal from failed experiences as the original goal, HER enables the agent to receive rewards frequently. However, a key assumption of HER is that the hindsight goals do not change the likelihood of the sampled transitions and trajectories used in training, which is not the fact according to our analysis. More specifically, we show that using hindsight goals changes such a likelihood and results in a biased learning objective for multigoal RL. We analyze the hindsight bias due to this use of hindsight goals and propose the bias-corrected HER (BHER), an efficient algorithm that corrects the hindsight bias in training. We further show that BHER outperforms several state-of-the-art multigoal RL approaches in challenging robotics tasks.
Collapse
|
11
|
Li K, Zhang T, Wang R, Wang Y, Han Y, Wang L. Deep Reinforcement Learning for Combinatorial Optimization: Covering Salesman Problems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:13142-13155. [PMID: 34437087 DOI: 10.1109/tcyb.2021.3103811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This article introduces a new deep learning approach to approximately solve the covering salesman problem (CSP). In this approach, given the city locations of a CSP as input, a deep neural network model is designed to directly output the solution. It is trained using the deep reinforcement learning without supervision. Specifically, in the model, we apply the multihead attention (MHA) to capture the structural patterns, and design a dynamic embedding to handle the dynamic patterns of the problem. Once the model is trained, it can generalize to various types of CSP tasks (different sizes and topologies) without the need of retraining. Through controlled experiments, the proposed approach shows desirable time complexity: it runs more than 20 times faster than the traditional heuristic solvers with a tiny gap of optimality. Moreover, it significantly outperforms the current state-of-the-art deep learning approaches for combinatorial optimization in the aspect of both training and inference. In comparison with traditional solvers, this approach is highly desirable for most of the challenging tasks in practice that are usually large scale and require quick decisions.
Collapse
|
12
|
Xu Y, Fang M, Chen L, Xu G, Du Y, Zhang C. Reinforcement Learning With Multiple Relational Attention for Solving Vehicle Routing Problems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:11107-11120. [PMID: 34236983 DOI: 10.1109/tcyb.2021.3089179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, we study the reinforcement learning (RL) for vehicle routing problems (VRPs). Recent works have shown that attention-based RL models outperform recurrent neural network-based methods on these problems in terms of both effectiveness and efficiency. However, existing RL models simply aggregate node embeddings to generate the context embedding without taking into account the dynamic network structures, making them incapable of modeling the state transition and action selection dynamics. In this work, we develop a new attention-based RL model that provides enhanced node embeddings via batch normalization reordering and gate aggregation, as well as dynamic-aware context embedding through an attentive aggregation module on multiple relational structures. We conduct experiments on five types of VRPs: 1) travelling salesman problem (TSP); 2) capacitated VRP (CVRP); 3) split delivery VRP (SDVRP); 4) orienteering problem (OP); and 5) prize collecting TSP (PCTSP). The results show that our model not only outperforms the learning-based baselines but also solves the problems much faster than the traditional baselines. In addition, our model shows improved generalizability when being evaluated in large-scale problems, as well as problems with different data distributions.
Collapse
|
13
|
Wei X, Wu C, Yu H, Liu S, Yuan Y. A coin selection strategy based on the greedy and genetic algorithm. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00799-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
AbstractCoin selection method refers to the process undergone when selecting a set of unspent transaction outputs (UTXOs) from a cryptocurrency wallet or account to use as inputs in each transaction. The most applied coin selection method that UTXO-based cryptocurrencies currently employ is an algorithm that decides on a certain set of UTXOs that matches the target amount and limits the transaction fee. However this approach trades off favourable maintenance overhead of the entire network for low transaction fees, as many low-value UTXOs known as “dust” is produced. Over time, this will impact the scalability and management of the cryptocurrency network as the global set of UTXOs become larger. Therefore, there is an urgency to find a higher-performing coin selection method suitable for UTXO-based cryptocurrencies. This paper proposes a method based on the greedy and genetic algorithm for effectively choosing sets of UTXOs in Bitcoin. The main objective of this coin selection strategy is to get as close as possible to the target while also maintaining and possibly reducing the number of UTXO inputs.
Collapse
|
14
|
Yang Y, Wu Z, Yao X, Kang Y, Hou T, Hsieh CY, Liu H. Exploring Low-Toxicity Chemical Space with Deep Learning for Molecular Generation. J Chem Inf Model 2022; 62:3191-3199. [PMID: 35713712 DOI: 10.1021/acs.jcim.2c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Creating a wide range of new compounds that not only have ideal pharmacological properties but also easily pass long-term toxicity evaluation is still a challenging task in current drug discovery. In this study, we developed a conditional generative model by combining a semisupervised variational autoencoder (SSVAE) with an MGA toxicity predictor. Our aim is to generate molecules with low toxicity, good drug-like properties, and structural diversity. For multiobjective optimization, we have developed a method with hierarchical constraints on the toxicity space of small molecules to generate drug-like small molecules, which can also minimize the effect on the diversity of generated results. The evaluation results of the metrics indicate that the developed model has good effectiveness, novelty, and diversity. The generated molecules by this model are mainly distributed in low-toxicity regions, which suggests that our model can efficiently constrain the generation of toxic structures. In contrast to simply filtering toxic ones after generation, the low-toxicity molecular generative model can generate molecules with structural diversity. Our strategy can be used in target-based drug discovery to improve the quality of generated molecules with low-toxicity, drug-like, and highly active properties.
Collapse
Affiliation(s)
- Yuwei Yang
- School of Pharmacy, Lanzhou University, Lanzhou 730000, China
| | - Zhenxing Wu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Xiaojun Yao
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Tencent, Shenzhen 518000, China
| | - Huanxiang Liu
- School of Pharmacy, Lanzhou University, Lanzhou 730000, China.,Faculty of Applied Science, Macao Polytechnic University, Macao, SAR 999078, China
| |
Collapse
|
15
|
Li X, Zhang Z, Gao L, Wen L. A New Semi-Supervised Fault Diagnosis Method via Deep CORAL and Transfer Component Analysis. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE 2022. [DOI: 10.1109/tetci.2021.3115666] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Xinyu Li
- State Key Laboratory of Digital Manufacturing Equipment & Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Zhao Zhang
- State Key Laboratory of Digital Manufacturing Equipment & Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Liang Gao
- State Key Laboratory of Digital Manufacturing Equipment & Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Long Wen
- School of Mechanical Engineering and Electronic Information, China University of Geosciences, Wuhan, China
| |
Collapse
|
16
|
A Fast and Robust Algorithm with Reinforcement Learning for Large UAV Cluster Mission Planning. REMOTE SENSING 2022. [DOI: 10.3390/rs14061304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Large Unmanned Aerial Vehicle (UAV) clusters, containing hundreds of UAVs, have widely been used in the modern world. Therein, mission planning is the core of large UAV cluster collaborative systems. In this paper, we propose a mission planning method by introducing the Simple Attention Model (SAM) into Dynamic Information Reinforcement Learning (DIRL), named DIRL-SAM. To reduce the computational complexity of the original attention model, we derive the SAM with a lightweight interactive model to rapidly extract high-dimensional features of the cluster information. In DIRL, dynamic training conditions are considered to simulate different mission environments. Meanwhile, the data expansion in DIRL guarantees the convergence of the model in these dynamic environments, which improves the robustness of the algorithm. Finally, the simulation experiment results show that the proposed method can adaptively provide feasible mission planning schemes with second-level solution speed and that it exhibits excellent generalization performance in large-scale cluster planning problems.
Collapse
|
17
|
Li S, Luo T, Wang L, Xing L, Ren T. Tourism route optimization based on improved knowledge ant colony algorithm. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00635-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
AbstractWith the rapid development of tourism in the economy, popular demand for tourism also increases. Unreasonable distribution arises a series of problems such as reduction of tourist satisfaction and decrease of the income in tourist attractions. Based on consideration of tourism route planning, a mathematical model which takes the maximization of the overall satisfaction of all tourist groups as the objective function is established by taking the age and preferences of tourists, the upper limits of the tourist carrying capacity in various tourism routes, etc. as constraints. It aims to maximize income in tourist attractions while improving tourist satisfaction. Based on the tourist data of a travel agency, the statistical ideas of hierarchical clustering and random sampling are utilized to process the acquired data to obtain the simulation examples in the article. Aiming at this model, a knowledge-based hybrid ant colony algorithm is designed. On this basis, the mechanism of bacterial foraging algorithm is introduced. It improves the performance of the algorithm and avoids the generation of local optimal solution. At the same time, two knowledge models are in addition to improve the solution quality of the algorithm. Typical simulation indicates that the improved ant colony algorithm can find the optimal solution at a higher efficiency when solving the tourism route planning problem. The model can also satisfy the economic benefit of enterprises and achieves favorable path optimization effect under different optional routes, thus further verifying the effect liveness of the model.
Collapse
|
18
|
Bi Y, Meixner CC, Bunyakitanon M, Vasilakos X, Nejabati R, Simeonidou D. Multi-Objective Deep Reinforcement Learning Assisted Service Function Chains Placement. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT 2021. [DOI: 10.1109/tnsm.2021.3127685] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
19
|
Analysis of public opinion evolution of COVID-19 based on LDA-ARMA hybrid model. COMPLEX INTELL SYST 2021; 7:3165-3178. [PMID: 34777976 PMCID: PMC8416577 DOI: 10.1007/s40747-021-00514-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 08/22/2021] [Indexed: 12/28/2022]
Abstract
The aim of this study was to explore a method for developing an emotional evolution classification model for large-scale online public opinion of events such as Coronavirus Disease 2019 (COVID-19), in order to guide government departments to adopt differentiated forms of emergency management and to correctly guide online public opinion for severely afflicted areas such as Wuhan and those afflicted elsewhere in China. We propose the LDA-ARMA deep neural network for dynamic presentation and fine-grained categorization of a public opinion events. This was applied to a huge quantity of online public opinion texts in a complicated setting and integrated the proposed sentiment measurement algorithm. To begin, the Latent Dirichlet Allocation (LDA) was employed to extract information about the topic of comments. The autoregressive moving average model (ARMA) was then utilized to perform multidimensional sentiment analysis and evolution prediction on large-scale textual data related to COVID-19 published by netizens from Wuhan and other countries on Sina Weibo. The results show that Wuhan netizens paid more attention to the development of the situation, treatment measures, and policies related to COVID-19 than other issues, and were under greater emotional pressure, whereas netizens in the rest of the country paid more attention to the overall COVID-19 prevention and control, and were more positive and optimistic with the assistance of the government and NGOs. The average error in predicting public opinion sentiment was less than 5.64%, demonstrating that this approach may be effectively applied to the analysis of large-scale online public sentiment evolution.
Collapse
|
20
|
Malektaji S, Ebrahimzadeh A, Elbiaze H, Glitho RH, Kianpisheh S. Deep Reinforcement Learning-Based Content Migration for Edge Content Delivery Networks With Vehicular Nodes. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT 2021. [DOI: 10.1109/tnsm.2021.3086721] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|