1
|
Tan Y, Zhang Z, Li M, Pan F, Duan H, Huang Z, Deng H, Yu Z, Yang C, Shen G, Qi P, Yue C, Liu Y, Hong L, Yu H, Fan G, Tang Y. MedChatZH: A tuning LLM for traditional Chinese medicine consultations. Comput Biol Med 2024; 172:108290. [PMID: 38503097 DOI: 10.1016/j.compbiomed.2024.108290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 02/18/2024] [Accepted: 03/12/2024] [Indexed: 03/21/2024]
Abstract
Generative Large Language Models (LLMs) have achieved significant success in various natural language processing tasks, including Question-Answering (QA) and dialogue systems. However, most models are trained on English data and lack strong generalization in providing answers in Chinese. This limitation is especially evident in specialized domains like traditional Chinese medical QA, where performance suffers due to the absence of fine-tuning and high-quality datasets. To address this, we introduce MedChatZH, a dialogue model optimized for Chinese medical QA based on transformer decoder with LLaMA architecture. Continued pre-training on a curated corpus of Chinese medical books is followed by fine-tuning with a carefully selected medical instruction dataset, resulting in MedChatZH outperforming several Chinese dialogue baselines on a real-world medical dialogue dataset. Our model, code, and dataset are publicly available on GitHub (https://github.com/tyang816/MedChatZH) to encourage further research in traditional Chinese medicine and LLMs.
Collapse
Affiliation(s)
- Yang Tan
- Department of Computer Science and Technology, East China University of Science and Technology, Shanghai, 200237, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China; Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, 200240, China
| | - Zhixing Zhang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Mingchen Li
- Department of Computer Science and Technology, East China University of Science and Technology, Shanghai, 200237, China; Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China; Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, 200240, China
| | - Fei Pan
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Hao Duan
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Zijie Huang
- Department of Computer Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Hua Deng
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Zhuohang Yu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Chen Yang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Guoyang Shen
- Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, 200240, China
| | - Peng Qi
- Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, 200240, China
| | - Chengyuan Yue
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China
| | - Yuxian Liu
- The University of Sydney, Sydney, 2050, Australia
| | - Liang Hong
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200240, China; Chongqing Artificial Intelligence Research Institute of Shanghai Jiao Tong University, 200240, China; School of Physics and Astronomy & School of Pharmacy, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Huiqun Yu
- Department of Computer Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Guisheng Fan
- Department of Computer Science and Technology, East China University of Science and Technology, Shanghai, 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, 200237, China.
| |
Collapse
|