1
|
Rezapour M, Yazdinejad M, Rajabi Kouchi F, Habibi Baghi M, Khorrami Z, Khavanin Zadeh M, Pourbaghi E, Rezapour H. Text mining of hypertension researches in the west Asia region: a 12-year trend analysis. Ren Fail 2024; 46:2337285. [PMID: 38616180 PMCID: PMC11018045 DOI: 10.1080/0886022x.2024.2337285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 03/27/2024] [Indexed: 04/16/2024] Open
Abstract
More than half of the world population lives in Asia and hypertension (HTN) is the most prevalent risk factor found in Asia. There are numerous articles published about HTN in Eastern Mediterranean Region (EMRO) and artificial intelligence (AI) methods can analyze articles and extract top trends in each country. Present analysis uses Latent Dirichlet allocation (LDA) as an algorithm of topic modeling (TM) in text mining, to obtain subjective topic-word distribution from the 2790 studies over the EMRO. The period of checked studied is last 12 years and results of LDA analyses show that HTN researches published in EMRO discuss on changes in BP and the factors affecting it. Among the countries in the region, most of these articles are related to I.R Iran and Egypt, which have an increasing trend from 2017 to 2018 and reached the highest level in 2021. Meanwhile, Iraq and Lebanon have been conducting research since 2010. The EMRO word cloud illustrates 'BMI', 'mortality', 'age', and 'meal', which represent important indicators, dangerous outcomes of high BP, and gender of HTN patients in EMRO, respectively.
Collapse
Affiliation(s)
- Mohammad Rezapour
- Faculty Member of the Iranian Ministry of Science, Research and Technology, Tehran, Iran
| | | | - Faezeh Rajabi Kouchi
- Department of Computer Engineering, Central Tehran Branch, Islamic Azad University, Tehran, Iran
| | | | - Zahra Khorrami
- Ophthalmic Epidemiology Research Center, Research Institute for Ophthalmology and Vision Science, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Morteza Khavanin Zadeh
- Hasheminejad Kidney Center, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Elmira Pourbaghi
- Faculty of Advanced Sciences and Technology, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Hassan Rezapour
- Department of Transportation and Urban Infrastructure Studies, Morgan State University, Baltimore, MD, USA
| |
Collapse
|
2
|
Labbo MS, Qu L, Xu C, Bai W, Ayele Atumo E, Jiang X. Understanding risky driving behaviors among young novice drivers in Nigeria: A latent class analysis coupled with association rule mining approach. Accid Anal Prev 2024; 200:107557. [PMID: 38537532 DOI: 10.1016/j.aap.2024.107557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 02/22/2024] [Accepted: 03/21/2024] [Indexed: 04/14/2024]
Abstract
Traffic crashes are significant public health concern in Nigeria, particularly among young drivers. The study aims to explore the underlying pattern of risky driving behaviors and the associations with demographic factors among young drivers in Nigeria. A combined approach of Latent Class Analysis (LCA) and Association Rule Mining is applied to the dataset comprising responses from 684 young drivers who complete the "Behavior of Young Novice Drivers Scale" (BYND) questionnaires. The LCA identifies four distinct classes of drivers based on the risky behavior profiles: Reckless-Speedsters, Cautious Drivers, Distracted Multitaskers, and Emotion-impacted Drivers. Association rule mining further connects these driver classes to demographic and driving history variables, uncovering intriguing insights. Reckless-Speedsters predominantly consist of young males who engage in riskier driving behaviors, including exceeding speed limits and disregarding traffic rules. Conversely, Cautious Drivers, also predominantly young males, exhibit a safer driving profile marked by rule adherence and a notably lower crash rate. Distracted Multitaskers, sharing a demographic profile with Cautious Drivers, diverge significantly due to their higher crash involvement, hinting at a propensity for distracted driving practices. Lastly, Emotion-Impacted Drivers, primarily comprising young employed males, display behaviors influenced by emotions, shorter driving distances, and prior unsupervised driving experience. Most of the behaviors are attributed to inadequate traffic control, absence of traffic signs in most of the roads, preferential treatment, and lack of strict law enforcement in the country. The findings hold substantial implications for road safety interventions in Nigeria, urging targeted approaches to address the unique challenges presented by each driver class. With acknowledging the study limitations and advocating for future research in objective measures and emotion-behavior interactions, the comprehensive approach provides a robust foundation for enhancing road safety in the Nigerian context.
Collapse
Affiliation(s)
- Muwaffaq Safiyanu Labbo
- School of Transportation and Logistics, Southwest Jiaotong University, West Park, High-Tech District, Chengdu 611756, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, West Park, High-Tech District, Chengdu 611756, China; Department of Civil Engineering, Aliko Dangote University of Science and Technology, Wudil, Kano, Nigeria
| | - Lin Qu
- School of Transportation and Logistics, Southwest Jiaotong University, West Park, High-Tech District, Chengdu 611756, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, West Park, High-Tech District, Chengdu 611756, China
| | - Chuan Xu
- School of Transportation and Logistics, Southwest Jiaotong University, West Park, High-Tech District, Chengdu 611756, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, West Park, High-Tech District, Chengdu 611756, China
| | - Wei Bai
- Department of Road Traffic Management, Sichuan Police College, Luzhou, Sichuan, China
| | | | - Xinguo Jiang
- School of Transportation and Logistics, Southwest Jiaotong University, West Park, High-Tech District, Chengdu 611756, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, West Park, High-Tech District, Chengdu 611756, China; School of Transportation, Fujian University of Technology, Fuzhou, China.
| |
Collapse
|
3
|
Ju W, Fang Z, Gu Y, Liu Z, Long Q, Qiao Z, Qin Y, Shen J, Sun F, Xiao Z, Yang J, Yuan J, Zhao Y, Wang Y, Luo X, Zhang M. A Comprehensive Survey on Deep Graph Representation Learning. Neural Netw 2024; 173:106207. [PMID: 38442651 DOI: 10.1016/j.neunet.2024.106207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 01/23/2024] [Accepted: 02/21/2024] [Indexed: 03/07/2024]
Abstract
Graph representation learning aims to effectively encode high-dimensional sparse graph-structured data into low-dimensional dense vectors, which is a fundamental task that has been widely studied in a range of fields, including machine learning and data mining. Classic graph embedding methods follow the basic idea that the embedding vectors of interconnected nodes in the graph can still maintain a relatively close distance, thereby preserving the structural information between the nodes in the graph. However, this is sub-optimal due to: (i) traditional methods have limited model capacity which limits the learning performance; (ii) existing techniques typically rely on unsupervised learning strategies and fail to couple with the latest learning paradigms; (iii) representation learning and downstream tasks are dependent on each other which should be jointly enhanced. With the remarkable success of deep learning, deep graph representation learning has shown great potential and advantages over shallow (traditional) methods, there exist a large number of deep graph representation learning techniques have been proposed in the past decade, especially graph neural networks. In this survey, we conduct a comprehensive survey on current deep graph representation learning algorithms by proposing a new taxonomy of existing state-of-the-art literature. Specifically, we systematically summarize the essential components of graph representation learning and categorize existing approaches by the ways of graph neural network architectures and the most recent advanced learning paradigms. Moreover, this survey also provides the practical and promising applications of deep graph representation learning. Last but not least, we state new perspectives and suggest challenging directions which deserve further investigations in the future.
Collapse
Affiliation(s)
- Wei Ju
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Zheng Fang
- School of Intelligence Science and Technology, Peking University, Beijing, 100871, China
| | - Yiyang Gu
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Zequn Liu
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Qingqing Long
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100086, China
| | - Ziyue Qiao
- Artificial Intelligence Thrust, The Hong Kong University of Science and Technology, Guangzhou, 511453, China
| | - Yifang Qin
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Jianhao Shen
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Fang Sun
- Department of Computer Science, University of California, Los Angeles, 90095, USA
| | - Zhiping Xiao
- Department of Computer Science, University of California, Los Angeles, 90095, USA
| | - Junwei Yang
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Jingyang Yuan
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Yusheng Zhao
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China
| | - Yifan Wang
- School of Information Technology & Management, University of International Business and Economics, Beijing, 100029, China
| | - Xiao Luo
- Department of Computer Science, University of California, Los Angeles, 90095, USA.
| | - Ming Zhang
- School of Computer Science, National Key Laboratory for Multimedia Information Processing, Peking University, Beijing, 100871, China.
| |
Collapse
|
4
|
Tamakloe R, Zhang K, Hossain A, Kim I, Park SH. Critical risk factors associated with fatal/severe crash outcomes in personal mobility device rider at-fault crashes: A two-step inter-cluster rule mining technique. Accid Anal Prev 2024; 199:107527. [PMID: 38428242 DOI: 10.1016/j.aap.2024.107527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/28/2024] [Accepted: 02/25/2024] [Indexed: 03/03/2024]
Abstract
Personal Mobility Devices (PMDs) have witnessed an extraordinary surge in popularity, emerging as a favored mode of urban transportation. This has sparked significant safety concerns, paralleled by a stark increase in PMD-involved crashes. Research indicates that PMD user behavior, especially in urban areas, is crucial in these crashes, underscoring the need for an extensive investigation into key factors, particularly those causing fatal/severe outcomes. Remarkably, there exists a noticeable gap in the research concerning the analysis of determinants behind fatal/severe PMD crashes, specifically in PMD rider-at-fault collisions. This study addresses this gap by identifying uniform groups of PMD rider-at-fault crashes and investigating cluster-specific key factor associations and determinants of fatal/severe crash outcomes using Seoul's PMD rider-at-fault crash data from 2017 to 2021. A comprehensive two-step framework, integrating Cluster Correspondence Analysis (CCA) and Association Rules Mining (ARM) techniques is employed to segment PMD rider-at-fault crash data into homogeneous groups, revealing unique risk factor patterns within each cluster and further exploring the combination of factors associated with fatal/severe PMD rider-at-fault crash outcomes. CCA revealed three distinct groups: PMD-vehicle, PMD-pedestrian, and single-PMD crashes. From the ARM, it was found that fatal/severe crashes were linked to dry road conditions, male PMD users, and weekdays, irrespective of the cluster. Whereas speeding violations and side collisions were associated with fatal/severe PMD-vehicle rider-at-fault crashes, traffic control violations were related to fatal/severe PMD-pedestrian rider-at-fault crashes at pedestrian crossings. Unsafe riding practices predominantly caused single-PMD crashes during daytime hours. From the findings, engineering improvements, awareness campaigns, education, and law enforcement actions are recommended. The new insights gleaned from this research provide a foundation for informed decision-making and the implementation of policies designed to enhance PMD safety.
Collapse
Affiliation(s)
- Reuben Tamakloe
- The Cho Chun Shik Graduate School of Mobility, Korea Advanced Institute of Science and Technology, 193 Munji-ro, Yuseong-gu, Daejeon, 34051, South Korea.
| | - Kaihan Zhang
- The Cho Chun Shik Graduate School of Mobility, Korea Advanced Institute of Science and Technology, 193 Munji-ro, Yuseong-gu, Daejeon, 34051, South Korea.
| | - Ahmed Hossain
- Department of Civil Engineering, University of Louisiana at Lafayette, Lafayette, LA, 70503, Unites States.
| | - Inhi Kim
- The Cho Chun Shik Graduate School of Mobility, Korea Advanced Institute of Science and Technology, 193 Munji-ro, Yuseong-gu, Daejeon, 34051, South Korea.
| | - Shin Hyoung Park
- Department of Transportation Engineering, University of Seoul, 163 Seoulsiripdae-ro Dongdaemun-gu, Seoul 02504, South Korea.
| |
Collapse
|
5
|
Tang XE, Lu T, Zhou YC, Zhan MJ, Chen W, Peng Z, Liu JH, Gui YF, Deng ZH, Fan F. Adult age estimation from the sternum using maximum intensity projection images of CT and data mining in a Chinese population. Int J Legal Med 2024; 138:961-970. [PMID: 38240839 DOI: 10.1007/s00414-024-03161-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Accepted: 01/08/2024] [Indexed: 04/11/2024]
Abstract
This study aimed to explore and develop data mining models for adult age estimation based on CT reconstruction images from the sternum. Maximum intensity projection (MIP) images of chest CT were retrospectively collected from a modern Chinese population, and data from 2700 patients (1349 males and 1351 females) aged 20 to 70 years were obtained. A staging technique within four indicators was applied. Several data mining models were established, and mean absolute error (MAE) was the primary comparison parameter. The intraobserver and interobserver agreement levels were good. Within internal validation, the optimal data mining model obtained the lowest MAE of 9.08 in males and 10.41 in females. For the external validation (N = 200), MAEs were 7.09 in males and 7.15 in females. In conclusion, the accuracy of our model for adult age estimation was among similar studies. MIP images of the sternum could be a potential age indicator. However, it should be combined with other indicators since the accuracy level is still unsatisfactory.
Collapse
Affiliation(s)
- Xian-E Tang
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Ting Lu
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Yu-Chi Zhou
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Meng-Jun Zhan
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Wang Chen
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Zhao Peng
- Department of Radiology, West China Hospital, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Jun-Hong Liu
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Yu-Fan Gui
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China
| | - Zhen-Hua Deng
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China.
| | - Fei Fan
- West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, 610041, People's Republic of China.
| |
Collapse
|
6
|
Clarke DJB, Marino GB, Deng EZ, Xie Z, Evangelista JE, Ma'ayan A. Rummagene: massive mining of gene sets from supporting materials of biomedical research publications. Commun Biol 2024; 7:482. [PMID: 38643247 PMCID: PMC11032387 DOI: 10.1038/s42003-024-06177-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 04/10/2024] [Indexed: 04/22/2024] Open
Abstract
Many biomedical research publications contain gene sets in their supporting tables, and these sets are currently not available for search and reuse. By crawling PubMed Central, the Rummagene server provides access to hundreds of thousands of such mammalian gene sets. So far, we scanned 5,448,589 articles to find 121,237 articles that contain 642,389 gene sets. These sets are served for enrichment analysis, free text, and table title search. Investigating statistical patterns within the Rummagene database, we demonstrate that Rummagene can be used for transcription factor and kinase enrichment analyses, and for gene function predictions. By combining gene set similarity with abstract similarity, Rummagene can find surprising relationships between biological processes, concepts, and named entities. Overall, Rummagene brings to surface the ability to search a massive collection of published biomedical datasets that are currently buried and inaccessible. The Rummagene web application is available at https://rummagene.com .
Collapse
Affiliation(s)
- Daniel J B Clarke
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Giacomo B Marino
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Eden Z Deng
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Zhuorui Xie
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - John Erol Evangelista
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences, Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
| |
Collapse
|
7
|
Masoumi S, Amirkhani H, Sadeghian N, Shahraz S. Natural language processing (NLP) to facilitate abstract review in medical research: the application of BioBERT to exploring the 20-year use of NLP in medical research. Syst Rev 2024; 13:107. [PMID: 38622611 PMCID: PMC11020656 DOI: 10.1186/s13643-024-02470-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Accepted: 01/28/2024] [Indexed: 04/17/2024] Open
Abstract
BACKGROUND Abstract review is a time and labor-consuming step in the systematic and scoping literature review in medicine. Text mining methods, typically natural language processing (NLP), may efficiently replace manual abstract screening. This study applies NLP to a deliberately selected literature review problem, the trend of using NLP in medical research, to demonstrate the performance of this automated abstract review model. METHODS Scanning PubMed, Embase, PsycINFO, and CINAHL databases, we identified 22,294 with a final selection of 12,817 English abstracts published between 2000 and 2021. We invented a manual classification of medical fields, three variables, i.e., the context of use (COU), text source (TS), and primary research field (PRF). A training dataset was developed after reviewing 485 abstracts. We used a language model called Bidirectional Encoder Representations from Transformers to classify the abstracts. To evaluate the performance of the trained models, we report a micro f1-score and accuracy. RESULTS The trained models' micro f1-score for classifying abstracts, into three variables were 77.35% for COU, 76.24% for TS, and 85.64% for PRF. The average annual growth rate (AAGR) of the publications was 20.99% between 2000 and 2020 (72.01 articles (95% CI: 56.80-78.30) yearly increase), with 81.76% of the abstracts published between 2010 and 2020. Studies on neoplasms constituted 27.66% of the entire corpus with an AAGR of 42.41%, followed by studies on mental conditions (AAGR = 39.28%). While electronic health or medical records comprised the highest proportion of text sources (57.12%), omics databases had the highest growth among all text sources with an AAGR of 65.08%. The most common NLP application was clinical decision support (25.45%). CONCLUSIONS BioBERT showed an acceptable performance in the abstract review. If future research shows the high performance of this language model, it can reliably replace manual abstract reviews.
Collapse
Affiliation(s)
- Safoora Masoumi
- Pediatric Infectious Diseases Research Center, Mazandaran University of Medical Sciences, Sari, Iran.
| | - Hossein Amirkhani
- Computer and Information Technology Department, University of Qom, Qom, Iran
| | - Najmeh Sadeghian
- Student Research Committee, Mazandaran University of Medical Sciences, Sari, Iran
| | - Saeid Shahraz
- Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
| |
Collapse
|
8
|
Lei B, Mahajan A, Mallick B. Identifying and overcoming COVID-19 vaccination impediments using Bayesian data mining techniques. Sci Rep 2024; 14:8595. [PMID: 38615084 PMCID: PMC11016065 DOI: 10.1038/s41598-024-58902-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 04/04/2024] [Indexed: 04/15/2024] Open
Abstract
The COVID-19 pandemic has profoundly reshaped human life. The development of COVID-19 vaccines has offered a semblance of normalcy. However, obstacles to vaccination have led to substantial loss of life and economic burdens. In this study, we analyze data from a prominent health insurance provider in the United States to uncover the underlying reasons behind the inability, refusal, or hesitancy to receive vaccinations. Our research proposes a methodology for pinpointing affected population groups and suggests strategies to mitigate vaccination barriers and hesitations. Furthermore, we estimate potential cost savings resulting from the implementation of these strategies. To achieve our objectives, we employed Bayesian data mining methods to streamline data dimensions and identify significant variables (features) influencing vaccination decisions. Comparative analysis reveals that the Bayesian method outperforms cutting-edge alternatives, demonstrating superior performance.
Collapse
Affiliation(s)
- Bowen Lei
- Department of Statistics, Texas A&M University, College Station, TX, USA
| | - Arvind Mahajan
- Department of Finance, Texas A&M University, College Station, TX, USA
| | - Bani Mallick
- Department of Statistics, Texas A&M University, College Station, TX, USA.
| |
Collapse
|
9
|
Li D, Deng Y, Liu L, Wang J, Huang Z, Zhang X. Analysis of heavy metal and polycyclic aromatic hydrocarbon pollution characteristics of a typical metal rolling industrial site based on data mining. Environ Geochem Health 2024; 46:146. [PMID: 38578375 DOI: 10.1007/s10653-024-01928-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Accepted: 02/20/2024] [Indexed: 04/06/2024]
Abstract
With the transformation and upgrading of industries, the environmental problems caused by industrial residual contaminated sites are becoming increasingly prominent. Based on actual investigation cases, this study analyzed the soil pollution status of a remaining sites of the copper and zinc rolling industry, and found that the pollutants exceeding the screening values included Cu, Ni, Zn, Pb, total petroleum hydrocarbons and 6 polycyclic aromatic hydrocarbon monomers. Based on traditional analysis methods such as the correlation coefficient and spatial distribution, combined with machine learning methods such as SOM + K-means, it is inferred that the heavy metal Zn/Pb may be mainly related to the production history of zinc rolling. Cu/Ni may be mainly originated from the production history of copper rolling. PAHs are mainly due to the incomplete combustion of fossil fuels in the melting equipment. TPH pollution is speculated to be related to oil leakage during the industrial use period and later period of vehicle parking. The results showed that traditional analysis methods can quickly identify the correlation between site pollutants, while SOM + K-means machine learning methods can further effectively extract complex hidden relationships in data and achieve in-depth mining of site monitoring data.
Collapse
Affiliation(s)
- De'an Li
- Guangdong Laboratory of Soil Pollution Fate and Risk Management in Earth's Critical Zone and Guangdong Key Laboratory of Contaminated Environmental Management and Remediation, Guangdong Provincial Academy of Environmental Science, Guangzhou, 510045, China
| | - Yirong Deng
- Guangdong Laboratory of Soil Pollution Fate and Risk Management in Earth's Critical Zone and Guangdong Key Laboratory of Contaminated Environmental Management and Remediation, Guangdong Provincial Academy of Environmental Science, Guangzhou, 510045, China.
| | - LiLi Liu
- Guangdong Laboratory of Soil Pollution Fate and Risk Management in Earth's Critical Zone and Guangdong Key Laboratory of Contaminated Environmental Management and Remediation, Guangdong Provincial Academy of Environmental Science, Guangzhou, 510045, China
| | - Jun Wang
- Guangdong Laboratory of Soil Pollution Fate and Risk Management in Earth's Critical Zone and Guangdong Key Laboratory of Contaminated Environmental Management and Remediation, Guangdong Provincial Academy of Environmental Science, Guangzhou, 510045, China
| | - Zaoquan Huang
- Guangdong Laboratory of Soil Pollution Fate and Risk Management in Earth's Critical Zone and Guangdong Key Laboratory of Contaminated Environmental Management and Remediation, Guangdong Provincial Academy of Environmental Science, Guangzhou, 510045, China
| | - Xiaolu Zhang
- Guangdong Laboratory of Soil Pollution Fate and Risk Management in Earth's Critical Zone and Guangdong Key Laboratory of Contaminated Environmental Management and Remediation, Guangdong Provincial Academy of Environmental Science, Guangzhou, 510045, China
| |
Collapse
|
10
|
Nishimura Y, Matsumoto S, Sasaki T, Kubo T. Impacts of workplace verbal aggression classified via text mining on workers' mental health. Occup Med (Lond) 2024; 74:186-192. [PMID: 38346110 PMCID: PMC10990467 DOI: 10.1093/occmed/kqae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2024] Open
Abstract
BACKGROUND Exposure to workplace aggression adversely affects workers' health; however, little is known regarding the impact of specific types of verbal content. AIMS We aimed to examine the relationship between exposure to several types of aggressive words at work and the victim's depressive symptoms and sleep disturbance using text mining. METHODS We conducted a longitudinal survey with 800 workers in wholesale and retail companies; of which, 500 responded to the follow-up survey. The Centre for Epidemiologic Studies-Depression Scale and Pittsburgh Sleep Quality Index were filled out by the participants, and their responses were analysed by logistic regression to evaluate the risk of depression or sleep problems. We collected exact aggressive words encountered at work over the past year as a dependent variable and classified it into four types using text mining, such as words criticizing one's performance. RESULTS The follow-up rate was 63%. Exposure to words threatening one's life showed a significant relationship with the risk of depression (odds ratio [OR] = 13.94, 95% confidence interval [CI] = 1.76-110.56). The exposure to words criticizing one's job performance is significantly related to the risk of sleep disturbance (OR = 5.56, 95% CI = 2.08-14.88). CONCLUSIONS These findings suggest that different contents of verbal aggression can have different impacts on workers' health. This indicates that not only overtly threatening and abusive language but also words related to one's performance can be a risk factor for workers, depending on how they are delivered. To mitigate the adverse effects, promoting effective communication and cultivating psychological detachment from work may be beneficial.
Collapse
Affiliation(s)
- Y Nishimura
- Occupational Stress and Health Management Research Group, National Institute of Occupational Safety and Health, Kawasaki, Japan
| | - S Matsumoto
- Occupational Stress and Health Management Research Group, National Institute of Occupational Safety and Health, Kawasaki, Japan
| | - T Sasaki
- Occupational Stress and Health Management Research Group, National Institute of Occupational Safety and Health, Kawasaki, Japan
| | - T Kubo
- Occupational Stress and Health Management Research Group, National Institute of Occupational Safety and Health, Kawasaki, Japan
| |
Collapse
|
11
|
Zhang JM, Wang Y, Mouton M, Zhang J, Shi M. Public Discourse, User Reactions, and Conspiracy Theories on the X Platform About HIV Vaccines: Data Mining and Content Analysis. J Med Internet Res 2024; 26:e53375. [PMID: 38568723 PMCID: PMC11024739 DOI: 10.2196/53375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/08/2023] [Accepted: 02/28/2024] [Indexed: 04/05/2024] Open
Abstract
BACKGROUND The initiation of clinical trials for messenger RNA (mRNA) HIV vaccines in early 2022 revived public discussion on HIV vaccines after 3 decades of unsuccessful research. These trials followed the success of mRNA technology in COVID-19 vaccines but unfolded amid intense vaccine debates during the COVID-19 pandemic. It is crucial to gain insights into public discourse and reactions about potential new vaccines, and social media platforms such as X (formerly known as Twitter) provide important channels. OBJECTIVE Drawing from infodemiology and infoveillance research, this study investigated the patterns of public discourse and message-level drivers of user reactions on X regarding HIV vaccines by analyzing posts using machine learning algorithms. We examined how users used different post types to contribute to topics and valence and how these topics and valence influenced like and repost counts. In addition, the study identified salient aspects of HIV vaccines related to COVID-19 and prominent anti-HIV vaccine conspiracy theories through manual coding. METHODS We collected 36,424 English-language original posts about HIV vaccines on the X platform from January 1, 2022, to December 31, 2022. We used topic modeling and sentiment analysis to uncover latent topics and valence, which were subsequently analyzed across post types in cross-tabulation analyses and integrated into linear regression models to predict user reactions, specifically likes and reposts. Furthermore, we manually coded the 1000 most engaged posts about HIV and COVID-19 to uncover salient aspects of HIV vaccines related to COVID-19 and the 1000 most engaged negative posts to identify prominent anti-HIV vaccine conspiracy theories. RESULTS Topic modeling revealed 3 topics: HIV and COVID-19, mRNA HIV vaccine trials, and HIV vaccine and immunity. HIV and COVID-19 underscored the connections between HIV vaccines and COVID-19 vaccines, as evidenced by subtopics about their reciprocal impact on development and various comparisons. The overall valence of the posts was marginally positive. Compared to self-composed posts initiating new conversations, there was a higher proportion of HIV and COVID-19-related and negative posts among quote posts and replies, which contribute to existing conversations. The topic of mRNA HIV vaccine trials, most evident in self-composed posts, increased repost counts. Positive valence increased like and repost counts. Prominent anti-HIV vaccine conspiracy theories often falsely linked HIV vaccines to concurrent COVID-19 and other HIV-related events. CONCLUSIONS The results highlight COVID-19 as a significant context for public discourse and reactions regarding HIV vaccines from both positive and negative perspectives. The success of mRNA COVID-19 vaccines shed a positive light on HIV vaccines. However, COVID-19 also situated HIV vaccines in a negative context, as observed in some anti-HIV vaccine conspiracy theories misleadingly connecting HIV vaccines with COVID-19. These findings have implications for public health communication strategies concerning HIV vaccines.
Collapse
Affiliation(s)
- Jueman M Zhang
- Harrington School of Communication and Media, University of Rhode Island, Kingston, RI, United States
| | - Yi Wang
- Department of Communication, University of Louisville, Louisville, KY, United States
| | - Magali Mouton
- School of Rehabilitation Sciences, University of Ottawa, Ottawa, ON, Canada
| | - Jixuan Zhang
- Polk School of Communications, Long Island University, Brooklyn, NY, United States
| | - Molu Shi
- College of Business, University of Louisville, Louisville, KY, United States
| |
Collapse
|
12
|
Bellomo RK, Zavalis EA, Ioannidis JPA. Assessment of transparency indicators in space medicine. PLoS One 2024; 19:e0300701. [PMID: 38564591 PMCID: PMC10986997 DOI: 10.1371/journal.pone.0300701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 03/04/2024] [Indexed: 04/04/2024] Open
Abstract
Space medicine is a vital discipline with often time-intensive and costly projects and constrained opportunities for studying various elements such as space missions, astronauts, and simulated environments. Moreover, private interests gain increasing influence in this discipline. In scientific disciplines with these features, transparent and rigorous methods are essential. Here, we undertook an evaluation of transparency indicators in publications within the field of space medicine. A meta-epidemiological assessment of PubMed Central Open Access (PMC OA) eligible articles within the field of space medicine was performed for prevalence of code sharing, data sharing, pre-registration, conflicts of interest, and funding. Text mining was performed with the rtransparent text mining algorithms with manual validation of 200 random articles to obtain corrected estimates. Across 1215 included articles, 39 (3%) shared code, 258 (21%) shared data, 10 (1%) were registered, 110 (90%) contained a conflict-of-interest statement, and 1141 (93%) included a funding statement. After manual validation, the corrected estimates for code sharing, data sharing, and registration were 5%, 27%, and 1%, respectively. Data sharing was 32% when limited to original articles and highest in space/parabolic flights (46%). Overall, across space medicine we observed modest rates of data sharing, rare sharing of code and almost non-existent protocol registration. Enhancing transparency in space medicine research is imperative for safeguarding its scientific rigor and reproducibility.
Collapse
Affiliation(s)
- Rosa Katia Bellomo
- Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, United States of America
- Department of Public Health and Infectious Diseases, Sapienza University of Rome, Rome, Italy
| | - Emmanuel A. Zavalis
- Department of Learning Informatics Management and Ethics, Karolinska Institutet, Stockholm, Sweden
- Departments of Medicine, of Epidemiology and Population Health, of Biomedical Data Science, and of Statistics, Stanford University, Stanford, CA, United States of America
| | - John P. A. Ioannidis
- Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, United States of America
- Departments of Medicine, of Epidemiology and Population Health, of Biomedical Data Science, and of Statistics, Stanford University, Stanford, CA, United States of America
| |
Collapse
|
13
|
Kirchhof B. 170 years of data-mining: history and future. Graefes Arch Clin Exp Ophthalmol 2024; 262:1013-1014. [PMID: 38231246 PMCID: PMC10995019 DOI: 10.1007/s00417-023-06359-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 12/20/2023] [Accepted: 12/28/2023] [Indexed: 01/18/2024] Open
Affiliation(s)
- Bernd Kirchhof
- Center of Opthalmology, University of Koeln, Koeln, Germany.
| |
Collapse
|
14
|
Paul J, Jacob J, Mahmud M, Vaka M, Krishnan SG, Arifutzzaman A, Thesiya D, Xiong T, Kadirgama K, Selvaraj J. A data mining approach to analyze the role of biomacromolecules-based nanocomposites in sustainable packaging. Int J Biol Macromol 2024; 265:130850. [PMID: 38492706 DOI: 10.1016/j.ijbiomac.2024.130850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/09/2024] [Accepted: 03/11/2024] [Indexed: 03/18/2024]
Abstract
Recent decades have witnessed a surge in research interest in bio-nanocomposite-based packaging materials, but still, a lack of systematic analysis exists in this domain. Bio-based packaging materials pose a sustainable alternative to petroleum-based packaging materials. The current work employs bibliometric analysis to deliver a comprehensive outline on the role of bio nanocomposites in packaging. India, Iran, and China were revealed to be the top three nations actively engaged in this domain in total publications. Islamic Azad University in Iran and Universiti Putra Malaysia in Malaysia are among the world's best institutions in active research and publications in this field. The extensive collaboration between nations and institutions highlights the significance of a holistic approach towards bio-nanocomposite. The National Natural Science Foundation of China is the leading funding body in this field of research. Among authors, Jong whan Rhim secured the topmost citations (2234) in this domain (13 publications). Among journals, Carbohydrate Polymers secured the maximum citation count (4629) from 36 articles; the initial one was published in 2011. Bio nanocomposite is the most frequently used keyword. Researchers and policymakers focussing on sustainable packaging solutions will gain crucial insights on the current research status on packaging solutions using bio-nanocomposites from the conclusions.
Collapse
Affiliation(s)
- John Paul
- Faculty of Mechanical & Automotive Engineering Technology, University Malaysia Pahang Al-Sultan Abdullah, Malaysia.
| | - Jeeja Jacob
- Higher Institution Centre of Excellence, UM Power Energy Dedicated Advanced Centre (UMPEDAC), University of Malaya, Kuala Lumpur, Malaysia.
| | - Md Mahmud
- Phillip M. Drayer Department of Electrical and Computer Engineering, College of Engineering, Lamar University, Beaumont, TX 77710, USA
| | - Mahesh Vaka
- Thermal Energy Storage department, Iberian Energy Storage Research Center (CIIAE), 10003 Caceres, Spain
| | - Syam G Krishnan
- Department of Chemical Engineering, Faculty of Engineering and Information Technology, The University of Melbourne, Victoria 3010, Australia
| | - A Arifutzzaman
- Tyndall National Institute, University College Cork, Lee Maltings, Cork T12 R5CP, Ireland
| | | | - Teng Xiong
- Department of the Built Environment, College of Design and Engineering, National University of Singapore, Singapore 117566, Singapore
| | - K Kadirgama
- Faculty of Mechanical & Automotive Engineering Technology, University Malaysia Pahang Al-Sultan Abdullah, Malaysia; Department of Civil Engineering, College of Engineering, Almaaqal University, Iraq.
| | - Jeyraj Selvaraj
- Higher Institution Centre of Excellence, UM Power Energy Dedicated Advanced Centre (UMPEDAC), University of Malaya, Kuala Lumpur, Malaysia
| |
Collapse
|
15
|
Mateu-Sanz M, Fuenteslópez CV, Uribe-Gomez J, Haugen HJ, Pandit A, Ginebra MP, Hakimi O, Krallinger M, Samara A. Redefining biomaterial biocompatibility: challenges for artificial intelligence and text mining. Trends Biotechnol 2024; 42:402-417. [PMID: 37858386 DOI: 10.1016/j.tibtech.2023.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 09/25/2023] [Accepted: 09/26/2023] [Indexed: 10/21/2023]
Abstract
The surge in 'Big data' has significantly influenced biomaterials research and development, with vast data volumes emerging from clinical trials, scientific literature, electronic health records, and other sources. Biocompatibility is essential in developing safe medical devices and biomaterials to perform as intended without provoking adverse reactions. Therefore, establishing an artificial intelligence (AI)-driven biocompatibility definition has become decisive for automating data extraction and profiling safety effectiveness. This definition should both reflect the attributes related to biocompatibility and be compatible with computational data-mining methods. Here, we discuss the need for a comprehensive and contemporary definition of biocompatibility and the challenges in developing one. We also identify the key elements that comprise biocompatibility, and propose an integrated biocompatibility definition that enables data-mining approaches.
Collapse
Affiliation(s)
- Miguel Mateu-Sanz
- Biomaterials, Biomechanics, and Tissue Engineering Group, Department of Materials Science and Engineering, Universitat Politècnica de Catalunya, Barcelona 08019, Spain
| | - Carla V Fuenteslópez
- Institute of Biomedical Engineering, Botnar Research Centre, Nuffield Orthopaedic Centre, University of Oxford, Oxford OX3 7LD, UK
| | - Juan Uribe-Gomez
- CÚRAM, SFI Research Centre for Medical Devices, University of Galway, Galway H92 W2TY, Ireland
| | - Håvard Jostein Haugen
- Department of Biomaterials, Center for Functional Tissue Reconstruction, Faculty of Dentistry, University of Oslo, Oslo 0317, Norway
| | - Abhay Pandit
- CÚRAM, SFI Research Centre for Medical Devices, University of Galway, Galway H92 W2TY, Ireland
| | - Maria-Pau Ginebra
- Biomaterials, Biomechanics, and Tissue Engineering Group, Department of Materials Science and Engineering, Universitat Politècnica de Catalunya, Barcelona 08019, Spain
| | - Osnat Hakimi
- aMoon Ventures, Yerushalaim Rd 34, Ra'anana 4350108, Israel
| | | | - Athina Samara
- Department of Biomaterials, Center for Functional Tissue Reconstruction, Faculty of Dentistry, University of Oslo, Oslo 0317, Norway.
| |
Collapse
|
16
|
Li JJ, Chen L, Zhao Y, Yang XQ, Hu FB, Wang L. Data mining and safety analysis of traditional immunosuppressive drugs: a pharmacovigilance investigation based on the FAERS database. Expert Opin Drug Saf 2024; 23:513-525. [PMID: 38533933 DOI: 10.1080/14740338.2024.2327503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 10/13/2023] [Indexed: 03/28/2024]
Abstract
OBJECTIVE The purpose of this study aimed to explore the new and serious adverse events(AEs) of Tacrolimus(FK506), cyclosporine(CsA), azathioprine(AZA), mycophenolate mofetil(MMF), cyclophosphamide(CTX) and methotrexate(MTX), which have not been concerned. METHODS The FAERS data from January 2016 and December 2022 were selected for disproportionality analysis to discover the potential risks of traditional immunosuppressive drugs. RESULTS Compared with CsA, FK506 has more frequent transplant rejection, and is more related to renal impairment, COVID-19, cytomegalovirus infection and aspergillus infection. However, CsA has a high infection-related fatality rate. In addition, we also found some serious and rare AE in other drugs which were rarely reported in previous studies. For example, AZA is closely related to hepatosplenic T-cell lymphoma with high fatality rate and MTX is strongly related to hypofibrinogenemia. CONCLUSION The AEs report on this study confirmed that the results were basically consistent with the previous studies, but there were also some important safety signals that were inconsistent with or not mentioned in previous published studies. EXPERT OPINION The opinion section discusses some of the limitations and shortcomings, proposing the areas where more effort should be invested in order to improve the safety of immunosuppressive drugs.
Collapse
Affiliation(s)
- Juan-Juan Li
- Department of Pharmacy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
- Ministry of Education, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Chengdu, Sichuan, China
- Department of Pharmacy, Guangyuan Central Hospital, Guanyuan, Sichuan, China
| | - Li Chen
- Department of Pharmacy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
- Ministry of Education, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Chengdu, Sichuan, China
| | - Yang Zhao
- Department of Pharmacy, Guangyuan Central Hospital, Guanyuan, Sichuan, China
| | - Xue-Qin Yang
- Department of Pharmacy, Guangyuan Central Hospital, Guanyuan, Sichuan, China
| | - Fa-Bin Hu
- Department of Pharmacy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
- Ministry of Education, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Chengdu, Sichuan, China
- Department of Pharmacy, Jinniu Maternity and Child Health Hospital of Chengdu, Chengdu, Sichuan, China
| | - Li Wang
- Department of Pharmacy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
- Ministry of Education, Key Laboratory of Birth Defects and Related Diseases of Women and Children (Sichuan University), Chengdu, Sichuan, China
| |
Collapse
|
17
|
Limsomwong P, Ingviya T, Fumaneeshoat O. Identifying cancer patients who received palliative care using the SPICT-LIS in medical records: a rule-based algorithm and text-mining technique. BMC Palliat Care 2024; 23:83. [PMID: 38556869 PMCID: PMC10983682 DOI: 10.1186/s12904-024-01419-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 03/25/2024] [Indexed: 04/02/2024] Open
Abstract
BACKGROUND Due to limited numbers of palliative care specialists and/or resources, accessing palliative care remains limited in many low and middle-income countries. Data science methods, such as rule-based algorithms and text mining, have potential to improve palliative care by facilitating analysis of electronic healthcare records. This study aimed to develop and evaluate a rule-based algorithm for identifying cancer patients who may benefit from palliative care based on the Thai version of the Supportive and Palliative Care Indicators for a Low-Income Setting (SPICT-LIS) criteria. METHODS The medical records of 14,363 cancer patients aged 18 years and older, diagnosed between 2016 and 2020 at Songklanagarind Hospital, were analyzed. Two rule-based algorithms, strict and relaxed, were designed to identify key SPICT-LIS indicators in the electronic medical records using tokenization and sentiment analysis. The inter-rater reliability between these two algorithms and palliative care physicians was assessed using percentage agreement and Cohen's kappa coefficient. Additionally, factors associated with patients might be given palliative care as they will benefit from it were examined. RESULTS The strict rule-based algorithm demonstrated a high degree of accuracy, with 95% agreement and Cohen's kappa coefficient of 0.83. In contrast, the relaxed rule-based algorithm demonstrated a lower agreement (71% agreement and Cohen's kappa of 0.16). Advanced-stage cancer with symptoms such as pain, dyspnea, edema, delirium, xerostomia, and anorexia were identified as significant predictors of potentially benefiting from palliative care. CONCLUSION The integration of rule-based algorithms with electronic medical records offers a promising method for enhancing the timely and accurate identification of patients with cancer might benefit from palliative care.
Collapse
Affiliation(s)
- Pawita Limsomwong
- Department of Family and Preventive Medicine, Prince of Songkla University, Songkhla, 90110, Thailand
| | - Thammasin Ingviya
- Department of Family and Preventive Medicine, Prince of Songkla University, Songkhla, 90110, Thailand
- Division of Digital Innovation and Data Analytics, Faculty of Medicine, Prince of Songkla University, Hat Yai Campus, Songkhla, 90110, Thailand
- Department of Clinical Research and Medical Data Science, Faculty of Medicine, Prince of Songkla University, Songkhla, 90110, Thailand
| | - Orapan Fumaneeshoat
- Department of Family and Preventive Medicine, Prince of Songkla University, Songkhla, 90110, Thailand.
| |
Collapse
|
18
|
Zhao K, Ebrahimie E, Mohammadi-Dehcheshmeh M, Lewsey MG, Zheng L, Hoogenraad NJ. Transcriptomic signature of cancer cachexia by integration of machine learning, literature mining and meta-analysis. Comput Biol Med 2024; 172:108233. [PMID: 38452471 DOI: 10.1016/j.compbiomed.2024.108233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2023] [Revised: 01/23/2024] [Accepted: 02/25/2024] [Indexed: 03/09/2024]
Abstract
BACKGROUND Cancer cachexia is a severe metabolic syndrome marked by skeletal muscle atrophy. A successful clinical intervention for cancer cachexia is currently lacking. The study of cachexia mechanisms is largely based on preclinical animal models and the availability of high-throughput transcriptomic datasets of cachectic mouse muscles is increasing through the extensive use of next generation sequencing technologies. METHODS Cachectic mouse muscle transcriptomic datasets of ten different studies were combined and mined by seven attribute weighting models, which analysed both categorical variables and numerical variables. The transcriptomic signature of cancer cachexia was identified by attribute weighting algorithms and was used to evaluate the performance of eleven pattern discovery models. The signature was employed to find the best combination of drugs (drug repurposing) for developing cancer cachexia treatment strategies, as well as to evaluate currently used cachexia drugs by literature mining. RESULTS Attribute weighting algorithms ranked 26 genes as the transcriptomic signature of muscle from mice with cancer cachexia. Deep Learning and Random Forest models performed better in differentiating cancer cachexia cases based on muscle transcriptomic data. Literature mining revealed that a combination of melatonin and infliximab has negative interactions with 2 key genes (Rorc and Fbxo32) upregulated in the transcriptomic signature of cancer cachexia in muscle. CONCLUSIONS The integration of machine learning, meta-analysis and literature mining was found to be an efficient approach to identifying a robust transcriptomic signature for cancer cachexia, with implications for improving clinical diagnosis and management of this condition.
Collapse
Affiliation(s)
- Kening Zhao
- Department of Laboratory Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China; La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, 3086, Australia.
| | - Esmaeil Ebrahimie
- Genomics Research Platform, School of Agriculture, Biomedicine and Environment, La Trobe University, Melbourne, VIC, 3086, Australia; School of Animal and Veterinary Science, The University of Adelaide, Adelaide, SA 5371, Australia; School of BioSciences, The University of Melbourne, Melbourne, VIC, 3010, Australia.
| | - Manijeh Mohammadi-Dehcheshmeh
- Genomics Research Platform, School of Agriculture, Biomedicine and Environment, La Trobe University, Melbourne, VIC, 3086, Australia; School of Animal and Veterinary Science, The University of Adelaide, Adelaide, SA 5371, Australia.
| | - Mathew G Lewsey
- Australian Research Council Research Hub for Medicinal Agriculture, La Trobe University, AgriBio Building, Bundoora, VIC, 3086, Australia; La Trobe Institute for Sustainable Agriculture and Food, Department of Plant, Animal and Soil Sciences, La Trobe University, AgriBio Building, Bundoora, VIC, 3086, Australia; Australian Research Council Centre of Excellence in Plants for Space, AgriBio Building, La Trobe University, Bundoora, VIC, 3086, Australia.
| | - Lei Zheng
- Department of Laboratory Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China.
| | - Nick J Hoogenraad
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, 3086, Australia; Tumour Targeting Laboratory, Olivia Newton-John Cancer Research Institute, School of Cancer Medicine, La Trobe University, Melbourne, VIC, 3084, Australia.
| |
Collapse
|
19
|
Zhang S, Wang Y, Qi Z, Tong S, Zhu D. Data mining and analysis of adverse event signals associated with teprotumumab using the Food and Drug Administration adverse event reporting system database. Int J Clin Pharm 2024; 46:471-479. [PMID: 38245664 DOI: 10.1007/s11096-023-01676-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 11/20/2023] [Indexed: 01/22/2024]
Abstract
BACKGROUND Teprotumumab was approved by the US Food and Drug Administration (FDA) for the treatment of thyroid eye disease in 2020. However, its adverse events (AEs) have not been investigated in real-world settings. AIM This study aimed to detect and evaluate AEs associated with teprotumumab in the real-world setting by conducting a pharmacovigilance analysis of the FDA Adverse Event Reporting System (FAERS) database. METHOD Reporting odds ratio (ROR) was used to detect risk signals from the data from January 2020 to March 2023 in the FAERS database. RESULTS A total of 3,707,269 cases were retrieved, of which 1542 were related to teprotumumab. The FAERS analysis identified 99 teprotumumab-related AE signals in 14 System Organ Classes (SOCs). The most frequent AEs were muscle spasms (n = 287), fatigue (n = 174), blood glucose increase (n = 121), alopecia (n = 120), nausea (n = 118), hyperacusis (n = 117), and headache (n = 117). The AEs with strongest signal strengths were autophony (ROR = 14,475.49), deafness permanent (ROR = 1853.35), gingival recession (ROR = 190.74), deafness neurosensory (ROR = 129.89), nail growth abnormal (ROR = 103.67), onychoclasis (ROR = 73.58), ear discomfort (ROR = 72.88), and deafness bilateral (ROR = 62.46). Eleven positive AE signals were found at the standardized MedDRA queries (SMQs) level, of which the top five SMQs were hyperglycemia/new-onset diabetes mellitus, hearing impairment, gastrointestinal nonspecific symptoms and therapeutic procedures, noninfectious diarrhea, and hypertension. Age significantly increased the risk of hearing impairment. CONCLUSION This study identified potential new and unexpected AE signals of teprotumumab. Our findings emphasize the importance of pharmacovigilance analysis in the real world to identify and manage AEs effectively, ultimately improving patient safety during teprotumumab treatment.
Collapse
Affiliation(s)
- Sha Zhang
- Department of Pharmacy, Tongji Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Yidong Wang
- Department of Pharmacy, Ren Ji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhan Qi
- Department of Pharmacy, Tongji Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Shanshan Tong
- Department of Pharmacy, Tongji Hospital, School of Medicine, Tongji University, Shanghai, China
| | - Deqiu Zhu
- Department of Pharmacy, Tongji Hospital, School of Medicine, Tongji University, Shanghai, China.
| |
Collapse
|
20
|
Li T, Hu K, Ye L, Ma J, Huang L, Guo C, Huang X, Jiang J, Xie X, Guo C, He Q. Association of Antipsychotic Drugs with Venous Thromboembolism: Data Mining of Food and Drug Administration Adverse Event Reporting System and Mendelian Randomization Analysis. J Atheroscler Thromb 2024; 31:396-418. [PMID: 38030236 PMCID: PMC10999720 DOI: 10.5551/jat.64461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 09/25/2023] [Indexed: 12/01/2023] Open
Abstract
AIMS Past observational studies have reported on the association between antipsychotic drugs and venous thromboembolism (VTE); however, the conclusions remain controversial, and its mechanisms are yet to be fully understood. Thus, in this study, we aim to determine the associations of antipsychotic drugs with VTE, including deep vein thrombosis (DVT) and pulmonary embolism (PE), and their potential mechanisms. METHODS We first mined the adverse event signals of VTE, DVT, and PE caused by antipsychotic drugs in Food and Drug Administration Adverse Event Reporting System (FAERS). Next, we used two-sample Mendelian randomization (MR) method to investigate the association of antipsychotic drug target gene expression with VTE, DVT, and PE, using single-nucleotide polymorphisms as genetic instruments. We not only used the expression of all antipsychotic drug target genes as exposure to perform MR analyses but also analyzed the effect of single target gene expression on the outcomes. RESULTS In the FAERS, 1694 cases of VTE events were reported by 16 drugs. However, using the MR approach, no significant association was determined between the expression of all antipsychotic target genes and VTE, DVT, or PE, either in blood or brain tissue. Although the analysis of single gene expression data showed that the expression of nine genes was associated with VTE events, these targets lacked significant pharmacological action. CONCLUSIONS Adverse event mining results have supported the claim that antipsychotic drugs can increase the risk of VTE. However, we failed to find any genetic evidence for this causal association and potential mechanisms. Thus, vigilance is still needed for antipsychotic drug-related VTE despite the limited supporting evidence.
Collapse
Affiliation(s)
- Tong Li
- Center of Clinical Pharmacology, The Third Xiangya Hospital, Central South University, Hunan, P.R. China
| | - Kai Hu
- Department of Neurology, Xiangya Hospital, Central South University, Hunan, P.R. China
- National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Hunan, P.R. China
- Clinical Research Center for Epileptic disease of Hunan Province, Central South University, Hunan, P.R. China
| | - Ling Ye
- Center of Clinical Pharmacology, The Third Xiangya Hospital, Central South University, Hunan, P.R. China
| | - Junlong Ma
- Center of Clinical Pharmacology, The Third Xiangya Hospital, Central South University, Hunan, P.R. China
| | - Longjian Huang
- Youjiang Medical University for Nationalities, Guangxi, P.R. China
| | - Chengjun Guo
- School of Applied Mathematics, Guangdong University of Technology, Guangdong, P.R. China
| | - Xin Huang
- Center of Clinical Pharmacology, The Third Xiangya Hospital, Central South University, Hunan, P.R. China
| | - Jie Jiang
- Department of Pediatrics, The Affiliated Changsha Central Hospital, University of South China Hengyang Medical School,University of South China, Hunan, P.R. China
| | - Xiaoxue Xie
- Department of Radiotherapy, Hunan Provincial Tumor Hospital and Affiliated Tumor Hospital of Xiangya Medical School, Central South University, Hunan, P.R. China
| | - Chengxian Guo
- Center of Clinical Pharmacology, The Third Xiangya Hospital, Central South University, Hunan, P.R. China
| | - Qingnan He
- Department of Pediatrics, The Third Xiangya Hospital, Central South University, Hunan, P.R. China
| |
Collapse
|
21
|
Wermers Z, Yoo S, Radenbaugh B, Douglass A, Biesecker LG, Johnston JJ. Comparison of literature mining tools for variant classification: Through the lens of 50 RYR1 variants. Genet Med 2024; 26:101083. [PMID: 38281099 DOI: 10.1016/j.gim.2024.101083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 01/19/2024] [Accepted: 01/22/2024] [Indexed: 01/29/2024] Open
Abstract
PURPOSE The American College of Medical Genetics and Genomics and the Association for Molecular Pathology have outlined a schema that allows for systematic classification of variant pathogenicity. Although gnomAD is generally accepted as a reliable source of population frequency data and ClinGen has provided guidance on the utility of specific bioinformatic predictors, there is no consensus source for identifying publications relevant to a variant. Multiple tools are available to aid in the identification of relevant variant literature, including manually curated databases and literature search engines. We set out to determine the utility of 4 literature mining tools used for ascertainment to inform the discussion of the use of these tools. METHODS Four literature mining tools including the Human Gene Mutation Database, Mastermind, ClinVar, and LitVar 2.0 were used to identify relevant variant literature for 50 RYR1 variants. Sensitivity and precision were determined for each tool. RESULTS Sensitivity among the 4 tools ranged from 0.332 to 0.687. Precision ranged from 0.389 to 0.906. No single tool retrieved all relevant publications. CONCLUSION At the current time, the use of multiple tools is necessary to completely identify the literature relevant to curate a variant.
Collapse
Affiliation(s)
- Zara Wermers
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Seeley Yoo
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Bailey Radenbaugh
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Amber Douglass
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Leslie G Biesecker
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD
| | - Jennifer J Johnston
- Center for Precision Health Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD.
| |
Collapse
|
22
|
Rocha HAL, Solha EZM, Furtado V, Justino FL, Barreto LAL, da Silva RG, de Oliveira ÍM, Bates DW, de Góes Cavalcanti LP, Lima Neto AS, de Oliveira EA. COVID-19 outbreaks surveillance through text mining applied to electronic health records. BMC Infect Dis 2024; 24:359. [PMID: 38549109 PMCID: PMC10976796 DOI: 10.1186/s12879-024-09250-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/24/2024] [Indexed: 04/01/2024] Open
Abstract
BACKGROUND The COVID-19 pandemic has caused significant disruptions to everyday life and has had social, political, and financial consequences that will persist for years. Several initiatives with intensive use of technology were quickly developed in this scenario. However, technologies that enhance epidemiological surveillance in contexts with low testing capacity and healthcare resources are scarce. Therefore, this study aims to address this gap by developing a data science model that uses routinely generated healthcare encounter records to detect possible new outbreaks early in real-time. METHODS We defined an epidemiological indicator that is a proxy for suspected cases of COVID-19 using the health records of Emergency Care Unit (ECU) patients and text mining techniques. The open-field dataset comprises 2,760,862 medical records from nine ECUs, where each record has information about the patient's age, reported symptoms, and the time and date of admission. We also used a dataset where 1,026,804 cases of COVID-19 were officially confirmed. The records range from January 2020 to May 2022. Sample cross-correlation between two finite stochastic time series was used to evaluate the models. RESULTS For patients with age 18 years, we find time-lag () = 72 days and cross-correlation () ~ 0.82, = 25 days and ~ 0.93, and = 17 days and ~ 0.88 for the first, second, and third waves, respectively. CONCLUSIONS In conclusion, the developed model can aid in the early detection of signs of possible new COVID-19 outbreaks, weeks before traditional surveillance systems, thereby anticipating in initiating preventive and control actions in public health with a higher likelihood of success.
Collapse
Affiliation(s)
- Hermano Alexandre Lima Rocha
- Department of Community Health, Federal University of Ceará, Street Papi Júnior, 1223, 5th. Floor, Fortaleza, CE, Brazil.
| | - Erik Zarko Macêdo Solha
- Postgraduate Program in Applied Informatics, University of Fortaleza, Fortaleza, CE, 60811-905, Brazil
| | - Vasco Furtado
- Postgraduate Program in Applied Informatics, University of Fortaleza, Fortaleza, CE, 60811-905, Brazil
| | - Francion Linhares Justino
- Postgraduate Program in Applied Informatics, University of Fortaleza, Fortaleza, CE, 60811-905, Brazil
| | - Lucas Arêa Leão Barreto
- Department of Community Health, Federal University of Ceará, Street Papi Júnior, 1223, 5th. Floor, Fortaleza, CE, Brazil
| | - Ronaldo Guedes da Silva
- Department of Community Health, Federal University of Ceará, Street Papi Júnior, 1223, 5th. Floor, Fortaleza, CE, Brazil
| | | | | | - Luciano Pamplona de Góes Cavalcanti
- Department of Community Health, Federal University of Ceará, Street Papi Júnior, 1223, 5th. Floor, Fortaleza, CE, Brazil
- School of Public Health of Ceará, Fortaleza, CE, Brazil
- Faculty of Medicine, Christus University Center, Fortaleza, CE, Brazil
| | - Antônio Silva Lima Neto
- Postgraduate Program in Applied Informatics, University of Fortaleza, Fortaleza, CE, 60811-905, Brazil
- Health Secretariat, Ceará State Government, Fortaleza, CE, Brazil
| | - Erneson Alves de Oliveira
- Postgraduate Program in Applied Informatics, University of Fortaleza, Fortaleza, CE, 60811-905, Brazil
- Laboratory of Data Science and Artificial Intelligence, University of Fortaleza, Fortaleza, Ceará, 60811-905, Brazil
- Professional Masters in City Science, University of Fortaleza, Fortaleza, Ceará, 60811-905, Brazil
| |
Collapse
|
23
|
Lu S, Yang J, Gu Y, He D, Wu H, Sun W, Xu D, Li C, Guo C. Advances in Machine Learning Processing of Big Data from Disease Diagnosis Sensors. ACS Sens 2024; 9:1134-1148. [PMID: 38363978 DOI: 10.1021/acssensors.3c02670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Abstract
Exploring accurate, noninvasive, and inexpensive disease diagnostic sensors is a critical task in the fields of chemistry, biology, and medicine. The complexity of biological systems and the explosive growth of biomarker data have driven machine learning to become a powerful tool for mining and processing big data from disease diagnosis sensors. With the development of bioinformatics and artificial intelligence (AI), machine learning models formed by data mining have been able to guide more sensitive and accurate molecular computing. This review presents an overview of big data collection approaches and fundamental machine learning algorithms and discusses recent advances in machine learning and molecular computational disease diagnostic sensors. More specifically, we highlight existing modular workflows and key opportunities and challenges for machine learning to achieve disease diagnosis through big data mining.
Collapse
Affiliation(s)
- Shasha Lu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Jianyu Yang
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Yu Gu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Dongyuan He
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Haocheng Wu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Wei Sun
- College of Chemistry and Chemical Engineering, Hainan Normal University, Haikou 571158, China
| | - Dong Xu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou 310022, China
| | - Changming Li
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Chunxian Guo
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| |
Collapse
|
24
|
Huang L, Fan Y, Lin R, Zhao Y, Mo Y, Luo S, Li Z. Investigating acupoint selection and combinations of acupuncture for primary idiopathic tinnitus using data mining. Medicine (Baltimore) 2024; 103:e37107. [PMID: 38518013 PMCID: PMC10956944 DOI: 10.1097/md.0000000000037107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 01/08/2024] [Indexed: 03/24/2024] Open
Abstract
BACKGROUND Acupuncture is widely used in the treatment of tinnitus worldwide because of its good efficacy and safety. However, the criteria for selecting acupoint prescriptions and combinations have not been summarized. Therefore, data mining was used herein to determine the treatment principles and the most effective acupoint selection for the treatment of idiopathic tinnitus. METHODS The clinical research literature of acupuncture in the treatment of idiopathic tinnitus from the establishment of the database to September 1, 2023 in China National Knowledge Infrastructure, China Medical Journal Full-text Database, PubMed, Embase, Cochrane Library and Web of Science databases was retrieved and extracted. Microsoft Excel 2016 was used to establish the acupoint prescription database and the frequency statistics of acupoints, meridians and specific acupoints were carried out. IBM SPSS Statistics 25.0 software was used for cluster analysis of acupoints, and IBM SPSS Modeler18.0 software was used for association rule analysis of acupoints. RESULTS A total of 112 articles were included, involving 221 acupuncture prescriptions, including 99 acupoints, with a total frequency of 1786 times. The 5 most frequently used acupoints were Tinggong (SI19), Tinghui (GB2), Yifeng (TE17), Ermen (TE21), and Zhongzhu (TE3). The commonly used meridians were Sanjiao meridian of hand-shaoyang, Gallbladder meridian of foot-shaoyang and Small intestine meridian of hand-taiyang. The specific points are mostly Crossing point, Five-shu point and Yuan-primary point. The core acupoint combination of association rules was Ermen (TE21)-Tinggong (SI19)-Tinghui (GB2)-Yifeng (TE17), and 3 effective clustering groups were obtained by cluster analysis of high-frequency acupoints. CONCLUSION In this study, the published literature on acupuncture treatment of idiopathic tinnitus was analyzed by data mining, and the relationship between acupoints was explored, which provided a more wise choice for clinical acupuncture treatment of idiopathic tinnitus.
Collapse
Affiliation(s)
- Liangliang Huang
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
- Liuzhou Workers’ Hospital, Guangxi, China
| | - Yushan Fan
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
| | - Rui Lin
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
| | - Yiping Zhao
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
| | - Yaru Mo
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
| | - Sen Luo
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
| | - Zhan Li
- Faculty of Acupuncture, Moxibustion and Tui Na of Guangxi University of Chinese Medicine, Nanning, People’s Republic of China
| |
Collapse
|
25
|
Karystianis G, Lukmanjaya W, Buchan I, Simpson P, Ginnivan N, Nenadic G, Butler T. An analysis of published study designs in PubMed prisoner health abstracts from 1963 to 2023: a text mining study. BMC Med Res Methodol 2024; 24:68. [PMID: 38494501 PMCID: PMC10944606 DOI: 10.1186/s12874-024-02186-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 02/20/2024] [Indexed: 03/19/2024] Open
Abstract
BACKGROUND The challenging nature of studies with incarcerated populations and other offender groups can impede the conduct of research, particularly that involving complex study designs such as randomised control trials and clinical interventions. Providing an overview of study designs employed in this area can offer insights into this issue and how research quality may impact on health and justice outcomes. METHODS We used a rule-based approach to extract study designs from a sample of 34,481 PubMed abstracts related to epidemiological criminology published between 1963 and 2023. The results were compared against an accepted hierarchy of scientific evidence. RESULTS We evaluated our method in a random sample of 100 PubMed abstracts. An F1-Score of 92.2% was returned. Of 34,481 study abstracts, almost 40.0% (13,671) had an extracted study design. The most common study design was observational (37.3%; 5101) while experimental research in the form of trials (randomised, non-randomised) was present in 16.9% (2319). Mapped against the current hierarchy of scientific evidence, 13.7% (1874) of extracted study designs could not be categorised. Among the remaining studies, most were observational (17.2%; 2343) followed by systematic reviews (10.5%; 1432) with randomised controlled trials accounting for 8.7% (1196) of studies and meta-analysis for 1.4% (190) of studies. CONCLUSIONS It is possible to extract epidemiological study designs from a large-scale PubMed sample computationally. However, the number of trials, systematic reviews, and meta-analysis is relatively small - just 1 in 5 articles. Despite an increase over time in the total number of articles, study design details in the abstracts were missing. Epidemiological criminology still lacks the experimental evidence needed to address the health needs of the marginalized and isolated population that is prisoners and offenders.
Collapse
Affiliation(s)
- George Karystianis
- School of Population Health, University of New South Wales, Sydney, Australia.
| | - Wilson Lukmanjaya
- School of Population Health, University of New South Wales, Sydney, Australia
| | - Iain Buchan
- Institute of Population Health, University of Liverpool, Liverpool, UK
| | - Paul Simpson
- School of Population Health, University of New South Wales, Sydney, Australia
| | - Natasha Ginnivan
- School of Population Health, University of New South Wales, Sydney, Australia
| | - Goran Nenadic
- School of Computer Science, University of Manchester, Manchester, UK
| | - Tony Butler
- School of Population Health, University of New South Wales, Sydney, Australia
| |
Collapse
|
26
|
Liu C, Sun K, Zhou Q, Duan Y, Shu J, Kan H, Gu Z, Hu J. CPMI-ChatGLM: parameter-efficient fine-tuning ChatGLM with Chinese patent medicine instructions. Sci Rep 2024; 14:6403. [PMID: 38493251 PMCID: PMC10944515 DOI: 10.1038/s41598-024-56874-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 03/12/2024] [Indexed: 03/18/2024] Open
Abstract
Chinese patent medicine (CPM) is a typical type of traditional Chinese medicine (TCM) preparation that uses Chinese herbs as raw materials and is an important means of treating diseases in TCM. Chinese patent medicine instructions (CPMI) serve as a guide for patients to use drugs safely and effectively. In this study, we apply a pre-trained language model to the domain of CPM. We have meticulously assembled, processed, and released the first CPMI dataset and fine-tuned the ChatGLM-6B base model, resulting in the development of CPMI-ChatGLM. We employed consumer-grade graphics cards for parameter-efficient fine-tuning and investigated the impact of LoRA and P-Tuning v2, as well as different data scales and instruction data settings on model performance. We evaluated CPMI-ChatGLM using BLEU, ROUGE, and BARTScore metrics. Our model achieved scores of 0.7641, 0.8188, 0.7738, 0.8107, and - 2.4786 on the BLEU-4, ROUGE-1, ROUGE-2, ROUGE-L and BARTScore metrics, respectively. In comparison experiments and human evaluation with four large language models of similar parameter scales, CPMI-ChatGLM demonstrated state-of-the-art performance. CPMI-ChatGLM demonstrates commendable proficiency in CPM recommendations, making it a promising tool for auxiliary diagnosis and treatment. Furthermore, the various attributes in the CPMI dataset can be used for data mining and analysis, providing practical application value and research significance.
Collapse
Affiliation(s)
- Can Liu
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, 230012, China
| | - Kaijie Sun
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
| | - Qingqing Zhou
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
| | - Yuchen Duan
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
| | - Jianhua Shu
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
| | - Hongxing Kan
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, 230012, China
| | - Zongyun Gu
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China
| | - Jili Hu
- School of Medical Informatics Engineering, Anhui University of Traditional Chinese Medicine, Hefei, 230012, China.
- Anhui Computer Application Research Institute of Chinese Medicine, China Academy of Chinese Medical Sciences, Hefei, 230012, China.
| |
Collapse
|
27
|
Huang L, Shi F, Hu D, Kang D. Analysis of research topics and trends in investigator-initiated research/trials (IIRs/IITs): A topic modeling study. Medicine (Baltimore) 2024; 103:e37375. [PMID: 38457583 PMCID: PMC10919521 DOI: 10.1097/md.0000000000037375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 12/26/2023] [Accepted: 02/05/2024] [Indexed: 03/10/2024] Open
Abstract
BACKGROUND With the exponential growth of publications in the field of investigator-initiated research/trials (IIRs/IITs), it has become necessary to employ text mining and bibliometric analysis as tools for gaining deeper insights into this area of study. By using these methods, researchers can effectively identify and analyze research topics within the field. METHODS This study retrieved relevant publications from the Web of Science Core Collection and conducted bioinformatics analysis. The latent Dirichlet allocation model, which is based on machine learning, was utilized to identify subfield research topics. RESULTS A total of 4315 articles related to IIRs/IITs were obtained from the Web of Science Core Collection. After excluding duplicates and articles with missing abstracts, a final dataset of 3333 articles was included for bibliometric analysis. The number of publications showed a steady increase over time, particularly since 2000. The United States, Germany, the United Kingdom, the Netherlands, Canada, Denmark, Japan, Switzerland, and France emerged as the most productive countries in terms of IIRs/IITs. The citation analysis revealed intriguing trends, with certain highly cited articles showing a significant increase in citation frequency in recent years. A model with 45 topics was deemed the best fit for characterizing the extensively researched fields within IIRs/IITs. Our analysis revealed 10 top topics that have garnered significant attention, spanning domains such as community health, cancer treatment, brain development and disease mechanisms, nursing research, and stem cell therapy. These top topics offer researchers valuable directions for further investigation and innovation. Additionally, we identified 12 hot topics, which represent the most cutting-edge and highly regarded research areas within the field. CONCLUSION This study contributes to a comprehensive understanding of the current research landscape and provides valuable insights for researchers working in this domain.
Collapse
Affiliation(s)
- Litao Huang
- Chinese Evidence-Based Medicine Center, National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
- Department of Clinical Research Management, West China Hospital of Sichuan University, Chengdu, China
| | - Fanfan Shi
- Department of Clinical Research Management, West China Hospital of Sichuan University, Chengdu, China
| | - Dan Hu
- Department of Clinical Research Management, West China Hospital of Sichuan University, Chengdu, China
| | - Deying Kang
- Department of Clinical Research Management, West China Hospital of Sichuan University, Chengdu, China
- Department of Evidence-Based Medicine and Clinical Epidemiology, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
28
|
Saheb T. Mapping Ethical Artificial Intelligence Policy Landscape: A Mixed Method Analysis. Sci Eng Ethics 2024; 30:9. [PMID: 38451328 PMCID: PMC10920462 DOI: 10.1007/s11948-024-00472-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 01/30/2024] [Indexed: 03/08/2024]
Abstract
As more national governments adopt policies addressing the ethical implications of artificial intelligence, a comparative analysis of policy documents on these topics can provide valuable insights into emerging concerns and areas of shared importance. This study critically examines 57 policy documents pertaining to ethical AI originating from 24 distinct countries, employing a combination of computational text mining methods and qualitative content analysis. The primary objective is to methodically identify common themes throughout these policy documents and perform a comparative analysis of the ways in which various governments give priority to crucial matters. A total of nineteen topics were initially retrieved. Through an iterative coding process, six overarching themes were identified: principles, the protection of personal data, governmental roles and responsibilities, procedural guidelines, governance and monitoring mechanisms, and epistemological considerations. Furthermore, the research revealed 31 ethical dilemmas pertaining to AI that had been overlooked previously but are now emerging. These dilemmas have been referred to in different extents throughout the policy documents. This research makes a scholarly contribution to the expanding field of technology policy formulations at the national level by analyzing similarities and differences among countries. Furthermore, this analysis has practical ramifications for policymakers who are attempting to comprehend prevailing trends and potentially neglected domains that demand focus in the ever-evolving field of artificial intelligence.
Collapse
Affiliation(s)
- Tahereh Saheb
- School of Business, Menlo College, Atherton, CA, USA.
- Faculty of Law, Tarbiat Modares University, Tehran, Iran.
| |
Collapse
|
29
|
Pei Y, O'Brien KH. Use of Social Media Data Mining to Examine Needs, Concerns, and Experiences of People With Traumatic Brain Injury. Am J Speech Lang Pathol 2024; 33:831-847. [PMID: 38147471 DOI: 10.1044/2023_ajslp-23-00297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
PURPOSE Given the limited availability of topic-specific resources, many people turn to anonymous social media platforms such as Reddit to seek information and connect to others with similar experiences and needs. Mining of such data can therefore identify unmet needs within the community and allow speech-language pathologists to incorporate clients' real-life insights into clinical practices. METHOD A mixed-method analysis was performed on 3,648 traumatic brain injury (TBI) subreddit posts created between 2013 and 2021. Sentiment analysis was used to determine the sentiment expressed in each post; topic modeling and qualitative content analysis were used to uncover the main topics discussed across posts. Subgroup analyses were conducted based on injury severity, chronicity, and whether the post was authored by a person with TBI or a close other. RESULTS There was no significant difference between the number of posts with positive sentiment and the number of posts with negative sentiment. Comparisons between subgroups showed significantly higher positive sentiment in posts by or about people with moderate-to-severe TBI (compared to mild TBI) and who were more than 1 month postinjury (compared to less than 1 month). Posts by close others had significantly higher positive sentiment than posts by people with TBI. Topic modeling identified three meta-themes: Recovery, Symptoms, and Medical Care. Qualitative content analysis further revealed that returning to productivity and life as well as sharing recovery tips were the primary focus under the Recovery theme. Symptom-related posts often discussed symptom management and validation of experiences. The Medical Care theme encompassed concerns regarding diagnosis, medication, and treatment. CONCLUSIONS Concerns and needs shift over time following TBI, and they extend beyond health and functioning to participation in meaningful daily activities. The findings can inform the development of tailored educational resources and rehabilitative approaches, facilitating recovery and community building for individuals with TBI. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.24881340.
Collapse
Affiliation(s)
- Yalian Pei
- Department of Communication Sciences and Special Education, University of Georgia, Athens
- Department of Communication Sciences and Disorders, Syracuse University, NY
| | - Katy H O'Brien
- Department of Communication Sciences and Special Education, University of Georgia, Athens
- Courage Kenny Rehabilitation Institute, Allina Health, Minneapolis, MN
| |
Collapse
|
30
|
Park YJ, Yang GJ, Sohn CB, Park SJ. GPDminer: a tool for extracting named entities and analyzing relations in biological literature. BMC Bioinformatics 2024; 25:101. [PMID: 38448845 PMCID: PMC10916184 DOI: 10.1186/s12859-024-05710-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 02/19/2024] [Indexed: 03/08/2024] Open
Abstract
PURPOSE The expansion of research across various disciplines has led to a substantial increase in published papers and journals, highlighting the necessity for reliable text mining platforms for database construction and knowledge acquisition. This abstract introduces GPDMiner(Gene, Protein, and Disease Miner), a platform designed for the biomedical domain, addressing the challenges posed by the growing volume of academic papers. METHODS GPDMiner is a text mining platform that utilizes advanced information retrieval techniques. It operates by searching PubMed for specific queries, extracting and analyzing information relevant to the biomedical field. This system is designed to discern and illustrate relationships between biomedical entities obtained from automated information extraction. RESULTS The implementation of GPDMiner demonstrates its efficacy in navigating the extensive corpus of biomedical literature. It efficiently retrieves, extracts, and analyzes information, highlighting significant connections between genes, proteins, and diseases. The platform also allows users to save their analytical outcomes in various formats, including Excel and images. CONCLUSION GPDMiner offers a notable additional functionality among the array of text mining tools available for the biomedical field. This tool presents an effective solution for researchers to navigate and extract relevant information from the vast unstructured texts found in biomedical literature, thereby providing distinctive capabilities that set it apart from existing methodologies. Its application is expected to greatly benefit researchers in this domain, enhancing their capacity for knowledge discovery and data management.
Collapse
Affiliation(s)
- Yeon-Ji Park
- Department of Electronics and Communications Engineering, Kwangwoon University, 20 Gwangun-ro, Seoul, 01897, Republic of Korea
| | - Geun-Je Yang
- Department of Electronics and Communications Engineering, Kwangwoon University, 20 Gwangun-ro, Seoul, 01897, Republic of Korea
| | - Chae-Bong Sohn
- Department of Electronics and Communications Engineering, Kwangwoon University, 20 Gwangun-ro, Seoul, 01897, Republic of Korea.
| | - Soo Jun Park
- Welfare & Medical ICT Research Department, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Daejeon, 34129, Republic of Korea.
| |
Collapse
|
31
|
Wu X, Wen Q, Zhu J. Association rule mining with a special rule coding and dynamic genetic algorithm for air quality impact factors in Beijing, China. PLoS One 2024; 19:e0299865. [PMID: 38437225 PMCID: PMC10911623 DOI: 10.1371/journal.pone.0299865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 02/16/2024] [Indexed: 03/06/2024] Open
Abstract
Understanding air quality requires a comprehensive understanding of its various factors. Most of the association rule techniques focuses on high frequency terms, ignoring the potential importance of low- frequency terms and causing unnecessary storage space waste. Therefore, a dynamic genetic association rule mining algorithm is proposed in this paper, which combines the improved dynamic genetic algorithm with the association rule mining algorithm to realize the importance mining of low- frequency terms. Firstly, in the chromosome coding phase of genetic algorithm, an innovative multi-information coding strategy is proposed, which selectively stores similar values of different levels in one storage unit. It avoids storing all the values at once and facilitates efficient mining of valid rules later. Secondly, by weighting the evaluation indicators such as support, confidence and promotion in association rule mining, a new evaluation index is formed, avoiding the need to set a minimum threshold for high-interest rules. Finally, in order to improve the mining performance of the rules, the dynamic crossover rate and mutation rate are set to improve the search efficiency of the algorithm. In the experimental stage, this paper adopts the 2016 annual air quality data set of Beijing to verify the effectiveness of the unit point multi-information coding strategy in reducing the rule storage air, the effectiveness of mining the rules formed by the low frequency item set, and the effectiveness of combining the rule mining algorithm with the swarm intelligence optimization algorithm in terms of search time and convergence. In the experimental stage, this paper adopts the 2016 annual air quality data set of Beijing to verify the effectiveness of the above three aspects. The unit point multi-information coding strategy reduced the rule space storage consumption by 50%, the new evaluation index can mine more interesting rules whose interest level can be up to 90%, while mining the rules formed by the lower frequency terms, and in terms of search time, we reduced it about 20% compared with some meta-heuristic algorithms, while improving convergence.
Collapse
Affiliation(s)
- Xiaoxuan Wu
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, Anhui, China
- Key Laboratory of Intelligent Building and Building Energy Efficiency, Anhui Jianzhu University, Hefei, Anhui, China
| | - Qiang Wen
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, Anhui, China
| | - Jun Zhu
- School of Artificial Intelligence and Big Data, Hefei University, Hefei, Anhui, China
| |
Collapse
|
32
|
Grotenhuis Z, Mosteiro PJ, Leeuwenberg AM. Modest performance of text mining to extract health outcomes may be almost sufficient for high-quality prognostic model development. Comput Biol Med 2024; 170:108014. [PMID: 38301515 DOI: 10.1016/j.compbiomed.2024.108014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 01/03/2024] [Accepted: 01/19/2024] [Indexed: 02/03/2024]
Abstract
BACKGROUND Across medicine, prognostic models are used to estimate patient risk of certain future health outcomes (e.g., cardiovascular or mortality risk). To develop (or train) prognostic models, historic patient-level training data is needed containing both the predictive factors (i.e., features) and the relevant health outcomes (i.e., labels). Sometimes, when the health outcomes are not recorded in structured data, these are first extracted from textual notes using text mining techniques. Because there exist many studies utilizing text mining to obtain outcome data for prognostic model development, our aim is to study the impact of the text mining quality on downstream prognostic model performance. METHODS We conducted a simulation study charting the relationship between text mining quality and prognostic model performance using an illustrative case study about in-hospital mortality prediction in intensive care unit patients. We repeatedly developed and evaluated a prognostic model for in-hospital mortality, using outcome data extracted by multiple text mining models of varying quality. RESULTS Interestingly, we found in our case study that a relatively low-quality text mining model (F1 score ≈ 0.50) could already be used to train a prognostic model with quite good discrimination (area under the receiver operating characteristic curve of around 0.80). The calibration of the risks estimated by the prognostic model seemed unreliable across the majority of settings, even when text mining models were of relatively high quality (F1 ≈ 0.80). DISCUSSION Developing prognostic models on text-extracted outcomes using imperfect text mining models seems promising. However, it is likely that prognostic models developed using this approach may not produce well-calibrated risk estimates, and require recalibration in (possibly a smaller amount of) manually extracted outcome data.
Collapse
Affiliation(s)
- Zwierd Grotenhuis
- Department of Information and Computing Sciences, Utrecht University, The Netherlands; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, The Netherlands
| | - Pablo J Mosteiro
- Department of Information and Computing Sciences, Utrecht University, The Netherlands
| | - Artuur M Leeuwenberg
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, The Netherlands.
| |
Collapse
|
33
|
Kung JY, Ly K, Shiri A. Text mining applications to support health library practice: A case study on marijuana legalization Twitter analytics. Health Info Libr J 2024; 41:53-63. [PMID: 36598110 DOI: 10.1111/hir.12473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 11/29/2022] [Accepted: 12/14/2022] [Indexed: 01/05/2023]
Abstract
BACKGROUND Twitter is rich in data for text and data analytics research, with the ability to capture trends. OBJECTIVES This study examines Canadian tweets on marijuana legalization and terminology used. Presented as a case study, Twitter analytics will demonstrate the varied applications of how this kind of research method may be used to inform library practice. METHODS Twitter API was used to extract a subset of tweets using seven relevant hashtags. Using open-source programming tools, the sampled tweets were analysed between September to November 2018, identifying themes, frequently used terms, sentiment, and co-occurring hashtags. RESULTS More than 1,176,000 tweets were collected. The most popular hashtag co-occurrence, two hashtags appearing together, was #cannabis and #CdnPoli. There was a high variance in the sentiment analysis of all collected tweets but most scores had neutral sentiment. DISCUSSION The case study presents text-mining applications relevant to help make informed decisions in library practice through service analysis, quality analysis, and collection analysis. CONCLUSIONS Findings from sentiment analysis may determine usage patterns from users. There are several ways in which libraries may use text mining to make evidence-informed decisions such as examining all possible terminologies used by the public to help inform comprehensive evidence synthesis projects and build taxonomies for digital libraries and repositories.
Collapse
Affiliation(s)
- Janice Y Kung
- John W. Scott Health Sciences Library, University of Alberta, Edmonton, Canada
| | - Kynan Ly
- Digital Humanities, University of Alberta, Edmonton, Canada
| | - Ali Shiri
- School of Library and Information Studies, University of Alberta, Edmonton, Canada
| |
Collapse
|
34
|
Lawal O, Ochei LC. Lichen - air quality association rule mining for urban environments in the tropics. Int J Environ Health Res 2024; 34:1713-1724. [PMID: 37489590 DOI: 10.1080/09603123.2023.2239716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Accepted: 07/18/2023] [Indexed: 07/26/2023]
Abstract
There are significant gaps in air quality monitoring across many low- and middle-income countries, which can be filled by bioindicators like lichen. This study examined the links between lichen and air quality across urban environments in Nigeria. Lichen surveys and air quality monitoring were carried out across four major cities focusing on NO2, SO2, PM2.5, and PM10. Association rule mining was used to identify robust rules defining the association between lichen and air quality categories. For the maximal frequent set with Lichen in the antecedent, 9 and 5 rules were identified by A priori and Eclat, respectively. These indicated that three genera: Diorygma, Pyxine, and Physcia are the most commonly associated lichen with poor air quality particularly NO2 and SO2. This showed that these lichens are viable indicators of long-term air quality due to their consistent occurrence across the rules from different algorithms.
Collapse
Affiliation(s)
- Olanrewaju Lawal
- Department of Geography and Environmental Management, University of Port Harcourt, Port Harcourt, Nigeria
| | - Laud Charles Ochei
- Department of Computer Science, University of Port Harcourt, Port Harcourt, Nigeria
| |
Collapse
|
35
|
Xiong J, Liu X, Li Z, Xiao H, Wang G, Niu Z, Fei C, Zhong F, Wang G, Zhang W, Fu Z, Liu Z, Chen K, Jiang H, Zheng M. αExtractor: a system for automatic extraction of chemical information from biomedical literature. Sci China Life Sci 2024; 67:618-621. [PMID: 37758905 DOI: 10.1007/s11427-023-2388-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 06/07/2023] [Indexed: 09/29/2023]
Affiliation(s)
- Jiacheng Xiong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiaohong Liu
- AI Department, Suzhou Alphama Biotechnology Co., Ltd., Suzhou, 215125, China
| | - Zhaojun Li
- AI Department, Suzhou Alphama Biotechnology Co., Ltd., Suzhou, 215125, China
- College of Computer and Information Engineering, Dezhou University, Dezhou, 253023, China
| | - Hongzhong Xiao
- AI Department, Suzhou Alphama Biotechnology Co., Ltd., Suzhou, 215125, China
| | - Guangchao Wang
- College of Computer and Information Engineering, Dezhou University, Dezhou, 253023, China
| | - Zhenjiang Niu
- AI Department, Suzhou Alphama Biotechnology Co., Ltd., Suzhou, 215125, China
| | - Chaoyuan Fei
- AI Department, Suzhou Alphama Biotechnology Co., Ltd., Suzhou, 215125, China
| | - Feisheng Zhong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Gang Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhiguo Liu
- AI Department, Suzhou Alphama Biotechnology Co., Ltd., Suzhou, 215125, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
36
|
Wang L, Wang Y, Zhao Q. Data mining and analysis of the adverse events derived signals of 4 gadolinium-based contrast agents based on the US Food and drug administration adverse event reporting system. Expert Opin Drug Saf 2024; 23:339-352. [PMID: 37837355 DOI: 10.1080/14740338.2023.2271834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Accepted: 10/13/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND To detect and analyze risk signals of the drug-related adverse events (AEs) of 4 gadolinium-based contrast agents (GBCAs) (gadopentetate dimeglumine (Gd-DTPA), gadobenate dimeglumine (Gd-BOPTA), gadoteridol (Gd-HP-DO3A), and gadobutrol (Gd-BT-DO3A)) according to the US Food and Drug Administration Adverse Event Reporting System (FAERS) database and ensure the clinical safety. RESEARCH DESIGN AND METHODS The AEs that are associated with the 4 GBCAs were collected from the FAERS database from 2004Q1 to 2022Q3. The risk signals were mined using reporting odds ratio (ROR) and proportional reporting ratio (PRR). RESULTS 424 risk signals were excavated, in which 151 risk signals were associated with Gd-DTPA, 93 risk signals were related to Gd-BOPTA, 79 risk signals were relevant to Gd-HP-DO3A, and 101 risk signals were associated with Gd-BT-DO3A. The AE signals involved 20 system organ classes (SOCs). Two of the top four SOCs were identical, namely 'skin and subcutaneous tissue disorders' and 'general disorders and administration site conditions.' CONCLUSIONS The safety signals of 4 GBCAs were detected, and the SOCs associated with the AEs of the 4 GBCAs were different. Besides, some AEs obtained in this study were not mentioned in the package inserts, which need more attention and research to ensure the clinical safety.
Collapse
Affiliation(s)
- Lu Wang
- Department of Pharmacy, Yantai Yuhuangding Hospital, Yantai, Shandong, P. R. China
| | - Yinglin Wang
- Department of Pharmacy, Yantai Yuhuangding Hospital, Yantai, Shandong, P. R. China
| | - Quan Zhao
- Department of Pharmacy, Yantai Yuhuangding Hospital, Yantai, Shandong, P. R. China
| |
Collapse
|
37
|
Hellali R, Chelly Dagdia Z, Ktaish A, Zeitouni K, Annane D. Corticosteroid sensitivity detection in sepsis patients using a personalized data mining approach: A clinical investigation. Comput Methods Programs Biomed 2024; 245:108017. [PMID: 38241801 DOI: 10.1016/j.cmpb.2024.108017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 10/29/2023] [Accepted: 01/09/2024] [Indexed: 01/21/2024]
Abstract
BACKGROUND AND OBJECTIVE Sepsis is a life-threatening disease with high mortality, incidence, and morbidity. Corticosteroids (CS) are a recommended treatment for sepsis, but some patients respond negatively to CS therapy. Early prediction of corticosteroid responsiveness can help intervene and reduce mortality. In this study, we aim to develop a data mining methodology for predicting CS responsiveness of septic patients. METHODS We used data from a randomized controlled trial called APROCCHSS, which recruited 1241 septis patients to study the effectiveness of corticotherapy. We conducted a thorough study of multiple machine learning models to select the most efficient prediction model, called "signature". We evaluated the performance of the signature using precision, sensitivity, and specificity values. RESULTS We found that Logistic Regression was the best model with an AUC of 72%. We conducted further experiments to examine the impact of additional features and the model's generalizability to different groups of patients. We also performed a statistical analysis to analyze the effect of the treatment at the individual level and on the population as a whole. CONCLUSIONS Our data mining methodology can accurately predict cortico-sensitivity or resistance in septis patients. The signature has been deployed into the Assistance Publique - Hôpitaux de Paris (APHP) information system as a web service, taking patient information as input and providing a prediction of cortico-sensitivity or resistance. Early prediction of corticosteroid responsiveness can help clinicians intervene promptly and improve patient outcomes.
Collapse
Affiliation(s)
- Rahma Hellali
- Université Paris-Saclay, UVSQ, DAVID, Paris, France.
| | - Zaineb Chelly Dagdia
- Université Paris-Saclay, UVSQ, DAVID, Paris, France; Université de Tunis, Institut supérieur de gestion de Tunis, LARODEC, Tunis, Tunisia
| | | | | | - Djillali Annane
- Réanimation medico-chirurgicale, hôpital Raymond-Poincaré, AP-HP, Garches, France
| |
Collapse
|
38
|
Mol MJ, Belfi B, Bakk Z. Unravelling the skills of data scientists: A text mining analysis of Dutch university master programs in data science and artificial intelligence. PLoS One 2024; 19:e0299327. [PMID: 38422040 PMCID: PMC10903789 DOI: 10.1371/journal.pone.0299327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 02/07/2024] [Indexed: 03/02/2024] Open
Abstract
The growing demand for data scientists in both the global and Dutch labour markets has led to an increase in data science and artificial intelligence (AI) master programs offered by universities. However, there is still a lack of clarity regarding the specific skills of data scientists. This study addresses this issue by employing Correlated Topic Modeling (CTM) to analyse the content of 41 master programs offered by 11 Dutch universities and an interuniversity combined program. We assess the differences and similarities in the core skills taught by these programs, determine the subject-specific and general nature of the skills, and provide a comparison between the different types of universities offering these programs. Our analysis reveals that data processing, statistics, research, and ethics are the core competencies in Dutch data science and AI master programs. General universities tend to focus on research skills, while technical universities lean more towards IT and electronics skills. Broad-focussed data science and AI programs generally concentrate on data processing, information technology, electronics, and research, while subject-specific programs give priority to statistics and ethics. This research enhances the understanding of the diverse skills of Dutch data science graduates, providing valuable insights for employers, academic institutions, and prospective students.
Collapse
Affiliation(s)
- Mathijs J. Mol
- Gastroenterology and Hepatology (AMC), Amsterdam University Medical Center, Amsterdam, North Holland, The Netherlands
| | - Barbara Belfi
- ROA, Maastricht University, Maastricht, Limburg, The Netherlands
| | - Zsuzsa Bakk
- The unit of Methodology and Statistics, Institute of Psychology, Leiden University, Leiden, South Holland, The Netherlands
| |
Collapse
|
39
|
Yu Y, Hu G, Yang X, Yin Y, Tong K, Yu R. A strategic study of acupuncture for diabetic kidney disease based on meta-analysis and data mining. Front Endocrinol (Lausanne) 2024; 15:1273265. [PMID: 38469137 PMCID: PMC10925656 DOI: 10.3389/fendo.2024.1273265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 01/22/2024] [Indexed: 03/13/2024] Open
Abstract
Objective The specific benefit and selection of acupoints in acupuncture for diabetic kidney disease (DKD) remains controversial. This study aims to explore the specific benefits and acupoints selection of acupuncture for DKD through meta-analysis and data mining. Methods Clinical trials of acupuncture for DKD were searched in eight common databases. Meta-analysis was used to evaluate its efficacy and safety, and data mining was used to explore its acupoints selection. Results Meta-analysis displayed that compared with the conventional drug group, the combined acupuncture group significantly increased the clinical effective rate (risk ratio [RR] 1.35, 95% confidence interval [CI] 1.20 to 1.51, P < 0.00001) and high-density lipoprotein cholesterol (mean difference [MD] 0.36, 95% CI 0.27 to 0.46, P < 0.00001), significantly reduced the urinary albumin (MD -0.39, 95% CI -0.42 to -0.36, P < 0.00001), urinary microalbumin (MD -32.63, 95% CI -42.47 to -22.79, P < 0.00001), urine β2-microglobulin (MD -0.45, 95% CI -0.66 to -0.24, P < 0.0001), serum creatinine (MD -15.36, 95% CI -21.69 to -9.03, P < 0.00001), glycated hemoglobin A1c (MD -0.69, 95% CI -1.18 to -0.19, P = 0.006), fasting blood glucose (MD -0.86, 95% CI -0.90 to -0.82, P < 0.00001), 2h postprandial plasma glucose (MD -0.87, 95% CI -0.92 to -0.82, P < 0.00001), total cholesterol (MD -1.23, 95% CI -2.05 to -0.40, P = 0.003), triglyceride (MD -0.69, 95% CI -1.23 to -0.15, P = 0.01), while adverse events were comparable. Data mining revealed that CV12, SP8, SP10, ST36, SP6, BL20, BL23, and SP9 were the core acupoints for DKD treated by acupuncture. Conclusion Acupuncture improved clinical symptoms, renal function indices such as uALB, umALB, uβ2-MG, and SCR, as well as blood glucose and blood lipid in patients with DKD, and has a favorable safety profile. CV12, SP8, SP10, ST36, SP6, BL20, BL23, and SP9 are the core acupoints for acupuncture in DKD, and this program is expected to become a supplementary treatment for DKD.
Collapse
Affiliation(s)
- Yunfeng Yu
- Department of Endocrinology, The First Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Gang Hu
- Department of Endocrinology, The First Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Xinyu Yang
- College of Chinese Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Yuman Yin
- College of Chinese Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Keke Tong
- Department of Gastroenterology, The Hospital of Hunan University of Traditional Chinese Medicine, Changde, Hunan, China
| | - Rong Yu
- Department of Endocrinology, The First Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China
| |
Collapse
|
40
|
He X, Zhang H, Huang J, Zhao D, Li Y, Nie R, Liu X. [Research on fault diagnosis of patient monitor based on text mining]. Sheng Wu Yi Xue Gong Cheng Xue Za Zhi 2024; 41:168-176. [PMID: 38403618 PMCID: PMC10894744 DOI: 10.7507/1001-5515.202306017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
The conventional fault diagnosis of patient monitors heavily relies on manual experience, resulting in low diagnostic efficiency and ineffective utilization of fault maintenance text data. To address these issues, this paper proposes an intelligent fault diagnosis method for patient monitors based on multi-feature text representation, improved bidirectional gate recurrent unit (BiGRU) and attention mechanism. Firstly, the fault text data was preprocessed, and the word vectors containing multiple linguistic features was generated by linguistically-motivated bidirectional encoder representation from Transformer. Then, the bidirectional fault features were extracted and weighted by the improved BiGRU and attention mechanism respectively. Finally, the weighted loss function is used to reduce the impact of class imbalance on the model. To validate the effectiveness of the proposed method, this paper uses the patient monitor fault dataset for verification, and the macro F1 value has achieved 91.11%. The results show that the model built in this study can realize the automatic classification of fault text, and may provide assistant decision support for the intelligent fault diagnosis of the patient monitor in the future.
Collapse
Affiliation(s)
- Xiangfei He
- School of Automation, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Hehua Zhang
- Department of Medical Engineering, Daping Hospital of Army Medical University, Chongqing 400042, P. R. China
- School of Biological Information, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Jing Huang
- Department of Medical Engineering, Daping Hospital of Army Medical University, Chongqing 400042, P. R. China
| | - Dechun Zhao
- School of Biological Information, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yang Li
- Department of Medical Engineering, Daping Hospital of Army Medical University, Chongqing 400042, P. R. China
| | - Rui Nie
- Department of Medical Engineering, Daping Hospital of Army Medical University, Chongqing 400042, P. R. China
| | - Xianghua Liu
- Department of Medical Engineering, Daping Hospital of Army Medical University, Chongqing 400042, P. R. China
| |
Collapse
|
41
|
He YJ, Fan YS, Miao FR, Zhao XY, Zhang FZ, He C, Zhang H. Acupoint selection rules of acupuncture and moxibustion in treating neurogenic bladder based on data mining. Zhen Ci Yan Jiu 2024; 49:198-207. [PMID: 38413042 DOI: 10.13702/j.1000-0607.20230018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
OBJECTIVES To explore the rules of acupoint selection in the treatment of neurogenic bladder (NB) with acupuncture and moxibustion by using data mining. METHODS The clinical research literatures on acupuncture treatment of NB were collected from PubMed, Embase, Cochrane Library, CNKI, Wanfang Database, VIP Database and China Biology Medicine from retrieved to January 1, 2023. The acupoint prescription database was established using Excel 2019. SPSS Modeler 18.0 and SPSS Statistics 26.0 softwares were used to conduct the frequency, meri-dians, locations, specific acupoints analysis and association rules analysis, factor analysis, cluster analysis, etc., to explore the characteristics and rules of acupoint selection in acupuncture and moxibustion treatment of NB. RESULTS Totally 313 papers were included, including 110 acupoints with a total frequency of 1 995. The high-frequency acupoints are Zhongji (CV3), Guanyuan (CV4), Sanyinjiao (SP6), etc. The commonly used meridians are the Bladder Meridian of Foot Taiyang and Conception Vessel. The involved acupoints are mostly located in the lumbosacral region and abdomen, and intersection acupoints, mu-front acupoints and back-shu acupoints are the majority in the specific acupoints. The core acupoints group was analyzed, and 17 groups of association rules, 7 factors and 6 effective cluster groups were obtained. CONCLUSIONS Acupuncture and moxibustion treatment of NB follows the therapeutic principles of toni-fying the kidney, invigorating the spleen, and soothing the liver. The core acupoints group is CV3-CV4-SP6.
Collapse
Affiliation(s)
- Yu-Jun He
- Faculty of Acupuncture, Moxibustion and Tuina of Guangxi University of Chinese Medicine, Nanning 530001, China
| | - Yu-Shan Fan
- Faculty of Acupuncture, Moxibustion and Tuina of Guangxi University of Chinese Medicine, Nanning 530001, China.
| | - Fu-Rui Miao
- Faculty of Acupuncture, Moxibustion and Tuina of Guangxi University of Chinese Medicine, Nanning 530001, China
| | - Xin-Yi Zhao
- Zhuang Medical College of Guangxi University of Traditional Chinese Medicine, Nanning 530001
| | - Fang-Zhi Zhang
- Faculty of Acupuncture, Moxibustion and Tuina of Guangxi University of Chinese Medicine, Nanning 530001, China
| | - Cai He
- Faculty of Acupuncture, Moxibustion and Tuina of Guangxi University of Chinese Medicine, Nanning 530001, China
| | - Hui Zhang
- Faculty of Acupuncture, Moxibustion and Tuina of Guangxi University of Chinese Medicine, Nanning 530001, China
| |
Collapse
|
42
|
Mughaz D, HaCohen-Kerner Y, Gabbay D. Extraction of time-related expressions using text mining with application to Hebrew. PLoS One 2024; 19:e0293196. [PMID: 38394097 PMCID: PMC10889890 DOI: 10.1371/journal.pone.0293196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Accepted: 10/08/2023] [Indexed: 02/25/2024] Open
Abstract
In this research, we extract time-related expressions from a rabbinic text in a semi-automatic manner. These expressions usually appear next to rabbinic references (name / nickname / acronym / book-name). The first step toward our goal is to find all the expressions near references in the corpus. However, not all of the phrases around the references are time-related expressions. Therefore, these phrases are initially considered to be potential time-related expressions. To extract the time-related expressions, we formulate two new statistical functions, and we use screening and heuristic methods. We tested these statistical functions, grammatical screenings, and heuristic methods on a corpus containing responsa documents. In this corpus, many rabbinic citations are known and marked. The statistical functions and the screening methods filtered the potential time-related expressions and reduced 99.88% of the initial expressions (from 484,681 to 575).
Collapse
Affiliation(s)
- Dror Mughaz
- Dept. of Computer Science, Jerusalem College of Technology–Lev Academic Center, Jerusalem, Israel
- Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
| | - Yaakov HaCohen-Kerner
- Dept. of Computer Science, Jerusalem College of Technology–Lev Academic Center, Jerusalem, Israel
| | - Dov Gabbay
- Dept. of Computer Science, Bar-Ilan University, Ramat-Gan, Israel
- Dep. of Informatics, Kings College London, Strand, London, United Kingdom
| |
Collapse
|
43
|
Qiao H, Chen Y, Qian C, Guo Y. Clinical data mining: challenges, opportunities, and recommendations for translational applications. J Transl Med 2024; 22:185. [PMID: 38378565 PMCID: PMC10880222 DOI: 10.1186/s12967-024-05005-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 02/18/2024] [Indexed: 02/22/2024] Open
Abstract
Clinical data mining of predictive models offers significant advantages for re-evaluating and leveraging large amounts of complex clinical real-world data and experimental comparison data for tasks such as risk stratification, diagnosis, classification, and survival prediction. However, its translational application is still limited. One challenge is that the proposed clinical requirements and data mining are not synchronized. Additionally, the exotic predictions of data mining are difficult to apply directly in local medical institutions. Hence, it is necessary to incisively review the translational application of clinical data mining, providing an analytical workflow for developing and validating prediction models to ensure the scientific validity of analytic workflows in response to clinical questions. This review systematically revisits the purpose, process, and principles of clinical data mining and discusses the key causes contributing to the detachment from practice and the misuse of model verification in developing predictive models for research. Based on this, we propose a niche-targeting framework of four principles: Clinical Contextual, Subgroup-Oriented, Confounder- and False Positive-Controlled (CSCF), to provide guidance for clinical data mining prior to the model's development in clinical settings. Eventually, it is hoped that this review can help guide future research and develop personalized predictive models to achieve the goal of discovering subgroups with varied remedial benefits or risks and ensuring that precision medicine can deliver its full potential.
Collapse
Affiliation(s)
- Huimin Qiao
- Medical Big Data and Bioinformatics Research Centre, First Affiliated Hospital of Gannan Medical University, Ganzhou, China
| | - Yijing Chen
- School of Public Health and Health Management, Gannan Medical University, Ganzhou, China
| | - Changshun Qian
- School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, China
| | - You Guo
- Medical Big Data and Bioinformatics Research Centre, First Affiliated Hospital of Gannan Medical University, Ganzhou, China.
- School of Public Health and Health Management, Gannan Medical University, Ganzhou, China.
- School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, China.
- Ganzhou Key Laboratory of Medical Big Data, Ganzhou, China.
| |
Collapse
|
44
|
Chandrasekaran R, Konaraddi K, Sharma SS, Moustakas E. Text-Mining and Video Analytics of COVID-19 Narratives Shared by Patients on YouTube. J Med Syst 2024; 48:21. [PMID: 38358554 DOI: 10.1007/s10916-024-02047-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Accepted: 02/13/2024] [Indexed: 02/16/2024]
Abstract
This study explores how individuals who have experienced COVID-19 share their stories on YouTube, focusing on the nature of information disclosure, public engagement, and emotional impact pertaining to consumer health. Using a dataset of 186 YouTube videos, we used text mining and video analytics techniques to analyze textual transcripts and visual frames to identify themes, emotions, and their relationship with viewer engagement metrics. Findings reveal eight key themes: infection origins, symptoms, treatment, mental well-being, isolation, prevention, government directives, and vaccination. While viewers engaged most with videos about infection origins, treatment, and vaccination, fear and sadness in the text consistently drove views, likes, and comments. Visuals primarily conveyed happiness and sadness, but their influence on engagement varied. This research highlights the crucial role YouTube plays in disseminating COVID-19 patient narratives and suggests its potential for improving health communication strategies. By understanding how emotions and content influence viewer engagement, healthcare professionals and public health officials can tailor their messaging to better connect with the public and address pandemic-related anxieties.
Collapse
Affiliation(s)
| | - Karthik Konaraddi
- Department of Information & Decision Sciences, University of Illinois at Chicago, Chicago, IL, USA
| | - Sakshi S Sharma
- Department of Information & Decision Sciences, University of Illinois at Chicago, Chicago, IL, USA
| | | |
Collapse
|
45
|
Hong S, Wang T, Fu X, Li G. Research on quantitative evaluation of digital economy policy in China based on the PMC index model. PLoS One 2024; 19:e0298312. [PMID: 38359065 PMCID: PMC10868804 DOI: 10.1371/journal.pone.0298312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 01/21/2024] [Indexed: 02/17/2024] Open
Abstract
The development of digital economy is a strategic choice to grasp the revolution of new science and technology and the new opportunities of industrial reform. The development of digital economy depends on the good support of policy and theoretical system. Therefore, the quantitative evaluation of policy texts provides the basis of decision-making and the suggestions of path optimization for the formulation and improvement of digital economy policy of China. By selecting the text of digital economy policy issued by China government, the paper constructs a quantitative evaluation model of digital economy policy using the methods of content analysis and text mining. The empirical research results show that the overall design evaluation of the selected policy is relatively reasonable. Six policies were evaluated as excellent and two as acceptable. In view of the problems such as lack of predictive policy in the policy type, lack of specific policy in the policy timeliness, imbalance in the use of policy guarantee, and lack of comprehensive coverage in the policy objectives, the paper puts forward corresponding countermeasures and suggestions.
Collapse
Affiliation(s)
- Shuai Hong
- Institute of Economic Research, Hebei University of Economics and Business, Shijiazhuang, China
- Hebei Coordinated Innovation Center for BTH Coordinated Development, Hebei University of Economics and Business, Shijiazhuang, China
| | - Tianzun Wang
- Institute of Economic Research, Hebei University of Economics and Business, Shijiazhuang, China
- College of Economics and Management, Beijing University of Technology, Beijing, China
| | - Xiaoyi Fu
- Institute of Economic Research, Hebei University of Economics and Business, Shijiazhuang, China
| | - Guo Li
- Institute of Economic Research, Hebei University of Economics and Business, Shijiazhuang, China
| |
Collapse
|
46
|
Valdez D, Mena-Meléndez L, Crawford BL, Jozkowski KN. Analyzing Reddit Forums Specific to Abortion That Yield Diverse Dialogues Pertaining to Medical Information Seeking and Personal Worldviews: Data Mining and Natural Language Processing Comparative Study. J Med Internet Res 2024; 26:e47408. [PMID: 38354044 PMCID: PMC10902765 DOI: 10.2196/47408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 09/27/2023] [Accepted: 12/20/2023] [Indexed: 02/16/2024] Open
Abstract
BACKGROUND Attitudes toward abortion have historically been characterized via dichotomized labels, yet research suggests that these labels do not appropriately encapsulate beliefs on abortion. Rather, contexts, circumstances, and lived experiences often shape views on abortion into more nuanced and complex perspectives. Qualitative data have also been shown to underpin belief systems regarding abortion. Social media, as a form of qualitative data, could reveal how attitudes toward abortion are communicated publicly in web-based spaces. Furthermore, in some cases, social media can also be leveraged to seek health information. OBJECTIVE This study applies natural language processing and social media mining to analyze Reddit (Reddit, Inc) forums specific to abortion, including r/Abortion (the largest subreddit about abortion) and r/AbortionDebate (a subreddit designed to discuss and debate worldviews on abortion). Our analytical pipeline intends to identify potential themes within the data and the affect from each post. METHODS We applied a neural network-based topic modeling pipeline (BERTopic) to uncover themes in the r/Abortion (n=2151) and r/AbortionDebate (n=2815) subreddits. After deriving the optimal number of topics per subreddit using an iterative coherence score calculation, we performed a sentiment analysis using the Valence Aware Dictionary and Sentiment Reasoner to assess positive, neutral, and negative affect and an emotion analysis using the Text2Emotion lexicon to identify potential emotionality per post. Differences in affect and emotion by subreddit were compared. RESULTS The iterative coherence score calculation revealed 10 topics for both r/Abortion (coherence=0.42) and r/AbortionDebate (coherence=0.35). Topics in the r/Abortion subreddit primarily centered on information sharing or offering a source of social support; in contrast, topics in the r/AbortionDebate subreddit centered on contextualizing shifting or evolving views on abortion across various ethical, moral, and legal domains. The average compound Valence Aware Dictionary and Sentiment Reasoner scores for the r/Abortion and r/AbortionDebate subreddits were 0.01 (SD 0.44) and -0.06 (SD 0.41), respectively. Emotionality scores were consistent across the r/Abortion and r/AbortionDebate subreddits; however, r/Abortion had a marginally higher average fear score of 0.36 (SD 0.39). CONCLUSIONS Our findings suggest that people posting on abortion forums on Reddit are willing to share their beliefs, which manifested in diverse ways, such as sharing abortion stories including how their worldview changed, which critiques the value of dichotomized abortion identity labels, and information seeking. Notably, the style of discourse varied significantly by subreddit. r/Abortion was principally leveraged as an information and outreach source; r/AbortionDebate largely centered on debating across various legal, ethical, and moral abortion domains. Collectively, our findings suggest that abortion remains an opaque yet politically charged issue for people and that social media can be leveraged to understand views and circumstances surrounding abortion.
Collapse
Affiliation(s)
- Danny Valdez
- Department of Applied Health Science, Indiana University School of Public Health, Bloomington, IN, United States
| | - Lucrecia Mena-Meléndez
- Department of Applied Health Science, Indiana University School of Public Health, Bloomington, IN, United States
| | - Brandon L Crawford
- Department of Applied Health Science, Indiana University School of Public Health, Bloomington, IN, United States
| | - Kristen N Jozkowski
- Department of Applied Health Science, Indiana University School of Public Health, Bloomington, IN, United States
| |
Collapse
|
47
|
Li S, Wang J. Exploration of the methods and rules of syndrome/pattern differentiation and treatment of headache from the acupuncture-moxibustion prescriptions of ancient literature based on the data mining technology. Zhongguo Zhen Jiu 2024; 44:224-230. [PMID: 38373772 DOI: 10.13703/j.0255-2930.20230629-k0001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
The study aims to identifying and exploring the methods and rules of the syndrome/pattern differentiation and treatment of headache through collating acupuncture-moxibustion prescriptions recorded earliest in ancient literature. Using Excel2016 software, the structural data table was prepared with "name of disease", "location of disease", "etiology and pathogenesis", "complicated symptoms", "sites for acupuncture and moxibustion" and "techniques of acupuncture and moxibustion" included. The normative approach was conduced on "name of disease", "etiology and pathogenesis", "complicated symptoms" and "nomenclature of acupoint". Using conventional literature statistical method, combined with Apriori algorithm of association rule, the implicit multi-dimensional correlation rules were explored among various elements of syndrome/pattern differentiation of headache and corresponding therapeutic methods. Based on the findings of the study, the regularity was distinct regarding the treatment at "distal acupoints along the affected meridian and the local acupoints at the affected area" after identifying the location of headache; the strong association was presented between "etiology and pathogenesis" and "acupoint selection", and between "etiology and pathogenesis" and "therapeutic methods", including 9 and 12 rules, respectively. Guanyuan (CV 4) selected in treatment of headache was associated with kidney deficiency, the combination of Zhongwan (CV 12) and Zusanli (ST 36) was with phlegm, Fengfu (GV 16), Fengchi (GB 20), Xinghui (GV 22) and Baihui (GV 20) was with wind, and Hegu (LI 4) was with cold. Moxibustion was dominant in treatment if headache was caused by pathogenic cold or related to deficiency syndrome; acupuncture was used specially for the case caused by phlegm, or interaction of wind and phlegm or wind and heat. For heat syndrome, either acupuncture or moxibustion was applicable, in general, acupuncture was more commonly used in comparison with moxibustion for headache. There were 6 association rules regarding the acupoint selection and the techniques of acupuncture and moxibustion. Moxibustion was generally applied to Xinghui (GV 22), Shangxing (GV 23) and Baihui (GV 20) ; and acupuncture was to Fengfu (GV 16), Hegu (LI 4) and Zusanli (ST 36). There were few association rules between the complicated symptoms and acupoint selection. Among nearly 100 complications, there were only 3 feature associations. Zhongwan (CV 12) was selected for the case with poor appetite, Chengjiang (CV 24) was with neck stiffness, and Fengchic (GB 20) combined with Fenglong (ST 40) or Jiexi (ST 41) was used if vertigo was present. In the ancient time, regarding the treatment of headache, acupuncture and moxibustion are delivered based on the three aspects, i.e. the location of illness, the etiology and pathogenesis, and the complicated symptoms. For acupoint selection, in line with the courses of affected meridians, the adjacent and distal acupoints are combined according to the location of headache. The acupoint prescription is composed in terms of the etiology and pathogenesis. The techniques of acupuncture and moxibustion are optimized in consideration of the sites where acupuncture and moxibustion are operated.
Collapse
Affiliation(s)
- Suyun Li
- Institute of Acupuncture and Moxibustion, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| | - Jianing Wang
- Institute of Acupuncture and Moxibustion, China Academy of Chinese Medical Sciences, Beijing 100700, China
| |
Collapse
|
48
|
Guérin J, Nahid A, Tassy L, Deloger M, Bocquet F, Thézenas S, Desandes E, Le Deley MC, Durando X, Jaffré A, Es-Saad I, Crochet H, Le Morvan M, Lion F, Raimbourg J, Khay O, Craynest F, Giro A, Laizet Y, Bertaut A, Joly F, Livartowski A, Heudel P. Consore: A Powerful Federated Data Mining Tool Driving a French Research Network to Accelerate Cancer Research. Int J Environ Res Public Health 2024; 21:189. [PMID: 38397680 PMCID: PMC10887639 DOI: 10.3390/ijerph21020189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/28/2024] [Accepted: 01/31/2024] [Indexed: 02/25/2024]
Abstract
BACKGROUND Real-world data (RWD) related to the health status and care of cancer patients reflect the ongoing medical practice, and their analysis yields essential real-world evidence. Advanced information technologies are vital for their collection, qualification, and reuse in research projects. METHODS UNICANCER, the French federation of comprehensive cancer centres, has innovated a unique research network: Consore. This potent federated tool enables the analysis of data from millions of cancer patients across eleven French hospitals. RESULTS Currently operational within eleven French cancer centres, Consore employs natural language processing to structure the therapeutic management data of approximately 1.3 million cancer patients. These data originate from their electronic medical records, encompassing about 65 million medical records. Thanks to the structured data, which are harmonized within a common data model, and its federated search tool, Consore can create patient cohorts based on patient or tumor characteristics, and treatment modalities. This ability to derive larger cohorts is particularly attractive when studying rare cancers. CONCLUSIONS Consore serves as a tremendous data mining instrument that propels French cancer centres into the big data era. With its federated technical architecture and unique shared data model, Consore facilitates compliance with regulations and acceleration of cancer research projects.
Collapse
Affiliation(s)
| | - Amine Nahid
- Coexya, 69370 Saint-Didier-au-Mont-d’Or, France; (A.N.); (F.J.)
| | - Louis Tassy
- Institut Paoli-Calmettes, 13009 Marseille, France; (L.T.); (M.L.M.)
| | - Marc Deloger
- Gustave Roussy, 94805 Villejuif, France; (M.D.); (F.L.)
| | - François Bocquet
- Data Factory & Analytics Department, Institut de Cancérologie de l’Ouest, 44805 Nantes-Angers, France (J.R.)
| | - Simon Thézenas
- Institut Régional du Cancer de Montpellier, 34090 Montpellier, France;
| | - Emmanuel Desandes
- Institut de Cancérologie de Lorraine, 54519 Nancy, France; (E.D.); (O.K.)
| | | | - Xavier Durando
- Centre Jean Perrin, 63011 Clermont Ferrand, France; (X.D.); (A.G.)
| | - Anne Jaffré
- Institut Bergonié, 33076 Bordeaux, France; (A.J.); (Y.L.)
| | - Ikram Es-Saad
- Centre Georges Francois Leclerc, 21000 Dijon, France; (I.E.-S.); (A.B.)
| | | | - Marie Le Morvan
- Institut Paoli-Calmettes, 13009 Marseille, France; (L.T.); (M.L.M.)
| | - François Lion
- Gustave Roussy, 94805 Villejuif, France; (M.D.); (F.L.)
| | - Judith Raimbourg
- Data Factory & Analytics Department, Institut de Cancérologie de l’Ouest, 44805 Nantes-Angers, France (J.R.)
| | - Oussama Khay
- Institut de Cancérologie de Lorraine, 54519 Nancy, France; (E.D.); (O.K.)
| | - Franck Craynest
- Centre Oscar Lambret, 59000 Lille, France; (M.-C.L.D.); (F.C.)
| | - Alexia Giro
- Centre Jean Perrin, 63011 Clermont Ferrand, France; (X.D.); (A.G.)
| | - Yec’han Laizet
- Institut Bergonié, 33076 Bordeaux, France; (A.J.); (Y.L.)
| | - Aurélie Bertaut
- Centre Georges Francois Leclerc, 21000 Dijon, France; (I.E.-S.); (A.B.)
| | - Frederik Joly
- Coexya, 69370 Saint-Didier-au-Mont-d’Or, France; (A.N.); (F.J.)
| | | | | |
Collapse
|
49
|
Fins IS, Davies H, Farrell S, Torres JR, Pinchbeck G, Radford AD, Noble P. Evaluating ChatGPT text mining of clinical records for companion animal obesity monitoring. Vet Rec 2024; 194:e3669. [PMID: 38058223 PMCID: PMC10952314 DOI: 10.1002/vetr.3669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 09/25/2023] [Accepted: 11/07/2023] [Indexed: 12/08/2023]
Abstract
BACKGROUND Veterinary clinical narratives remain a largely untapped resource for addressing complex diseases. Here we compare the ability of a large language model (ChatGPT) and a previously developed regular expression (RegexT) to identify overweight body condition scores (BCS) in veterinary narratives pertaining to companion animals. METHODS BCS values were extracted from 4415 anonymised clinical narratives using either RegexT or by appending the narrative to a prompt sent to ChatGPT, prompting the model to return the BCS information. Data were manually reviewed for comparison. RESULTS The precision of RegexT was higher (100%, 95% confidence interval [CI] 94.81%-100%) than that of ChatGPT (89.3%, 95% CI 82.75%-93.64%). However, the recall of ChatGPT (100%, 95% CI 96.18%-100%) was considerably higher than that of RegexT (72.6%, 95% CI 63.92%-79.94%). LIMITATIONS Prior anonymisation and subtle prompt engineering are needed to improve ChatGPT output. CONCLUSIONS Large language models create diverse opportunities and, while complex, present an intuitive interface to information. However, they require careful implementation to avoid unpredictable errors.
Collapse
Affiliation(s)
- Ivo S. Fins
- Small Animal Veterinary Surveillance NetworkInstitute of InfectionVeterinary and Ecological SciencesUniversity of LiverpoolLiverpoolUK
| | - Heather Davies
- Small Animal Veterinary Surveillance NetworkInstitute of InfectionVeterinary and Ecological SciencesUniversity of LiverpoolLiverpoolUK
| | - Sean Farrell
- Department of Computer ScienceDurham UniversityDurhamUK
| | - Jose R. Torres
- Institute for Animal Health and Food SafetyUniversity of Las Palmas de Gran CanariaLas Palmas, Gran CanariaSpain
| | - Gina Pinchbeck
- Small Animal Veterinary Surveillance NetworkInstitute of InfectionVeterinary and Ecological SciencesUniversity of LiverpoolLiverpoolUK
| | - Alan D. Radford
- Small Animal Veterinary Surveillance NetworkInstitute of InfectionVeterinary and Ecological SciencesUniversity of LiverpoolLiverpoolUK
| | - Peter‐John Noble
- Small Animal Veterinary Surveillance NetworkInstitute of InfectionVeterinary and Ecological SciencesUniversity of LiverpoolLiverpoolUK
| |
Collapse
|
50
|
Kilicoglu H, Ensan F, McInnes B, Wang LL. Semantics-enabled biomedical literature analytics. J Biomed Inform 2024; 150:104588. [PMID: 38244957 DOI: 10.1016/j.jbi.2024.104588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 01/10/2024] [Indexed: 01/22/2024]
Affiliation(s)
- Halil Kilicoglu
- School of Information Sciences, University of Illinois Urbana Champaign, Champaign, IL, USA.
| | - Faezeh Ensan
- Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON, Canada.
| | - Bridget McInnes
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Lucy Lu Wang
- Information School, University of Washington, Seattle, WA, USA.
| |
Collapse
|