Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

31
(from Reference Citation Analysis)

Article PDFs (9)

Cited by > 0 (16)

Searched Name

Shoko Wakamiya

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Kamba M, She WJ, Ferawati K, Wakamiya S, Aramaki E. Exploring the Impact of the COVID-19 Pandemic on Twitter in Japan: Qualitative Analysis of Disrupted Plans and Consequences. JMIR Infodemiology 2024;4:e49699. [PMID: 38557446 PMCID: PMC10986681 DOI: 10.2196/49699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 08/11/2023] [Accepted: 03/06/2024] [Indexed: 04/04/2024]

Abstract

BACKGROUND

Despite being a pandemic, the impact of the spread of COVID-19 extends beyond public health, influencing areas such as the economy, education, work style, and social relationships. Research studies that document public opinions and estimate the long-term potential impact after the pandemic can be of value to the field.

OBJECTIVE

This study aims to uncover and track concerns in Japan throughout the COVID-19 pandemic by analyzing Japanese individuals' self-disclosure of disruptions to their life plans on social media. This approach offers alternative evidence for identifying concerns that may require further attention for individuals living in Japan.

METHODS

We extracted 300,778 tweets using the query phrase Corona-no-sei ("due to COVID-19," "because of COVID-19," or "considering COVID-19"), enabling us to identify the activities and life plans disrupted by the pandemic. The correlation between the number of tweets and COVID-19 cases was analyzed, along with an examination of frequently co-occurring words.

RESULTS

The top 20 nouns, verbs, and noun plus verb pairs co-occurring with Corona no-sei were extracted. The top 5 keywords were graduation ceremony, cancel, school, work, and event. The top 5 verbs were disappear, go, rest, can go, and end. Our findings indicate that education emerged as the top concern when the Japanese government announced the first state of emergency. We also observed a sudden surge in anxiety about material shortages such as toilet paper. As the pandemic persisted and more states of emergency were declared, we noticed a shift toward long-term concerns, including careers, social relationships, and education.

CONCLUSIONS

Our study incorporated machine learning techniques for disease monitoring through the use of tweet data, allowing the identification of underlying concerns (eg, disrupted education and work conditions) throughout the 3 stages of Japanese government emergency announcements. The comparison with COVID-19 case numbers provides valuable insights into the short- and long-term societal impacts, emphasizing the importance of considering citizens' perspectives in policy-making and supporting those affected by the pandemic, particularly in the context of Japanese government decision-making.

Collapse

Yao LFL, Liew K, Wakamiya S, Aramaki E. Extracting Spatio-Temporal Trends in Medical Research Prioritization Through Natural Language Processing of Case Report Abstracts. Stud Health Technol Inform 2024;310:634-638. [PMID: 38269886 DOI: 10.3233/shti231042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2024]

Otsuka N, Kawanishi Y, Doi F, Takeda T, Okumura K, Yamauchi T, Yada S, Wakamiya S, Aramaki E, Makinodan M. Diagnosing psychiatric disorders from history of present illness using a large-scale linguistic model. Psychiatry Clin Neurosci 2023;77:597-604. [PMID: 37526294 DOI: 10.1111/pcn.13580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 07/22/2023] [Accepted: 07/27/2023] [Indexed: 08/02/2023]

Azuaje G, Liew K, Buening R, She WJ, Siriaraya P, Wakamiya S, Aramaki E. Exploring the use of AI text-to-image generation to downregulate negative emotions in an expressive writing application. R Soc Open Sci 2023;10:220238. [PMID: 36636309 PMCID: PMC9810434 DOI: 10.1098/rsos.220238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 11/30/2022] [Indexed: 06/17/2023]

Yao LF, Ferawati K, Liew K, Wakamiya S, Aramaki E. The Disruption of the Cystic Fibrosis Community’s Experiences and Concerns during the COVID-19 Pandemic: Topic Modeling and Time Series Analysis of Reddit Comments (Preprint). J Med Internet Res 2022;25:e45249. [PMID: 37079359 PMCID: PMC10160941 DOI: 10.2196/45249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 03/15/2023] [Accepted: 03/16/2023] [Indexed: 03/18/2023] Open

Abstract

BACKGROUND

The COVID-19 pandemic disrupted the needs and concerns of the cystic fibrosis community. Patients with cystic fibrosis were particularly vulnerable during the pandemic due to overlapping symptoms in addition to the challenges patients with rare diseases face, such as the need for constant medical aid and limited information regarding their disease or treatments. Even before the pandemic, patients vocalized these concerns on social media platforms like Reddit and formed communities and networks to share insight and information. This data can be used as a quick and efficient source of information about the experiences and concerns of patients with cystic fibrosis in contrast to traditional survey- or clinical-based methods.

OBJECTIVE

This study applies topic modeling and time series analysis to identify the disruption caused by the COVID-19 pandemic and its impact on the cystic fibrosis community's experiences and concerns. This study illustrates the utility of social media data in gaining insight into the experiences and concerns of patients with rare diseases.

METHODS

We collected comments from the subreddit r/CysticFibrosis to represent the experiences and concerns of the cystic fibrosis community. The comments were preprocessed before being used to train the BERTopic model to assign each comment to a topic. The number of comments and active users for each data set was aggregated monthly per topic and then fitted with an autoregressive integrated moving average (ARIMA) model to study the trends in activity. To verify the disruption in trends during the COVID-19 pandemic, we assigned a dummy variable in the model where a value of "1" was assigned to months in 2020 and "0" otherwise and tested for its statistical significance.

RESULTS

A total of 120,738 comments from 5827 users were collected from March 24, 2011, until August 31, 2022. We found 22 topics representing the cystic fibrosis community's experiences and concerns. Our time series analysis showed that for 9 topics, the COVID-19 pandemic was a statistically significant event that disrupted the trends in user activity. Of the 9 topics, only 1 showed significantly increased activity during this period, while the other 8 showed decreased activity. This mixture of increased and decreased activity for these topics indicates a shift in attention or focus on discussion topics during this period.

CONCLUSIONS

There was a disruption in the experiences and concerns the cystic fibrosis community faced during the COVID-19 pandemic. By studying social media data, we were able to quickly and efficiently study the impact on the lived experiences and daily struggles of patients with cystic fibrosis. This study shows how social media data can be used as an alternative source of information to gain insight into the needs of patients with rare diseases and how external factors disrupt them.

Collapse

Nishiyama T, Yada S, Wakamiya S, Hori S, Aramaki E. Transferability Based on Drug Structure Similarity in Automatic Classification of Noncompliant Drug Use on Social Media: Natural Language Processing Approach (Preprint). J Med Internet Res 2022;25:e44870. [PMID: 37133915 DOI: 10.2196/44870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 03/17/2023] [Accepted: 03/29/2023] [Indexed: 03/31/2023] Open

Abstract

BACKGROUND

Medication noncompliance is a critical issue because of the increased number of drugs sold on the web. Web-based drug distribution is difficult to control, causing problems such as drug noncompliance and abuse. The existing medication compliance surveys lack completeness because it is impossible to cover patients who do not go to the hospital or provide accurate information to their doctors, so a social media-based approach is being explored to collect information about drug use. Social media data, which includes information on drug usage by users, can be used to detect drug abuse and medication compliance in patients.

OBJECTIVE

This study aimed to assess how the structural similarity of drugs affects the efficiency of machine learning models for text classification of drug noncompliance.

METHODS

This study analyzed 22,022 tweets about 20 different drugs. The tweets were labeled as either noncompliant use or mention, noncompliant sales, general use, or general mention. The study compares 2 methods for training machine learning models for text classification: single-sub-corpus transfer learning, in which a model is trained on tweets about a single drug and then tested on tweets about other drugs, and multi-sub-corpus incremental learning, in which models are trained on tweets about drugs in order of their structural similarity. The performance of a machine learning model trained on a single subcorpus (a data set of tweets about a specific category of drugs) was compared to the performance of a model trained on multiple subcorpora (data sets of tweets about multiple categories of drugs).

RESULTS

The results showed that the performance of the model trained on a single subcorpus varied depending on the specific drug used for training. The Tanimoto similarity (a measure of the structural similarity between compounds) was weakly correlated with the classification results. The model trained by transfer learning a corpus of drugs with close structural similarity performed better than the model trained by randomly adding a subcorpus when the number of subcorpora was small.

CONCLUSIONS

The results suggest that structural similarity improves the classification performance of messages about unknown drugs if the drugs in the training corpus are few. On the other hand, this indicates that there is little need to consider the influence of the Tanimoto structural similarity if a sufficient variety of drugs are ensured.

Collapse

Uehara M, Fujita S, Shimizu N, Liew K, Wakamiya S, Aramaki E. Measuring concerns about the COVID-19 vaccine among Japanese internet users through search queries. Sci Rep 2022;12:15037. [PMID: 36057657 PMCID: PMC9440921 DOI: 10.1038/s41598-022-18307-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 08/09/2022] [Indexed: 11/09/2022] Open

Mutinda FW, Liew K, Yada S, Wakamiya S, Aramaki E. Automatic data extraction to support meta-analysis statistical analysis: a case study on breast cancer. BMC Med Inform Decis Mak 2022;22:158. [PMID: 35717167 PMCID: PMC9206132 DOI: 10.1186/s12911-022-01897-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022] Open

Abstract

Background

Meta-analyses aggregate results of different clinical studies to assess the effectiveness of a treatment. Despite their importance, meta-analyses are time-consuming and labor-intensive as they involve reading hundreds of research articles and extracting data. The number of research articles is increasing rapidly and most meta-analyses are outdated shortly after publication as new evidence has not been included. Automatic extraction of data from research articles can expedite the meta-analysis process and allow for automatic updates when new results become available. In this study, we propose a system for automatically extracting data from research abstracts and performing statistical analysis.

Materials and methods

Our corpus consists of 1011 PubMed abstracts of breast cancer randomized controlled trials annotated with the core elements of clinical trials: Participants, Intervention, Control, and Outcomes (PICO). We proposed a BERT-based named entity recognition (NER) model to identify PICO information from research abstracts. After extracting the PICO information, we parse numeric outcomes to identify the number of patients having certain outcomes for statistical analysis.

Results

The NER model extracted PICO elements with relatively high accuracy, achieving F1-scores greater than 0.80 in most entities. We assessed the performance of the proposed system by reproducing the results of an existing meta-analysis. The data extraction step achieved high accuracy, however the statistical analysis step achieved low performance because abstracts sometimes lack all the required information.

Conclusion

We proposed a system for automatically extracting data from research abstracts and performing statistical analysis. We evaluated the performance of the system by reproducing an existing meta-analysis and the system achieved a relatively good performance, though more substantiation is required.

Collapse

Nakamura Y, Hanaoka S, Nomura Y, Hayashi N, Abe O, Yada S, Wakamiya S, Aramaki E. Clinical Comparable Corpus Describing the Same Subjects with Different Expressions. Stud Health Technol Inform 2022;290:253-257. [PMID: 35673012 DOI: 10.3233/shti220073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Mutinda FW, Yada S, Wakamiya S, Aramaki E. AUTOMETA: Automatic Meta-Analysis System Employing Natural Language Processing. Stud Health Technol Inform 2022;290:612-616. [PMID: 35673089 DOI: 10.3233/shti220150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Aramaki E, Wakamiya S, Yada S, Nakamura Y. Natural Language Processing: from Bedside to Everywhere. Yearb Med Inform 2022;31:243-253. [PMID: 35654422 DOI: 10.1055/s-0042-1742510] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open

Ferawati K, Liew K, Aramaki E, Wakamiya S. Monitoring Mentions of COVID-19 Vaccine Side Effects from Japanese and Indonesian Twitter: Infodemiological Study (Preprint). JMIR Infodemiology 2022;2:e39504. [PMID: 36277140 PMCID: PMC9578292 DOI: 10.2196/39504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 07/29/2022] [Accepted: 09/19/2022] [Indexed: 11/13/2022]

Abstract

Background

The year 2021 was marked by vaccinations against COVID-19, which spurred wider discussion among the general population, with some in favor and some against vaccination. Twitter, a popular social media platform, was instrumental in providing information about the COVID-19 vaccine and has been effective in observing public reactions. We focused on tweets from Japan and Indonesia, 2 countries with a large Twitter-using population, where concerns about side effects were consistently stated as a strong reason for vaccine hesitancy.

Objective

This study aimed to investigate how Twitter was used to report vaccine-related side effects and to compare the mentions of these side effects from 2 messenger RNA (mRNA) vaccine types developed by Pfizer and Moderna, in Japan and Indonesia.

Methods

We obtained tweet data from Twitter using Japanese and Indonesian keywords related to COVID-19 vaccines and their side effects from January 1, 2021, to December 31, 2021. We then removed users with a high frequency of tweets and merged the tweets from multiple users as a single sentence to focus on user-level analysis, resulting in a total of 214,165 users (Japan) and 12,289 users (Indonesia). Then, we filtered the data to select tweets mentioning Pfizer or Moderna only and removed tweets mentioning both. We compared the side effect counts to the public reports released by Pfizer and Moderna. Afterward, logistic regression models were used to compare the side effects for the Pfizer and Moderna vaccines for each country.

Results

We observed some differences in the ratio of side effects between the public reports and tweets. Specifically, fever was mentioned much more frequently in tweets than would be expected based on the public reports. We also observed differences in side effects reported between Pfizer and Moderna vaccines from Japan and Indonesia, with more side effects reported for the Pfizer vaccine in Japanese tweets and more side effects with the Moderna vaccine reported in Indonesian tweets.

Conclusions

We note the possible consequences of vaccine side effect surveillance on Twitter and information dissemination, in that fever appears to be over-represented. This could be due to fever possibly having a higher severity or measurability, and further implications are discussed.

Collapse

Wakamiya S, Morimoto O, Omichi K, Hara H, Kawase I, Koshiba R, Aramaki E. Exploring Relationships Between Tweet Numbers and Over-the-counter Drug Sales for Allergic Rhinitis: Retrospective Analysis. JMIR Form Res 2022;6:e33941. [PMID: 35107434 PMCID: PMC8851323 DOI: 10.2196/33941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/19/2021] [Accepted: 01/04/2022] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Health-related social media data are increasingly being used in disease surveillance studies. In particular, surveillance of infectious diseases such as influenza has demonstrated high correlations between the number of social media posts mentioning the disease and the number of patients who went to the hospital and were diagnosed with the disease. However, the prevalence of some diseases, such as allergic rhinitis, cannot be estimated based on the number of patients alone. Specifically, individuals with allergic rhinitis typically self-medicate by taking over-the-counter (OTC) medications without going to the hospital. Although allergic rhinitis is not a life-threatening disease, it represents a major social problem because it reduces people's quality of life, making it essential to understand its prevalence and people's motives for self-medication behavior.

OBJECTIVE

This study aims to explore the relationship between the number of social media posts mentioning the main symptoms of allergic rhinitis and the sales volume of OTC rhinitis medications in Japan.

METHODS

We collected tweets over 4 years (from 2017 to 2020) that included keywords corresponding to the main nasal symptoms of allergic rhinitis: "sneezing," "runny nose," and "stuffy nose." We also obtained the sales volume of OTC drugs, including oral medications and nasal sprays, for the same period. We then calculated the Pearson correlation coefficient between time series data on the number of tweets per week and time series data on the sales volume of OTC drugs per week.

RESULTS

The results showed a much higher correlation (r=0.8432) between the time series data on the number of tweets mentioning "stuffy nose" and the time series data on the sales volume of nasal sprays than for the other two symptoms. There was also a high correlation (r=0.9317) between the seasonal components of these time series data.

CONCLUSIONS

We investigated the relationships between social media data and behavioral patterns, such as OTC drug sales volume. Exploring these relationships can help us understand the prevalence of allergic rhinitis and the motives for self-care treatment using social media data, which would be useful as a marketing indicator to reduce the number of out-of-stocks in stores, provide (sell) rhinitis medicines to consumers in a stable manner, and reduce the loss of sales opportunities. In the future, in-depth investigations are required to estimate sales volume using social media data, and future research could investigate other diseases and countries.

Collapse

Kamba M, Manabe M, Wakamiya S, Yada S, Aramaki E, Odani S, Miyashiro I. Medical Needs Extraction for Breast Cancer Patients from Question and Answer Services: Natural Language Processing-Based Approach. JMIR Cancer 2021;7:e32005. [PMID: 34709187 PMCID: PMC8587180 DOI: 10.2196/32005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/25/2021] [Accepted: 10/04/2021] [Indexed: 11/13/2022] Open

Manabe M, Liew K, Yada S, Wakamiya S, Aramaki E. Estimation of Psychological Distress in Japanese Youth Through Narrative Writing: Text-Based Stylometric and Sentiment Analyses. JMIR Form Res 2021;5:e29500. [PMID: 34387556 PMCID: PMC8391726 DOI: 10.2196/29500] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Revised: 06/29/2021] [Accepted: 07/06/2021] [Indexed: 11/13/2022] Open

Abstract

Background

Internalizing mental illnesses associated with psychological distress are often underdetected. Text-based detection using natural language processing (NLP) methods is increasingly being used to complement conventional detection efforts. However, these approaches often rely on self-disclosure through autobiographical narratives that may not always be possible, especially in the context of the collectivistic Japanese culture.

Objective

We propose the use of narrative writing as an alternative resource for mental illness detection in youth. Accordingly, in this study, we investigated the textual characteristics of narratives written by youth with psychological distress; our research focuses on the detection of psychopathological tendencies in written imaginative narratives.

Methods

Using NLP tools such as stylometric measures and lexicon-based sentiment analysis, we examined short narratives from 52 Japanese youth (mean age 19.8 years, SD 3.1) obtained through crowdsourcing. Participants wrote a short narrative introduction to an imagined story before completing a questionnaire to quantify their tendencies toward psychological distress. Based on this score, participants were categorized into higher distress and lower distress groups. The written narratives were then analyzed using NLP tools and examined for between-group differences. Although outside the scope of this study, we also carried out a supplementary analysis of narratives written by adults using the same procedure.

Results

Youth demonstrating higher tendencies toward psychological distress used significantly more positive (happiness-related) words, revealing differences in valence of the narrative content. No other significant differences were observed between the high and low distress groups.

Conclusions

Youth with tendencies toward mental illness were found to write more positive stories that contained more happiness-related terms. These results may potentially have widespread implications on psychological distress screening on online platforms, particularly in cultures such as Japan that are not accustomed to self-disclosure. Although the mechanisms that we propose in explaining our results are speculative, we believe that this interpretation paves the way for future research in online surveillance and detection efforts.

Collapse

Gao Z, Fujita S, Shimizu N, Liew K, Murayama T, Yada S, Wakamiya S, Aramaki E. Measuring Public Concern About COVID-19 in Japanese Internet Users Through Search Queries: Infodemiological Study. JMIR Public Health Surveill 2021;7:e29865. [PMID: 34174781 PMCID: PMC8294121 DOI: 10.2196/29865] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 06/01/2021] [Accepted: 06/13/2021] [Indexed: 01/19/2023] Open

Abstract

Background

COVID-19 has disrupted lives and livelihoods and caused widespread panic worldwide. Emerging reports suggest that people living in rural areas in some countries are more susceptible to COVID-19. However, there is a lack of quantitative evidence that can shed light on whether residents of rural areas are more concerned about COVID-19 than residents of urban areas.

Objective

This infodemiology study investigated attitudes toward COVID-19 in different Japanese prefectures by aggregating and analyzing Yahoo! JAPAN search queries.

Methods

We measured COVID-19 concerns in each Japanese prefecture by aggregating search counts of COVID-19–related queries of Yahoo! JAPAN users and data related to COVID-19 cases. We then defined two indices—the localized concern index (LCI) and localized concern index by patient percentage (LCIPP)—to quantitatively represent the degree of concern. To investigate the impact of emergency declarations on people's concerns, we divided our study period into three phases according to the timing of the state of emergency in Japan: before, during, and after. In addition, we evaluated the relationship between the LCI and LCIPP in different prefectures by correlating them with prefecture-level indicators of urbanization.

Results

Our results demonstrated that the concerns about COVID-19 in the prefectures changed in accordance with the declaration of the state of emergency. The correlation analyses also indicated that the differentiated types of public concern measured by the LCI and LCIPP reflect the prefectures’ level of urbanization to a certain extent (ie, the LCI appears to be more suitable for quantifying COVID-19 concern in urban areas, while the LCIPP seems to be more appropriate for rural areas).

Conclusions

We quantitatively defined Japanese Yahoo users’ concerns about COVID-19 by using the search counts of COVID-19–related search queries. Our results also showed that the LCI and LCIPP have external validity.

Collapse

Mutinda FW, Yada S, Wakamiya S, Aramaki E. Semantic Textual Similarity in Japanese Clinical Domain Texts Using BERT. Methods Inf Med 2021;60:e56-e64. [PMID: 34237783 PMCID: PMC8294940 DOI: 10.1055/s-0041-1731390] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Accepted: 05/18/2021] [Indexed: 11/13/2022]

Abstract

BACKGROUND

Semantic textual similarity (STS) captures the degree of semantic similarity between texts. It plays an important role in many natural language processing applications such as text summarization, question answering, machine translation, information retrieval, dialog systems, plagiarism detection, and query ranking. STS has been widely studied in the general English domain. However, there exists few resources for STS tasks in the clinical domain and in languages other than English, such as Japanese.

OBJECTIVE

The objective of this study is to capture semantic similarity between Japanese clinical texts (Japanese clinical STS) by creating a Japanese dataset that is publicly available.

MATERIALS

We created two datasets for Japanese clinical STS: (1) Japanese case reports (CR dataset) and (2) Japanese electronic medical records (EMR dataset). The CR dataset was created from publicly available case reports extracted from the CiNii database. The EMR dataset was created from Japanese electronic medical records.

METHODS

We used an approach based on bidirectional encoder representations from transformers (BERT) to capture the semantic similarity between the clinical domain texts. BERT is a popular approach for transfer learning and has been proven to be effective in achieving high accuracy for small datasets. We implemented two Japanese pretrained BERT models: a general Japanese BERT and a clinical Japanese BERT. The general Japanese BERT is pretrained on Japanese Wikipedia texts while the clinical Japanese BERT is pretrained on Japanese clinical texts.

RESULTS

The BERT models performed well in capturing semantic similarity in our datasets. The general Japanese BERT outperformed the clinical Japanese BERT and achieved a high correlation with human score (0.904 in the CR dataset and 0.875 in the EMR dataset). It was unexpected that the general Japanese BERT outperformed the clinical Japanese BERT on clinical domain dataset. This could be due to the fact that the general Japanese BERT is pretrained on a wide range of texts compared with the clinical Japanese BERT.

Collapse

Murayama T, Wakamiya S, Aramaki E, Kobayashi R. Modeling the spread of fake news on Twitter. PLoS One 2021;16:e0250419. [PMID: 33886665 PMCID: PMC8062041 DOI: 10.1371/journal.pone.0250419] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 04/06/2021] [Indexed: 11/18/2022] Open

Ujiie S, Yada S, Wakamiya S, Aramaki E. Identification of Adverse Drug Event-Related Japanese Articles: Natural Language Processing Analysis. JMIR Med Inform 2020;8:e22661. [PMID: 33245290 PMCID: PMC7732716 DOI: 10.2196/22661] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2020] [Revised: 10/05/2020] [Accepted: 10/28/2020] [Indexed: 12/23/2022] Open

Hisada S, Murayama T, Tsubouchi K, Fujita S, Yada S, Wakamiya S, Aramaki E. Surveillance of early stage COVID-19 clusters using search query logs and mobile device-based location information. Sci Rep 2020;10:18680. [PMID: 33122686 PMCID: PMC7596075 DOI: 10.1038/s41598-020-75771-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Accepted: 10/01/2020] [Indexed: 12/18/2022] Open

Murayama T, Shimizu N, Fujita S, Wakamiya S, Aramaki E. Robust two-stage influenza prediction model considering regular and irregular trends. PLoS One 2020;15:e0233126. [PMID: 32437380 PMCID: PMC7241782 DOI: 10.1371/journal.pone.0233126] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 04/28/2020] [Indexed: 11/18/2022] Open

Aramaki E, Honda C, Wakamiya S, Sato A, Myashiro I. Quick Cognitive Impairment Test for Cancer Patients Using Emotional Stroop Effect. Stud Health Technol Inform 2019;264:1629-1630. [PMID: 31438264 DOI: 10.3233/shti190568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Aramaki E, Miyabe M, Honda C, Isozaki S, Wakamiya S, Sato A, Miyashiro I. KOTOBAKARI Study: Using Natural Language Processing of Patient Short Narratives to Detect Cancer Related Cognitive Impairment. Stud Health Technol Inform 2019;264:1111-1115. [PMID: 31438097 DOI: 10.3233/shti190398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Wakamiya S, Morita M, Kano Y, Ohkuma T, Aramaki E. Tweet Classification Toward Twitter-Based Disease Surveillance: New Data, Methods, and Evaluations. J Med Internet Res 2019;21:e12783. [PMID: 30785407 PMCID: PMC6401666 DOI: 10.2196/12783] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Revised: 12/12/2018] [Accepted: 12/13/2018] [Indexed: 11/13/2022] Open

Abstract

Background

The amount of medical and clinical-related information on the Web is increasing. Among the different types of information available, social media–based data obtained directly from people are particularly valuable and are attracting significant attention. To encourage medical natural language processing (NLP) research exploiting social media data, the 13th NII Testbeds and Community for Information access Research (NTCIR-13) Medical natural language processing for Web document (MedWeb) provides pseudo-Twitter messages in a cross-language and multi-label corpus, covering 3 languages (Japanese, English, and Chinese) and annotated with 8 symptom labels (such as cold, fever, and flu). Then, participants classify each tweet into 1 of the 2 categories: those containing a patient’s symptom and those that do not.

Objective

This study aimed to present the results of groups participating in a Japanese subtask, English subtask, and Chinese subtask along with discussions, to clarify the issues that need to be resolved in the field of medical NLP.

Methods

In summary, 8 groups (19 systems) participated in the Japanese subtask, 4 groups (12 systems) participated in the English subtask, and 2 groups (6 systems) participated in the Chinese subtask. In total, 2 baseline systems were constructed for each subtask. The performance of the participant and baseline systems was assessed using the exact match accuracy, F-measure based on precision and recall, and Hamming loss.

Results

The best system achieved exactly 0.880 match accuracy, 0.920 F-measure, and 0.019 Hamming loss. The averages of match accuracy, F-measure, and Hamming loss for the Japanese subtask were 0.720, 0.820, and 0.051; those for the English subtask were 0.770, 0.850, and 0.037; and those for the Chinese subtask were 0.810, 0.880, and 0.032, respectively.

Conclusions

This paper presented and discussed the performance of systems participating in the NTCIR-13 MedWeb task. As the MedWeb task settings can be formalized as the factualization of text, the achievement of this task could be directly applied to practical clinical applications.

Collapse

Wakamiya S, Matsune S, Okubo K, Aramaki E. Causal Relationships Among Pollen Counts, Tweet Numbers, and Patient Numbers for Seasonal Allergic Rhinitis Surveillance: Retrospective Analysis. J Med Internet Res 2019;21:e10450. [PMID: 30785411 PMCID: PMC6401667 DOI: 10.2196/10450] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 11/08/2018] [Accepted: 12/10/2018] [Indexed: 12/29/2022] Open

Abstract

Background

Health-related social media data are increasingly used in disease-surveillance studies, which have demonstrated moderately high correlations between the number of social media posts and the number of patients. However, there is a need to understand the causal relationship between the behavior of social media users and the actual number of patients in order to increase the credibility of disease surveillance based on social media data.

Objective

This study aimed to clarify the causal relationships among pollen count, the posting behavior of social media users, and the number of patients with seasonal allergic rhinitis in the real world.

Methods

This analysis was conducted using datasets of pollen counts, tweet numbers, and numbers of patients with seasonal allergic rhinitis from Kanagawa Prefecture, Japan. We examined daily pollen counts for Japanese cedar (the major cause of seasonal allergic rhinitis in Japan) and hinoki cypress (which commonly complicates seasonal allergic rhinitis) from February 1 to May 31, 2017. The daily numbers of tweets that included the keyword “kafunshō” (or seasonal allergic rhinitis) were calculated between January 1 and May 31, 2017. Daily numbers of patients with seasonal allergic rhinitis from January 1 to May 31, 2017, were obtained from three healthcare institutes that participated in the study. The Granger causality test was used to examine the causal relationships among pollen count, tweet numbers, and the number of patients with seasonal allergic rhinitis from February to May 2017. To determine if time-variant factors affect these causal relationships, we analyzed the main seasonal allergic rhinitis phase (February to April) when Japanese cedar trees actively produce and release pollen.

Results

Increases in pollen count were found to increase the number of tweets during the overall study period (P=.04), but not the main seasonal allergic rhinitis phase (P=.05). In contrast, increases in pollen count were found to increase patient numbers in both the study period (P=.04) and the main seasonal allergic rhinitis phase (P=.01). Increases in the number of tweets increased the patient numbers during the main seasonal allergic rhinitis phase (P=.02), but not the overall study period (P=.89). Patient numbers did not affect the number of tweets in both the overall study period (P=.24) and the main seasonal allergic rhinitis phase (P=.47).

Conclusions

Understanding the causal relationships among pollen counts, tweet numbers, and numbers of patients with seasonal allergic rhinitis is an important step to increasing the credibility of surveillance systems that use social media data. Further in-depth studies are needed to identify the determinants of social media posts described in this exploratory analysis.

Collapse

Usui M, Aramaki E, Iwao T, Wakamiya S, Sakamoto T, Mochizuki M. Extraction and Standardization of Patient Complaints from Electronic Medication Histories for Pharmacovigilance: Natural Language Processing Analysis in Japanese. JMIR Med Inform 2018;6:e11021. [PMID: 30262450 PMCID: PMC6231790 DOI: 10.2196/11021] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Revised: 08/07/2018] [Accepted: 08/25/2018] [Indexed: 12/13/2022] Open

Wakamiya S, Kawai Y, Aramaki E. Twitter-Based Influenza Detection After Flu Peak via Tweets With Indirect Information: Text Mining Study. JMIR Public Health Surveill 2018;4:e65. [PMID: 30274968 PMCID: PMC6231889 DOI: 10.2196/publichealth.8627] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2017] [Revised: 02/24/2018] [Accepted: 07/18/2018] [Indexed: 11/13/2022] Open

Abstract

Background

The recent rise in popularity and scale of social networking services (SNSs) has resulted in an increasing need for SNS-based information extraction systems. A popular application of SNS data is health surveillance for predicting an outbreak of epidemics by detecting diseases from text messages posted on SNS platforms. Such applications share the following logic: they incorporate SNS users as social sensors. These social sensor–based approaches also share a common problem: SNS-based surveillance are much more reliable if sufficient numbers of users are active, and small or inactive populations produce inconsistent results.

Objective

This study proposes a novel approach to estimate the trend of patient numbers using indirect information covering both urban areas and rural areas within the posts.

Methods

We presented a TRAP model by embedding both direct information and indirect information. A collection of tweets spanning 3 years (7 million influenza-related tweets in Japanese) was used to evaluate the model. Both direct information and indirect information that mention other places were used. As indirect information is less reliable (too noisy or too old) than direct information, the indirect information data were not used directly and were considered as inhibiting direct information. For example, when indirect information appeared often, it was considered as signifying that everyone already had a known disease, leading to a small amount of direct information.

Results

The estimation performance of our approach was evaluated using the correlation coefficient between the number of influenza cases as the gold standard values and the estimated values by the proposed models. The results revealed that the baseline model (BASELINE+NLP) shows .36 and that the proposed model (TRAP+NLP) improved the accuracy (.70, +.34 points).

Conclusions

The proposed approach by which the indirect information inhibits direct information exhibited improved estimation performance not only in rural cities but also in urban cities, which demonstrated the effectiveness of the proposed method consisting of a TRAP model and natural language processing (NLP) classification.

Collapse

Aramaki E, Yano K, Wakamiya S. MedEx/J: A One-Scan Simple and Fast NLP Tool for Japanese Clinical Texts. Stud Health Technol Inform 2017;245:285-288. [PMID: 29295100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Iso H, Wakamiya S, Aramaki E. Conditional Density Estimation of Tweet Location: A Feature-Dependent Approach. Stud Health Technol Inform 2017;245:408-411. [PMID: 29295126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Wakamiya S, Lee R, Sumiya K. Crowd-Powered TV Viewing Rates: Measuring Relevancy between Tweets and TV Programs. Database Systems for Adanced Applications 2011. [DOI: 10.1007/978-3-642-20244-5_37] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Dagalakis NG, Muehlhouse C, Wakamiya S, Yang JC. Loss of control biomechanics of the human arm-elbow system. J Biomech 1987;20:385-96. [PMID: 3597455 DOI: 10.1016/0021-9290(87)90046-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]