Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Katzan IL, Rudick RA. Time to integrate clinical and research informatics. Sci Transl Med 2013. [PMID: 23197569 DOI: 10.1126/scitranslmed.3004583] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Number

Cited by Other Article(s)

Taira RK, Garlid AO, Speier W. Design considerations for a hierarchical semantic compositional framework for medical natural language understanding. PLoS One 2023;18:e0282882. [PMID: 36928721 PMCID: PMC10019629 DOI: 10.1371/journal.pone.0282882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 02/24/2023] [Indexed: 03/18/2023] Open

Deng L, Zhang X, Yang T, Liu M, Chen L, Jiang T. PIAT: an evolutionarily intelligent system for deep phenotyping of Chinese electronic health records. IEEE J Biomed Health Inform 2022;26:4142-4152. [PMID: 35609107 DOI: 10.1109/jbhi.2022.3177421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Li S, Deng L, Zhang X, Chen L, Yang T, Qi Y, Jiang T. Deep Phenotyping on Chinese Electronic Health Records by Recognizing Linguistic Patterns of Phenotypic Narratives with a Sequence Motif Discovery Tool: Algorithm Development and Validation (Preprint). J Med Internet Res 2022;24:e37213. [PMID: 35657661 PMCID: PMC9206202 DOI: 10.2196/37213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/21/2022] [Accepted: 05/12/2022] [Indexed: 11/23/2022] Open

Abstract

Background

Phenotype information in electronic health records (EHRs) is mainly recorded in unstructured free text, which cannot be directly used for clinical research. EHR-based deep-phenotyping methods can structure phenotype information in EHRs with high fidelity, making it the focus of medical informatics. However, developing a deep-phenotyping method for non-English EHRs (ie, Chinese EHRs) is challenging. Although numerous EHR resources exist in China, fine-grained annotation data that are suitable for developing deep-phenotyping methods are limited. It is challenging to develop a deep-phenotyping method for Chinese EHRs in such a low-resource scenario.

Objective

In this study, we aimed to develop a deep-phenotyping method with good generalization ability for Chinese EHRs based on limited fine-grained annotation data.

Methods

The core of the methodology was to identify linguistic patterns of phenotype descriptions in Chinese EHRs with a sequence motif discovery tool and perform deep phenotyping of Chinese EHRs by recognizing linguistic patterns in free text. Specifically, 1000 Chinese EHRs were manually annotated based on a fine-grained information model, PhenoSSU (Semantic Structured Unit of Phenotypes). The annotation data set was randomly divided into a training set (n=700, 70%) and a testing set (n=300, 30%). The process for mining linguistic patterns was divided into three steps. First, free text in the training set was encoded as single-letter sequences (P: phenotype, A: attribute). Second, a biological sequence analysis tool—MEME (Multiple Expectation Maximums for Motif Elicitation)—was used to identify motifs in the single-letter sequences. Finally, the identified motifs were reduced to a series of regular expressions representing linguistic patterns of PhenoSSU instances in Chinese EHRs. Based on the discovered linguistic patterns, we developed a deep-phenotyping method for Chinese EHRs, including a deep learning–based method for named entity recognition and a pattern recognition–based method for attribute prediction.

Results

In total, 51 sequence motifs with statistical significance were mined from 700 Chinese EHRs in the training set and were combined into six regular expressions. It was found that these six regular expressions could be learned from a mean of 134 (SD 9.7) annotated EHRs in the training set. The deep-phenotyping algorithm for Chinese EHRs could recognize PhenoSSU instances with an overall accuracy of 0.844 on the test set. For the subtask of entity recognition, the algorithm achieved an F1 score of 0.898 with the Bidirectional Encoder Representations from Transformers–bidirectional long short-term memory and conditional random field model; for the subtask of attribute prediction, the algorithm achieved a weighted accuracy of 0.940 with the linguistic pattern–based method.

Conclusions

We developed a simple but effective strategy to perform deep phenotyping of Chinese EHRs with limited fine-grained annotation data. Our work will promote the second use of Chinese EHRs and give inspiration to other non–English-speaking countries.

Collapse

He W, Kirchoff KG, Sampson RR, McGhee KK, Cates AM, Obeid JS, Lenert LA. Research Integrated Network of Systems (RINS): a virtual data warehouse for the acceleration of translational research. J Am Med Inform Assoc 2021;28:1440-1450. [PMID: 33729486 DOI: 10.1093/jamia/ocab023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 01/28/2021] [Indexed: 11/13/2022] Open

Park JA, Sung MD, Kim HH, Park YR. Weight-Based Framework for Predictive Modeling of Multiple Databases With Noniterative Communication Without Data Sharing: Privacy-Protecting Analytic Method for Multi-Institutional Studies. JMIR Med Inform 2021;9:e21043. [PMID: 33818396 PMCID: PMC8056295 DOI: 10.2196/21043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 11/16/2020] [Accepted: 03/03/2021] [Indexed: 01/22/2023] Open

Abstract

Background

Securing the representativeness of study populations is crucial in biomedical research to ensure high generalizability. In this regard, using multi-institutional data have advantages in medicine. However, combining data physically is difficult as the confidential nature of biomedical data causes privacy issues. Therefore, a methodological approach is necessary when using multi-institution medical data for research to develop a model without sharing data between institutions.

Objective

This study aims to develop a weight-based integrated predictive model of multi-institutional data, which does not require iterative communication between institutions, to improve average predictive performance by increasing the generalizability of the model under privacy-preserving conditions without sharing patient-level data.

Methods

The weight-based integrated model generates a weight for each institutional model and builds an integrated model for multi-institutional data based on these weights. We performed 3 simulations to show the weight characteristics and to determine the number of repetitions of the weight required to obtain stable values. We also conducted an experiment using real multi-institutional data to verify the developed weight-based integrated model. We selected 10 hospitals (2845 intensive care unit [ICU] stays in total) from the electronic intensive care unit Collaborative Research Database to predict ICU mortality with 11 features. To evaluate the validity of our model, compared with a centralized model, which was developed by combining all the data of 10 hospitals, we used proportional overlap (ie, 0.5 or less indicates a significant difference at a level of .05; and 2 indicates 2 CIs overlapping completely). Standard and firth logistic regression models were applied for the 2 simulations and the experiment.

Results

The results of these simulations indicate that the weight of each institution is determined by 2 factors (ie, the data size of each institution and how well each institutional model fits into the overall institutional data) and that repeatedly generating 200 weights is necessary per institution. In the experiment, the estimated area under the receiver operating characteristic curve (AUC) and 95% CIs were 81.36% (79.37%-83.36%) and 81.95% (80.03%-83.87%) in the centralized model and weight-based integrated model, respectively. The proportional overlap of the CIs for AUC in both the weight-based integrated model and the centralized model was approximately 1.70, and that of overlap of the 11 estimated odds ratios was over 1, except for 1 case.

Conclusions

In the experiment where real multi-institutional data were used, our model showed similar results to the centralized model without iterative communication between institutions. In addition, our weight-based integrated model provided a weighted average model by integrating 10 models overfitted or underfitted, compared with the centralized model. The proposed weight-based integrated model is expected to provide an efficient distributed research approach as it increases the generalizability of the model and does not require iterative communication.

Collapse

Sáez C, Gutiérrez-Sacristán A, Kohane I, García-Gómez JM, Avillach P. EHRtemporalVariability: delineating temporal data-set shifts in electronic health records. Gigascience 2020;9:giaa079. [PMID: 32729900 PMCID: PMC7391413 DOI: 10.1093/gigascience/giaa079] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 05/28/2020] [Accepted: 07/03/2020] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

Temporal variability in health-care processes or protocols is intrinsic to medicine. Such variability can potentially introduce dataset shifts, a data quality issue when reusing electronic health records (EHRs) for secondary purposes. Temporal data-set shifts can present as trends, as well as abrupt or seasonal changes in the statistical distributions of data over time. The latter are particularly complicated to address in multimodal and highly coded data. These changes, if not delineated, can harm population and data-driven research, such as machine learning. Given that biomedical research repositories are increasingly being populated with large sets of historical data from EHRs, there is a need for specific software methods to help delineate temporal data-set shifts to ensure reliable data reuse.

RESULTS

EHRtemporalVariability is an open-source R package and Shiny app designed to explore and identify temporal data-set shifts. EHRtemporalVariability estimates the statistical distributions of coded and numerical data over time; projects their temporal evolution through non-parametric information geometric temporal plots; and enables the exploration of changes in variables through data temporal heat maps. We demonstrate the capability of EHRtemporalVariability to delineate data-set shifts in three impact case studies, one of which is available for reproducibility.

CONCLUSIONS

EHRtemporalVariability enables the exploration and identification of data-set shifts, contributing to the broad examination and repurposing of large, longitudinal data sets. Our goal is to help ensure reliable data reuse for a wide range of biomedical data users. EHRtemporalVariability is designed for technical users who are programmatically utilizing the R package, as well as users who are not familiar with programming via the Shiny user interface.Availability: https://github.com/hms-dbmi/EHRtemporalVariability/Reproducible vignette: https://cran.r-project.org/web/packages/EHRtemporalVariability/vignettes/EHRtemporalVariability.htmlOnline demo: http://ehrtemporalvariability.upv.es/.

Collapse

Platt JE, Raj M, Wienroth M. An Analysis of the Learning Health System in Its First Decade in Practice: Scoping Review. J Med Internet Res 2020;22:e17026. [PMID: 32191214 PMCID: PMC7118548 DOI: 10.2196/17026] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Revised: 12/30/2019] [Accepted: 12/31/2019] [Indexed: 12/20/2022] Open

Lowes LP, Noritz GH, Newmeyer A, Embi PJ, Yin H, Smoyer WE. 'Learn From Every Patient': implementation and early results of a learning health system. Dev Med Child Neurol 2017;59:183-191. [PMID: 27545839 DOI: 10.1111/dmcn.13227] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/15/2016] [Indexed: 01/17/2023]

Fedorov A, Clunie D, Ulrich E, Bauer C, Wahle A, Brown B, Onken M, Riesmeier J, Pieper S, Kikinis R, Buatti J, Beichel RR. DICOM for quantitative imaging biomarker development: a standards based approach to sharing clinical data and structured PET/CT analysis results in head and neck cancer research. PeerJ 2016;4:e2057. [PMID: 27257542 PMCID: PMC4888317 DOI: 10.7717/peerj.2057] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 04/29/2016] [Indexed: 12/29/2022] Open

Abstract

Background. Imaging biomarkers hold tremendous promise for precision medicine clinical applications. Development of such biomarkers relies heavily on image post-processing tools for automated image quantitation. Their deployment in the context of clinical research necessitates interoperability with the clinical systems. Comparison with the established outcomes and evaluation tasks motivate integration of the clinical and imaging data, and the use of standardized approaches to support annotation and sharing of the analysis results and semantics. We developed the methodology and tools to support these tasks in Positron Emission Tomography and Computed Tomography (PET/CT) quantitative imaging (QI) biomarker development applied to head and neck cancer (HNC) treatment response assessment, using the Digital Imaging and Communications in Medicine (DICOM(®)) international standard and free open-source software. Methods. Quantitative analysis of PET/CT imaging data collected on patients undergoing treatment for HNC was conducted. Processing steps included Standardized Uptake Value (SUV) normalization of the images, segmentation of the tumor using manual and semi-automatic approaches, automatic segmentation of the reference regions, and extraction of the volumetric segmentation-based measurements. Suitable components of the DICOM standard were identified to model the various types of data produced by the analysis. A developer toolkit of conversion routines and an Application Programming Interface (API) were contributed and applied to create a standards-based representation of the data. Results. DICOM Real World Value Mapping, Segmentation and Structured Reporting objects were utilized for standards-compliant representation of the PET/CT QI analysis results and relevant clinical data. A number of correction proposals to the standard were developed. The open-source DICOM toolkit (DCMTK) was improved to simplify the task of DICOM encoding by introducing new API abstractions. Conversion and visualization tools utilizing this toolkit were developed. The encoded objects were validated for consistency and interoperability. The resulting dataset was deposited in the QIN-HEADNECK collection of The Cancer Imaging Archive (TCIA). Supporting tools for data analysis and DICOM conversion were made available as free open-source software. Discussion. We presented a detailed investigation of the development and application of the DICOM model, as well as the supporting open-source tools and toolkits, to accommodate representation of the research data in QI biomarker development. We demonstrated that the DICOM standard can be used to represent the types of data relevant in HNC QI biomarker development, and encode their complex relationships. The resulting annotated objects are amenable to data mining applications, and are interoperable with a variety of systems that support the DICOM standard.

Collapse

Affiliation(s)

Andriy Fedorov Department of Radiology, Brigham and Women’s Hospital, Boston, MA, United States of America Harvard Medical School, Harvard University, Boston, MA, United States of America
David Clunie PixelMed Publishing, LLC, Bangor, PA, United States of America
Ethan Ulrich Department of Electrical and Computer Engineering, University of Iowa, Iowa City, IA, United States of America Iowa Institute for Biomedical Imaging, University of Iowa, Iowa City, IA, United States of America
Christian Bauer Department of Electrical and Computer Engineering, University of Iowa, Iowa City, IA, United States of America Iowa Institute for Biomedical Imaging, University of Iowa, Iowa City, IA, United States of America
Andreas Wahle Department of Electrical and Computer Engineering, University of Iowa, Iowa City, IA, United States of America Iowa Institute for Biomedical Imaging, University of Iowa, Iowa City, IA, United States of America
Bartley Brown Center for Bioinformatics and Computational Biology, University of Iowa, Iowa City, IA, United States of America
Michael Onken OpenConnections GmbH, Oldenburg, Germany
Jörg Riesmeier Freelancer, Oldenburg, Germany
Steve Pieper Isomics, Inc., Cambridge, MA, United States of America
Ron Kikinis Department of Radiology, Brigham and Women’s Hospital, Boston, MA, United States of America Harvard Medical School, Harvard University, Boston, MA, United States of America Fraunhofer MEVIS, Bremen, Germany Mathematics/Computer Science Faculty, University of Bremen, Bremen, Germany
John Buatti Department of Radiation Oncology, University of Iowa Carver College of Medicine, Iowa City, IA, United States of America
Reinhard R. Beichel Department of Electrical and Computer Engineering, University of Iowa, Iowa City, IA, United States of America Iowa Institute for Biomedical Imaging, University of Iowa, Iowa City, IA, United States of America Department of Internal Medicine, University of Iowa Carver College of Medicine, Iowa City, IA, United States of America

Collapse

Percha B, Altman RB. Learning the Structure of Biomedical Relationships from Unstructured Text. PLoS Comput Biol 2015. [PMID: 26219079 PMCID: PMC4517797 DOI: 10.1371/journal.pcbi.1004216] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

The published biomedical research literature encompasses most of our understanding of how drugs interact with gene products to produce physiological responses (phenotypes). Unfortunately, this information is distributed throughout the unstructured text of over 23 million articles. The creation of structured resources that catalog the relationships between drugs and genes would accelerate the translation of basic molecular knowledge into discoveries of genomic biomarkers for drug response and prediction of unexpected drug-drug interactions. Extracting these relationships from natural language sentences on such a large scale, however, requires text mining algorithms that can recognize when different-looking statements are expressing similar ideas. Here we describe a novel algorithm, Ensemble Biclustering for Classification (EBC), that learns the structure of biomedical relationships automatically from text, overcoming differences in word choice and sentence structure. We validate EBC's performance against manually-curated sets of (1) pharmacogenomic relationships from PharmGKB and (2) drug-target relationships from DrugBank, and use it to discover new drug-gene relationships for both knowledge bases. We then apply EBC to map the complete universe of drug-gene relationships based on their descriptions in Medline, revealing unexpected structure that challenges current notions about how these relationships are expressed in text. For instance, we learn that newer experimental findings are described in consistently different ways than established knowledge, and that seemingly pure classes of relationships can exhibit interesting chimeric structure. The EBC algorithm is flexible and adaptable to a wide range of problems in biomedical text mining.

Virtually all important biomedical knowledge is described in the published research literature, but Medline currently contains over 23 million articles and is growing at the rate of several hundred thousand new articles each year. In this environment, we need computational algorithms that can efficiently extract, aggregate, annotate and store information from the raw text. Because authors describe their results using natural language, descriptions of similar phenomena vary considerably with respect to both word choice and sentence structure. Any algorithm capable of mining the biomedical literature on a large scale must be able to overcome these differences and recognize when two different-looking statements are saying the same thing. Here we describe a novel algorithm, Ensemble Biclustering for Classification (EBC), that learns the structure of drug-gene relationships automatically from the unstructured text of biomedical research abstracts. By applying EBC to the entirety of Medline, we learn from the structure of the text itself approximately 20 key ways that drugs and genes can interact, discover new facts for two biomedical knowledge bases, and reveal rich and unexpected structure in how scientists describe drug-gene relationships.

Collapse

Knowledge retrieval from PubMed abstracts and electronic medical records with the Multiple Sclerosis Ontology. PLoS One 2015;10:e0116718. [PMID: 25665127 PMCID: PMC4321837 DOI: 10.1371/journal.pone.0116718] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Accepted: 12/13/2014] [Indexed: 12/03/2022] Open

Papoulias C, Robotham D, Drake G, Rose D, Wykes T. Staff and service users' views on a 'Consent for Contact' research register within psychosis services: a qualitative study. BMC Psychiatry 2014;14:377. [PMID: 25539869 PMCID: PMC4296527 DOI: 10.1186/s12888-014-0377-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 12/19/2014] [Indexed: 11/16/2022] Open

Sim I, Tu SW, Carini S, Lehmann HP, Pollock BH, Peleg M, Wittkowski KM. The Ontology of Clinical Research (OCRe): an informatics foundation for the science of clinical research. J Biomed Inform 2013;52:78-91. [PMID: 24239612 DOI: 10.1016/j.jbi.2013.11.002] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2013] [Revised: 10/11/2013] [Accepted: 11/03/2013] [Indexed: 11/25/2022]

McDonald CJ, Vreeman DJ, Abhyankar S. Comment on "time to integrate clinical and research informatics". Sci Transl Med 2013;5:179le1. [PMID: 23552367 DOI: 10.1126/scitranslmed.3005700] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Witt CM. Clinical research on traditional drugs and food items--the potential of comparative effectiveness research for interdisciplinary research. JOURNAL OF ETHNOPHARMACOLOGY 2013;147:254-258. [PMID: 23458921 DOI: 10.1016/j.jep.2013.02.024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2012] [Revised: 02/14/2013] [Accepted: 02/16/2013] [Indexed: 06/01/2023]

Abstract

ETHNOPHARMACOLOGICAL RELEVANCE

In the traditional context, herbs are often used as herbal whole system therapies, however, most clinical trials included highly selected patients and applied standardized treatment protocols with the aim to exclude as much bias as possible. These studies have contributed important information on the efficacy of herbal medicine extracts; however, their results are only marginally helpful to understand the value of herbal medicine and food items in a more traditional usual care context.

METHODS

The new development of comparative effectiveness research (CER) will be introduced and synergies with ethnopharmacology will be outlined.

RESULTS

CER provides great opportunities for guiding researchers and clinicians in improving management of disease. CER compares two or more health interventions in order to determine which of these options works best for which types of patients in settings that are similar to those in which the intervention will be used in practice. CER uses a broad spectrum of methodologies including randomized pragmatic trials that can also be applied to herbal whole system therapies. Ethnopharmacological research can provide highly relevant information for CER including data on characteristics of typical patients as well as traditional usage including methods of collection, extraction, and preparation. Recommendations for future research on traditional herbal medicine and food items are (1) a systematic cooperation between ethnopharmacology and clinical researchers and (2) a call for more CER on traditional herbal medicines and food items.

CONCLUSION

Multiple stakeholders, including ethnopharmacologists, should cooperate to identify relevant study questions as well share their knowledge to determine the optimal placement of a clinical trial in the efficacy-effectiveness-continuum.

Collapse

Katzan IL, Rudick RA. Author Response to Comment on “Time to Integrate Clinical and Research Informatics”. Sci Transl Med 2013;5:179lr1. [DOI: 10.1126/scitranslmed.3006031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]