1
|
Kalyan KS, Rajasekharan A, Sangeetha S. AMMU: A survey of transformer-based biomedical pretrained language models. J Biomed Inform 2021; 126:103982. [PMID: 34974190 DOI: 10.1016/j.jbi.2021.103982] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 12/12/2021] [Accepted: 12/20/2021] [Indexed: 01/04/2023]
Abstract
Transformer-based pretrained language models (PLMs) have started a new era in modern natural language processing (NLP). These models combine the power of transformers, transfer learning, and self-supervised learning (SSL). Following the success of these models in the general domain, the biomedical research community has developed various in-domain PLMs starting from BioBERT to the latest BioELECTRA and BioALBERT models. We strongly believe there is a need for a survey paper that can provide a comprehensive survey of various transformer-based biomedical pretrained language models (BPLMs). In this survey, we start with a brief overview of foundational concepts like self-supervised learning, embedding layer and transformer encoder layers. We discuss core concepts of transformer-based PLMs like pretraining methods, pretraining tasks, fine-tuning methods, and various embedding types specific to biomedical domain. We introduce a taxonomy for transformer-based BPLMs and then discuss all the models. We discuss various challenges and present possible solutions. We conclude by highlighting some of the open issues which will drive the research community to further improve transformer-based BPLMs. The list of all the publicly available transformer-based BPLMs along with their links is provided at https://mr-nlp.github.io/posts/2021/05/transformer-based-biomedical-pretrained-language-models-list/.
Collapse
|
2
|
McMurry R, Lenehan P, Awasthi S, Silvert E, Puranik A, Pawlowski C, Venkatakrishnan AJ, Anand P, Agarwal V, O'Horo JC, Gores GJ, Williams AW, Badley AD, Halamka J, Virk A, Swift MD, Carlson K, Doddahonnaiah D, Metzger A, Kayal N, Berner G, Ramudu E, Carpenter C, Wagner T, Rajasekharan A, Soundararajan V. Real-time analysis of a mass vaccination effort confirms the safety of FDA-authorized mRNA COVID-19 vaccines. Med (N Y) 2021; 2:965-978.e5. [PMID: 34230920 PMCID: PMC8248717 DOI: 10.1016/j.medj.2021.06.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 05/04/2021] [Accepted: 06/15/2021] [Indexed: 02/04/2023]
Abstract
Background As the coronavirus disease 2019 (COVID-19) vaccination campaign unfolds, it is important to continuously assess the real-world safety of Food and Drug Administration (FDA)-authorized vaccines. Curation of large-scale electronic health records (EHRs) enables near-real-time safety evaluations that were not previously possible. Methods In this retrospective study, we deployed deep neural networks over a large EHR system to automatically curate the adverse effects mentioned by physicians in over 1.2 million clinical notes between December 1, 2020 and April 20, 2021. We compared notes from 68,266 individuals who received at least one dose of BNT162b2 (n = 51,795) or mRNA-1273 (n = 16,471) to notes from 68,266 unvaccinated individuals who were matched by demographic, geographic, and clinical features. Findings Individuals vaccinated with BNT162b2 or mRNA-1273 had a higher rate of return to the clinic, but not the emergency department, after both doses compared to unvaccinated controls. The most frequently documented adverse effects within 7 days of each vaccine dose included myalgia, headache, and fatigue, but the rates of EHR documentation for each side effect were remarkably low compared to those derived from active solicitation during clinical trials. Severe events, including anaphylaxis, facial paralysis, and cerebral venous sinus thrombosis, were rare and occurred at similar frequencies in vaccinated and unvaccinated individuals. Conclusions This analysis of vaccine-related adverse effects from over 1.2 million EHR notes of more than 130,000 individuals reaffirms the safety and tolerability of the FDA-authorized mRNA COVID-19 vaccines in practice. Funding This study was funded by nference. This is a study of the mRNA COVID-19 vaccines developed by Pfizer/BioNTech and Moderna. Although these vaccines have been shown to be safe and tolerated in clinical trials, it is important to confirm their safety profiles in practice. The results from this study show that individuals receiving these vaccines are likely to experience muscle and joint soreness, but they are not more likely to seek out emergent clinical care or experience severe medical events than unvaccinated individuals. As one of the largest real-world safety studies of COVID-19 vaccines to date, these data reinforce that we should continue expanding efforts to deliver more vaccines with high confidence in their safety.
Collapse
Affiliation(s)
- Reid McMurry
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Patrick Lenehan
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Samir Awasthi
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Eli Silvert
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Arjun Puranik
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Colin Pawlowski
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | | | - Praveen Anand
- nference Labs, 2nd Floor, 22 3rd Cross Rd, Murgesh Pallya, Bengaluru, Karnataka 560017, India
| | - Vineet Agarwal
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | | | | | | | | | | | | | | | - Katie Carlson
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | | | - Anna Metzger
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Nikhil Kayal
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Gabi Berner
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | - Eshwan Ramudu
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | | | - Tyler Wagner
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
| | | | - Venky Soundararajan
- nference, One Main Street, East Arcade, Cambridge, MA 02142, USA
- nference Labs, 2nd Floor, 22 3rd Cross Rd, Murgesh Pallya, Bengaluru, Karnataka 560017, India
| |
Collapse
|
3
|
Murugadoss K, Rajasekharan A, Malin B, Agarwal V, Bade S, Anderson JR, Ross JL, Faubion WA, Halamka JD, Soundararajan V, Ardhanari S. Building a best-in-class automated de-identification tool for electronic health records through ensemble learning. Patterns (N Y) 2021; 2:100255. [PMID: 34179842 PMCID: PMC8212138 DOI: 10.1016/j.patter.2021.100255] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 02/24/2021] [Accepted: 04/07/2021] [Indexed: 10/29/2022]
Abstract
The presence of personally identifiable information (PII) in natural language portions of electronic health records (EHRs) constrains their broad reuse. Despite continuous improvements in automated detection of PII, residual identifiers require manual validation and correction. Here, we describe an automated de-identification system that employs an ensemble architecture, incorporating attention-based deep-learning models and rule-based methods, supported by heuristics for detecting PII in EHR data. Detected identifiers are then transformed into plausible, though fictional, surrogates to further obfuscate any leaked identifier. Our approach outperforms existing tools, with a recall of 0.992 and precision of 0.979 on the i2b2 2014 dataset and a recall of 0.994 and precision of 0.967 on a dataset of 10,000 notes from the Mayo Clinic. The de-identification system presented here enables the generation of de-identified patient data at the scale required for modern machine-learning applications to help accelerate medical discoveries.
Collapse
Affiliation(s)
| | | | - Bradley Malin
- Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | | | | | - Jeff R. Anderson
- Mayo Clinic, Rochester, MN 55905, USA
- Mayo Clinic Platform, Rochester, MN 55905, USA
| | | | | | - John D. Halamka
- Mayo Clinic, Rochester, MN 55905, USA
- Mayo Clinic Platform, Rochester, MN 55905, USA
| | | | | |
Collapse
|
4
|
Sivakumar KK, Rajasekharan A, Rao R, Narasimhan B. Synthesis, SAR Study and Evaluation of Mannich and Schiff Bases of Pyrazol-5(4H)-one Moiety Containing 3-(Hydrazinyl)-2-phenylquinazolin-4(3H)-one. Indian J Pharm Sci 2013; 75:463-75. [PMID: 24302802 PMCID: PMC3831729 DOI: 10.4103/0250-474x.119832] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Revised: 06/03/2013] [Accepted: 06/10/2013] [Indexed: 11/12/2022] Open
Abstract
In the present investigation, a series of 12 Mannich bases (QP1-12) and 5 Schiff bases (QSP1-5) of pyrazol-5(4H)-one moiety containing 3-(hydrazinyl)-2-phenylquinazolin-4(3H)-one has been synthesized and characterized by physicochemical as well as spectral means. The synthesized Mannich and Schiff bases were screened for their preliminary antimicrobial activity against Gram-positive and Gram-negative bacterial as well as fungal strains by the determination of zone of inhibition. Mannich bases (QP1-12) were found to be more potent antibacterial agents against Gram-positive bacteria, whereas Schiff bases (QSP1-5) were more potent against Gram-negative bacteria and fungi. Minimum inhibitory concentration result demonstrated that Mannich base compound (QP7) having ortho -OH and para -COOH group showed some improvement in antibacterial activity (minimum inhibitory concentration of 48.88×10−3 μM/ml) among the tested Gram-positive organisms and it also exhibit minimum inhibitory concentration of value of 12.22×10−3 μM/ml for Klebsiella pneumoniae. The antitubercular activity of synthesized compounds against Mycobacterium tuberculosis (H37Rv) was determined using microplate alamar blue assay. Compound QP11 showed appreciable antitubercular activity (minimum inhibitory concentration of 6.49×10−3 μM/ml) which was more active than the standard drugs, ethambutol (minimum inhibitory concentration of 7.60×10−3 μM/ml) and ciprofloxacin (9.4×10−3 μM/ml). Compounds QP11, QP9, QSP1, QSP2, and QSP5 have good selective index and may be selected as a lead compound for the development of novel antitubercular agents.
Collapse
Affiliation(s)
- K K Sivakumar
- Department of Pharmaceutical Chemistry, Karpagam University, Coimbatore-641 021, India
| | | | | | | |
Collapse
|
5
|
Gopinath R, Hanna LE, Kumaraswami V, Pillai SV, Kavitha V, Vijayasekaran V, Rajasekharan A, Nutman TB. Long-term persistence of cellular hyporesponsiveness to filarial antigens after clearance of microfilaremia. Am J Trop Med Hyg 1999; 60:848-53. [PMID: 10344663 DOI: 10.4269/ajtmh.1999.60.848] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
The persistence of parasite-specific cellular hyporesponsiveness after clearance of blood microfilariae (mf) was studied in 18 individuals who had been treated with a single dose of ivermectin, diethylcarbamazine, or a combination 2-3 years previously and who had initially cleared their parasitemia. At recruitment into the present study, 50% were again mf+ and 50% remained mf-. There were no significant differences between the mf+ and mf- groups in the amount of interferon-gamma (IFN-gamma) produced by peripheral blood mononuclear cells in response to adult or microfilarial antigens, although IFN-gamma production in response to purified protein derivative was greater in the mf+ group (geometric mean [gm] = 3,791 pg/ml; P = 0.02) than in the mf- group (gm = 600 pg/ml). These data suggest that although microfilaremic individuals may temporarily regain the ability to produce IFN-gamma to parasite antigens post-treatment, they subsequently revert to a state of hyporesponsiveness to mf-containing antigens that appears to be independent of the recurrence of microfilaremia and the response to nonparasite antigens.
Collapse
Affiliation(s)
- R Gopinath
- Helminth Immunology Section, Laboratory of Parasitic Diseases, National Institutes of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | | | | | |
Collapse
|
6
|
Pushpagadan P, Rajasekharan A, Ratheeshkumar P, Jawahar C, Radhakrishnan K, Nair C, Amma LS, Bhatt Aicrpe AV. 'AMRITHAPALA' (Janakia arayalpatra, Joseph & Chandrasekharan), A NEW DRUG FROM THE KANI TRIBE OF KERALA. Anc Sci Life 1990; 9:212-4. [PMID: 22557701 PMCID: PMC3331335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/1989] [Accepted: 12/10/1989] [Indexed: 11/26/2022] Open
Abstract
Amrithapala (Janakia arayalpatra), a rare and endemic plant species found in the Southern forests of Western Ghat region of kerala, is used by the local 'Kani' tribe as an effective remedy for peptic ulcer, cancer-like afflictions and as a rejuvenating tonic. Search made in Ayurvedic literature indicates that the plant may be the divine drug named variously as MRITHA SANJEEVINI (the drug that can revive unconscious or dead) or SANJEEVINI, THAMPRA RASAYANI in the Oushadha Nighantu (Dictionary of Medicinal Drugs) of Tayyil Kumaran Krishnan (1906).
Collapse
Affiliation(s)
- P. Pushpagadan
- Regional research Laboratory (CSIR), Jammu Tawi-180 001, India
| | - A. Rajasekharan
- Regional Research Institute (Drug Research), CCRAS, Trivandrum, India
| | | | - C.R. Jawahar
- Regional Research Institute (Drug Research), CCRAS, Trivandrum, India
| | - K. Radhakrishnan
- Regional Research Institute (Drug Research), CCRAS, Trivandrum, India
| | - C.P.R. Nair
- Regional Research Institute (Drug Research), CCRAS, Trivandrum, India
| | - L. Sarada Amma
- Regional Research Institute (Drug Research), CCRAS, Trivandrum, India
| | | |
Collapse
|