Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Krauthammer M, Rzhetsky A, Morozov P, Friedman C. Using BLAST for identifying gene and protein names in journal articles. Gene 2000;259:245-52. [PMID: 11163982 DOI: 10.1016/s0378-1119(00)00431-5] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Number

Cited by Other Article(s)

Fu L, Weng Z, Zhang J, Xie H, Cao Y. MMBERT: a unified framework for biomedical named entity recognition. Med Biol Eng Comput 2024;62:327-341. [PMID: 37833517 DOI: 10.1007/s11517-023-02934-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 09/07/2023] [Indexed: 10/15/2023]

Groza T, Wu H, Dinger ME, Danis D, Hilton C, Bagley A, Davids JR, Luo L, Lu Z, Robinson PN. Term-BLAST-like alignment tool for concept recognition in noisy clinical texts. Bioinformatics 2023;39:btad716. [PMID: 38001031 PMCID: PMC10710372 DOI: 10.1093/bioinformatics/btad716] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/20/2023] [Accepted: 11/23/2023] [Indexed: 11/26/2023] Open

Tsujimura T, Miwa M, Sasaki Y. Large-scale neural biomedical entity linking with layer overwriting. J Biomed Inform 2023:104433. [PMID: 37385326 DOI: 10.1016/j.jbi.2023.104433] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 06/10/2023] [Accepted: 06/19/2023] [Indexed: 07/01/2023]

Abstract

MOTIVATION

Entity linking is the task of linking entity mentions to the database entries corresponding to the entity mentions. Entity linking enables the treatment of superficially different but semantically identical mentions as the same entity. Since millions of concepts are listed in biomedical databases, selecting the correct database entry for each targeted entity is challenging. Simple string matching between the word and each synonym in biomedical databases is insufficient to handle a wide variety of variants of biomedical entities appearing in the biomedical literature. Recent progress in neural approaches is promising for entity linking. Still, existing neural methods require sufficient data, which is difficult to prepare in biomedical entity linking that deals with millions of biomedical concepts. Therefore, we need to develop a new neural method to train entity-linking models over the sparse training data covering a very limited part of the biomedical concepts.

RESULTS

We have devised a pure neural model that classifies biomedical entity mentions into millions of biomedical concepts. The classifier employs (1) the layer overwriting that breaks through the performance ceiling during training, (2) training data augmentation using database entries that compensate for the problem of insufficient training data, and (3) the cosine similarity-based loss function that helps distinguish the millions of biomedical concepts. Our system using the proposed classifier was ranked first in the official run of the National NLP Clinical Challenges (n2c2) 2019 Track 3, which targeted linking medical/clinical entity mentions to 434,056 Concept Unique Identifier (CUI) entries. We also applied our system to the MedMentions dataset, which has 3.2M candidate concepts. Experimental results confirmed the same advantages of our proposed method. We further evaluated our system on the NLM-CHEM corpus with 350K candidate concepts, and our system achieved a new state-of-the-art performance on the corpus.

AVAILABILITY

https://github.com/tti-coin/bio-linking Contact:makoto.miwa@toyota-ti.ac.jp.

Collapse

Tharmakulasingam M, Gardner B, La Ragione R, Fernando A. Rectified Classifier Chains for Prediction of Antibiotic Resistance From Multi-Labelled Data With Missing Labels. IEEE/ACM Trans Comput Biol Bioinform 2023;20:625-636. [PMID: 35130168 DOI: 10.1109/tcbb.2022.3148577] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Zheng X, Du H, Luo X, Tong F, Song W, Zhao D. BioByGANS: biomedical named entity recognition by fusing contextual and syntactic features through graph attention network in node classification framework. BMC Bioinformatics 2022;23:501. [PMID: 36418937 PMCID: PMC9682683 DOI: 10.1186/s12859-022-05051-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 11/10/2022] [Indexed: 11/24/2022] Open

Abstract

BACKGROUND

Automatic and accurate recognition of various biomedical named entities from literature is an important task of biomedical text mining, which is the foundation of extracting biomedical knowledge from unstructured texts into structured formats. Using the sequence labeling framework and deep neural networks to implement biomedical named entity recognition (BioNER) is a common method at present. However, the above method often underutilizes syntactic features such as dependencies and topology of sentences. Therefore, it is an urgent problem to be solved to integrate semantic and syntactic features into the BioNER model.

RESULTS

In this paper, we propose a novel biomedical named entity recognition model, named BioByGANS (BioBERT/SpaCy-Graph Attention Network-Softmax), which uses a graph to model the dependencies and topology of a sentence and formulate the BioNER task as a node classification problem. This formulation can introduce more topological features of language and no longer be only concerned about the distance between words in the sequence. First, we use periods to segment sentences and spaces and symbols to segment words. Second, contextual features are encoded by BioBERT, and syntactic features such as part of speeches, dependencies and topology are preprocessed by SpaCy respectively. A graph attention network is then used to generate a fusing representation considering both the contextual features and syntactic features. Last, a softmax function is used to calculate the probabilities and get the results. We conduct experiments on 8 benchmark datasets, and our proposed model outperforms existing BioNER state-of-the-art methods on the BC2GM, JNLPBA, BC4CHEMD, BC5CDR-chem, BC5CDR-disease, NCBI-disease, Species-800, and LINNAEUS datasets, and achieves F1-scores of 85.15%, 78.16%, 92.97%, 94.74%, 87.74%, 91.57%, 75.01%, 90.99%, respectively.

CONCLUSION

The experimental results on 8 biomedical benchmark datasets demonstrate the effectiveness of our model, and indicate that formulating the BioNER task into a node classification problem and combining syntactic features into the graph attention networks can significantly improve model performance.

Collapse

Allaoui H, Rached N, Marrakchi N, Cherif A, Mosbah A, Messadi E. In Silico Study of the Mechanisms Underlying the Action of the Snake Natriuretic-Like Peptide Lebetin 2 during Cardiac Ischemia. Toxins (Basel) 2022;14. [PMID: 36422961 DOI: 10.3390/toxins14110787] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 11/07/2022] [Accepted: 11/09/2022] [Indexed: 11/16/2022] Open

Lücking A, Driller C, Stoeckel M, Abrami G, Pachzelt A, Mehler A. Multiple annotation for biodiversity: developing an annotation framework among biology, linguistics and text technology. LANG RESOUR EVAL 2021. [DOI: 10.1007/s10579-021-09553-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Johnson NA, Smith CH. Novel Molecular Resources to Facilitate Future Genetics Research on Freshwater Mussels (Bivalvia: Unionidae). Data 2020;5:65. [DOI: 10.3390/data5030065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Wang Y, Xu J, Kong L, Liu T, Yi L, Wang H, Huang WE, Zheng C. Raman-deuterium isotope probing to study metabolic activities of single bacterial cells in human intestinal microbiota. Microb Biotechnol 2019;13:572-583. [PMID: 31821744 PMCID: PMC7017835 DOI: 10.1111/1751-7915.13519] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 11/15/2019] [Indexed: 12/22/2022] Open

Labbé C, Grima N, Gautier T, Favier B, Byrne JA. Semi-automated fact-checking of nucleotide sequence reagents in biomedical research publications: The Seek & Blastn tool. PLoS One 2019;14:e0213266. [PMID: 30822319 PMCID: PMC6396917 DOI: 10.1371/journal.pone.0213266] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Accepted: 02/18/2019] [Indexed: 12/14/2022] Open

Wu H, Lu D, Hyder M, Zhang S, Quinney SK, Desta Z, Li L. DrugMetab: An Integrated Machine Learning and Lexicon Mapping Named Entity Recognition Method for Drug Metabolite. CPT Pharmacometrics Syst Pharmacol 2018;7:709-717. [PMID: 30033622 PMCID: PMC6263660 DOI: 10.1002/psp4.12340] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 06/25/2018] [Indexed: 11/29/2022] Open

Xing W, Yuan X, Li L, Hu L, Peng J. Phenotype Extraction Based on Word Embedding to Sentence Embedding Cascaded Approach. IEEE Trans Nanobioscience 2018;17:172-180. [PMID: 29994536 DOI: 10.1109/tnb.2018.2838137] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Sumathipala S, Yamada K, Unehara M, Suzuki I. Protein Entity Name Recognition Using Orthographic, Morphological and Proteinhood Features. J Adv Comput Intell Intell Inform 2015. [DOI: 10.20965/jaciii.2015.p0843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Rzhetsky A, Foster JG, Foster IT, Evans JA. Choosing experiments to accelerate collective discovery. Proc Natl Acad Sci U S A 2015;112:14569-74. [PMID: 26554009 DOI: 10.1073/pnas.1509757112] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Sheth BP, Thaker VS. Identification of a Herbal Powder by Deoxyribonucleic Acid Barcoding and Structural Analyses. Pharmacogn Mag 2015;11:S570-4. [PMID: 27013796 PMCID: PMC4787090 DOI: 10.4103/0973-1296.172963] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Abstract

BACKGROUND

Authentic identification of plants is essential for exploiting their medicinal properties as well as to stop the adulteration and malpractices with the trade of the same.

OBJECTIVE

To identify a herbal powder obtained from a herbalist in the local vicinity of Rajkot, Gujarat, using deoxyribonucleic acid (DNA) barcoding and molecular tools.

MATERIALS AND METHODS

The DNA was extracted from a herbal powder and selected Cassia species, followed by the polymerase chain reaction (PCR) and sequencing of the rbcL barcode locus. Thereafter the sequences were subjected to National Center for Biotechnology Information (NCBI) basic local alignment search tool (BLAST) analysis, followed by the protein three-dimension structure determination of the rbcL protein from the herbal powder and Cassia species namely Cassia fistula, Cassia tora and Cassia javanica (sequences obtained in the present study), Cassia Roxburghii, and Cassia abbreviata (sequences retrieved from Genbank). Further, the multiple and pairwise structural alignment were carried out in order to identify the herbal powder.

RESULTS

The nucleotide sequences obtained from the selected species of Cassia were submitted to Genbank (Accession No. JX141397, JX141405, JX141420). The NCBI BLAST analysis of the rbcL protein from the herbal powder showed an equal sequence similarity (with reference to different parameters like E value, maximum identity, total score, query coverage) to C. javanica and C. roxburghii. In order to solve the ambiguities of the BLAST result, a protein structural approach was implemented. The protein homology models obtained in the present study were submitted to the protein model database (PM0079748-PM0079753). The pairwise structural alignment of the herbal powder (as template) and C. javanica and C. roxburghii (as targets individually) revealed a close similarity of the herbal powder with C. javanica.

CONCLUSION

A strategy as used here, incorporating the integrated use of DNA barcoding and protein structural analyses could be adopted, as a novel rapid and economic procedure, especially in cases when protein coding loci are considered.

SUMMARY

Authentic identification of plants is essential for exploiting their medicinal properties as well as to stop the adulteration and malpractices with the trade of the same. A herbal powder was obtained from a herbalist in the local vicinity of Rajkot, Gujarat. An integrated approach using DNA barcoding and structural analyses was carried out to identify the herbal powder. The herbal powder was identified as Cassia javanica L.

Collapse

Blair DR, Wang K, Nestorov S, Evans JA, Rzhetsky A. Quantifying the impact and extent of undocumented biomedical synonymy. PLoS Comput Biol 2014;10:e1003799. [PMID: 25255227 PMCID: PMC4177665 DOI: 10.1371/journal.pcbi.1003799] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 06/26/2014] [Indexed: 12/14/2022] Open

Abstract

Synonymous relationships among biomedical terms are extensively annotated within specialized terminologies, implying that synonymy is important for practical computational applications within this field. It remains unclear, however, whether text mining actually benefits from documented synonymy and whether existing biomedical thesauri provide adequate coverage of these linguistic relationships. In this study, we examine the impact and extent of undocumented synonymy within a very large compendium of biomedical thesauri. First, we demonstrate that missing synonymy has a significant negative impact on named entity normalization, an important problem within the field of biomedical text mining. To estimate the amount synonymy currently missing from thesauri, we develop a probabilistic model for the construction of synonym terminologies that is capable of handling a wide range of potential biases, and we evaluate its performance using the broader domain of near-synonymy among general English words. Our model predicts that over 90% of these relationships are currently undocumented, a result that we support experimentally through “crowd-sourcing.” Finally, we apply our model to biomedical terminologies and predict that they are missing the vast majority (>90%) of the synonymous relationships they intend to document. Overall, our results expose the dramatic incompleteness of current biomedical thesauri and suggest the need for “next-generation,” high-coverage lexical terminologies.

Automated systems that extract and integrate information from the research literature have become common in biomedicine. As the same meaning can be expressed in many distinct but synonymous ways, access to comprehensive thesauri may enable such systems to maximize their performance. Here, we establish the importance of synonymy for a specific text-mining task (named-entity normalization), and we suggest that current thesauri may be woefully inadequate in their documentation of this linguistic phenomenon. To test this claim, we develop a model for estimating the amount of missing synonymy. We apply our model to both biomedical terminologies and general-English thesauri, predicting massive amounts of missing synonymy for both lexicons. Furthermore, we verify some of our predictions for the latter domain through “crowd-sourcing.” Overall, our work highlights the dramatic incompleteness of current biomedical thesauri, and to mitigate this issue, we propose the creation of “living” terminologies, which would automatically harvest undocumented synonymy and help smart machines enrich biomedicine.

Collapse

Collier N, Tran MV, Le HQ, Ha QT, Oellrich A, Rebholz-Schuhmann D. Learning to recognize phenotype candidates in the auto-immune literature using SVM re-ranking. PLoS One 2013;8:e72965. [PMID: 24155869 PMCID: PMC3796529 DOI: 10.1371/journal.pone.0072965] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 07/15/2013] [Indexed: 11/19/2022] Open

Dinh D, Tamine L, Boubekeur F. Factors affecting the effectiveness of biomedical document indexing and retrieval based on terminologies. Artif Intell Med 2013;57:155-67. [DOI: 10.1016/j.artmed.2012.08.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2011] [Revised: 08/26/2012] [Accepted: 08/30/2012] [Indexed: 11/26/2022]

García-Remesal M, García-Ruiz A, Pérez-Rey D, de la Iglesia D, Maojo V. Using nanoinformatics methods for automatically identifying relevant nanotoxicology entities from the literature. Biomed Res Int 2012;2013:410294. [PMID: 23509721 PMCID: PMC3591181 DOI: 10.1155/2013/410294] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/08/2012] [Revised: 07/03/2012] [Accepted: 07/10/2012] [Indexed: 01/12/2023]

Thessen AE, Cui H, Mozzherin D. Applications of natural language processing in biodiversity science. Adv Bioinformatics 2012;2012:391574. [PMID: 22685456 DOI: 10.1155/2012/391574] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 02/15/2012] [Indexed: 12/11/2022] Open

Galvez C, de Moya‐Anegón F. A dictionary‐based approach to normalizing gene names in one domain of knowledge from the biomedical literature. Journal of Documentation 2012. [DOI: 10.1108/00220411211200301] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

NG EYK, TAY LL. STUDY OF BLAST DNA MATCHING TOOLKITS. J MECH MED BIOL 2011. [DOI: 10.1142/s0219519404001090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Garten Y, Coulet A, Altman RB. Recent progress in automatically extracting information from the pharmacogenomic literature. Pharmacogenomics 2011;11:1467-89. [PMID: 21047206 DOI: 10.2217/pgs.10.136] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

DeSantis TZ, Keller K, Karaoz U, Alekseyenko AV, Singh NNS, Brodie EL, Pei Z, Andersen GL, Larsen N. Simrank: Rapid and sensitive general-purpose k-mer search tool. BMC Ecol 2011;11:11. [PMID: 21524302 PMCID: PMC3097142 DOI: 10.1186/1472-6785-11-11] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Accepted: 04/27/2011] [Indexed: 02/01/2023] Open

Ruiz JC, D'Afonseca V, Silva A, Ali A, Pinto AC, Santos AR, Rocha AAMC, Lopes DO, Dorella FA, Pacheco LGC, Costa MP, Turk MZ, Seyffert N, Moraes PMRO, Soares SC, Almeida SS, Castro TLP, Abreu VAC, Trost E, Baumbach J, Tauch A, Schneider MPC, McCulloch J, Cerdeira LT, Ramos RTJ, Zerlotini A, Dominitini A, Resende DM, Coser EM, Oliveira LM, Pedrosa AL, Vieira CU, Guimarães CT, Bartholomeu DC, Oliveira DM, Santos FR, Rabelo ÉM, Lobo FP, Franco GR, Costa AF, Castro IM, Dias SRC, Ferro JA, Ortega JM, Paiva LV, Goulart LR, Almeida JF, Ferro MIT, Carneiro NP, Falcão PRK, Grynberg P, Teixeira SMR, Brommonschenkel S, Oliveira SC, Meyer R, Moore RJ, Miyoshi A, Oliveira GC, Azevedo V. Evidence for reductive genome evolution and lateral acquisition of virulence functions in two Corynebacterium pseudotuberculosis strains. PLoS One 2011;6:e18551. [PMID: 21533164 PMCID: PMC3078919 DOI: 10.1371/journal.pone.0018551] [Citation(s) in RCA: 69] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2010] [Accepted: 03/11/2011] [Indexed: 02/02/2023] Open

Abstract

Background

Corynebacterium pseudotuberculosis, a Gram-positive, facultative intracellular pathogen, is the etiologic agent of the disease known as caseous lymphadenitis (CL). CL mainly affects small ruminants, such as goats and sheep; it also causes infections in humans, though rarely. This species is distributed worldwide, but it has the most serious economic impact in Oceania, Africa and South America. Although C. pseudotuberculosis causes major health and productivity problems for livestock, little is known about the molecular basis of its pathogenicity.

Methodology and Findings

We characterized two C. pseudotuberculosis genomes (Cp1002, isolated from goats; and CpC231, isolated from sheep). Analysis of the predicted genomes showed high similarity in genomic architecture, gene content and genetic order. When C. pseudotuberculosis was compared with other Corynebacterium species, it became evident that this pathogenic species has lost numerous genes, resulting in one of the smallest genomes in the genus. Other differences that could be part of the adaptation to pathogenicity include a lower GC content, of about 52%, and a reduced gene repertoire. The C. pseudotuberculosis genome also includes seven putative pathogenicity islands, which contain several classical virulence factors, including genes for fimbrial subunits, adhesion factors, iron uptake and secreted toxins. Additionally, all of the virulence factors in the islands have characteristics that indicate horizontal transfer.

Conclusions

These particular genome characteristics of C. pseudotuberculosis, as well as its acquired virulence factors in pathogenicity islands, provide evidence of its lifestyle and of the pathogenicity pathways used by this pathogen in the infection process. All genomes cited in this study are available in the NCBI Genbank database (http://www.ncbi.nlm.nih.gov/genbank/) under accession numbers CP001809 and CP001829.

Collapse

Affiliation(s)

Jerônimo C. Ruiz Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil
Vívian D'Afonseca Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Artur Silva Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
Amjad Ali Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Anne C. Pinto Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Anderson R. Santos Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Aryanne A. M. C. Rocha Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Débora O. Lopes Health Sciences Center, Federal University of São João Del Rei, Divinópilis, Minas Gerais, Brazil
Fernanda A. Dorella Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Luis G. C. Pacheco Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil Department of Biointeraction Sciences, Federal University of Bahia, Salvador, Bahia, Brazil
Marcília P. Costa Department of Veterinary Medicine, State University of Ceará, Fortaleza, Ceará, Brazil
Meritxell Z. Turk Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Núbia Seyffert Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Pablo M. R. O. Moraes Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Siomar C. Soares Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Sintia S. Almeida Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Thiago L. P. Castro Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Vinicius A. C. Abreu Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Eva Trost Department of Genetics, University of Bielefeld, CeBiTech, Bielefeld, Nordrhein-Westfale, Germany
Jan Baumbach Department of Computer Science, Max-Planck-Institut für Informatik, Saarbrücken, Saarlan, Germany
Andreas Tauch Department of Genetics, University of Bielefeld, CeBiTech, Bielefeld, Nordrhein-Westfale, Germany
Maria Paula C. Schneider Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
John McCulloch Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
Louise T. Cerdeira Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
Rommel T. J. Ramos Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
Adhemar Zerlotini Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil
Anderson Dominitini Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil
Daniela M. Resende Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil Department of Pharmaceutical Sciences, Federal University of Ouro Preto, Ouro Preto, Minas Gerais, Brazil
Elisângela M. Coser Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil
Luciana M. Oliveira Department of Phisics, Federal University of Ouro Preto, Ouro Preto, Minas Gerais, Brazil
André L. Pedrosa Department of Pharmaceutical Sciences, Federal University of Ouro Preto, Ouro Preto, Minas Gerais, Brazil Department of Biological Sciences, Federal University of Triangulo Mineiro, Uberaba, Minas Gerais, Brazil
Carlos U. Vieira Department of Genetics and Biochemistry, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
Cláudia T. Guimarães Brazilian Agricultural Research Corporation (EMBRAPA), Sete Lagoas, Minas Gerais, Brazil
Daniela C. Bartholomeu Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Diana M. Oliveira Department of Veterinary Medicine, State University of Ceará, Fortaleza, Ceará, Brazil
Fabrício R. Santos Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Élida Mara Rabelo Department of Parasitology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Francisco P. Lobo Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Glória R. Franco Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Ana Flávia Costa Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Ieso M. Castro Department of Pharmacy, Federal University of Ouro Preto, Ouro Preto, Minas Gerais, Brazil
Sílvia Regina Costa Dias Department of Parasitology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Jesus A. Ferro Department of Technology, State University of São Paulo, Jaboticabal, São Paulo, Brazil
José Miguel Ortega Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Luciano V. Paiva Department of Chemistry, Federal University of Lavras, Lavras, Minas Gerais, Brazil
Luiz R. Goulart Department of Genetics and Biochemistry, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
Juliana Franco Almeida Department of Genetics and Biochemistry, Federal University of Uberlândia, Uberlândia, Minas Gerais, Brazil
Maria Inês T. Ferro Department of Technology, State University of São Paulo, Jaboticabal, São Paulo, Brazil
Newton P. Carneiro Brazilian Agricultural Research Corporation (EMBRAPA), Sete Lagoas, Minas Gerais, Brazil
Paula R. K. Falcão Brazilian Agricultural Research Corporation (EMBRAPA), Campinas, São Paulo, Brazil
Priscila Grynberg Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Santuza M. R. Teixeira Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Sérgio Brommonschenkel Department of Plant Pathology, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
Sérgio C. Oliveira Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Roberto Meyer Department of Biointeraction Sciences, Federal University of Bahia, Salvador, Bahia, Brazil
Robert J. Moore CSIRO Livestock Industries, Australia
Anderson Miyoshi Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
Guilherme C. Oliveira Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil Center of Excellence in Bioinformatics, National Institute of Science and Technology, Research Center René Rachou, Oswaldo Cruz Foundation, Belo Horizonte, Minas Gerais, Brazil
Vasco Azevedo Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil * E-mail:

Collapse

Smalheiser NR, Torvik VI. Author name disambiguation. ACTA ACUST UNITED AC 2011. [DOI: 10.1002/aris.2009.1440430113] [Citation(s) in RCA: 130] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Khordad M, Mercer RE, Rogan P. Improving Phenotype Name Recognition. Advances in Artificial Intelligence 2011. [DOI: 10.1007/978-3-642-21043-3_30] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Hudson DM, Mattatall NR, Uribe E, Richards RC, Gong H, Ewart KV. Cystine-mediated oligomerization of the Atlantic salmon serum C-type lectin. Biochim Biophys Acta 2011;1814:283-9. [PMID: 21109028 DOI: 10.1016/j.bbapap.2010.11.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2010] [Revised: 11/08/2010] [Accepted: 11/10/2010] [Indexed: 11/20/2022]

Cukrowska B, Motyl I, Kozáková H, Schwarzer M, Górecki RK, Klewicka E, Śliżewska K, Libudzisz Z. Probiotic Lactobacillus strains: in vitro and in vivo studies. Folia Microbiol (Praha) 2010;54:533-7. [DOI: 10.1007/s12223-009-0077-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Revised: 11/19/2009] [Indexed: 11/29/2022]

Yan D, Kang J, Liu D. Genomic analysis of the aromatic catabolic pathways fromSilicibacter pomeroyi DSS-3. ANN MICROBIOL 2009;59:789-800. [DOI: 10.1007/bf03179225] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open

Tatar S, Cicekli I. Two learning approaches for protein name extraction. J Biomed Inform 2009;42:1046-55. [DOI: 10.1016/j.jbi.2009.05.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2009] [Accepted: 05/07/2009] [Indexed: 10/20/2022]

Bobby P, Balaji S, Sathyanath V, Eapen SJ. JUZBOX: a web server for extracting biomedical words from the protein sequence. Bioinformation 2009;4:179-81. [PMID: 20461154 PMCID: PMC2859571 DOI: 10.6026/97320630004179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2009] [Revised: 07/31/2009] [Accepted: 09/11/2009] [Indexed: 11/25/2022] Open

Park J, Rosania GR, Saitou K. Tunable machine vision-based strategy for automated annotation of chemical databases. J Chem Inf Model 2009;49:1993-2001. [PMID: 19621901 PMCID: PMC2907084 DOI: 10.1021/ci900029v] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Huang KC, Geller J, Halper M, Perl Y, Xu J. Using WordNet synonym substitution to enhance UMLS source integration. Artif Intell Med 2009;46:97-109. [PMID: 19117739 PMCID: PMC2755556 DOI: 10.1016/j.artmed.2008.11.008] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2008] [Revised: 08/15/2008] [Accepted: 11/09/2008] [Indexed: 11/21/2022]

Abstract

OBJECTIVE

Synonym-substitution algorithms have been developed for the purpose of matching source vocabulary terms with existing Unified Medical Language System (UMLS) terms during the integration process. A drawback is the possible explosion in the number of newly generated (potential) synonyms, which can tax computational and expert review resources. Experiments are run using a synonym-substitution approach based on WordNet to see how constraining two methodological parameters, namely, "maximum number of substitutions per term" and "maximum term length," affects performance. Our hypothesis is that these values can be constrained rather tightly--thus greatly speeding up the methodology--without a marked decline in the additional matches produced. Furthermore, we investigate whether a limitation on only the first of the two parameters is sufficient to achieve the same results.

METHODS

A four-stage synonym-substitution methodology using WordNet is presented. A group of experiments is carried out in which the two methodological parameters "maximum number of substitutions per term" and "maximum term length" are varied. The purpose is to examine their effect on the growth in the number of potential synonyms generated and the associated loss of results. The experiments are based on the re-integration of the "Minimal Standard Terminology" (MST) into the UMLS. Synonym-substitution matches found to be inconsistent with the current content of the UMLS and thus deemed to be incorrect are further manually scrutinized as an audit of the original integration of the MST.

RESULTS

An increase of 11% in the number of "MST term/UMLS term" matches was achieved using the synonym-substitution methodology. Importantly, this result prevailed when tight threshold values (such as a maximum of two synonym substitutions per term) were imposed on the parameters. Furthermore, it was found that limiting only the "maximum number of substitutions per term" parameter was sufficient to obtain the performance enhancement. During the additional audit phase, a number of the reported mismatches were actually seen to be correct, representing an additional 10% increase in the number of matches obtained.

CONCLUSION

A synonym-substitution methodology that utilizes WordNet is a useful automated aide in UMLS source integration. Experiments showed that there was a significant speed-up but no degradation in match results when the methodology's "maximum number of substitutions per term" parameter was relatively tightly constrained. The methodology also helped to discover errors in the MST's original integration, and improve the quality of the UMLS's conceptual content.

Collapse

Yoo IH, Song M. Biomedical Ontologies and Text Mining for Biomedicine and Healthcare: A Survey. ACTA ACUST UNITED AC 2008. [DOI: 10.5626/jcse.2008.2.2.109] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Tsuruoka Y, McNaught J, Ananiadou S. Normalizing biomedical terms by minimizing ambiguity and variability. BMC Bioinformatics 2008;9 Suppl 3:S2. [PMID: 18426547 DOI: 10.1186/1471-2105-9-S3-S2] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Dimililer N, Varoğlu E, Altınçay H. Classifier subset selection for biomedical named entity recognition. APPL INTELL 2009;31:267-82. [DOI: 10.1007/s10489-008-0124-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Chae JM, Oh HB, Choi SE, Cha CH, Kim MH, Jung SY. [Development of a system for extracting the information of candidate tumor markers reported in biomedical literatures]. Korean J Lab Med 2008;28:79-87. [PMID: 18309259 DOI: 10.3343/kjlm.2008.28.1.79] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Furlong LI, Dach H, Hofmann-Apitius M, Sanz F. OSIRISv1.2: a named entity recognition system for sequence variants of genes in biomedical literature. BMC Bioinformatics 2008;9:84. [PMID: 18251998 PMCID: PMC2277400 DOI: 10.1186/1471-2105-9-84] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2007] [Accepted: 02/05/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Single Nucleotide Polymorphisms, among other type of sequence variants, constitute key elements in genetic epidemiology and pharmacogenomics. While sequence data about genetic variation is found at databases such as dbSNP, clues about the functional and phenotypic consequences of the variations are generally found in biomedical literature. The identification of the relevant documents and the extraction of the information from them are hampered by the large size of literature databases and the lack of widely accepted standard notation for biomedical entities. Thus, automatic systems for the identification of citations of allelic variants of genes in biomedical texts are required.

RESULTS

Our group has previously reported the development of OSIRIS, a system aimed at the retrieval of literature about allelic variants of genes http://ibi.imim.es/osirisform.html. Here we describe the development of a new version of OSIRIS (OSIRISv1.2, http://ibi.imim.es/OSIRISv1.2.html) which incorporates a new entity recognition module and is built on top of a local mirror of the MEDLINE collection and HgenetInfoDB: a database that collects data on human gene sequence variations. The new entity recognition module is based on a pattern-based search algorithm for the identification of variation terms in the texts and their mapping to dbSNP identifiers. The performance of OSIRISv1.2 was evaluated on a manually annotated corpus, resulting in 99% precision, 82% recall, and an F-score of 0.89. As an example, the application of the system for collecting literature citations for the allelic variants of genes related to the diseases intracranial aneurysm and breast cancer is presented.

CONCLUSION

OSIRISv1.2 can be used to link literature references to dbSNP database entries with high accuracy, and therefore is suitable for collecting current knowledge on gene sequence variations and supporting the functional annotation of variation databases. The application of OSIRISv1.2 in combination with controlled vocabularies like MeSH provides a way to identify associations of biomedical interest, such as those that relate SNPs with diseases.

Collapse

Soanes KH, Ewart KV, Mattatall NR. Recombinant production and characterization of the carbohydrate recognition domain from Atlantic salmon C-type lectin receptor C (SCLRC). Protein Expr Purif 2008;59:38-46. [PMID: 18272393 DOI: 10.1016/j.pep.2008.01.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2007] [Revised: 01/08/2008] [Accepted: 01/09/2008] [Indexed: 11/20/2022]

Quiñones KD, Su H, Marshall B, Eggers S, Chen H. User-centered evaluation of Arizona BioPathway: an information extraction, integration, and visualization system. IEEE Trans Inf Technol Biomed 2007;11:527-36. [PMID: 17912969 DOI: 10.1109/titb.2006.889706] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Tsuruoka Y, McNaught J, Tsujii J, Ananiadou S. Learning string similarity measures for gene/protein name dictionary look-up using logistic regression. Bioinformatics 2007;23:2768-74. [PMID: 17698493 DOI: 10.1093/bioinformatics/btm393] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Crasto CJ, Shepherd GM. Managing knowledge in neuroscience. Methods Mol Biol 2007;401:3-21. [PMID: 18368357 DOI: 10.1007/978-1-59745-520-6_1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Tulipano PK, Tao Y, Millar WS, Zanzonico P, Kolbert K, Xu H, Yu H, Chen L, Lussier YA, Friedman C. Natural language processing and visualization in the molecular imaging domain. J Biomed Inform 2006;40:270-81. [PMID: 17084109 DOI: 10.1016/j.jbi.2006.08.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2005] [Revised: 08/25/2006] [Accepted: 08/29/2006] [Indexed: 11/16/2022]

Zhou W, Torvik VI, Smalheiser NR. ADAM: another database of abbreviations in MEDLINE. Bioinformatics 2006;22:2813-8. [PMID: 16982707 DOI: 10.1093/bioinformatics/btl480] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Wren JD. A scalable machine-learning approach to recognize chemical names within large text databases. BMC Bioinformatics 2006;7 Suppl 2:S3. [PMID: 17118146 PMCID: PMC1683569 DOI: 10.1186/1471-2105-7-s2-s3] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Malik R, Franke L, Siebes A. Combination of text-mining algorithms increases the performance. ACTA ACUST UNITED AC 2006;22:2151-7. [PMID: 16766558 DOI: 10.1093/bioinformatics/btl281] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Jensen LJ, Saric J, Bork P. Literature mining for the biologist: from information retrieval to biological discovery. Nat Rev Genet 2006;7:119-29. [PMID: 16418747 DOI: 10.1038/nrg1768] [Citation(s) in RCA: 356] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Couté Y, Burgess JA, Diaz JJ, Chichester C, Lisacek F, Greco A, Sanchez JC. Deciphering the human nucleolar proteome. Mass Spectrom Rev 2006;25:215-34. [PMID: 16211575 DOI: 10.1002/mas.20067] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Natarajan J, Berrar D, Hack CJ, Dubitzky W. Knowledge discovery in biology and biotechnology texts: a review of techniques, evaluation strategies, and applications. Crit Rev Biotechnol 2005;25:31-52. [PMID: 15999851 DOI: 10.1080/07388550590935571] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]