1
|
Caufield JH, Liem DA, Garlid AO, Zhou Y, Watson K, Bui AAT, Wang W, Ping P. A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts. J Vis Exp 2018. [PMID: 30295669 PMCID: PMC6235242 DOI: 10.3791/58392] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Clinical case reports (CCRs) are a valuable means of sharing observations and insights in medicine. The form of these documents varies, and their content includes descriptions of numerous, novel disease presentations and treatments. Thus far, the text data within CCRs is largely unstructured, requiring significant human and computational effort to render these data useful for in-depth analysis. In this protocol, we describe methods for identifying metadata corresponding to specific biomedical concepts frequently observed within CCRs. We provide a metadata template as a guide for document annotation, recognizing that imposing structure on CCRs may be pursued by combinations of manual and automated effort. The approach presented here is appropriate for organization of concept-related text from a large literature corpus (e.g., thousands of CCRs) but may be easily adapted to facilitate more focused tasks or small sets of reports. The resulting structured text data includes sufficient semantic context to support a variety of subsequent text analysis workflows: meta-analyses to determine how to maximize CCR detail, epidemiological studies of rare diseases, and the development of models of medical language may all be made more realizable and manageable through the use of structured text data.
Collapse
Affiliation(s)
- John Harry Caufield
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles;
| | - David A Liem
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles; Department of Medicine/Cardiology, University of California, Los Angeles
| | - Anders O Garlid
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles
| | - Yijiang Zhou
- Department of Cardiology, First Affiliated Hospital, Zhejiang University School of Medicine
| | - Karol Watson
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Medicine/Cardiology, University of California, Los Angeles
| | - Alex A T Bui
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Radiological Sciences, University of California, Los Angeles; Department of Bioengineering, University of California, Los Angeles; Scalable Analytics Institute (ScAi), University of California, Los Angeles
| | - Wei Wang
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Scalable Analytics Institute (ScAi), University of California, Los Angeles; Department of Bioinformatics, University of California, Los Angeles; Department of Computer Science, University of California, Los Angeles
| | - Peipei Ping
- The NIH BD2K Center of Excellence in Biomedical Computing, University of California, Los Angeles; Department of Physiology, University of California, Los Angeles; Department of Medicine/Cardiology, University of California, Los Angeles; Scalable Analytics Institute (ScAi), University of California, Los Angeles; Department of Bioinformatics, University of California, Los Angeles
| |
Collapse
|
2
|
Medical concept normalization in social media posts with recurrent neural networks. J Biomed Inform 2018; 84:93-102. [DOI: 10.1016/j.jbi.2018.06.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2017] [Revised: 04/24/2018] [Accepted: 06/10/2018] [Indexed: 12/11/2022]
|
3
|
Abstract
MOTIVATION Despite the central role of diseases in biomedical research, there have been much fewer attempts to automatically determine which diseases are mentioned in a text-the task of disease name normalization (DNorm)-compared with other normalization tasks in biomedical text mining research. METHODS In this article we introduce the first machine learning approach for DNorm, using the NCBI disease corpus and the MEDIC vocabulary, which combines MeSH® and OMIM. Our method is a high-performing and mathematically principled framework for learning similarities between mentions and concept names directly from training data. The technique is based on pairwise learning to rank, which has not previously been applied to the normalization task but has proven successful in large optimization problems for information retrieval. RESULTS We compare our method with several techniques based on lexical normalization and matching, MetaMap and Lucene. Our algorithm achieves 0.782 micro-averaged F-measure and 0.809 macro-averaged F-measure, an increase over the highest performing baseline method of 0.121 and 0.098, respectively. AVAILABILITY The source code for DNorm is available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/DNorm, along with a web-based demonstration and links to the NCBI disease corpus. Results on PubMed abstracts are available in PubTator: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator .
Collapse
Affiliation(s)
- Robert Leaman
- National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda, MD 20894, USA and Department of Biomedical Informatics, Arizona State University, 13212 East Shea Blvd, Scottsdale, AZ 85259, USA
| | | | | |
Collapse
|
4
|
A computational method based on the integration of heterogeneous networks for predicting disease-gene associations. PLoS One 2011; 6:e24171. [PMID: 21912671 PMCID: PMC3166294 DOI: 10.1371/journal.pone.0024171] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2011] [Accepted: 08/01/2011] [Indexed: 01/22/2023] Open
Abstract
The identification of disease-causing genes is a fundamental challenge in human health and of great importance in improving medical care, and provides a better understanding of gene functions. Recent computational approaches based on the interactions among human proteins and disease similarities have shown their power in tackling the issue. In this paper, a novel systematic and global method that integrates two heterogeneous networks for prioritizing candidate disease-causing genes is provided, based on the observation that genes causing the same or similar diseases tend to lie close to one another in a network of protein-protein interactions. In this method, the association score function between a query disease and a candidate gene is defined as the weighted sum of all the association scores between similar diseases and neighbouring genes. Moreover, the topological correlation of these two heterogeneous networks can be incorporated into the definition of the score function, and finally an iterative algorithm is designed for this issue. This method was tested with 10-fold cross-validation on all 1,126 diseases that have at least a known causal gene, and it ranked the correct gene as one of the top ten in 622 of all the 1,428 cases, significantly outperforming a state-of-the-art method called PRINCE. The results brought about by this method were applied to study three multi-factorial disorders: breast cancer, Alzheimer disease and diabetes mellitus type 2, and some suggestions of novel causal genes and candidate disease-causing subnetworks were provided for further investigation.
Collapse
|
5
|
Biesecker LG. Polydactyly: how many disorders and how many genes? 2010 update. Dev Dyn 2011; 240:931-42. [PMID: 21445961 PMCID: PMC3088011 DOI: 10.1002/dvdy.22609] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/07/2010] [Indexed: 01/26/2023] Open
Abstract
Limb development is clinically and biologically important. Polydactyly is common and caused by aberrant anterior-posterior patterning. Human disorders that include polydactyly are diverse. To facilitate an understanding of the biology of limb development, cataloging the genes that are mutated in patients with polydactyly would be useful. In 2002, I characterized human phenotypes that included polydactyly. Subsequently, many advances have occurred with refinement of clinical entities and identification of numerous genes. Here, I update human polydactyly entities by phenotype and mutated gene. This survey demonstrates phenotypes with overlapping manifestations, genetic heterogeneity, and distinct phenotypes generated from mutations in single genes. Among 310 clinical entities, 80 are associated with mutations in 99 genes. These results show that knowledge of limb patterning genetics is improving rapidly. Soon, we will have a comprehensive toolkit of genes important for limb development, which will lead to regenerative therapies for limb anomalies.
Collapse
Affiliation(s)
- Leslie G Biesecker
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA.
| |
Collapse
|
6
|
Biesecker LG. An introduction to standardized clinical nomenclature for dysmorphic features: the Elements of Morphology project. BMC Med 2010; 8:56. [PMID: 20920337 PMCID: PMC2958881 DOI: 10.1186/1741-7015-8-56] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2010] [Accepted: 10/04/2010] [Indexed: 11/10/2022] Open
Abstract
Human structural malformations (anomalies or birth defects) have an enormous and complex range of manifestations and severity. The description of these findings can be challenging because the variation of many of the features is continuous and only some of them can be objectively assessed (that is, measured), among other factors. An international group of clinicians resolved to develop a set of terms that could be used to describe human structural malformations, under the general project name 'Elements of Morphology'. Here, the background to the project, progress to date, and the practical implementation of the terminology in research reporting is discussed.
Collapse
Affiliation(s)
- Leslie G Biesecker
- Genetic Disease Research Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
7
|
Li Y, Patra JC. Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network. ACTA ACUST UNITED AC 2010; 26:1219-24. [PMID: 20215462 DOI: 10.1093/bioinformatics/btq108] [Citation(s) in RCA: 245] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
MOTIVATION Clinical diseases are characterized by distinct phenotypes. To identify disease genes is to elucidate the gene-phenotype relationships. Mutations in functionally related genes may result in similar phenotypes. It is reasonable to predict disease-causing genes by integrating phenotypic data and genomic data. Some genetic diseases are genetically or phenotypically similar. They may share the common pathogenetic mechanisms. Identifying the relationship between diseases will facilitate better understanding of the pathogenetic mechanism of diseases. RESULTS In this article, we constructed a heterogeneous network by connecting the gene network and phenotype network using the phenotype-gene relationship information from the OMIM database. We extended the random walk with restart algorithm to the heterogeneous network. The algorithm prioritizes the genes and phenotypes simultaneously. We use leave-one-out cross-validation to evaluate the ability of finding the gene-phenotype relationship. Results showed improved performance than previous works. We also used the algorithm to disclose hidden disease associations that cannot be found by gene network or phenotype network alone. We identified 18 hidden disease associations, most of which were supported by literature evidence. AVAILABILITY The MATLAB code of the program is available at http://www3.ntu.edu.sg/home/aspatra/research/Yongjin_BI2010.zip.
Collapse
Affiliation(s)
- Yongjin Li
- School of Computer Engineering, Nanyang Technological University, Singapore.
| | | |
Collapse
|
8
|
Abstract
A standardized, controlled vocabulary allows phenotypic information to be described in an unambiguous fashion in medical publications and databases. The Human Phenotype Ontology (HPO) is being developed in an effort to provide such a vocabulary. The use of an ontology to capture phenotypic information allows the use of computational algorithms that exploit semantic similarity between related phenotypic abnormalities to define phenotypic similarity metrics, which can be used to perform database searches for clinical diagnostics or as a basis for incorporating the human phenome into large-scale computational analysis of gene expression patterns and other cellular phenomena associated with human disease. The HPO is freely available at http://www.human-phenotype-ontology.org.
Collapse
Affiliation(s)
- P N Robinson
- Institute for Medical Genetics, Augustenburger Platz 1, 13353 Berlin, Germany.
| | | |
Collapse
|
9
|
Oti M, Huynen MA, Brunner HG. The biological coherence of human phenome databases. Am J Hum Genet 2009; 85:801-8. [PMID: 20004759 DOI: 10.1016/j.ajhg.2009.10.026] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2009] [Revised: 10/15/2009] [Accepted: 10/20/2009] [Indexed: 11/28/2022] Open
Abstract
Disease networks are increasingly explored as a complement to networks centered around interactions between genes and proteins. The quality of disease networks is heavily dependent on the amount and quality of phenotype information in phenotype databases of human genetic diseases. We explored which aspects of phenotype database architecture and content best reflect the underlying biology of disease. We used the OMIM-based HPO, Orphanet, and POSSUM phenotype databases for this purpose and devised a biological coherence score based on the sharing of gene ontology annotation to investigate the degree to which phenotype similarity in these databases reflects related pathobiology. Our analyses support the notion that a fine-grained phenotype ontology enhances the accuracy of phenome representation. In addition, we find that the OMIM database that is most used by the human genetics community is heavily underannotated. We show that this problem can easily be overcome by simply adding data available in the POSSUM database to improve OMIM phenotype representations in the HPO. Also, we find that the use of feature frequency estimates--currently implemented only in the Orphanet database--significantly improves the quality of the phenome representation. Our data suggest that there is much to be gained by improving human phenome databases and that some of the measures needed to achieve this are relatively easy to implement. More generally, we propose that curation and more systematic annotation of human phenome databases can greatly improve the power of the phenotype for genetic disease analysis.
Collapse
Affiliation(s)
- Martin Oti
- Centre for Molecular and Biomolecular Informatics, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen Medical Centre, Geert Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands
| | | | | |
Collapse
|
10
|
Systematic genotype-phenotype analysis of autism susceptibility loci implicates additional symptoms to co-occur with autism. Eur J Hum Genet 2009; 18:588-95. [PMID: 19935830 DOI: 10.1038/ejhg.2009.206] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Many genetic studies in autism have been performed, resulting in the identification of multiple linkage regions and cytogenetic aberrations, but little unequivocal evidence for the involvement of specific genes exists. By identifying novel symptoms in these patients, enhanced phenotyping of autistic individuals not only improves understanding and diagnosis but also helps to define biologically more homogeneous groups of patients, improving the potential to detect causative genes. Supported by recent copy number variation findings in autism, we hypothesized that for some susceptibility loci, autism resembles a contiguous gene syndrome, caused by aberrations within multiple (contiguous) genes, which jointly increases autism susceptibility. This would result in various different clinical manifestations that might be rather atypical, but that also co-occur with autism. To test this hypothesis, 13 susceptibility loci, identified through genetic linkage and cytogenetic analyses, were systematically analyzed. The Online Mendelian Inheritance in Man database was used to identify syndromes caused by mutations in the genes residing in each of these loci. Subsequent analysis of the symptoms expressed within these disorders allowed us to identify 33 symptoms (significantly more than expected, P=0.037) that were over-represented in previous reports mapping to these loci. Some of these symptoms, including seizures and craniofacial abnormalities, support our hypothesis as they are already known to co-occur with autism. These symptoms, together with ones that have not previously been described to co-occur with autism, might be considered for use as inclusion or exclusion criteria toward defining etiologically more homogeneous groups for molecular genetic studies of autism.
Collapse
|
11
|
Mapping gene associations in human mitochondria using clinical disease phenotypes. PLoS Comput Biol 2009; 5:e1000374. [PMID: 19390613 PMCID: PMC2668170 DOI: 10.1371/journal.pcbi.1000374] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2008] [Accepted: 03/24/2009] [Indexed: 01/11/2023] Open
Abstract
Nuclear genes encode most mitochondrial proteins, and their mutations cause diverse and debilitating clinical disorders. To date, 1,200 of these mitochondrial genes have been recorded, while no standardized catalog exists of the associated clinical phenotypes. Such a catalog would be useful to develop methods to analyze human phenotypic data, to determine genotype-phenotype relations among many genes and diseases, and to support the clinical diagnosis of mitochondrial disorders. Here we establish a clinical phenotype catalog of 174 mitochondrial disease genes and study associations of diseases and genes. Phenotypic features such as clinical signs and symptoms were manually annotated from full-text medical articles and classified based on the hierarchical MeSH ontology. This classification of phenotypic features of each gene allowed for the comparison of diseases between different genes. In turn, we were then able to measure the phenotypic associations of disease genes for which we calculated a quantitative value that is based on their shared phenotypic features. The results showed that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. We then constructed a functional network of mitochondrial genes and discovered a higher connectivity for non-disease than for disease genes, and a tendency of disease genes to interact with each other. Utilizing these differences, we propose 168 candidate genes that resemble the characteristic interaction patterns of mitochondrial disease genes. Through their network associations, the candidates are further prioritized for the study of specific disorders such as optic neuropathies and Parkinson disease. Most mitochondrial disease phenotypes involve several clinical categories including neurologic, metabolic, and gastrointestinal disorders, which might indicate the effects of gene defects within the mitochondrial system. The accompanying knowledgebase (http://www.mitophenome.org/) supports the study of clinical diseases and associated genes. An important prerequisite for successful disease gene identification is the assessment, with minimal ambiguity, of a particular clinical trait or phenotype. Even with years of experience, recognizing and diagnosing mitochondrial diseases is still a major hurdle in clinical medicine. Computational tools supporting clinicians not only help identify affected individuals, but also guide studies of the genetic and biological causes of these disorders. In this study we dissect and categorize individual clinical features, signs, and symptoms of 174 disease genes and then identify gene similarities based on their shared phenotypic features. We demonstrate that genes sharing more similar phenotypes have a stronger tendency for functional interactions, proving the usefulness of phenotype similarity values in disease gene network analysis. Our study of a large functional network of mitochondrial genes revealed distinct properties that differentiate disease and non-disease genes. Disease genes showed a lower average total connectivity but a tendency to interact with each other; a finding that we used to predict 168 high-probability disease candidates. The accompanying knowledgebase allows for easy navigation between disease and gene information. We believe the open source format will support and encourage further research that will benefit this and other human phenome projects.
Collapse
|
12
|
Girirajan S, Truong HT, Blanchard CL, Elsea SH. A functional network module for Smith-Magenis syndrome. Clin Genet 2009; 75:364-74. [PMID: 19236431 DOI: 10.1111/j.1399-0004.2008.01135.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Disorders with overlapping diagnostic features are grouped into a network module. Based on phenotypic similarities or differential diagnoses, it is possible to identify functional pathways leading to individual features. We generated a Smith-Magenis syndrome (SMS)-specific network module utilizing patient clinical data, text mining from the Online Mendelian Inheritance in Man database, and in vitro functional analysis. We tested our module by functional studies based on a hypothesis that RAI1 acts through phenotype-specific pathways involving several downstream genes, which are altered due to RAI1 haploinsufficiency. A preliminary genome-wide gene expression study was performed using microarrays on RAI1 haploinsufficient cells created by RNAi-based approximately 50% knockdown of RAI1 in HEK293T cells. The top dysregulated genes were involved in growth signaling and insulin sensitivity, neuronal differentiation, lipid biosynthesis and fat mobilization, circadian activity, behavior, renal, cardiovascular and skeletal development, gene expression, and cell-cycle regulation and recombination, reflecting the spectrum of clinical features observed in SMS. Validation using real-time quantitative reverse transcriptase polymerase chain reaction confirmed the gene expression profile of 75% of the selected genes analyzed in both HEK293T RAI1 knockdown cells and SMS lymphoblastoid cell lines. Overall, these data support a method for identifying genes and pathways responsible for individual clinical features in a complex disorder such as SMS.
Collapse
Affiliation(s)
- S Girirajan
- Department of Human and Molecular Genetics, Medical College of Virginia Campus, Virginia Commonwealth University, Richmond, VA 23298, USA
| | | | | | | |
Collapse
|
13
|
Wu X, Liu Q, Jiang R. Align human interactome with phenome to identify causative genes and networks underlying disease families. ACTA ACUST UNITED AC 2008; 25:98-104. [PMID: 19010805 DOI: 10.1093/bioinformatics/btn593] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Understanding the complexity in gene-phenotype relationship is vital for revealing the genetic basis of common diseases. Recent studies on the basis of human interactome and phenome not only uncovers prevalent phenotypic overlap and genetic overlap between diseases, but also reveals a modular organization of the genetic landscape of human diseases, providing new opportunities to reduce the complexity in dissecting the gene-phenotype association. RESULTS We provide systematic and quantitative evidence that phenotypic overlap implies genetic overlap. With these results, we perform the first heterogeneous alignment of human interactome and phenome via a network alignment technique and identify 39 disease families with corresponding causative gene networks. Finally, we propose AlignPI, an alignment-based framework to predict disease genes, and identify plausible candidates for 70 diseases. Our method scales well to the whole genome, as demonstrated by prioritizing 6154 genes across 37 chromosome regions for Crohn's disease (CD). Results are consistent with a recent meta-analysis of genome-wide association studies for CD. AVAILABILITY Bi-modules and disease gene predictions are freely available at the URL http://bioinfo.au.tsinghua.edu.cn/alignpi/
Collapse
Affiliation(s)
- Xuebing Wu
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China
| | | | | |
Collapse
|
14
|
Network-based global inference of human disease genes. Mol Syst Biol 2008; 4:189. [PMID: 18463613 PMCID: PMC2424293 DOI: 10.1038/msb.2008.27] [Citation(s) in RCA: 430] [Impact Index Per Article: 26.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2007] [Accepted: 03/17/2008] [Indexed: 01/04/2023] Open
Abstract
Deciphering the genetic basis of human diseases is an important goal of biomedical research. On the basis of the assumption that phenotypically similar diseases are caused by functionally related genes, we propose a computational framework that integrates human protein–protein interactions, disease phenotype similarities, and known gene–phenotype associations to capture the complex relationships between phenotypes and genotypes. We develop a tool named CIPHER to predict and prioritize disease genes, and we show that the global concordance between the human protein network and the phenotype network reliably predicts disease genes. Our method is applicable to genetically uncharacterized phenotypes, effective in the genome-wide scan of disease genes, and also extendable to explore gene cooperativity in complex diseases. The predicted genetic landscape of over 1000 human phenotypes, which reveals the global modular organization of phenotype–genotype relationships. The genome-wide prioritization of candidate genes for over 5000 human phenotypes, including those with under-characterized disease loci or even those lacking known association, is publicly released to facilitate future discovery of disease genes.
Collapse
|
15
|
Phenome connections. Trends Genet 2008; 24:103-6. [DOI: 10.1016/j.tig.2007.12.005] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Revised: 12/18/2007] [Accepted: 12/18/2007] [Indexed: 11/23/2022]
|
16
|
Abstract
Facial appearance can be a significant clue in the initial identification of genetic conditions, but their low incidence limits exposure during training and inhibits the development of skills in recognising the facial "gestalt" characteristic of many dysmorphic syndromes. Here we describe the potential of computer-based models of three-dimensional (3D) facial morphology to assist in dysmorphology training, in clinical diagnosis and in multidisciplinary studies of phenotype-genotype correlations.
Collapse
|
17
|
Van Vooren S, Coessens B, De Moor B, Moreau Y, Vermeesch JR. Array comparative genomic hybridization and computational genome annotation in constitutional cytogenetics: suggesting candidate genes for novel submicroscopic chromosomal imbalance syndromes. Genet Med 2007; 9:642-9. [PMID: 17873653 DOI: 10.1097/gim.0b013e318145b27b] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Genome-wide array comparative genomic hybridization screening is uncovering pathogenic submicroscopic chromosomal imbalances in patients with developmental disorders. In those patients, imbalances appear now to be scattered across the whole genome, and most patients carry different chromosomal anomalies. Screening patients with developmental disorders can be considered a forward functional genome screen. The imbalances pinpoint the location of genes that are involved in human development. Because most imbalances encompass regions harboring multiple genes, the challenge is to (1) identify those genes responsible for the specific phenotype and (2) disentangle the role of the different genes located in an imbalanced region. In this review, we discuss novel tools and relevant databases that have recently been developed to aid this gene discovery process. Identification of the functional relevance of genes will not only deepen our understanding of human development but will, in addition, aid in the data interpretation and improve genetic counseling.
Collapse
Affiliation(s)
- Steven Van Vooren
- Department of Electrotechnical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium.
| | | | | | | | | |
Collapse
|
18
|
Abstract
The recent completion of the Human Genome Project has made possible a high-throughput "systems approach" for accelerating the elucidation of molecular underpinnings of human diseases, and subsequent derivation of molecular-based strategies to more effectively prevent, diagnose, and treat these diseases. Although altered phenotypes are among the most reliable manifestations of altered gene functions, research using systematic analysis of phenotype relationships to study human biology is still in its infancy. This article focuses on the emerging field of high-throughput phenotyping (HTP) phenomics research, which aims to capitalize on novel high-throughput computation and informatics technology developments to derive genomewide molecular networks of genotype-phenotype associations, or "phenomic associations." The HTP phenomics research field faces the challenge of technological research and development to generate novel tools in computation and informatics that will allow researchers to amass, access, integrate, organize, and manage phenotypic databases across species and enable genomewide analysis to associate phenotypic information with genomic data at different scales of biology. Key state-of-the-art technological advancements critical for HTP phenomics research are covered in this review. In particular, we highlight the power of computational approaches to conduct large-scale phenomics studies.
Collapse
Affiliation(s)
- Yves A Lussier
- Section of Genetic Medicine, Department of Medicine, University of Chicago,Chicago, Illinois 60637, USA.
| | | |
Collapse
|
19
|
Abstract
With the explosion in genomic and functional genomics information, methods for disease gene identification are rapidly evolving. Databases are now essential to the process of selecting candidate disease genes. Combining positional information with disease characteristics and functional information is the usual strategy by which candidate disease genes are selected. Enrichment for candidate disease genes, however, depends on the skills of the operating researcher. Over the past few years, a number of bioinformatics methods that enrich for the most likely candidate disease genes have been developed. Such in silico prioritisation methods may further improve by completion of datasets, by development of standardised ontologies across databases and species and, ultimately, by the integration of different strategies.
Collapse
Affiliation(s)
- Marc A van Driel
- Molecular Biology Department, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands
| | - Han G Brunner
- Department of Human Genetics, University Medical Centre Nijmegen, Geert Grooteplein 10, Nijmegen, The Netherlands
| |
Collapse
|
20
|
Feenstra I, Brunner HG, van Ravenswaaij CMA. Cytogenetic genotype-phenotype studies: improving genotyping, phenotyping and data storage. Cytogenet Genome Res 2006; 115:231-9. [PMID: 17124405 DOI: 10.1159/000095919] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2006] [Accepted: 05/02/2006] [Indexed: 11/19/2022] Open
Abstract
High-resolution molecular cytogenetic techniques such as genomic array CGH and MLPA detect submicroscopic chromosome aberrations in patients with unexplained mental retardation. These techniques rapidly change the practice of cytogenetic testing. Additionally, these techniques may improve genotype-phenotype studies of patients with microscopically visible chromosome aberrations, such as Wolf-Hirschhorn syndrome, 18q deletion syndrome and 1p36 deletion syndrome. In order to make the most of high-resolution karyotyping, a similar accuracy of phenotyping is needed to allow researchers and clinicians to make optimal use of the recent advances. International agreements on phenotype nomenclature and the use of computerized 3D face surface models are examples of such improvements in the practice of phenotyping patients with chromosomal anomalies. The combination of high-resolution cytogenetic techniques, a comprehensive, systematic system for phenotyping and optimal data storage will facilitate advances in genotype-phenotype studies and a further deconstruction of chromosomal syndromes. As a result, critical regions or single genes can be determined to be responsible for specific features and malformations.
Collapse
Affiliation(s)
- I Feenstra
- Radboud University Nijmegen Medical Centre, Department of Human Genetics, Nijmegen, The Netherlands.
| | | | | |
Collapse
|
21
|
Abstract
Evidence from many sources suggests that similar phenotypes are begotten by functionally related genes. This is most obvious in the case of genetically heterogeneous diseases such as Fanconi anemia, Bardet-Biedl or Usher syndrome, where the various genes work together in a single biological module. Such modules can be a multiprotein complex, a pathway, or a single cellular or subcellular organelle. This observation suggests a number of hypotheses about the human phenome that are now beginning to be explored. First, there is now good evidence from bioinformatic analyses that human genetic diseases can be clustered on the basis of their phenotypic similarities and that such a clustering represents true biological relationships of the genes involved. Second, one may use such phenotypic similarity to predict and then test for the contribution of apparently unrelated genes to the same functional module. This concept is now being systematically tested for several diseases. Most recently, a systematic yeast two-hybrid screen of all known genes for inherited ataxias indicated that they all form part of a single extended protein-protein interaction network. Third, one can use bioinformatics to make predictions about new genes for diseases that form part of the same phenotype cluster. This is done by starting from the known disease genes and then searching for genes that share one or more functional attributes such as gene expression pattern, coevolution, or gene ontology. Ultimately, one may expect that a modular view of disease genes should help the rapid identification of additional disease genes for multifactorial diseases once the first few contributing genes (or environmental factors) have been reliably identified.
Collapse
Affiliation(s)
- M Oti
- Centre for Molecular and Biomolecular Informatics, Nijmegen Centre for Molecular Life Sciences, Radboud University Nijmegen, The Netherlands
| | | |
Collapse
|
22
|
Abstract
The author gives a personal account on how he was introduced to the field of clinical genetics as a student of John Opitz in Helena, MT. That process was facilitated by the study of several malformation syndromes. Particularly instructive were the approaches to the cardio-facio-cutaneous, the Perlman, and the FG syndrome. These three conditions are briefly revisited with a critical perspective, made possible by the elapse of 20 years, since the time when the author became acquainted with them.
Collapse
Affiliation(s)
- Giovanni Neri
- Istituto di Genetica Medica, Facoltà di Medicina A. Gemelli, Università Cattolica del S. Cuore, Roma, Italy.
| |
Collapse
|
23
|
van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JAM. A text-mining analysis of the human phenome. Eur J Hum Genet 2006; 14:535-42. [PMID: 16493445 DOI: 10.1038/sj.ejhg.5201585] [Citation(s) in RCA: 393] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
A number of large-scale efforts are underway to define the relationships between genes and proteins in various species. But, few attempts have been made to systematically classify all such relationships at the phenotype level. Also, it is unknown whether such a phenotype map would carry biologically meaningful information. We have used text mining to classify over 5000 human phenotypes contained in the Online Mendelian Inheritance in Man database. We find that similarity between phenotypes reflects biological modules of interacting functionally related genes. These similarities are positively correlated with a number of measures of gene function, including relatedness at the level of protein sequence, protein motifs, functional annotation, and direct protein-protein interaction. Phenotype grouping reflects the modular nature of human disease genetics. Thus, phenotype mapping may be used to predict candidate genes for diseases as well as functional relations between genes and proteins. Such predictions will further improve if a unified system of phenotype descriptors is developed. The phenotype similarity data are accessible through a web interface at http://www.cmbi.ru.nl/MimMiner/.
Collapse
Affiliation(s)
- Marc A van Driel
- Centre for Molecular and Biomolecular Informatics, Radboud University Nijmegen, Toernooiveld 1, 6525ED Nijmegen, the Netherlands
| | | | | | | | | |
Collapse
|