1
|
A review from biological mapping to computation-based subcellular localization. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 32:507-521. [PMID: 37215152 PMCID: PMC10192651 DOI: 10.1016/j.omtn.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Subcellular localization is crucial to the study of virus and diseases. Specifically, research on protein subcellular localization can help identify clues between virus and host cells that can aid in the design of targeted drugs. Research on RNA subcellular localization is significant for human diseases (such as Alzheimer's disease, colon cancer, etc.). To date, only reviews addressing subcellular localization of proteins have been published, which are outdated for reference, and reviews of RNA subcellular localization are not comprehensive. Therefore, we collated (the most up-to-date) literature on protein and RNA subcellular localization to help researchers understand changes in the field of protein and RNA subcellular localization. Extensive and complete methods for constructing subcellular localization models have also been summarized, which can help readers understand the changes in application of biotechnology and computer science in subcellular localization research and explore how to use biological data to construct improved subcellular localization models. This paper is the first review to cover both protein subcellular localization and RNA subcellular localization. We urge researchers from biology and computational biology to jointly pay attention to transformation patterns, interrelationships, differences, and causality of protein subcellular localization and RNA subcellular localization.
Collapse
|
2
|
Genome-Wide Association Study of Growth Traits in a Four-Way Crossbred Pig Population. Genes (Basel) 2022; 13:genes13111990. [DOI: 10.3390/genes13111990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 10/28/2022] [Accepted: 10/28/2022] [Indexed: 11/04/2022] Open
Abstract
Growth traits are crucial economic traits in the commercial pig industry and have a substantial impact on pig production. However, the genetic mechanism of growth traits is not very clear. In this study, we performed a genome-wide association study (GWAS) based on the specific-locus amplified fragment sequencing (SLAF-seq) to analyze ten growth traits on 223 four-way intercross pigs. A total of 227,921 highly consistent single nucleotide polymorphisms (SNPs) uniformly dispersed throughout the entire genome were used to conduct GWAS. A total of 53 SNPs were identified for ten growth traits using the mixed linear model (MLM), of which 18 SNPs were located in previously reported quantitative trait loci (QTL) regions. Two novel QTLs on SSC4 and SSC7 were related to average daily gain from 30 to 60 kg (ADG30–60) and body length (BL), respectively. Furthermore, 13 candidate genes (ATP5O, GHRHR, TRIM55, EIF2AK1, PLEKHA1, BRAP, COL11A2, HMGA1, NHLRC1, SGSM1, NFATC2, MAML1, and PSD3) were found to be associated with growth traits in pigs. The GWAS findings will enhance our comprehension of the genetic architecture of growth traits. We suggested that these detected SNPs and corresponding candidate genes might provide a biological foundation for improving the growth and production performance of pigs in swine breeding.
Collapse
|
3
|
A Kit Mutation Associated with Black-Eyed White Phenotype in the Grey Red-Backed Vole, Myodes rufocanus. MAMMAL STUDY 2022. [DOI: 10.3106/ms2022-0003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
4
|
Systems biology analysis of human genomes points to key pathways conferring spina bifida risk. Proc Natl Acad Sci U S A 2021; 118:2106844118. [PMID: 34916285 PMCID: PMC8713748 DOI: 10.1073/pnas.2106844118] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/20/2021] [Indexed: 12/15/2022] Open
Abstract
Genetic investigations of most structural birth defects, including spina bifida (SB), congenital heart disease, and craniofacial anomalies, have been underpowered for genome-wide association studies because of their rarity, genetic heterogeneity, incomplete penetrance, and environmental influences. Our systems biology strategy to investigate SB predisposition controls for population stratification and avoids much of the bias inherent in candidate gene searches that are pervasive in the field. We examine both protein coding and noncoding regions of whole genomes to analyze sequence variants, collapsed by gene or regulatory region, and apply machine learning, gene enrichment, and pathway analyses to elucidate molecular pathways and genes contributing to human SB. Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetrance, and environmental influences that hamper genome-wide association studies approaches to untargeted discovery. Thus, SB genetic studies may suffer from population substructure and/or selection bias introduced by typical candidate gene searches. We report a population based, ancestry-matched whole-genome sequence analysis of SB genetic predisposition using a systems biology strategy to interrogate 298 case-control subject genomes (149 pairs). Genes that were enriched in likely gene disrupting (LGD), rare protein-coding variants were subjected to machine learning analysis to identify genes in which LGD variants occur with a different frequency in cases versus controls and so discriminate between these groups. Those genes with high discriminatory potential for SB significantly enriched pathways pertaining to carbon metabolism, inflammation, innate immunity, cytoskeletal regulation, and essential transcriptional regulation consistent with their having impact on the pathogenesis of human SB. Additionally, an interrogation of conserved noncoding sequences identified robust variant enrichment in regulatory regions of several transcription factors critical to embryonic development. This genome-wide perspective offers an effective approach to the interrogation of coding and noncoding sequence variant contributions to rare complex genetic disorders.
Collapse
|
5
|
Unraveling the complex genetics of neural tube defects: From biological models to human genomics and back. Genesis 2021; 59:e23459. [PMID: 34713546 DOI: 10.1002/dvg.23459] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/08/2021] [Accepted: 09/17/2021] [Indexed: 12/11/2022]
Abstract
Neural tube defects (NTDs) are a classic example of preventable birth defects for which there is a proven-effective intervention, folic acid (FA); however, further methods of prevention remain unrealized. In the decades following implementation of FA nutritional fortification programs throughout at least 87 nations, it has become apparent that not all NTDs can be prevented by FA. In the United States, FA fortification only reduced NTD rates by 28-35% (Williams et al., 2015). As such, it is imperative that further work is performed to understand the risk factors associated with NTDs and their underlying mechanisms so that alternative prevention strategies can be developed. However, this is complicated by the sheer number of genes associated with neural tube development, the heterogeneity of observable phenotypes in human cases, the rareness of the disease, and the myriad of environmental factors associated with NTD risk. Given the complex genetic architecture underlying NTD pathology and the way in which that architecture interacts dynamically with environmental factors, further prevention initiatives will undoubtedly require precision medicine strategies that utilize the power of human genomics and modern tools for assessing genetic risk factors. Herein, we review recent advances in genomic strategies for discovering genetic variants associated with these defects, and new ways in which biological models, such as mice and cell culture-derived organoids, are leveraged to assess mechanistic functionality, the way these variants interact with other genetic or environmental factors, and their ultimate contribution to human NTD risk.
Collapse
|
6
|
Genomic Characteristics and Selection Signatures in Indigenous Chongming White Goat ( Capra hircus). Front Genet 2020; 11:901. [PMID: 32973871 PMCID: PMC7472782 DOI: 10.3389/fgene.2020.00901] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 07/21/2020] [Indexed: 12/23/2022] Open
Abstract
The Chongming white goat (CM) is an indigenous goat breed exhibits unique traits that are adapted to the local environment and artificial selection. By performing whole-genome re-sequencing, we generated 14–20× coverage sequences from 10 domestic goat breeds to explore the genomic characteristics and selection signatures of the CM breed. We identified a total of 23,508,551 single-nucleotide polymorphisms (SNPs) and 2,830,800 insertion–deletion mutations (indels) after read mapping and variant calling. We further specifically identified 1.2% SNPs (271,713) and 0.9% indels (24,843) unique to the CM breed in comparison with the other nine goat breeds. Missense (SIFT < 0.05), frameshift, splice-site, start-loss, stop-loss, and stop-gain variants were identified in 183 protein-coding genes of the CM breed. Of the 183, 36 genes, including AP4E1, FSHR, COL11A2, and DYSF, are involved in phenotype ontology terms related to the nervous system, short stature, and skeletal muscle morphology. Moreover, based on genome-wide FST and pooled heterozygosity (Hp) calculation, we further identified selection signature genes between the CM and the other nine goat breeds. These genes are significantly associated with the nervous system (C2CD3, DNAJB13, UCP2, ZMYND11, CEP126, SCAPER, and TSHR), growth (UCP2, UCP3, TSHR, FGFR1, ERLIN2, and ZNF703), and coat color (KITLG, ASIP, AHCY, RALY, and MC1R). Our results suggest that the CM breed may be differentiated from other goat breeds in terms of nervous system owing to natural or artificial selection. The whole-genome analysis provides an improved understanding of genetic diversity and trait exploration for this indigenous goat breed.
Collapse
|
7
|
The essentiality of drug targets: an analysis of current literature and genomic databases. Drug Discov Today 2019; 24:544-550. [DOI: 10.1016/j.drudis.2018.11.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 09/18/2018] [Accepted: 11/05/2018] [Indexed: 12/14/2022]
|
8
|
Nicotinamide Mononucleotide: Exploration of Diverse Therapeutic Applications of a Potential Molecule. Biomolecules 2019; 9:biom9010034. [PMID: 30669679 PMCID: PMC6359187 DOI: 10.3390/biom9010034] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Revised: 01/15/2019] [Accepted: 01/15/2019] [Indexed: 12/13/2022] Open
Abstract
Nicotinamide mononucleotide (NMN) is a nucleotide that is most recognized for its role as an intermediate of nicotinamide adenine dinucleotide (NAD+) biosynthesis. Although the biosynthetic pathway of NMN varies between eukaryote and prokaryote, two pathways are mainly followed in case of eukaryotic human-one is through the salvage pathway using nicotinamide while the other follows phosphorylation of nicotinamide riboside. Due to the unavailability of a suitable transporter, NMN enters inside the mammalian cell in the form of nicotinamide riboside followed by its subsequent conversion to NMN and NAD+. This particular molecule has demonstrated several beneficial pharmacological activities in preclinical studies, which suggest its potential therapeutic use. Mostly mediated by its involvement in NAD+ biosynthesis, the pharmacological activities of NMN include its role in cellular biochemical functions, cardioprotection, diabetes, Alzheimer's disease, and complications associated with obesity. The recent groundbreaking discovery of anti-ageing activities of this chemical moiety has added a valuable essence in the research involving this molecule. This review focuses on the biosynthesis of NMN in mammalian and prokaryotic cells and mechanism of absorption along with the reported pharmacological activities in murine model.
Collapse
|
9
|
Complementing preclinical safety assessments through genomic analyses. CURRENT OPINION IN TOXICOLOGY 2018. [DOI: 10.1016/j.cotox.2019.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
10
|
Threshold for neural tube defect risk by accumulated singleton loss-of-function variants. Cell Res 2018; 28:1039-1041. [PMID: 29976953 PMCID: PMC6170406 DOI: 10.1038/s41422-018-0061-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Revised: 05/03/2018] [Accepted: 06/11/2018] [Indexed: 11/23/2022] Open
|
11
|
Burly1 is a mouse QTL for lean body mass that maps to a 0.8-Mb region of chromosome 2. Mamm Genome 2018; 29:325-343. [PMID: 29737391 DOI: 10.1007/s00335-018-9746-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2018] [Accepted: 04/26/2018] [Indexed: 11/25/2022]
Abstract
To fine map a mouse QTL for lean body mass (Burly1), we used information from intercross, backcross, consomic, and congenic mice derived from the C57BL/6ByJ (host) and 129P3/J (donor) strains. The results from these mapping populations were concordant and showed that Burly1 is located between 151.9 and 152.7 Mb (rs33197365 to rs3700604) on mouse chromosome 2. The congenic region harboring Burly1 contains 26 protein-coding genes, 11 noncoding RNA elements (e.g., lncRNA), and 4 pseudogenes, with 1949 predicted functional variants. Of the protein-coding genes, 7 have missense variants, including genes that may contribute to lean body weight, such as Angpt41, Slc52c3, and Rem1. Lean body mass was increased by the B6-derived variant relative to the 129-derived allele. Burly1 influenced lean body weight at all ages but not food intake or locomotor activity. However, congenic mice with the B6 allele produced more heat per kilogram of lean body weight than did controls, pointing to a genotype effect on lean mass metabolism. These results show the value of integrating information from several mapping populations to refine the map location of body composition QTLs and to identify a short list of candidate genes.
Collapse
|
12
|
Adiposity QTL Adip20 decomposes into at least four loci when dissected using congenic strains. PLoS One 2017; 12:e0188972. [PMID: 29194435 PMCID: PMC5711020 DOI: 10.1371/journal.pone.0188972] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 11/16/2017] [Indexed: 01/03/2023] Open
Abstract
An average mouse in midlife weighs between 25 and 30 g, with about a gram of tissue in the largest adipose depot (gonadal), and the weight of this depot differs between inbred strains. Specifically, C57BL/6ByJ mice have heavier gonadal depots on average than do 129P3/J mice. To understand the genetic contributions to this trait, we mapped several quantitative trait loci (QTLs) for gonadal depot weight in an F2 intercross population. Our goal here was to fine-map one of these QTLs, Adip20 (formerly Adip5), on mouse chromosome 9. To that end, we analyzed the weight of the gonadal adipose depot from newly created congenic strains. Results from the sequential comparison method indicated at least four rather than one QTL; two of the QTLs were less than 0.5 Mb apart, with opposing directions of allelic effect. Different types of evidence (missense and regulatory genetic variation, human adiposity/body mass index orthologues, and differential gene expression) implicated numerous candidate genes from the four QTL regions. These results highlight the value of mouse congenic strains and the value of this sequential method to dissect challenging genetic architecture.
Collapse
|
13
|
Bioinformatics Resources for Cancer Research with an Emphasis on Gene Function and Structure Prediction Tools. Cancer Inform 2017. [DOI: 10.1177/117693510600200020] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The immensely popular fields of cancer research and bioinformatics overlap in many different areas, e.g. large data repositories that allow for users to analyze data from many experiments (data handling, databases), pattern mining, microarray data analysis, and interpretation of proteomics data. There are many newly available resources in these areas that may be unfamiliar to most cancer researchers wanting to incorporate bioinformatics tools and analyses into their work, and also to bioinformaticians looking for real data to develop and test algorithms. This review reveals the interdependence of cancer research and bioinformatics, and highlight the most appropriate and useful resources available to cancer researchers. These include not only public databases, but general and specific bioinformatics tools which can be useful to the cancer researcher. The primary foci are function and structure prediction tools of protein genes. The result is a useful reference to cancer researchers and bioinformaticians studying cancer alike.
Collapse
|
14
|
Increased burden of deleterious variants in essential genes in autism spectrum disorder. Proc Natl Acad Sci U S A 2016; 113:15054-15059. [PMID: 27956632 DOI: 10.1073/pnas.1613195113] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Autism spectrum disorder (ASD) is a heterogeneous, highly heritable neurodevelopmental syndrome characterized by impaired social interaction, communication, and repetitive behavior. It is estimated that hundreds of genes contribute to ASD. We asked if genes with a strong effect on survival and fitness contribute to ASD risk. Human orthologs of genes with an essential role in pre- and postnatal development in the mouse [essential genes (EGs)] are enriched for disease genes and under strong purifying selection relative to human orthologs of mouse genes with a known nonlethal phenotype [nonessential genes (NEGs)]. This intolerance to deleterious mutations, commonly observed haploinsufficiency, and the importance of EGs in development suggest a possible cumulative effect of deleterious variants in EGs on complex neurodevelopmental disorders. With a comprehensive catalog of 3,915 mammalian EGs, we provide compelling evidence for a stronger contribution of EGs to ASD risk compared with NEGs. By examining the exonic de novo and inherited variants from 1,781 ASD quartet families, we show a significantly higher burden of damaging mutations in EGs in ASD probands compared with their non-ASD siblings. The analysis of EGs in the developing brain identified clusters of coexpressed EGs implicated in ASD. Finally, we suggest a high-priority list of 29 EGs with potential ASD risk as targets for future functional and behavioral studies. Overall, we show that large-scale studies of gene function in model organisms provide a powerful approach for prioritization of genes and pathogenic variants identified by sequencing studies of human disease.
Collapse
|
15
|
Immunotherapy using inhibin antiserum enhanced the efficacy of equine chorionic gonadotropin on superovulation in major inbred and outbred mice strains. Theriogenology 2016; 86:1341-6. [PMID: 27242176 DOI: 10.1016/j.theriogenology.2016.04.076] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Revised: 04/25/2016] [Accepted: 04/25/2016] [Indexed: 11/29/2022]
Abstract
Improvement of the superovulation technique will help to enhance the efficiency of embryo and animal production. Blocking inhibin using inhibin antiserum (IAS) is known to promote follicular development by increasing the level of FSH. Previously, we reported that coadministration of IAS and eCG produced more than 100 oocytes from a single female C57BL/6 mouse at 4 weeks old. The oocytes derived from the IAS + eCG (IASe) treatment were able to fertilize and develop normally into offspring. In this study, we examined the effect of IASe treatment on the numbers of ovulated oocytes in major inbred (A/J, BALB/cByJ, C3HeJ, DBA/2J, and FVB/NJ) and outbred (CD1) mice strains at 4 weeks old. We confirmed the fertilization and developmental ability of the IASe-derived oocytes. IASe treatment ovulated 1.5 to 3.2 times higher numbers of oocytes than eCG treatment alone. The fertilization rate of IASe-derived oocytes was similar to that of eCG-derived oocytes. In vitro and in vivo developmental rates of the embryos derived from IASe were similar to the rates of embryos derived from eCG. We have shown that superovulation by IASe is very effective in obtaining high numbers of ovulated oocytes from small numbers of oocyte donor in a number of mice strains. The superovulation technique will contribute to the archiving of cryopreserved embryos of genetically engineered mice using small numbers of donors and has the potential to produce more live animals for rederivation of the archived mouse lines in mouse repositories.
Collapse
|
16
|
Functional dissection of virus-human crosstalk mediated by miRNAs based on the VmiReg database. MOLECULAR BIOSYSTEMS 2016; 11:1319-28. [PMID: 25787233 DOI: 10.1039/c5mb00095e] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Recently, a number of viruses have been shown to encode microRNAs (miRNAs), and they play important roles in several biological processes, enhancing the intricacies of the virus-host crosstalk. However, systematically deciphering the characteristics of crosstalk mediated by viral and human miRNAs has been hampered by the lack of high-confidence targets. Here, a user-friendly platform is developed to provide experimentally validated and predicted target genes of viral miRNAs as well as their functions, named VmiReg. To explore the virus-human crosstalk meditated by miRNAs, validated human cellular targets of viral and cellular miRNAs are analyzed. As a result, target genes of viral miRNAs are prone to be silenced by human miRNAs. Two kinds of targets have globally significantly high functional similarities and are more often found simultaneously in many important biological functions, even in disease genes, particularly cancer genes, and essential genes. In addition, viral and human miRNA targets are in close proximity within the protein-protein interaction network, indicating frequent communication via physical interactions to participate in the same functions. Finally, multiple dense modules intuitively exhibit crosstalk between viral and cellular miRNAs. Furthermore, most co-regulated genes tend to be in important locations of modules. The lymphoma-related module is one of the typical examples. Our study suggests that the functional importance of cellular genes targeted by viral miRNAs and the intricate virus-host crosstalk mediated by miRNAs may be performed via the sharing of target genes or physical interactions, providing a new direction in further researching the roles of miRNAs in infection.
Collapse
|
17
|
Investigation of Pathogenic Genes in Chinese sporadic Hypertrophic Cardiomyopathy Patients by Whole Exome Sequencing. Sci Rep 2015; 5:16609. [PMID: 26573135 PMCID: PMC4647833 DOI: 10.1038/srep16609] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 10/16/2015] [Indexed: 11/08/2022] Open
Abstract
Hypertrophic cardiomyopathy (HCM) is a cardiovascular disease with high heterogeneity. Limited knowledge concerning the genetic background of nearly 40% HCM cases indicates there is a clear need for further investigation to explore the genetic pathogenesis of the disease. In this study, we undertook a whole exome sequencing (WES) approach to identify novel candidate genes and mutations associated with HCM. The cohort consisted of 74 unrelated patients with sporadic HCM (sHCM) previously determined to be negative for mutations in eight sarcomere genes. The results showed that 7 of 74 patients (9.5%) had damaging mutations in 43 known HCM disease genes. Furthermore, after analysis combining the Transmission and De novo Association (TADA) program and the ToppGene program, 10 putative genes gained priority. A thorough review of public databases and related literature revealed that there is strong supporting evidence for most of the genes playing roles in various aspects of heart development. Findings from recent studies suggest that the putative and known disease genes converge on three functional pathways: sarcomere function, calcium signaling and metabolism pathway. This study illustrates the benefit of WES, in combination with rare variant analysis tools, in providing valuable insight into the genetic etiology of a heterogeneous sporadic disease.
Collapse
|
18
|
Body Composition QTLs Identified in Intercross Populations Are Reproducible in Consomic Mouse Strains. PLoS One 2015; 10:e0141494. [PMID: 26551037 PMCID: PMC4638354 DOI: 10.1371/journal.pone.0141494] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2015] [Accepted: 10/07/2015] [Indexed: 12/16/2022] Open
Abstract
Genetic variation contributes to individual differences in obesity, but defining the exact relationships between naturally occurring genotypes and their effects on fatness remains elusive. As a step toward positional cloning of previously identified body composition quantitative trait loci (QTLs) from F2 crosses of mice from the C57BL/6ByJ and 129P3/J inbred strains, we sought to recapture them on a homogenous genetic background of consomic (chromosome substitution) strains. Male and female mice from reciprocal consomic strains originating from the C57BL/6ByJ and 129P3/J strains were bred and measured for body weight, length, and adiposity. Chromosomes 2, 7, and 9 were selected for substitution because previous F2 intercross studies revealed body composition QTLs on these chromosomes. We considered a QTL confirmed if one or both sexes of one or both reciprocal consomic strains differed significantly from the host strain in the expected direction after correction for multiple testing. Using these criteria, we confirmed two of two QTLs for body weight (Bwq5-6), three of three QTLs for body length (Bdln3-5), and three of three QTLs for adiposity (Adip20, Adip26 and Adip27). Overall, this study shows that despite the biological complexity of body size and composition, most QTLs for these traits are preserved when transferred to consomic strains; in addition, studying reciprocal consomic strains of both sexes is useful in assessing the robustness of a particular QTL.
Collapse
|
19
|
The International Mouse Strain Resource (IMSR): cataloging worldwide mouse and ES cell line resources. Mamm Genome 2015; 26:448-55. [PMID: 26373861 PMCID: PMC4602064 DOI: 10.1007/s00335-015-9600-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 08/10/2015] [Indexed: 11/04/2022]
Abstract
The availability of and access to quality genetically defined, health-status known mouse resources is critical for biomedical research. By ensuring that mice used in research experiments are biologically, genetically, and health-status equivalent, we enable knowledge transfer, hypothesis building based on multiple data streams, and experimental reproducibility based on common mouse resources (reagents). Major repositories for mouse resources have developed over time and each has significant unique resources to offer. Here we (a) describe The International Mouse Strain Resource that offers users a combined catalog of worldwide mouse resources (live, cryopreserved, embryonic stem cells), with direct access to repository sites holding resources of interest and (b) discuss the commitment to nomenclature standards among resources that remain a challenge in unifying mouse resource catalogs.
Collapse
|
20
|
Network Modules of the Cross-Species Genotype-Phenotype Map Reflect the Clinical Severity of Human Diseases. PLoS One 2015; 10:e0136300. [PMID: 26301634 PMCID: PMC4547739 DOI: 10.1371/journal.pone.0136300] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Accepted: 08/02/2015] [Indexed: 01/09/2023] Open
Abstract
Recent advances in genome sequencing techniques have improved our understanding of the genotype-phenotype relationship between genetic variants and human diseases. However, genetic variations uncovered from patient populations do not provide enough information to understand the mechanisms underlying the progression and clinical severity of human diseases. Moreover, building a high-resolution genotype-phenotype map is difficult due to the diverse genetic backgrounds of the human population. We built a cross-species genotype-phenotype map to explain the clinical severity of human genetic diseases. We developed a data-integrative framework to investigate network modules composed of human diseases mapped with gene essentiality measured from a model organism. Essential and nonessential genes connect diseases of different types which form clusters in the human disease network. In a large patient population study, we found that disease classes enriched with essential genes tended to show a higher mortality rate than disease classes enriched with nonessential genes. Moreover, high disease mortality rates are explained by the multiple comorbid relationships and the high pleiotropy of disease genes found in the essential gene-enriched diseases. Our results reveal that the genotype-phenotype map of a model organism can facilitate the identification of human disease-gene associations and predict human disease progression.
Collapse
|
21
|
Abstract
From its inception in 1989, the mission of the Mouse Genome Informatics (MGI) resource remains to integrate genetic, genomic, and biological data about the laboratory mouse to facilitate the study of human health and disease. This mission is ever more feasible as the revolution in genetics knowledge, the ability to sequence genomes, and the ability to specifically manipulate mammalian genomes are now at our fingertips. Through major paradigm shifts in biological research and computer technologies, MGI has adapted and evolved to become an integral part of the larger global bioinformatics infrastructure and honed its ability to provide authoritative reference datasets used and incorporated by many other established bioinformatics resources. Here, we review some of the major changes in research approaches over that last quarter century, how these changes are reflected in the MGI resource you use today, and what may be around the next corner.
Collapse
|
22
|
Regulator of G-protein signaling-5 is a marker of hepatic stellate cells and expression mediates response to liver injury. PLoS One 2014; 9:e108505. [PMID: 25290689 PMCID: PMC4188519 DOI: 10.1371/journal.pone.0108505] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Accepted: 08/22/2014] [Indexed: 12/11/2022] Open
Abstract
Liver fibrosis is mediated by hepatic stellate cells (HSCs), which respond to a variety of cytokine and growth factors to moderate the response to injury and create extracellular matrix at the site of injury. G-protein coupled receptor (GPCR)-mediated signaling, via endothelin-1 (ET-1) and angiotensin II (AngII), increases HSC contraction, migration and fibrogenesis. Regulator of G-protein signaling-5 (RGS5), an inhibitor of vasoactive GPCR agonists, functions to control GPCR-mediated contraction and hypertrophy in pericytes and smooth muscle cells (SMCs). Therefore we hypothesized that RGS5 controls GPCR signaling in activated HSCs in the context of liver injury. In this study, we localize RGS5 to the HSCs and demonstrate that Rgs5 expression is regulated during carbon tetrachloride (CCl4)-induced acute and chronic liver injury in Rgs5LacZ/LacZ reporter mice. Furthermore, CCl4 treated RGS5-null mice develop increased hepatocyte damage and fibrosis in response to CCl4 and have increased expression of markers of HSC activation. Knockdown of Rgs5 enhances ET-1-mediated signaling in HSCs in vitro. Taken together, we demonstrate that RGS5 is a critical regulator of GPCR signaling in HSCs and regulates HSC activation and fibrogenesis in liver injury.
Collapse
|
23
|
Effect of duplicate genes on mouse genetic robustness: an update. BIOMED RESEARCH INTERNATIONAL 2014; 2014:758672. [PMID: 25110693 PMCID: PMC4119742 DOI: 10.1155/2014/758672] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Revised: 06/15/2014] [Accepted: 06/16/2014] [Indexed: 12/28/2022]
Abstract
In contrast to S. cerevisiae and C. elegans, analyses based on the current knockout (KO) mouse phenotypes led to the conclusion that duplicate genes had almost no role in mouse genetic robustness. It has been suggested that the bias of mouse KO database toward ancient duplicates may possibly cause this knockout duplicate puzzle, that is, a very similar proportion of essential genes (PE) between duplicate genes and singletons. In this paper, we conducted an extensive and careful analysis for the mouse KO phenotype data and corroborated a strong effect of duplicate genes on mouse genetics robustness. Moreover, the effect of duplicate genes on mouse genetic robustness is duplication-age dependent, which holds after ruling out the potential confounding effect from coding-sequence conservation, protein-protein connectivity, functional bias, or the bias of duplicates generated by whole genome duplication (WGD). Our findings suggest that two factors, the sampling bias toward ancient duplicates and very ancient duplicates with a proportion of essential genes higher than that of singletons, have caused the mouse knockout duplicate puzzle; meanwhile, the effect of genetic buffering may be correlated with sequence conservation as well as protein-protein interactivity.
Collapse
|
24
|
Abstract
Orthologs are an indispensable bridge to transfer biological knowledge between species, from protein annotations to sophisticated disease models. However, orthology assignment is not trivial. A large number of resources now exist, each with its own idiosyncrasies. The goal of this review is to compare their contents and clarify which database is most suited for a certain task.:
Collapse
|
25
|
Role of the bicarbonate-responsive soluble adenylyl cyclase in pH sensing and metabolic regulation. Front Physiol 2014; 5:42. [PMID: 24575049 PMCID: PMC3918592 DOI: 10.3389/fphys.2014.00042] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2013] [Accepted: 01/22/2014] [Indexed: 12/18/2022] Open
Abstract
The evolutionarily conserved soluble adenylyl cyclase (sAC, adcy10) was recently identified as a unique source of cAMP in the cytoplasm and the nucleus. Its activity is regulated by bicarbonate and fine-tuned by calcium. As such, and in conjunction with carbonic anhydrase (CA), sAC constitutes an HCO(-) 3/CO(-) 2/pH sensor. In both alpha-intercalated cells of the collecting duct and the clear cells of the epididymis, sAC is expressed at significant level and involved in pH homeostasis via apical recruitment of vacuolar H(+)-ATPase (VHA) in a PKA-dependent manner. In addition to maintenance of pH homeostasis, sAC is also involved in metabolic regulation such as coupling of Krebs cycle to oxidative phosphorylation via bicarbonate/CO2 sensing. Additionally, sAC also regulates CFTR channel and plays an important role in regulation of barrier function and apoptosis. These observations suggest that sAC, via bicarbonate-sensing, plays an important role in maintaining homeostatic status of cells against fluctuations in their microenvironment.
Collapse
|
26
|
Mutations in MAPT give rise to aneuploidy in animal models of tauopathy. Neurogenetics 2013; 15:31-40. [PMID: 24218087 PMCID: PMC3968519 DOI: 10.1007/s10048-013-0380-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 09/30/2013] [Indexed: 12/02/2022]
Abstract
Tau is a major microtubule-associated protein in brain neurons. Its misfolding and accumulation cause neurodegenerative diseases characterized by brain atrophy and dementia, named tauopathies. Genetic forms are caused by mutations of microtubule-associated protein tau gene (MAPT). Tau is expressed also in nonneural tissues such as lymphocytes. Tau has been recently recognized as a multifunctional protein, and in particular, some findings supported a role in genome stability. In fact, peripheral cells of patients affected by frontotemporal dementia carrying different MAPT mutations showed structural and numerical chromosome aberrations. The aim of this study was to assess chromosome stability in peripheral cell from two animal models of genetic tauopathy, JNPL3 and PS19 mouse strains expressing the human tau carrying the P301L and P301S mutations, respectively, to confirm the previous data on humans. After demonstrating the presence of mutated tau in spleen, we performed standard cytogenetic analysis of splenic lymphocytes from homozygous and hemizygous JNPL3, hemizygous PS19, and relevant controls. Losses and gains of chromosomes (aneuploidy) were evaluated. We detected a significantly higher level of aneuploidy in JNPL3 and PS19 than in control mice. Moreover, in JNPL3, the aneuploidy was higher in homozygotes than in hemizygotes, demonstrating a gene dose effect, which appeared also to be age independent. Our results show that mutated tau is associated with chromosome instability. It is conceivable to hypothesize that in genetic tauopathies the aneuploidy may be present also in central nervous system, possibly contributing to neurodegeneration.
Collapse
|
27
|
Abstract
BACKGROUND Multiplecompeting bioinformatics tools exist for next-generation sequencing data analysis. Many of these tools are available as R/Bioconductor modules, and it can be challenging for the bench biologist without any programming background to quickly analyse genomics data. Here, we present an application that is designed to be simple to use, while leveraging the power of R as the analysis engine behind the scenes. RESULTS Genome Informatics Data Explorer (Guide) is a desktop application designed for the bench biologist to analyse RNA-seq and microarray gene expression data. It requires a text file of summarised read counts or expression values as input data, and performs differential expression analyses at both the gene and pathway level. It uses well-established R/Bioconductor packages such as limma for its analyses, without requiring the user to have specific knowledge of the underlying R functions. Results are presented in figures or interactive tables which integrate useful data from multiple sources such as gene annotation and orthologue data. Advanced options include the ability to edit R commands to customise the analysis pipeline. CONCLUSIONS Guide is a desktop application designed to query gene expression data in a user-friendly way while automatically communicating with R. Its customisation options make it possible to use different bioinformatics tools available through R/Bioconductor for its analyses, while keeping the core usage simple. Guide is written in the cross-platform framework of Qt, and is freely available for use from http://guide.wehi.edu.au.
Collapse
|
28
|
Dual analysis of the murine cytomegalovirus and host cell transcriptomes reveal new aspects of the virus-host cell interface. PLoS Pathog 2013; 9:e1003611. [PMID: 24086132 PMCID: PMC3784481 DOI: 10.1371/journal.ppat.1003611] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2013] [Accepted: 07/26/2013] [Indexed: 11/19/2022] Open
Abstract
Major gaps in our knowledge of pathogen genes and how these gene products interact with host gene products to cause disease represent a major obstacle to progress in vaccine and antiviral drug development for the herpesviruses. To begin to bridge these gaps, we conducted a dual analysis of Murine Cytomegalovirus (MCMV) and host cell transcriptomes during lytic infection. We analyzed the MCMV transcriptome during lytic infection using both classical cDNA cloning and sequencing of viral transcripts and next generation sequencing of transcripts (RNA-Seq). We also investigated the host transcriptome using RNA-Seq combined with differential gene expression analysis, biological pathway analysis, and gene ontology analysis. We identify numerous novel spliced and unspliced transcripts of MCMV. Unexpectedly, the most abundantly transcribed viral genes are of unknown function. We found that the most abundant viral transcript, recently identified as a noncoding RNA regulating cellular microRNAs, also codes for a novel protein. To our knowledge, this is the first viral transcript that functions both as a noncoding RNA and an mRNA. We also report that lytic infection elicits a profound cellular response in fibroblasts. Highly upregulated and induced host genes included those involved in inflammation and immunity, but also many unexpected transcription factors and host genes related to development and differentiation. Many top downregulated and repressed genes are associated with functions whose roles in infection are obscure, including host long intergenic noncoding RNAs, antisense RNAs or small nucleolar RNAs. Correspondingly, many differentially expressed genes cluster in biological pathways that may shed new light on cytomegalovirus pathogenesis. Together, these findings provide new insights into the molecular warfare at the virus-host interface and suggest new areas of research to advance the understanding and treatment of cytomegalovirus-associated diseases.
Collapse
|
29
|
Targeted mutagenesis tools for modelling psychiatric disorders. Cell Tissue Res 2013; 354:9-25. [PMID: 24078022 DOI: 10.1007/s00441-013-1708-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 07/16/2013] [Indexed: 12/15/2022]
Abstract
In the 1980s, the basic principles of gene targeting were discovered and forged into sharp tools for efficient and precise engineering of the mouse genome. Since then, genetic mouse models have substantially contributed to our understanding of major neurobiological concepts and are of utmost importance for our comprehension of neuropsychiatric disorders. The "domestication" of site-specific recombinases and the continuous creative technological developments involving the implementation of previously identified biological principles such as transcriptional and posttranslational control now enable conditional mutagenesis with high spatial and temporal resolution. The initiation and successful accomplishment of large-scale efforts to annotate functionally the entire mouse genome and to build strategic resources for the research community have significantly accelerated the rapid proliferation and broad propagation of mouse genetic tools. Addressing neurobiological processes with the assistance of genetic mouse models is a routine procedure in psychiatric research and will be further extended in order to improve our understanding of disease mechanisms. In light of the highly complex nature of psychiatric disorders and the current lack of strong causal genetic variants, a major future challenge is to model of psychiatric disorders more appropriately. Humanized mice, and the recently developed toolbox of site-specific nucleases for more efficient and simplified tailoring of the genome, offer the perspective of significantly improved models. Ultimately, these tools will push the limits of gene targeting beyond the mouse to allow genome engineering in any model organism of interest.
Collapse
|
30
|
Network topologies and convergent aetiologies arising from deletions and duplications observed in individuals with autism. PLoS Genet 2013; 9:e1003523. [PMID: 23754953 PMCID: PMC3675007 DOI: 10.1371/journal.pgen.1003523] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Accepted: 04/06/2013] [Indexed: 12/24/2022] Open
Abstract
Autism Spectrum Disorders (ASD) are highly heritable and characterised by impairments in social interaction and communication, and restricted and repetitive behaviours. Considering four sets of de novo copy number variants (CNVs) identified in 181 individuals with autism and exploiting mouse functional genomics and known protein-protein interactions, we identified a large and significantly interconnected interaction network. This network contains 187 genes affected by CNVs drawn from 45% of the patients we considered and 22 genes previously implicated in ASD, of which 192 form a single interconnected cluster. On average, those patients with copy number changed genes from this network possess changes in 3 network genes, suggesting that epistasis mediated through the network is extensive. Correspondingly, genes that are highly connected within the network, and thus whose copy number change is predicted by the network to be more phenotypically consequential, are significantly enriched among patients that possess only a single ASD-associated network copy number changed gene (p = 0.002). Strikingly, deleted or disrupted genes from the network are significantly enriched in GO-annotated positive regulators (2.3-fold enrichment, corrected p = 2×10−5), whereas duplicated genes are significantly enriched in GO-annotated negative regulators (2.2-fold enrichment, corrected p = 0.005). The direction of copy change is highly informative in the context of the network, providing the means through which perturbations arising from distinct deletions or duplications can yield a common outcome. These findings reveal an extensive ASD-associated molecular network, whose topology indicates ASD-relevant mutational deleteriousness and that mechanistically details how convergent aetiologies can result extensively from CNVs affecting pathways causally implicated in ASD. Autism Spectrum Disorders (ASD) are characterised by impairments in social interaction and communication, and restricted and repetitive behaviours. ASD are highly heritable and many different stretches of DNA have been found to be duplicated or deleted in individuals with ASD. We found that an unusually high number of genes affected by these DNA deletions/duplications are associated with the functioning of synaptic transmission between nerve cells. The proteins made by many of these genes are known to interact with each other and, together with proteins from other deleted/duplicated genes, form a large interlinked biological network. This network was affected by almost 50% of the deletions/duplications in the ASD patients considered. Many individual ASD patients had deletions or duplications of multiple genes within this network, but for those patients with just a single gene from the network changed, that single gene appeared to play an important role. Furthermore, the network predicts that the effects arising from the genes in the deletions are similar to the effects arising from the genes in the duplications. Thus, the way that this ASD-associated network is wired together contributes to the understanding of the impact of these DNA deletions and duplications.
Collapse
|
31
|
Molecular architecture of the chick vestibular hair bundle. Nat Neurosci 2013; 16:365-74. [PMID: 23334578 PMCID: PMC3581746 DOI: 10.1038/nn.3312] [Citation(s) in RCA: 139] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2012] [Accepted: 12/17/2012] [Indexed: 12/31/2022]
Abstract
Hair bundles of the inner ear have a specialized structure and protein composition that underlies their sensitivity to mechanical stimulation. Using mass spectrometry, we identified and quantified >1,100 proteins, present from a few to 400,000 copies per stereocilium, from purified chick bundles; 336 of these were significantly enriched in bundles. Bundle proteins that we detected have been shown to regulate cytoskeleton structure and dynamics, energy metabolism, phospholipid synthesis and cell signaling. Three-dimensional imaging using electron tomography allowed us to count the number of actin-actin cross-linkers and actin-membrane connectors; these values compared well to those obtained from mass spectrometry. Network analysis revealed several hub proteins, including RDX (radixin) and SLC9A3R2 (NHERF2), which interact with many bundle proteins and may perform functions essential for bundle structure and function. The quantitative mass spectrometry of bundle proteins reported here establishes a framework for future characterization of dynamic processes that shape bundle structure and function.
Collapse
|
32
|
Abstract
The laboratory mouse is the premier animal model for studying human biology because all life stages can be accessed experimentally, a completely sequenced reference genome is publicly available and there exists a myriad of genomic tools for comparative and experimental research. In the current era of genome scale, data-driven biomedical research, the integration of genetic, genomic and biological data are essential for realizing the full potential of the mouse as an experimental model. The Mouse Genome Database (MGD; http://www.informatics.jax.org), the community model organism database for the laboratory mouse, is designed to facilitate the use of the laboratory mouse as a model system for understanding human biology and disease. To achieve this goal, MGD integrates genetic and genomic data related to the functional and phenotypic characterization of mouse genes and alleles and serves as a comprehensive catalog for mouse models of human disease. Recent enhancements to MGD include the addition of human ortholog details to mouse Gene Detail pages, the inclusion of microRNA knockouts to MGD’s catalog of alleles and phenotypes, the addition of video clips to phenotype images, providing access to genotype and phenotype data associated with quantitative trait loci (QTL) and improvements to the layout and display of Gene Ontology annotations.
Collapse
|
33
|
Evolutionary history of human disease genes reveals phenotypic connections and comorbidity among genetic diseases. Sci Rep 2012; 2:757. [PMID: 23091697 PMCID: PMC3477654 DOI: 10.1038/srep00757] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Accepted: 10/03/2012] [Indexed: 01/02/2023] Open
Abstract
The extent to which evolutionary changes have impacted the phenotypic relationships among human diseases remains unclear. In this work, we report that phenotypically similar diseases are connected by the evolutionary constraints on human disease genes. Human disease groups can be classified into slowly or rapidly evolving classes, where the diseases in the slowly evolving class are enriched with morphological phenotypes and those in the rapidly evolving class are enriched with physiological phenotypes. Our findings establish a clear evolutionary connection between disease classes and disease phenotypes for the first time. Furthermore, the high comorbidity found between diseases connected by similar evolutionary constraints enables us to improve the predictability of the relative risk of human diseases. We find the evolutionary constraints on disease genes are a new layer of molecular connection in the network-based exploration of human diseases.
Collapse
|
34
|
Gene expression profiling for mechanistic understanding of cellular aggregation in mammalian cell perfusion cultures. Biotechnol Bioeng 2012; 110:483-90. [PMID: 23007466 DOI: 10.1002/bit.24730] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2012] [Revised: 08/03/2012] [Accepted: 09/10/2012] [Indexed: 12/14/2022]
Abstract
Aggregation of baby hamster kidney (BHK) cells cultivated in perfusion mode for manufacturing recombinant proteins was characterized. The potential impact of cultivation time on cell aggregation for an aggregating culture (cell line A) was studied by comparing expression profiles of 84 genes in the extracellular adhesion molecules (ECM) pathway by qRT-PCR from 9 and 25 day shake flask samples and 80 and 94 day bioreactor samples. Significant up-regulation of THBS2 (4.4- to 6.9-fold) was seen in both the 25 day shake flask and 80 and 94 day bioreactor samples compared to the 9 day shake flask while NCAM1 was down-regulated 5.1- to 8.9-fold in the 80 and 94 day bioreactor samples. Subsequent comparisons were made between cell line A and a non-aggregating culture (cell line B). A 65 day perfusion bioreactor sample from cell line B served as the control for 80 and 94 day samples from four different perfusion bioreactors for cell line A. Of the 84 genes in the ECM pathway, four (COL1A1, COL4A1, THBS2, and VCAN) were consistently up-regulated in cell line A while two (NCAM1 and THBS1) were consistently down-regulated. The magnitudes of differential gene expression were much higher when cell lines were compared (4.1- to 44.6-fold) than when early and late cell line B samples were compared (4.4- to 6.9-fold) indicating greater variability between aggregating and non-aggregating cell lines. Based on the differential gene expression results, two mechanistic models were proposed for aggregation of BHK cells in perfusion cultures.
Collapse
|
35
|
Topological analysis of protein co-abundance networks identifies novel host targets important for HCV infection and pathogenesis. BMC SYSTEMS BIOLOGY 2012; 6:28. [PMID: 22546282 PMCID: PMC3383540 DOI: 10.1186/1752-0509-6-28] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 04/30/2012] [Indexed: 01/12/2023]
Abstract
Background High-throughput methods for obtaining global measurements of transcript and protein levels in biological samples has provided a large amount of data for identification of 'target' genes and proteins of interest. These targets may be mediators of functional processes involved in disease and therefore represent key points of control for viruses and bacterial pathogens. Genes and proteins that are the most highly differentially regulated are generally considered to be the most important. We present topological analysis of co-abundance networks as an alternative to differential regulation for confident identification of target proteins from two related global proteomics studies of hepatitis C virus (HCV) infection. Results We analyzed global proteomics data sets from a cell culture study of HCV infection and from a clinical study of liver biopsies from HCV-positive patients. Using lists of proteins known to be interaction partners with pathogen proteins we show that the most differentially regulated proteins in both data sets are indeed enriched in pathogen interactors. We then use these data sets to generate co-abundance networks that link proteins based on similar abundance patterns in time or across patients. Analysis of these co-abundance networks using a variety of network topology measures revealed that both degree and betweenness could be used to identify pathogen interactors with better accuracy than differential regulation alone, though betweenness provides the best discrimination. We found that though overall differential regulation was not correlated between the cell culture and liver biopsy data, network topology was conserved to an extent. Finally, we identified a set of proteins that has high betweenness topology in both networks including a protein that we have recently shown to be essential for HCV replication in cell culture. Conclusions The results presented show that the network topology of protein co-abundance networks can be used to identify proteins important for viral replication. These proteins represent targets for further experimental investigation that will provide biological insight and potentially could be exploited for novel therapeutic approaches to combat HCV infection.
Collapse
|
36
|
TEMPORAL GRAPHICAL MODELS FOR CROSS-SPECIES GENE REGULATORY NETWORK DISCOVERY. J Bioinform Comput Biol 2011; 9:231-50. [DOI: 10.1142/s0219720011005525] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Revised: 02/28/2011] [Accepted: 03/01/2011] [Indexed: 11/18/2022]
Abstract
Many genes and biological processes function in similar ways across different species. Cross-species gene expression analysis, as a powerful tool to characterize the dynamical properties of the cell, has found a number of applications, such as identifying a conserved core set of cell cycle genes. However, to the best of our knowledge, there is limited effort on developing appropriate techniques to capture the causality relations between genes from time-series microarray data across species. In this paper, we present hidden Markov random field regression with L1penalty to uncover the regulatory network structure for different species. The algorithm provides a framework for sharing information across species via hidden component graphs and is able to incorporate domain knowledge across species easily. We demonstrate our method on two synthetic datasets and apply it to discover causal graphs from innate immune response data.
Collapse
|
37
|
Abstract
Background Despite intense investment growth and technology development, there is an observed bottleneck in drug discovery and development over the past decade. NIH started the Molecular Libraries Initiative (MLI) in 2003 to enlarge the pool for potential drug targets, especially from the “undruggable” part of human genome, and potential drug candidates from much broader types of drug-like small molecules. All results are being made publicly available in a web portal called PubChem. Results In this paper we construct a network from bioassay data in PubChem, apply network biology concepts to characterize this bioassay network, integrate information from multiple biological databases (e.g. DrugBank, OMIM, and UniHI), and systematically analyze the potential of bioassay targets being new drug targets in the context of complex biological networks. We propose a model to quantitatively prioritize this druggability of bioassay targets, and literature evidence was found to confirm our prioritization of bioassay targets at a roughly 70% accuracy. Conclusions Our analysis provide some measures of the value of the MLI data as a resource for both basic chemical biology research and future therapeutic discovery.
Collapse
|
38
|
Audiogenic seizure proneness requires the contribution of two susceptibility loci in mice. Neurogenetics 2011; 12:253-7. [PMID: 21681693 DOI: 10.1007/s10048-011-0289-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2011] [Accepted: 06/03/2011] [Indexed: 10/18/2022]
Abstract
Juvenile mice of the DBA/2J strain undergo generalised seizures when exposed to a high-intensity auditory stimulus. Genetic analysis identified three different loci underlying this audiogenic seizure proneness (ASP)-Asp1, Asp2 and Asp3 on chromosomes 12, 4 and 7, respectively. Asp1 is thought to have the strongest influence, and mice with only Asp1 derived from the DBA/2J strain are reported to exhibit ASP. The aim of this study was to characterise more accurately the contributions of the Asp1 and Asp3 loci in ASP using congenic strains. Each congenic strain contains a DBA/2J-derived interval encompassing either Asp1 or Asp3 on a C57BL/6J genetic background. A double congenic C57BL/6J strain containing both Asp loci derived from DBA/2J was also generated. Here, we report that DBA/2J alleles at both of these Asp loci are required to confer ASP because congenic C57BL/6 mice harbouring DBA/2J alleles at only Asp1 or Asp3 do not exhibit ASP, whereas DBA/2J alleles at both loci resulted in increased susceptibility for audiogenic seizure in double congenic C57BL/6 mice.
Collapse
|
39
|
The implications of relationships between human diseases and metabolic subpathways. PLoS One 2011; 6:e21131. [PMID: 21695054 PMCID: PMC3117879 DOI: 10.1371/journal.pone.0021131] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2011] [Accepted: 05/20/2011] [Indexed: 01/08/2023] Open
Abstract
One of the challenging problems in the etiology of diseases is to explore the relationships between initiation and progression of diseases and abnormalities in local regions of metabolic pathways. To gain insight into such relationships, we applied the “k-clique” subpathway identification method to all disease-related gene sets. For each disease, the disease risk regions of metabolic pathways were then identified and considered as subpathways associated with the disease. We finally built a disease-metabolic subpathway network (DMSPN). Through analyses based on network biology, we found that a few subpathways, such as that of cytochrome P450, were highly connected with many diseases, and most belonged to fundamental metabolisms, suggesting that abnormalities of fundamental metabolic processes tend to cause more types of diseases. According to the categories of diseases and subpathways, we tested the clustering phenomenon of diseases and metabolic subpathways in the DMSPN. The results showed that both disease nodes and subpathway nodes displayed slight clustering phenomenon. We also tested correlations between network topology and genes within disease-related metabolic subpathways, and found that within a disease-related subpathway in the DMSPN, the ratio of disease genes and the ratio of tissue-specific genes significantly increased as the number of diseases caused by the subpathway increased. Surprisingly, the ratio of essential genes significantly decreased and the ratio of housekeeping genes remained relatively unchanged. Furthermore, the coexpression levels between disease genes and other types of genes were calculated for each subpathway in the DMSPN. The results indicated that those genes intensely influenced by disease genes, including essential genes and tissue-specific genes, might be significantly associated with the disease diversity of subpathways, suggesting that different kinds of genes within a disease-related subpathway may play significantly differential roles on the diversity of diseases caused by the corresponding subpathway.
Collapse
|
40
|
A computational approach to candidate gene prioritization for X-linked mental retardation using annotation-based binary filtering and motif-based linear discriminatory analysis. Biol Direct 2011; 6:30. [PMID: 21668950 PMCID: PMC3142252 DOI: 10.1186/1745-6150-6-30] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2011] [Accepted: 06/13/2011] [Indexed: 01/07/2023] Open
Abstract
Background Several computational candidate gene selection and prioritization methods have recently been developed. These in silico selection and prioritization techniques are usually based on two central approaches - the examination of similarities to known disease genes and/or the evaluation of functional annotation of genes. Each of these approaches has its own caveats. Here we employ a previously described method of candidate gene prioritization based mainly on gene annotation, in accompaniment with a technique based on the evaluation of pertinent sequence motifs or signatures, in an attempt to refine the gene prioritization approach. We apply this approach to X-linked mental retardation (XLMR), a group of heterogeneous disorders for which some of the underlying genetics is known. Results The gene annotation-based binary filtering method yielded a ranked list of putative XLMR candidate genes with good plausibility of being associated with the development of mental retardation. In parallel, a motif finding approach based on linear discriminatory analysis (LDA) was employed to identify short sequence patterns that may discriminate XLMR from non-XLMR genes. High rates (>80%) of correct classification was achieved, suggesting that the identification of these motifs effectively captures genomic signals associated with XLMR vs. non-XLMR genes. The computational tools developed for the motif-based LDA is integrated into the freely available genomic analysis portal Galaxy (http://main.g2.bx.psu.edu/). Nine genes (APLN, ZC4H2, MAGED4, MAGED4B, RAP2C, FAM156A, FAM156B, TBL1X, and UXT) were highlighted as highly-ranked XLMR methods. Conclusions The combination of gene annotation information and sequence motif-orientated computational candidate gene prediction methods highlight an added benefit in generating a list of plausible candidate genes, as has been demonstrated for XLMR. Reviewers: This article was reviewed by Dr Barbara Bardoni (nominated by Prof Juergen Brosius); Prof Neil Smalheiser and Dr Dustin Holloway (nominated by Prof Charles DeLisi).
Collapse
|
41
|
Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res 2011; 21:1109-21. [PMID: 21536720 DOI: 10.1101/gr.118992.110] [Citation(s) in RCA: 488] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Network "guilt by association" (GBA) is a proven approach for identifying novel disease genes based on the observation that similar mutational phenotypes arise from functionally related genes. In principle, this approach could account even for nonadditive genetic interactions, which underlie the synergistic combinations of mutations often linked to complex diseases. Here, we analyze a large-scale, human gene functional interaction network (dubbed HumanNet). We show that candidate disease genes can be effectively identified by GBA in cross-validated tests using label propagation algorithms related to Google's PageRank. However, GBA has been shown to work poorly in genome-wide association studies (GWAS), where many genes are somewhat implicated, but few are known with very high certainty. Here, we resolve this by explicitly modeling the uncertainty of the associations and incorporating the uncertainty for the seed set into the GBA framework. We observe a significant boost in the power to detect validated candidate genes for Crohn's disease and type 2 diabetes by comparing our predictions to results from follow-up meta-analyses, with incorporation of the network serving to highlight the JAK-STAT pathway and associated adaptors GRB2/SHC1 in Crohn's disease and BACH2 in type 2 diabetes. Consideration of the network during GWAS thus conveys some of the benefits of enrolling more participants in the GWAS study. More generally, we demonstrate that a functional network of human genes provides a valuable statistical framework for prioritizing candidate disease genes, both for candidate gene-based and GWAS-based studies.
Collapse
|
42
|
Abstract
Human genetic variation is expected to play a central role in personalized medicine. Yet only a fraction of the natural genetic variation that is harbored by humans has been discovered to date. Here we report almost 2 million small insertions and deletions (INDELs) that range from 1 bp to 10,000 bp in length in the genomes of 79 diverse humans. These variants include 819,363 small INDELs that map to human genes. Small INDELs frequently were found in the coding exons of these genes, and several lines of evidence indicate that such variation is a major determinant of human biological diversity. Microarray-based genotyping experiments revealed several interesting observations regarding the population genetics of small INDEL variation. For example, we found that many of our INDELs had high levels of linkage disequilibrium (LD) with both HapMap SNPs and with high-scoring SNPs from genome-wide association studies. Overall, our study indicates that small INDEL variation is likely to be a key factor underlying inherited traits and diseases in humans.
Collapse
|
43
|
Abstract
The potential use of neural stem cells (NSCs) in basic research, drug testing, and for the development of therapeutic strategies is dependent on their large scale in vitro amplification which, however, introduces considerable risks of genetic instability and transformation. NSCs have been derived from different sources, but the occurrence of chromosomal instability has been monitored only to a limited extent in relationship to the source of derivation, growth procedure, long-term culture, and genetic manipulation. Here we have systematically investigated the effect of these parameters on the chromosomal stability of pure populations of mouse NSCs obtained after neuralization from embryonic stem cells (ESCs) or directly from fetal or adult mouse brain. We found that the procedure of NSCs establishment is not accompanied by genetic instability and chromosomal aberration. On the contrary, we observed that a composite karyotype appears in NSCs above extensive passaging. This phenomenon is more evident in ESC- and adult sub-ventricular zone-derived NSCs and further deteriorates after genetic engineering of the cells. Fetal-derived NSCs showed the greatest euploidy state with negligible clonal structural aberrations, but persistent clonal numerical abnormalities. It was previously published that long-term passaged ESC- and adult sub-ventricular zone-derived NSCs did not show any defects in the cells' proliferative and differentiative capacity nor induced in vivo tumour formation, although we here report on the chromosomal abnormalities of these cells. Although chromosomal aberrations are known to occur less frequently in human cells, studies performed on murine stem cells provide an important complement to understand the biological events occurring in human lines.
Collapse
|
44
|
Controlling the response: predictive modeling of a highly central, pathogen-targeted core response module in macrophage activation. PLoS One 2011; 6:e14673. [PMID: 21339814 PMCID: PMC3038849 DOI: 10.1371/journal.pone.0014673] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2010] [Accepted: 01/17/2011] [Indexed: 11/19/2022] Open
Abstract
We have investigated macrophage activation using computational analyses of a compendium of transcriptomic data covering responses to agonists of the TLR pathway, Salmonella infection, and manufactured amorphous silica nanoparticle exposure. We inferred regulatory relationship networks using this compendium and discovered that genes with high betweenness centrality, so-called bottlenecks, code for proteins targeted by pathogens. Furthermore, combining a novel set of bioinformatics tools, topological analysis with analysis of differentially expressed genes under the different stimuli, we identified a conserved core response module that is differentially expressed in response to all studied conditions. This module occupies a highly central position in the inferred network and is also enriched in genes preferentially targeted by pathogens. The module includes cytokines, interferon induced genes such as Ifit1 and 2, effectors of inflammation, Cox1 and Oas1 and Oasl2, and transcription factors including AP1, Egr1 and 2 and Mafb. Predictive modeling using a reverse-engineering approach reveals dynamic differences between the responses to each stimulus and predicts the regulatory influences directing this module. We speculate that this module may be an early checkpoint for progression to apoptosis and/or inflammation during macrophage activation.
Collapse
|
45
|
Molecular-genetic systems of development: Functional dynamics and molecular evolution. BIOCHEMISTRY (MOSCOW) 2011; 73:219-30. [DOI: 10.1134/s0006297908020144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
46
|
Genes and biological processes commonly disrupted in rare and heterogeneous developmental delay syndromes. Hum Mol Genet 2010; 20:880-93. [PMID: 21147756 DOI: 10.1093/hmg/ddq527] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Rare copy number variations (CNVs) are a recognized cause of common human disease. Predicting the genetic element(s) within a small CNV whose copy number loss or gain underlies a specific phenotype might be achieved reasonably rapidly for single patients. Identifying the biological processes that are commonly disrupted within a large patient cohort which possess larger CNVs, however, requires a more objective approach that exploits genomic resources. In this study, we first identified 98 large, rare CNVs within patients exhibiting multiple congenital anomalies. All patients presented with global developmental delay (DD), while other secondary symptoms such as cardiac defects, craniofacial features and seizures were varyingly presented. By applying a robust statistical procedure that matches patients' clinical phenotypes to laboratory mouse gene knockouts, we were able to strongly implicate anomalies in brain morphology and, separately, in long-term potentiation as manifestations of these DD patients' disorders. These and other significantly enriched model phenotypes provide insights into the pathoetiology of human DD and behavioral and anatomical secondary symptoms that are specific to DD patients. These enrichments set apart 103 genes, from among thousands overlapped by these CNVs, as strong candidates whose copy number change causally underlies approximately 46% of the cohort's DD syndromes and between 59 and 80% of the cohort's secondary symptoms. We also identified significantly enriched model phenotypes among genes overlapped by CNVs in both DD and learning disability cohorts, indicating a congruent etiology. These results demonstrate the high predictive potential of model organism phenotypes when implicating candidate genes for rare genomic disorders.
Collapse
|
47
|
A comparison of machine learning techniques for detection of drug target articles. J Biomed Inform 2010; 43:902-13. [DOI: 10.1016/j.jbi.2010.07.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2009] [Revised: 06/30/2010] [Accepted: 07/28/2010] [Indexed: 11/24/2022]
|
48
|
Functional genomics complements quantitative genetics in identifying disease-gene associations. PLoS Comput Biol 2010; 6:e1000991. [PMID: 21085640 PMCID: PMC2978695 DOI: 10.1371/journal.pcbi.1000991] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Accepted: 10/07/2010] [Indexed: 11/25/2022] Open
Abstract
An ultimate goal of genetic research is to understand the connection between genotype and phenotype in order to improve the diagnosis and treatment of diseases. The quantitative genetics field has developed a suite of statistical methods to associate genetic loci with diseases and phenotypes, including quantitative trait loci (QTL) linkage mapping and genome-wide association studies (GWAS). However, each of these approaches have technical and biological shortcomings. For example, the amount of heritable variation explained by GWAS is often surprisingly small and the resolution of many QTL linkage mapping studies is poor. The predictive power and interpretation of QTL and GWAS results are consequently limited. In this study, we propose a complementary approach to quantitative genetics by interrogating the vast amount of high-throughput genomic data in model organisms to functionally associate genes with phenotypes and diseases. Our algorithm combines the genome-wide functional relationship network for the laboratory mouse and a state-of-the-art machine learning method. We demonstrate the superior accuracy of this algorithm through predicting genes associated with each of 1157 diverse phenotype ontology terms. Comparison between our prediction results and a meta-analysis of quantitative genetic studies reveals both overlapping candidates and distinct, accurate predictions uniquely identified by our approach. Focusing on bone mineral density (BMD), a phenotype related to osteoporotic fracture, we experimentally validated two of our novel predictions (not observed in any previous GWAS/QTL studies) and found significant bone density defects for both Timp2 and Abcg8 deficient mice. Our results suggest that the integration of functional genomics data into networks, which itself is informative of protein function and interactions, can successfully be utilized as a complementary approach to quantitative genetics to predict disease risks. All supplementary material is available at http://cbfg.jax.org/phenotype. Many recent efforts to understand the genetic origins of complex diseases utilize statistical approaches to analyze phenotypic traits measured in genetically well-characterized populations. While these quantitative genetics methods are powerful, their success is limited by sampling biases and other confounding factors, and the biological interpretation of results can be challenging since these methods are not based on any functional information for candidate loci. On the other hand, the functional genomics field has greatly expanded in past years, both in terms of experimental approaches and analytical algorithms. However, functional approaches have been applied to understanding phenotypes in only the most basic ways. In this study, we demonstrate that functional genomics can complement traditional quantitative genetics by analytically extracting protein function information from large collections of high throughput data, which can then be used to predict genotype-phenotype associations. We applied our prediction methodology to the laboratory mouse, and we experimentally confirmed a role in osteoporosis for two of our predictions that were not candidates from any previous quantitative genetics study. The ability of our approach to produce accurate and unique predictions implies that functional genomics can complement quantitative genetics and can help address previous limitations in identifying disease genes.
Collapse
|
49
|
Integrating genetic and toxicogenomic information for determining underlying susceptibility to developmental disorders. ACTA ACUST UNITED AC 2010; 88:920-30. [DOI: 10.1002/bdra.20708] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
|
50
|
Integrating heterogeneous sequence information for transcriptome-wide microarray design; a Zebrafish example. BMC Res Notes 2010; 3:192. [PMID: 20626891 PMCID: PMC2913925 DOI: 10.1186/1756-0500-3-192] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2010] [Accepted: 07/13/2010] [Indexed: 11/10/2022] Open
Abstract
Background A complete gene-expression microarray should preferably detect all genomic sequences that can be expressed as RNA in an organism, i.e. the transcriptome. However, our knowledge of a transcriptome of any organism still is incomplete and transcriptome information is continuously being updated. Here, we present a strategy to integrate heterogeneous sequence information that can be used as input for an up-to-date microarray design. Findings Our algorithm consists of four steps. In the first step transcripts from different resources are grouped into Transcription Clusters (TCs) by looking at the similarity of all transcripts. TCs are groups of transcripts with a similar length. If a transcript is much smaller than a TC to which it is highly similar, it will be annotated as a subsequence of that TC and is used for probe design only if the probe designed for the TC does not query the subsequence. Secondly, all TCs are mapped to a genome assembly and gene information is added to the design. Thirdly TC members are ranked according to their trustworthiness and the most reliable sequence is used for the probe design. The last step is the actual array design. We have used this strategy to build an up-to-date zebrafish microarray. Conclusions With our strategy and the software developed, it is possible to use a set of heterogeneous transcript resources for microarray design, reduce the number of candidate target sequences on which the design is based and reduce redundancy. By changing the parameters in the procedure it is possible to control the similarity within the TCs and thus the amount of candidate sequences for the design. The annotation of the microarray is carried out simultaneously with the design.
Collapse
|