1
|
A Transcriptional Link between HER2, JAM-A and FOXA1 in Breast Cancer. Cells 2022; 11:cells11040735. [PMID: 35203384 PMCID: PMC8870165 DOI: 10.3390/cells11040735] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 02/11/2022] [Accepted: 02/17/2022] [Indexed: 01/03/2023] Open
Abstract
Overexpression of the human epidermal growth factor receptor-2 (HER2) is associated with aggressive disease in breast and certain other cancers. At a cellular level, the adhesion protein Junctional Adhesion Molecule-A (JAM-A) has been reported to regulate the expression of HER3 via a transcriptional pathway involving FOXA1. Since FOXA1 is also a suggested transcription factor for HER2, this study set out to determine if JAM-A regulates HER2 expression via a similar mechanism. An integrated tripartite approach was taken, involving cellular expression studies after targeted disruption of individual players in the putative pathway, in silico identification of relevant HER2 promoter regions and, finally, interrogation of cancer patient survival databases to deconstruct functionally important links between HER2, JAM-A and FOXA1 gene expression. The outcome of these investigations revealed a unidirectional pathway in which JAM-A expression transcriptionally regulates that of HER2 by influencing the binding of FOXA1 to a specific site in the HER2 gene promoter. Moreover, a correlation between JAM-A and HER2 gene expression was identified in 75% of a sample of 40 cancer types from The Cancer Genome Atlas, and coincident high mean mRNA expression of JAM-A, HER2 and FOXA1 was associated with poorer survival outcomes in HER2-positive (but not HER2-negative) patients with either breast or gastric tumors. These investigations provide the first evidence of a transcriptional pathway linking JAM-A, HER2 and FOXA1 in cancer settings, and support potential future pharmacological targeting of JAM-A as an upstream regulator of HER2.
Collapse
|
2
|
Cruz RGB, Madden SF, Richards CE, Vellanki SH, Jahns H, Hudson L, Fay J, O’Farrell N, Sheehan K, Jirström K, Brennan K, Hopkins AM. Human Epidermal Growth Factor Receptor-3 Expression Is Regulated at Transcriptional Level in Breast Cancer Settings by Junctional Adhesion Molecule-A via a Pathway Involving Beta-Catenin and FOXA1. Cancers (Basel) 2021; 13:cancers13040871. [PMID: 33669586 PMCID: PMC7922773 DOI: 10.3390/cancers13040871] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 02/08/2021] [Accepted: 02/15/2021] [Indexed: 01/29/2023] Open
Abstract
Simple Summary Signaling from the human epidermal growth factor receptor (HER) family of proteins increases in many cancers, including breast. HER2-high breast cancers are successfully treated with anti-HER2 therapies, but these drugs are limited by the fact that patients frequently develop resistance to them. One common mechanism by which resistance develops is when tumors acquire high levels of a family member called HER3. We had previously shown that a protein called JAM-A regulates the level of HER2 in breast cancer cells, and is associated with the development of resistance to HER2-targeted therapies. In this study we show for the first time that JAM-A levels also regulate those of HER3. Using breast cancer cell and tissue models and culminating in patient tissue material, we provide evidence that JAM-A regulates HER3 expression via a pathway involving the transcription factors β-catenin and FOXA1. We suggest that JAM-A merits future investigation as a novel drug target for its potential to reduce HER3 tumorigenic signaling and to offset the development of resistance to HER2-targeted therapies. Abstract The success of breast cancer therapies targeting the human epidermal growth factor receptor-2 (HER2) is limited by the development of drug resistance by mechanisms including upregulation of HER3. Having reported that HER2 expression and resistance to HER2-targeted therapies can be regulated by Junctional Adhesion Molecule-A (JAM-A), this study investigated if JAM-A regulates HER3 expression. Expressional alteration of JAM-A in breast cancer cells was used to test expressional effects on HER3 and its effectors, alongside associated functional behaviors, in vitro and semi-in vivo. HER3 transcription factors were identified and tested for regulation by JAM-A. Finally a patient tissue microarray was used to interrogate connections between putative pathway components connecting JAM-A and HER3. This study reveals for the first time that HER3 and its effectors are regulated at gene/protein expression level by JAM-A in breast cancer cell lines; with functional consequences in in vitro and semi-in vivo models. In bioinformatic, cellular and patient tissue models, this was associated with regulation of the HER3 transcription factor FOXA1 by JAM-A via a pathway involving β-catenin. Our data suggest a novel model whereby JAM-A expression regulates β-catenin localization, in turn regulating FOXA1 expression, which could drive HER3 gene transcription. JAM-A merits investigation as a novel target to prevent upregulation of HER3 during the development of resistance to HER2-targeted therapies, or to reduce HER3-dependent tumorigenic signaling.
Collapse
Affiliation(s)
- Rodrigo G. B. Cruz
- Department of Surgery, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (R.G.B.C.); (C.E.R.); (S.H.V.); (L.H.); (K.B.)
| | - Stephen F. Madden
- Data Science Centre, Royal College of Surgeons in Ireland, Dublin 2, Ireland;
| | - Cathy E. Richards
- Department of Surgery, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (R.G.B.C.); (C.E.R.); (S.H.V.); (L.H.); (K.B.)
| | - Sri HariKrishna Vellanki
- Department of Surgery, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (R.G.B.C.); (C.E.R.); (S.H.V.); (L.H.); (K.B.)
| | - Hanne Jahns
- Pathobiology Section, UCD School of Veterinary Medicine, University College Dublin, Dublin 4, Ireland;
| | - Lance Hudson
- Department of Surgery, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (R.G.B.C.); (C.E.R.); (S.H.V.); (L.H.); (K.B.)
| | - Joanna Fay
- Department of Pathology, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (J.F.); (N.O.); (K.S.)
| | - Naoimh O’Farrell
- Department of Pathology, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (J.F.); (N.O.); (K.S.)
| | - Katherine Sheehan
- Department of Pathology, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (J.F.); (N.O.); (K.S.)
| | - Karin Jirström
- Department of Clinical Sciences Lund, Division of Oncology and Therapeutic Pathology, Lund University, SE 221 85 Lund, Sweden;
| | - Kieran Brennan
- Department of Surgery, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (R.G.B.C.); (C.E.R.); (S.H.V.); (L.H.); (K.B.)
| | - Ann M. Hopkins
- Department of Surgery, Royal College of Surgeons in Ireland, Beaumont Hospital, Dublin 9, Ireland; (R.G.B.C.); (C.E.R.); (S.H.V.); (L.H.); (K.B.)
- Correspondence: ; Tel.: +353-1-809-3858
| |
Collapse
|
3
|
Poczai P, Hyvönen J. The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis. PLoS One 2017; 12:e0187199. [PMID: 29095905 PMCID: PMC5667773 DOI: 10.1371/journal.pone.0187199] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Accepted: 10/16/2017] [Indexed: 11/24/2022] Open
Abstract
Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC-rps14 region and 6-kb in the trnG-UCC-psbD, followed by a third <1kb inversion in the trnT sequence.
Collapse
Affiliation(s)
- Péter Poczai
- Finnish Museum of Natural History (Botany), University of Helsinki, Helsinki, Finland
| | - Jaakko Hyvönen
- Finnish Museum of Natural History (Botany), University of Helsinki, Helsinki, Finland
- Dept. Biosci. (Plant Biology), University of Helsinki, Helsinki, Finland
| |
Collapse
|
4
|
Elnitski L, Burhans R, Riemer C, Hardison R, Miller W. MultiPipMaker: a comparative alignment server for multiple DNA sequences. ACTA ACUST UNITED AC 2010; Chapter 10:10.4.1-10.4.14. [PMID: 20521245 DOI: 10.1002/0471250953.bi1004s30] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The MultiPipMaker World Wide Web server (http://www.bx.psu.edu) provides a tool for aligning multiple DNA sequences and visualizing regions of conservation among them. This unit describes its use and gives an explanation of the resulting output files and supporting tools. Features provided by the server include alignment of up to 20 very long genomic sequences, output choices of a true, nucleotide-level multiple alignment and/or stacked, pairwise percent identity plots, and support for user-specified annotations of genomic features and arbitrary regions, with clickable links to additional information. Input sequences other than the reference can be fragmented, unordered, and unoriented.
Collapse
Affiliation(s)
- Laura Elnitski
- The Pennsylvania State University, University Park, Pennsylvania, USA
| | | | | | | | | |
Collapse
|
5
|
Johnson KR, Nicodemus-Johnson J, Danziger RS. An evolutionary analysis of cAMP-specific Phosphodiesterase 4 alternative splicing. BMC Evol Biol 2010; 10:247. [PMID: 20701803 PMCID: PMC2929239 DOI: 10.1186/1471-2148-10-247] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2010] [Accepted: 08/11/2010] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Cyclic nucleotide phosphodiesterases (PDEs) hydrolyze the intracellular second messengers: cyclic adenosine monophosphate (cAMP) and cyclic guanine monophosphate (cGMP). The cAMP-specific PDE family 4 (PDE4) is widely expressed in vertebrates. Each of the four PDE4 gene isoforms (PDE4 A-D) undergo extensive alternative splicing via alternative transcription initiation sites, producing unique amino termini and yielding multiple splice variant forms from each gene isoform termed long, short, super-short and truncated super-short. Many species across the vertebrate lineage contain multiple splice variants of each gene type, which are characterized by length and amino termini. RESULTS A phylogenetic approach was used to visualize splice variant form genesis and identify conserved splice variants (genome conservation with EST support) across the vertebrate taxa. Bayesian and maximum likelihood phylogenetic inference indicated PDE4 gene duplication occurred at the base of the vertebrate lineage and reveals additional gene duplications specific to the teleost lineage. Phylogenetic inference and PDE4 splice variant presence, or absence as determined by EST screens, were further supported by the genomic analysis of select vertebrate taxa. Two conserved PDE4 long form splice variants were found in each of the PDE4A, PDE4B, and PDE4C genes, and eight conserved long forms from the PDE4 D gene. Conserved short and super-short splice variants were found from each of the PDE4A, PDE4B, and PDE4 D genes, while truncated super-short variants were found from the PDE4C and PDE4 D genes. PDE4 long form splice variants were found in all taxa sampled (invertebrate through mammals); short, super-short, and truncated super-short are detected primarily in tetrapods and mammals, indicating an increasing complexity in both alternative splicing and cAMP metabolism through vertebrate evolution. CONCLUSIONS There was a progressive independent incorporation of multiple PDE4 splice variant forms and amino termini, increasing PDE4 proteome complexity from primitive vertebrates to humans. While PDE4 gene isoform duplicates with limited alternative splicing were found in teleosts, an expansion of both PDE4 splice variant forms, and alternatively spliced amino termini predominantly occurs in mammals. Since amino termini have been linked to intracellular targeting of the PDE4 enzymes, the conservation of amino termini in PDE4 splice variants in evolution highlights the importance of compartmentalization of PDE4-mediated cAMP hydrolysis.
Collapse
Affiliation(s)
- Keven R Johnson
- Department of Physiology and Biophysics, University of Illinois at Chicago 835 S. Wolcott Avenue, M/C 901, Chicago, IL 60612-7342, USA
| | | | | |
Collapse
|
6
|
Kusik BW, Hammond DR, Udvadia AJ. Transcriptional regulatory regions of gap43 needed in developing and regenerating retinal ganglion cells. Dev Dyn 2010; 239:482-95. [PMID: 20034105 DOI: 10.1002/dvdy.22190] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Mammals and fish differ in their ability to express axon growth-associated genes in response to CNS injury, which contributes to the differences in their ability for CNS regeneration. Previously we demonstrated that for the axon growth-associated gene, gap43, regions of the rat promoter that are sufficient to promote reporter gene expression in the developing zebrafish nervous system are not sufficient to promote expression in regenerating retinal ganglion cells in zebrafish. Recently, we identified a 3.6-kb gap43 promoter fragment from the pufferfish, Takifugu rubripes (fugu), that can promote reporter gene expression during both development and regeneration. Using promoter deletion analysis, we have found regions of the 3.6-kb fugu gap43 promoter that are necessary for expression in regenerating, but not developing, retinal ganglion cells. Within the 3.6-kb promoter, we have identified elements that are highly conserved among fish, as well as elements conserved among fish, mammals, and birds.
Collapse
Affiliation(s)
- Brandon W Kusik
- Department of Biological Sciences, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
| | | | | |
Collapse
|
7
|
Guisinger MM, Chumley TW, Kuehl JV, Boore JL, Jansen RK. Implications of the plastid genome sequence of typha (typhaceae, poales) for understanding genome evolution in poaceae. J Mol Evol 2010; 70:149-66. [PMID: 20091301 PMCID: PMC2825539 DOI: 10.1007/s00239-009-9317-3] [Citation(s) in RCA: 142] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2009] [Accepted: 12/16/2009] [Indexed: 11/21/2022]
Abstract
Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.
Collapse
Affiliation(s)
- Mary M Guisinger
- Section of Integrative Biology, University of Texas, Austin, TX 78712, USA.
| | | | | | | | | |
Collapse
|
8
|
Ladunga I(S. Finding Homologs in Amino Acid Sequences Using Network BLAST Searches. ACTA ACUST UNITED AC 2009; Chapter 3:3.4.1-3.4.34. [DOI: 10.1002/0471250953.bi0304s25] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
9
|
Elisaphenko EA, Kolesnikov NN, Shevchenko AI, Rogozin IB, Nesterova TB, Brockdorff N, Zakian SM. A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements. PLoS One 2008; 3:e2521. [PMID: 18575625 PMCID: PMC2430539 DOI: 10.1371/journal.pone.0002521] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2008] [Accepted: 05/15/2008] [Indexed: 11/18/2022] Open
Abstract
X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA.
Collapse
Affiliation(s)
- Eugeny A. Elisaphenko
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Department, Novosibirsk, Russia
| | - Nikolay N. Kolesnikov
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Department, Novosibirsk, Russia
| | - Alexander I. Shevchenko
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Department, Novosibirsk, Russia
| | - Igor B. Rogozin
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Department, Novosibirsk, Russia
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Tatyana B. Nesterova
- Medical Research Council, Clinical Sciences Centre, Imperial College Faculty of Medicine, London, United Kingdom
| | - Neil Brockdorff
- Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | - Suren M. Zakian
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Department, Novosibirsk, Russia
| |
Collapse
|
10
|
Elnitski L, Riemer C, Schwartz S, Hardison R, Miller W. PipMaker: a World Wide Web server for genomic sequence alignments. ACTA ACUST UNITED AC 2008; Chapter 10:Unit 10.2. [PMID: 18428692 DOI: 10.1002/0471250953.bi1002s00] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
PipMaker is a World-Wide Web site used to compare two long genomic sequences and identify conserved segments between them. This unit describes the use of the PipMaker server and explains the resulting output files. PipMaker provides an efficient method of aligning genomic sequences and returns a compact, but easy-to-interpret form of output, the percent identity plot (pip). For each aligning segment between two sequences the pip shows both the position relative to the first sequence and the degree of similarity. Optional annotations on the pip provide additional information to assist in the interpretation of the alignment. The default parameters of the underlying blastz alignment program are tuned for human-mouse alignments.
Collapse
Affiliation(s)
- Laura Elnitski
- The Pennsylvania State University, University Park, Pennsylvania, USA
| | | | | | | | | |
Collapse
|
11
|
Yang Z, Jiang H, Zhao F, Shankar DB, Sakamoto KM, Zhang MQ, Lin S. A highly conserved regulatory element controls hematopoietic expression of GATA-2 in zebrafish. BMC DEVELOPMENTAL BIOLOGY 2007; 7:97. [PMID: 17708765 PMCID: PMC1988811 DOI: 10.1186/1471-213x-7-97] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2006] [Accepted: 08/20/2007] [Indexed: 01/30/2023]
Abstract
BACKGROUND GATA-2 is a transcription factor required for hematopoietic stem cell survival as well as for neuronal development in vertebrates. It has been shown that specific expression of GATA-2 in blood progenitor cells requires distal cis-acting regulatory elements. Identification and characterization of these elements should help elucidating transcription regulatory mechanisms of GATA-2 expression in hematopoietic lineage. RESULTS By pair-wise alignments of the zebrafish genomic sequences flanking GATA-2 to orthologous regions of fugu, mouse, rat and human genomes, we identified three highly conserved non-coding sequences in the genomic region flanking GATA-2, two upstream of GATA-2 and another downstream. Using both transposon and bacterial artificial chromosome mediated germline transgenic zebrafish analyses, one of the sequences was established as necessary and sufficient to direct hematopoietic GFP expression in a manner that recapitulates that of GATA-2. In addition, we demonstrated that this element has enhancer activity in mammalian myeloid leukemia cell lines, thus validating its functional conservation among vertebrate species. Further analysis of potential transcription factor binding sites suggested that integrity of the putative HOXA3 and LMO2 sites is required for regulating GATA-2/GFP hematopoietic expression. CONCLUSION Regulation of GATA-2 expression in hematopoietic cells is likely conserved among vertebrate animals. The integrated approach described here, drawing on embryological, transgenesis and computational methods, should be generally applicable to analyze tissue-specific gene regulation involving distal DNA cis-acting elements.
Collapse
Affiliation(s)
- Zhongan Yang
- Department of Molecular, Cell and Developmental Biology, University of California Los Angeles, Los Angeles, California 90095-1606, USA
| | - Hong Jiang
- Department of Molecular, Cell and Developmental Biology, University of California Los Angeles, Los Angeles, California 90095-1606, USA
| | - Fang Zhao
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Deepa B Shankar
- Division of Hematology-Oncology and Pathology and Laboratory Medicine, Gwynne Hazen Cherry Memorial Laboratories, David Geffen School of Medicine at UCLA, Los Angeles, California 90095-1752, USA
| | - Kathleen M Sakamoto
- Division of Hematology-Oncology and Pathology and Laboratory Medicine, Gwynne Hazen Cherry Memorial Laboratories, David Geffen School of Medicine at UCLA, Los Angeles, California 90095-1752, USA
| | - Michael Q Zhang
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Shuo Lin
- Department of Molecular, Cell and Developmental Biology, University of California Los Angeles, Los Angeles, California 90095-1606, USA
| |
Collapse
|
12
|
Saski C, Lee SB, Fjellheim S, Guda C, Jansen RK, Luo H, Tomkins J, Rognli OA, Daniell H, Clarke JL. Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2007; 115:571-90. [PMID: 17534593 PMCID: PMC2674615 DOI: 10.1007/s00122-007-0567-4] [Citation(s) in RCA: 134] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 11/22/2006] [Accepted: 04/23/2007] [Indexed: 05/07/2023]
Abstract
Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5' end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19-37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16-21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C-U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae.
Collapse
Affiliation(s)
- Christopher Saski
- Clemson University Genomics Institute, Clemson University, Biosystems Research Complex, 51 New Cherry Street, Clemson, SC 29634, USA
| | - Seung-Bum Lee
- 4000 Central Florida Blvd, Department of Molecular Biology and Microbiology, Biomolecular Science, University of Central Florida, Building #20, Orlando, FL 32816-2364, USA
| | - Siri Fjellheim
- Department of Plant and Environmental Sciences, Norwegian University of Life Sciences, 1432 Aas, Norway
| | - Chittibabu Guda
- Gen*NY* Sis Center for Excellence in Cancer Genomics and Department of Epidemiology and Biostatistics, State University of New York at Albany, 1 Discovery Dr Rensselaer, New York, NY 12144, USA
| | - Robert K. Jansen
- Section of Integrative Biology and Institute of Cellular and Molecular Biology, Biological Laboratories 404, University of Texas, Austin, TX 78712, USA
| | - Hong Luo
- Department of Genetics and Biochemistry, Clemson University, 51 New Cherry Street, Clemson, SC 29634, USA
| | - Jeffrey Tomkins
- Clemson University Genomics Institute, Clemson University, Biosystems Research Complex, 51 New Cherry Street, Clemson, SC 29634, USA
| | - Odd Arne Rognli
- Department of Plant and Environmental Sciences, Norwegian University of Life Sciences, 1432 Aas, Norway
| | - Henry Daniell
- 4000 Central Florida Blvd, Department of Molecular Biology and Microbiology, Biomolecular Science, University of Central Florida, Building #20, Orlando, FL 32816-2364, USA, e-mail:
| | - Jihong Liu Clarke
- Department of Genetics and Biotechnology, Norwegian Institute for Agricultural and Environmental Sciences, 1432 Aas, Norway
| |
Collapse
|
13
|
Jeffery IB, Madden SF, McGettigan PA, Perrière G, Culhane AC, Higgins DG. Integrating transcription factor binding site information with gene expression datasets. Bioinformatics 2006; 23:298-305. [PMID: 17127681 DOI: 10.1093/bioinformatics/btl597] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Microarrays are widely used to measure gene expression differences between sets of biological samples. Many of these differences will be due to differences in the activities of transcription factors. In principle, these differences can be detected by associating motifs in promoters with differences in gene expression levels between the groups. In practice, this is hard to do. RESULTS We combine correspondence analysis, between group analysis and co-inertia analysis to determine which motifs, from a database of promoter motifs, are strongly associated with differences in gene expression levels. Given a database of motifs and gene expression levels from a set of arrays, the method produces a ranked list of motifs associated with any specified split in the arrays. We give an example using the Gene Atlas compendium of gene expression levels for human tissues where we search for motifs that are associated with expression in central nervous system (CNS) or muscle tissues. Most of the motifs that we find are known from previous work to be strongly associated with expression in CNS or muscle. We give a second example using a published prostate cancer dataset where we can simply and clearly find which transcriptional pathways are associated with differences between benign and metastatic samples. AVAILABILITY The source code is freely available upon request from the authors.
Collapse
Affiliation(s)
- Ian B Jeffery
- UCD Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | | | | | |
Collapse
|
14
|
Annilo T, Chen ZQ, Shulenin S, Costantino J, Thomas L, Lou H, Stefanov S, Dean M. Evolution of the vertebrate ABC gene family: analysis of gene birth and death. Genomics 2006; 88:1-11. [PMID: 16631343 DOI: 10.1016/j.ygeno.2006.03.001] [Citation(s) in RCA: 134] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2005] [Revised: 02/28/2006] [Accepted: 03/02/2006] [Indexed: 10/24/2022]
Abstract
Vertebrate evolution has been largely driven by the duplication of genes that allow for the acquisition of new functions. The ATP-binding cassette (ABC) proteins constitute a large and functionally diverse family of membrane transporters. The members of this multigene family are found in all cellular organisms, most often engaged in the translocation of a wide variety of substrates across lipid membranes. Because of the diverse function of these genes, their large size, and the large number of orthologs, ABC genes represent an excellent tool to study gene family evolution. We have identified ABC proteins from the sea squirt (Ciona intestinalis), zebrafish (Danio rerio), and chicken (Gallus gallus) and, using phylogenetic analysis, identified those genes with a one-to-one orthologous relationship to human ABC proteins. All ABC protein subfamilies found in Ciona and zebrafish correspond to the human subfamilies, with the exception of a single ABCH subfamily gene found only in zebrafish. Multiple gene duplication and deletion events were identified in different lineages, indicating an ongoing process of gene evolution. As many ABC genes are involved in human genetic diseases, and important drug transport phenotypes, the understanding of ABC gene evolution is important to the development of animal models and functional studies.
Collapse
Affiliation(s)
- Tarmo Annilo
- Laboratory of Genomic Diversity, Building 560, Room 21-18, NCI-Frederick, Frederick, MD 21702, USA
| | | | | | | | | | | | | | | |
Collapse
|
15
|
Daniell H, Lee SB, Grevich J, Saski C, Quesada-Vargas T, Guda C, Tomkins J, Jansen RK. Complete chloroplast genome sequences of Solanum bulbocastanum, Solanum lycopersicum and comparative analyses with other Solanaceae genomes. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2006; 112:1503-18. [PMID: 16575560 DOI: 10.1007/s00122-006-0254-x] [Citation(s) in RCA: 114] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2005] [Accepted: 02/24/2006] [Indexed: 05/07/2023]
Abstract
Despite the agricultural importance of both potato and tomato, very little is known about their chloroplast genomes. Analysis of the complete sequences of tomato, potato, tobacco, and Atropa chloroplast genomes reveals significant insertions and deletions within certain coding regions or regulatory sequences (e.g., deletion of repeated sequences within 16S rRNA, ycf2 or ribosomal binding sites in ycf2). RNA, photosynthesis, and atp synthase genes are the least divergent and the most divergent genes are clpP, cemA, ccsA, and matK. Repeat analyses identified 33-45 direct and inverted repeats >or=30 bp with a sequence identity of at least 90%; all but five of the repeats shared by all four Solanaceae genomes are located in the same genes or intergenic regions, suggesting a functional role. A comprehensive genome-wide analysis of all coding sequences and intergenic spacer regions was done for the first time in chloroplast genomes. Only four spacer regions are fully conserved (100% sequence identity) among all genomes; deletions or insertions within some intergenic spacer regions result in less than 25% sequence identity, underscoring the importance of choosing appropriate intergenic spacers for plastid transformation and providing valuable new information for phylogenetic utility of the chloroplast intergenic spacer regions. Comparison of coding sequences with expressed sequence tags showed considerable amount of variation, resulting in amino acid changes; none of the C-to-U conversions observed in potato and tomato were conserved in tobacco and Atropa. It is possible that there has been a loss of conserved editing sites in potato and tomato.
Collapse
Affiliation(s)
- Henry Daniell
- Department of Molecular Biology & Microbiology, Biomolecular Science, University of Central Florida, 4000 Central Florida Blvd, Bldg # 20, Room 336, Orlando, FL 32816-2364, USA.
| | | | | | | | | | | | | | | |
Collapse
|
16
|
Liu GE, Adams MD. Genome resources and comparative analysis tools for cardiovascular research. METHODS IN MOLECULAR MEDICINE 2006; 128:101-23. [PMID: 17071992 DOI: 10.1007/978-1-59745-159-8_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Disorders of the cardiovascular system are often caused by the interaction of genetic and environmental factors that jointly contribute to individual susceptibility. Genomic data and bioinformatics tools generated from genome projects, coupled with functional verification, offer novel approaches to study both rare single-gene and complex multigenic cardiovascular diseases. These approaches include gene mapping using genome variation, especially single-nucleotide polymorphisms and comparative genomics within and between species. This chapter illustrates the major genome resources, associated bioinformatics tools, and their potential application in cardiovascular research.
Collapse
Affiliation(s)
- George E Liu
- Bovine Functional Genomics Laboratory, Animal and Natural Resources Institute, US Department of Agriculture-Agriculture Research Service, Beltsville, MD, USA
| | | |
Collapse
|
17
|
Saski C, Lee SB, Daniell H, Wood TC, Tomkins J, Kim HG, Jansen RK. Complete chloroplast genome sequence of Gycine max and comparative analyses with other legume genomes. PLANT MOLECULAR BIOLOGY 2005; 59:309-22. [PMID: 16247559 DOI: 10.1007/s11103-005-8882-0] [Citation(s) in RCA: 169] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2005] [Accepted: 06/16/2005] [Indexed: 05/05/2023]
Abstract
Lack of complete chloroplast genome sequences is still one of the major limitations to extending chloroplast genetic engineering technology to useful crops. Therefore, we sequenced the soybean chloroplast genome and compared it to the other completely sequenced legumes, Lotus and Medicago. The chloroplast genome of Glycine is 152,218 basepairs (bp) in length, including a pair of inverted repeats of 25,574 bp of identical sequence separated by a small single copy region of 17,895 bp and a large single copy region of 83,175 bp. The genome contains 111 unique genes, and 19 of these are duplicated in the inverted repeat (IR). Comparisons of Glycine, Lotus and Medicago confirm the organization of legume chloroplast genomes based on previous studies. Gene content of the three legumes is nearly identical. The rpl22 gene is missing from all three legumes, and Medicago is missing rps16 and one copy of the IR. Gene order in Glycine, Lotus, and Medicago differs from the usual gene order for angiosperm chloroplast genomes by the presence of a single, large inversion of 51 kilobases (kb). Detailed analyses of repeated sequences indicate that many of the Glycine repeats that are located in the intergenic spacer regions and introns occur in the same location in the other legumes and in Arabidopsis, suggesting that they may play some functional role. The presence of small repeats of psbA and rbcL in legumes that have lost one copy of the IR indicate that this loss has only occurred once during the evolutionary history of legumes.
Collapse
Affiliation(s)
- Christopher Saski
- Clemson University Genomics Institute, Clemson University, Biosystems Research Complex, 51 New Cherry Street, Clemson, SC 29634, USA
| | | | | | | | | | | | | |
Collapse
|
18
|
Brown CT, Xie Y, Davidson EH, Cameron RA. Paircomp, FamilyRelationsII and Cartwheel: tools for interspecific sequence comparison. BMC Bioinformatics 2005; 6:70. [PMID: 15790396 PMCID: PMC1087472 DOI: 10.1186/1471-2105-6-70] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2004] [Accepted: 03/24/2005] [Indexed: 11/28/2022] Open
Abstract
Background Comparative sequence analysis is an effective and increasingly common way to identify cis-regulatory regions in animal genomes. Results We describe three tools for comparative analysis of pairs of BAC-sized genomic regions. Paircomp is a tool that does windowed (ungapped) comparisons of two sequences and reports all matches above a set threshold. FamilyRelationsII is a graphical viewer for comparisons that enables interactive exploration of several different kinds of comparisons. Cartwheel is a Web site and compute-cluster management system used to execute and store comparisons for display by FamilyRelationsII. These tools are specialized for the discovery of cis-regulatory regions in animal genomes. All tools and their source code are freely available at . Conclusion These tools have been shown to effectively identify regulatory regions in echinoderms, mammals, and nematodes.
Collapse
Affiliation(s)
- C Titus Brown
- Division of Biological Sciences, California Institute of Technology, Pasadena, CA 91125, USA
- Center for Computational Regulatory Genomics, California Institute of Technology, Pasadena, CA 91125, USA
| | - Yuan Xie
- Center for Computational Regulatory Genomics, California Institute of Technology, Pasadena, CA 91125, USA
| | - Eric H Davidson
- Division of Biological Sciences, California Institute of Technology, Pasadena, CA 91125, USA
| | - R Andrew Cameron
- Division of Biological Sciences, California Institute of Technology, Pasadena, CA 91125, USA
- Center for Computational Regulatory Genomics, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
19
|
Martin N, Patel S, Segre JA. Long-range comparison of human and mouse Sprr loci to identify conserved noncoding sequences involved in coordinate regulation. Genome Res 2005; 14:2430-8. [PMID: 15574822 PMCID: PMC534667 DOI: 10.1101/gr.2709404] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Mammalian epidermis provides a permeability barrier between an organism and its environment. Under homeostatic conditions, epidermal cells produce structural proteins, which are cross-linked in an orderly fashion to form a cornified envelope (CE). However, under genetic or environmental stress, specific genes are induced to rapidly build a temporary barrier. Small proline-rich (SPRR) proteins are the primary constituents of the CE. Under stress the entire family of 14 Sprr genes is upregulated. The Sprr genes are clustered within the larger epidermal differentiation complex on mouse chromosome 3, human chromosome 1q21. The clustering of the Sprr genes and their upregulation under stress suggest that these genes may be coordinately regulated. To identify enhancer elements that regulate this stress response activation of the Sprr locus, we utilized bioinformatic tools and classical biochemical dissection. Long-range comparative sequence analysis identified conserved noncoding sequences (CNSs). Clusters of epidermal-specific DNaseI-hypersensitive sites (HSs) mapped to specific CNSs. Increased prevalence of these HSs in barrier-deficient epidermis provides in vivo evidence of the regulation of the Sprr locus by these conserved sequences. Individual components of these HSs were cloned, and one was shown to have strong enhancer activity specific to conditions when the Sprr genes are coordinately upregulated.
Collapse
Affiliation(s)
- Natalia Martin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | |
Collapse
|
20
|
Jansen RK, Raubeson LA, Boore JL, dePamphilis CW, Chumley TW, Haberle RC, Wyman SK, Alverson AJ, Peery R, Herman SJ, Fourcade HM, Kuehl JV, McNeal JR, Leebens-Mack J, Cui L. Methods for obtaining and analyzing whole chloroplast genome sequences. Methods Enzymol 2005; 395:348-84. [PMID: 15865976 DOI: 10.1016/s0076-6879(05)95020-9] [Citation(s) in RCA: 294] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
During the past decade, there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. There are 45 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next 5 years. Several groups of researchers including ours have been developing new techniques for gathering and analyzing entire plastid genome sequences and details of these developments are summarized in this chapter. The most important developments that enhance our ability to generate whole chloroplast genome sequences involve the generation of pure fractions of chloroplast genomes by whole genome amplification using rolling circle amplification, cloning genomes into Fosmid or bacterial artificial chromosome (BAC) vectors, and the development of an organellar annotation program (Dual Organellar GenoMe Annotator [DOGMA]). In addition to providing details of these methods, we provide an overview of methods for analyzing complete plastid genome sequences for repeats and gene content, as well as approaches for using gene order and sequence data for phylogeny reconstruction. This explosive increase in the number of sequenced plastid genomes and improved computational tools will provide many insights into the evolution of these genomes and much new data for assessing relationships at deep nodes in plants and other photosynthetic organisms.
Collapse
Affiliation(s)
- Robert K Jansen
- Section of Integrative Biology, The University of Texas at Austin, Institute of Cellular and Molecular Biology, Austin, Texas 78712-0253, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
ABC: software for interactive browsing of genomic multiple sequence alignment data. BMC Bioinformatics 2004; 5:192. [PMID: 15588288 PMCID: PMC539296 DOI: 10.1186/1471-2105-5-192] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2004] [Accepted: 12/08/2004] [Indexed: 01/14/2023] Open
Abstract
Background Alignment and comparison of related genome sequences is a powerful method to identify regions likely to contain functional elements. Such analyses are data intensive, requiring the inclusion of genomic multiple sequence alignments, sequence annotations, and scores describing regional attributes of columns in the alignment. Visualization and browsing of results can be difficult, and there are currently limited software options for performing this task. Results The Application for Browsing Constraints (ABC) is interactive Java software for intuitive and efficient exploration of multiple sequence alignments and data typically associated with alignments. It is used to move quickly from a summary view of the entire alignment via arbitrary levels of resolution to individual alignment columns. It allows for the simultaneous display of quantitative data, (e.g., sequence similarity or evolutionary rates) and annotation data (e.g. the locations of genes, repeats, and constrained elements). It can be used to facilitate basic comparative sequence tasks, such as export of data in plain-text formats, visualization of phylogenetic trees, and generation of alignment summary graphics. Conclusions The ABC is a lightweight, stand-alone, and flexible graphical user interface for browsing genomic multiple sequence alignments of specific loci, up to hundreds of kilobases or a few megabases in length. It is coded in Java for cross-platform use and the program and source code are freely available under the General Public License. Documentation and a sample data set are also available .
Collapse
|
22
|
Annilo T, Dean M. Degeneration of an ATP-binding cassette transporter gene, ABCC13, in different mammalian lineages. Genomics 2004; 84:34-46. [PMID: 15203202 DOI: 10.1016/j.ygeno.2004.02.010] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2003] [Accepted: 02/19/2004] [Indexed: 11/18/2022]
Abstract
The ABC transporter gene family has evolved by a gene "birth-and-death" process; however, the number of ABC pseudogenes in the human genome is surprisingly small. On chromosome 21q11.2, spanning 90 kb, is an ABC gene-like sequence (recently annotated as ABCC13) with the highest similarity to ABCC2. Here we show that while comparative analysis and in silico prediction methods indicate the presence of at least 28 exons, the major ABCC13 transcript in humans consists of only 6 exons with a total length of 1.1 kb. The open reading frame of this transcript is capable of encoding a polypeptide of only 274 amino acids, compared to the more than 1500 amino acids of related ABC transporters. The truncated ABCC13 transcript shows tissue-specific expression, highest in fetal liver, bone marrow, and colon. Since the last exon of the ABCC13 transcript contains an apparent frameshift, we sequenced the respective region from several primates and found that the frameshift is due to an 11-bp deletion that is shared between human, chimpanzee, and gorilla, but is not found in monkeys. In addition, the human ABCC13 gene contains two other frameshift indels in the exons that encode the second nucleotide-binding domain, indicating that ABCC13 is not capable of encoding a functional ABC protein. In an attempt to identify an intact ABCC13 ortholog, we have sequenced the full-length cDNA from rhesus macaque, which contains an open reading frame of 1296 amino acids, producing an apparently functional ABC transporter. Although the mouse and rat genomes contain long-range similarity in the locus where Abcc13 is expected to reside, most of the Abcc13 exons in rodents are degraded below the threshold of sequence homology searches or have been deleted completely.
Collapse
Affiliation(s)
- Tarmo Annilo
- Human Genetics Section, Laboratory of Genomic Diversity, National Cancer Institute-Frederick, Building 560, Room 21-31, Frederick, MD 21702, USA.
| | | |
Collapse
|
23
|
Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004; 5:276-87. [PMID: 15131651 DOI: 10.1038/nrg1315] [Citation(s) in RCA: 805] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Affiliation(s)
- Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics and British Columbia Women's and Children's Hospitals, and Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia V5Z 4H4, Canada
| | | |
Collapse
|
24
|
Abrams KL, Xu J, Nativelle-Serpentini C, Dabirshahsahebi S, Rogers MB. An evolutionary and molecular analysis of Bmp2 expression. J Biol Chem 2004; 279:15916-28. [PMID: 14757762 DOI: 10.1074/jbc.m313531200] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The coding regions of many metazoan genes are highly similar. For example, homologs to the key developmental factor bone morphogenetic protein (BMP) 2 have been cloned by sequence identity from arthropods, mollusks, cnidarians, and nematodes. Wide conservation of protein sequences suggests that differential gene expression explains many of the vast morphological differences between species. To test the hypothesis that the regulatory mechanisms controlling this evolutionarily ancient and critical gene are conserved, we compared sequences flanking Bmp2 genes of several species. We identified numerous conserved noncoding sequences including some retained because the fish lineage separated 450 million years ago. We tested the function of some of these sequences in the F9 cell model system of Bmp2 expression. We demonstrated that both mouse and primate Bmp2 promoters drive a reporter gene in an expression pattern resembling that of the endogenous transcript in F9 cells. A conserved Sp1 site contributes to the retinoic acid responsiveness of the Bmp2 promoter, which lacks a classical retinoic acid response element. We have also discovered a sequence downstream of the stop codon whose conservation between humans, rodents, deer, chickens, frogs, and fish is striking. A fragment containing this region influences reporter gene expression in F9 cells. The conserved region contains elements that may mediate the half-life of the Bmp2 transcript. Together, our molecular and evolutionary analysis has identified new regulatory elements controlling Bmp2 expression.
Collapse
Affiliation(s)
- Kevin L Abrams
- Department of Biology, University of South Florida, Tampa, Florida 33620, USA
| | | | | | | | | |
Collapse
|
25
|
Nazina AG, Papatsenko DA. Statistical extraction of Drosophila cis-regulatory modules using exhaustive assessment of local word frequency. BMC Bioinformatics 2003; 4:65. [PMID: 14690551 PMCID: PMC341902 DOI: 10.1186/1471-2105-4-65] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2003] [Accepted: 12/22/2003] [Indexed: 11/13/2022] Open
Abstract
Background Transcription regulatory regions in higher eukaryotes are often represented by cis-regulatory modules (CRM) and are responsible for the formation of specific spatial and temporal gene expression patterns. These extended, ~1 KB, regions are found far from coding sequences and cannot be extracted from genome on the basis of their relative position to the coding regions. Results To explore the feasibility of CRM extraction from a genome, we generated an original training set, containing annotated sequence data for most of the known developmental CRMs from Drosophila. Based on this set of experimental data, we developed a strategy for statistical extraction of cis-regulatory modules from the genome, using exhaustive analysis of local word frequency (LWF). To assess the performance of our analysis, we measured the correlation between predictions generated by the LWF algorithm and the distribution of conserved non-coding regions in a number of Drosophila developmental genes. Conclusions In most of the cases tested, we observed high correlation (up to 0.6–0.8, measured on the entire gene locus) between the two independent techniques. We discuss computational strategies available for extraction of Drosophila CRMs and possible extensions of these methods.
Collapse
Affiliation(s)
- Anna G Nazina
- Department of Biology, New York University, New York, USA
| | | |
Collapse
|
26
|
Ureta-Vidal A, Ettwiller L, Birney E. Comparative genomics: genome-wide analysis in metazoan eukaryotes. Nat Rev Genet 2003; 4:251-62. [PMID: 12671656 DOI: 10.1038/nrg1043] [Citation(s) in RCA: 143] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The increasing number of complete and nearly complete metazoan genome sequences provides a significant amount of material for large-scale comparative genomic analysis. Finding new effective methods to analyse such enormous datasets has been the object of intense research. Three main areas in comparative genomics have recently shown important developments: whole-genome alignment, gene prediction and regulatory-region prediction. Each of these areas improves the methods of deciphering long genomic sequences and uncovering what lies hidden in them.
Collapse
Affiliation(s)
- Abel Ureta-Vidal
- EnsEMBL Project, Room A2-06, EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | |
Collapse
|
27
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2003. [PMCID: PMC2448450 DOI: 10.1002/cfg.228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|