1
|
Abstract
INTRODUCTION 1α,25-Dihydroxyvitamin D3 (1,25-D3) is antiproliferative in preclinical models of lung cancer, but in tumor tissues, its efficacy may be limited by CYP24A1 expression. CYP24A1 is the rate limiting catabolic enzyme for 1,25-D3 and is overexpressed in human lung adenocarcinoma (AC) by unknown mechanisms. METHODS The DNA methylation status of CYP24A1 was determined by bisulfite DNA pyrosequencing in a panel of 30 lung cell lines and 90 surgically resected lung AC. The level of CYP24A1 methylation was correlated with CYP24A1 expression in lung AC cell lines and tumors. In addition, histone modifications were assessed by quantitative chromatin immunoprecipitation-polymerase chain reaction (ChIP-qPCR) in A549, NCI-H460, and SK-LU-1. RESULTS Bisulfite DNA pyrosequencing analysis revealed that CYP24A1 gene was heterogeneously methylated in lung AC. Expression of CYP24A1 was inversely correlated with promoter DNA methylation in lung AC cell lines and tumors. Treatment with 5-aza-2'-deoxycytidine (5-Aza) and trichostatin A (TSA) increased CYP24A1 expression in lung AC. We observed that CYP24A1 promoter hypermethylation decreased CYP24A1 enzyme activity in vitro, whereas treatment with 5-Aza and/or TSA increased CYP24A1 enzyme affinity for its substrate 1,25-D3. In addition, ChIP-qPCR analysis revealed specific histone modifications within the CYP24A1 promoter region. Treatment with TSA increased H3K4me2 and H3K9ac and simultaneously decreased H3K9me2 at the CYP24A1 promoter and treatment with 5-Aza and/or TSA increased the recruitment of vitamin D receptor (VDR) to vitamin D response elements (VDRE) of the CYP24A1 promoter. CONCLUSIONS The expression of CYP24A1 gene in human lung AC is in part epigenetically regulated by promoter DNA methylation and repressive histone modifications. These findings should be taken into consideration when targeting CYP24A1 to optimize antiproliferative effects of 1,25-D3 in lung AC.
Collapse
|
2
|
Andorf CM, Kopylov M, Dobbs D, Koch KE, Stroupe ME, Lawrence CJ, Bass HW. G-Quadruplex (G4) Motifs in the Maize (Zea mays L.) Genome Are Enriched at Specific Locations in Thousands of Genes Coupled to Energy Status, Hypoxia, Low Sugar, and Nutrient Deprivation. J Genet Genomics 2014; 41:627-47. [DOI: 10.1016/j.jgg.2014.10.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2014] [Revised: 10/16/2014] [Accepted: 10/24/2014] [Indexed: 02/07/2023]
|
3
|
Managadze D, Lobkovsky AE, Wolf YI, Shabalina SA, Rogozin IB, Koonin EV. The vast, conserved mammalian lincRNome. PLoS Comput Biol 2013; 9:e1002917. [PMID: 23468607 PMCID: PMC3585383 DOI: 10.1371/journal.pcbi.1002917] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 12/26/2012] [Indexed: 01/22/2023] Open
Abstract
We compare the sets of experimentally validated long intergenic non-coding (linc)RNAs from human and mouse and apply a maximum likelihood approach to estimate the total number of lincRNA genes as well as the size of the conserved part of the lincRNome. Under the assumption that the sets of experimentally validated lincRNAs are random samples of the lincRNomes of the corresponding species, we estimate the total lincRNome size at approximately 40,000 to 50,000 species, at least twice the number of protein-coding genes. We further estimate that the fraction of the human and mouse euchromatic genomes encoding lincRNAs is more than twofold greater than the fraction of protein-coding sequences. Although the sequences of most lincRNAs are much less strongly conserved than protein sequences, the extent of orthology between the lincRNomes is unexpectedly high, with 60 to 70% of the lincRNA genes shared between human and mouse. The orthologous mammalian lincRNAs can be predicted to perform equivalent functions; accordingly, it appears likely that thousands of evolutionarily conserved functional roles of lincRNAs remain to be characterized. Genome analysis of humans and other mammals reveals a surprisingly small number of protein-coding genes, only slightly over 20,000 (although the diversity of actual proteins is substantially augmented by alternative transcription and alternative splicing). Recent analysis of the mammalian genomes and transcriptomes, in particular, using the RNAseq technology, shows that, in addition to protein-coding genes, mammalian genomes encode many long non-coding RNAs. For some of these transcripts, various regulatory functions have been demonstrated, but on the whole the repertoire of long non-coding RNAs remains poorly characterized. We compared the identified long intergenic non-coding (linc)RNAs from human and mouse, and employed a specially developed statistical technique to estimate the size and evolutionary conservation of the human and mouse lincRNomes. The estimates show that there are at least twice as many human and mouse lincRNAs than there are protein-coding genes. Moreover, about two third of the lincRNA genes appear to be conserved between human and mouse, implying thousands of conserved but still uncharacterized functions.
Collapse
Affiliation(s)
- David Managadze
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Alexander E. Lobkovsky
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Svetlana A. Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Igor B. Rogozin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|
4
|
Yin PY, Shyu SJ, Yang SR, Chang YC. Reinforcement Learning for Improving Gene Identification Accuracy by Combination of Gene-Finding Programs. INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING 2012. [DOI: 10.4018/jamc.2012010104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Due to the explosive and growing size of the genome database, the discovery of gene has become one of the most computationally intensive tasks in bioinformatics. Many such systems have been developed to find genes; however, there is still some room to improve the prediction accuracy. This paper proposes a reinforcement learning model for a combination of gene predictions from existing gene-finding programs. The model learns the optimal policy for accepting the best predictions. The fitness of a policy is reinforced if the selected prediction at a nucleotide site correctly corresponds to the true annotation. The model searches for the optimal policy which maximizes the expected prediction accuracy over all nucleotide sites in the sequences. The experimental results demonstrate that the proposed model yields higher prediction accuracy than that obtained by the single best program.
Collapse
|
5
|
Luo W, Karpf AR, Deeb KK, Muindi JR, Morrison CD, Johnson CS, Trump DL. Epigenetic regulation of vitamin D 24-hydroxylase/CYP24A1 in human prostate cancer. Cancer Res 2010; 70:5953-62. [PMID: 20587525 DOI: 10.1158/0008-5472.can-10-0617] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Calcitriol, a regulator of calcium homeostasis with antitumor properties, is degraded by the product of the CYP24A1 gene, which is downregulated in human prostate cancer by unknown mechanisms. We found that CYP24A1 expression is inversely correlated with promoter DNA methylation in prostate cancer cell lines. Treatment with the DNA methyltransferase inhibitor 5-aza-2'-deoxycytidine (DAC) activates CYP24A1 expression in prostate cancer cells. In vitro methylation of the CYP24A1 promoter represses its promoter activity. Furthermore, inhibition of histone deacetylases by trichostatin A (TSA) enhances the expression of CYP24A1 in prostate cancer cells. Quantitative chromatin immunoprecipitation-PCR (ChIP-qPCR) reveals that specific histone modifications are associated with the CYP24A1 promoter region. Treatment with TSA increases H3K9ac and H3K4me2 and simultaneously decreases H3K9me2 at the CYP24A1 promoter. ChIP-qPCR assay reveals that treatment with DAC and TSA increases the recruitment of vitamin D receptor to the CYP24A1 promoter. Reverse transcriptase-PCR analysis of paired human prostate samples revealed that CYP24A1 expression is downregulated in prostate malignant lesions compared with adjacent histologically benign lesions. Bisulfite pyrosequencing shows that CYP24A1 gene is hypermethylated in malignant lesions compared with matched benign lesions. Our findings indicate that repression of CYP24A1 gene expression in human prostate cancer cells is mediated in part by promoter DNA methylation and repressive histone modifications.
Collapse
Affiliation(s)
- Wei Luo
- Department of Pharmacology and Therapeutics, Roswell Park Cancer Institute, Buffalo, New York 14263, USA
| | | | | | | | | | | | | |
Collapse
|
6
|
Guselnikov SV, Reshetnikova ES, Najakshin AM, Mechetina LV, Robert J, Taranin AV. The amphibians Xenopus laevis and Silurana tropicalis possess a family of activating KIR-related Immunoglobulin-like receptors. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2010; 34:308-15. [PMID: 19896971 PMCID: PMC2813978 DOI: 10.1016/j.dci.2009.10.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2009] [Revised: 10/26/2009] [Accepted: 10/26/2009] [Indexed: 05/09/2023]
Abstract
In this study, we searched the amphibian species Xenopus laevis and Silurana (Xenopus) tropicalis for the presence of genes homologous to mammalian KIRs and avian CHIRs (KRIR family). By experimental and computational procedures, we identified four related ILR (Ig-like Receptors) genes in S. tropicalis and three in X. laevis. ILRs encode type I transmembrane receptors with 3-4 Ig-like extracellular domains. All predicted ILR proteins appear to be activating receptors. ILRs have a broad expression pattern, the gene transcripts were found in both lymphoid and non-lymphoid tissues. Phylogenetic analysis shows that the amphibian KRIR family receptors evolved independently from their mammalian and avian counterparts. The only conserved structural element of tetrapod KRIRs is the NxxR motif-containing transmembrane domain that facilitates association with FcRgamma subunit. Our findings suggest that if KRIRs of various vertebrates have any common function at all, such a function is activating rather than inhibitory.
Collapse
Affiliation(s)
- Sergey V Guselnikov
- Laboratory of Immunogenetics, Division of Molecular and Cellular Biology, Institute of Chemical Biology and Fundamental Medicine, Prospekt Lavrentyeva 8, Novosibirsk 630090, Russian Federation.
| | | | | | | | | | | |
Collapse
|
7
|
Childs KL. Genomic and genetic database resources for the grasses. PLANT PHYSIOLOGY 2009; 149:132-6. [PMID: 19126704 PMCID: PMC2613734 DOI: 10.1104/pp.108.129593] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2008] [Accepted: 11/06/2008] [Indexed: 05/20/2023]
Affiliation(s)
- Kevin L Childs
- Department of Plant Biology, Michigan State University, East Lansing, Michigan 48824, USA.
| |
Collapse
|
8
|
Guselnikov SV, Ramanayake T, Erilova AY, Mechetina LV, Najakshin AM, Robert J, Taranin AV. The Xenopus FcR family demonstrates continually high diversification of paired receptors in vertebrate evolution. BMC Evol Biol 2008; 8:148. [PMID: 18485190 PMCID: PMC2413239 DOI: 10.1186/1471-2148-8-148] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2007] [Accepted: 05/16/2008] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Recent studies have revealed an unexpected diversity of domain architecture among FcR-like receptors that presumably fulfill regulatory functions in the immune system. Different species of mammals, as well as chicken and catfish have been found to possess strikingly different sets of these receptors. To better understand the evolutionary history of paired receptors, we extended the study of FcR-like genes in amphibian representatives Xenopus tropicalis and Xenopus laevis. RESULTS The diploid genome of X. tropicalis contains at least 75 genes encoding paired FcR-related receptors designated XFLs. The allotetraploid X. laevis displays many similar genes primarily expressed in lymphoid tissues. Up to 35 domain architectures generated by combinatorial joining of six Ig-domain subtypes and two subtypes of the transmembrane regions were found in XFLs. None of these variants are shared by FcR-related proteins from other studied species. Putative activating XFLs associate with the FcRgamma subunit, and their transmembrane domains are highly similar to those of activating mammalian KIR-related receptors. This argues in favor of a common origin for the FcR and the KIR families. Phylogenetic analysis shows that the entire repertoires of the Xenopus and mammalian FcR-related proteins have emerged after the amphibian-amniotes split. CONCLUSION FcR- and KIR-related receptors evolved through continual species-specific diversification, most likely by extensive domain shuffling and birth-and-death processes. This mode of evolution raises the possibility that the ancestral function of these paired receptors was a direct interaction with pathogens and that many physiological functions found in the mammalian receptors were secondary acquisitions or specializations.
Collapse
Affiliation(s)
| | | | | | | | | | - Jacques Robert
- University of Rochester Medical Centre, Rochester, NY, USA
| | | |
Collapse
|
9
|
Zhao XF, Fjose A, Larsen N, Helvik JV, Drivenes Ø. Treatment with small interfering RNA affects the microRNA pathway and causes unspecific defects in zebrafish embryos. FEBS J 2008; 275:2177-84. [PMID: 18384379 DOI: 10.1111/j.1742-4658.2008.06371.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
MicroRNAs (miRNAs) are generated from primary transcripts through sequential processing by two RNase III enzymes, Drosha and Dicer, in association with other proteins. This maturation is essential for their function as post-transcriptional regulators. Notably, Dicer is also a component of RNA-induced silencing complexes, which incorporate either miRNA or small interfering RNA (siRNA) as guides to target specific mRNAs. In zebrafish, processed miRNAs belonging to the miR-430 family have previously been shown to promote deadenylation and degradation of maternal mRNAs during early embryogenesis. We show that injection of one-cell-stage zebrafish embryos with siRNA causes a significant reduction in the endogenous levels of processed miR-430 and other miRNAs, leading to unspecific developmental defects. Coinjection of siRNA with preprocessed miR-430 efficiently rescued development. This indicates that the abnormalities generally observed in siRNA-treated zebrafish embryos could be due to inhibition of miR-430 processing and/or activity. Our results also suggest that the miRNA pathway in mammals, under some experimental or therapeutic conditions, may be affected by siRNA.
Collapse
Affiliation(s)
- Xiao-Feng Zhao
- Department of Molecular Biology, University of Bergen, Norway
| | | | | | | | | |
Collapse
|
10
|
Abstract
Background The CSL (CBF1/RBP-Jκ/Suppressor of Hairless/LAG-1) transcription factor family members are well-known components of the transmembrane receptor Notch signaling pathway, which plays a critical role in metazoan development. They function as context-dependent activators or repressors of transcription of their responsive genes, the promoters of which harbor the GTG(G/A)GAA consensus elements. Recently, several studies described Notch-independent activities of the CSL proteins. Conclusion Our findings support the evolutionary origin of the CSL transcription factor family in the last common ancestor of fungi and metazoans. We hypothesize that the ancestral CSL function involved DNA binding and Notch-independent regulation of transcription and that this function may still be shared, to a certain degree, by the present CSL family members from both fungi and metazoans.
Collapse
|
11
|
Andreini C, Banci L, Bertini I, Elmi S, Rosato A. Non-heme iron through the three domains of life. Proteins 2007; 67:317-24. [PMID: 17286284 DOI: 10.1002/prot.21324] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Metalloproteins are proteins capable of binding one or more metal ions, which are often required for their biological function or for regulation of their activities or for structural purposes. In high-throughput genome-level protein investigation efforts, such as Structural Genomics, the systematic experimental characterization of metal-binding properties (i.e. the investigation of the metalloproteome) is not always pursued, and remains far from trivial. In the present work we have applied a bioinformatic approach to investigate the occurrence of (putative) non-heme iron-binding proteins in 57 different organisms spanning the entire tree of life. It is found that the non-heme iron-proteome constitutes between 1% and 10% of the entire proteome of an organism. However, the iron-proteome constitutes a higher fraction of the proteome in archaea (on average 7.1% +/- 2.1%) than in bacteria (3.9% +/- 1.6%) and in eukaryota (1.1% +/- 0.4%). The analysis of the function of each putative iron-protein identified suggests that extant organisms have inherited the large majority of their iron-proteome from the last common ancestor.
Collapse
Affiliation(s)
- Claudia Andreini
- Magnetic Resonance Center (CERM) and Department of Chemistry, University of Florence, 50019 Sesto Fiorentino, Italy
| | | | | | | | | |
Collapse
|
12
|
Fayngerts SA, Najakshin AM, Taranin AV. Species-specific evolution of the FcR family in endothermic vertebrates. Immunogenetics 2007; 59:493-506. [PMID: 17356879 DOI: 10.1007/s00251-007-0208-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2007] [Accepted: 02/19/2007] [Indexed: 10/23/2022]
Abstract
In primates and rodents, the extended FcR family is comprised of three subsets: classical FcRs, structurally diverse cell surface receptors currently designated FCRL1-FCRL6, and intracellular proteins FCRLA and FCRLB. Using bioinformatic analysis, we revealed the FcR-like genes of the same three subsets in the genome of dog, another representative of placental mammals, and in the genome of short-tailed opossum, a representative of marsupials. In contrast, a single FcR-like gene was found in the current version of the chicken genome. This in silico finding was confirmed by the gene cloning and subsequent Southern blot hybridization. The chicken FCRL gene encodes a cell surface receptor with the extracellular region composed of four Ig-like domains of the D1-, D2-, D3-, and D4-subtypes. The gene is expressed in lymphoid and non-lymphoid tissues. Phylogenetic analysis of the mammalian and chicken genes suggested that classical FcRs, FCRLA, and FCRLB emerged after the mammalian-avian split but before the eutherian-marsupial radiation. The data obtained show that the repertoire of the classical FcRs and surface FcR-like proteins in mammalian species was shaped by an extensive recombination process, which resulted in domain shuffling and species-specific gain and loss of distinct exons or entire genes.
Collapse
Affiliation(s)
- Svetlana A Fayngerts
- Laboratory of Immunogenetics, Institute of Cytology and Genetics, Novosibirsk, Russia
| | | | | |
Collapse
|
13
|
Andersen SU, Algreen-Petersen RG, Hoedl M, Jurkiewicz A, Cvitanich C, Braunschweig U, Schauser L, Oh SA, Twell D, Jensen EØ. The conserved cysteine-rich domain of a tesmin/TSO1-like protein binds zinc in vitro and TSO1 is required for both male and female fertility in Arabidopsis thaliana. JOURNAL OF EXPERIMENTAL BOTANY 2007; 58:3657-3670. [PMID: 18057042 DOI: 10.1093/jxb/erm215] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Development of reproductive tissue and control of cell division are common challenges to all sexually reproducing eukaryotes. The Arabidopsis thaliana TSO1 gene is involved in both these processes. Mild tso1 mutant alleles influence only ovule development, whereas strong alleles have an effect on all floral tissues and cause cell division defects. The tso1 mutants described so far carry point mutations in a conserved cysteine-rich domain, the CRC domain, but the reason for the range of phenotypes observed is poorly understood. In the present study, the tesmin/TSO1-like CXC (TCX) proteins are characterized at the biochemical, genomic, transcriptomic, and functional level to address this question. It is shown that the CRC domain binds zinc, offering an explanation for the severity of tso1 alleles where cysteine residues are affected. In addition, the phylogenetic and expression analysis of the TCX genes suggested an overlap in function between AtTSO1 and the related gene AtTCX2. Their expression ratios indicated that pollen, in addition to ovules, would be sensitive to loss of TSO1 function. This was confirmed by analysis of novel tso1 T-DNA insertion alleles where the development of both pollen and ovules was affected.
Collapse
Affiliation(s)
- Stig Uggerhøj Andersen
- Laboratory of Gene Expression, Department of Molecular Biology, University of Aarhus, Gustav Wieds Vej 10, DK-8000 Aarhus C, Denmark.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Abstract
Conserved motifs of eukaryotic gene promoters, such as TATA box and CAAT box sequences, of E1A of human adenoviruses (e.g human adenovirus 5) lie between the left inverted terminal repeat (ITR) and the ATG of E1A. However, analysis of the left end of the bovine adenovirus 3 (BAdV-3) genome revealed that the conserved sequences of the E1A promoter are present only in the ITR. As such, the promoter activity of ITR was tested in the context of a BAdV-3 vector or a plasmid-based system. Different regions of the left end of the BAdV-3 genome initiated transcription of the red fluorescent protein gene in a plasmid-based system. Moreover, BAdV-3 mutants in which the open reading frame of E1A was placed immediately downstream of the ITR produced E1A transcript and could be propagated in non-E1A-complementing Madin-Darby bovine kidney cells. These results suggest that the left ITR contains the sole BAdV-3 E1A promoter.
Collapse
Affiliation(s)
- Li Xing
- Vectored Vaccine Program, Vaccine and Infectious Disease Organization, 120 Veterinary Road, University of Saskatchewan, Saskatoon, SK S7N 5E3, Canada
| | - Suresh Kumar Tikoo
- Vectored Vaccine Program, Vaccine and Infectious Disease Organization, 120 Veterinary Road, University of Saskatchewan, Saskatoon, SK S7N 5E3, Canada
| |
Collapse
|
15
|
Andreini C, Banci L, Bertini I, Rosato A. Counting the zinc-proteins encoded in the human genome. J Proteome Res 2006; 5:196-201. [PMID: 16396512 DOI: 10.1021/pr050361j] [Citation(s) in RCA: 685] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Metalloproteins are proteins capable of binding one or more metal ions, which may be required for their biological function, or for regulation of their activities or for structural purposes. Genome sequencing projects have provided a huge number of protein primary sequences, but, even though several different elaborate analyses and annotations have been enabled by a rich and ever-increasing portfolio of bioinformatic tools, metal-binding properties remain difficult to predict as well as to investigate experimentally. Consequently, the present knowledge about metalloproteins is only partial. The present bioinformatic research proposes a strategy to answer the question of how many and which proteins encoded in the human genome may require zinc for their physiological function. This is achieved by a combination of approaches, which include: (i) searching in the proteome for the zinc-binding patterns that, on their turn, are obtained from all available X-ray data; (ii) using libraries of metal-binding protein domains based on multiple sequence alignments of known metalloproteins obtained from the Pfam database; and (iii) mining the annotations of human gene sequences, which are based on any type of information available. It is found that 1684 proteins in the human proteome are independently identified by all three approaches as zinc-proteins, 746 are identified by two, and 777 are identified by only one method. By assuming that all proteins identified by at least two approaches are truly zinc-binding and inspecting the proteins identified by a single method, it can be proposed that ca. 2800 human proteins are potentially zinc-binding in vivo, corresponding to 10% of the human proteome, with an uncertainty of 400 sequences. Available functional information suggests that the large majority of human zinc-binding proteins are involved in the regulation of gene expression. The most abundant class of zinc-binding proteins in humans is that of zinc-fingers, with Cys4 and Cys2His2 being the most common types of coordination environment.
Collapse
Affiliation(s)
- Claudia Andreini
- Magnetic Resonance Center (CERM), University of Florence, Via L. Sacconi 6, 50019 Sesto Fiorentino, Italy
| | | | | | | |
Collapse
|
16
|
Kasho VN, Smirnova IN, Kaback HR. Sequence alignment and homology threading reveals prokaryotic and eukaryotic proteins similar to lactose permease. J Mol Biol 2006; 358:1060-70. [PMID: 16574153 PMCID: PMC2785551 DOI: 10.1016/j.jmb.2006.02.049] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2006] [Revised: 02/16/2006] [Accepted: 02/17/2006] [Indexed: 11/16/2022]
Abstract
Certain prokaryotic transport proteins similar to the lactose permease of Escherichia coli (LacY) have been identified by BLAST searches from available genomic databanks. These proteins exhibit conservation of amino acid residues that participate in sugar binding and H(+) translocation in LacY. Homology threading of prokaryotic transporters based on the X-ray structure of LacY (PDB ID: 1PV7) and sequence similarities reveals a common overall fold for sugar transporters belonging to the Major Facilitator Superfamily (MFS) and suggest new targets for study. Evolution-based searches for sequence similarities also identify eukaryotic proteins bearing striking resemblance to MFS sugar transporters. Like LacY, the eukaryotic proteins are predicted to have 12 transmembrane domains (TMDs), and many of the irreplaceable residues for sugar binding and H(+) translocation in LacY appear to be largely conserved. The overall size of the eukaryotic homologs is about twice that of prokaryotic permeases with longer N and C termini and loops between TMDs III-IV and VI-VII. The human gene encoding protein FLJ20160 consists of six exons located on more than 60,000 bp of DNA sequences and requires splicing to produce mature mRNA. Cellular localization predictions suggest membrane insertion with possible proteolysis at the N terminus, and expression studies with the human protein FJL20160 demonstrate membrane insertion in both E.coli and Pichia pastoris. Widespread expression of the eukaryotic sugar transport candidates suggests an important role in cellular metabolism, particularly in brain and tumors. Homology is observed in the TMDs of both the eukaryotic and prokaryotic proteins that contain residues involved in sugar binding and H(+) translocation in LacY.
Collapse
|
17
|
Sacconi S, Trevisson E, Pistollato F, Baldoin MC, Rezzonico R, Bourget I, Desnuelle C, Tenconi R, Basso G, DiMauro S, Salviati L. hCOX18 and hCOX19: Two human genes involved in cytochrome c oxidase assembly. Biochem Biophys Res Commun 2005; 337:832-9. [PMID: 16212937 DOI: 10.1016/j.bbrc.2005.09.127] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2005] [Accepted: 09/17/2005] [Indexed: 10/25/2022]
Abstract
We identified the human homologues of yCOX18 and yCOX19, two Saccharomyces cerevisiae genes involved in the biogenesis of mitochondrial respiratory chain complexes. In yeast, these two genes are required for the expression of cytochrome c oxidase: Cox18p catalyses the insertion of Cox2p COOH-tail into the mitochondrial inner membrane, and Cox19p is probably involved in metal transport to the intermembrane space. Both hCox18p and hCox19p present significant amino acid identity with the corresponding yeast polypeptides and reveal highly conserved functional domains. In addition, their subcellular localization is analogous to that of the yeast proteins. These data strongly suggest that the human gene products share similar functions with their yeast homologues. These two COX-assembly genes represent new candidates for mutational analysis in patients with isolated COX deficiency of unknown etiology.
Collapse
Affiliation(s)
- Sabrina Sacconi
- INSERM U638, Faculté de Médicine, Université de Nice, France
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Wang M, Marín A. Characterization and prediction of alternative splice sites. Gene 2005; 366:219-27. [PMID: 16226402 DOI: 10.1016/j.gene.2005.07.015] [Citation(s) in RCA: 182] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2004] [Revised: 04/20/2005] [Accepted: 07/08/2005] [Indexed: 11/16/2022]
Abstract
Human alternative isoform, cryptic, skipped, and constitutive splice sites from the ALTEXTRON database were analysed regarding splice site strength, composition, GC content, position and binding site strength of polypyrimidine tract and branch site. Several features were identified which distinguish alternative isoform and cryptic splice sites, but not skipped splice sites from constitutive ones. These include splice site strength, introns GC content, U2AF35 binding site score, and oligonucleotide frequencies. For the predictive classification of splice sites, pattern recognition models for different splicing factor binding sites and oligonucleotide frequency models (OFMs) were combined using backpropagation networks. 67.45% of acceptor sites and 71.23% of donor sites are correctly classified by networks trained for classification of constitutive and alternative isoform/cryptic splice sites. A web-application for the prediction of alternative splice sites is available at http://es.embnet.org/~mwang/assp.html .
Collapse
Affiliation(s)
- Magnus Wang
- Departamento de Genética, Facultad de Biología, Universidad de Sevilla, Avenida de Reina Mercedes 6, E-41012 Sevilla, Spain.
| | | |
Collapse
|
19
|
Minczuk M, Lilpop J, Boros J, Stepien PP. The 5′ region of the human hSUV3 gene encoding mitochondrial DNA and RNA helicase: Promoter characterization and alternative pre-mRNA splicing. ACTA ACUST UNITED AC 2005; 1729:81-7. [PMID: 15919122 DOI: 10.1016/j.bbaexp.2005.04.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2005] [Revised: 04/19/2005] [Accepted: 04/20/2005] [Indexed: 10/25/2022]
Abstract
The human nuclear hSUV3 gene encodes ATP-dependent RNA and DNA helicase, which predominantly localizes in the mitochondria. In yeast, the Suv3 helicase is a component of mitochondrial degradosome, a two-subunit complex, which degrades aberrant mtRNAs. In contrast to the well-documented physiological role of the yeast SUV3, the function of its human orthologue remains unknown. In this report, we have analyzed the hSUV3 5' genomic region. Our data suggest that hSUV3 is a housekeeping gene. Deletion analysis and in vitro mutagenesis revealed the presence of an enhancer region and regulatory elements in basal promoter including: (i) direct 10-bp-long repeats, which share significant sequence similarity with the consensus for the NF-kappaB/Rel family transcription factors, (ii) Sp1 general transcription factor binding site, and (iii) NRF-1 transcription factor binding sites, the latter typical for nuclear-encoded mitochondrial genes. Furthermore, we show that the 5' region of the hSUV3 pre-mRNA can be alternatively spliced.
Collapse
Affiliation(s)
- Michal Minczuk
- Department of Genetics, University of Warsaw, Pawinskiego 5A, 02-106 Warsaw, Poland.
| | | | | | | |
Collapse
|
20
|
Jovelin R, Phillips PC. Functional constraint and divergence in the G protein family in Caenorhabditis elegans and Caenorhabditis briggsae. Mol Genet Genomics 2005; 273:299-310. [PMID: 15856303 DOI: 10.1007/s00438-004-1105-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2004] [Accepted: 12/09/2004] [Indexed: 10/25/2022]
Abstract
Part of the challenge of the post-genomic world is to identify functional elements within the wide array of information generated by genome sequencing. Although cross-species comparisons and investigation of rates of sequence divergence are an efficient approach, the relationship between sequence divergence and functional conservation is not clear. Here, we use a comparative approach to examine questions of evolutionary rates and conserved function within the guanine nucleotide-binding protein (G protein) gene family in nematodes of the genus Caenorhabditis. In particular, we show that, in cases where the Caenorhabditis elegans ortholog shows a loss-of-function phenotype, G protein genes of C. elegans and Caenorhabditis briggsae diverge on average three times more slowly than G protein genes that do not exhibit any phenotype when mutated in C. elegans, suggesting that genes with loss of function phenotypes are subject to stronger selective constraints in relation to their function in both species. Our results also indicate that selection is as strong on G proteins involved in environmental perception as it is on those controlling other important processes. Finally, using phylogenetic footprinting, we identify a conserved non-coding motif present in multiple copies in the genomes of four species of Caenorhabditis. The presence of this motif in the same intron in the gpa-1 genes of C. elegans, C. briggsae and Caenorhabditis remanei suggests that it plays a role in the regulation of gpa-1, as well as other loci.
Collapse
Affiliation(s)
- Richard Jovelin
- Center for Ecology and Evolutionary Biology, University of Oregon, Eugene, OR, 97403-5289, USA
| | | |
Collapse
|
21
|
Bench AJ, Li J, Huntly BJP, Delabesse E, Fourouclas N, Hunt AR, Deloukas P, Green AR. Characterization of the imprinted polycomb geneL3MBTL, a candidate 20q tumour suppressor gene, in patients with myeloid malignancies. Br J Haematol 2004; 127:509-18. [PMID: 15566354 DOI: 10.1111/j.1365-2141.2004.05278.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Chromosome 20q deletion is a recurrent chromosomal abnormality associated with myeloid malignancies. L3MBTL represents a strong candidate tumour suppressor gene since it lies within the common deleted region, is a member of the Polycomb-like family, encodes the human homologue of a Drosophila tumour suppressor and is expressed within haematopoietic progenitor cells. We describe the structure of L3MBTL, identify two putative promoters each associated with two CpG islands and characterize a complex pattern of alternative splicing events. Mutation analysis of the gene in patients with and without a 20q deletion identified several polymorphisms but no acquired mutations. The two CpG islands spanning promoter 2 undergo monoallelic methylation in normal haematopoietic cells consistent with imprinting of L3MBTL. Samples from patients with a 20q deletion retained either the methylated or unmethylated allele but retention of the methylated allele did not correlate with reduction in L3MBTL mRNA levels. The absence of a correlation between L3MBTL methylation and transcription could be shown to reflect loss of imprinting in one patient. In addition, our results demonstrate that inactivation of L3MBTL is not a common occurrence in patients with a 20q deletion or in cytogenetically normal patients with polycythaemia vera.
Collapse
Affiliation(s)
- Anthony J Bench
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge CB2 2XY, UK
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Milanesi L, Rogozin IB. ESTMAP: a system for expressed sequence tags mapping on genomic sequences. IEEE Trans Nanobioscience 2004; 2:75-8. [PMID: 15382662 DOI: 10.1109/tnb.2003.813928] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The completion of a number of large genome sequencing projects emphasizes the importance of protein-coding gene predictions. Most of the problems associated with gene prediction are caused by the complex exon-intron structures commonly found in eukaryotic genomes. However, information from homologous sequences can significantly improve the accuracy of the prediction. In particular, expressed sequence tags (ESTs) are very useful for this purpose, since currently existing EST collections are very large. We developed an ESTMAP system, which utilizes homology searches against a database of repetitive elements using the RepeatView program and the EST Division of GenBank using the BLASTN program. ESTMAP extracts "exact" matches with EST sequences (> 95% of homology) from BLASTN output file and predicts introns in DNA comparing ESTs and a query sequence. ESTMAP is implemented as a part of the WebGene system (http://www.cnr.it/webgene).
Collapse
|
23
|
Cvrčková F, Novotný M, Pícková D, Žárský V. Formin homology 2 domains occur in multiple contexts in angiosperms. BMC Genomics 2004; 5:44. [PMID: 15256004 PMCID: PMC509240 DOI: 10.1186/1471-2164-5-44] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2004] [Accepted: 07/15/2004] [Indexed: 11/10/2022] Open
Abstract
Background Involvement of conservative molecular modules and cellular mechanisms in the widely diversified processes of eukaryotic cell morphogenesis leads to the intriguing question: how do similar proteins contribute to dissimilar morphogenetic outputs. Formins (FH2 proteins) play a central part in the control of actin organization and dynamics, providing a good example of evolutionarily versatile use of a conserved protein domain in the context of a variety of lineage-specific structural and signalling interactions. Results In order to identify possible plant-specific sequence features within the FH2 protein family, we performed a detailed analysis of angiosperm formin-related sequences available in public databases, with particular focus on the complete Arabidopsis genome and the nearly finished rice genome sequence. This has led to revision of the current annotation of half of the 22 Arabidopsis formin-related genes. Comparative analysis of the two plant genomes revealed a good conservation of the previously described two subfamilies of plant formins (Class I and Class II), as well as several subfamilies within them that appear to predate the separation of monocot and dicot plants. Moreover, a number of plant Class II formins share an additional conserved domain, related to the protein phosphatase/tensin/auxilin fold. However, considerable inter-species variability sets limits to generalization of any functional conclusions reached on a single species such as Arabidopsis. Conclusions The plant-specific domain context of the conserved FH2 domain, as well as plant-specific features of the domain itself, may reflect distinct functional requirements in plant cells. The variability of formin structures found in plants far exceeds that known from both fungi and metazoans, suggesting a possible contribution of FH2 proteins in the evolution of the plant type of multicellularity.
Collapse
Affiliation(s)
- Fatima Cvrčková
- Department of Plant Physiology, Faculty of Sciences, Charles University, Viničná 5, CZ 128 44 Praha 2, Czech Republic
| | - Marian Novotný
- Department of Cell and Molecular Biology, Uppsala University, Biomedical Centre, Husargatan 3, Box 570, S 751 23 Uppsala, Sweden
| | - Denisa Pícková
- Department of Plant Physiology, Faculty of Sciences, Charles University, Viničná 5, CZ 128 44 Praha 2, Czech Republic
- Institute of Experimental Botany, Faculty of Sciences of the Czech Republic, Rozvojová 135, CZ 165 02 Praha 6, Czech Republic
| | - Viktor Žárský
- Department of Plant Physiology, Faculty of Sciences, Charles University, Viničná 5, CZ 128 44 Praha 2, Czech Republic
- Institute of Experimental Botany, Faculty of Sciences of the Czech Republic, Rozvojová 135, CZ 165 02 Praha 6, Czech Republic
| |
Collapse
|
24
|
Li J, Bench AJ, Vassiliou GS, Fourouclas N, Ferguson-Smith AC, Green AR. Imprinting of the human L3MBTL gene, a polycomb family member located in a region of chromosome 20 deleted in human myeloid malignancies. Proc Natl Acad Sci U S A 2004; 101:7341-6. [PMID: 15123827 PMCID: PMC409920 DOI: 10.1073/pnas.0308195101] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
L3MBTL encodes a member of the Polycomb family of proteins, which, together with Trithorax group proteins, is responsible for the coordinated regulation of patterns of gene activity. Members of the Polycomb family also regulate self renewal of normal and malignant hematopoietic stem cells. L3MBTL lies in a region of chromosome 20, deletion of which is associated with myeloid malignancies and represents a good candidate for a 20q target gene. However, mutations of L3MBTL have not been identified in patients with 20q deletions or in cytogenetically normal patients. Here we demonstrate that monoallelic methylation of two CpG islands correlates with transcriptional silencing of L3MBTL, and that L3MBTL transcription occurs from the paternally derived allele in five individuals from two families. Expression of the paternally derived allele was observed in multiple hematopoietic cell types as well as in bone marrow derived mesenchymal cells. Deletions of 20q associated with myeloid malignancies resulted in loss of either the unmethylated or methylated allele. Our results demonstrate that L3MBTL represents a previously undescribed imprinted locus, a vertebrate Polycomb group gene shown to be regulated by this mechanism, and has implications for the pathogenesis of myeloid malignancies associated with 20q deletions.
Collapse
Affiliation(s)
- Juan Li
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Hills Road, Cambridge CB2 2XY, United Kingdom
| | | | | | | | | | | |
Collapse
|
25
|
Malinowski R, Filipecki M, Tagashira N, Wiśniewska A, Gaj P, Plader W, Malepszy S. Xyloglucan endotransglucosylase/hydrolase genes in cucumber (Cucumis sativus) - differential expression during somatic embryogenesis. PHYSIOLOGIA PLANTARUM 2004; 120:678-685. [PMID: 15032830 DOI: 10.1111/j.0031-9317.2004.0289.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Defined changes in the cell wall directed by many proteins accompany every morphogenetic process in plants. Xyloglucan endotransglucosylase/hydrolase proteins (XTH; EC 2.4.1.207) have the potential to modify the hemicellulose matrix within the cell wall. Cs-XTH1 and Cs-XTH3 genes, which encode XTH proteins, were found among numerous genes that are differentially expressed after the induction of cucumber somatic embryogenesis. The expression of these genes increased during somatic embryogenesis. The Cs-XTH1 gene was localized on the second chromosome near the centromere region, whereas Cs-XTH3 was found in the middle of the fifth chromosome's longer arm. Northern blot hybridization showed that both genes were preferentially expressed in roots. We also observed higher accumulation of both transcripts in somatic embryos than in the proembryogenic mass. The localization of mRNA by in situ hybridization revealed that the Cs-XTH1 transcripts were largely accumulated in the presumptive cotyledon primordia of somatic embryos. The XTH gene family consists of a number of genes with a high degree of structural similarity. Screening a cucumber genomic library has identified other members of this gene family. The intron/exon structure, sequence similarities and the close chromosomal distance between some members suggest their common evolutionary origin. The involvement of XTH-related genes in somatic embryo formation is discussed.
Collapse
Affiliation(s)
- Robert Malinowski
- Department of Plant Genetics Breeding and Biotechnology, Warsaw Agricultural University, Nowoursynowska 166, 02-787 Warsaw, Poland
| | | | | | | | | | | | | |
Collapse
|
26
|
Frébort I, Sebela M, Hirota S, Yamada M, Tamaki H, Kumagai H, Adachi O, Pec P. Gene organization and molecular modeling of copper amine oxidase from Aspergillus niger: re-evaluation of the cofactor structure. Biol Chem 2003; 384:1451-61. [PMID: 14669988 DOI: 10.1515/bc.2003.161] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Amine oxidase AO-I from Aspergillus niger AKU 3302 has been reported to contain topa quinone (TPQ) as a cofactor; however, analysis of the p-nitrophenylhydrazine-derivatized enzyme and purified active site peptides showed the presence of a carboxylate ester linkage of TPQ to a glutamate. The catalytic functionality of such a cross-linked cofactor has recently been shown unlikely by spectroscopic and voltammetric studies on synthesized model compounds. We have obtained resonance Raman spectra of native and substrate-reduced AO-I demonstrating that the catalytically active cofactor is unmodified TPQ. The primary structure of the enzyme (GenBank acc. no. U31869) has been reviewed and updated by repeated isolation and sequencing of AO-I cDNA. This allowed rectification of several errors that account for previously reported low homology to other amine oxidases in the regions around copper binding histididyl residues. The results were confirmed by cloning the ao-1 structural gene (GenBank acc. no. AF362473). Analysis of the gene 5'-upstream region of the gene revealed potential binding sites for an analog of NIT2, the nitrogen metabolism regulatory protein found in Neurospora crassa and other fungi. The molecular structure of AO-I was modeled by a comparative method using published crystal structures of amine oxidases as templates.
Collapse
Affiliation(s)
- Ivo Frébort
- Department of Biochemistry, Faculty of Science, Palacký University, Slechtitelů 11, CZ-783 71 Olomouc, Czech Republic.
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Schmitt I, Evert BO, Khazneh H, Klockgether T, Wuellner U. The human MJD gene: genomic structure and functional characterization of the promoter region. Gene 2003; 314:81-8. [PMID: 14527720 DOI: 10.1016/s0378-1119(03)00706-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Machado-Joseph disease (MJD) is a progressive neurodegenerative disorder caused by expansion of a CAG motif within the translated region of the human MJD (hMJD) gene which has been mapped to chromosome 14q. In this study, the hMJD gene was identified in two overlapping bacterial artificial chromosome (BAC) clones and contained 11 exons resulting in a 6.14 kb transcript. The 5'-flanking region of the hMJD gene included a TATA-less promoter with GC-rich regions, a CCAAT box and multiple potential SP1 binding sites. Luciferase reporter assays performed in neuronal and non-neuronal human cell lines demonstrated a core promoter within the 200 bp region immediately upstream of the putative transcriptional start site (-89 according to the start codon). DNA-protein interactions defined by electrophoretic mobility shift assays (EMSA) revealed specific binding of nuclear proteins to the putative core promoter region.
Collapse
Affiliation(s)
- Ina Schmitt
- Department of Neurology, Neurobiology, University of Bonn, Sigmund-Freud-Str. 25, 53105, Bonn, Germany.
| | | | | | | | | |
Collapse
|
28
|
Overman RG, Enderle PJ, Farrow JM, Wiley JE, Farwell MA. The human mitochondrial translation initiation factor 2 gene (MTIF2): transcriptional analysis and identification of a pseudogene. BIOCHIMICA ET BIOPHYSICA ACTA 2003; 1628:195-205. [PMID: 12932832 DOI: 10.1016/s0167-4781(03)00144-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Mitochondrial translation initiation factor 2 (MTIF2) is nuclear-encoded and functions in mitochondria to initiate the translation of proteins encoded by the mitochondrial genome. To gain insight into mechanisms that regulate MTIF2 gene expression, the genomic copy and the 5' and 3' flanking regions of MTIF2 were isolated using a combination of genomic library screening and polymerase chain reaction (PCR). MTIF2 is approximately 33.5-kb long and contains 16 exons, confirming data from the Human Genome Project. With RNA ligase-mediated rapid amplification of cDNA ends (RLM-RACE), we mapped the transcription start point in human heart tissue to a cytosine residue 296 bp upstream from the translation initiation site. The region surrounding the transcription start point contains consensus binding sites for transcription factors Sp1, nuclear respiratory factor 2 (NRF-2) and estrogen receptor, while enhancer binding sites were identified upstream. Promoter constructs were prepared in a luciferase reporter vector and transiently transfected into 293T cells. The minimal promoter gave an expression level 3.5x higher than the SV40 control (P=0.001), while the construct containing the minimal promoter plus the enhancer region gave a 3.8x higher level of expression compared to the control (P<0.001). We also discovered a pseudogene of MTIF2 and mapped it to chromosome 1p13-12.
Collapse
Affiliation(s)
- R Glenn Overman
- Department of Biology, East Carolina University, Greenville, NC 27858, USA
| | | | | | | | | |
Collapse
|
29
|
Culi J, Mann RS. Boca, an endoplasmic reticulum protein required for wingless signaling and trafficking of LDL receptor family members in Drosophila. Cell 2003; 112:343-54. [PMID: 12581524 DOI: 10.1016/s0092-8674(02)01279-5] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The maturation of cell surface receptors through the secretory pathway often requires chaperones that aid in protein folding and trafficking from one organelle to another. Here we describe boca, an evolutionarily conserved gene in Drosophila melanogaster, which encodes an endoplasmic reticulum protein that is specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs). Two LDLRs in flies, Arrow, which is required for Wingless signal transduction, and Yolkless, which is required for yolk protein uptake during oogenesis, both require boca function. Consequently, boca is an essential component of the Wingless pathway but is more generally required for the activities of multiple LDL receptor family members.
Collapse
MESH Headings
- Animals
- Cell Compartmentation/genetics
- Cell Membrane/genetics
- Cell Membrane/metabolism
- Cells, Cultured
- Congenital Abnormalities/genetics
- Congenital Abnormalities/metabolism
- DNA, Complementary/analysis
- DNA, Complementary/genetics
- Drosophila Proteins/genetics
- Drosophila Proteins/isolation & purification
- Drosophila Proteins/metabolism
- Drosophila melanogaster/genetics
- Drosophila melanogaster/growth & development
- Drosophila melanogaster/metabolism
- Egg Proteins/genetics
- Egg Proteins/metabolism
- Endoplasmic Reticulum/genetics
- Endoplasmic Reticulum/metabolism
- Female
- Genes, Lethal
- Homeodomain Proteins/genetics
- Homeodomain Proteins/metabolism
- LIM-Homeodomain Proteins
- Male
- Molecular Chaperones/genetics
- Molecular Chaperones/isolation & purification
- Molecular Chaperones/metabolism
- Molecular Sequence Data
- Phenotype
- Protein Folding
- Protein Transport/genetics
- Proto-Oncogene Proteins/genetics
- Proto-Oncogene Proteins/metabolism
- Receptors, Cell Surface/genetics
- Receptors, Cell Surface/metabolism
- Receptors, LDL/genetics
- Receptors, LDL/metabolism
- Sequence Homology, Amino Acid
- Sequence Homology, Nucleic Acid
- Signal Transduction/genetics
- Transcription Factors/genetics
- Transcription Factors/metabolism
- Wnt1 Protein
Collapse
Affiliation(s)
- Joaquim Culi
- Department of Biochemistry and Molecular Biophysics, Center for Neurobiology and Behavior, Columbia University, 701 West 168th Street, HHSC 1104, New York, NY 10032, USA
| | | |
Collapse
|
30
|
Chuang TJ, Lin WC, Lee HC, Wang CW, Hsiao KL, Wang ZH, Shieh D, Lin SC, Ch'ang LY. A complexity reduction algorithm for analysis and annotation of large genomic sequences. Genome Res 2003; 13:313-22. [PMID: 12566410 PMCID: PMC420370 DOI: 10.1101/gr.313703] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
DNA is a universal language encrypted with biological instruction for life. In higher organisms, the genetic information is preserved predominantly in an organized exon/intron structure. When a gene is expressed, the exons are spliced together to form the transcript for protein synthesis. We have developed a complexity reduction algorithm for sequence analysis (CRASA) that enables direct alignment of cDNA sequences to the genome. This method features a progressive data structure in hierarchical orders to facilitate a fast and efficient search mechanism. CRASA implementation was tested with already annotated genomic sequences in two benchmark data sets and compared with 15 annotation programs (10 ab initio and 5 homology-based approaches) against the EST database. By the use of layered noise filters, the complexity of CRASA-matched data was reduced exponentially. The results from the benchmark tests showed that CRASA annotation excelled in both the sensitivity and specificity categories. When CRASA was applied to the analysis of human Chromosomes 21 and 22, an additional 83 potential genes were identified. With its large-scale processing capability, CRASA can be used as a robust tool for genome annotation with high accuracy by matching the EST sequences precisely to the genomic sequences.
Collapse
MESH Headings
- Algorithms
- Chromosomes, Human, Pair 21/genetics
- Chromosomes, Human, Pair 22/genetics
- DNA/analysis
- DNA/genetics
- DNA, Complementary/analysis
- DNA, Complementary/genetics
- Exons/genetics
- Expressed Sequence Tags
- Genes/genetics
- Genome, Human
- Humans
- Pseudogenes/genetics
- Reproducibility of Results
- Sensitivity and Specificity
- Sequence Alignment/methods
- Sequence Analysis, DNA/methods
- Sequence Analysis, DNA/trends
- Sequence Homology, Nucleic Acid
Collapse
Affiliation(s)
- Trees-Juen Chuang
- Bioinformatics Research Center, Institute of Biomedical Sciences, Academia Sinica, Taipei 11529, Taiwan
| | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Wong ML, Islas-Trejo A, Medrano JF. Structural characterization of the mouse high growth deletion and discovery of a novel fusion transcript between suppressor of cytokine signaling-2 (Socs-2) and viral encoded semaphorin receptor (Plexin C1). Gene 2002; 299:153-63. [PMID: 12459263 DOI: 10.1016/s0378-1119(02)01052-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The high growth (HG) mouse mutation is a 460 Kb deletion of chromosome 10 which causes a 30-50% increase in growth in the homozygous animal. We have shotgun sequenced six bacterial artificial chromosomes which span the length of the deletion to an average depth of 13.2x to generate a 649,868 bp sequence. Sequence analysis revealed the presence of three genes, suppressor of cytokine signaling-2 (Socs-2), caspase and RIP adaptor with death domain (Raidd/Cradd), and viral encoded semaphorin receptor (Plexin C1, viral encoded semaphorin receptor). The two deletion breakpoints lie in within the second introns of both Socs-2 and Plexin C1, resulting in the formation of a novel expressed fusion transcript between Socs-2 and Plexin C1 in HG mice. Expression of the fusion transcript, the presence of four splice variants of Raidd/Cradd and the exon structure of Socs-2 were illustrated using polymerase chain reaction. Genomic comparisons of the mouse and human sequence were used to verify the sequence assembly.
Collapse
Affiliation(s)
- Marisa L Wong
- Department of Animal Science, University of California, One Shields Avenue, Davis, CA 95616-8521, USA
| | | | | |
Collapse
|
32
|
Kang HG, Evers MR, Xia G, Baenziger JU, Schachner M. Molecular cloning and characterization of chondroitin-4-O-sulfotransferase-3. A novel member of the HNK-1 family of sulfotransferases. J Biol Chem 2002; 277:34766-72. [PMID: 12080076 DOI: 10.1074/jbc.m204907200] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have identified and characterized an N-acetylgalactosamine-4-O-sulfotransferase designated chondroitin-4-sulfotransferase-3 (C4ST-3) (GenBank accession number AY120869) based on its homology to HNK-1 sulfotransferase (HNK-1 ST). The cDNA predicts an open reading frame encoding a type II membrane protein of 341 amino acids with a 12-amino acid cytoplasmic domain and a 311-amino acid luminal domain containing a single potential N-linked glycosylation site. C4ST-3 has the greatest amino acid sequence identity when aligned with chondroitin-4-O-sulfotransferase 1 (C4ST-1) (45%) but also shows significant amino acid identity with chondroitin-4-O-sulfotransferase 2 (C4ST-2) (27%), dermatan-4-O-sulfotransferase 1 (29%), HNK-1 ST (26%), N-acetylgalactosamine-4-O-sulfotransferase 1 (26%), and N-acetylgalactosamine-4-O-sulfotransferase 2 (23%). C4ST-3 transfers sulfate to the C-4 hydroxyl of beta1,4-linked GalNAc that is substituted with a beta-linked glucuronic acid at the C-3 hydroxyl. The open reading frame of C4ST-3 is encoded by three exons located on human chromosome 3q21.3. Northern blot analysis reveals a single 2.1-kilobase transcript. C4ST-3 message is expressed in adult liver and at lower levels in adult kidney, lymph nodes, and fetal liver. Although C4ST-3 and C4ST-1 have similar specificities, the highly restricted pattern of expression seen for C4ST-3 suggests that it has a different role than C4ST-1.
Collapse
Affiliation(s)
- Hyung-Gyoo Kang
- Department of Pathology, Washington University School of Medicine, St. Louis, Missouri 63110 and Zentrum fuer Molekulare Neurobiologie, Universitaet Hamburg, Martinistrasse 52, D-20246 Hamburg, Germany
| | | | | | | | | |
Collapse
|
33
|
Petersenn S, Rasch AC, Böhnke C, Schulte HM. Identification of an upstream pituitary-active promoter of human somatostatin receptor subtype 5. Endocrinology 2002; 143:2626-34. [PMID: 12072395 DOI: 10.1210/endo.143.7.8883] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Somatostatin receptor subtype 5 (sst5) has been linked to inhibition of PRL and insulin secretion. We characterized the genomic structure of the human sst5. The transcription start site was located 94 nucleotides upstream of the initiator ATG codon. Sequence analysis of 5'-inverse PCR products revealed the presence of a 6.1-kb intron in the 5'-untranslated region. RT-PCR analysis indicated tissue-specific activation of the newly identified upstream promoter in pituitary, but not in small intestine, lung, or placenta. A -1741 promoter directed significant levels of luciferase expression in GH(4) rat pituitary cells, Skut-1B endometrium cells, and JEG3 chorion carcinoma cells, which was absent in COS-7 monkey kidney cells. A minimal -101 promoter was sufficient to allow tissue-specific expression. Its activity in COS-7 cells was not enhanced by cotransfection of the pituitary-specific transcription factor Pit-1. Analysis of deletion constructs revealed a GC-rich region immediately upstream of the transcription start site, which is necessary for promoter activity. Somatostatin led to a significant inhibition, and forskolin and thyroid hormone to a significant stimulation of pituitary-specific promoter activity. Further mapping suggested a cAMP-responsive element located between -101 and the transcription start site, and thyroid hormone-responsive elements between -1741 and -1269 and between -317 and -101. These studies identified an upstream promoter of the sst5 gene with tissue-specific activity.
Collapse
Affiliation(s)
- S Petersenn
- IHF Institute for Hormone and Fertility Research, University of Hamburg, Germany.
| | | | | | | |
Collapse
|
34
|
Migeon BR, Chowdhury AK, Dunston JA, McIntosh I. Identification of TSIX, encoding an RNA antisense to human XIST, reveals differences from its murine counterpart: implications for X inactivation. Am J Hum Genet 2001; 69:951-60. [PMID: 11555794 PMCID: PMC1274371 DOI: 10.1086/324022] [Citation(s) in RCA: 101] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2001] [Accepted: 08/27/2001] [Indexed: 11/03/2022] Open
Abstract
X inactivation is the mammalian method for X-chromosome dosage compensation, but some features of this developmental process vary among mammals. Such species variations provide insights into the essential components of the pathway. Tsix encodes a transcript antisense to the murine Xist transcript and is expressed in the mouse embryo only during the initial stages of X inactivation; it has been shown to play a role in imprinted X inactivation in the mouse placenta. We have identified its counterpart within the human X inactivation center (XIC). Human TSIX produces a >30-kb transcript that is expressed only in cells of fetal origin; it is expressed from human XIC transgenes in mouse embryonic stem cells and from human embryoid-body-derived cells, but not from human adult somatic cells. Differences in the structure of human and murine genes indicate that human TSIX was truncated during evolution. These differences could explain the fact that X inactivation is not imprinted in human placenta, and they raise questions about the role of TSIX in random X inactivation.
Collapse
MESH Headings
- Aging/genetics
- Animals
- Cell Line
- Dosage Compensation, Genetic
- Embryo, Mammalian/metabolism
- Evolution, Molecular
- Fetus/metabolism
- Genomic Imprinting/genetics
- Humans
- Mice
- Molecular Sequence Data
- Open Reading Frames/genetics
- Placenta/metabolism
- RNA, Antisense/analysis
- RNA, Antisense/biosynthesis
- RNA, Antisense/genetics
- RNA, Antisense/isolation & purification
- RNA, Long Noncoding
- RNA, Untranslated/analysis
- RNA, Untranslated/biosynthesis
- RNA, Untranslated/genetics
- RNA, Untranslated/isolation & purification
- Sequence Deletion/genetics
- Sequence Homology, Nucleic Acid
- Species Specificity
- Stem Cells/metabolism
- Transcription Factors/genetics
- Transcription Initiation Site
- Transcription, Genetic
- Transgenes/genetics
Collapse
Affiliation(s)
- B R Migeon
- McKusick-Nathans Institute of Genetic Medicine and Department of Pediatrics, The Johns Hopkins University School of Medicine, Baltimore, MD, 21287, USA.
| | | | | | | |
Collapse
|
35
|
Evers MR, Xia G, Kang HG, Schachner M, Baenziger JU. Molecular cloning and characterization of a dermatan-specific N-acetylgalactosamine 4-O-sulfotransferase. J Biol Chem 2001; 276:36344-53. [PMID: 11470797 DOI: 10.1074/jbc.m105848200] [Citation(s) in RCA: 85] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have identified and characterized an N-acetylgalactosamine-4-O-sulfotransferase designated dermatan-4-sulfotransferase-1 (D4ST-1) (GenBank(TM) accession number AF401222) based on its homology to HNK-1 sulfotransferase. The cDNA predicts an open reading frame encoding a type II membrane protein of 376 amino acids with a 43-amino acid cytoplasmic domain and a 316-amino acid luminal domain containing two potential N-linked glycosylation sites. D4ST-1 has significant amino acid identity with HNK-1 sulfotransferase (21.4%), N-acetylgalactosamine-4-O-sulfotransferase 1 (GalNAc-4-ST1) (24.7%), N-acetylgalactosamine-4-O-sulfotransferase 2 (GalNAc-4-ST2) (21.0%), chondroitin-4-O-sulfotransferase 1 (27.3%), and chondroitin-4-O-sulfotransferase 2 (22.8%). D4ST-1 transfers sulfate to the C-4 hydroxyl of beta1,4-linked GalNAc that is substituted with an alpha-linked iduronic acid (IdoUA) at the C-3 hydroxyl. D4ST-1 shows a strong preference in vitro for sulfate transfer to IdoUAalpha1,3GalNAcbeta1,4 that is flanked by GlcUAbeta1,3GalNAcbeta1,4 as compared with IdoUAalpha1,3GalNAcbeta1,4 flanked by IdoUAalpha1,3GalNAcbeta1,4. The specificity of D4ST-1 when assayed in vitro suggests that the addition of sulfate to GalNAc occurs immediately after epimerization of GlcUA to IdoUA. The open reading frame of D4ST-1 is encoded by a single exon located on human chromosome 15q14. Northern blot analysis reveals a single 2.4-kilobase transcript. D4ST-1 message is expressed in virtually all tissues at some level but is most highly expressed in pituitary, placenta, uterus, and thyroid. The properties of D4ST-1 indicate that sulfation of the GalNAc moieties in dermatan is mediated by a distinct GalNAc-4-O-sulfotransferase and occurs following epimerization of GlcUA to IdoUA.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Base Sequence
- Blotting, Northern
- CHO Cells
- Carbohydrate Sequence
- Chromatography, Gel
- Chromosomes, Human, Pair 15
- Cloning, Molecular
- Cricetinae
- DNA, Complementary/metabolism
- Dermatan Sulfate/chemistry
- Dose-Response Relationship, Drug
- Exons
- Humans
- Models, Chemical
- Models, Genetic
- Molecular Sequence Data
- Oligonucleotide Array Sequence Analysis
- Open Reading Frames
- Protein Binding
- Protein Structure, Tertiary
- RNA, Messenger/metabolism
- Sequence Homology, Amino Acid
- Sulfotransferases/biosynthesis
- Sulfotransferases/chemistry
- Sulfotransferases/genetics
- Time Factors
- Tissue Distribution
- Transfection
Collapse
Affiliation(s)
- M R Evers
- Department of Pathology, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | | | | | | | | |
Collapse
|
36
|
Abstract
The Genome Annotation Assessment Project tested current methods of gene identification, including a critical assessment of the accuracy of different methods. Two new databases have provided new resources for gene annotation: these are the InterPro database of protein domains and motifs, and the Gene Ontology database for terms that describe the molecular functions and biological roles of gene products. Efforts in genome annotation are most often based upon advances in computer systems that are specifically designed to deal with the tremendous amounts of data being generated by current sequencing projects. These efforts in analysis are being linked to new ways of visualizing computationally annotated genomes.
Collapse
Affiliation(s)
- S Lewis
- Department of Molecular and Cell Biology, Berkeley Drosophila Genome Project, University of California, Berkeley, CA 94720-3200, USA.
| | | | | |
Collapse
|