1
|
Srinivasan E, Rajasekaran R. A Systematic and Comprehensive Review on Disease-Causing Genes in Amyotrophic Lateral Sclerosis. J Mol Neurosci 2020; 70:1742-1770. [PMID: 32415434 DOI: 10.1007/s12031-020-01569-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Accepted: 04/22/2020] [Indexed: 12/13/2022]
Abstract
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disorder and is characterized by degeneration and axon loss from the upper motor neuron, that descends from the lower motor neuron in the brain. Over the period, assorted outcomes from medical findings, molecular pathogenesis, and structural and biophysical studies have abetted in providing thoughtful insights underlying the importance of disease-causing genes in ALS. Consequently, numerous mechanisms were proposed for the pathogenesis of ALS, considering protein mutations, aggregation, and misfolding. Besides, the answers to the majority of ALS cases that happen to be sporadic still remain obscure. The application in discovering susceptibility factors in ALS contemplating the genetic factors is to be further dissevered in the future years with innovation in research studies. Hence, this review targets in revisiting the breakthroughs on the disease-causing genes related with ALS.
Collapse
Affiliation(s)
- E Srinivasan
- Bioinformatics Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (deemed to be university), Vellore, Tamil Nadu, 632014, India
| | - R Rajasekaran
- Bioinformatics Lab, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology (deemed to be university), Vellore, Tamil Nadu, 632014, India.
| |
Collapse
|
2
|
Rappoport N, Stern A, Linial N, Linial M. Entropy-driven partitioning of the hierarchical protein space. Bioinformatics 2015; 30:i624-30. [PMID: 25161256 PMCID: PMC4147929 DOI: 10.1093/bioinformatics/btu478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Motivation: Modern protein sequencing techniques have led to the determination of >50 million protein sequences. ProtoNet is a clustering system that provides a continuous hierarchical agglomerative clustering tree for all proteins. While ProtoNet performs unsupervised classification of all included proteins, finding an optimal level of granularity for the purpose of focusing on protein functional groups remain elusive. Here, we ask whether knowledge-based annotations on protein families can support the automatic unsupervised methods for identifying high-quality protein families. We present a method that yields within the ProtoNet hierarchy an optimal partition of clusters, relative to manual annotation schemes. The method’s principle is to minimize the entropy-derived distance between annotation-based partitions and all available hierarchical partitions. We describe the best front (BF) partition of 2 478 328 proteins from UniRef50. Of 4 929 553 ProtoNet tree clusters, BF based on Pfam annotations contain 26 891 clusters. The high quality of the partition is validated by the close correspondence with the set of clusters that best describe thousands of keywords of Pfam. The BF is shown to be superior to naïve cut in the ProtoNet tree that yields a similar number of clusters. Finally, we used parameters intrinsic to the clustering process to enrich a priori the BF’s clusters. We present the entropy-based method’s benefit in overcoming the unavoidable limitations of nested clusters in ProtoNet. We suggest that this automatic information-based cluster selection can be useful for other large-scale annotation schemes, as well as for systematically testing and comparing putative families derived from alternative clustering methods. Availability and implementation: A catalog of BF clusters for thousands of Pfam keywords is provided at http://protonet.cs.huji.ac.il/bestFront/ Contact: michall@cc.huji.ac.il
Collapse
Affiliation(s)
- Nadav Rappoport
- School of Computer Science and Engineering and Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, 91904, Israel
| | - Amos Stern
- School of Computer Science and Engineering and Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, 91904, Israel
| | - Nathan Linial
- School of Computer Science and Engineering and Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, 91904, Israel
| | - Michal Linial
- School of Computer Science and Engineering and Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University of Jerusalem, 91904, Israel
| |
Collapse
|
3
|
Vodkin LO, Khanna A, Shealy R, Clough SJ, Gonzalez DO, Philip R, Zabala G, Thibaud-Nissen F, Sidarous M, Strömvik MV, Shoop E, Schmidt C, Retzel E, Erpelding J, Shoemaker RC, Rodriguez-Huete AM, Polacco JC, Coryell V, Keim P, Gong G, Liu L, Pardinas J, Schweitzer P. Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant. BMC Genomics 2004; 5:73. [PMID: 15453914 PMCID: PMC526184 DOI: 10.1186/1471-2164-5-73] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2004] [Accepted: 09/29/2004] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays. RESULTS We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants. The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines. CONCLUSIONS Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype.
Collapse
Affiliation(s)
- Lila O Vodkin
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Anupama Khanna
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
- Epicentre, 726 Post Road, Madison, WI, 53713, USA
| | - Robin Shealy
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Steven J Clough
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
- USDA/ARS, National Soybean Research Laboratory, University of Illinois, Urbana, IL, 61801, USA
| | | | - Reena Philip
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
- Food and Drug Administration, Rockeville, MD, 20850, USA
| | - Gracia Zabala
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Françoise Thibaud-Nissen
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
- The Institute for Genome Research, 9212 Medical Center Drive, Rockville, MD, 20850, USA
| | - Mark Sidarous
- Department of Crop Sciences, University of Illinois, Urbana, IL, 61801, USA
| | - Martina V Strömvik
- Center for Computational Genomics and Bioinformatics, University of Minnesota, Minneapolis, MN, 55455, USA
- Department of Plant Science, McGill University, 2111 Lakeshore, St. Anne-de-Bellevue, QC, H9X3V9, Canada
| | - Elizabeth Shoop
- Center for Computational Genomics and Bioinformatics, University of Minnesota, Minneapolis, MN, 55455, USA
- Mathematics and Computer Science, Macalester College, St. Paul, MN, 55105, USA
| | - Christina Schmidt
- Center for Computational Genomics and Bioinformatics, University of Minnesota, Minneapolis, MN, 55455, USA
| | - Ernest Retzel
- Center for Computational Genomics and Bioinformatics, University of Minnesota, Minneapolis, MN, 55455, USA
| | - John Erpelding
- USDA/ARS, Department of Agronomy, Iowa State University, Ames, IA, 50011, USA
| | - Randy C Shoemaker
- USDA/ARS, Department of Agronomy, Iowa State University, Ames, IA, 50011, USA
| | - Alicia M Rodriguez-Huete
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
- Department of Microbiology, School of Medicine, University of Nevada-Reno, Reno, NV, USA
| | - Joseph C Polacco
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
| | - Virginia Coryell
- Department of Biology, Northern Arizona University, Flagstaff, AZ, 86011, USA
| | - Paul Keim
- Department of Biology, Northern Arizona University, Flagstaff, AZ, 86011, USA
| | - George Gong
- Keck Center for Comparative and Functional Genomics, University of Illinois, Urbana, IL, 61801, USA
| | - Lei Liu
- Keck Center for Comparative and Functional Genomics, University of Illinois, Urbana, IL, 61801, USA
| | - Jose Pardinas
- Keck Center for Comparative and Functional Genomics, University of Illinois, Urbana, IL, 61801, USA
| | - Peter Schweitzer
- Keck Center for Comparative and Functional Genomics, University of Illinois, Urbana, IL, 61801, USA
- Biotechnology Resource Center, Cornell University, Ithaca, NY, 14853, USA
| |
Collapse
|
4
|
Microarrays for global expression constructed with a low redundancy set of 27,500 sequenced cDNAs representing an array of developmental stages and physiological conditions of the soybean plant. BMC Genomics 2004. [PMID: 15453914 DOI: 10.1186/1471‐2164‐5‐73] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarrays are an important tool with which to examine coordinated gene expression. Soybean (Glycine max) is one of the most economically valuable crop species in the world food supply. In order to accelerate both gene discovery as well as hypothesis-driven research in soybean, global expression resources needed to be developed. The applications of microarray for determining patterns of expression in different tissues or during conditional treatments by dual labeling of the mRNAs are unlimited. In addition, discovery of the molecular basis of traits through examination of naturally occurring variation in hundreds of mutant lines could be enhanced by the construction and use of soybean cDNA microarrays. RESULTS We report the construction and analysis of a low redundancy 'unigene' set of 27,513 clones that represent a variety of soybean cDNA libraries made from a wide array of source tissue and organ systems, developmental stages, and stress or pathogen-challenged plants. The set was assembled from the 5' sequence data of the cDNA clones using cluster analysis programs. The selected clones were then physically reracked and sequenced at the 3' end. In order to increase gene discovery from immature cotyledon libraries that contain abundant mRNAs representing storage protein gene families, we utilized a high density filter normalization approach to preferentially select more weakly expressed cDNAs. All 27,513 cDNA inserts were amplified by polymerase chain reaction. The amplified products, along with some repetitively spotted control or 'choice' clones, were used to produce three 9,728-element microarrays that have been used to examine tissue specific gene expression and global expression in mutant isolines. CONCLUSIONS Global expression studies will be greatly aided by the availability of the sequence-validated and low redundancy cDNA sets described in this report. These cDNAs and ESTs represent a wide array of developmental stages and physiological conditions of the soybean plant. We also demonstrate that the quality of the data from the soybean cDNA microarrays is sufficiently reliable to examine isogenic lines that differ with respect to a mutant phenotype and thereby to define a small list of candidate genes potentially encoding or modulated by the mutant phenotype.
Collapse
|
7
|
Lamblin AFJ, Crow JA, Johnson JE, Silverstein KAT, Kunau TM, Kilian A, Benz D, Stromvik M, Endré G, VandenBosch KA, Cook DR, Young ND, Retzel EF. MtDB: a database for personalized data mining of the model legume Medicago truncatula transcriptome. Nucleic Acids Res 2003; 31:196-201. [PMID: 12519981 PMCID: PMC165566 DOI: 10.1093/nar/gkg119] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In order to identify the genes and gene functions that underlie key aspects of legume biology, researchers have selected the cool season legume Medicago truncatula (Mt) as a model system for legume research. A set of >170 000 Mt ESTs has been assembled based on in-depth sampling from various developmental stages and pathogen-challenged tissues. MtDB is a relational database that integrates Mt transcriptome data and provides a wide range of user-defined data mining options. The database is interrogated through a series of interfaces with 58 options grouped into two filters. In addition, the user can select and compare unigene sets generated by different assemblers: Phrap, Cap3 and Cap4. Sequence identifiers from all public Mt sites (e.g. IDs from GenBank, CCGB, TIGR, NCGR, INRA) are fully cross-referenced to facilitate comparisons between different sites, and hypertext links to the appropriate database records are provided for all queries' results. MtDB's goal is to provide researchers with the means to quickly and independently identify sequences that match specific research interests based on user-defined criteria. The underlying database and query software have been designed for ease of updates and portability to other model organisms. Public access to the database is at http://www.medicago.org/MtDB.
Collapse
Affiliation(s)
- Anne-Françoise J Lamblin
- Center for Computational Genomics and Bioinformatics, University of Minnesota, MMC43, 420 Delaware Street S.E., Minneapolis, MN 55455, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Wu CH, Xiao C, Hou Z, Huang H, Barker WC. iProClass: an integrated, comprehensive and annotated protein classification database. Nucleic Acids Res 2001; 29:52-4. [PMID: 11125047 PMCID: PMC29833 DOI: 10.1093/nar/29.1.52] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2000] [Revised: 10/27/2000] [Accepted: 10/27/2000] [Indexed: 11/13/2022] Open
Abstract
The iProClass database is an integrated resource that provides comprehensive family relationships and structural and functional features of proteins, with rich links to various databases. It is extended from ProClass, a protein family database that integrates PIR superfamilies and PROSITE motifs. The iProClass currently consists of more than 200,000 non-redundant PIR and SWISS-PROT proteins organized with more than 28,000 superfamilies, 2600 domains, 1300 motifs, 280 post-translational modification sites and links to more than 30 databases of protein families, structures, functions, genes, genomes, literature and taxonomy. Protein and family summary reports provide rich annotations, including membership information with length, taxonomy and keyword statistics, full family relationships, comprehensive enzyme and PDB cross-references and graphical feature display. The database facilitates classification-driven annotation for protein sequence databases and complete genomes, and supports structural and functional genomic research. The iProClass is implemented in Oracle 8i object-relational system and available for sequence search and report retrieval at http://pir.georgetown.edu/iproclass/.
Collapse
Affiliation(s)
- C H Wu
- Protein Information Resource, National Biomedical Research Foundation, Georgetown University Medical Center, 3900 Reservoir Road, NW Washington, DC 20007-2195, USA.
| | | | | | | | | |
Collapse
|