1
|
Duan B, Qiu C, Lockless SW, Sze SH, Kaplan CD. Higher-order epistasis within Pol II trigger loop haplotypes. bioRxiv 2024:2024.01.20.576280. [PMID: 38293233 PMCID: PMC10827151 DOI: 10.1101/2024.01.20.576280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
RNA polymerase II (Pol II) has a highly conserved domain, the trigger loop (TL), that controls transcription fidelity and speed. We previously probed pairwise genetic interactions between residues within and surrounding the TL and identified widespread incompatibility between TLs of different species when placed in the Saccharomyces cerevisiae Pol II context, indicating epistasis between the TL and its surrounding context. We sought to understand the nature of this incompatibility and probe higher order epistasis internal to the TL. We have employed deep mutational scanning with selected natural TL variants ("haplotypes"), and all possible intermediate substitution combinations between them and the yeast Pol II TL. We identified both positive and negative higher-order residue interactions within example TL haplotypes. Intricate higher-order epistasis formed by TL residues was sometimes only apparent from analysis of intermediate genotypes, emphasizing complexity of epistatic interactions. Furthermore, we distinguished TL substitutions with distinct classes of epistatic patterns, suggesting specific TL residues that potentially influence TL evolution. Our examples of complex residue interactions suggest possible pathways for epistasis to facilitate Pol II evolution.
Collapse
Affiliation(s)
- Bingbing Duan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| | - Chenxi Qiu
- Department of Genetics, Harvard Medical School, Boston, MA 02215
| | - Steve W Lockless
- Department of Biology, Texas A&M University, College Station, TX 77843
| | - Sing-Hoi Sze
- Department of Computer Science & Engineering, Texas A&M University, College Station, TX 77843
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX 77843
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
2
|
Zhu Y, Vvedenskaya IO, Sze SH, Nickels BE, Kaplan CD. Quantitative analysis of transcription start site selection reveals control by DNA sequence, RNA polymerase II activity and NTP levels. Nat Struct Mol Biol 2024; 31:190-202. [PMID: 38177677 PMCID: PMC10928753 DOI: 10.1038/s41594-023-01171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 11/03/2023] [Indexed: 01/06/2024]
Abstract
Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.
Collapse
Affiliation(s)
- Yunye Zhu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Irina O Vvedenskaya
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| | - Bryce E Nickels
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
3
|
Duan B, Qiu C, Sze SH, Kaplan C. Widespread epistasis shapes RNA Polymerase II active site function and evolution. bioRxiv 2023:2023.02.27.530048. [PMID: 36909581 PMCID: PMC10002619 DOI: 10.1101/2023.02.27.530048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Abstract
Multi-subunit RNA Polymerases (msRNAPs) are responsible for transcription in all kingdoms of life. At the heart of these msRNAPs is an ultra-conserved active site domain, the trigger loop (TL), coordinating transcription speed and fidelity by critical conformational changes impacting multiple steps in substrate selection, catalysis, and translocation. Previous studies have observed several different types of genetic interactions between eukaryotic RNA polymerase II (Pol II) TL residues, suggesting that the TL's function is shaped by functional interactions of residues within and around the TL. The extent of these interaction networks and how they control msRNAP function and evolution remain to be determined. Here we have dissected the Pol II TL interaction landscape by deep mutational scanning in Saccharomyces cerevisiae Pol II. Through analysis of over 15000 alleles, representing all single mutants, a rationally designed subset of double mutants, and evolutionarily observed TL haplotypes, we identify interaction networks controlling TL function. Substituting residues creates allele-specific networks and propagates epistatic effects across the Pol II active site. Furthermore, the interaction landscape further distinguishes alleles with similar growth phenotypes, suggesting increased resolution over the previously reported single mutant phenotypic landscape. Finally, co-evolutionary analyses reveal groups of co-evolving residues across Pol II converge onto the active site, where evolutionary constraints interface with pervasive epistasis. Our studies provide a powerful system to understand the plasticity of RNA polymerase mechanism and evolution, and provide the first example of pervasive epistatic landscape in a highly conserved and constrained domain within an essential enzyme.
Collapse
Affiliation(s)
- Bingbing Duan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| | - Chenxi Qiu
- Department of Genetics, Harvard Medical School, Boston, MA 02215
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX 77843
| | - Craig Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260
| |
Collapse
|
4
|
Zhang M, Liu YH, Wang Y, Sze SH, Scheuring CF, Qi X, Ekinci O, Pekar J, Murray SC, Zhang HB. Genome-wide identification of genes enabling accurate prediction of hybrid performance from parents across environments and populations for gene-based breeding in maize. Plant Sci 2022; 324:111424. [PMID: 35995113 DOI: 10.1016/j.plantsci.2022.111424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 08/07/2022] [Accepted: 08/16/2022] [Indexed: 06/15/2023]
Abstract
Accurate prediction of hybrid offspring complex trait phenotype from parents is paramount to enhanced plant breeding, animal breeding, and human medicine. Here we report genome-wide identification of genes enabling accurate prediction of hybrid offspring complex traits from parents using maize grain yield as the target trait. We identified 181 ZmF1GY genes enabling prediction of maize (Zea mays L.) F1 hybrid grain yield from parents and tested their utility and efficiency for predicting F1 hybrid grain yields from parents using their expressions, genic SNPs, and number of favorable alleles (NFAs), respectively. The ZmF1GY genes predicted hybrid grain yields from parents at an accuracy of 0.86, presented by correlation coefficient between predicted and observed phenotypes, within an environment, 0.74 across environments, and 0.64 across populations, outperforming genomic prediction by 27-406%, 23%, and 40%, respectively. Furthermore, we identified nine of the ZmF1GY genes containing SNPs or InDels in parents that increased or decreased hybrid grain yields by 14-46%. When the NFAs of these nine ZmF1GY genes were used for hybrid grain yield prediction from parents, they predicted hybrid grain yields at an accuracy of 0.79, outperforming genomic prediction by 21% that was based on up to tens of thousands of genome-wide SNPs. These results demonstrate the feasibility of developing a gene toolkit for a species enabling gene-based breeding across environments and populations that is much more powerful and efficient than current breeding, thereby helping secure the world's food production. The methodology is applicable to all crops, livestock, and humans.
Collapse
Affiliation(s)
- Meiping Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Yun-Hua Liu
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Yinglei Wang
- Department of Computer Science, Cornell University, Ithaca, NY 14853, USA.
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering and Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA.
| | - Chantel F Scheuring
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Xiaoli Qi
- College of Life Science, Jiamusi University, Jiamusi, Heilongjiang 154007, China.
| | - Ozge Ekinci
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Jacob Pekar
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Seth C Murray
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| |
Collapse
|
5
|
Hjelmen CE, Yuan Y, Parrott JJ, McGuane AS, Srivastav SP, Purcell AC, Pimsler ML, Sze SH, Tarone AM. Identification and Characterization of Small RNA Markers of Age in the Blow Fly Cochliomyia macellaria (Fabricius) (Diptera: Calliphoridae). Insects 2022; 13:948. [PMID: 36292896 PMCID: PMC9603907 DOI: 10.3390/insects13100948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/13/2022] [Accepted: 10/15/2022] [Indexed: 06/16/2023]
Abstract
Blow fly development is important in decomposition ecology, agriculture, and forensics. Much of the impact of these species is from immature samples, thus knowledge of their development is important to enhance or ameliorate their effects. One application of this information is the estimation of immature insect age to provide temporal information for death investigations. While traditional markers of age such as stage and size are generally accurate, they lack precision in later developmental stages. We used miRNA sequencing to measure miRNA expression, throughout development, of the secondary screwworm, Cochliomyia macellaria (Fabricius) (Diptera: Calliphoridae) and identified 217 miRNAs present across the samples. Ten were identified to be significantly differentially expressed in larval samples and seventeen were found to be significantly differentially expressed in intrapuparial samples. Twenty-eight miRNAs were identified to be differentially expressed between sexes. Expression patterns of two miRNAs, miR-92b and bantam, were qPCR-validated in intrapuparial samples; these and likely food-derived miRNAs appear to be stable markers of age in C. macellaria. Our results support the use of miRNAs for developmental markers of age and suggest further investigations across species and under a range of abiotic and biotic conditions.
Collapse
Affiliation(s)
- Carl E. Hjelmen
- Department of Biology, Utah Valley University, Orem, UT 84058, USA
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Ye Yuan
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843, USA
| | - Jonathan J. Parrott
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
- School of Mathematical and Natural Sciences, Arizona State University, Glendale, AZ 85306, USA
| | | | - Satyam P. Srivastav
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| | - Amanda C. Purcell
- Centre for Forensic Science, Department of Pure and Applied Chemistry, University of Strathclyde, Glasgow G1 1XQ, UK
| | - Meaghan L. Pimsler
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843, USA
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA
| | - Aaron M. Tarone
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
6
|
Liu YH, Zhang M, Sze SH, Smith CW, Zhang HB. Analysis of the genes controlling cotton fiber length reveals the molecular basis of plant breeding and the genetic potential of current cultivars for continued improvement. Plant Sci 2022; 321:111318. [PMID: 35696918 DOI: 10.1016/j.plantsci.2022.111318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 05/02/2022] [Accepted: 05/08/2022] [Indexed: 06/15/2023]
Abstract
Stagnated crop improvement has raised questions of whether and how current crop cultivars can be further improved. Genes are the core determinants of performance of all cultivars. Here, we report the molecular basis of plant breeding and address these questions by analyzing 226 GFL genes controlling and accurately predicting fiber length, an important breeding objective trait, in cotton (Gossypium sp.). We first identified the favorable allele and the number of favorable alleles (NFAs) of each GFL gene, calculated the total NFAs of the 226 GFL genes accumulated in 198 advanced breeding lines, and analyzed them against fiber lengths. Fiber lengths of the breeding lines were strongly correlated with the total NFAs of the GFL genes (r = 0.85, P < 0.0001), suggesting that accumulation of the favorable alleles of the genes controlling objective traits is the molecular basis of cotton breeding. Surprisingly, a breeding line with a fiber length of present cultivars having the longest fibers contained only about 51% of the total NFAs of the 226 GFL genes. The genetic potentials of current cultivars were then predicted using linear and non-linear models, respectively, revealing that a breeding line or cultivar with a fiber length of 33.8 mm could be further improved in fiber length by up to 118%. Finally, we showed that the genetic potential of such a breeding line can be realized through gene-based breeding. Therefore, these findings shed light on continued crop improvement in general and provide 740 genic biomarkers desirable for enhanced cotton fiber breeding.
Collapse
Affiliation(s)
- Yun-Hua Liu
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Meiping Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering, and Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA.
| | - C Wayne Smith
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| |
Collapse
|
7
|
Zhang Y, Xiao X, Elhag O, Cai M, Zheng L, Huang F, Jordan HR, Tomberlin JK, Sze SH, Yu Z, Zhang J. Hermetia illucens L. larvae-associated intestinal microbes reduce the transmission risk of zoonotic pathogens in pig manure. Microb Biotechnol 2022; 15:2631-2644. [PMID: 35881487 PMCID: PMC9518977 DOI: 10.1111/1751-7915.14113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 06/16/2022] [Accepted: 06/18/2022] [Indexed: 12/03/2022] Open
Abstract
Black soldier fly (BSF) larvae are considered a promising biological reactor to convert organic waste and reduce the impact of zoonotic pathogens on the environment. We analysed the effects of BSF larvae on Staphylococcus aureus and Salmonella spp. populations in pig manure (PM), which showed that BSF larvae can significantly reduce the counts of the associated S. aureus and Salmonella spp. Then, using a sterile BSF larval system, we validated the function of BSF larval intestinal microbiota in vivo to suppress pathogens, and lastly, we isolated eight bacterial strains from the BSF larval gut that inhibit S. aureus. Results indicated that functional microbes are essential for BSF larvae to antagonise S. aureus. Moreover, the analysis results of the relationship between the intestinal microbiota and S. aureus and Salmonella spp. showed that Myroides, Tissierella, Oblitimonas, Paenalcalignes, Terrisporobacter, Clostridium, Fastidiosipila, Pseudomonas, Ignatzschineria, Savagea, Moheibacter and Sphingobacterium were negatively correlated with S. aureus and Salmonella. Overall, these results suggested that the potential ability of BSF larvae to inhibit S. aureus and Salmonella spp. present in PM is accomplished primarily by gut‐associated microorganisms.
Collapse
Affiliation(s)
- Yuanpu Zhang
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| | - Xiaopeng Xiao
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| | - Osama Elhag
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China.,Faculty of Science and Technology, Omdurman Islamic University, Khartoum, Sudan
| | - Minmin Cai
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| | - Longyu Zheng
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| | - Feng Huang
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| | - Heather R Jordan
- Department of Biology, Mississippi State University, Mississippi State, Mississippi, USA
| | | | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas, USA
| | - Ziniu Yu
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| | - Jibin Zhang
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, China.,Hubei Hongshan Laboratory, Wuhan, China
| |
Collapse
|
8
|
Wells KM, He K, Pandey A, Cabello A, Zhang D, Yang J, Gomez G, Liu Y, Chang H, Li X, Zhang H, Feng X, da Costa LF, Metz R, Johnson CD, Martin CL, Skrobarczyk J, Berghman LR, Patrick KL, Leibowitz J, Ficht A, Sze SH, Song J, Qian X, Qin QM, Ficht TA, de Figueiredo P. Brucella activates the host RIDD pathway to subvert BLOS1-directed immune defense. eLife 2022; 11:73625. [PMID: 35587649 PMCID: PMC9119680 DOI: 10.7554/elife.73625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 04/26/2022] [Indexed: 11/18/2022] Open
Abstract
The phagocytosis and destruction of pathogens in lysosomes constitute central elements of innate immune defense. Here, we show that Brucella, the causative agent of brucellosis, the most prevalent bacterial zoonosis globally, subverts this immune defense pathway by activating regulated IRE1α-dependent decay (RIDD) of Bloc1s1 mRNA encoding BLOS1, a protein that promotes endosome–lysosome fusion. RIDD-deficient cells and mice harboring a RIDD-incompetent variant of IRE1α were resistant to infection. Inactivation of the Bloc1s1 gene impaired the ability to assemble BLOC-1-related complex (BORC), resulting in differential recruitment of BORC-related lysosome trafficking components, perinuclear trafficking of Brucella-containing vacuoles (BCVs), and enhanced susceptibility to infection. The RIDD-resistant Bloc1s1 variant maintains the integrity of BORC and a higher-level association of BORC-related components that promote centrifugal lysosome trafficking, resulting in enhanced BCV peripheral trafficking and lysosomal destruction, and resistance to infection. These findings demonstrate that host RIDD activity on BLOS1 regulates Brucella intracellular parasitism by disrupting BORC-directed lysosomal trafficking. Notably, coronavirus murine hepatitis virus also subverted the RIDD–BLOS1 axis to promote intracellular replication. Our work establishes BLOS1 as a novel immune defense factor whose activity is hijacked by diverse pathogens.
Collapse
Affiliation(s)
- Kelsey Michelle Wells
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | - Kai He
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, United States
| | - Aseem Pandey
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States.,Department of Veterinary Pathobiology, Texas A&M University, College Station, United States
| | - Ana Cabello
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States.,Department of Veterinary Pathobiology, Texas A&M University, College Station, United States
| | - Dongmei Zhang
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | - Jing Yang
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | - Gabriel Gomez
- Texas A&M Veterinary Medical Diagnostic Laboratory, Texas A&M University, College Station, United States
| | - Yue Liu
- College of Plant Sciences, Key Laboratory of Zoonosis Research, Ministry of Education, Jilin University, Jilin, China
| | - Haowu Chang
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Xueqiang Li
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Hao Zhang
- Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Xuehuang Feng
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | | | - Richard Metz
- Genomics and Bioinformatics Services, Texas A&M University, College Station, United States
| | - Charles D Johnson
- Genomics and Bioinformatics Services, Texas A&M University, College Station, United States
| | - Cameron Lee Martin
- Department of Poultry Science, Texas A&M University, College Station, United States
| | - Jill Skrobarczyk
- Department of Poultry Science, Texas A&M University, College Station, United States
| | - Luc R Berghman
- Department of Poultry Science, Texas A&M University, College Station, United States
| | - Kristin L Patrick
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | - Julian Leibowitz
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | - Allison Ficht
- Department of Molecular and Cellular Medicine, College of Medicine, Texas A&M Health Science Center, College Station, United States
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Dwight Look College of Engineering, Texas A&M University, College Station, United States.,Department of Biochemistry & Biophysics, Texas A&M University, College Station, United States
| | - Jianxun Song
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States
| | - Xiaoning Qian
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, United States.,TEES-AgriLife Center for Bioinformatics & Genomic Systems Engineering, Texas A&M University, College Station, United States
| | - Qing-Ming Qin
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States.,College of Plant Sciences, Key Laboratory of Zoonosis Research, Ministry of Education, Jilin University, Jilin, China
| | - Thomas A Ficht
- Department of Veterinary Pathobiology, Texas A&M University, College Station, United States
| | - Paul de Figueiredo
- Department of Microbial Pathogenesis and Immunology, College of Medicine, Texas A&M Health Science Center, Bryan, United States.,Department of Veterinary Pathobiology, Texas A&M University, College Station, United States
| |
Collapse
|
9
|
Liu YH, Zhang M, Scheuring CF, Cilkiz M, Sze SH, Smith CW, Murray SC, Xu W, Zhang HB. Accurate prediction of complex traits for individuals and offspring from parents using a simple, rapid, and efficient method for gene-based breeding in cotton and maize. Plant Sci 2022; 316:111153. [PMID: 35151437 DOI: 10.1016/j.plantsci.2021.111153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 12/11/2021] [Indexed: 06/14/2023]
Abstract
Accurate, simple, rapid, and inexpensive prediction of complex traits controlled by numerous genes is paramount to enhanced plant breeding, animal breeding, and human medicine. Here we report a novel method that enables accurate, simple, and rapid prediction of complex traits of individuals or offspring from parents based on the number of favorable alleles (NFAs) of the genes controlling the objective traits. The NFAs of 226 cotton fiber length (GFL) genes and nine maize hybrid grain yield related (ZmF1GY) genes were directly used to predict cotton fiber lengths of individual plants and maize grain yields of F1 hybrids from parents, respectively, using prediction model-based methods as controls. The NFAs of the 226 GFL genes predicted cotton fiber lengths at an accuracy of 0.85, as the model methods and outperforming genomic prediction by 82 % - 170 %. The NFAs of the nine ZmF1GY genes predicted grain yields of maize hybrids from parents at an accuracy of 0.80, outperforming genomic prediction by 67 %. Moreover, the prediction accuracies of these traits were consistent across years, environments, and eco-agricultural systems. Importantly, the accurate prediction of these traits directly using the NFAs of the genes allows breeding to be performed in greenhouse, phytotron, or off-season, without the need of the model training and validation steps essential and costly for model-based genomic or genic prediction. Therefore, this new method dramatically outperforms the current model-based genomic methods used for phenotype prediction and streamlines the process of breeding, thus promising to substantially enhance current plant and animal breeding.
Collapse
Affiliation(s)
- Yun-Hua Liu
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Meiping Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Chantel F Scheuring
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Mustafa Cilkiz
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering and Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA
| | - C Wayne Smith
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Seth C Murray
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA
| | - Wenwei Xu
- Texas A&M AgriLife Research, Lubbock, TX 79403, USA
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA.
| |
Collapse
|
10
|
Olafson PU, Aksoy S, Attardo GM, Buckmeier G, Chen X, Coates CJ, Davis M, Dykema J, Emrich SJ, Friedrich M, Holmes CJ, Ioannidis P, Jansen EN, Jennings EC, Lawson D, Martinson EO, Maslen GL, Meisel RP, Murphy TD, Nayduch D, Nelson DR, Oyen KJ, Raszick TJ, Ribeiro JMC, Robertson HM, Rosendale AJ, Sackton TB, Saelao P, Swiger SL, Sze SH, Tarone AM, Taylor DB, Warren WC, Waterhouse RM, Weirauch MT, Werren JH, Wilson RK, Zdobnov EM, Benoit JB. Publisher Correction: The genome of the stable fly, Stomoxys calcitrans, reveals potential mechanisms underlying reproduction, host interactions, and novel targets for pest control. BMC Biol 2021; 19:150. [PMID: 34325695 PMCID: PMC8320157 DOI: 10.1186/s12915-021-01098-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Affiliation(s)
- Pia U Olafson
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA.
| | - Serap Aksoy
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
| | - Geoffrey M Attardo
- Department of Entomology and Nematology, University of California - Davis, Davis, CA, USA
| | - Greta Buckmeier
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA
| | - Xiaoting Chen
- The Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Craig J Coates
- Department of Entomology, Texas A & M University, College Station, TX, USA
| | - Megan Davis
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA
| | - Justin Dykema
- Department of Biological Sciences, Wayne State University, Detroit, MI, USA
| | - Scott J Emrich
- Department of Electrical Engineering & Computer Science, University of Tennessee, Knoxville, TN, USA
| | - Markus Friedrich
- Department of Biological Sciences, Wayne State University, Detroit, MI, USA
| | - Christopher J Holmes
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Panagiotis Ioannidis
- Department of Genetic Medicine and Development, University of Geneva Medical School and Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland
| | - Evan N Jansen
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Emily C Jennings
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Daniel Lawson
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | | | - Gareth L Maslen
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Richard P Meisel
- Department of Biology and Biochemistry, University of Houston, Houston, TX, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Dana Nayduch
- Arthropod-borne Animal Diseases Research Unit, USDA-ARS, Manhattan, KS, USA
| | - David R Nelson
- Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Kennan J Oyen
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Tyler J Raszick
- Department of Entomology, Texas A & M University, College Station, TX, USA
| | - José M C Ribeiro
- Section of Vector Biology, Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, Rockville, MD, USA
| | - Hugh M Robertson
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | - Timothy B Sackton
- Informatics Group, Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA
| | - Perot Saelao
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA
| | - Sonja L Swiger
- Department of Entomology, Texas A&M AgriLife Research and Extension Center, Stephenville, TX, USA
| | - Sing-Hoi Sze
- Department of Computer Science & Engineering, Department of Biochemistry & Biophysics, Texas A & M University, College Station, TX, USA
| | - Aaron M Tarone
- Department of Entomology, Texas A & M University, College Station, TX, USA
| | - David B Taylor
- Agroecosystem Management Research Unit, USDA-ARS, Lincoln, NE, USA
| | - Wesley C Warren
- University of Missouri, Bond Life Sciences Center, Columbia, MO, USA
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Matthew T Weirauch
- Center for Autoimmune Genomics and Etiology, Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - John H Werren
- Department of Biology, University of Rochester, Rochester, NY, USA
| | - Richard K Wilson
- Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,College of Medicine, Ohio State University, Columbus, OH, USA
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School and Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland
| | - Joshua B Benoit
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA.
| |
Collapse
|
11
|
Pimsler ML, Hjelmen CE, Jonika MM, Sharma A, Fu S, Bala M, Sze SH, Tomberlin JK, Tarone AM. Sexual Dimorphism in Growth Rate and Gene Expression Throughout Immature Development in Wild Type Chrysomya rufifacies (Diptera: Calliphoridae) Macquart. Front Ecol Evol 2021. [DOI: 10.3389/fevo.2021.696638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Reliability of forensic entomology analyses to produce relevant information to a given case requires an understanding of the underlying arthropod population(s) of interest and the factors contributing to variability. Common traits for analyses are affected by a variety of genetic and environmental factors. One trait of interest in forensic investigations has been species-specific temperature-dependent growth rates. Recent work indicates sexual dimorphism may be important in the analysis of such traits and related genetic markers of age. However, studying sexual dimorphic patterns of gene expression throughout immature development in wild-type insects can be difficult due to a lack of genetic tools, and the limits of most sex-determination mechanisms. Chrysomya rufifacies, however, is a particularly tractable system to address these issues as it has a monogenic sex determination system, meaning females have only a single-sex of offspring throughout their life. Using modified breeding procedures (to ensure single-female egg clutches) and transcriptomics, we investigated sexual dimorphism in development rate and gene expression. Females develop slower than males (9 h difference from egg to eclosion respectively) even at 30°C, with an average egg-to-eclosion time of 225 h for males and 234 h for females. Given that many key genes rely on sex-specific splicing for the development and maintenance of sexually dimorphic traits, we used a transcriptomic approach to identify different expression of gene splice variants. We find that 98.4% of assembled nodes exhibited sex-specific, stage-specific, to sex-by-stage specific patterns of expression. However, the greatest signal in the expression data is differentiation by developmental stage, indicating that sexual dimorphism in gene expression during development may not be investigatively important and that markers of age may be relatively independent of sex. Subtle differences in these gene expression patterns can be detected as early as 4 h post-oviposition, and 12 of these nodes demonstrate homology with key Drosophila sex determination genes, providing clues regarding the distinct sex determination mechanism of C. rufifacies. Finally, we validated the transcriptome analyses through qPCR and have identified five genes that are developmentally informative within and between sexes.
Collapse
|
12
|
Song JM, Arif M, Zi Y, Sze SH, Zhang M, Zhang HB. Molecular and genetic dissection of the USDA rice mini-core collection using high-density SNP markers. Plant Sci 2021; 308:110910. [PMID: 34034867 DOI: 10.1016/j.plantsci.2021.110910] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Revised: 04/05/2021] [Accepted: 04/10/2021] [Indexed: 06/12/2023]
Abstract
Molecular tools and knowledge of crop germplasm are vital for their effective utilization. In this study, we developed 40,866 high-quality and well distributed SNPs for a rice mini-core collection (RMC) developed by the United States Department of Agriculture (USDA). The high-quality SNPs clustered the USDA-RMC into five subpopulations (Ind, indica; Aus, aus; Afr, African rice; TeJ, temperate japonica; TrJ, tropical japonica) and one admixture (Adm). This classification was further confirmed by phylogenetic and principal component analyses. The rice ARO (aromatic) subpopulation of previous studies was re-assigned with Adm and the WD (wild-type) subpopulation was re-defined to the Afr subpopulation because most of its accessions are African cultivated rice. The Aus and Ind subpopulations had a substantially wider genetic variation than the TrJ and TeJ subpopulations. The genetic diversities were much larger between the Ind or Aus subpopulation and the TrJ or TeJ subpopulation than between the Afr subpopulation and the Ind, Aus, TrJ or TeJ subpopulation. Comparative agronomic trait analysis between the subpopulations also supported the genetic structure and variation of the RMC, and suggested the existence of extensive variation in the genes controlling agronomic traits among them. Furthermore, analysis of ancestral membership of the RMC accessions revealed that reproductive barrier or wide incompatibility existed between the Indica and Japonica groups, while gene flow occurred between them. These results provide high-quality SNPs and knowledge of genetic structure and diversity of the USDA-RMC necessary for enhanced rice research and breeding.
Collapse
Affiliation(s)
- Jian-Min Song
- Crop Research Institute/National Engineering Laboratory for Wheat and Maize, Shandong Academy of Agricultural Sciences (SAAS), Jinan, 250100, PR China; Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, 77843-2474, USA.
| | - Muhammad Arif
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, 77843-2474, USA; Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering, Faisalabad, Pakistan.
| | - Yan Zi
- Crop Research Institute/National Engineering Laboratory for Wheat and Maize, Shandong Academy of Agricultural Sciences (SAAS), Jinan, 250100, PR China
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering and Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843, USA.
| | - Meiping Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, 77843-2474, USA.
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, 77843-2474, USA.
| |
Collapse
|
13
|
Olafson PU, Aksoy S, Attardo GM, Buckmeier G, Chen X, Coates CJ, Davis M, Dykema J, Emrich SJ, Friedrich M, Holmes CJ, Ioannidis P, Jansen EN, Jennings EC, Lawson D, Martinson EO, Maslen GL, Meisel RP, Murphy TD, Nayduch D, Nelson DR, Oyen KJ, Raszick TJ, Ribeiro JMC, Robertson HM, Rosendale AJ, Sackton TB, Saelao P, Swiger SL, Sze SH, Tarone AM, Taylor DB, Warren WC, Waterhouse RM, Weirauch MT, Werren JH, Wilson RK, Zdobnov EM, Benoit JB. The genome of the stable fly, Stomoxys calcitrans, reveals potential mechanisms underlying reproduction, host interactions, and novel targets for pest control. BMC Biol 2021; 19:41. [PMID: 33750380 PMCID: PMC7944917 DOI: 10.1186/s12915-021-00975-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 02/03/2021] [Indexed: 01/01/2023] Open
Abstract
Background The stable fly, Stomoxys calcitrans, is a major blood-feeding pest of livestock that has near worldwide distribution, causing an annual cost of over $2 billion for control and product loss in the USA alone. Control of these flies has been limited to increased sanitary management practices and insecticide application for suppressing larval stages. Few genetic and molecular resources are available to help in developing novel methods for controlling stable flies. Results This study examines stable fly biology by utilizing a combination of high-quality genome sequencing and RNA-Seq analyses targeting multiple developmental stages and tissues. In conjunction, 1600 genes were manually curated to characterize genetic features related to stable fly reproduction, vector host interactions, host-microbe dynamics, and putative targets for control. Most notable was characterization of genes associated with reproduction and identification of expanded gene families with functional associations to vision, chemosensation, immunity, and metabolic detoxification pathways. Conclusions The combined sequencing, assembly, and curation of the male stable fly genome followed by RNA-Seq and downstream analyses provide insights necessary to understand the biology of this important pest. These resources and new data will provide the groundwork for expanding the tools available to control stable fly infestations. The close relationship of Stomoxys to other blood-feeding (horn flies and Glossina) and non-blood-feeding flies (house flies, medflies, Drosophila) will facilitate understanding of the evolutionary processes associated with development of blood feeding among the Cyclorrhapha. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-021-00975-9.
Collapse
Affiliation(s)
- Pia U Olafson
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA.
| | - Serap Aksoy
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
| | - Geoffrey M Attardo
- Department of Entomology and Nematology, University of California - Davis, Davis, CA, USA
| | - Greta Buckmeier
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA
| | - Xiaoting Chen
- The Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Craig J Coates
- Department of Entomology, Texas A & M University, College Station, TX, USA
| | - Megan Davis
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA
| | - Justin Dykema
- Department of Biological Sciences, Wayne State University, Detroit, MI, USA
| | - Scott J Emrich
- Department of Electrical Engineering & Computer Science, University of Tennessee, Knoxville, TN, USA
| | - Markus Friedrich
- Department of Biological Sciences, Wayne State University, Detroit, MI, USA
| | - Christopher J Holmes
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Panagiotis Ioannidis
- Department of Genetic Medicine and Development, University of Geneva Medical School and Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland
| | - Evan N Jansen
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Emily C Jennings
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Daniel Lawson
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | | | - Gareth L Maslen
- The European Molecular Biology Laboratory, The European Bioinformatics Institute, The Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Richard P Meisel
- Department of Biology and Biochemistry, University of Houston, Houston, TX, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Dana Nayduch
- Arthropod-borne Animal Diseases Research Unit, USDA-ARS, Manhattan, KS, USA
| | - David R Nelson
- Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Kennan J Oyen
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA
| | - Tyler J Raszick
- Department of Entomology, Texas A & M University, College Station, TX, USA
| | - José M C Ribeiro
- Section of Vector Biology, Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, Rockville, MD, USA
| | - Hugh M Robertson
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | - Timothy B Sackton
- Informatics Group, Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA
| | - Perot Saelao
- Livestock Arthropod Pests Research Unit, USDA-ARS, Kerrville, TX, USA
| | - Sonja L Swiger
- Department of Entomology, Texas A&M AgriLife Research and Extension Center, Stephenville, TX, USA
| | - Sing-Hoi Sze
- Department of Computer Science & Engineering, Department of Biochemistry & Biophysics, Texas A & M University, College Station, TX, USA
| | - Aaron M Tarone
- Department of Entomology, Texas A & M University, College Station, TX, USA
| | - David B Taylor
- Agroecosystem Management Research Unit, USDA-ARS, Lincoln, NE, USA
| | - Wesley C Warren
- University of Missouri, Bond Life Sciences Center, Columbia, MO, USA
| | - Robert M Waterhouse
- Department of Ecology and Evolution, University of Lausanne, and Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Matthew T Weirauch
- Center for Autoimmune Genomics and Etiology, Divisions of Biomedical Informatics and Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.,Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - John H Werren
- Department of Biology, University of Rochester, Rochester, NY, USA
| | - Richard K Wilson
- Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,College of Medicine, Ohio State University, Columbus, OH, USA
| | - Evgeny M Zdobnov
- Department of Genetic Medicine and Development, University of Geneva Medical School and Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland
| | - Joshua B Benoit
- Department of Biological Sciences, University of Cincinnati, Cincinnati, OH, USA.
| |
Collapse
|
14
|
Abstract
Background Single-cell RNA sequencing (scRNA-seq) is a powerful profiling technique at the single-cell resolution. Appropriate analysis of scRNA-seq data can characterize molecular heterogeneity and shed light into the underlying cellular process to better understand development and disease mechanisms. The unique analytic challenge is to appropriately model highly over-dispersed scRNA-seq count data with prevalent dropouts (zero counts), making zero-inflated dimensionality reduction techniques popular for scRNA-seq data analyses. Employing zero-inflated distributions, however, may place extra emphasis on zero counts, leading to potential bias when identifying the latent structure of the data. Results In this paper, we propose a fully generative hierarchical gamma-negative binomial (hGNB) model of scRNA-seq data, obviating the need for explicitly modeling zero inflation. At the same time, hGNB can naturally account for covariate effects at both the gene and cell levels to identify complex latent representations of scRNA-seq data, without the need for commonly adopted pre-processing steps such as normalization. Efficient Bayesian model inference is derived by exploiting conditional conjugacy via novel data augmentation techniques. Conclusion Experimental results on both simulated data and several real-world scRNA-seq datasets suggest that hGNB is a powerful tool for cell cluster discovery as well as cell lineage inference.
Collapse
Affiliation(s)
- Siamak Zamani Dadaneh
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, USA
| | - Paul de Figueiredo
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, Bryan, Texas, USA.,Department of Veterinary Pathobiology, Texas A&M University, College Station, Texas, USA.,Norman Borlaug Center, Texas A&M University, College Station, Texas, USA
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas, USA
| | - Mingyuan Zhou
- McCombs School of Business, The University of Texas at Austin, Austin, Texas, USA
| | - Xiaoning Qian
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas, USA. .,TEES-AgriLife Center for Bioinformatics & Genomic Systems Engineering, Texas A&M University, College Station, Texas, USA.
| |
Collapse
|
15
|
Qiu C, Jin H, Vvedenskaya I, Llenas JA, Zhao T, Malik I, Visbisky AM, Schwartz SL, Cui P, Čabart P, Han KH, Lai WKM, Metz RP, Johnson CD, Sze SH, Pugh BF, Nickels BE, Kaplan CD. Universal promoter scanning by Pol II during transcription initiation in Saccharomyces cerevisiae. Genome Biol 2020; 21:132. [PMID: 32487207 PMCID: PMC7265651 DOI: 10.1186/s13059-020-02040-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Accepted: 05/08/2020] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The majority of eukaryotic promoters utilize multiple transcription start sites (TSSs). How multiple TSSs are specified at individual promoters across eukaryotes is not understood for most species. In Saccharomyces cerevisiae, a pre-initiation complex (PIC) comprised of Pol II and conserved general transcription factors (GTFs) assembles and opens DNA upstream of TSSs. Evidence from model promoters indicates that the PIC scans from upstream to downstream to identify TSSs. Prior results suggest that TSS distributions at promoters where scanning occurs shift in a polar fashion upon alteration in Pol II catalytic activity or GTF function. RESULTS To determine the extent of promoter scanning across promoter classes in S. cerevisiae, we perturb Pol II catalytic activity and GTF function and analyze their effects on TSS usage genome-wide. We find that alterations to Pol II, TFIIB, or TFIIF function widely alter the initiation landscape consistent with promoter scanning operating at all yeast promoters, regardless of promoter class. Promoter architecture, however, can determine the extent of promoter sensitivity to altered Pol II activity in ways that are predicted by a scanning model. CONCLUSIONS Our observations coupled with previous data validate key predictions of the scanning model for Pol II initiation in yeast, which we term the shooting gallery. In this model, Pol II catalytic activity and the rate and processivity of Pol II scanning together with promoter sequence determine the distribution of TSSs and their usage.
Collapse
Affiliation(s)
- Chenxi Qiu
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843-2128, USA
- Present Address: Department of Medicine, Division of Translational Therapeutics, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, 02215, USA
| | - Huiyan Jin
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843-2128, USA
| | - Irina Vvedenskaya
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ, 08854, USA
- Department of Genetics, Rutgers University, Piscataway, NJ, 08854, USA
| | - Jordi Abante Llenas
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, 77843-3128, USA
- Present Address: Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Tingting Zhao
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Indranil Malik
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843-2128, USA
- Present Address: Department of Neurology, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Alex M Visbisky
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Scott L Schwartz
- Genomics and Bioinformatics Service, Texas A&M AgriLife, College Station, TX, 77845, USA
| | - Ping Cui
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843-2128, USA
| | - Pavel Čabart
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843-2128, USA
- Present Address: First Faculty of Medicine, Charles University, BIOCEV, 252 42, Vestec, Czech Republic
| | - Kang Hoo Han
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, 16802, USA
| | - William K M Lai
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, 16802, USA
- Present Address: Department of Molecular Biology and Genetics, 458 Biotechnology, Cornell University, New York, 14853, USA
| | - Richard P Metz
- Genomics and Bioinformatics Service, Texas A&M AgriLife, College Station, TX, 77845, USA
| | - Charles D Johnson
- Genomics and Bioinformatics Service, Texas A&M AgriLife, College Station, TX, 77845, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, 77843-2128, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, 77843-3127, USA
| | - B Franklin Pugh
- Department of Biochemistry and Molecular Biology, Penn State University, University Park, PA, 16802, USA
- Present Address: Department of Molecular Biology and Genetics, 458 Biotechnology, Cornell University, New York, 14853, USA
| | - Bryce E Nickels
- Waksman Institute of Microbiology, Rutgers University, Piscataway, NJ, 08854, USA
- Department of Genetics, Rutgers University, Piscataway, NJ, 08854, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
| |
Collapse
|
16
|
Liu YH, Xu Y, Zhang M, Cui Y, Sze SH, Smith CW, Xu S, Zhang HB. Accurate Prediction of a Quantitative Trait Using the Genes Controlling the Trait for Gene-Based Breeding in Cotton. Front Plant Sci 2020; 11:583277. [PMID: 33281846 PMCID: PMC7690289 DOI: 10.3389/fpls.2020.583277] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 10/15/2020] [Indexed: 05/03/2023]
Abstract
Accurate phenotype prediction of quantitative traits is paramount to enhanced plant research and breeding. Here, we report the accurate prediction of cotton fiber length, a typical quantitative trait, using 474 cotton (Gossypium ssp.) fiber length (GFL) genes and nine prediction models. When the SNPs/InDels contained in 226 of the GFL genes or the expressions of all 474 GFL genes was used for fiber length prediction, a prediction accuracy of r = 0.83 was obtained, approaching the maximally possible prediction accuracy of a quantitative trait. This has improved by 116%, the prediction accuracies of the fiber length thus far achieved for genomic selection using genome-wide random DNA markers. Moreover, analysis of the GFL genes identified 125 of the GFL genes that are key to accurate prediction of fiber length, with which a prediction accuracy similar to that of all 474 GFL genes was obtained. The fiber lengths of the plants predicted with expressions of the 125 key GFL genes were significantly correlated with those predicted with the SNPs/InDels of the above 226 SNP/InDel-containing GFL genes (r = 0.892, P = 0.000). The prediction accuracies of fiber length using both genic datasets were highly consistent across environments or generations. Finally, we found that a training population consisting of 100-120 plants was sufficient to train a model for accurate prediction of a quantitative trait using the genes controlling the trait. Therefore, the genes controlling a quantitative trait are capable of accurately predicting its phenotype, thereby dramatically improving the ability, accuracy, and efficiency of phenotype prediction and promoting gene-based breeding in cotton and other species.
Collapse
Affiliation(s)
- Yun-Hua Liu
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
| | - Yang Xu
- Botany and Plant Sciences, University of California, Riverside, Riverside, CA, United States
| | - Meiping Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
| | - Yanru Cui
- Botany and Plant Sciences, University of California, Riverside, Riverside, CA, United States
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering and Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, United States
| | - C. Wayne Smith
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
| | - Shizhong Xu
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX, United States
- *Correspondence: Shizhong Xu,
| | - Hong-Bin Zhang
- Botany and Plant Sciences, University of California, Riverside, Riverside, CA, United States
- Hong-Bin Zhang,
| |
Collapse
|
17
|
Fu S, Chang PL, Friesen ML, Teakle NL, Tarone AM, Sze SH. Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus. BMC Genomics 2019; 20:425. [PMID: 31167652 PMCID: PMC6551239 DOI: 10.1186/s12864-019-5702-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Background A popular strategy to study alternative splicing in non-model organisms starts from sequencing the entire transcriptome, then assembling the reads by using de novo transcriptome assembly algorithms to obtain predicted transcripts. A similarity search algorithm is then applied to a related organism to infer possible function of these predicted transcripts. While some of these predictions may be inaccurate and transcripts with low coverage are often missed, we observe that it is possible to obtain a more complete set of transcripts to facilitate possible functional assignments by starting the search from the intermediate de Bruijn graph that contains all branching possibilities. Results We develop an algorithm to extract similar transcripts in a related organism by starting the search from the de Bruijn graph that represents the transcriptome instead of from predicted transcripts. We show that our algorithm is able to recover more similar transcripts than existing algorithms, with large improvements in obtaining longer transcripts and a finer resolution of isoforms. We apply our algorithm to study salt and waterlogging tolerance in two Melilotus species by constructing new RNA-Seq libraries. Conclusions We have developed an algorithm to identify paths in the de Bruijn graph that correspond to similar transcripts in a related organism directly. Our strategy bypasses the transcript prediction step in RNA-Seq data and makes use of support from evolutionary information. Electronic supplementary material The online version of this article (10.1186/s12864-019-5702-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shuhua Fu
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, 77843, TX, USA
| | - Peter L Chang
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, 90089, CA, USA
| | - Maren L Friesen
- Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, 90089, CA, USA.,Department of Crop and Soil Sciences, Washington State University, Pullman, 99164, WA, USA.,Department of Plant Pathology, Washington State University, Pullman, 99164, WA, USA
| | - Natasha L Teakle
- Centre for Ecohydrology, The University of Western Australia, 35 Stirling Highway, Crawley, 6009, WA, Australia.,School of Plant Biology (M084), Faculty of Natural and Agricultural Sciences, The University of Western Australia, 35 Stirling Highway, Crawley, 6009, WA, Australia
| | - Aaron M Tarone
- Department of Entomology, Texas A&M University, College Station, 77843, TX, USA
| | - Sing-Hoi Sze
- Department of Biochemistry & Biophysics, Texas A&M University, College Station, 77843, TX, USA. .,Department of Computer Science and Engineering, Texas A&M University, College Station, 77843, TX, USA.
| |
Collapse
|
18
|
de Figueiredo P, Pandey A, Ding SL, Qin QM, Gupta R, Gomez G, Lin F, Feng X, de Costa LF, Chaki SP, Katepalli M, Case E, Van Schaik E, Sidiq T, Khalaf O, Arenas A, Kobayashi KS, Samuel JE, Rivera G, Alaniz RC, Sze SH, Qian X, Brown WJ, Rice-Ficht A, Russell W, Ficht TA. Mechanisms controlling Cryptococcus Intracellular Parasitism. The Journal of Immunology 2019. [DOI: 10.4049/jimmunol.202.supp.190.31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Abstract
Cryptococcus neoformans (Cn) is a deadly fungal pathogen whose intracellular lifestyle is important for virulence. Host mechanisms controlling fungal phagocytosis and replication remain obscure. Here, we describe insights that have emerged from a global phosphoproteomic analysis of the host response to Cryptococcus infection. Our analysis revealed diverse host proteins that were differentially phosphorylated following fungal infection, indicating global reprogramming of host kinase signaling during this process. Notably, phagocytosis of the pathogen activated the host autophagy initiation complex (AIC) as well as regulatory components that reside upstream of this complex. Cn-containing vacuoles (CnCVs) were found to be decorated with the cell surface marker CD44, which colocalized with components of the AIC complex. Taken together, these findings suggest that associations between CD44 and AIC proteins confer susceptibility to infection, thereby implicating novel host mechanisms in regulating fungal intracellular parasitism.
Collapse
|
19
|
Song JM, Arif M, Zhang M, Sze SH, Zhang HB. Phenotypic and molecular dissection of grain quality using the USDA rice mini-core collection. Food Chem 2019; 284:312-322. [PMID: 30744863 DOI: 10.1016/j.foodchem.2019.01.009] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Revised: 11/21/2018] [Accepted: 01/03/2019] [Indexed: 12/16/2022]
Abstract
Grain quality is a major breeding objective and paramount to food production. This study was aimed to phenotypically and molecularly dissect the rice grain quality, especially amylose content (AC), grain protein content (GPC) and alkali spreading value (ASV), using the USDA rice mini-core collection representing the world-wide rice germplasm lines. Grain chemical analysis combined with genome-wide association study (GWAS) was used for the study. A wide genetic variation was observed for these grain quality traits in the mini-core collection. Germplasm lines unique in AC, GPC and ASV and desirable for grain quality improvement were identified. The genetic diversity of the collection was re-analyzed using new SNPs, thus providing a more precise genotypic information about the collection. Furthermore, ten loci significantly associated with these grain quality traits were identified through GWAS using 22947 high-quality SNPs. These results, therefore, provide knowledge, resources and molecular tools for efficient rice grain quality improvement.
Collapse
Affiliation(s)
- Jian-Min Song
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA; Crop Research Institute, Shandong Academy of Agricultural Sciences, Jinan, Shandong 250100, China
| | - Muhammad Arif
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA; Agricultural Biotechnology Division, National Institute for Biotechnology and Genetic Engineering (NIBGE), Jhang Road, P.O. Box 577, Faisalabad, Pakistan
| | - Meiping Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA.
| | - Sing-Hoi Sze
- Department of Computer Science and Engineering and Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA.
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843-2474, USA.
| |
Collapse
|
20
|
Zhu Z, Rehman KU, Yu Y, Liu X, Wang H, Tomberlin JK, Sze SH, Cai M, Zhang J, Yu Z, Zheng J, Zheng L. De novo transcriptome sequencing and analysis revealed the molecular basis of rapid fat accumulation by black soldier fly ( Hermetia illucens, L.) for development of insectival biodiesel. Biotechnol Biofuels 2019; 12:194. [PMID: 31413730 PMCID: PMC6688347 DOI: 10.1186/s13068-019-1531-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 07/20/2019] [Indexed: 05/06/2023]
Abstract
BACKGROUND Black soldier fly (BSF, Hermetia illucens L.) can efficiently degrade organic wastes and transform into a high fat containing insect biomass that could be used as feedstock for biodiesel production. Meanwhile, the molecular regulatory basis of fat accumulation by BSF is still unclear; it is necessary to identify vital genes and regulators that are involved in fat accumulation. RESULTS This study analyzed the dynamic state of fat content and fatty-acid composition of BSF larvae in eight different stages. The late prepupa stage exhibited the highest crude fat, with lauric acid being the main component. Therefore, to provide insight into this unexplained phenomenon, the molecular regulation of rapid fat accumulation by BSF larvae was investigated. The twelve developmental stages of BSF were selected for transcriptome analysis, including the eight stages used for investigation of fat content and fatty-acid composition. By Illumina sequencing, 218,295,450,000 nt were generated. Through assembly by Trinity, 70,475 unigenes were obtained with an average length of 1064 nt and an N50 of 1749 nt. The differentially expressed unigenes were identified by DESeq, with 9159 of them being up-regulated and 10,101 of them were down-regulated. The several putative genes that are involved in the formation of pyruvate, acetyl-CoA biosynthesis, acetyl-CoA transcription, fatty-acid biosynthesis, and triacylglycerol biosynthesis were identified. The four vital metabolic genes that are associated with fat accumulation were validated by quantitative real-time PCR (qRT-PCR). The molecular mechanism of fat accumulation in BSF was clarified in this investigation through the construction of a detailed fat accumulation model from our results. CONCLUSION The study provides an unprecedented level of insight from transcriptome sequencing to reveal the crude fat accumulation mechanism in developing BSF. The finding holds considerable promise for insectival biodiesel production, and the fat content and fatty-acid composition can be altered by genetic engineering approaches in the future for the insect production industry.
Collapse
Affiliation(s)
- Zhaolu Zhu
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Kashif ur Rehman
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
- Livestock and Dairy Development Department, Poultry Research Institute, Rawalpindi, Pakistan
- Insectplus, Apfelbaumstrasse 22, 8050 Zurich, Switzerland
| | - Yongqiang Yu
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| | - Xiu Liu
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| | - Hui Wang
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| | | | - Sing-Hoi Sze
- Department of Computer Science and Engineering, Department of Biochemistry & Biophysics, Texas A&M University, College Station, TX USA
| | - Minmin Cai
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| | - Jibin Zhang
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| | - Ziniu Yu
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| | - Jinshui Zheng
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Longyu Zheng
- State Key Laboratory of Agricultural Microbiology, National Engineering Research Center of Microbial Pesticides, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, People’s Republic of China
| |
Collapse
|
21
|
Qiu C, Erinne OC, Dave JM, Cui P, Jin H, Muthukrishnan N, Tang LK, Ganesh Babu S, Lam KC, Vandeventer PJ, Strohner R, Van den Brulle J, Sze SH, Kaplan CD. Correction: High-Resolution Phenotypic Landscape of the RNA Polymerase II Trigger Loop. PLoS Genet 2018; 14:e1007158. [PMID: 29298339 PMCID: PMC5751974 DOI: 10.1371/journal.pgen.1007158] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
22
|
Pandey A, Ding SL, Qin QM, Gupta R, Gomez G, Lin F, Feng X, Fachini da Costa L, Chaki SP, Katepalli M, Case ED, van Schaik EJ, Sidiq T, Khalaf O, Arenas A, Kobayashi KS, Samuel JE, Rivera GM, Alaniz RC, Sze SH, Qian X, Brown WJ, Rice-Ficht A, Russell WK, Ficht TA, de Figueiredo P. Global Reprogramming of Host Kinase Signaling in Response to Fungal Infection. Cell Host Microbe 2017; 21:637-649.e6. [PMID: 28494245 DOI: 10.1016/j.chom.2017.04.008] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Revised: 03/12/2017] [Accepted: 04/24/2017] [Indexed: 12/26/2022]
Abstract
Cryptococcus neoformans (Cn) is a deadly fungal pathogen whose intracellular lifestyle is important for virulence. Host mechanisms controlling fungal phagocytosis and replication remain obscure. Here, we perform a global phosphoproteomic analysis of the host response to Cryptococcus infection. Our analysis reveals numerous and diverse host proteins that are differentially phosphorylated following fungal ingestion by macrophages, thereby indicating global reprogramming of host kinase signaling. Notably, phagocytosis of the pathogen activates the host autophagy initiation complex (AIC) and the upstream regulatory components LKB1 and AMPKα, which regulate autophagy induction through their kinase activities. Deletion of Prkaa1, the gene encoding AMPKα1, in monocytes results in resistance to fungal colonization of mice. Finally, the recruitment of AIC components to nascent Cryptococcus-containing vacuoles (CnCVs) regulates the intracellular trafficking and replication of the pathogen. These findings demonstrate that host AIC regulatory networks confer susceptibility to infection and establish a proteomic resource for elucidating host mechanisms that regulate fungal intracellular parasitism.
Collapse
Affiliation(s)
- Aseem Pandey
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA; Norman Borlaug Center, Texas A&M University, College Station, Texas 77843, USA; Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Sheng Li Ding
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA; Norman Borlaug Center, Texas A&M University, College Station, Texas 77843, USA; Department of Plant Pathology, College of Plant Protection, Henan Agricultural University, Zhengzhou 450002, Henan, China
| | - Qing-Ming Qin
- College of Plant Sciences, Jilin University, Changchun 130062, Jilin, China; Key Laboratory of Zoonosis Research, Ministry of Education, Jilin University, Changchun 130062, Jilin, China
| | - Rahul Gupta
- Health and Engineering Group, Leidos Inc., 2295 Parklake Drive, Atlanta, GA 30345, USA
| | - Gabriel Gomez
- Texas A&M Veterinary Medical Diagnostic Laboratory, Texas A&M University, College Station, Texas 77843, USA
| | - Furong Lin
- Norman Borlaug Center, Texas A&M University, College Station, Texas 77843, USA
| | - Xuehuan Feng
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA; Norman Borlaug Center, Texas A&M University, College Station, Texas 77843, USA
| | - Luciana Fachini da Costa
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA; Norman Borlaug Center, Texas A&M University, College Station, Texas 77843, USA; Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Sankar P Chaki
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Madhu Katepalli
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - Elizabeth D Case
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - Erin J van Schaik
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - Tabasum Sidiq
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - Omar Khalaf
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Angela Arenas
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Koichi S Kobayashi
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - James E Samuel
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - Gonzalo M Rivera
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA
| | - Robert C Alaniz
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - Sing-Hoi Sze
- Center for Bioinformatics & Genomic Systems Engineering, Texas A&M University, College Station, Texas 77843, USA; Department of Computer Science and Engineering, Dwight Look College of Engineering, Texas A&M University, College Station, Texas 77843, USA; Department of Biochemistry & Biophysics, Texas A&M University, College Station, Texas 77843, USA
| | - Xiaoning Qian
- Center for Bioinformatics & Genomic Systems Engineering, Texas A&M University, College Station, Texas 77843, USA; Department of Electrical and Computer Engineering, Dwight Look College of Engineering, Texas A&M University, College Station, Texas 77843, USA
| | - William J Brown
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853-2703, USA
| | - Allison Rice-Ficht
- Department of Molecular and Cellular Medicine, College of Medicine, Texas A&M Health Science Center, College Station, Texas 77843, USA
| | - William K Russell
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, TX 77555, USA.
| | - Thomas A Ficht
- Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA.
| | - Paul de Figueiredo
- Department of Microbial Pathogenesis and Immunology, Texas A&M Health Science Center, College Station, Texas 77843, USA; Norman Borlaug Center, Texas A&M University, College Station, Texas 77843, USA; Department of Veterinary Pathobiology, College of Veterinary Medicine, Texas A&M University, College Station, Texas 77843, USA.
| |
Collapse
|
23
|
Sze SH, Parrott JJ, Tarone AM. A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms. BMC Genomics 2017; 18:895. [PMID: 29244008 PMCID: PMC5731495 DOI: 10.1186/s12864-017-4270-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background While the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies. Results We develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies. Conclusions Our divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies.
Collapse
Affiliation(s)
- Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, Mexico, 77843, TX, USA. .,Department of Biochemistry & Biophysics, Texas A&M University, College Station, Mexico, 77843, TX, USA.
| | - Jonathan J Parrott
- Department of Entomology, Texas A&M University, College Station, Mexico, 77843, TX, USA
| | - Aaron M Tarone
- Department of Entomology, Texas A&M University, College Station, Mexico, 77843, TX, USA
| |
Collapse
|
24
|
Abstract
Background With increased availability of de novo assembly algorithms, it is feasible to study entire transcriptomes of non-model organisms. While algorithms are available that are specifically designed for performing transcriptome assembly from high-throughput sequencing data, they are very memory-intensive, limiting their applications to small data sets with few libraries. Results We develop a transcriptome assembly algorithm that recovers alternatively spliced isoforms and expression levels while utilizing as many RNA-Seq libraries as possible that contain hundreds of gigabases of data. New techniques are developed so that computations can be performed on a computing cluster with moderate amount of physical memory. Conclusions Our strategy minimizes memory consumption while simultaneously obtaining comparable or improved accuracy over existing algorithms. It provides support for incremental updates of assemblies when new libraries become available.
Collapse
Affiliation(s)
- Sing-Hoi Sze
- Department of Computer Science and Engineering, Texas A&M University, College Station, 77843, TX, USA. .,Department of Biochemistry & Biophysics, Texas A&M University, College Station, 77843, TX, USA.
| | - Meaghan L Pimsler
- Department of Entomology, Texas A&M University, College Station, 77843, TX, USA
| | - Jeffery K Tomberlin
- Department of Entomology, Texas A&M University, College Station, 77843, TX, USA
| | - Corbin D Jones
- Department of Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Aaron M Tarone
- Department of Entomology, Texas A&M University, College Station, 77843, TX, USA
| |
Collapse
|
25
|
Fu S, Tarone AM, Sze SH. Heuristic pairwise alignment of de Bruijn graphs to facilitate simultaneous transcript discovery in related organisms from RNA-Seq data. BMC Genomics 2015; 16 Suppl 11:S5. [PMID: 26576690 PMCID: PMC4652555 DOI: 10.1186/1471-2164-16-s11-s5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND The advance of high-throughput sequencing has made it possible to obtain new transcriptomes and study splicing mechanisms in non-model organisms. In these studies, there is often a need to investigate the transcriptomes of two related organisms at the same time in order to find the similarities and differences between them. The traditional approach to address this problem is to perform de novo transcriptome assemblies to obtain predicted transcripts for these organisms independently and then employ similarity comparison algorithms to study them. RESULTS Instead of obtaining predicted transcripts for these organisms separately from the intermediate de Bruijn graph structures employed by de novo transcriptome assembly algorithms, we develop an algorithm to allow direct comparisons between paths in two de Bruijn graphs by first enumerating short paths in both graphs, and iteratively extending paths in one graph that have high similarity to paths in the other graph to obtain longer corresponding paths between the two graphs. These paths represent predicted transcripts that are present in both organisms. CONCLUSIONS Our approach generalizes the pairwise sequence alignment problem to allow the input to be non-linear structures, and provides a heuristic to reliably recover similar paths from the two structures. Our algorithm allows detailed investigation of the similarities and differences in alternative splicing between the two organisms at both the sequence and structure levels, even in the absence of reference transcriptomes or a closely related model organism.
Collapse
|
26
|
Sze SH, Tarone AM. A memory-efficient algorithm to obtain splicing graphs and de novo expression estimates from de Bruijn graphs of RNA-Seq data. BMC Genomics 2014; 15 Suppl 5:S6. [PMID: 25082000 PMCID: PMC4120145 DOI: 10.1186/1471-2164-15-s5-s6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background The recent advance of high-throughput sequencing makes it feasible to study entire transcriptomes through the application of de novo sequence assembly algorithms. While a popular strategy is to first construct an intermediate de Bruijn graph structure to represent the transcriptome, an additional step is needed to construct predicted transcripts from the graph. Results Since the de Bruijn graph contains all branching possibilities, we develop a memory-efficient algorithm to recover alternative splicing information and library-specific expression information directly from the graph without prior genomic knowledge. We implement the algorithm as a postprocessing module of the Velvet assembler. We validate our algorithm by simulating the transcriptome assembly of Drosophila using its known genome, and by performing Drosophila transcriptome assembly using publicly available RNA-Seq libraries. Under a range of conditions, our algorithm recovers sequences and alternative splicing junctions with higher specificity than Oases or Trans-ABySS. Conclusions Since our postprocessing algorithm does not consume as much memory as Velvet and is less memory-intensive than Oases, it allows biologists to assemble large libraries with limited computational resources. Our algorithm has been applied to perform transcriptome assembly of the non-model blow fly Lucilia sericata that was reported in a previous article, which shows that the assembly is of high quality and it facilitates comparison of the Lucilia sericata transcriptome to Drosophila and two mosquitoes, prediction and experimental validation of alternative splicing, investigation of differential expression among various developmental stages, and identification of transposable elements.
Collapse
|
27
|
Radulović ŽM, Kim TK, Porter LM, Sze SH, Lewis L, Mulenga A. A 24-48 h fed Amblyomma americanum tick saliva immuno-proteome. BMC Genomics 2014; 15:518. [PMID: 24962723 PMCID: PMC4099483 DOI: 10.1186/1471-2164-15-518] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 06/12/2014] [Indexed: 01/31/2023] Open
Abstract
BACKGROUND Multiple tick saliva proteins, the majority of which are unknown, confer tick resistance in repeatedly infested animals. The objective of this study was to identify the 24-48 h fed Amblyomma americanum tick saliva immuno-proteome. The 24-48 h tick-feeding phase is critical to tick parasitism as it precedes important events in tick biology, blood meal feeding and disease agent transmission. Fed male, 24 and 96 h fed female phage display cDNA expression libraries were biopanned using rabbit antibodies to 24 and 48 h fed A. americanum female tick saliva proteins. Biopanned immuno-cDNA libraries were subjected to next generation sequencing, de novo assembly, and bioinformatic analysis. RESULTS More than 800 transcripts that code for 24-48 h fed A. americanum immuno-proteins are described. Of the 895 immuno-proteins, 52% (464/895) were provisionally identified based on matches in GenBank. Of these, ~19% (86/464) show high level of identity to other tick hypothetical proteins, and the rest include putative proteases (serine, cysteine, leukotriene A-4 hydrolase, carboxypeptidases, and metalloproteases), protease inhibitors (serine and cysteine protease inhibitors, tick carboxypeptidase inhibitor), and transporters and/or ligand binding proteins (histamine binding/lipocalin, fatty acid binding, calreticulin, hemelipoprotein, IgG binding protein, ferritin, insulin-like growth factor binding proteins, and evasin). Others include enzymes (glutathione transferase, cytochrome oxidase, protein disulfide isomerase), ribosomal proteins, and those of miscellaneous functions (histamine release factor, selenoproteins, tetraspanin, defensin, heat shock proteins). CONCLUSIONS Data here demonstrate that A. americanum secretes a complex cocktail of immunogenic tick saliva proteins during the first 24-48 h of feeding. Of significance, previously validated immunogenic tick saliva proteins including AV422 protein, calreticulin, histamine release factor, histamine binding/lipocalins, selenoproteins, and paramyosin were identified in this screen, supporting the specificity of the approach in this study. While descriptive, this study opens opportunities for in-depth tick feeding physiology studies.
Collapse
Affiliation(s)
- Željko M Radulović
- />Department of Entomology, AgriLife Research, Texas A & M University, 2475 TAMU, College Station, TX77843 USA
| | - Tae K Kim
- />Department of Entomology, AgriLife Research, Texas A & M University, 2475 TAMU, College Station, TX77843 USA
| | - Lindsay M Porter
- />Department of Entomology, AgriLife Research, Texas A & M University, 2475 TAMU, College Station, TX77843 USA
| | - Sing-Hoi Sze
- />Department of Computer Sciences and Engineering, Texas A & M University, College Station, TX77843 USA
- />Department of Biochemistry & Biophysics, Texas A & M University, College Station, TX77843 USA
| | - Lauren Lewis
- />Department of Entomology, AgriLife Research, Texas A & M University, 2475 TAMU, College Station, TX77843 USA
| | - Albert Mulenga
- />Department of Entomology, AgriLife Research, Texas A & M University, 2475 TAMU, College Station, TX77843 USA
| |
Collapse
|
28
|
Abstract
As the amount of data describing biological interactions increases, it becomes possible to analyze the complex interactions of genes and proteins across multiple networks at the genome scale. While the most popular techniques to study conservation of patterns in biological networks are through the use of network alignment techniques or the identification of network motifs, we show that it is possible to exhaustively enumerate all graphlet alignments, which consist of at least two vertex-disjoint subgraphs that share a common topology and contain homologous proteins at the same position in the topology. We compare the performance of our algorithm to network alignment algorithms and show that our algorithm is able to cover significantly more proteins in the given networks while maintaining comparable or higher sensitivity and specificity with respect to functional enrichment.
Collapse
Affiliation(s)
- Mu-Fen Hsieh
- 1 Department of Computer Science and Engineering, Texas A&M University , College Station, Texas
| | | |
Collapse
|
29
|
Fan JH, Chen J, Sze SH. Identifying complexes from protein interaction networks according to different types of neighborhood density. J Comput Biol 2013; 19:1284-94. [PMID: 23210476 DOI: 10.1089/cmb.2012.0195] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
To facilitate the realization of biological functions, proteins are often organized into complexes. While computational techniques are used to predict these complexes, detailed understanding of their organization remains inadequate. Apart from complexes that reside in very dense regions of a protein interaction network in which most algorithms are able to identify, we observe that many other complexes, while not residing in very dense regions, reside in regions with low neighborhood density. We develop an algorithm for identifying protein complexes by considering these two types of complexes separately. We test our algorithm on a few yeast protein interaction networks, and show that our algorithm is able to identify complexes more accurately than existing algorithms. A software program NDComplex for implementing the algorithm is available at http://faculty.cse.tamu.edu/shsze/ndcomplex.
Collapse
Affiliation(s)
- Jia-Hao Fan
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX 77843-3112, USA
| | | | | |
Collapse
|
30
|
Abstract
The goal of protein family classification is to group proteins into families so that proteins within the same family have common function or are related by ancestry. While supervised classification algorithms are available for this purpose, most of these approaches focus on assigning unclassified proteins to known families but do not allow for progressive construction of new families from proteins that cannot be assigned. Although unsupervised clustering algorithms are also available, they do not make use of information from known families. By computing similarities between proteins based on pairwise sequence comparisons, we develop supervised classification algorithms that achieve improved accuracy over previous approaches while allowing for construction of new families. We show that our algorithm has higher accuracy rate and lower mis-classification rate when compared to algorithms that are based on the use of multiple sequence alignments and hidden Markov models, and our algorithm performs well even on families with very few proteins and on families with low sequence similarity. A software program implementing the algorithm (SClassify) is available online (http://faculty.cse.tamu.edu/shsze/sclassify).
Collapse
Affiliation(s)
- Gangman Yi
- Department of Computer Science and Engineering, Gangneung-Wonju National University, Gangwon-do, South Korea
| | | | | |
Collapse
|
31
|
Madina BR, Kuppan G, Vashisht AA, Liang YH, Downey KM, Wohlschlegel JA, Ji X, Sze SH, Sacchettini JC, Read LK, Cruz-Reyes J. Guide RNA biogenesis involves a novel RNase III family endoribonuclease in Trypanosoma brucei. RNA 2011; 17:1821-30. [PMID: 21810935 PMCID: PMC3185915 DOI: 10.1261/rna.2815911] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2011] [Accepted: 06/29/2011] [Indexed: 05/29/2023]
Abstract
The mitochondrial genome of kinetoplastids, including species of Trypanosoma and Leishmania, is an unprecedented DNA structure of catenated maxicircles and minicircles. Maxicircles represent the typical mitochondrial genome encoding components of the respiratory complexes and ribosomes. However, most mRNA sequences are cryptic, and their maturation requires a unique U insertion/deletion RNA editing. Minicircles encode hundreds of small guide RNAs (gRNAs) that partially anneal with unedited mRNAs and direct the extensive editing. Trypanosoma brucei gRNAs and mRNAs are transcribed as polycistronic precursors, which undergo processing preceding editing; however, the relevant nucleases are unknown. We report the identification and functional characterization of a close homolog of editing endonucleases, mRPN1 (mitochondrial RNA precursor-processing endonuclease 1), which is involved in gRNA biogenesis. Recombinant mRPN1 is a dimeric dsRNA-dependent endonuclease that requires Mg(2+), a critical catalytic carboxylate, and generates 2-nucleotide 3' overhangs. The cleavage specificity of mRPN1 is reminiscent of bacterial RNase III and thus is fundamentally distinct from editing endonucleases, which target a single scissile bond just 5' of short duplexes. An inducible knockdown of mRPN1 in T. brucei results in loss of gRNA and accumulation of precursor transcripts (pre-gRNAs), consistent with a role of mRPN1 in processing. mRPN1 stably associates with three proteins previously identified in relatively large complexes that do not contain mRPN1, and have been linked with multiple aspects of mitochondrial RNA metabolism. One protein, TbRGG2, directly binds mRPN1 and is thought to modulate gRNA utilization by editing complexes. The proposed participation of mRPN1 in processing of polycistronic RNA and its specific protein interactions in gRNA expression are discussed.
Collapse
Affiliation(s)
- Bhaskara Reddy Madina
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843, USA
| | - Gokulan Kuppan
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843, USA
| | - Ajay A. Vashisht
- Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California 90095-1737, USA
| | - Yu-He Liang
- Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA
| | - Kurtis M. Downey
- Department of Microbiology and Immunology, University at Buffalo, State University of New York, Buffalo, New York 14214, USA
| | - James A. Wohlschlegel
- Department of Biological Chemistry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, California 90095-1737, USA
| | - Xinhua Ji
- Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas 77843, USA
| | - James C. Sacchettini
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843, USA
| | - Laurie K. Read
- Department of Microbiology and Immunology, University at Buffalo, State University of New York, Buffalo, New York 14214, USA
| | - Jorge Cruz-Reyes
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas 77843, USA
| |
Collapse
|
32
|
Zhao X, Sze SH. Motif finding in DNA sequences based on skipping nonconserved positions in background Markov chains. J Comput Biol 2011; 18:759-70. [PMID: 21554019 DOI: 10.1089/cmb.2010.0197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
One strategy to identify transcription factor binding sites is through motif finding in upstream DNA sequences of potentially co-regulated genes. Despite extensive efforts, none of the existing algorithms perform very well. We consider a string representation that allows arbitrary ignored positions within the nonconserved portion of single motifs, and use O(2(l)) Markov chains to model the background distributions of motifs of length l while skipping these positions within each Markov chain. By focusing initially on positions that have fixed nucleotides to define core occurrences, we develop an algorithm to identify motifs of moderate lengths. We compare the performance of our algorithm to other motif finding algorithms on a few benchmark data sets, and show that significant improvement in accuracy can be obtained when the sites are sufficiently conserved within a given sample, while comparable performance is obtained when the site conservation rate is low. A software program (PosMotif ) and detailed results are available online at http://faculty.cse.tamu.edu/shsze/posmotif.
Collapse
Affiliation(s)
- Xiaoyan Zhao
- Department of Computer Science & Engineering, Texas A&M University, College Station, Texas 77843, USA
| | | |
Collapse
|
33
|
Zhu H, Hu F, Wang R, Zhou X, Sze SH, Liou LW, Barefoot A, Dickman M, Zhang X. Arabidopsis Argonaute10 specifically sequesters miR166/165 to regulate shoot apical meristem development. Cell 2011; 145:242-56. [PMID: 21496644 DOI: 10.1016/j.cell.2011.03.024] [Citation(s) in RCA: 320] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2010] [Revised: 12/13/2010] [Accepted: 03/07/2011] [Indexed: 12/29/2022]
Abstract
The shoot apical meristem (SAM) comprises a group of undifferentiated cells that divide to maintain the plant meristem and also give rise to all shoot organs. SAM fate is specified by class III HOMEODOMAIN-LEUCINE ZIPPER (HD-ZIP III) transcription factors, which are targets of miR166/165. In Arabidopsis, AGO10 is a critical regulator of SAM maintenance, and here we demonstrate that AGO10 specifically interacts with miR166/165. The association is determined by a distinct structure of the miR166/165 duplex. Deficient loading of miR166 into AGO10 results in a defective SAM. Notably, the miRNA-binding ability of AGO10, but not its catalytic activity, is required for SAM development, and AGO10 has a higher binding affinity for miR166 than does AGO1, a principal contributor to miRNA-mediated silencing. We propose that AGO10 functions as a decoy for miR166/165 to maintain the SAM, preventing their incorporation into AGO1 complexes and the subsequent repression of HD-ZIP III gene expression.
Collapse
Affiliation(s)
- Hongliang Zhu
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
An important strategy to study genome evolution is to investigate the clustering of orthologous genes among multiple genomes, in which the most popular approaches require that the distance between adjacent genes in a cluster be small. We investigate a different formulation based on constraining the overall size of a cluster and develop statistical significance estimates that allow direct comparison of clusters of different sizes. We first consider a restricted version which requires that orthologous genes are strictly ordered within each cluster and show that it can be solved in polynomial time. We then develop practical exact algorithms for the unrestricted problem that allows paralogous genes within a genome and clusters that may not appear in every genome while considering a general model in which a gene is allowed to appear in more than one orthologous group. We show that our algorithm can identify biologically relevant gene clusters on four bacterial genomes Bacillus subtilis, Streptococcus pyogenes, Streptococcus pneumoniae, and Clostridium acetobutylicum. We also show that our algorithm can identify significantly more functionally enriched gene clusters on four yeast genomes Saccharomyces cerevisiae, Saccharomyces paradoxus, Saccharomyces mikatae, and Saccharomyces bayanus than previous algorithms. A software program (GCFinder) and a list of gene clusters found on the bacterial and the yeast genomes are available at http://faculty.cse.tamu.edu/shsze/gcfinder .
Collapse
Affiliation(s)
- Qingwu Yang
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas 77843-3112, USA
| | | | | | | | | |
Collapse
|
35
|
Abstract
High-throughput techniques for measuring protein interactions have enabled the systematic study of complex protein networks. Comparing the networks of different organisms and identifying their common substructures can lead to a better understanding of the regulatory mechanisms underlying various cellular functions. To facilitate such comparisons, we present an efficient framework based on hidden Markov models (HMMs) that can be used for finding homologous pathways in a network of interest. Given a query path, our method identifies the top k matching paths in the network, which may contain any number of consecutive insertions and deletions. We demonstrate that our method is able to identify biologically significant pathways in protein interaction networks obtained from the DIP database, and the retrieved paths are closer to the curated pathways in the KEGG database when compared to the results from previous approaches. Unlike most existing algorithms that suffer from exponential time complexity, our algorithm has a polynomial complexity that grows linearly with the query size. This enables the search for very long paths with more than 10 proteins within a few minutes on a desktop computer. A software program implementing the algorithm is available upon request from the authors.
Collapse
Affiliation(s)
- Xiaoning Qian
- Department of Electrical & Computer Engineering, Texas A&M University, College Station, TX 77843-3128, USA
| | | | | |
Collapse
|
36
|
Abstract
While most of the recent improvements in multiple sequence alignment accuracy are due to better use of vertical information, which include the incorporation of consistency-based pairwise alignments and the use of profile alignments, we observe that it is possible to further improve accuracy by taking into account alignment of neighboring residues when aligning two residues, thus making better use of horizontal information. By modifying existing multiple alignment algorithms to make use of horizontal information, we show that this strategy is able to consistently improve over existing algorithms on a few sets of benchmark alignments that are commonly used to measure alignment accuracy, and the average improvements in accuracy can be as much as 1–3% on protein sequence alignment and 5–10% on DNA/RNA sequence alignment. Unlike previous algorithms, consistent average improvements can be obtained across all identity levels.
Collapse
Affiliation(s)
- Yue Lu
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX 77843, USA
| | | |
Collapse
|
37
|
Affiliation(s)
- Yue Lu
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, Texas
- Computer Science, Texas A&M University, College Station, Texas
| |
Collapse
|
38
|
Abstract
An important strategy to study operons and their evolution is to investigate clustering of related genes across multiple bacterial genomes. Although existing algorithms are available that can identify gene clusters across two or more genomes, very few algorithms are efficient enough to study gene clusters across hundreds of genomes. We observe that a querying strategy can be used to analyze gene clusters across a large number of genomes and develop an efficient algorithm to identify all related clusters on a genome from a given query cluster. We use this algorithm to study gene clustering in 400 bacterial genomes by starting from a well-characterized list of operons in Escherichia coli K12 and perform comparative analysis of operon occurrences, gene orientations, and rearrangements both within and across clusters. We show that important biological insights can be obtained by comparing results across these categories. A software program implementing the algorithm (GCQuery) and supplementary data containing detailed results are available at http://faculty.cs.tamu.edu/shsze/gcquery.
Collapse
Affiliation(s)
- Qingwu Yang
- Department of Computer Science, Texas A&M University, College Station, Texas 77843, USA
| | | |
Collapse
|
39
|
Abstract
We develop algorithms for the following path matching and graph matching problems: (i) given a query path p and a graph G, find a path p' that is most similar to p in G; (ii) given a query graph G (0) and a graph G, find a graph G (0)' that is most similar to G (0) in G. In these problems, p and G (0) represent a given substructure of interest to a biologist, and G represents a large network in which the biologist desires to find a related substructure. These algorithms allow the study of common substructures in biological networks in order to understand how these networks evolve both within and between organisms. We reduce the path matching problem to finding a longest weighted path in a directed acyclic graph and show that the problem of finding top k suboptimal paths can be solved in polynomial time. This is in contrast with most previous approaches that used exponential time algorithms to find simple paths which are practical only when the paths are short. We reduce the graph matching problem to finding highest scoring subgraphs in a graph and give an exact algorithm to solve the problem when the query graph G (0) is of moderate size. This eliminates the need for less accurate heuristic or randomized algorithms. We show that our algorithms are able to extract biologically meaningful pathways from protein interaction networks in the DIP database and metabolic networks in the KEGG database. Software programs implementing these techniques (PathMatch and GraphMatch) are available at http://faculty.cs.tamu.edu/shsze/pathmatch and http://faculty.cs.tamu.edu/shsze/graphmatch.
Collapse
Affiliation(s)
- Qingwu Yang
- Department of Computer Science, Texas A and M University, College Station, Texas 77843, USA
| | | |
Collapse
|
40
|
Abstract
MOTIVATION An increasing body of literature shows that genomes of eukaryotes can contain clusters of functionally related genes. Most approaches to identify gene clusters utilize microarray data or metabolic pathway databases to find groups of genes on chromosomes that are linked by common attributes. A generalized method that can find gene clusters regardless of the mechanism of origin would provide researchers with an unbiased method for finding clusters and studying the evolutionary forces that give rise to them. RESULTS We present an algorithm to identify gene clusters in eukaryotic genomes that utilizes functional categories defined in graph-based vocabularies such as the Gene Ontology (GO). Clusters identified in this manner need only have a common function and are not constrained by gene expression or other properties. We tested the algorithm by analyzing genomes of a representative set of species. We identified species-specific variation in percentage of clustered genes as well as in properties of gene clusters including size distribution and functional annotation. These properties may be diagnostic of the evolutionary forces that lead to the formation of gene clusters. AVAILABILITY A software implementation of the algorithm and example output files are available at http://fcg.tamu.edu/C_Hunter/.
Collapse
Affiliation(s)
- Gangman Yi
- Department of Computer Science, Texas A&M University, College Station, TX 77845, USA
| | | | | |
Collapse
|
41
|
Yang SS, Cheung F, Lee JJ, Ha M, Wei NE, Sze SH, Stelly DM, Thaxton P, Triplett B, Town CD, Chen ZJ. Accumulation of genome-specific transcripts, transcription factors and phytohormonal regulators during early stages of fiber cell development in allotetraploid cotton. Plant J 2006; 47:761-75. [PMID: 16889650 PMCID: PMC4367961 DOI: 10.1111/j.1365-313x.2006.02829.x] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Gene expression during the early stages of fiber cell development and in allopolyploid crops is poorly understood. Here we report computational and expression analyses of 32 789 high-quality ESTs derived from Gossypium hirsutum L. Texas Marker-1 (TM-1) immature ovules (GH_TMO). The ESTs were assembled into 8540 unique sequences including 4036 tentative consensus sequences (TCs) and 4504 singletons, representing approximately 15% of the unique sequences in the cotton EST collection. Compared with approximately 178 000 existing ESTs derived from elongating fibers and non-fiber tissues, GH_TMO ESTs showed a significant increase in the percentage of genes encoding putative transcription factors such as MYB and WRKY and genes encoding predicted proteins involved in auxin, brassinosteroid (BR), gibberellic acid (GA), abscisic acid (ABA) and ethylene signaling pathways. Cotton homologs related to MIXTA, MYB5, GL2 and eight genes in the auxin, BR, GA and ethylene pathways were induced during fiber cell initiation but repressed in the naked seed mutant (N1N1) that is impaired in fiber formation. The data agree with the known roles of MYB and WRKY transcription factors in Arabidopsis leaf trichome development and the well-documented phytohormonal effects on fiber cell development in immature cotton ovules cultured in vitro. Moreover, the phytohormonal pathway-related genes were induced prior to the activation of MYB-like genes, suggesting an important role of phytohormones in cell fate determination. Significantly, AA sub-genome ESTs of all functional classifications including cell-cycle control and transcription factor activity were selectively enriched in G. hirsutum L., an allotetraploid derived from polyploidization between AA and DD genome species, a result consistent with the production of long lint fibers in AA genome species. These results suggest general roles for genome-specific, phytohormonal and transcriptional gene regulation during the early stages of fiber cell development in cotton allopolyploids.
Collapse
Affiliation(s)
- S. Samuel Yang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas 77843, USA
| | - Foo Cheung
- The Institute for Genomic Research, Rockville, Maryland 20850, USA
| | - Jinsuk J. Lee
- Section of Molecular Cell and Developmental Biology, The University of Texas, Austin, Texas 78712, USA
| | - Misook Ha
- Section of Molecular Cell and Developmental Biology, The University of Texas, Austin, Texas 78712, USA
| | - Ning E. Wei
- Department of Computer Science, Texas A&M University, College Station, Texas 77843, USA
| | - Sing-Hoi Sze
- Department of Computer Science, Texas A&M University, College Station, Texas 77843, USA
| | - David M. Stelly
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas 77843, USA
| | - Peggy Thaxton
- Delta Research and Extension Center, Mississippi State University, Stoneville, Mississippi 38776, USA
| | - Barbara Triplett
- USDA-ARS Southern Regional Research Center, New Orleans, Louisiana 70179, USA
| | | | - Z. Jeffrey Chen
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas 77843, USA
- Section of Molecular Cell and Developmental Biology, The University of Texas, Austin, Texas 78712, USA
- Author for correspondence: Institute for Cellular and Molecular Biology, The University of Texas, Austin, Texas 78712-0159, USA, Phone: 512-475-9327; Fax: 512-232-3432;
| |
Collapse
|
42
|
Wang N, Lu SE, Yang Q, Sze SH, Gross DC. Identification of the syr-syp box in the promoter regions of genes dedicated to syringomycin and syringopeptin production by Pseudomonas syringae pv. syringae B301D. J Bacteriol 2006; 188:160-8. [PMID: 16352832 PMCID: PMC1317596 DOI: 10.1128/jb.188.1.160-168.2006] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The phytotoxins syringopeptin and syringomycin are synthesized by nonribosomal peptide synthetases which are encoded by the syringomycin (syr) and syringopeptin (syp) genomic island of Pseudomonas syringae pv. syringae. Previous studies demonstrated that expression of the syr-syp genes was controlled by the salA-syrF regulatory pathway, which in turn was induced by plant signal molecules. In this study, the 132-kb syr-syp genomic island was found to be organized into five polycistronic operons along with eight individual genes based on reverse transcriptional PCR and bioinformatic analysis. The transcriptional start sites of the salA gene and operons III and IV were located 63, 75, and 104 bp upstream of the start codons of salA, syrP, and syrB1, respectively, using primer extension analysis. The predicted -10/-35 promoter region of operon IV was confirmed based on deletion and site-directed mutagenesis analyses of the syrB1::uidA reporter with beta-glucuronidase assays. A 20-bp conserved sequence (TGtCccgN(6)cggGaCA, termed the syr-syp box) with dyad symmetry around the -35 region was identified via computer analysis for the syr-syp genes/operons responsible for biosynthesis and secretion of syringomycin and syringopeptin. Expression of the syrB1::uidA fusion was decreased 59% when 6 bp was deleted from the 5' end of the syr-syp box in the promoter region of operon IV. These results demonstrate that the conserved promoter sequences of the syr-syp genes contribute to the coregulation of syringomycin and syringopeptin production.
Collapse
Affiliation(s)
- Nian Wang
- Department of Plant Pathology and Microbiology, Texas A&M University, College Station, TX 77843, USA
| | | | | | | | | |
Collapse
|
43
|
Abstract
Since traditional multiple alignment formulations are NP-hard, heuristics are commonly employed to find acceptable alignments with no guaranteed performance bound. This causes a substantial difficulty in understanding what the resulting alignment means and in assessing the quality of these alignments. We propose an alternative formulation of multiple alignment based on the idea of finding a multiple alignment of k sequences which preserves k - 1 pairwise alignments as specified by edges of a given tree. Although it is well known that such a preserving alignment always exists, it did not become a mainstream method for multiple alignment since it seems that a lot of information is lost from ignoring pairwise similarities outside the tree. In contrast, by using pairwise alignments that incorporate consistency information from other sequences, we show that it is possible to obtain very good accuracy with the preserving alignment formulation. We show that a reasonable objective function to use is to find the shortest preserving alignment, and, by a reduction to a graph-theoretic problem, that the problem of finding the shortest preserving multiple alignment can be solved in polynomial time. We demonstrate the success of this approach on three sets of benchmark multiple alignments by using consistency-based pairwise alignments from the first stage of two of the best performing progressive alignment algorithms TCoffee and ProbCons and replace the second heuristic progressive step of these algorithms by the exact preserving alignment step. We apply this strategy to TCoffee and show that our approach outperforms TCoffee on two of the three test sets. We apply the strategy to a variant of ProbCons with no iterative refinements and show that our approach achieves similar or better accuracy except on one test set. We also compare our performance to ProbCons with iterative refinements and show that our approach achieves similar or better accuracy on many subcategories even without further refinements. The most important advantage of the preserving alignment formulation is that we are certain that we can solve the problem in polynomial time without using a heuristic. A software program implementing this approach (PSAlign) is available at http://faculty.cs.tamu.edu/shsze/psalign.
Collapse
Affiliation(s)
- Sing-Hoi Sze
- Department of Computer Science, Texas A&M University, College Station, 77843, USA.
| | | | | |
Collapse
|
44
|
Lee JJ, Hassan OSS, Gao W, Wei NE, Kohel RJ, Chen XY, Payton P, Sze SH, Stelly DM, Chen ZJ. Developmental and gene expression analyses of a cotton naked seed mutant. Planta 2006; 223:418-32. [PMID: 16254724 DOI: 10.1007/s00425-005-0098-7] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2005] [Accepted: 07/25/2005] [Indexed: 05/05/2023]
Abstract
Cotton fiber development is a fundamental biological phenomenon, yet the molecular basis of fiber cell initiation is poorly understood. We examined molecular and cellular events of fiber cell development in the naked seed mutant (N1N1) and its isogenic line of cotton (Gossypium hirsutum L. cv. Texas Marker-1, TM-1). The dominant mutation not only delayed the process of fiber cell formation and elongation but also reduced the total number of fiber cells, resulting in sparsely distributed short fibers. Gene expression changes in TM-1 and N1N1 mutant lines among four tissues were analyzed using spotted cotton oligo-gene microarrays. Using the Arabidopsis genes, we selected and designed approximately 1,334 70-mer oligos from a subset of cotton fiber ESTs. Statistical analysis of the microarray data indicates that the number of significantly differentially expressed genes was 856 in the leaves compared to the ovules (3 days post-anthesis, DPA), 632 in the petals relative to the ovules (3 DPA), and 91 in the ovules at 0 DPA compared to 3 DPA, all in TM-1. Moreover, 117 and 30 genes were expressed significantly different in the ovules at three and 0 DPA, respectively, between TM-1 and N1N1. Quantitative RT-PCR analysis of 23 fiber-associated genes in seven tissues including ovules, fiber-bearing ovules, fibers, and non-fiber tissues in TM-1 and N1N1 indicates a mode of temporal regulation of the genes involved in transcriptional and translational regulation, signal transduction, and cell differentiation during early stages of fiber development. Suppression of the fiber-associated genes in the mutant may suggest that the N1N1 mutation disrupts temporal regulation of gene expression, leading to a defective process of fiber cell elongation and development.
Collapse
Affiliation(s)
- Jinsuk J Lee
- Department of Soil and Crop Sciences and Intercollegiate Program in Genetics, Texas A&M University, MS 2474/Molecular Genetics, College Station, TX 77843, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Wang J, Adelson DL, Yilmaz A, Sze SH, Jin Y, Zhu JJ. Genomic organization, annotation, and ligand-receptor inferences of chicken chemokines and chemokine receptor genes based on comparative genomics. BMC Genomics 2005; 6:45. [PMID: 15790398 PMCID: PMC1082905 DOI: 10.1186/1471-2164-6-45] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2004] [Accepted: 03/24/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Chemokines and their receptors play important roles in host defense, organogenesis, hematopoiesis, and neuronal communication. Forty-two chemokines and 19 cognate receptors have been found in the human genome. Prior to this report, only 11 chicken chemokines and 7 receptors had been reported. The objectives of this study were to systematically identify chicken chemokines and their cognate receptor genes in the chicken genome and to annotate these genes and ligand-receptor binding by a comparative genomics approach. RESULTS Twenty-three chemokine and 14 chemokine receptor genes were identified in the chicken genome. All of the chicken chemokines contained a conserved CC, CXC, CX3C, or XC motif, whereas all the chemokine receptors had seven conserved transmembrane helices, four extracellular domains with a conserved cysteine, and a conserved DRYLAIV sequence in the second intracellular domain. The number of coding exons in these genes and the syntenies are highly conserved between human, mouse, and chicken although the amino acid sequence homologies are generally low between mammalian and chicken chemokines. Chicken genes were named with the systematic nomenclature used in humans and mice based on phylogeny, synteny, and sequence homology. CONCLUSION The independent nomenclature of chicken chemokines and chemokine receptors suggests that the chicken may have ligand-receptor pairings similar to mammals. All identified chicken chemokines and their cognate receptors were identified in the chicken genome except CCR9, whose ligand was not identified in this study. The organization of these genes suggests that there were a substantial number of these genes present before divergence between aves and mammals and more gene duplications of CC, CXC, CCR, and CXCR subfamilies in mammals than in aves after the divergence.
Collapse
Affiliation(s)
- Jixin Wang
- Department of Poultry Science, Texas A & M University, College Station, TX 77843, USA
| | - David L Adelson
- Department of Animal Science, Texas A & M University, College Station, TX 77843, USA
| | - Ahmet Yilmaz
- Department of Poultry Science, Texas A & M University, College Station, TX 77843, USA
| | - Sing-Hoi Sze
- Department of Computer Science, Texas A & M University, College Station, TX 77843, USA
| | - Yuan Jin
- Department of Computer Science, Texas A & M University, College Station, TX 77843, USA
| | - James J Zhu
- Department of Poultry Science, Texas A & M University, College Station, TX 77843, USA
| |
Collapse
|
46
|
Abstract
MOTIVATION The traditional approach to annotate alternative splicing is to investigate every splicing variant of the gene in a case-by-case fashion. This approach, while useful, has some serious shortcomings. Recent studies indicate that alternative splicing is more frequent than previously thought and some genes may produce tens of thousands of different transcripts. A list of alternatively spliced variants for such genes would be difficult to build and hard to analyse. Moreover, such a list does not show the relationships between different transcripts and does not show the overall structure of all transcripts. A better approach would be to represent all splicing variants for a given gene in a way that captures the relationships between different splicing variants. RESULTS We introduce the notion of the splicing graph that is a natural and convenient representation of all splicing variants. The key difference with the existing approaches is that we abandon the linear (sequence) representation of each transcript and replace it with a graph representation where each transcript corresponds to a path in the graph. We further design an algorithm to assemble EST reads into the splicing graph rather than assembling them into each splicing variant in a case-by-case fashion.
Collapse
Affiliation(s)
- Steffen Heber
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA, 92093-0114, USA.
| | | | | | | | | |
Collapse
|
47
|
Lunyak VV, Burgess R, Prefontaine GG, Nelson C, Sze SH, Chenoweth J, Schwartz P, Pevzner PA, Glass C, Mandel G, Rosenfeld MG. Corepressor-dependent silencing of chromosomal regions encoding neuronal genes. Science 2002; 298:1747-52. [PMID: 12399542 DOI: 10.1126/science.1076469] [Citation(s) in RCA: 368] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The molecular mechanisms by which central nervous system-specific genes are expressed only in the nervous system and repressed in other tissues remain a central issue in developmental and regulatory biology. Here, we report that the zinc-finger gene-specific repressor element RE-1 silencing transcription factor/neuronal restricted silencing factor (REST/NRSF) can mediate extraneuronal restriction by imposing either active repression via histone deacetylase recruitment or long-term gene silencing using a distinct functional complex. Silencing of neuronal-specific genes requires the recruitment of an associated corepressor, CoREST, that serves as a functional molecular beacon for the recruitment of molecular machinery that imposes silencing across a chromosomal interval, including transcriptional units that do not themselves contain REST/NRSF response elements.
Collapse
Affiliation(s)
- Victoria V Lunyak
- Howard Hughes Medical Institute (HHMI), Department of Computer Science and Engineering, School of Medicine, University of California, San Diego, 9500 Gilman Drive, Room 345, La Jolla, CA 92093-0648, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Abstract
Recognition of regulatory sites in unaligned DNA sequences is an old and well-studied problem in computational molecular biology. Recently, large-scale expression studies and comparative genomics brought this problem into a spotlight by generating a large number of samples with unknown regulatory signals. Here we develop algorithms for recognition of signals in corrupted samples (where only a fraction of sequences contain sites) with biased nucleotide composition. We further benchmark these and other algorithms on several bacterial and archaeal sites in a setting specifically designed to imitate the situations arising in comparative genomics studies.
Collapse
Affiliation(s)
- S H Sze
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla, CA 92093-0114, USA
| | | | | |
Collapse
|
49
|
Pevzner PA, Sze SH. Combinatorial approaches to finding subtle signals in DNA sequences. Proc Int Conf Intell Syst Mol Biol 2001; 8:269-78. [PMID: 10977088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
Signal finding (pattern discovery in unaligned DNA sequences) is a fundamental problem in both computer science and molecular biology with important applications in locating regulatory sites and drug target identification. Despite many studies, this problem is far from being resolved: most signals in DNA sequences are so complicated that we don't yet have good models or reliable algorithms for their recognition. We complement existing statistical and machine learning approaches to this problem by a combinatorial approach that proved to be successful in identifying very subtle signals.
Collapse
Affiliation(s)
- P A Pevzner
- Department of Mathematics, University of Southern California, Los Angeles 90089-1113, USA.
| | | |
Collapse
|
50
|
Abstract
MOTIVATION Gene annotation is the final goal of gene prediction algorithms. However, these algorithms frequently make mistakes and therefore the use of gene predictions for sequence annotation is hardly possible. As a result, biologists are forced to conduct time-consuming gene identification experiments by designing appropriate PCR primers to test cDNA libraries or applying RT-PCR, exon trapping/amplification, or other techniques. This process frequently amounts to 'guessing' PCR primers on top of unreliable gene predictions and frequently leads to wasting of experimental efforts. RESULTS The present paper proposes a simple and reliable algorithm for experimental gene identification which bypasses the unreliable gene prediction step. Studies of the performance of the algorithm on a sample of human genes indicate that an experimental protocol based on the algorithm's predictions achieves an accurate gene identification with relatively few PCR primers. Predictions of PCR primers may be used for exon amplification in preliminary mutation analysis during an attempt to identify a gene responsible for a disease. We propose a simple approach to find a short region from a genomic sequence that with high probability overlaps with some exon of the gene. The algorithm is enhanced to find one or more segments that are probably contained in the translated region of the gene and can be used as PCR primers to select appropriate clones in cDNA libraries by selective amplification. The algorithm is further extended to locate a set of PCR primers that uniformly cover all translated regions and can be used for RT-PCR and further sequencing of (unknown) mRNA.
Collapse
Affiliation(s)
- S H Sze
- Department of Computer Science, University of Southern California, Los Angeles 90089-1113, USA
| | | | | | | | | | | |
Collapse
|