Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

22
(from Reference Citation Analysis)

Article PDFs (12)

Cited by > 0 (15)

Searched Name

Thomas K F Wong

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Wong TKF, Cherryh C, Rodrigo AG, Hahn MW, Minh BQ, Lanfear R. MAST: Phylogenetic Inference with Mixtures Across Sites and Trees. Syst Biol 2024:syae008. [PMID: 38421146 DOI: 10.1093/sysbio/syae008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Indexed: 03/02/2024] Open

Abstract

Hundreds or thousands of loci are now routinely used in modern phylogenomic studies. Concatenation approaches to tree inference assume that there is a single topology for the entire dataset, but different loci may have different evolutionary histories due to incomplete lineage sorting, introgression, and/or horizontal gene transfer; even single loci may not be treelike due to recombination. To overcome this shortcoming, we introduce an implementation of a multi-tree mixture model that we call MAST. This model extends a prior implementation by Boussau et al. (2009) by allowing users to estimate the weight of each of a set of pre-specified bifurcating trees in a single alignment. The MAST model allows each tree to have its own weight, topology, branch lengths, substitution model, nucleotide or amino acid frequencies, and model of rate heterogeneity across sites. We implemented the MAST model in a maximum-likelihood framework in the popular phylogenetic software, IQ-TREE. Simulations show that we can accurately recover the true model parameters, including branch lengths and tree weights for a given set of tree topologies, under a wide range of biologically realistic scenarios. We also show that we can use standard statistical inference approaches to reject a single-tree model when data are simulated under multiple trees (and vice versa). We applied the MAST model to multiple primate datasets and found that it can recover the signal of incomplete lineage sorting in the Great Apes, as well as the asymmetry in minor trees caused by introgression among several macaque species. When applied to a dataset of four Platyrrhine species for which standard concatenated maximum likelihood and gene tree approaches disagree, we observe that MAST gives the highest weight (i.e. the largest proportion of sites) to the tree also supported by gene tree approaches. These results suggest that the MAST model is able to analyse a concatenated alignment using maximum likelihood, while avoiding some of the biases that come with assuming there is only a single tree. We discuss how the MAST model can be extended in the future.

Collapse

Li X, Zou Y, Li T, Wong TKF, Bushey RT, Campa MJ, Gottlin EB, Liu H, Wei Q, Rodrigo A, Patz EF. Genetic Variants of CLPP and M1AP Are Associated With Risk of Non-Small Cell Lung Cancer. Front Oncol 2021;11:709829. [PMID: 34604049 PMCID: PMC8479179 DOI: 10.3389/fonc.2021.709829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/20/2021] [Indexed: 11/23/2022] Open

Affiliation(s)

Xianghan Li Research School of Biology, Australian National University, Canberra, ACT, Australia.,School of Biological Sciences, University of Auckland, Auckland, New Zealand
Yiran Zou Research School of Biology, Australian National University, Canberra, ACT, Australia.,School of Biological Sciences, University of Auckland, Auckland, New Zealand
Teng Li Research School of Biology, Australian National University, Canberra, ACT, Australia.,School of Biological Sciences, University of Auckland, Auckland, New Zealand
Thomas K F Wong Research School of Biology, Australian National University, Canberra, ACT, Australia
Ryan T Bushey Department of Radiology, Duke University Medical Center, Durham, NC, United States
Michael J Campa Department of Radiology, Duke University Medical Center, Durham, NC, United States
Elizabeth B Gottlin Department of Radiology, Duke University Medical Center, Durham, NC, United States
Hongliang Liu Duke Cancer Institute, Duke University Medical Center, Durham, NC, United States.,Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, United States
Qingyi Wei Duke Cancer Institute, Duke University Medical Center, Durham, NC, United States.,Department of Population Health Sciences, Duke University School of Medicine, Durham, NC, United States.,Department of Medicine, Duke University School of Medicine, Durham, NC, United States
Allen Rodrigo Research School of Biology, Australian National University, Canberra, ACT, Australia.,School of Biological Sciences, University of Auckland, Auckland, New Zealand
Edward F Patz Department of Radiology, Duke University Medical Center, Durham, NC, United States.,Duke Cancer Institute, Duke University Medical Center, Durham, NC, United States.,Department of Pharmacology and Cancer Biology, Duke University Medical Center, Durham, NC, United States

Collapse

Li T, Wong TKF, Ranjard L, Rodrigo AG. pgHMA: Application of the heteroduplex mobility assay analysis in phylogenetics and population genetics. Mol Ecol Resour 2021;22:653-663. [PMID: 34551204 DOI: 10.1111/1755-0998.13508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 09/01/2021] [Accepted: 09/06/2021] [Indexed: 11/26/2022]

Wong TKF, Li T, Ranjard L, Wu SH, Sukumaran J, Rodrigo AG. An assembly-free method of phylogeny reconstruction using short-read sequences from pooled samples without barcodes. PLoS Comput Biol 2021;17:e1008949. [PMID: 34516547 PMCID: PMC8460051 DOI: 10.1371/journal.pcbi.1008949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 09/23/2021] [Accepted: 09/01/2021] [Indexed: 12/01/2022] Open

Abstract

A current strategy for obtaining haplotype information from several individuals involves short-read sequencing of pooled amplicons, where fragments from each individual is identified by a unique DNA barcode. In this paper, we report a new method to recover the phylogeny of haplotypes from short-read sequences obtained using pooled amplicons from a mixture of individuals, without barcoding. The method, AFPhyloMix, accepts an alignment of the mixture of reads against a reference sequence, obtains the single-nucleotide-polymorphisms (SNP) patterns along the alignment, and constructs the phylogenetic tree according to the SNP patterns. AFPhyloMix adopts a Bayesian inference model to estimate the phylogeny of the haplotypes and their relative abundances, given that the number of haplotypes is known. In our simulations, AFPhyloMix achieved at least 80% accuracy at recovering the phylogenies and relative abundances of the constituent haplotypes, for mixtures with up to 15 haplotypes. AFPhyloMix also worked well on a real data set of kangaroo mitochondrial DNA sequences.

In evolutionary studies, it is customary to obtain homologous sequences from different individuals in a population or a species to construct a phylogeny. Frequently, sequences from different individuals will be identical; we refer to a set of identical sequences as a haplotype. If short-read sequencing technologies are used to obtain sequences from many individuals, the sequence from each individual is tagged with a unique barcode, and a mixed sample of tagged sequences is subsequently sequenced. The tagged sequences can be identified using the appropriate bioinformatics tools, for further downstream analyses. We have developed a novel method, AFPhyloMix, to reconstruct the phylogeny of a mixed sample of homologous sequences, and the relative abundance of different haplotypes, from different individuals without the need for barcoding. AFPhyloMix aligns the short reads obtained to a reference alignment, and identifies the variable sites along the alignment. On the basis of the patterns of nucleotide frequencies at these and neighbouring sites, AFPhyloMix uses a Bayesian inference model to compute the phylogenetic tree and the haplotype relative abundances. Our results show that AFPhyloMix works well on both the simulated data set and the real data set.

Collapse

Wong TKF, Kalyaanamoorthy S, Meusemann K, Yeates DK, Misof B, Jermiin LS. A minimum reporting standard for multiple sequence alignments. NAR Genom Bioinform 2020;2:lqaa024. [PMID: 33575581 PMCID: PMC7671350 DOI: 10.1093/nargab/lqaa024] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2020] [Revised: 03/12/2020] [Accepted: 03/30/2020] [Indexed: 12/19/2022] Open

Ranjard L, Wong TKF, Rodrigo AG. Correction to: Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage. BMC Bioinformatics 2020;21:24. [PMID: 31969110 PMCID: PMC6977291 DOI: 10.1186/s12859-019-3318-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2023] Open

Ranjard L, Wong TKF, Rodrigo AG. Effective machine-learning assembly for next-generation amplicon sequencing with very low coverage. BMC Bioinformatics 2019;20:654. [PMID: 31829137 PMCID: PMC6907241 DOI: 10.1186/s12859-019-3287-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 11/20/2019] [Indexed: 01/20/2023] Open

Abstract

BACKGROUND

In short-read DNA sequencing experiments, the read coverage is a key parameter to successfully assemble the reads and reconstruct the sequence of the input DNA. When coverage is very low, the original sequence reconstruction from the reads can be difficult because of the occurrence of uncovered gaps. Reference guided assembly can then improve these assemblies. However, when the available reference is phylogenetically distant from the sequencing reads, the mapping rate of the reads can be extremely low. Some recent improvements in read mapping approaches aim at modifying the reference according to the reads dynamically. Such approaches can significantly improve the alignment rate of the reads onto distant references but the processing of insertions and deletions remains challenging.

RESULTS

Here, we introduce a new algorithm to update the reference sequence according to previously aligned reads. Substitutions, insertions and deletions are performed in the reference sequence dynamically. We evaluate this approach to assemble a western-grey kangaroo mitochondrial amplicon. Our results show that more reads can be aligned and that this method produces assemblies of length comparable to the truth while limiting error rate when classic approaches fail to recover the correct length. Finally, we discuss how the core algorithm of this method could be improved and combined with other approaches to analyse larger genomic sequences.

CONCLUSIONS

We introduced an algorithm to perform dynamic alignment of reads on a distant reference. We showed that such approach can improve the reconstruction of an amplicon compared to classically used bioinformatic pipelines. Although not portable to genomic scale in the current form, we suggested several improvements to be investigated to make this method more flexible and allow dynamic alignment to be used for large genome assemblies.

Collapse

Wong TKF, Ranjard L, Lin Y, Rodrigo AG. HaploJuice : accurate haplotype assembly from a pool of sequences with known relative concentrations. BMC Bioinformatics 2018;19:389. [PMID: 30348075 PMCID: PMC6198429 DOI: 10.1186/s12859-018-2424-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Accepted: 10/09/2018] [Indexed: 11/10/2022] Open

Ranjard L, Wong TKF, Rodrigo AG. Reassembling haplotypes in a mixture of pooled amplicons when the relative concentrations are known: A proof-of-concept study on the efficient design of next-generation sequencing strategies. PLoS One 2018;13:e0195090. [PMID: 29621260 PMCID: PMC5886459 DOI: 10.1371/journal.pone.0195090] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Accepted: 03/18/2018] [Indexed: 12/02/2022] Open

Ranjard L, Wong TKF, Külheim C, Rodrigo AG, Ragg NLC, Patel S, Dunphy BJ. Complete mitochondrial genome of the green-lipped mussel, Perna canaliculus (Mollusca: Mytiloidea), from long nanopore sequencing reads. Mitochondrial DNA B Resour 2018;3:175-176. [PMID: 33490494 PMCID: PMC7801018 DOI: 10.1080/23802359.2018.1437810] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Tay WT, Walsh TK, Downes S, Anderson C, Jermiin LS, Wong TKF, Piper MC, Chang ES, Macedo IB, Czepak C, Behere GT, Silvie P, Soria MF, Frayssinet M, Gordon KHJ. Mitochondrial DNA and trade data support multiple origins of Helicoverpa armigera (Lepidoptera, Noctuidae) in Brazil. Sci Rep 2017;7:45302. [PMID: 28350004 PMCID: PMC5368605 DOI: 10.1038/srep45302] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 02/23/2017] [Indexed: 01/31/2023] Open

Affiliation(s)

Wee Tek Tay CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia
Thomas K. Walsh CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia
Sharon Downes CSIRO, Myall Vale Laboratories, Kamilaroi Highway, Narrabri, NSW 2390, Australia
Craig Anderson CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia Biological and Environmental Sciences, University of Stirling, Stirling, FK9 4LA, UK
Lars S. Jermiin CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia Research School of Biology, Australian National University, Acton, ACT 2601, Australia
Thomas K. F. Wong CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia Research School of Biology, Australian National University, Acton, ACT 2601, Australia
Melissa C. Piper CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia
Ester Silva Chang CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia Universidade de São Paulo, Instituto de Biociências, São Paulo, SP, 05508-090, Brazil
Isabella Barony Macedo CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia Universidade Federal de Minas Gerais, Faculdade de Farmácia, Belo Horizonte, MG, 31270-901, Brazil
Cecilia Czepak Universidade Federal de Goiás, Escola de Agronomia, Goiânia, GO, 75804-020, Brazil
Gajanan T. Behere Division of Crop Protection, ICAR Research Complex for North East Hill Region, Umroi Road, Umiam, Meghalaya, 793103, India
Pierre Silvie IRD, UMR EGCE, FR-91198 Gif-sur-Yvette Cedex, France CIRAD, UPR AÏDA, F-34398 Montpellier Cedex 05, France
Miguel F. Soria Bayer S.A., Crop Science Division, São Paulo, SP, 04779-900, Brazil
Marie Frayssinet DGIMI, INRA, Université Montpellier, Montpellier, 34095, France
Karl H. J. Gordon CSIRO, Black Mountain Laboratories, Clunies Ross Street, ACT 2601, Australia

Collapse

Jayaswal V, Wong TKF, Robinson J, Poladian L, Jermiin LS. Mixture models of nucleotide sequence evolution that account for heterogeneity in the substitution process across sites and across lineages. Syst Biol 2014;63:726-42. [PMID: 24927722 DOI: 10.1093/sysbio/syu036] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear genomes of Saccharomyces cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, S. castellii, S. kluyveri, S. bayanus, and Candida albicans. Our results show that second codon sites in the ancestral genome of these species contained 49.1% invariable sites, 39.6% variable sites belonging to one rate category (V1), and 11.3% variable sites belonging to a second rate category (V2). The ancestral nucleotide content was found to differ markedly across these three sets of sites, and the evolutionary processes operating at the variable sites were found to be non-SRH and best modeled by a combination of eight edge-specific rate matrices (four for V1 and four for V2). The number of substitutions per site at the variable sites also differed markedly, with sites belonging to V1 evolving slower than those belonging to V2 along the lineages separating the seven species of Saccharomyces. Finally, sites belonging to V1 appeared to have ceased evolving along the lineages separating S. cerevisiae, S. paradoxus, S. mikatae, S. kudriavzevii, and S. bayanus, implying that they might have become so selectively constrained that they could be considered invariable sites in these species.

Collapse

Affiliation(s)

Vivek Jayaswal School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, AustraliaSchool of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, Australia
Thomas K F Wong School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, Australia
John Robinson School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, AustraliaSchool of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, Australia
Leon Poladian School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, AustraliaSchool of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, Australia
Lars S Jermiin School of Biomedical Sciences, Queensland University of Technology, Brisbane, QLD 4000, Australia; School of Mathematics and Statistics, University of Sydney, Sydney, NSW 2006, Australia; CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia; and Centre for Mathematical Biology, University of Sydney, Sydney, NSW 2006, Australia

Collapse

Hu X, Wong TKF, Lu ZJ, Chan TF, Lau TCK, Yiu SM, Yip KY. Computational identification of protein binding sites on RNAs using high-throughput RNA structure-probing data. ACTA ACUST UNITED AC 2013;30:1049-1055. [PMID: 24376038 DOI: 10.1093/bioinformatics/btt757] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Accepted: 12/13/2013] [Indexed: 11/14/2022]

Affiliation(s)

Xihao Hu Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Thomas K F Wong Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Zhi John Lu Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Ting Fung Chan Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Terrence Chi Kong Lau Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Siu Ming Yiu Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong
Kevin Y Yip Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, CSIRO Ecosystem Sciences, Canberra, ACT 2601, Australia, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China 100084, School of Life Sciences, Hong Kong Bioinformatics Centre, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong and Department of Biology and Chemistry, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong

Collapse

Ma C, Wong TKF, Lam TW, Hon WK, Sadakane K, Yiu SM. An efficient alignment algorithm for searching simple pseudoknots over long genomic sequence. IEEE/ACM Trans Comput Biol Bioinform 2012;9:1629-1638. [PMID: 22848134 DOI: 10.1109/tcbb.2012.104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]

Wong TKF, Chiu YS, Lam TW, Yiu SM. Memory efficient algorithms for structural alignment of RNAs with pseudoknots. IEEE/ACM Trans Comput Biol Bioinform 2012;9:161-168. [PMID: 21464506 DOI: 10.1109/tcbb.2011.66] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Wong TKF, Wan KL, Hsu BY, Cheung BWY, Hon WK, Lam TW, Yiu SM. RNASAlign: RNA structural alignment system. Bioinformatics 2011;27:2151-2. [PMID: 21659321 DOI: 10.1093/bioinformatics/btr338] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Jia Y, Wong TKF, Song YQ, Yiu SM, Smith DK. Refining orthologue groups at the transcript level. BMC Genomics 2010;11 Suppl 4:S11. [PMID: 21143794 PMCID: PMC3005912 DOI: 10.1186/1471-2164-11-s4-s11] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open

Abstract

Background

Orthologues are genes in different species that are related through divergent evolution from a common ancestor and are expected to have similar functions. Many databases have been created to describe orthologous genes based on existing sequence data. However, alternative splicing (in eukaryotes) is usually disregarded in the determination of orthologue groups and the functional consequences of alternative splicing have not been considered. Most multi-exon genes can encode multiple protein isoforms which often have different functions and can be disease-related. Extending the definition of orthologue groups to take account of alternate splicing and the functional differences it causes requires further examination.

Results

A subset of the orthologous gene groups between human and mouse was selected from the InParanoid database for this study. Each orthologue group was divided into sub-clusters, at the transcript level, using a method based on the sequence similarity of the isoforms. Transcript based sub-clusters were verified by functional signatures of the cluster members in the InterPro database. Functional similarity was higher within than between transcript-based sub-clusters of a defined orthologous group. In certain cases, cancer-related isoforms of a gene could be distinguished from other isoforms of the gene. Predictions of intrinsic disorder in protein regions were also correlated with the isoform sub-clusters within an orthologue group.

Conclusions

Sub-clustering of orthologue groups at the transcript level is an important step to more accurately define functionally equivalent orthologue groups. This work appears to be the first effort to refine orthologous groupings of genes based on the consequences of alternative splicing on function. Further investigation and refinement of the methodology to classify and verify isoform sub-clusters is needed, particularly to extend the technique to more distantly related species.

Collapse

Wong TKF, Lam TW, Sung WK, Yiu SM. Adjacent nucleotide dependence in ncRNA and order-1 SCFG for ncRNA identification. PLoS One 2010;5. [PMID: 20927402 PMCID: PMC2946929 DOI: 10.1371/journal.pone.0012848] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2010] [Accepted: 08/25/2010] [Indexed: 12/31/2022] Open

Wong TKF, Lam TW, Yiu SM, Wong SCK. Improving the accuracy of signal transduction pathway construction using Level-2 neighbours. Int J Bioinform Res Appl 2010;6:542-555. [PMID: 21354961 DOI: 10.1504/ijbra.2010.038736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Wong TKF, Lam TW, Chan PY, Yiu SM. Correcting short reads with high error rates for improved sequencing result. ACTA ACUST UNITED AC 2009;5:224-37. [PMID: 19324607 DOI: 10.1504/ijbra.2009.024039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Wong TKF, Lam TW, Yang W, Yiu SM. Finding alternative splicing patterns with strong support from expressed sequences on individual exons/introns. J Bioinform Comput Biol 2009;6:1021-33. [PMID: 18942164 DOI: 10.1142/s0219720008003825] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Revised: 02/27/2008] [Accepted: 03/22/2008] [Indexed: 11/18/2022]

Yang W, Ng P, Zhao M, Wong TKF, Yiu SM, Lau YL. Promoter-sharing by different genes in human genome--CPNE1 and RBM12 gene pair as an example. BMC Genomics 2008;9:456. [PMID: 18831769 PMCID: PMC2568002 DOI: 10.1186/1471-2164-9-456] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2008] [Accepted: 10/03/2008] [Indexed: 11/27/2022] Open