1
|
Towards integrated oncogenic marker recognition through mutual information-based statistically significant feature extraction: an association rule mining based study on cancer expression and methylation profiles. QUANTITATIVE BIOLOGY 2017; 5:302-327. [PMID: 30221015 DOI: 10.1007/s40484-017-0119-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Background Marker detection is an important task in complex disease studies. Here we provide an association rule mining (ARM) based approach for identifying integrated markers through mutual information (MI) based statistically significant feature extraction, and apply it to acute myeloid leukemia (AML) and prostate carcinoma (PC) gene expression and methylation profiles. Methods We first collect the genes having both expression and methylation values in AML as well as PC. Next, we run Jarque-Bera normality test on the expression/methylation data to divide the whole dataset into two parts: one that ollows normal distribution and the other that does not follow normal distribution. Thus, we have now four parts of the dataset: normally distributed expression data, normally distributed methylation data, non-normally distributed expression data, and non-normally distributed methylated data. A feature-extraction technique, "mRMR" is then utilized on each part. This results in a list of top-ranked genes. Next, we apply Welch t-test (parametric test) and Shrink t-test (non-parametric test) on the expression/methylation data for the top selected normally distributed genes and non-normally distributed genes, respectively. We then use a recent weighted ARM method, "RANWAR" to combine all/specific resultant genes to generate top oncogenic rules along with respective integrated markers. Finally, we perform literature search as well as KEGG pathway and Gene-Ontology (GO) analyses using Enrichr database for in silico validation of the prioritized oncogenes as the markers and labeling the markers as existing or novel. Results The novel markers of AML are {ABCB11↑∪KRT17↓} (i.e., ABCB11 as up-regulated, & KRT17 as down-regulated), and {AP1S1-∪KRT17↓∪NEIL2-∪DYDC1↓}) (i.e., AP1S1 and NEIL2 both as hypo-methylated, & KRT17 and DYDC1 both as down-regulated). The novel marker of PC is {UBIAD1¶∪APBA2‡∪C4orf31‡} (i.e., UBIAD1 as up-regulated and hypo-methylated, & APBA2 and C4orf31 both as down-regulated and hyper-methylated). Conclusion The identified novel markers might have critical roles in AML as well as PC. The approach can be applied to other complex disease.
Collapse
|
2
|
Abstract
BACKGROUND Bone marrow plays a key role in bone formation and healing. Although a subset of marrow explants ossifies in vitro without excipient osteoinductive factors, some explants do not undergo ossification. The disparity of outcome suggests a significant heterogeneity in marrow tissue in terms of its capacity to undergo osteogenesis. QUESTIONS/PURPOSES We sought to identify: (1) proteins and signaling pathways associated with osteogenesis by contrasting the proteomes of ossified and poorly ossified marrow explants; and (2) temporal changes in proteome and signaling pathways of marrow ossification in the early and late phases of bone formation. METHODS Explants of marrow were cultured. Media conditioned by ossified (n = 4) and poorly ossified (n = 4) subsets were collected and proteins unique to each group were identified by proteomic analysis. Proteomic data were processed to assess proteins specific to the early phase (Days 1-14) and late phase (Days 15-28) of the culture period. Pathways involved in bone marrow ossification were identified through bioinformatics. RESULTS Twenty-eight proteins were unique to ossified samples and eight were unique to poorly ossified ones. Twelve proteins were expressed during the early phase and 15 proteins were specific to the late phase. Several identified pathways corroborated those reported for bone formation in the literature. Immune and inflammatory pathways were specific to ossified samples. CONCLUSIONS The marrow explant model indicates the inflammatory and immune pathways to be an integral part of the osteogenesis process.
Collapse
|
3
|
Progress in research of molecular markers for hepatic oval cells You-Lin Yu, Bao-Ming Shi. Shijie Huaren Xiaohua Zazhi 2011; 19:3610-3615. [DOI: 10.11569/wcjd.v19.i35.3610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Hepatic stem cells have the capacity of self-renewal, proliferation and differentiation and can produce progeny cells that have the same phenotypes and genotype as parental cells. The cells originate from the foregut endoderm and exist in the form of hepatic cells in embryonic liver, and small oval cells (OCs) with a large nuclear/cytoplasmic ratio and special cell markers in the adult liver. Hepatic stem cells are normally in the dormant state and divide at a very slow rate. The cells begin to be activated to proliferate quickly and transit from quiescent phase to proliferative phase when the liver is resected by operation or injured by drugs. In recent years, numerous studies have confirmed that hepatic OCs are hepatic stem cells that have the bipotential capability of differentiation into mature hepatocytes and biliary epithelial cells when hepatocyte proliferation is inhibited and liver regeneration compromised. The research of the role of hepatic OCs in the management of acute and chronic liver dysfunction, advanced cirrhosis, other liver diseases, and diabetes caused by pancreatic lesions has attracted wide attention. Great efforts have been made to find and isolate hepatic OCs. This review discusses the progress in research of molecular markers for hepatic OCs.
Collapse
|
4
|
Umbilical cord blood stem cells: Towards a proteomic approach. J Proteomics 2010; 73:468-82. [DOI: 10.1016/j.jprot.2009.06.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2009] [Revised: 06/04/2009] [Accepted: 06/16/2009] [Indexed: 02/07/2023]
|
5
|
Analysis of phenotype-genotype connection: the story of dissecting disease pathogenesis in genomic era in China, and beyond. Philos Trans R Soc Lond B Biol Sci 2007; 362:1043-61. [PMID: 17327209 PMCID: PMC2435570 DOI: 10.1098/rstb.2007.2033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
DNA is the ultimate depository of biological complexity. Thus, in order to understand life and gain insights into disease pathogenesis, genetic information embedded in the sequence of DNA base pairs comprising chromosomes should be deciphered. The stories of investigating the association between phenotype and genotype in China and other countries further demonstrate that genomics can serve as a probe for disease biology. We now know that in Mendelian disorders, one gene is not only a dictator of one phenotype but also a dictator of two or more distinct disorders. Dissecting genetic abnormalities of complex diseases, including diabetes, hypertension, mental diseases, coronary heart disease and cancer, may unravel the complicated networks and crosstalks, and help to simplify the complexity of the disease. The transcriptome and proteomic analysis for medicine not only deepen our understanding of disease pathogenesis, but also provide novel diagnostic and therapeutic strategies. Taken together, genomic research offers a new opportunity for determining how diseases occur, by taking advantage of experiments of nature and a growing array of sophisticated research tools to identify the molecular abnormalities underlying disease processes. We should be ready for the advent of genomic medicine, and put the genome into the doctors' bag, so that we can help patients to conquer diseases.
Collapse
|
6
|
Insights into human CD34+ hematopoietic stem/progenitor cells through a systematically proteomic survey coupled with transcriptome. Proteomics 2006; 6:2673-92. [PMID: 16596711 DOI: 10.1002/pmic.200500032] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Hematopoietic stem cells are capable of self-renewal and differentiation into different hematopoietic lineages. To gain a comprehensive understanding of hematopoietic stem/progenitor cells, a systematic proteomic survey of human CD34+ cells collected from human umbilical cord blood was performed, in which the proteins were separated by 1- and 2-DE, as well as by nano-LC, and subsequently identified by MS. A total of 370 distinct proteins identified from those cells provided new insights into the potential of the stem/progenitor cells because the nerve, gonad, and eye-associated proteins were reliably identified. Interestingly, the transcripts of 133 (35.9%) identified proteins were not found by the prevalent transcriptome approaches, although several selected transcripts could be detected by RT-PCR. Moreover, the heterogeneity of 33 proteins identified from 2-DE was attributable primarily to post-translational processes rather than to alternative splicing at transcriptional level. Furthermore, the biosyntheses of 15 proteins identified in this study appears not to be completely interrupted in spite of the fact that corresponding antisense RNAs were found in the existing transcriptome data. The integrated proteomic and transcriptomic analyses employed here provided a unique view of the human stem/progenitor cells.
Collapse
|
7
|
Cloning, expression and characterization of a novel human CAP10-like gene hCLP46 from CD34(+) stem/progenitor cells. Gene 2006; 371:7-15. [PMID: 16524674 DOI: 10.1016/j.gene.2005.08.027] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2005] [Revised: 08/06/2005] [Accepted: 08/12/2005] [Indexed: 11/21/2022]
Abstract
A novel human gene, named as human CAP10-like protein 46 kDa (hCLP46), was isolated and identified from human acute myeloid leukemia transformed from myelodysplastic syndrome (MDS-AML) CD34(+) cells. hCLP46 (3q13.33) contains 11 exons encoding a putative protein of 392 amino acids, with a highly conserved CAP10 domain, a hydrophobic signal peptide at its N-terminus, and an endoplasmic reticulum (ER) retention signal motif KTEL at the C-terminus. The homologs of hCLP46 exist in different organisms from plants to animal kingdoms. Subcellular localization analysis showed that hCLP46 is an ER-resident protein. hCLP46 expressed in most human adult tissues at different intensities, with lengths of 3.5 kb and 1.9 kb. Transcript of hCLP46 was not detectable in colon, thymus, and small intestine, but was abundant in liver, indicating that hCLP46 may be involved in important physiological functions in the liver. hCLP46 over-expressed U937 cells had higher growth rate than the cells without exogenic hCLP46 protein expression, suggesting that hCLP46 protein possess the ability of promoting cell proliferation.
Collapse
MESH Headings
- Amino Acid Sequence
- Antigens, CD34
- Chromosomes, Human, Pair 3/genetics
- Endoplasmic Reticulum/genetics
- Gene Expression Regulation, Leukemic/genetics
- Glucosyltransferases
- Humans
- Leukemia, Myeloid, Acute/genetics
- Leukemia, Myeloid, Acute/metabolism
- Leukemia, Myeloid, Acute/pathology
- Molecular Sequence Data
- Myelodysplastic Syndromes/genetics
- Myelodysplastic Syndromes/metabolism
- Myelodysplastic Syndromes/pathology
- Neoplasm Proteins/biosynthesis
- Neoplasm Proteins/genetics
- Neoplastic Stem Cells/metabolism
- Neoplastic Stem Cells/pathology
- Organ Specificity/genetics
- Protein Structure, Tertiary/genetics
- Proteins/genetics
- Proteins/metabolism
- Sequence Homology, Amino Acid
- U937 Cells
Collapse
|
8
|
Abstract
The Chinese genome project was initiated in 1993 with the goal of contributing 1% to the Human Genome Program. The study of gene expression profiles with cDNA microarrays, and large-scale sequencing and analysis of 130928 expressed sequence tags (ESTs), allowed isolation and characterization of over 1000 novel full-length human cDNAs derived from human hematopoietic stem/progenitor cells, neuroendocrine tissues, liver, and cardiovascular cells. In addition, EST sequencing for model organisms, including rat, zebrafish, Schistosoma japonicum and rice was performed, aiming at identifying genes associated with physiological and/or pathological characteristics.
Collapse
|
9
|
Abstract
Significant progress in human genome research has been made in China since 1994. This review aims to give a brief and incomplete introduction to the major research institutions and their achievements in human genome sequencing and functional genomics in medicine, with emphasis on the "1% Sequencing Project", the generation of single nucleotide polymorphism and haplotype maps of the human genome, disease gene identification, and the molecular characterization of leukemia and other diseases. Chinese efforts towards the sequencing of pathogenic microbial genomes and of the rice (Oryza sativa ssp. Indica) genome are also described.
Collapse
|
10
|
Gene-expression profiling of CD34+cells from various hematopoietic stem-cell sources reveals functional differences in stem-cell activity. J Leukoc Biol 2003; 75:314-23. [PMID: 14634063 DOI: 10.1189/jlb.0603287] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The replacement of bone marrow (BM) as a conventional source of stem cell (SC) by umbilical cord blood (UCB) and granulocyte-colony stimulating factor-mobilized peripheral blood SC (PBSC) has brought about clinical advantages. However, several studies have demonstrated that UCB CD34(+) cells and PBSC significantly differ from BM CD34(+) cells qualitatively and quantitatively. Here, we quantified the number of SC in purified BM, UCB CD34(+) cells, and CD34(+) PBSC using in vitro and in vivo assays for human hematopoietic SC (HSC) activity. A cobblestone area-forming cell (CAFC) assay showed that UCB CD34(+) cells contained the highest frequency of CAFC(wk6) (3.6- to tenfold higher than BM CD34(+) cells and PBSC, respectively), and the engraftment capacity in vivo by nonobese diabetic/severe combined immunodeficiency repopulation assay was also significantly greater than BM CD34(+), with a higher proportion of CD45(+) cells detected in the recipients at a lower cell dose. To understand the molecular characteristics underlying these functional differences, we performed several DNA microarray experiments using Affymetrix gene chips, containing 12,600 genes. Comparative analysis of gene-expression profiles showed differential expression of 51 genes between BM and UCB CD34(+) SC and 64 genes between BM CD34(+) cells and PBSC. These genes are involved in proliferation, differentiation, apoptosis, and engraftment capacity of SC. Thus, the molecular expression profiles reported here confirmed functional differences observed among the SC sources. Moreover, this report provides new insights to describe the molecular phenotype of CD34(+) HSC and leads to a better understanding of the discrepancy among the SC sources.
Collapse
|
11
|
Abstract
The myelodysplastic syndromes (MDS) are characterized by hemopoietic insufficiency associated with cytopenias leading to serious morbidity plus the additional risk of leukemic transformation. Therapeutic dilemmas exist in MDS because of the disease's multifactorial pathogenetic features, heterogeneous stages, and the patients' generally elderly ages. Underlying the cytopenias and evolutionary potential in MDS are innate stem cell lesions, cellular/cytokine-mediated stromal defects, and immunologic derangements. This article reviews the developing understanding of biologic and molecular lesions in MDS and recently available biospecific drugs that are potentially capable of abrogating these abnormalities. Dr. Peter Greenberg's discussion centers on decision-making approaches for these therapeutic options, considering the patient's clinical factors and risk-based prognostic category. One mechanism underlying the marrow failure present in a portion of MDS patients is immunologic attack on the hemopoietic stem cells. Considerable overlap exists between aplastic anemia, paroxysmal nocturnal hemoglobinuria, and subsets of MDS. Common or intersecting pathophysiologic mechanisms appear to underlie hemopoietic cell destruction and genetic instability, which are characteristic of these diseases. Treatment results and new therapeutic strategies using immune modulation, as well as the role of the immune system in possible mechanisms responsible for genetic instability in MDS, will be the subject of discussion by Dr. Neal Young. A common morphological change found within MDS marrow cells, most sensitively demonstrated by electron microscopy, is the presence of ringed sideroblasts. Such assessment shows that this abnormal mitochondrial iron accumulation is not confined to the refractory anemia with ring sideroblast (RARS) subtype of MDS and may also contribute to numerous underlying MDS pathophysiological processes. Generation of abnormal sideroblast formation appears to be due to malfunction of the mitochondrial respiratory chain, attributable to mutations of mitochondrial DNA, to which aged individuals are most vulnerable. Such dysfunction leads to accumulation of toxic ferric iron in the mitochondrial matrix. Understanding the broad biologic consequences of these derangements is the focus of the discussion by Dr. Norbert Gattermann.
Collapse
|
12
|
Insight into hepatocellular carcinogenesis at transcriptome level by comparing gene expression profiles of hepatocellular carcinoma with those of corresponding noncancerous liver. Proc Natl Acad Sci U S A 2001; 98:15089-94. [PMID: 11752456 PMCID: PMC64988 DOI: 10.1073/pnas.241522398] [Citation(s) in RCA: 272] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Human hepatocellular carcinoma (HCC) is one of the most common cancers worldwide. In this work, we report on a comprehensive characterization of gene expression profiles of hepatitis B virus-positive HCC through the generation of a large set of 5'-read expressed sequence tag (EST) clusters (11,065 in total) from HCC and noncancerous liver samples, which then were applied to a cDNA microarray system containing 12,393 genes/ESTs and to comparison with a public database. The commercial cDNA microarray, which contains 1,176 known genes related to oncogenesis, was used also for profiling gene expression. Integrated data from the above approaches identified 2,253 genes/ESTs as candidates with differential expression. A number of genes related to oncogenesis and hepatic function/differentiation were selected for further semiquantitative reverse transcriptase-PCR analysis in 29 paired HCC/noncancerous liver samples. Many genes involved in cell cycle regulation such as cyclins, cyclin-dependent kinases, and cell cycle negative regulators were deregulated in most patients with HCC. Aberrant expression of the Wnt-beta-catenin pathway and enzymes for DNA replication also could contribute to the pathogenesis of HCC. The alteration of transcription levels was noted in a large number of genes implicated in metabolism, whereas a profile change of others might represent a status of dedifferentiation of the malignant hepatocytes, both considered as potential markers of diagnostic value. Notably, the altered transcriptome profiles in HCC could be correlated to a number of chromosome regions with amplification or loss of heterozygosity, providing one of the underlying causes of the transcription anomaly of HCC.
Collapse
|
13
|
The pattern of gene expression in human CD34(+) stem/progenitor cells. Proc Natl Acad Sci U S A 2001; 98:13966-71. [PMID: 11717454 PMCID: PMC61150 DOI: 10.1073/pnas.241526198] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/04/2001] [Indexed: 11/18/2022] Open
Abstract
We have analyzed the pattern of gene expression in human primary CD34(+) stem/progenitor cells. We identified 42,399 unique serial analysis of gene expression (SAGE) tags among 106,021 SAGE tags collected from 2.5 x 10(6) CD34(+) cells purified from bone marrow. Of these unique SAGE tags, 21,546 matched known expressed sequences, including 3,687 known genes, and 20,854 were novel without a match. The SAGE tags that matched known sequences tended to be at higher levels, whereas the novel SAGE tags tended to be at lower levels. By using the generation of longer sequences from SAGE tags for gene identification (GLGI) method, we identified the correct gene for 385 of 440 high-copy SAGE tags that matched multiple genes and we generated 198 novel 3' expressed sequence tags from 138 high-copy novel SAGE tags. We observed that many different SAGE tags were derived from the same genes, reflecting the high heterogeneity of the 3' untranslated region in the expressed genes. We compared the quantitative relationship for genes known to be important in hematopoiesis. The qualitative identification and quantitative measure for each known gene, expressed sequence tag, and novel SAGE tag provide a base for studying normal gene expression in hematopoietic stem/progenitor cells and for studying abnormal gene expression in hematopoietic diseases.
Collapse
|
14
|
Cloning and functional analysis of cDNAs with open reading frames for 300 previously undefined genes expressed in CD34+ hematopoietic stem/progenitor cells. Genome Res 2000; 10:1546-60. [PMID: 11042152 PMCID: PMC310934 DOI: 10.1101/gr.140200] [Citation(s) in RCA: 138] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2000] [Accepted: 07/19/2000] [Indexed: 11/24/2022]
Abstract
Three hundred cDNAs containing putatively entire open reading frames (ORFs) for previously undefined genes were obtained from CD34+ hematopoietic stem/progenitor cells (HSPCs), based on EST cataloging, clone sequencing, in silico cloning, and rapid amplification of cDNA ends (RACE). The cDNA sizes ranged from 360 to 3496 bp and their ORFs coded for peptides of 58-752 amino acids. Public database search indicated that 225 cDNAs exhibited sequence similarities to genes identified across a variety of species. Homology analysis led to the recognition of 50 basic structural motifs/domains among these cDNAs. Genomic exon-intron organization could be established in 243 genes by integration of cDNA data with genome sequence information. Interestingly, a new gene named as HSPC070 on 3p was found to share a sequence of 105bp in 3' UTR with RAF gene in reversed transcription orientation. Chromosomal localizations were obtained using electronic mapping for 192 genes and with radiation hybrid (RH) for 38 genes. Macroarray technique was applied to screen the gene expression patterns in five hematopoietic cell lines (NB4, HL60, U937, K562, and Jurkat) and a number of genes with differential expression were found. The resource work has provided a wide range of information useful not only for expression genomics and annotation of genomic DNA sequence, but also for further research on the function of genes involved in hematopoietic development and differentiation.
Collapse
|