1
|
Li L. An intelligent model to decode students' behavioral states in physical education using back propagation neural network and Hidden Markov Model. BMC Psychol 2024; 12:249. [PMID: 38711093 PMCID: PMC11071333 DOI: 10.1186/s40359-024-01743-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Accepted: 04/21/2024] [Indexed: 05/08/2024] Open
Abstract
This paper highlights the need for intelligent analysis of students' behavioral states in physical education tasks. The hand-ring inertial data is used to identify students' motion sequence states. First, statistical feature extraction is performed based on the acceleration and angular velocity data collected from the bracelet. After completing the filtering and noise reduction of the data, we perform feature extraction by Back Propagation Neural Network (BPNN) and use the sliding window method for analysis. Finally, the classification capability of the model sequence is enhanced by the Hidden Markov Model (HMM). The experimental results indicate that the classification accuracy of student action sequences in physical education exceeds 96% after optimization by the HMM method. This provides intelligent means and new ideas for future student state recognition in physical education and teaching reform.
Collapse
Affiliation(s)
- Liyan Li
- Sports Department, Luoyang Normal University, Luoyang, 471934, Henan, China.
| |
Collapse
|
2
|
Pu H, Wang C, Yu T, Chen X, Li G, Zhu D, Pan X, Wang Y. A synergistic strategy based on active hydroxymethyl amine compounds and fucoidan for bioprosthetic heart valves with enhancing anti-coagulation and anti-calcification properties. Int J Biol Macromol 2024; 266:130715. [PMID: 38462108 DOI: 10.1016/j.ijbiomac.2024.130715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/04/2024] [Accepted: 03/05/2024] [Indexed: 03/12/2024]
Abstract
With an aging population, the patients with valvular heart disease (VHD) are growing worldwide, and valve replacement is a primary choice for these patients with severe valvular disease. Among them, bioprosthetic heart valves (BHVs), especially BHVs trough transcatheter aortic valve replacement, are widely accepted by patients on account of their good hemodynamics and biocompatibility. Commercial BHVs in clinic are prepared by glutaraldehyde cross-linked pericardial tissue with the risk of calcification and thrombotic complications. In the present study, a strategy combines improved hemocompatibility and anti-calcification properties for BHVs has been developed based on a novel non-glutaraldehyde BHV crosslinker hexakis(hydroxymethyl)melamine (HMM) and the anticoagulant fucoidan. Besides the similar mechanical properties and enhanced component stability compared to glutaraldehyde crosslinked PP (G-PP), the fucoidan modified HMM-crosslinked PPs (HMM-Fu-PPs) also exhibit significantly enhanced anticoagulation performance with a 72 % decrease in thrombus weight compared with G-PP in ex-vivo shunt assay, along with the superior biocompatibility, satisfactory anti-calcification properties confirmed by subcutaneous implantation. Owing to good comprehensive performance of these HMM-Fu-PPs, this simple and feasible strategy may offer a great potential for BHV fabrication in the future, and open a new avenue to explore more N-hydroxymethyl compound based crosslinker with excellent performance in the field of biomaterials.
Collapse
Affiliation(s)
- Hongxia Pu
- National Engineering Research Center for Biomaterials and College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Canyu Wang
- National Engineering Research Center for Biomaterials and College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Tao Yu
- Frontiers Medical Center, Tianfu Jincheng Laboratory, Chengdu, China
| | - Xiaotong Chen
- National Engineering Research Center for Biomaterials and College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Gaocan Li
- National Engineering Research Center for Biomaterials and College of Biomedical Engineering, Sichuan University, Chengdu, China
| | - Da Zhu
- Department of Structure Heart Center, Fuwai Yunnan Cardiovascular Hospital, Kunming, China
| | - Xiangbin Pan
- Department of Structure Heart Center, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China; Department of Structure Heart Center, Fuwai Yunnan Cardiovascular Hospital, Kunming, China.
| | - Yunbing Wang
- National Engineering Research Center for Biomaterials and College of Biomedical Engineering, Sichuan University, Chengdu, China.
| |
Collapse
|
3
|
Baranowski B, Krysińska M, Gradowski M. KINtaro: protein kinase-like database. BMC Res Notes 2024; 17:50. [PMID: 38365785 PMCID: PMC10870513 DOI: 10.1186/s13104-024-06713-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 02/01/2024] [Indexed: 02/18/2024] Open
Abstract
OBJECTIVE The superfamily of protein kinases features a common Protein Kinase-like (PKL) three-dimensional fold. Proteins with PKL structure can also possess enzymatic activities other than protein phosphorylation, such as AMPylation or glutamylation. PKL proteins play a vital role in the world of living organisms, contributing to the survival of pathogenic bacteria inside host cells, as well as being involved in carcinogenesis and neurological diseases in humans. The superfamily of PKL proteins is constantly growing. Therefore, it is crucial to gather new information about PKL families. RESULTS To this end, the KINtaro database ( http://bioinfo.sggw.edu.pl/kintaro/ ) has been created as a resource for collecting and sharing such information. KINtaro combines protein sequence information and additional annotations for more than 70 PKL families, including 32 families not associated with PKL superfamily in established protein domain databases. KINtaro is searchable by keywords and by protein sequence and provides family descriptions, sequences, sequence alignments, HMM models, 3D structure models, experimental structures with PKL domain annotations and sequence logos with catalytic residue annotations.
Collapse
Affiliation(s)
- Bartosz Baranowski
- Laboratory of Plant Pathogenesis, Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland
| | - Marianna Krysińska
- Department of Biochemistry and Microbiology, Warsaw University of Life Sciences (SGGW), Warsaw, Poland
| | - Marcin Gradowski
- Department of Biochemistry and Microbiology, Warsaw University of Life Sciences (SGGW), Warsaw, Poland.
| |
Collapse
|
4
|
El Ghazi D, Miere A, Crincoli E, Le HM, Souied EH. In vivo cone-photoreceptor density comparison between eyes with subretinal drusenoid deposits and healthy eyes using high magnification imaging. Int Ophthalmol 2024; 44:82. [PMID: 38358437 DOI: 10.1007/s10792-024-03023-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 01/11/2024] [Indexed: 02/16/2024]
Abstract
PURPOSE To compare photoreceptor density automated quantification in eyes with subretinal drusenoid deposits (SDD) and healthy controls using Heidelberg Spectralis High Magnification Module (HMM) imaging. METHODS Twelve eyes of 6 patients with intermediate AMD, presenting with SDD were included, as well as twelve eyes of healthy controls. Individual dot SDD within the central 30° retina were examined with infrared confocal laser ophthalmoscopy, HMM, and spectral-domain optical coherence tomography (SD-OCT). Photoreceptor density analysis was performed on the best-quality image using the ImageJ Foci Picker plugin, after the removal of SDD from the HMM image. Correlations were made between the HMM quantified photoreceptor density, SD-OCT characteristics, stage, and number of SDD. RESULTS Mean age was 75.17 ± 2.51 years in the SDD group (3 males, 3 females) versus 73.17 ± 3.15 years in the healthy control group (p = 0.2). Defects in the overlying ellipsoid zone were present on SD-OCT in 8/12 (66.66%) eyes. The mean ± standard deviation foci detected (i.e., cone photoreceptors) was 7123.75 ± 3683.32 foci/mm2 in the SDD group versus 13,253 ± 3331.00 foci/mm2 in the healthy control group (p = 0.0003). The number of SDD was associated with a reduction in foci density, p = 0.0055, r = - 0.7622. CONCLUSION The decreased cone density in eyes with SDD may correlate with a decrease in retinal function in intermediate AMD eyes independent of neovascular complications or outer retinal pigment epithelial atrophy.
Collapse
Affiliation(s)
- Djazia El Ghazi
- Department of Ophthalmology, Center Hospitalier Intercommunal de Créteil, University Paris Est Créteil, 40 avenue de Verdun, 94010, Créteil, France
| | - Alexandra Miere
- Department of Ophthalmology, Center Hospitalier Intercommunal de Créteil, University Paris Est Créteil, 40 avenue de Verdun, 94010, Créteil, France.
| | - Emanuele Crincoli
- Department of Ophthalmology, Center Hospitalier Intercommunal de Créteil, University Paris Est Créteil, 40 avenue de Verdun, 94010, Créteil, France
| | - Hoang Mai Le
- Department of Ophthalmology, Center Hospitalier Intercommunal de Créteil, University Paris Est Créteil, 40 avenue de Verdun, 94010, Créteil, France
| | - Eric H Souied
- Department of Ophthalmology, Center Hospitalier Intercommunal de Créteil, University Paris Est Créteil, 40 avenue de Verdun, 94010, Créteil, France
| |
Collapse
|
5
|
Safi K, Aly WHF, Kanj H, Khalifa T, Ghedira M, Hutin E. Hidden Markov Model for Parkinson's Disease Patients Using Balance Control Data. Bioengineering (Basel) 2024; 11:88. [PMID: 38247965 PMCID: PMC10813155 DOI: 10.3390/bioengineering11010088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 01/09/2024] [Accepted: 01/15/2024] [Indexed: 01/23/2024] Open
Abstract
Understanding the behavior of the human postural system has become a very attractive topic for many researchers. This system plays a crucial role in maintaining balance during both stationary and moving states. Parkinson's disease (PD) is a prevalent degenerative movement disorder that significantly impacts human stability, leading to falls and injuries. This research introduces an innovative approach that utilizes a hidden Markov model (HMM) to distinguish healthy individuals and those with PD. Interestingly, this methodology employs raw data obtained from stabilometric signals without any preprocessing. The dataset used for this study comprises 60 subjects divided into healthy and PD patients. Impressively, the proposed method achieves an accuracy rate of up to 98% in effectively differentiating healthy subjects from those with PD.
Collapse
Affiliation(s)
- Khaled Safi
- Computer Science Department, Jinan University, Tripoli P.O. Box 818, Lebanon
| | - Wael Hosny Fouad Aly
- College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait; (H.K.); (T.K.)
| | - Hassan Kanj
- College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait; (H.K.); (T.K.)
| | - Tarek Khalifa
- College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait; (H.K.); (T.K.)
| | - Mouna Ghedira
- Laboratory Analysis and Restoration of Movement (ARM), Henri Mondor University Hospitals, Assistance Publique-Hôpitaux de Paris, 94000 Créteil, France; (M.G.); (E.H.)
| | - Emilie Hutin
- Laboratory Analysis and Restoration of Movement (ARM), Henri Mondor University Hospitals, Assistance Publique-Hôpitaux de Paris, 94000 Créteil, France; (M.G.); (E.H.)
| |
Collapse
|
6
|
Zhang M, Fu X, Gu R, Zhao B, Zhao X, Song H, Zheng H, Xu J, Bai W. A novel starch-active lytic polysaccharide monooxygenase discovered with bioinformatics screening and its application in textile desizing. BMC Biotechnol 2024; 24:2. [PMID: 38200466 PMCID: PMC10782670 DOI: 10.1186/s12896-023-00826-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 12/18/2023] [Indexed: 01/12/2024] Open
Abstract
BACKGROUND Lytic polysaccharide monooxygenases (LPMOs) catalyzing the oxidative cleavage of different types of polysaccharides have potential to be used in various industries. However, AA13 family LPMOs which specifically catalyze starch substrates have relatively less members than AA9 and AA10 families to limit their application range. Amylase has been used in enzymatic desizing treatment of cotton fabric for semicentury which urgently need for new assistant enzymes to improve reaction efficiency and reduce cost so as to promote their application in the textile industry. RESULTS A total of 380 unannotated new genes which probably encode AA13 family LPMOs were discovered by the Hidden Markov model scanning in this study. Ten of them have been successfully heterologous overexpressed. AlLPMO13 with the highest activity has been purified and determined its optimum pH and temperature as pH 5.0 and 50 °C. It also showed various oxidative activities on different substrates (modified corn starch > amylose > amylopectin > corn starch). The results of enzymatic textile desizing application showed that the best combination of amylase (5 g/L), AlLPMO13 (5 mg/L), and H2O2 (3 g/L) made the desizing level and the capillary effects increased by 3 grades and more than 20%, respectively, compared with the results treated by only amylase. CONCLUSION The Hidden Markov model constructed basing on 34 AA13 family LPMOs was proved to be a valid bioinformatics tool for discovering novel starch-active LPMOs. The novel enzyme AlLPMO13 has strong development potential in the enzymatic textile industry both concerning on economy and on application effect.
Collapse
Grants
- 145209322 Heilongjiang Province Fundamental Research Funds
- 2021YFC2100405 National Key Research and Development Program of China
- 2021YFC2100405 National Key Research and Development Program of China
- TSBICIP-CXRC-037, TSBICIP-KJGG-009-0202, and TSBICIP-PTJJ-007-13 Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project
- TSBICIP-CXRC-037, TSBICIP-KJGG-009-0202, and TSBICIP-PTJJ-007-13 Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project
- TSBICIP-CXRC-037, TSBICIP-KJGG-009-0202, and TSBICIP-PTJJ-007-13 Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project
- TSBICIP-CXRC-037, TSBICIP-KJGG-009-0202, and TSBICIP-PTJJ-007-13 Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project
- TSBICIP-CXRC-037, TSBICIP-KJGG-009-0202, and TSBICIP-PTJJ-007-13 Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project
- TSBICIP-CXRC-037, TSBICIP-KJGG-009-0202, and TSBICIP-PTJJ-007-13 Tianjin Synthetic Biotechnology Innovation Capacity Improvement Project
Collapse
Affiliation(s)
- Meijuan Zhang
- College of Life Science and Agriculture Forestry, Qiqihar University, Qiqihar, 161006, China
| | - Xiaoping Fu
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China
| | - Rongrong Gu
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
| | - Bohua Zhao
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China
| | - Xingya Zhao
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China
| | - Hui Song
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China
| | - Hongchen Zheng
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China.
| | - Jianyong Xu
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China.
| | - Wenqin Bai
- Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
- Industrial Enzymes National Engineering Research Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, 300308, China.
- National Center of Technology Innovation for Synthetic Biology, Tianjin, 300308, China.
| |
Collapse
|
7
|
Zhang H, Dierkes RF, Perez-Garcia P, Costanzi E, Dittrich J, Cea PA, Gurschke M, Applegate V, Partus K, Schmeisser C, Pfleger C, Gohlke H, Smits SHJ, Chow J, Streit WR. The metagenome-derived esterase PET40 is highly promiscuous and hydrolyses polyethylene terephthalate (PET). FEBS J 2024; 291:70-91. [PMID: 37549040 DOI: 10.1111/febs.16924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 07/24/2023] [Accepted: 08/07/2023] [Indexed: 08/09/2023]
Abstract
Polyethylene terephthalate (PET) is a widely used synthetic polymer and known to contaminate marine and terrestrial ecosystems. Only few PET-active microorganisms and enzymes (PETases) are currently known, and it is debated whether degradation activity for PET originates from promiscuous enzymes with broad substrate spectra that primarily act on natural polymers or other bulky substrates, or whether microorganisms evolved their genetic makeup to accepting PET as a carbon source. Here, we present a predicted diene lactone hydrolase designated PET40, which acts on a broad spectrum of substrates, including PET. It is the first esterase with activity on PET from a GC-rich Gram-positive Amycolatopsis species belonging to the Pseudonocardiaceae (Actinobacteria). It is highly conserved within the genera Amycolatopsis and Streptomyces. PET40 was identified by sequence-based metagenome search using a PETase-specific hidden Markov model. Besides acting on PET, PET40 has a versatile substrate spectrum, hydrolyzing δ-lactones, β-lactam antibiotics, the polyester-polyurethane Impranil® DLN, and various para-nitrophenyl ester substrates. Molecular docking suggests that the PET degradative activity is likely a result of the promiscuity of PET40, as potential binding modes were found for substrates encompassing mono(2-hydroxyethyl) terephthalate, bis(2-hydroxyethyl) terephthalate, and a PET trimer. We also solved the crystal structure of the inactive PET40 variant S178A to 1.60 Å resolution. PET40 is active throughout a wide pH (pH 4-10) and temperature range (4-65 °C) and remarkably stable in the presence of 5% SDS, making it a promising enzyme as a starting point for further investigations and optimization approaches.
Collapse
Affiliation(s)
- Hongli Zhang
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| | - Robert F Dierkes
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| | - Pablo Perez-Garcia
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
- Molecular Microbiology, Institute for General Microbiology, Kiel University, Germany
| | - Elisa Costanzi
- Center for Structural Studies, Heinrich Heine University, Düsseldorf, Germany
| | - Jonas Dittrich
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University, Düsseldorf, Germany
| | - Pablo A Cea
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University, Düsseldorf, Germany
| | - Marno Gurschke
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| | - Violetta Applegate
- Center for Structural Studies, Heinrich Heine University, Düsseldorf, Germany
| | - Kristina Partus
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| | - Christel Schmeisser
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| | - Christopher Pfleger
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University, Düsseldorf, Germany
| | - Holger Gohlke
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University, Düsseldorf, Germany
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics), John von Neumann Institute for Computing and Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH, Germany
| | - Sander H J Smits
- Center for Structural Studies, Heinrich Heine University, Düsseldorf, Germany
- Institute of Biochemistry, Heinrich Heine University, Düsseldorf, Germany
| | - Jennifer Chow
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| | - Wolfgang R Streit
- Department of Microbiology and Biotechnology, University of Hamburg, Germany
| |
Collapse
|
8
|
Richardson MO, Eddy SR. ORFeus: a computational method to detect programmed ribosomal frameshifts and other non-canonical translation events. BMC Bioinformatics 2023; 24:471. [PMID: 38093195 PMCID: PMC10720069 DOI: 10.1186/s12859-023-05602-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 12/05/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND In canonical protein translation, ribosomes initiate translation at a specific start codon, maintain a single reading frame throughout elongation, and terminate at the first in-frame stop codon. However, ribosomal behavior can deviate at each of these steps, sometimes in a programmed manner. Certain mRNAs contain sequence and structural elements that cause ribosomes to begin translation at alternative start codons, shift reading frame, read through stop codons, or reinitiate on the same mRNA. These processes represent important translational control mechanisms that can allow an mRNA to encode multiple functional protein products or regulate protein expression. The prevalence of these events remains uncertain, due to the difficulty of systematic detection. RESULTS We have developed a computational model to infer non-canonical translation events from ribosome profiling data. CONCLUSION ORFeus identifies known examples of alternative open reading frames and recoding events across different organisms and enables transcriptome-wide searches for novel events.
Collapse
Affiliation(s)
- Mary O Richardson
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| | - Sean R Eddy
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
9
|
Xu X, Bhalla N, Ståhl P, Jaldén J. Lokatt: a hybrid DNA nanopore basecaller with an explicit duration hidden Markov model and a residual LSTM network. BMC Bioinformatics 2023; 24:461. [PMID: 38062356 PMCID: PMC10704643 DOI: 10.1186/s12859-023-05580-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Basecalling long DNA sequences is a crucial step in nanopore-based DNA sequencing protocols. In recent years, the CTC-RNN model has become the leading basecalling model, supplanting preceding hidden Markov models (HMMs) that relied on pre-segmenting ion current measurements. However, the CTC-RNN model operates independently of prior biological and physical insights. RESULTS We present a novel basecaller named Lokatt: explicit duration Markov model and residual-LSTM network. It leverages an explicit duration HMM (EDHMM) designed to model the nanopore sequencing processes. Trained on a newly generated library with methylation-free Ecoli samples and MinION R9.4.1 chemistry, the Lokatt basecaller achieves basecalling performances with a median single read identity score of 0.930, a genome coverage ratio of 99.750%, on par with existing state-of-the-art structure when trained on the same datasets. CONCLUSION Our research underlines the potential of incorporating prior knowledge into the basecalling processes, particularly through integrating HMMs and recurrent neural networks. The Lokatt basecaller showcases the efficacy of a hybrid approach, emphasizing its capacity to achieve high-quality basecalling performance while accommodating the nuances of nanopore sequencing. These outcomes pave the way for advanced basecalling methodologies, with potential implications for enhancing the accuracy and efficiency of nanopore-based DNA sequencing protocols.
Collapse
Affiliation(s)
- Xuechun Xu
- Division of Information Science and Engineering, KTH Royal Institute of Technology, 11428, Stockholm, Sweden.
| | - Nayanika Bhalla
- Department of Gene Technology, Science for Life Laboratory, KTH Royal Institute of Technology, Solna, 17165, Stockholm, Sweden
| | - Patrik Ståhl
- Department of Gene Technology, Science for Life Laboratory, KTH Royal Institute of Technology, Solna, 17165, Stockholm, Sweden
| | - Joakim Jaldén
- Division of Information Science and Engineering, KTH Royal Institute of Technology, 11428, Stockholm, Sweden
| |
Collapse
|
10
|
Singleton M, Eisen M. Leveraging genomic redundancy to improve inference and alignment of orthologous proteins. G3 (Bethesda) 2023; 13:jkad222. [PMID: 37770067 DOI: 10.1093/g3journal/jkad222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/11/2023] [Accepted: 09/19/2023] [Indexed: 10/03/2023]
Abstract
Identifying protein sequences with common ancestry is a core task in bioinformatics and evolutionary biology. However, methods for inferring and aligning such sequences in annotated genomes have not kept pace with the increasing scale and complexity of the available data. Thus, in this work, we implemented several improvements to the traditional methodology that more fully leverage the redundancy of closely related genomes and the organization of their annotations. Two highlights include the application of the more flexible k-clique percolation algorithm for identifying clusters of orthologous proteins and the development of a novel technique for removing poorly supported regions of alignments with a phylogenetic hidden Markov model (phylo-HMM). In making the latter, we wrote a fully documented Python package Homomorph that implements standard HMM algorithms and created a set of tutorials to promote its use by a wide audience. We applied the resulting pipeline to a set of 33 annotated Drosophila genomes, generating 22,813 orthologous groups and 8,566 high-quality alignments.
Collapse
Affiliation(s)
- Marc Singleton
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, USA
| | - Michael Eisen
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
11
|
Long Z, Liu X, Niu Y, Shang H, Lu H, Zhang J, Yao L. Improved dynamic functional connectivity estimation with an alternating hidden Markov model. Cogn Neurodyn 2023; 17:1381-1398. [PMID: 37786659 PMCID: PMC10542089 DOI: 10.1007/s11571-022-09874-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Revised: 08/08/2022] [Accepted: 08/13/2022] [Indexed: 11/06/2022] Open
Abstract
Dynamic functional connectivity (DFC) analysis has been widely applied to functional magnetic resonance imaging (fMRI) data to reveal the time-varying functional interactions between brain regions. Although the sliding window (SW) method is popular for DFC analysis, the selection of window length is hard, and the temporal resolution is limited by the window length. The hidden Markov model (HMM) without the limitation of window length has been proven to be able to estimate time-varying brain states from fMRI data. However, HMM tends to be overfitted in DFC analysis of fMRI data because of the high spatial dimension and the limited sample size of fMRI data. In this study, we proposed an alternating HMM (aHMM) method that used the functional connectivity estimation of SW to initialize the covariance matrix of HMM and adopted an alternating HMM procedure to reduce the number of parameters during each optimization. The simulated and real fMRI resting data from the Human Connectome Projects showed that aHMM produced better robustness to noise, parameter number and sample size in DFC estimation than SW and HMM. For the real fMRI resting data of cerebral small vessel disease (CSVD), results of aHMM revealed that amnesia and mild cognitive impairment (aMCI) caused the CSVD with aMCI (CSVD-aMCI) group tended to spend more time on the brain state with overall weak connections and less time on the state with overall strong connections than the CSVD-controls. Moreover, CSVD-aMCI showed significantly lower connectivity amplitude and higher connectivity fluctuation than CSVD-control. In contrast, HMM did not detect intergroup differences of the connectivity amplitude and fluctuations and SW did not detect intergroup differences of connectivity fluctuations and fraction of time. The results further indicated that aHMM outperformed HMM and SW in detecting inter-group differences of temporal properties of DFC and connectivity fluctuations. Supplementary Information The online version contains supplementary material available at 10.1007/s11571-022-09874-3.
Collapse
Affiliation(s)
- Zhiying Long
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875 China
| | - Xuanping Liu
- The State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875 China
| | - Yantong Niu
- The State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875 China
| | - Huajie Shang
- The State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875 China
- BABRI Centre, Beijing Normal University, Beijing, 100875 China
| | - Hui Lu
- The State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875 China
- BABRI Centre, Beijing Normal University, Beijing, 100875 China
| | - Junying Zhang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, 100700 China
- BABRI Centre, Beijing Normal University, Beijing, 100875 China
| | - Li Yao
- School of Artificial Intelligence, Beijing Normal University, Beijing, 100875 China
- The State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing, 100875 China
| |
Collapse
|
12
|
Saldanha S, Cox SL, Militão T, González-Solís J. Animal behaviour on the move: the use of auxiliary information and semi-supervision to improve behavioural inferences from Hidden Markov Models applied to GPS tracking datasets. Mov Ecol 2023; 11:41. [PMID: 37488611 PMCID: PMC10367325 DOI: 10.1186/s40462-023-00401-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 06/21/2023] [Indexed: 07/26/2023]
Abstract
BACKGROUND State-space models, such as Hidden Markov Models (HMMs), are increasingly used to classify animal tracks into behavioural states. Typically, step length and turning angles of successive locations are used to infer where and when an animal is resting, foraging, or travelling. However, the accuracy of behavioural classifications is seldom validated, which may badly contaminate posterior analyses. In general, models appear to efficiently infer behaviour in species with discrete foraging and travelling areas, but classification is challenging for species foraging opportunistically across homogenous environments, such as tropical seas. Here, we use a subset of GPS loggers deployed simultaneously with wet-dry data from geolocators, activity measurements from accelerometers, and dive events from Time Depth Recorders (TDR), to improve the classification of HMMs of a large GPS tracking dataset (478 deployments) of red-billed tropicbirds (Phaethon aethereus), a poorly studied pantropical seabird. METHODS We classified a subset of fixes as either resting, foraging or travelling based on the three auxiliary sensors and evaluated the increase in overall accuracy, sensitivity (true positive rate), specificity (true negative rate) and precision (positive predictive value) of the models in relation to the increasing inclusion of fixes with known behaviours. RESULTS We demonstrate that even with a small informed sub-dataset (representing only 9% of the full dataset), we can significantly improve the overall behavioural classification of these models, increasing model accuracy from 0.77 ± 0.01 to 0.85 ± 0.01 (mean ± sd). Despite overall improvements, the sensitivity and precision of foraging behaviour remained low (reaching 0.37 ± 0.06, and 0.06 ± 0.01, respectively). CONCLUSIONS This study demonstrates that the use of a small subset of auxiliary data with known behaviours can both validate and notably improve behavioural classifications of state space models of opportunistic foragers. However, the improvement is state-dependant and caution should be taken when interpreting inferences of foraging behaviour from GPS data in species foraging on the go across homogenous environments.
Collapse
Affiliation(s)
- Sarah Saldanha
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona (UB), Barcelona, Spain.
- Dept Biologia Evolutiva, Ecologia i Ciències Ambientals, Universitat de Barcelona, Av Diagonal 643, Barcelona, 08028, Spain.
| | - Sam L Cox
- Centre National d'Études Spatiales (CNES), Toulouse, 31400, France
- MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Sète, France
- Institut de Recherche pour le Développement (IRD), Sète, France
- MaREI Centre, University College Cork, Cork, Ireland
| | - Teresa Militão
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona (UB), Barcelona, Spain
- Dept Biologia Evolutiva, Ecologia i Ciències Ambientals, Universitat de Barcelona, Av Diagonal 643, Barcelona, 08028, Spain
| | - Jacob González-Solís
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona (UB), Barcelona, Spain
- Dept Biologia Evolutiva, Ecologia i Ciències Ambientals, Universitat de Barcelona, Av Diagonal 643, Barcelona, 08028, Spain
| |
Collapse
|
13
|
Ma J, Guo J, Fan Z, Zhao W, Zhou X. CVAM: CNA Profile Inference of the Spatial Transcriptome Based on the VGAE and HMM. Biomolecules 2023; 13:767. [PMID: 37238637 PMCID: PMC10216626 DOI: 10.3390/biom13050767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 04/20/2023] [Accepted: 04/25/2023] [Indexed: 05/28/2023] Open
Abstract
Tumors are often polyclonal due to copy number alteration (CNA) events. Through the CNA profile, we can understand the tumor heterogeneity and consistency. CNA information is usually obtained through DNA sequencing. However, many existing studies have shown a positive correlation between the gene expression and gene copy number identified from DNA sequencing. With the development of spatial transcriptome technologies, it is urgent to develop new tools to identify genomic variation from the spatial transcriptome. Therefore, in this study, we developed CVAM, a tool to infer the CNA profile from spatial transcriptome data. Compared with existing tools, CVAM integrates the spatial information with the spot's gene expression information together and the spatial information is indirectly introduced into the CNA inference. By applying CVAM to simulated and real spatial transcriptome data, we found that CVAM performed better in identifying CNA events. In addition, we analyzed the potential co-occurrence and mutual exclusion between CNA events in tumor clusters, which is helpful to analyze the potential interaction between genes in mutation. Last but not least, Ripley's K-function is also applied to CNA multi-distance spatial pattern analysis so that we can figure out the differences of different gene CNA events in spatial distribution, which is helpful for tumor analysis and implementing more effective treatment measures based on spatial characteristics of genes.
Collapse
Affiliation(s)
- Jian Ma
- College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China
| | - Jingjing Guo
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu 610041, China
| | - Zhiwei Fan
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu 610040, China
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Weiling Zhao
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
14
|
Malonzo MH, Lähdesmäki H. Lux HMM: DNA methylation analysis with genome segmentation via hidden Markov model. BMC Bioinformatics 2023; 24:58. [PMID: 36810075 PMCID: PMC9945676 DOI: 10.1186/s12859-023-05174-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 02/06/2023] [Indexed: 02/23/2023] Open
Abstract
BACKGROUND DNA methylation plays an important role in studying the epigenetics of various biological processes including many diseases. Although differential methylation of individual cytosines can be informative, given that methylation of neighboring CpGs are typically correlated, analysis of differentially methylated regions is often of more interest. RESULTS We have developed a probabilistic method and software, LuxHMM, that uses hidden Markov model (HMM) to segment the genome into regions and a Bayesian regression model, which allows handling of multiple covariates, to infer differential methylation of regions. Moreover, our model includes experimental parameters that describe the underlying biochemistry in bisulfite sequencing and model inference is done using either variational inference for efficient genome-scale analysis or Hamiltonian Monte Carlo (HMC). CONCLUSIONS Analyses of real and simulated bisulfite sequencing data demonstrate the competitive performance of LuxHMM compared with other published differential methylation analysis methods.
Collapse
Affiliation(s)
- Maia H. Malonzo
- grid.5373.20000000108389418Department of Computer Science, Aalto University, 00076 Espoo, Finland
| | - Harri Lähdesmäki
- grid.5373.20000000108389418Department of Computer Science, Aalto University, 00076 Espoo, Finland
| |
Collapse
|
15
|
Martins A, Fonseca I, Farinha JT, Reis J, Cardoso AJM. Online Monitoring of Sensor Calibration Status to Support Condition-Based Maintenance. Sensors (Basel) 2023; 23:2402. [PMID: 36904607 PMCID: PMC10007291 DOI: 10.3390/s23052402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/07/2023] [Accepted: 02/20/2023] [Indexed: 06/18/2023]
Abstract
Condition-Based Maintenance (CBM), based on sensors, can only be reliable if the data used to extract information are also reliable. Industrial metrology plays a major role in ensuring the quality of the data collected by the sensors. To guarantee that the values collected by the sensors are reliable, it is necessary to have metrological traceability made by successive calibrations from higher standards to the sensors used in the factories. To ensure the reliability of the data, a calibration strategy must be put in place. Usually, sensors are only calibrated on a periodic basis; so, they often go for calibration without it being necessary or collect data inaccurately. In addition, the sensors are checked often, increasing the need for manpower, and sensor errors are frequently overlooked when the redundant sensor has a drift in the same direction. It is necessary to acquire a calibration strategy based on the sensor condition. Through online monitoring of sensor calibration status (OLM), it is possible to perform calibrations only when it is really necessary. To reach this end, this paper aims to provide a strategy to classify the health status of the production equipment and of the reading equipment that uses the same dataset. A measurement signal from four sensors was simulated, for which Artificial Intelligence and Machine Learning with unsupervised algorithms were used. This paper demonstrates how, through the same dataset, it is possible to obtain distinct information. Because of this, we have a very important feature creation process, followed by Principal Component Analysis (PCA), K-means clustering, and classification based on Hidden Markov Models (HMM). Through three hidden states of the HMM, which represent the health states of the production equipment, we will first detect, through correlations, the features of its status. After that, an HMM filter is used to eliminate those errors from the original signal. Next, an equal methodology is conducted for each sensor individually and using statistical features in the time domain where we can obtain, through HMM, the failures of each sensor.
Collapse
Affiliation(s)
- Alexandre Martins
- EIGeS—Research Centre in Industrial Engineering, Management and Sustainability, Lusófona University, Campo Grande 376, 1749-024 Lisboa, Portugal
- CISE—Electromechatronic Systems Research Centre, University of Beira Interior, Calçada Fonte do Lameiro, 62001-001 Covilhã, Portugal
| | - Inácio Fonseca
- Instituto Superior de Engenharia de Coimbra, Polytechnic of Coimbra, 3045-093 Coimbra, Portugal
| | - José Torres Farinha
- Instituto Superior de Engenharia de Coimbra, Polytechnic of Coimbra, 3045-093 Coimbra, Portugal
- Centre for Mechanical Engineering, Materials and Processes—CEMMPRE, University of Coimbra, 3030-788 Coimbra, Portugal
| | - João Reis
- EIGeS—Research Centre in Industrial Engineering, Management and Sustainability, Lusófona University, Campo Grande 376, 1749-024 Lisboa, Portugal
| | - António J. Marques Cardoso
- CISE—Electromechatronic Systems Research Centre, University of Beira Interior, Calçada Fonte do Lameiro, 62001-001 Covilhã, Portugal
| |
Collapse
|
16
|
Hassan M, Ali S, Kim JY, Saadia A, Sanaullah M, Alquhayz H, Safdar K. Developing a Novel Methodology by Integrating Deep Learning and HMM for Segmentation of Retinal Blood Vessels in Fundus Images. Interdiscip Sci 2023. [PMID: 36611082 DOI: 10.1007/s12539-022-00545-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 12/06/2022] [Accepted: 12/08/2022] [Indexed: 01/09/2023]
Abstract
Accurate segregation of retinal blood vessels network plays a crucial role in clinical assessments, treatments, and rehabilitation process. Owing to the presence of acquisition and instrumentation anomalies, precise tracking of vessels network is challenging. For this, a new fundus image segmentation framework is proposed by combining deep neural networks, and hidden Markov model. It has three main modules: the Atrous spatial pyramid pooling-based encoder, the decoder, and hidden Markov model vessel tracker. The encoder utilized modified ResNet18 deep neural networks model for low-and-high-levels features extraction. These features are concatenated in module-II by the decoder to perform convolution operations to obtain the initial segmentation. Previous modules detected the main vessel structure and overlooked some small capillaries. For improved segmentation, hidden Markov model vessel tracker is integrated with module-I and-II to detect overlooked small capillaries of the vessels network. In last module, final segmentation is obtained by combining multi-oriented sub-images using logical OR operation. This novel framework is validated experimentally using two standard DRIVE and STARE datasets. The developed model offers high average values of accuracy, area under the curve, and sensitivity of 99.8, 99.0, and 98.2%, respectively. Analysis of the results revealed that the developed approach offered enhanced performance in terms of sensitivity 18%, accuracy 3%, and specificity 1% over the state-of-the-art approaches. Owing to better learning and generalization capability, the developed approach tracked blood vessels network efficiently and automatically compared to other approaches. The proposed approach can be helpful for human eye assessment, disease diagnosis, and rehabilitation process.
Collapse
|
17
|
Shrestha P, Karmacharya J, Han SR, Park H, Oh TJ. In silico analysis and a comparative genomics approach to predict pathogenic trehalase genes in the complete genome of Antarctica Shigella sp. PAMC28760. Virulence 2022; 13:1502-1514. [PMID: 36040103 PMCID: PMC9450901 DOI: 10.1080/21505594.2022.2117679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Although four Shigella species (S. flexneri, S. sonnei, S. dysenteriae, and S. boydii) have been reported, S. sp. PAMC 28760, an Antarctica isolate, is the only one with a complete genome deposited in NCBI database as an uncharacterized isolate. Because it is the world’s driest, windiest, and coldest continent, Antarctica provides an unfavourable environment for microorganisms. Computational analysis of genomic sequences of four Shigella species and our uncategorized Antarctica isolates Shigella sp. PAMC28760 was performed using MP3 (offline version) program to predict trehalase encoding genes as a pathogenic or non-pathogenic form. Additionally, we employed RAST and Prokka (offline version) annotation programs to determine locations of periplasmic (treA) and cytoplasmic (treF) trehalase genes in studied genomes. Our results showed that only 56 out of 134 Shigella strains had two different trehalase genes (treF and treA). It was revealed that the treF gene tends to be prevalent in Shigella species. In addition, both treA and treF genes were present in our strain S. sp. PAMC28760. The main objective of this study was to predict the prevalence of two different trehalase genes (treF and treA) in the complete genome of Shigella sp. PAMC28760 and other complete genomes of Shigella species. Till date, it is the first study to show that two types of trehalase genes are involved in Shigella species, which could offer insight on how the bacteria use accessible carbohydrate like glucose produced from the trehalose degradation pathway, and importance of periplasmic trehalase involvement in bacterial virulence.
Collapse
Affiliation(s)
- Prasansah Shrestha
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, Korea
| | - Jayram Karmacharya
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, Korea
| | - So-Ra Han
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, Korea
| | - Hyun Park
- Division of Biotechnology, College of Life Sciences and Biotechnology, Korea University, Seoul, Korea
| | - Tae-Jin Oh
- Department of Life Science and Biochemical Engineering, Graduate School, SunMoon University, Asan, Korea.,Department of Life Science and Biochemical Engineering, SunMoon Univesity, Genome-based BioIT Convergence Institute, Asan, Korea.,Department of Pharmaceutical Engineering and Biotechnology, SunMoon University, Asan, Korea
| |
Collapse
|
18
|
Qin C, Zhu X, Ye L, Peng L, Li L, Wang J, Ma J, Liu T. Autism detection based on multiple time scale model. J Neural Eng 2022; 19. [PMID: 35985297 DOI: 10.1088/1741-2552/ac8b39] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 08/19/2022] [Indexed: 11/12/2022]
Abstract
OBJECTIVE Current autism clinical detection relies on doctor observation and filling of clinical scales, which is subjective and easily misdetection. Existing autism research of functional magnetic resonance imaging (fMRI) over-compresses the time-scale information and has poor generalization ability. This study extracts multiple time scale brain features of fMRI, providing objective detection. APPROACH We first use least absolute shrinkage and selection operator (LASSO) to build a sparse network and extract features with a time scale of 1. Then, we use hidden markov model (HMM) to extract features that describe the dynamic changes of the brain, with a time scale of 2. Additionally, to analyze the features of the potential network activity of autism from a higher time scale, we use long short-term memory (LSTM) to construct an auto-encoder to re-encode the original data and extract the features of the at a higher time scale, with a time scale of T, and T is the time length of fMRI. We use Recursive Feature Elimination (RFE) for feature selection for three different time scale features, merge them into multiple time scale features, and finally use one-dimensional convolution neural network (1DCNN) for classification. MAIN RESULTS Compared with well-established models, our method has achieved better results. The accuracy of our method is 76.0%, and the area under the roc curve is 0.83, tested on the completely independent data, so our method has better generalization ability. SIGNIFICANCE This research analyzes fMRI sequences from multiple time scale to detect autism, and it also provides a new framework and research ideas for subsequent fMRI analysis.
Collapse
Affiliation(s)
- Chi Qin
- Xi'an Jiaotong University, School of Life Science and Technology, Xi'an, 710049, CHINA
| | - Xiaofei Zhu
- Tangdu Hospital Fourth Military Medical University, Department of Radiology, Xi'an, Shaanxi, 710038, CHINA
| | - Lin Ye
- Xi'an Jiaotong University, School of Life Science and Technology, Xi'an, 710049, CHINA
| | - Li Peng
- Tongji Hospital of Tongji Medical College of Huazhong University of Science and Technology, Department of Radiology, Wuhan, Hubei, 430030, CHINA
| | - Long Li
- Xi'an Jiaotong University, School of Life Science and Technology, Xi'an, 710049, CHINA
| | - Jue Wang
- Xi'an Jiaotong University, School of Life Science and Technology, Xi'an, 710049, CHINA
| | - Jin Ma
- Air Force Medical University, School of Aerospace Medicine, Xi'an, 710032, CHINA
| | - Tian Liu
- Xi'an Jiaotong University, School of Life Science and Technology, Xi'an, 710049, CHINA
| |
Collapse
|
19
|
Wu X, Zhang Q. Design of Aging Smart Home Products Based on Radial Basis Function Speech Emotion Recognition. Front Psychol 2022; 13:882709. [PMID: 35602743 PMCID: PMC9114816 DOI: 10.3389/fpsyg.2022.882709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open
Abstract
The rapid development of computer technology and artificial intelligence is affecting people's daily lives, where language is the most common way of communication in people's daily life. To apply the emotion information contained in voice signals to artificial intelligence products after analysis, this article proposes a design based on voice emotion recognition for aging intelligent home products with RBF. The authors first aimed at a smart home design, and based on the problem of weak adaptability and learning ability of the aging population, a speech emotion recognition method based on a hybrid model of Hidden Markov/Radial Basis Function Neural Network (HMM/RBF) is proposed. This method combines the strong dynamic timing modeling capabilities of the HMM model and the strong classification decision-making ability of the RBF model, and by combining the two models, the speech emotion recognition rate is greatly improved. Furthermore, by introducing the concept of the dynamic optimal learning rate, the convergence speed of the network is reduced to 40.25s and the operation efficiency is optimized. Matlab's simulation tests show that the recognition speed of the HMM/RBF hybrid model is 9.82-12.28% higher than that of the HMM model and the RBF model alone, confirming the accuracy and superiority of the algorithm and model.
Collapse
Affiliation(s)
- Xu Wu
- School of Art and Design, Tianjin University of Technology, Tianjin, China
| | - Qian Zhang
- School of Control and Mechanical Engineering, Tianjin Chengjian University, Tianjin, China
| |
Collapse
|
20
|
Lund D, Kieffer N, Parras-Moltó M, Ebmeyer S, Berglund F, Johnning A, Larsson DGJ, Kristiansson E. Large-scale characterization of the macrolide resistome reveals high diversity and several new pathogen-associated genes. Microb Genom 2022; 8. [PMID: 35084301 PMCID: PMC8914350 DOI: 10.1099/mgen.0.000770] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Macrolides are broad-spectrum antibiotics used to treat a range of infections. Resistance to macrolides is often conferred by mobile resistance genes encoding Erm methyltransferases or Mph phosphotransferases. New erm and mph genes keep being discovered in clinical settings but their origins remain unknown, as is the type of macrolide resistance genes that will appear in the future. In this study, we used optimized hidden Markov models to characterize the macrolide resistome. Over 16 terabases of genomic and metagenomic data, representing a large taxonomic diversity (11 030 species) and diverse environments (1944 metagenomic samples), were searched for the presence of erm and mph genes. From this data, we predicted 28 340 macrolide resistance genes encoding 2892 unique protein sequences, which were clustered into 663 gene families (<70 % amino acid identity), of which 619 (94 %) were previously uncharacterized. This included six new resistance gene families, which were located on mobile genetic elements in pathogens. The function of ten predicted new resistance genes were experimentally validated in Escherichia coli using a growth assay. Among the ten tested genes, seven conferred increased resistance to erythromycin, with five genes additionally conferring increased resistance to azithromycin, showing that our models can be used to predict new functional resistance genes. Our analysis also showed that macrolide resistance genes have diverse origins and have transferred horizontally over large phylogenetic distances into human pathogens. This study expands the known macrolide resistome more than ten-fold, provides insights into its evolution, and demonstrates how computational screening can identify new resistance genes before they become a significant clinical problem.
Collapse
Affiliation(s)
- David Lund
- Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
| | - Nicolas Kieffer
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
- Department of Infectious Diseases, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Marcos Parras-Moltó
- Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
| | - Stefan Ebmeyer
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
- Department of Infectious Diseases, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Fanny Berglund
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
- Department of Infectious Diseases, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Anna Johnning
- Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
- Department of Systems and Data Analysis, Fraunhofer-Chalmers Centre, Gothenburg, Sweden
| | - D. G. Joakim Larsson
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
- Department of Infectious Diseases, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Erik Kristiansson
- Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, Gothenburg, Sweden
- Centre for Antibiotic Resistance Research (CARe), University of Gothenburg, Gothenburg, Sweden
- *Correspondence: Erik Kristiansson,
| |
Collapse
|
21
|
Dixson JD, Azad RK. A Protocol for Prion Discovery in Plants. Methods Mol Biol 2022; 2396:215-226. [PMID: 34786686 DOI: 10.1007/978-1-0716-1822-6_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently a likely prion was found in the proteome of Arabidopsis thaliana based on inclusive compositional similarity to known yeast prion-like domains (PrLDs) and gene ontology analysis. A total of 474 proteins in the Arabidopsis thaliana proteome showed significant compositional similarity to known PrLDs in yeast warranting further analysis. In this chapter, we describe the use and limitations of the PLAAC (Prion-Like Amino Acid Composition) software for the identification of prions, specifically as it has recently been applied to identifying the first prion in plants. Our interest in this method, though presented from a plant-based perspective here, is broad and is primarily in using the method for comparative assessment with novel prion identification algorithms currently under development in our lab. This chapter is not meant to serve as a replete description of the architecture and use of HMM in prion prediction in general but is intended to serve as a reference for implementation and interpretation of output from PLAAC and its application to plant proteomes.
Collapse
Affiliation(s)
- Jamie D Dixson
- Department of Biological Sciences and BioDiscovery Institute, University of North Texas, Denton, TX, USA
| | - Rajeev K Azad
- Department of Biological Sciences and BioDiscovery Institute, University of North Texas, Denton, TX, USA.
- Department of Mathematics, University of North Texas, Denton, TX, USA.
| |
Collapse
|
22
|
Bin Hafeez A, Jiang X, Bergen PJ, Zhu Y. Antimicrobial Peptides: An Update on Classifications and Databases. Int J Mol Sci 2021; 22:11691. [PMID: 34769122 PMCID: PMC8583803 DOI: 10.3390/ijms222111691] [Citation(s) in RCA: 77] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/24/2021] [Accepted: 10/25/2021] [Indexed: 02/06/2023] Open
Abstract
Antimicrobial peptides (AMPs) are distributed across all kingdoms of life and are an indispensable component of host defenses. They consist of predominantly short cationic peptides with a wide variety of structures and targets. Given the ever-emerging resistance of various pathogens to existing antimicrobial therapies, AMPs have recently attracted extensive interest as potential therapeutic agents. As the discovery of new AMPs has increased, many databases specializing in AMPs have been developed to collect both fundamental and pharmacological information. In this review, we summarize the sources, structures, modes of action, and classifications of AMPs. Additionally, we examine current AMP databases, compare valuable computational tools used to predict antimicrobial activity and mechanisms of action, and highlight new machine learning approaches that can be employed to improve AMP activity to combat global antimicrobial resistance.
Collapse
Affiliation(s)
- Ahmer Bin Hafeez
- Centre of Biotechnology and Microbiology, University of Peshawar, Peshawar 25120, Pakistan;
| | - Xukai Jiang
- Infection and Immunity Program, Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; (X.J.); (P.J.B.)
- National Glycoengineering Research Center, Shandong University, Qingdao 266237, China
| | - Phillip J. Bergen
- Infection and Immunity Program, Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; (X.J.); (P.J.B.)
| | - Yan Zhu
- Infection and Immunity Program, Department of Microbiology, Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; (X.J.); (P.J.B.)
| |
Collapse
|
23
|
Roth N, Küderle A, Prossel D, Gassner H, Eskofier BM, Kluge F. An Inertial Sensor-Based Gait Analysis Pipeline for the Assessment of Real-World Stair Ambulation Parameters. Sensors (Basel) 2021; 21:6559. [PMID: 34640878 DOI: 10.3390/s21196559] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 09/27/2021] [Accepted: 09/28/2021] [Indexed: 12/27/2022]
Abstract
Climbing stairs is a fundamental part of daily life, adding additional demands on the postural control system compared to level walking. Although real-world gait analysis studies likely contain stair ambulation sequences, algorithms dedicated to the analysis of such activities are still missing. Therefore, we propose a new gait analysis pipeline for foot-worn inertial sensors, which can segment, parametrize, and classify strides from continuous gait sequences that include level walking, stair ascending, and stair descending. For segmentation, an existing approach based on the hidden Markov model and a feature-based gait event detection were extended, reaching an average segmentation F1 score of 98.5% and gait event timing errors below ±10ms for all conditions. Stride types were classified with an accuracy of 98.2% using spatial features derived from a Kalman filter-based trajectory reconstruction. The evaluation was performed on a dataset of 20 healthy participants walking on three different staircases at different speeds. The entire pipeline was additionally validated end-to-end on an independent dataset of 13 Parkinson’s disease patients. The presented work aims to extend real-world gait analysis by including stair ambulation parameters in order to gain new insights into mobility impairments that can be linked to clinically relevant conditions such as a patient’s fall risk and disease state or progression.
Collapse
|
24
|
Wang Y, Song S, Schraiber JG, Sedghifar A, Byrnes JK, Turissini DA, Hong EL, Ball CA, Noto K. Ancestry inference using reference labeled clusters of haplotypes. BMC Bioinformatics 2021; 22:459. [PMID: 34563119 PMCID: PMC8466715 DOI: 10.1186/s12859-021-04350-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 08/31/2021] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. RESULTS The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. CONCLUSIONS Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.
Collapse
Affiliation(s)
- Yong Wang
- AncestryDNA, San Francisco, CA, 94107, USA
| | - Shiya Song
- AncestryDNA, San Francisco, CA, 94107, USA
| | | | | | | | | | | | | | - Keith Noto
- AncestryDNA, San Francisco, CA, 94107, USA.
| |
Collapse
|
25
|
Ahmed T, Thopalli K, Rikakis T, Turaga P, Kelliher A, Huang JB, Wolf SL. Automated Movement Assessment in Stroke Rehabilitation. Front Neurol 2021; 12:720650. [PMID: 34489855 PMCID: PMC8417323 DOI: 10.3389/fneur.2021.720650] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
We are developing a system for long term Semi-Automated Rehabilitation At the Home (SARAH) that relies on low-cost and unobtrusive video-based sensing. We present a cyber-human methodology used by the SARAH system for automated assessment of upper extremity stroke rehabilitation at the home. We propose a hierarchical model for automatically segmenting stroke survivor's movements and generating training task performance assessment scores during rehabilitation. The hierarchical model fuses expert therapist knowledge-based approaches with data-driven techniques. The expert knowledge is more observable in the higher layers of the hierarchy (task and segment) and therefore more accessible to algorithms incorporating high level constraints relating to activity structure (i.e., type and order of segments per task). We utilize an HMM and a Decision Tree model to connect these high level priors to data driven analysis. The lower layers (RGB images and raw kinematics) need to be addressed primarily through data driven techniques. We use a transformer based architecture operating on low-level action features (tracking of individual body joints and objects) and a Multi-Stage Temporal Convolutional Network(MS-TCN) operating on raw RGB images. We develop a sequence combining these complimentary algorithms effectively, thus encoding the information from different layers of the movement hierarchy. Through this combination, we produce a robust segmentation and task assessment results on noisy, variable and limited data, which is characteristic of low cost video capture of rehabilitation at the home. Our proposed approach achieves 85% accuracy in per-frame labeling, 99% accuracy in segment classification and 93% accuracy in task completion assessment. Although the methodology proposed in this paper applies to upper extremity rehabilitation using the SARAH system, it can potentially be used, with minor alterations, to assist automation in many other movement rehabilitation contexts (i.e., lower extremity training for neurological accidents).
Collapse
Affiliation(s)
- Tamim Ahmed
- Department of Biomedical Engineering, Virginia Tech, Blacksburg, VA, United States
| | - Kowshik Thopalli
- Geometric Media Lab, School of Arts, Media and Engineering, Arizona State University, Tempe, AZ, United States
| | - Thanassis Rikakis
- Department of Biomedical Engineering, Virginia Tech, Blacksburg, VA, United States
| | - Pavan Turaga
- Geometric Media Lab, School of Arts, Media and Engineering, Arizona State University, Tempe, AZ, United States
| | - Aisling Kelliher
- Department of Computer Science, Virginia Tech, Blacksburg, VA, United States
| | - Jia-Bin Huang
- Department of Electrical and Communication Engineering, Virginia Tech, Blacksburg, VA, United States
| | - Steven L Wolf
- Department of Rehabilitation Medicine, Emory University, Atlanta, GA, United States
| |
Collapse
|
26
|
Kämmerle J, Taubmann J, Andrén H, Fiedler W, Coppes J. Environmental and seasonal correlates of capercaillie movement traits in a Swedish wind farm. Ecol Evol 2021; 11:11762-11773. [PMID: 34522339 PMCID: PMC8427587 DOI: 10.1002/ece3.7922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/20/2021] [Accepted: 06/25/2021] [Indexed: 11/24/2022] Open
Abstract
Animals continuously interact with their environment through behavioral decisions, rendering the appropriate choice of movement speed and directionality an important phenotypic trait. Anthropogenic activities may alter animal behavior, including movement. A detailed understanding of movement decisions is therefore of great relevance for science and conservation alike. The study of movement decisions in relation to environmental and seasonal cues requires continuous observation of movement behavior, recently made possible by high-resolution telemetry. We studied movement traits of 13 capercaillie (Tetrao urogallus), a mainly ground-moving forest bird species of conservation interest, over two summer seasons in a Swedish windfarm using high-resolution GPS tracking data (5-min sampling interval). We filtered and removed unreliable movement steps using accelerometer data and step characteristics. We explored variation in movement speed and directionality in relation to environmental and seasonal covariates using generalized additive mixed models (GAMMs). We found evidence for clear daily and seasonal variation in speed and directionality of movement that reflected behavioral adjustments to biological and environmental seasonality. Capercaillie moved slower when more turbines were visible and faster close to turbine access roads. Movement speed and directionality were highest on open bogs, lowest on recent clear-cuts (<5 y.o.), and intermediate in all types of forest. Our results provide novel insights into the seasonal and environmental correlates of capercaillie movement patterns and supplement previous behavioral observations on lekking behavior and wind turbine avoidance with a more mechanistic understanding.
Collapse
Affiliation(s)
- Jim‐Lino Kämmerle
- FVA Wildlife InstituteForest Research Institute of Baden‐Wuerttemberg FVAFreiburgGermany
- Chair of Wildlife Ecology and ManagementUniversity of FreiburgFreiburgGermany
| | - Julia Taubmann
- FVA Wildlife InstituteForest Research Institute of Baden‐Wuerttemberg FVAFreiburgGermany
- Chair of Wildlife Ecology and ManagementUniversity of FreiburgFreiburgGermany
| | - Henrik Andrén
- Grimsö Wildlife Research StationDepartment of EcologySwedish University of Agricultural SciencesRiddarhyttanSweden
| | - Wolfgang Fiedler
- Department of Migration and Immuno‐EcologyMax Planck Institute of Animal BehaviorRadolfzellGermany
| | - Joy Coppes
- FVA Wildlife InstituteForest Research Institute of Baden‐Wuerttemberg FVAFreiburgGermany
| |
Collapse
|
27
|
Roth N, Küderle A, Ullrich M, Gladow T, Marxreiter F, Klucken J, Eskofier BM, Kluge F. Hidden Markov Model based stride segmentation on unsupervised free-living gait data in Parkinson's disease patients. J Neuroeng Rehabil 2021; 18:93. [PMID: 34082762 DOI: 10.1186/s12984-021-00883-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 05/20/2021] [Indexed: 12/28/2022] Open
Abstract
Background To objectively assess a patient’s gait, a robust identification of stride borders is one of the first steps in inertial sensor-based mobile gait analysis pipelines. While many different methods for stride segmentation have been presented in the literature, an out-of-lab evaluation of respective algorithms on free-living gait is still missing. Method To address this issue, we present a comprehensive free-living evaluation dataset, including 146.574 semi-automatic labeled strides of 28 Parkinson’s Disease patients. This dataset was used to evaluate the segmentation performance of a new Hidden Markov Model (HMM) based stride segmentation approach compared to an available dynamic time warping (DTW) based method. Results The proposed HMM achieved a mean F1-score of 92.1% and outperformed the DTW approach significantly. Further analysis revealed a dependency of segmentation performance to the number of strides within respective walking bouts. Shorter bouts (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$< 30$$\end{document}<30 strides) resulted in worse performance, which could be related to more heterogeneous gait and an increased diversity of different stride types in short free-living walking bouts. In contrast, the HMM reached F1-scores of more than 96.2% for longer bouts (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$> 50$$\end{document}>50 strides). Furthermore, we showed that an HMM, which was trained on at-lab data only, could be transferred to a free-living context with a negligible decrease in performance. Conclusion The generalizability of the proposed HMM is a promising feature, as fully labeled free-living training data might not be available for many applications. To the best of our knowledge, this is the first evaluation of stride segmentation performance on a large scale free-living dataset. Our proposed HMM-based approach was able to address the increased complexity of free-living gait data, and thus will help to enable a robust assessment of stride parameters in future free-living gait analysis applications. Supplementary Information The online version contains supplementary material available at 10.1186/s12984-021-00883-7.
Collapse
|
28
|
Queirós P, Delogu F, Hickl O, May P, Wilmes P. Mantis: flexible and consensus-driven genome annotation. Gigascience 2021; 10:6291114. [PMID: 34076241 PMCID: PMC8170692 DOI: 10.1093/gigascience/giab042] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 03/22/2021] [Accepted: 05/14/2021] [Indexed: 12/22/2022] Open
Abstract
Background The rapid development of the (meta-)omics fields has produced an unprecedented amount of high-resolution and high-fidelity data. Through the use of these datasets we can infer the role of previously functionally unannotated proteins from single organisms and consortia. In this context, protein function annotation can be described as the identification of regions of interest (i.e., domains) in protein sequences and the assignment of biological functions. Despite the existence of numerous tools, challenges remain in terms of speed, flexibility, and reproducibility. In the big data era, it is also increasingly important to cease limiting our findings to a single reference, coalescing knowledge from different data sources, and thus overcoming some limitations in overly relying on computationally generated data from single sources. Results We implemented a protein annotation tool, Mantis, which uses database identifiers intersection and text mining to integrate knowledge from multiple reference data sources into a single consensus-driven output. Mantis is flexible, allowing for the customization of reference data and execution parameters, and is reproducible across different research goals and user environments. We implemented a depth-first search algorithm for domain-specific annotation, which significantly improved annotation performance compared to sequence-wide annotation. The parallelized implementation of Mantis results in short runtimes while also outputting high coverage and high-quality protein function annotations. Conclusions Mantis is a protein function annotation tool that produces high-quality consensus-driven protein annotations. It is easy to set up, customize, and use, scaling from single genomes to large metagenomes. Mantis is available under the MIT license at https://github.com/PedroMTQ/mantis.
Collapse
Affiliation(s)
- Pedro Queirós
- Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Francesco Delogu
- Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Oskar Hickl
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Patrick May
- Bioinformatics Core, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| | - Paul Wilmes
- Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 Avenue du Swing, 4367 Esch-sur-Alzette, Luxembourg
| |
Collapse
|
29
|
Wu J, Liu Y, Zhao Y. Systematic Review on Local Ancestor Inference From a Mathematical and Algorithmic Perspective. Front Genet 2021; 12:639877. [PMID: 34108987 PMCID: PMC8181461 DOI: 10.3389/fgene.2021.639877] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 04/12/2021] [Indexed: 11/20/2022] Open
Abstract
Genotypic data provide deep insights into the population history and medical genetics. The local ancestry inference (LAI) (also termed local ancestry deconvolution) method uses the hidden Markov model (HMM) to solve the mathematical problem of ancestry reconstruction based on genomic data. HMM is combined with other statistical models and machine learning techniques for particular genetic tasks in a series of computer tools. In this article, we surveyed the mathematical structure, application characteristics, historical development, and benchmark analysis of the LAI method in detail, which will help researchers better understand and further develop LAI methods. Firstly, we extensively explore the mathematical structure of each model and its characteristic applications. Next, we use bibliometrics to show detailed model application fields and list articles to elaborate on the historical development. LAI publications had experienced a peak period during 2006-2016 and had kept on moving in the following years. The efficiency, accuracy, and stability of the existing models were evaluated by the benchmark. We find that phased data had higher accuracy in comparison with unphased data. We summarize these models with their distinct advantages and disadvantages. The Loter model uses dynamic programming to obtain a globally optimal solution with its parameter-free advantage. Aligned bases can be used directly in the Seqmix model if the genotype is hard to call. This research may help model developers to realize current challenges, develop more advanced models, and enable scholars to select appropriate models according to given populations and datasets.
Collapse
Affiliation(s)
- Jie Wu
- State Key Laboratory of Agrobiotechnology, China Agricultural University, Beijing, China
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, China
| | - Yangxiu Liu
- State Key Laboratory of Agrobiotechnology, China Agricultural University, Beijing, China
| | - Yiqiang Zhao
- State Key Laboratory of Agrobiotechnology, China Agricultural University, Beijing, China
| |
Collapse
|
30
|
Abstract
Adaptive introgression-the flow of adaptive genetic variation between species or populations-has attracted significant interest in recent years and it has been implicated in a number of cases of adaptation, from pesticide resistance and immunity, to local adaptation. Despite this, methods for identification of adaptive introgression from population genomic data are lacking. Here, we present Ancestry_HMM-S, a hidden Markov model-based method for identifying genes undergoing adaptive introgression and quantifying the strength of selection acting on them. Through extensive validation, we show that this method performs well on moderately sized data sets for realistic population and selection parameters. We apply Ancestry_HMM-S to a data set of an admixed Drosophila melanogaster population from South Africa and we identify 17 loci which show signatures of adaptive introgression, four of which have previously been shown to confer resistance to insecticides. Ancestry_HMM-S provides a powerful method for inferring adaptive introgression in data sets that are typically collected when studying admixed populations. This method will enable powerful insights into the genetic consequences of admixture across diverse populations. Ancestry_HMM-S can be downloaded from https://github.com/jesvedberg/Ancestry_HMM-S/.
Collapse
Affiliation(s)
- Jesper Svedberg
- Department of Biomolecular Engineering, Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | - Vladimir Shchur
- National Research University Higher School of Economics, Moscow, Russian Federation
| | - Solomon Reinman
- Department of Biomolecular Engineering, Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
| | - Rasmus Nielsen
- National Research University Higher School of Economics, Moscow, Russian Federation
- Department of Integrative Biology and Department of Statistics, UC Berkeley, Berkeley, CA, USA
- Center for GeoGenetics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, Genomics Institute, UC Santa Cruz, Santa Cruz, CA, USA
- National Research University Higher School of Economics, Moscow, Russian Federation
| |
Collapse
|
31
|
Si Y, Vanderwerff B, Zöllner S. Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms. Genetics 2021; 217:iyab011. [PMID: 33686438 PMCID: PMC8049559 DOI: 10.1093/genetics/iyab011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Accepted: 12/15/2020] [Indexed: 01/13/2023] Open
Abstract
Genotype imputation is an indispensable step in human genetic studies. Large reference panels with deeply sequenced genomes now allow interrogating variants with minor allele frequency < 1% without sequencing. Although it is critical to consider limits of this approach, imputation methods for rare variants have only done so empirically; the theoretical basis of their imputation accuracy has not been explored. To provide theoretical consideration of imputation accuracy under the current imputation framework, we develop a coalescent model of imputing rare variants, leveraging the joint genealogy of the sample to be imputed and reference individuals. We show that broadly used imputation algorithms include model misspecifications about this joint genealogy that limit the ability to correctly impute rare variants. We develop closed-form solutions for the probability distribution of this joint genealogy and quantify the inevitable error rate resulting from the model misspecification across a range of allele frequencies and reference sample sizes. We show that the probability of a falsely imputed minor allele decreases with reference sample size, but the proportion of falsely imputed minor alleles mostly depends on the allele count in the reference sample. We summarize the impact of this error on genotype imputation on association tests by calculating the r2 between imputed and true genotype and show that even when modeling other sources of error, the impact of the model misspecification has a significant impact on the r2 of rare variants. To evaluate these predictions in practice, we compare the imputation of the same dataset across imputation panels of different sizes. Although this empirical imputation accuracy is substantially lower than our theoretical prediction, modeling misspecification seems to further decrease imputation accuracy for variants with low allele counts in the reference. These results provide a framework for developing new imputation algorithms and for interpreting rare variant association analyses.
Collapse
Affiliation(s)
- Yichen Si
- Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| | - Brett Vanderwerff
- Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| | - Sebastian Zöllner
- Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA
- Department of Psychiatry, University of Michigan,1420 Washington Heights, Ann Arbor, MI 48109, USA
| |
Collapse
|
32
|
Parente JD, Chase JG, Moeller K, Shaw GM. High Inter-Patient Variability in Sepsis Evolution: A Hidden Markov Model Analysis. Comput Methods Programs Biomed 2021; 201:105956. [PMID: 33561709 DOI: 10.1016/j.cmpb.2021.105956] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 01/24/2021] [Indexed: 06/12/2023]
Abstract
BACKGROUND Severe sepsis and septic shock are common in the intensive care unit (ICU) and contribute significantly to cost and mortality. Early treatment is critical but is confounded by the difficulty of real-time diagnosis. This study uses hidden Markov models (HMMs) to examine whether the time evolution of sepsis can add diagnostic accuracy or value using a proven set of bio-signals. METHODS Clinical data (N=36 patients; 6071 hours), including an hourly personalised insulin sensitivity metric. A two hidden state HMM is created to discriminate diagnosed cases (Severe Sepsis, Septic Shock) from controls (SIRS, Sepsis) states. Diagnostic performance is measured by ROC curves, likelihood ratios (LHRs), sensitivity/specificity, and diagnostic odds-ratios (DOR), for a best-case resubstitution estimate and a worst-case 80/20% repeated holdout analysis. RESULTS The HMM delivered near perfect results (95% Sensitivity; 96% Specificity) for best-case resubstitution estimates, but was comparatively poor (59% Sensitivity; 61% Specificity) for worst-case repeated holdout estimations. Adding the time evolution of sepsis did not add to the accuracy of diagnosis from using the signals alone without time history. CONCLUSIONS These potentially surprising results indicate significant inter-patient variability in the time evolution of sepsis, preventing effective diagnosis in the context of the bio-signals, data, and HMM topology used. Efforts for improved real-time, early sepsis diagnosis should concentrate on the robustness and efficacy of the bio-signals and data used, as well as the level of model complexity, to create more effective real-time classifiers.
Collapse
Affiliation(s)
| | | | | | - Geoffrey M Shaw
- Otago University School of Medicine; and ICU, Christchurch Hospital; Christchurch, New Zealand.
| |
Collapse
|
33
|
Irisarri I, Burki F, Whelan S. Automated Removal of Non-homologous Sequence Stretches with PREQUAL. Methods Mol Biol 2021; 2231:147-162. [PMID: 33289892 DOI: 10.1007/978-1-0716-1036-7_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Large-scale multigene datasets used in phylogenomics and comparative genomics often contain sequence errors inherited from source genomes and transcriptomes. These errors typically manifest as stretches of non-homologous characters and derive from sequencing, assembly, and/or annotation errors. The lack of automatic tools to detect and remove sequence errors leads to the propagation of these errors in large-scale datasets. PREQUAL is a command line tool that identifies and masks regions with non-homologous adjacent characters in sets of unaligned homologous sequences. PREQUAL uses a full probabilistic approach based on pair hidden Markov models. On the front end, PREQUAL is user-friendly and simple to use while also allowing full customization to adjust filtering sensitivity. It is primarily aimed at amino acid sequences but can handle protein-coding nucleotide sequences. PREQUAL is computationally efficient and shows high sensitivity and accuracy. In this chapter, we briefly introduce the motivation for PREQUAL and its underlying methodology, followed by a description of basic and advanced usage, and conclude with some notes and recommendations. PREQUAL fills an important gap in the current bioinformatics tool kit for phylogenomics, contributing toward increased accuracy and reproducibility in future studies.
Collapse
Affiliation(s)
- Iker Irisarri
- Department of Organismal Biology (Program in Systematic Biology), Uppsala University, Uppsala, Sweden.
- Department of Biodiversity and Evolutionary Biology, Museo Nacional de Ciencias Naturales, Madrid, Spain.
- Department of Applied Bioinformatics, Institute for Microbiology and Genetics, University of Göttingen, Göttingen, Germany.
| | - Fabien Burki
- Department of Organismal Biology (Program in Systematic Biology), Uppsala University, Uppsala, Sweden
- Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Simon Whelan
- Department of Evolutionary Genetics (Program in Evolutionary Biology), Uppsala University, Uppsala, Sweden
| |
Collapse
|
34
|
Liu Y, Wang X. Differences in Driving Intention Transitions Caused by Driver's Emotion Evolutions. Int J Environ Res Public Health 2020; 17:E6962. [PMID: 32977577 DOI: 10.3390/ijerph17196962] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Revised: 09/14/2020] [Accepted: 09/21/2020] [Indexed: 11/17/2022]
Abstract
Joining worldwide efforts to understand the relationship between driving emotion and behavior, the current study aimed at examining the influence of emotions on driving intention transition. In Study 1, taking a car-following scene as an example, we designed the driving experiments to obtain the driving data in drivers’ natural states, and a driving intention prediction model was constructed based on the HMM. Then, we analyzed the probability distribution and transition probability of driving intentions. In Study 2, we designed a series of emotion-induction experiments for eight typical driving emotions, and the drivers with induced emotion participated in the driving experiments similar to Study 1. Then, we obtained the driving data of the drivers in eight typical emotional states, and the driving intention prediction models adapted to the driver’s different emotional states were constructed based on the HMM severally. Finally, we analyzed the probabilistic differences of driving intention in divers’ natural states and different emotional states, and the findings showed the changing law of driving intention probability distribution and transfer probability caused by emotion evolution. The findings of this study can promote the development of driving behavior prediction technology and an active safety early warning system.
Collapse
|
35
|
Fu W. Application of an Isolated Word Speech Recognition System in the Field of Mental Health Consultation: Development and Usability Study. JMIR Med Inform 2020; 8:e18677. [PMID: 32384054 PMCID: PMC7301261 DOI: 10.2196/18677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 03/21/2020] [Accepted: 03/21/2020] [Indexed: 11/16/2022] Open
Abstract
Background Speech recognition is a technology that enables machines to understand human language. Objective In this study, speech recognition of isolated words from a small vocabulary was applied to the field of mental health counseling. Methods A software platform was used to establish a human-machine chat for psychological counselling. The software uses voice recognition technology to decode the user's voice information. The software system analyzes and processes the user's voice information according to many internal related databases, and then gives the user accurate feedback. For users who need psychological treatment, the system provides them with psychological education. Results The speech recognition system included features such as speech extraction, endpoint detection, feature value extraction, training data, and speech recognition. Conclusions The Hidden Markov Model was adopted, based on multithread programming under a VC2005 compilation environment, to realize the parallel operation of the algorithm and improve the efficiency of speech recognition. After the design was completed, simulation debugging was performed in the laboratory. The experimental results showed that the designed program met the basic requirements of a speech recognition system.
Collapse
Affiliation(s)
- Weifeng Fu
- Liberal Arts College, Hunan Normal University, Changsha, China
| |
Collapse
|
36
|
Huang J, Luo H, Shao W, Zhao F, Yan S. Accurate and Robust Floor Positioning in Complex Indoor Environments. Sensors (Basel) 2020; 20:E2698. [PMID: 32397404 DOI: 10.3390/s20092698] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 04/26/2020] [Accepted: 05/05/2020] [Indexed: 12/04/2022]
Abstract
With the widespread development of location-based services, the demand for accurate indoor positioning is getting more and more urgent. Floor positioning, as a prerequisite for indoor positioning in multi-story buildings, is particularly important. Though lots of work has been done on floor positioning, the existing studies on floor positioning in complex multi-story buildings with large hollow areas through multiple floors still cannot meet the application requirements because of low accuracy and robustness. To obtain accurate and robust floor estimation in complex multi-story buildings, we propose a novel floor positioning method, which combines the Wi-Fi based floor positioning (BWFP), the barometric pressure-based floor positioning (BPFP) with HMM and the XGBoost based user motion detection. Extensive experiments show that using our proposed method can achieve 99.2% accuracy, which outperforms other state-of-the-art floor estimation methods.
Collapse
|
37
|
Abstract
BACKGROUND An important part of the rehabilitation process using exoskeleton robots has been the creation of a friendly Human Robot Interaction (HRI) system. OBJECTIVE In order to combine SEMG signal into the HRI system, a SEMG-angle model based on Hidden Markov Model (HMM) was put forward in this paper. METHODS Feature extraction as a critical issue of signal preprocessing was handled by Principal Component Analysis (PCA) which realized signal data dimension reduction and solved the common problem of redundant features. A comparison study was given to show the different performance of various EMG-angle model separately based on HMM, Back Propagation (BP) neural network and Radial Basis Function (RBF) neural network. RESULTS The HMM modeling method which with lower calculation complexity can achieve a better modeling performance (average accuracy 93.063%) compared with BP neural network (average accuracy 88.180%) and RBF neural network (average accuracy 88.752%). CONCLUSIONS SEMG signals have some characteristic properties which is similar to a quasi-stationary filtered white noise stochastic process, the structure of HMMs makes it ideally suited for classification and modeling SEMG signals, and the results of this study show that it can achieve a better performance than the commonly used methods (BP and RBF).
Collapse
Affiliation(s)
- Yanyan Chen
- Lianyungang Jari deepsoft Technology Co., LTD, Lianyungang, Jiangsu 222000, China.,The 716th Research Institute of CSIC, Lianyungang, Jiangsu 222000, China
| | - Le Liang
- Lianyungang Jari deepsoft Technology Co., LTD, Lianyungang, Jiangsu 222000, China.,The 716th Research Institute of CSIC, Lianyungang, Jiangsu 222000, China
| | - Maochuan Wu
- Lianyungang Jari deepsoft Technology Co., LTD, Lianyungang, Jiangsu 222000, China
| | - Qi Dong
- Lianyungang Jari deepsoft Technology Co., LTD, Lianyungang, Jiangsu 222000, China
| |
Collapse
|
38
|
Tseng YT, Kawashima S, Kobayashi S, Takeuchi S, Nakamura K. Forecasting the seasonal pollen index by using a hidden Markov model combining meteorological and biological factors. Sci Total Environ 2020; 698:134246. [PMID: 31505344 DOI: 10.1016/j.scitotenv.2019.134246] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 08/29/2019] [Accepted: 09/01/2019] [Indexed: 06/10/2023]
Abstract
The seasonal pollen index (SPI) is a continuing concern within the fields of aerobiology, ecology, botany, and epidemiology. The SPI of anemophilous trees, which varies substantially from year to year, reflects the flowering intensity. This intensity is regulated by two factors: weather conditions during flower formation and the inner resource for assimilation. A deterministic approach has to date been employed for predicting SPI, in which the forecast is made entirely by parameters. However, given the complexity of the masting mechanism (which has intrinsic stochastic properties), few attempts have been made to apply a stochastic model that considers the inter-annual SPI variation as a stochastic process. We propose a hidden Markov model that can integrate the stochastic process of mast flowering and the meteorological conditions influencing flower formation to predict the annual birch pollen concentration. In experiments conducted, the model was trained and validated by using data in Hokkaido, Japan covering 22 years. In the model, the hidden Markov sequence was assigned to represent the recurrence of mast years via a transition matrix, and the observation sequences were designated as meteorological conditions in the previous summer, which are governed by hidden states with emission distribution. The proposed model achieved accuracies of 83.3% in the training period and 75.0% in the test period. Thus, the proposed model can provide an alternative perspective toward the SPI forecast and probabilistic information of pollen levels as a useful reference for allergy stakeholders.
Collapse
Affiliation(s)
- Yi-Ting Tseng
- Graduate School of Agriculture, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan
| | - Shigeto Kawashima
- Graduate School of Agriculture, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan.
| | - Satoshi Kobayashi
- Hokkaido Institute of Public Health, 12 Chome Kita 19 Jonishi, Kita Ward, Sapporo, Hokkaido 060-0819, Japan
| | - Shinji Takeuchi
- Hokkaido Institute of Public Health, 12 Chome Kita 19 Jonishi, Kita Ward, Sapporo, Hokkaido 060-0819, Japan
| | - Kimihito Nakamura
- Graduate School of Agriculture, Kyoto University, Kitashirakawa-Oiwakecho, Sakyo-Ku, Kyoto 606-8502, Japan
| |
Collapse
|
39
|
Brekkan A, Jönsson S, Karlsson MO, Plan EL. Handling underlying discrete variables with bivariate mixed hidden Markov models in NONMEM. J Pharmacokinet Pharmacodyn 2019; 46:591-604. [PMID: 31654267 PMCID: PMC6868114 DOI: 10.1007/s10928-019-09658-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 10/09/2019] [Indexed: 11/26/2022]
Abstract
Non-linear mixed effects models typically deal with stochasticity in observed processes but models accounting for only observed processes may not be the most appropriate for all data. Hidden Markov models (HMMs) characterize the relationship between observed and hidden variables where the hidden variables can represent an underlying and unmeasurable disease status for example. Adding stochasticity to HMMs results in mixed HMMs (MHMMs) which potentially allow for the characterization of variability in unobservable processes. Further, HMMs can be extended to include more than one observation source and are then multivariate HMMs. In this work MHMMs were developed and applied in a chronic obstructive pulmonary disease example. The two hidden states included in the model were remission and exacerbation and two observation sources were considered, patient reported outcomes (PROs) and forced expiratory volume (FEV1). Estimation properties in the software NONMEM of model parameters were investigated with and without random and covariate effect parameters. The influence of including random and covariate effects of varying magnitudes on the parameters in the model was quantified and a power analysis was performed to compare the power of a single bivariate MHMM with two separate univariate MHMMs. A bivariate MHMM was developed for simulating and analysing hypothetical COPD data consisting of PRO and FEV1 measurements collected every week for 60 weeks. Parameter precision was high for all parameters with the exception of the variance of the transition rate dictating the transition from remission to exacerbation (relative root mean squared error [RRMSE] > 150%). Parameter precision was better with higher magnitudes of the transition probability parameters. A drug effect was included on the transition rate probability and the precision of the drug effect parameter improved with increasing magnitude of the parameter. The power to detect the drug effect was improved by utilizing a bivariate MHMM model over the univariate MHMM models where the number of subject required for 80% power was 25 with the bivariate MHMM model versus 63 in the univariate MHMM FEV1 model and > 100 in the univariate MHMM PRO model. The results advocates for the use of bivariate MHMM models when implementation is possible.
Collapse
Affiliation(s)
- A Brekkan
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 75124, Uppsala, Sweden
| | - S Jönsson
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 75124, Uppsala, Sweden
| | - M O Karlsson
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 75124, Uppsala, Sweden
| | - E L Plan
- Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 75124, Uppsala, Sweden.
| |
Collapse
|
40
|
Luo X, Shen Z. A Sensing and Tracking Algorithm for Multiple Frequency Line Components in Underwater Acoustic Signals. Sensors (Basel) 2019; 19:E4866. [PMID: 31717380 DOI: 10.3390/s19224866] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 10/24/2019] [Accepted: 11/01/2019] [Indexed: 11/16/2022]
Abstract
Reliable and efficient sensing and tracking of multiple weak or time-varying frequency line components in underwater acoustic signals is the topic of this paper. We propose a method for automatic detection and tracking of multiple frequency lines in lofargram based on hidden Markov model (HMM). Instead of being directly subjected to frequency line tracking, the whole lofargram is first segmented into several sub-lofargrams. Then, the sub-lofargrams suspected to contain frequency lines are screened. In these sub-lofargrams, the HMM-based method is used for detection of multiple frequency lines. Using image stitching and statistical model method, the frequency lines with overlapping parts detected by different sub-lofargrams are merged to obtain the final detection results. The method can effectively detect multiple time-varying frequency lines of underwater acoustic signals while ensuring the performance under the condition of low signal-to-noise ratio (SNR). It can be concluded that the proposed algorithm can provide better multiple frequency lines sensing ability while greatly reducing the amount of calculations and providing potential techniques for feature sensing and tracking processing of unattended equipment such as sonar buoys and submerged buoys.
Collapse
|
41
|
Abstract
Insight into the inter- and intra-family relationship of protein families is important, since it can aid understanding of substrate specificity evolution and assign putative functions to proteins with unknown function. To study both these inter- and intra-family relationships, the ability to build phylogenetic trees using the most sensitive sequence similarity search methods (e.g. profile hidden Markov model (pHMM)-pHMM alignments) is required. However, existing solutions require a very long calculation time to obtain the phylogenetic tree. Therefore, a faster protocol is required to make this approach efficient for research. To contribute to this goal, we extended the original Profile Comparer program (PRC) for the construction of large pHMM phylogenetic trees at speeds several orders of magnitude faster compared to pHMM-tree. As an example, PRC Extended (PRCx) was used to study the phylogeny of over 10,000 sequences of lytic polysaccharide monooxygenase (LPMO) from over seven families. Using the newly developed program we were able to reveal previously unknown homologs of LPMOs, namely the PFAM Egh16-like family. Moreover, we show that the substrate specificities have evolved independently several times within the LPMO superfamily. Furthermore, the LPMO phylogenetic tree, does not seem to follow taxonomy-based classification.
Collapse
Affiliation(s)
- Gerben P. Voshol
- Department of Microbial Biotechnology and Health, Insitute of Biology Leiden, Leiden, 2333BE, The Netherlands
- Dutch DNA Biotech B.V., Utrecht, 3584CH, The Netherlands
| | - Peter J. Punt
- Department of Microbial Biotechnology and Health, Insitute of Biology Leiden, Leiden, 2333BE, The Netherlands
- Dutch DNA Biotech B.V., Utrecht, 3584CH, The Netherlands
| | - Erik Vijgenboom
- Department of Microbial Biotechnology and Health, Insitute of Biology Leiden, Leiden, 2333BE, The Netherlands
| |
Collapse
|
42
|
Wiedenhoeft J, Cagan A, Kozhemyakina R, Gulevich R, Schliep A. Bayesian localization of CNV candidates in WGS data within minutes. Algorithms Mol Biol 2019; 14:20. [PMID: 31572486 PMCID: PMC6757390 DOI: 10.1186/s13015-019-0154-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 08/08/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Full Bayesian inference for detecting copy number variants (CNV) from whole-genome sequencing (WGS) data is still largely infeasible due to computational demands. A recently introduced approach to perform Forward-Backward Gibbs sampling using dynamic Haar wavelet compression has alleviated issues of convergence and, to some extent, speed. Yet, the problem remains challenging in practice. RESULTS In this paper, we propose an improved algorithmic framework for this approach. We provide new space-efficient data structures to query sufficient statistics in logarithmic time, based on a linear-time, in-place transform of the data, which also improves on the compression ratio. We also propose a new approach to efficiently store and update marginal state counts obtained from the Gibbs sampler. CONCLUSIONS Using this approach, we discover several CNV candidates in two rat populations divergently selected for tame and aggressive behavior, consistent with earlier results concerning the domestication syndrome as well as experimental observations. Computationally, we observe a 29.5-fold decrease in memory, an average 5.8-fold speedup, as well as a 191-fold decrease in minor page faults. We also observe that metrics varied greatly in the old implementation, but not the new one. We conjecture that this is due to the better compression scheme. The fully Bayesian segmentation of the entire WGS data set required 3.5 min and 1.24 GB of memory, and can hence be performed on a commodity laptop.
Collapse
|
43
|
Essa E, Jones JL, Xie X. Coupled s-excess HMM for vessel border tracking and segmentation. Int J Numer Method Biomed Eng 2019; 35:e3206. [PMID: 30968570 DOI: 10.1002/cnm.3206] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 03/28/2019] [Accepted: 03/29/2019] [Indexed: 06/09/2023]
Abstract
In this paper, we present a novel image segmentation technique, based on hidden Markov model (HMM), which we then apply to simultaneously segment interior and exterior walls of fluorescent confocal images of lymphatic vessels. Our proposed method achieves this by tracking hidden states, which are used to indicate the locations of both the inner and outer wall borders throughout the sequence of images. We parameterize these vessel borders using radial basis functions (RBFs), thus enabling us to minimize the number of points we need to track as we progress through multiple layers and therefore reduce computational complexity. Information about each border is detected using patch-wise convolutional neural networks (CNN). We use the softmax function to infer the emission probability and use a proposed new training algorithm based on s-excess optimization to learn the transition probability. We also introduce a new optimization method to determine the optimum sequence of the hidden states. Thus, we transform the segmentation problem into one that minimizes an s-excess graph cut, where each hidden state is represented as a graph node and the weight of these nodes are defined by their emission probabilities. The transition probabilities are used to define relationships between neighboring nodes in the constructed graph. We compare our proposed method to the Viterbi and Baum-Welch algorithms. Both qualitative and quantitative analysis show superior performance of the proposed methods.
Collapse
Affiliation(s)
- Ehab Essa
- Department of Computer Science, Faculty of Computers and Information, Mansoura University, Mansoura, Egypt
- Department of Computer Science, Swansea University, Swansea, UK
| | | | - Xianghua Xie
- Department of Computer Science, Swansea University, Swansea, UK
| |
Collapse
|
44
|
Abstract
Tandem mass spectrometry has become the method of choice for high-throughput, quantitative analysis in proteomics. Peptide spectrum matching algorithms score the concordance between the experimental and the theoretical spectra of candidate peptides by evaluating the number (or proportion) of theoretically possible fragment ions observed in the experimental spectra without any discrimination. However, the assumption that each theoretical fragment is just as likely to be observed is inaccurate. On the contrary, MS2 spectra often have few dominant fragments. Using millions of MS/MS spectra we show that there is high reproducibility across different fragmentation spectra given the precursor peptide and charge state, implying that there is a pattern to fragmentation. To capture this pattern we propose a novel prediction algorithm based on hidden Markov models with an efficient training process. We investigated the performance of our interpolated-HMM model, trained on millions of MS2 spectra, and found that our model picks up meaningful patterns in peptide fragmentation. Second, looking at the variability of the prediction performance by varying the train/test data split, we observed that our model performs well independent of the specific peptides that are present in the training data. Furthermore, we propose that the real value of this model is as a preprocessing step in the peptide identification process. The model can discern fragment ions that are unlikely to be intense for a given candidate peptide rather than using the actual predicted intensities. As such, probabilistic measures of concordance between experimental and theoretical spectra will leverage better statistics.
Collapse
Affiliation(s)
- Ufuk Kirik
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Science , University of Copenhagen , Blegdamsvej 3B , DK-2200 Copenhagen , Denmark
| | - Jan C Refsgaard
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Science , University of Copenhagen , Blegdamsvej 3B , DK-2200 Copenhagen , Denmark.,Intomics A/S , Lottenborgvej 26 , DK-2800 Kongens Lyngby , Denmark
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Science , University of Copenhagen , Blegdamsvej 3B , DK-2200 Copenhagen , Denmark
| |
Collapse
|
45
|
Kirsip H, Abroi A. Protein Structure-Guided Hidden Markov Models ( HMMs) as A Powerful Method in the Detection of Ancestral Endogenous Viral Elements. Viruses 2019; 11:v11040320. [PMID: 30986983 PMCID: PMC6520822 DOI: 10.3390/v11040320] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 03/23/2019] [Accepted: 03/27/2019] [Indexed: 12/19/2022] Open
Abstract
It has been believed for a long time that the transfer and fixation of genetic material from RNA viruses to eukaryote genomes is very unlikely. However, during the last decade, there have been several cases in which “virus-to-host” gene transfer from various viral families into various eukaryotic phyla have been described. These transfers have been identified by sequence similarity, which may disappear very quickly, especially in the case of RNA viruses. However, compared to sequences, protein structure is known to be more conserved. Applying protein structure-guided protein domain-specific Hidden Markov Models, we detected homologues of the Virgaviridae capsid protein in Schizophora flies. Further data analysis supported “virus-to-host” transfer into Schizophora ancestors as a single transfer event. This transfer was not identifiable by BLAST or by other methods we applied. Our data show that structure-guided Hidden Markov Models should be used to detect ancestral virus-to-host transfers.
Collapse
Affiliation(s)
- Heleri Kirsip
- Department of Bioinformatics, University of Tartu, Tartu, 51010, Riia 23, Estonia.
| | - Aare Abroi
- Institute of Technology, University of Tartu, Tartu, 50411, Nooruse 1, Estonia.
| |
Collapse
|
46
|
Ambidi N, Katta RLR. Adaptive Risk Prediction and Anonymous Secured Communication in MANET for Medical Informatics. J Med Syst 2019; 43:115. [PMID: 30905047 DOI: 10.1007/s10916-019-1231-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 02/27/2019] [Indexed: 11/27/2022]
Abstract
Location-based services (LBS) and information security is a major concern in communication system.With the increasing popularity of location based services more attention is paid to preserve location information to protect the data. In order to protect and preserve the MANET and location based services, there are various existing location based anonymity protocols such as k-anonymity location based, but these protocols are more overhead due to the dynamic mobility nature of ad-hoc networks. In this paper we proposed an Adaptive Risk Prediction and Anonymous Secured Communication protocol to predict the risk before processing anonymous communication. The proposed protocol estimates the risk against adjacent nodes and estimates the vulnerability paths using hidden markov model and decision tree. The decision tree determines the evidence to identify the trusted paths. The anonymous communication message authentication scheme assigns the anonymous communication and organize the secured authentication scheme. We simulated the network by considering different attacks to determine the efficiency of Adaptive Risk Prediction and Anonymous Secured Communication using NS2 simulator.
Collapse
Affiliation(s)
- Naveena Ambidi
- G. Narayanamma Institute of Technology and Science, Hyderabad, India.
| | | |
Collapse
|
47
|
Golmohammadi M, Harati Nejad Torbati AH, Lopez de Diego S, Obeid I, Picone J. Automatic Analysis of EEGs Using Big Data and Hybrid Deep Learning Architectures. Front Hum Neurosci 2019; 13:76. [PMID: 30914936 PMCID: PMC6423064 DOI: 10.3389/fnhum.2019.00076] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 02/13/2019] [Indexed: 11/13/2022] Open
Abstract
Brain monitoring combined with automatic analysis of EEGs provides a clinical decision support tool that can reduce time to diagnosis and assist clinicians in real-time monitoring applications (e.g., neurological intensive care units). Clinicians have indicated that a sensitivity of 95% with specificity below 5% was the minimum requirement for clinical acceptance. In this study, a high-performance automated EEG analysis system based on principles of machine learning and big data is proposed. This hybrid architecture integrates hidden Markov models (HMMs) for sequential decoding of EEG events with deep learning-based post-processing that incorporates temporal and spatial context. These algorithms are trained and evaluated using the Temple University Hospital EEG, which is the largest publicly available corpus of clinical EEG recordings in the world. This system automatically processes EEG records and classifies three patterns of clinical interest in brain activity that might be useful in diagnosing brain disorders: (1) spike and/or sharp waves, (2) generalized periodic epileptiform discharges, (3) periodic lateralized epileptiform discharges. It also classifies three patterns used to model the background EEG activity: (1) eye movement, (2) artifacts, and (3) background. Our approach delivers a sensitivity above 90% while maintaining a specificity below 5%. We also demonstrate that this system delivers a low false alarm rate, which is critical for any spike detection application.
Collapse
Affiliation(s)
- Meysam Golmohammadi
- The Neural Engineering Data Consortium, Temple University, Philadelphia, PA, United States
| | | | - Silvia Lopez de Diego
- The Neural Engineering Data Consortium, Temple University, Philadelphia, PA, United States
| | - Iyad Obeid
- The Neural Engineering Data Consortium, Temple University, Philadelphia, PA, United States
| | - Joseph Picone
- The Neural Engineering Data Consortium, Temple University, Philadelphia, PA, United States
| |
Collapse
|
48
|
Tang M, Hasan MS, Zhu H, Zhang L, Wu X. vi- HMM: a novel HMM-based method for sequence variant identification in short-read data. Hum Genomics 2019; 13:9. [PMID: 30795817 PMCID: PMC6387560 DOI: 10.1186/s40246-019-0194-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 01/29/2019] [Indexed: 12/30/2022] Open
Abstract
Background Accurate and reliable identification of sequence variants, including single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (INDELs), plays a fundamental role in next-generation sequencing (NGS) applications. Existing methods for calling these variants often make simplified assumptions of positional independence and fail to leverage the dependence between genotypes at nearby loci that is caused by linkage disequilibrium (LD). Results and conclusion We propose vi-HMM, a hidden Markov model (HMM)-based method for calling SNPs and INDELs in mapped short-read data. This method allows transitions between hidden states (defined as “SNP,” “Ins,” “Del,” and “Match”) of adjacent genomic bases and determines an optimal hidden state path by using the Viterbi algorithm. The inferred hidden state path provides a direct solution to the identification of SNPs and INDELs. Simulation studies show that, under various sequencing depths, vi-HMM outperforms commonly used variant calling methods in terms of sensitivity and F1 score. When applied to the real data, vi-HMM demonstrates higher accuracy in calling SNPs and INDELs. Electronic supplementary material The online version of this article (10.1186/s40246-019-0194-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Man Tang
- Department of Statistics, Virginia Tech, 250 Drillfield Drive, Blacksburg, 24061, VA, USA
| | - Mohammad Shabbir Hasan
- Department of Computer Science, Virginia Tech, 225 Stanger Street, Blacksburg, 24060, VA, USA
| | - Hongxiao Zhu
- Department of Statistics, Virginia Tech, 250 Drillfield Drive, Blacksburg, 24061, VA, USA
| | - Liqing Zhang
- Department of Computer Science, Virginia Tech, 225 Stanger Street, Blacksburg, 24060, VA, USA
| | - Xiaowei Wu
- Department of Statistics, Virginia Tech, 250 Drillfield Drive, Blacksburg, 24061, VA, USA.
| |
Collapse
|
49
|
Elzobi M, Al-Hamadi A. Generative vs. Discriminative Recognition Models for Off-Line Arabic Handwriting. Sensors (Basel) 2018; 18:s18092786. [PMID: 30149549 PMCID: PMC6164492 DOI: 10.3390/s18092786] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Revised: 07/13/2018] [Accepted: 07/17/2018] [Indexed: 11/16/2022]
Abstract
The majority of handwritten word recognition strategies are constructed on learning-based generative frameworks from letter or word training samples. Theoretically, constructing recognition models through discriminative learning should be the more effective alternative. The primary goal of this research is to compare the performances of discriminative and generative recognition strategies, which are described by generatively-trained hidden Markov modeling (HMM), discriminatively-trained conditional random fields (CRF) and discriminatively-trained hidden-state CRF (HCRF). With learning samples obtained from two dissimilar databases, we initially trained and applied an HMM classification scheme. To enable HMM classifiers to effectively reject incorrect and out-of-vocabulary segmentation, we enhance the models with adaptive threshold schemes. Aside from proposing such schemes for HMM classifiers, this research introduces CRF and HCRF classifiers in the recognition of offline Arabic handwritten words. Furthermore, the efficiencies of all three strategies are fully assessed using two dissimilar databases. Recognition outcomes for both words and letters are presented, with the pros and cons of each strategy emphasized.
Collapse
Affiliation(s)
- Moftah Elzobi
- Institute for Information Technology and Communications (IIKT), Otto von Guericke University, 39106 Magdeburg, Germany.
| | - Ayoub Al-Hamadi
- Institute for Information Technology and Communications (IIKT), Otto von Guericke University, 39106 Magdeburg, Germany.
| |
Collapse
|
50
|
Szalkai B, Grolmusz V. Meta HMM: A webserver for identifying novel genes with specified functions in metagenomic samples. Genomics 2019; 111:883-5. [PMID: 29802977 DOI: 10.1016/j.ygeno.2018.05.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Revised: 05/17/2018] [Accepted: 05/18/2018] [Indexed: 11/20/2022]
Abstract
The fast and affordable sequencing of large clinical and environmental metagenomic datasets opens up new horizons in medical and biotechnological applications. It is believed that today we have described only about 1% of the microorganisms on the Earth, therefore, metagenomic analysis mostly deals with unknown species in the samples. Microbial communities in extreme environments may contain genes with high biotechnological potential, and clinical metagenomes, related to diseases, may uncover still unknown pathogens and pathological mechanisms in known diseases. While the species-level identification and description of the taxa in the samples do not seem to be possible today, we can search for novel genes with known functions in these samples, using numerous techniques, including artificial intelligence tools, like the hidden Markov models (HMMs). Here we describe a simple-to-use webserver, the MetaHMM, which is capable of homology-based automatic model-building for the genes to be searched for, and it also finds the closest matches in the metagenome. The webserver uses already highly successful building blocks: it performs multiple alignments by applying Clustal Omega, builds a hidden Markov model with HMMER components of hmmbuild and uses hmmsearch for finding similar sequences to the specified model in the metagenomes. The webserver is publicly available at https://metahmm.pitgroup.org.
Collapse
|