1
|
Gu Y, Li J, Kang H, Zhang B, Zheng S. Employing Molecular Conformations for Ligand-Based Virtual Screening with Equivariant Graph Neural Network and Deep Multiple Instance Learning. Molecules 2023; 28:5982. [PMID: 37630234 PMCID: PMC10459669 DOI: 10.3390/molecules28165982] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2023] [Revised: 07/27/2023] [Accepted: 08/03/2023] [Indexed: 08/27/2023] Open
Abstract
Ligand-based virtual screening (LBVS) is a promising approach for rapid and low-cost screening of potentially bioactive molecules in the early stage of drug discovery. Compared with traditional similarity-based machine learning methods, deep learning frameworks for LBVS can more effectively extract high-order molecule structure representations from molecular fingerprints or structures. However, the 3D conformation of a molecule largely influences its bioactivity and physical properties, and has rarely been considered in previous deep learning-based LBVS methods. Moreover, the relative bioactivity benchmark dataset is still lacking. To address these issues, we introduce a novel end-to-end deep learning architecture trained from molecular conformers for LBVS. We first extracted molecule conformers from multiple public molecular bioactivity data and consolidated them into a large-scale bioactivity benchmark dataset, which totally includes millions of endpoints and molecules corresponding to 954 targets. Then, we devised a deep learning-based LBVS called EquiVS to learn molecule representations from conformers for bioactivity prediction. Specifically, graph convolutional network (GCN) and equivariant graph neural network (EGNN) are sequentially stacked to learn high-order molecule-level and conformer-level representations, followed with attention-based deep multiple-instance learning (MIL) to aggregate these representations and then predict the potential bioactivity for the query molecule on a given target. We conducted various experiments to validate the data quality of our benchmark dataset, and confirmed EquiVS achieved better performance compared with 10 traditional machine learning or deep learning-based LBVS methods. Further ablation studies demonstrate the significant contribution of molecular conformation for bioactivity prediction, as well as the reasonability and non-redundancy of deep learning architecture in EquiVS. Finally, a model interpretation case study on CDK2 shows the potential of EquiVS in optimal conformer discovery. The overall study shows that our proposed benchmark dataset and EquiVS method have promising prospects in virtual screening applications.
Collapse
Affiliation(s)
- Yaowen Gu
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Department of Chemistry, New York University, New York, NY 10027, USA
| | - Jiao Li
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
| | - Hongyu Kang
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Department of Biomedical Engineering, School of Life Science, Beijing Institute of Technology, Beijing 100081, China
| | - Bowen Zhang
- Beijing StoneWise Technology Co., Ltd., Beijing 100080, China;
| | - Si Zheng
- Institute of Medical Information (IMI), Chinese Academy of Medical Sciences and Peking Union Medical College (CAMS & PUMC), Beijing 100020, China; (Y.G.); (J.L.); (H.K.)
- Institute for Artificial Intelligence, Department of Computer Science and Technology, BNRist, Tsinghua University, Beijing 100084, China
| |
Collapse
|
2
|
Atzeni D, Ramjattan R, Figliè R, Baldi G, Mazzei D. Data-Driven Insights through Industrial Retrofitting: An Anonymized Dataset with Machine Learning Use Cases. Sensors (Basel) 2023; 23:6078. [PMID: 37447927 DOI: 10.3390/s23136078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 06/22/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023]
Abstract
Small and medium-sized enterprises (SMEs) often encounter practical challenges and limitations when extracting valuable insights from the data of retrofitted or brownfield equipment. The existing literature fails to reflect the full reality and potential of data-driven analysis in current SME environments. In this paper, we provide an anonymized dataset obtained from two medium-sized companies leveraging a non-invasive and scalable data-collection procedure. The dataset comprises mainly power consumption machine data collected over a period of 7 months and 1 year from two medium-sized companies. Using this dataset, we demonstrate how machine learning (ML) techniques can enable SMEs to extract useful information even in the short term, even from a small variety of data types. We develop several ML models to address various tasks, such as power consumption forecasting, item classification, next machine state prediction, and item production count forecasting. By providing this anonymized dataset and showcasing its application through various ML use cases, our paper aims to provide practical insights for SMEs seeking to leverage ML techniques with their limited data resources. The findings contribute to a better understanding of how ML can be effectively utilized in extracting actionable insights from limited datasets, offering valuable implications for SMEs in practical settings.
Collapse
Affiliation(s)
- Daniele Atzeni
- Department of Computer Science, University of Pisa, 56126 Pisa, Italy
| | - Reshawn Ramjattan
- Department of Computer Science, University of Pisa, 56126 Pisa, Italy
| | - Roberto Figliè
- Department of Computer Science, University of Pisa, 56126 Pisa, Italy
| | | | - Daniele Mazzei
- Department of Computer Science, University of Pisa, 56126 Pisa, Italy
- Zerynth, 56124 Pisa, Italy
| |
Collapse
|
3
|
Papadimos T, Andreadis S, Gialampoukidis I, Vrochidis S, Kompatsiaris I. Flood-Related Multimedia Benchmark Evaluation: Challenges, Results and a Novel GNN Approach. Sensors (Basel) 2023; 23:3767. [PMID: 37050827 PMCID: PMC10098572 DOI: 10.3390/s23073767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 03/30/2023] [Accepted: 04/03/2023] [Indexed: 06/19/2023]
Abstract
This paper discusses the importance of detecting breaking events in real time to help emergency response workers, and how social media can be used to process large amounts of data quickly. Most event detection techniques have focused on either images or text, but combining the two can improve performance. The authors present lessons learned from the Flood-related multimedia task in MediaEval2020, provide a dataset for reproducibility, and propose a new multimodal fusion method that uses Graph Neural Networks to combine image, text, and time information. Their method outperforms state-of-the-art approaches and can handle low-sample labelled data.
Collapse
|
4
|
Das A, Das Choudhury S, Das AK, Samal A, Awada T. EmergeNet: A novel deep-learning based ensemble segmentation model for emergence timing detection of coleoptile. Front Plant Sci 2023; 14:1084778. [PMID: 36818836 PMCID: PMC9936151 DOI: 10.3389/fpls.2023.1084778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 01/11/2023] [Indexed: 06/18/2023]
Abstract
The emergence timing of a plant, i.e., the time at which the plant is first visible from the surface of the soil, is an important phenotypic event and is an indicator of the successful establishment and growth of a plant. The paper introduces a novel deep-learning based model called EmergeNet with a customized loss function that adapts to plant growth for coleoptile (a rigid plant tissue that encloses the first leaves of a seedling) emergence timing detection. It can also track its growth from a time-lapse sequence of images with cluttered backgrounds and extreme variations in illumination. EmergeNet is a novel ensemble segmentation model that integrates three different but promising networks, namely, SEResNet, InceptionV3, and VGG19, in the encoder part of its base model, which is the UNet model. EmergeNet can correctly detect the coleoptile at its first emergence when it is tiny and therefore barely visible on the soil surface. The performance of EmergeNet is evaluated using a benchmark dataset called the University of Nebraska-Lincoln Maize Emergence Dataset (UNL-MED). It contains top-view time-lapse images of maize coleoptiles starting before the occurrence of their emergence and continuing until they are about one inch tall. EmergeNet detects the emergence timing with 100% accuracy compared with human-annotated ground-truth. Furthermore, it significantly outperforms UNet by generating very high-quality segmented masks of the coleoptiles in both natural light and dark environmental conditions.
Collapse
Affiliation(s)
- Aankit Das
- Institute of Radio Physics and Electronics, University of Calcutta, Kolkata, West Bengal, India
| | - Sruti Das Choudhury
- School of Computing, University of Nebraska-Lincoln, Lincoln, NE, United States
- School of Natural Resources University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Amit Kumar Das
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, West Bengal, India
| | - Ashok Samal
- Institute of Radio Physics and Electronics, University of Calcutta, Kolkata, West Bengal, India
- School of Computing, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Tala Awada
- School of Natural Resources University of Nebraska-Lincoln, Lincoln, NE, United States
- Agricultural Research Division, University of Nebraska-Lincoln, Lincoln, NE, United States
| |
Collapse
|
5
|
Chen X, Wang Z, Liu J, Gong C, Pang Y. A Neural Network-Based Mesh Quality Indicator for Three-Dimensional Cylinder Modelling. Entropy (Basel) 2022; 24:1245. [PMID: 36141132 PMCID: PMC9497966 DOI: 10.3390/e24091245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 08/26/2022] [Accepted: 09/02/2022] [Indexed: 06/16/2023]
Abstract
Evaluating mesh quality prior to performing the computational fluid dynamics (CFD) simulation is an essential step to ensure the acceptable accuracy of cylinder modelling. However, traditional mesh quality indicators are often insufficient since they only check geometric information on individual distorted elements. To yield more accurate results, the current evaluation process usually requires careful manual re-evaluation for quality properties such as mesh distribution and local refinement, which heavily increase the meshing overhead. In this paper, we introduce an efficient quality indicator for varisized cylinder meshes, consisting of a mesh pre-processing method and a neural network-based indicator, Mesh-Net. We also publish a cylinder mesh benchmark dataset. The proposed indicator is trained to study the role of CFD meshes on the accuracy of numerical simulations. It considers both the effect of element geometry (e.g., orthogonality) and quality properties (e.g., smoothness and distribution). Thereafter, the well-trained indicator is used as a black-box to predict the overall quality of the input mesh automatically. Experimental results demonstrate that the proposed indicator is accurate and can be applied in the mesh quality evaluation process without manual interactions.
Collapse
Affiliation(s)
- Xinhai Chen
- Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China
- Laboratory of Software Engineering for Complex System, National University of Defense Technology, Changsha 410073, China
| | - Zhichao Wang
- Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China
- Laboratory of Software Engineering for Complex System, National University of Defense Technology, Changsha 410073, China
| | - Jie Liu
- Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China
- Laboratory of Software Engineering for Complex System, National University of Defense Technology, Changsha 410073, China
| | - Chunye Gong
- Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China
- Laboratory of Software Engineering for Complex System, National University of Defense Technology, Changsha 410073, China
| | - Yufei Pang
- China Aerodynamics Research and Development Center, Mianyang 621000, China
| |
Collapse
|
6
|
Otgonbold ME, Gochoo M, Alnajjar F, Ali L, Tan TH, Hsieh JW, Chen PY. SHEL5K: An Extended Dataset and Benchmarking for Safety Helmet Detection. Sensors (Basel) 2022; 22:s22062315. [PMID: 35336491 PMCID: PMC8950768 DOI: 10.3390/s22062315] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 02/23/2022] [Accepted: 03/08/2022] [Indexed: 02/04/2023]
Abstract
Wearing a safety helmet is important in construction and manufacturing industrial activities to avoid unpleasant situations. This safety compliance can be ensured by developing an automatic helmet detection system using various computer vision and deep learning approaches. Developing a deep-learning-based helmet detection model usually requires an enormous amount of training data. However, there are very few public safety helmet datasets available in the literature, in which most of them are not entirely labeled, and the labeled one contains fewer classes. This paper presents the Safety HELmet dataset with 5K images (SHEL5K) dataset, an enhanced version of the SHD dataset. The proposed dataset consists of six completely labeled classes (helmet, head, head with helmet, person with helmet, person without helmet, and face). The proposed dataset was tested on multiple state-of-the-art object detection models, i.e., YOLOv3 (YOLOv3, YOLOv3-tiny, and YOLOv3-SPP), YOLOv4 (YOLOv4 and YOLOv4pacsp-x-mish), YOLOv5-P5 (YOLOv5s, YOLOv5m, and YOLOv5x), the Faster Region-based Convolutional Neural Network (Faster-RCNN) with the Inception V2 architecture, and YOLOR. The experimental results from the various models on the proposed dataset were compared and showed improvement in the mean Average Precision (mAP). The SHEL5K dataset had an advantage over other safety helmet datasets as it contains fewer images with better labels and more classes, making helmet detection more accurate.
Collapse
Affiliation(s)
- Munkh-Erdene Otgonbold
- Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain 15551, United Arab Emirates; (M.-E.O.); (F.A.); (L.A.)
| | - Munkhjargal Gochoo
- Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain 15551, United Arab Emirates; (M.-E.O.); (F.A.); (L.A.)
- Correspondence:
| | - Fady Alnajjar
- Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain 15551, United Arab Emirates; (M.-E.O.); (F.A.); (L.A.)
- RIKEN Center for Brain Science (CBS), Wako 463-0003, Japan
| | - Luqman Ali
- Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain 15551, United Arab Emirates; (M.-E.O.); (F.A.); (L.A.)
| | - Tan-Hsu Tan
- Department of Electrical Engineering, National Taipei University of Technology, Taipei 10608, Taiwan;
| | - Jun-Wei Hsieh
- College of Artificial Intelligence and Green Energy, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan;
| | - Ping-Yang Chen
- Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan;
| |
Collapse
|
7
|
Zhang H, Zhao M, Wei C, Mantini D, Li Z, Liu Q. EEGdenoiseNet: a benchmark dataset for deep learning solutions of EEG denoising. J Neural Eng 2021; 18. [PMID: 34596046 DOI: 10.1088/1741-2552/ac2bf8] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 09/29/2021] [Indexed: 12/21/2022]
Abstract
Objective.Deep learning (DL) networks are increasingly attracting attention across various fields, including electroencephalography (EEG) signal processing. These models provide comparable performance to that of traditional techniques. At present, however, there is a lack of well-structured and standardized datasets with specific benchmark limit the development of DL solutions for EEG denoising.Approach.Here, we present EEGdenoiseNet, a benchmark EEG dataset that is suited for training and testing DL-based denoising models, as well as for performance comparisons across models. EEGdenoiseNet contains 4514 clean EEG segments, 3400 ocular artifact segments and 5598 muscular artifact segments, allowing users to synthesize contaminated EEG segments with the ground-truth clean EEG.Main results.We used EEGdenoiseNet to evaluate denoising performance of four classical networks (a fully-connected network, a simple and a complex convolution network, and a recurrent neural network). Our results suggested that DL methods have great potential for EEG denoising even under high noise contamination.Significance.Through EEGdenoiseNet, we hope to accelerate the development of the emerging field of DL-based EEG denoising. The dataset and code are available athttps://github.com/ncclabsustech/EEGdenoiseNet.
Collapse
Affiliation(s)
- Haoming Zhang
- Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| | - Mingqi Zhao
- Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China.,Movement Control and Neuroplasticity Research Group, KU Leuven, Leuven 3001, Belgium
| | - Chen Wei
- Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| | - Dante Mantini
- Movement Control and Neuroplasticity Research Group, KU Leuven, Leuven 3001, Belgium.,Brain Imaging and Neural Dynamics Research Group, IRCCS San Camillo Hospital, Venice 30126, Italy
| | - Zherui Li
- Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| | - Quanying Liu
- Shenzhen Key Laboratory of Smart Healthcare Engineering, Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| |
Collapse
|
8
|
Yuan H, Hoogenkamp T, Veltkamp RC. RobotP: A Benchmark Dataset for 6D Object Pose Estimation. Sensors (Basel) 2021; 21:1299. [PMID: 33670325 DOI: 10.3390/s21041299] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Revised: 02/03/2021] [Accepted: 02/07/2021] [Indexed: 11/17/2022]
Abstract
Deep learning has achieved great success on robotic vision tasks. However, when compared with other vision-based tasks, it is difficult to collect a representative and sufficiently large training set for six-dimensional (6D) object pose estimation, due to the inherent difficulty of data collection. In this paper, we propose the RobotP dataset consisting of commonly used objects for benchmarking in 6D object pose estimation. To create the dataset, we apply a 3D reconstruction pipeline to produce high-quality depth images, ground truth poses, and 3D models for well-selected objects. Subsequently, based on the generated data, we produce object segmentation masks and two-dimensional (2D) bounding boxes automatically. To further enrich the data, we synthesize a large number of photo-realistic color-and-depth image pairs with ground truth 6D poses. Our dataset is freely distributed to research groups by the Shape Retrieval Challenge benchmark on 6D pose estimation. Based on our benchmark, different learning-based approaches are trained and tested by the unified dataset. The evaluation results indicate that there is considerable room for improvement in 6D object pose estimation, particularly for objects with dark colors, and photo-realistic images are helpful in increasing the performance of pose estimation algorithms.
Collapse
|
9
|
Das Choudhury S, Maturu S, Samal A, Stoerger V, Awada T. Leveraging Image Analysis to Compute 3D Plant Phenotypes Based on Voxel-Grid Plant Reconstruction. Front Plant Sci 2020; 11:521431. [PMID: 33362806 PMCID: PMC7755976 DOI: 10.3389/fpls.2020.521431] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 11/17/2020] [Indexed: 05/31/2023]
Abstract
High throughput image-based plant phenotyping facilitates the extraction of morphological and biophysical traits of a large number of plants non-invasively in a relatively short time. It facilitates the computation of advanced phenotypes by considering the plant as a single object (holistic phenotypes) or its components, i.e., leaves and the stem (component phenotypes). The architectural complexity of plants increases over time due to variations in self-occlusions and phyllotaxy, i.e., arrangements of leaves around the stem. One of the central challenges to computing phenotypes from 2-dimensional (2D) single view images of plants, especially at the advanced vegetative stage in presence of self-occluding leaves, is that the information captured in 2D images is incomplete, and hence, the computed phenotypes are inaccurate. We introduce a novel algorithm to compute 3-dimensional (3D) plant phenotypes from multiview images using voxel-grid reconstruction of the plant (3DPhenoMV). The paper also presents a novel method to reliably detect and separate the individual leaves and the stem from the 3D voxel-grid of the plant using voxel overlapping consistency check and point cloud clustering techniques. To evaluate the performance of the proposed algorithm, we introduce the University of Nebraska-Lincoln 3D Plant Phenotyping Dataset (UNL-3DPPD). A generic taxonomy of 3D image-based plant phenotypes are also presented to promote 3D plant phenotyping research. A subset of these phenotypes are computed using computer vision algorithms with discussion of their significance in the context of plant science. The central contributions of the paper are (a) an algorithm for 3D voxel-grid reconstruction of maize plants at the advanced vegetative stages using images from multiple 2D views; (b) a generic taxonomy of 3D image-based plant phenotypes and a public benchmark dataset, i.e., UNL-3DPPD, to promote the development of 3D image-based plant phenotyping research; and (c) novel voxel overlapping consistency check and point cloud clustering techniques to detect and isolate individual leaves and stem of the maize plants to compute the component phenotypes. Detailed experimental analyses demonstrate the efficacy of the proposed method, and also show the potential of 3D phenotypes to explain the morphological characteristics of plants regulated by genetic and environmental interactions.
Collapse
Affiliation(s)
- Sruti Das Choudhury
- School of Natural Resources, University of Nebraska-Lincoln, Lincoln, NE, United States
- Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Srikanth Maturu
- Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Ashok Samal
- Department of Computer Science and Engineering, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Vincent Stoerger
- Agricultural Research Division, University of Nebraska-Lincoln, Lincoln, NE, United States
| | - Tala Awada
- School of Natural Resources, University of Nebraska-Lincoln, Lincoln, NE, United States
- Agricultural Research Division, University of Nebraska-Lincoln, Lincoln, NE, United States
| |
Collapse
|
10
|
Simões M, Borra D, Santamaría-Vázquez E, Bittencourt-Villalpando M, Krzemiński D, Miladinović A, Schmid T, Zhao H, Amaral C, Direito B, Henriques J, Carvalho P, Castelo-Branco M. BCIAUT-P300: A Multi-Session and Multi-Subject Benchmark Dataset on Autism for P300-Based Brain-Computer-Interfaces. Front Neurosci 2020; 14:568104. [PMID: 33100959 PMCID: PMC7556208 DOI: 10.3389/fnins.2020.568104] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 08/24/2020] [Indexed: 01/13/2023] Open
Abstract
There is a lack of multi-session P300 datasets for Brain-Computer Interfaces (BCI). Publicly available datasets are usually limited by small number of participants with few BCI sessions. In this sense, the lack of large, comprehensive datasets with various individuals and multiple sessions has limited advances in the development of more effective data processing and analysis methods for BCI systems. This is particularly evident to explore the feasibility of deep learning methods that require large datasets. Here we present the BCIAUT-P300 dataset, containing 15 autism spectrum disorder individuals undergoing 7 sessions of P300-based BCI joint-attention training, for a total of 105 sessions. The dataset was used for the 2019 IFMBE Scientific Challenge organized during MEDICON 2019 where, in two phases, teams from all over the world tried to achieve the best possible object-detection accuracy based on the P300 signals. This paper presents the characteristics of the dataset and the approaches followed by the 9 finalist teams during the competition. The winner obtained an average accuracy of 92.3% with a convolutional neural network based on EEGNet. The dataset is now publicly released and stands as a benchmark for future P300-based BCI algorithms based on multiple session data.
Collapse
Affiliation(s)
- Marco Simões
- Coimbra Institute for Biomedical Imaging and Translational Research (CIBIT), Institute of Nuclear Sciences Applied to Health (ICNAS), University of Coimbra, Coimbra, Portugal.,Centre for Informatics and Systems (CISUC), Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
| | - Davide Borra
- Department of Electrical, Electronic and Information Engineering "Guglielmo Marconi" (DEI), University of Bologna, Cesena, Italy
| | - Eduardo Santamaría-Vázquez
- Grupo de Ingeniería Biomédica, Universidad de Valladolid, Valladolid, Spain.,Centro de Investigación Biomédica en Red, Biomateriales y Nanomedicina, Madrid, Spain
| | | | | | - Dominik Krzemiński
- CUBRIC, School of Psychology, Cardiff University, Cardiff, United Kingdom
| | | | | | - Thomas Schmid
- Machine Learning Group, Universität Leipzig, Leipzig, Germany
| | - Haifeng Zhao
- The University of Sydney, Camperdown, NSW, Australia
| | - Carlos Amaral
- Coimbra Institute for Biomedical Imaging and Translational Research (CIBIT), Institute of Nuclear Sciences Applied to Health (ICNAS), University of Coimbra, Coimbra, Portugal
| | - Bruno Direito
- Coimbra Institute for Biomedical Imaging and Translational Research (CIBIT), Institute of Nuclear Sciences Applied to Health (ICNAS), University of Coimbra, Coimbra, Portugal
| | - Jorge Henriques
- Centre for Informatics and Systems (CISUC), Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
| | - Paulo Carvalho
- Centre for Informatics and Systems (CISUC), Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
| | - Miguel Castelo-Branco
- Coimbra Institute for Biomedical Imaging and Translational Research (CIBIT), Institute of Nuclear Sciences Applied to Health (ICNAS), University of Coimbra, Coimbra, Portugal
| |
Collapse
|
11
|
Guan ZX, Li SH, Zhang ZM, Zhang D, Yang H, Ding H. A Brief Survey for MicroRNA Precursor Identification Using Machine Learning Methods. Curr Genomics 2020; 21:11-25. [PMID: 32655294 PMCID: PMC7324890 DOI: 10.2174/1389202921666200214125102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/24/2020] [Accepted: 01/30/2020] [Indexed: 11/22/2022] Open
Abstract
MicroRNAs, a group of short non-coding RNA molecules, could regulate gene expression. Many diseases are associated with abnormal expression of miRNAs. Therefore, accurate identification of miRNA precursors is necessary. In the past 10 years, experimental methods, comparative genomics methods, and artificial intelligence methods have been used to identify pre-miRNAs. However, experimental methods and comparative genomics methods have their disadvantages, such as time-consuming. In contrast, machine learning-based method is a better choice. Therefore, the review summarizes the current advances in pre-miRNA recognition based on computational methods, including the construction of benchmark datasets, feature extraction methods, prediction algorithms, and the results of the models. And we also provide valid information about the predictors currently available. Finally, we give the future perspectives on the identification of pre-miRNAs. The review provides scholars with a whole background of pre-miRNA identification by using machine learning methods, which can help researchers have a clear understanding of progress of the research in this field.
Collapse
Affiliation(s)
- Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Shi-Hao Li
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Zi-Mei Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| |
Collapse
|
12
|
Noor-Ul-Huda M, Tehsin S, Ahmed S, Niazi FAK, Murtaza Z. Retinal images benchmark for the detection of diabetic retinopathy and clinically significant macular edema (CSME). ACTA ACUST UNITED AC 2019; 64:297-307. [PMID: 30055096 DOI: 10.1515/bmt-2018-0098] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Accepted: 06/15/2018] [Indexed: 11/15/2022]
Abstract
Diabetes mellitus is an enduring disease related with significant morbidity and mortality. The main pathogenesis behind this disease is its numerous micro- and macrovascular complications. In developing countries, diabetic retinopathy (DR) is one of the major sources of vision impairment in working age population. DR has been classified into two categories: proliferative diabetic retinopathy (PDR) and non-proliferative diabetic retinopathy (NPDR). NPDR is further classified into mild, moderate and severe, while PDR is further classified into early PDR, high risk PDR and advanced diabetic eye disease. DR is a disease caused due to high blood glucose levels which result in vision loss or permanent blindness. High-level advancements in the field of bio-medical image processing have speeded up the automated process of disease diagnoses and analysis. Much research has been conducted and computerized systems have been designed to detect and analyze retinal diseases through image processing. Similarly, a number of algorithms have been designed to detect and grade DR by analyzing different symptoms including microaneurysms, soft exudates, hard exudates, cotton wool spots, fibrotic bands, neovascularization on disc (NVD), neovascularization elsewhere (NVE), hemorrhages and tractional bands. The visual examination of the retina is a vital test to diagnose DR-related complications. However, all the DR computer-aided diagnostic systems require a standard dataset for the estimation of their efficiency, performance and accuracy. This research presents a benchmark for the evaluation of computer-based DR diagnostic systems. The existing DR benchmarks are small in size and do not cover all the DR stages and categories. The dataset contains 1445 high-quality fundus photographs of retinal images, acquired over 2 years from the records of the patients who presented to the Department of Ophthalmology, Holy Family Hospital, Rawalpindi. This benchmark provides an evaluation platform for medical image analysis researchers. Furthermore, it provides evaluation data for all the stages of DR.
Collapse
Affiliation(s)
| | - Samabia Tehsin
- Department of Computer Science, Bahria University, Islamabad, Pakistan
| | - Sairam Ahmed
- Department of Ophthalmology, Rawalpindi Medical University (RMU), Rawalpindi, Pakistan
| | - Fuad A K Niazi
- Department of Ophthalmology, Rawalpindi Medical University (RMU), Rawalpindi, Pakistan
| | - Zeerish Murtaza
- Department of Ophthalmology, Rawalpindi Medical University (RMU), Rawalpindi, Pakistan
| |
Collapse
|
13
|
Zheng J, Li J, Li Y, Peng L. A Benchmark Dataset and Deep Learning-Based Image Reconstruction for Electrical Capacitance Tomography. Sensors (Basel) 2018; 18:E3701. [PMID: 30384432 PMCID: PMC6263896 DOI: 10.3390/s18113701] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 10/26/2018] [Accepted: 10/29/2018] [Indexed: 11/16/2022]
Abstract
Electrical Capacitance Tomography (ECT) image reconstruction has developed for decades and made great achievements, but there is still a need to find a new theoretical framework to make it better and faster. In recent years, machine learning theory has been introduced in the ECT area to solve the image reconstruction problem. However, there is still no public benchmark dataset in the ECT field for the training and testing of machine learning-based image reconstruction algorithms. On the other hand, a public benchmark dataset can provide a standard framework to evaluate and compare the results of different image reconstruction methods. In this paper, a benchmark dataset for ECT image reconstruction is presented. Like the great contribution of ImageNet that transformed machine learning research, this benchmark dataset is hoped to be helpful for society to investigate new image reconstruction algorithms since the relationship between permittivity distribution and capacitance can be better mapped. In addition, different machine learning-based image reconstruction algorithms can be trained and tested by the unified dataset, and the results can be evaluated and compared under the same standard, thus, making the ECT image reconstruction study more open and causing a breakthrough.
Collapse
Affiliation(s)
- Jin Zheng
- Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
| | - Jinku Li
- Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
| | - Yi Li
- Graduate School at Shenzhen, Tsinghua University, Shenzhen 518055, China.
| | - Lihui Peng
- Tsinghua National Laboratory for Information Science and Technology, Department of Automation, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
14
|
Hannink J, Ollenschläger M, Kluge F, Roth N, Klucken J, Eskofier BM. Benchmarking Foot Trajectory Estimation Methods for Mobile Gait Analysis. Sensors (Basel) 2017; 17:E1940. [PMID: 28832511 DOI: 10.3390/s17091940] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 08/18/2017] [Accepted: 08/19/2017] [Indexed: 11/17/2022]
Abstract
Mobile gait analysis systems based on inertial sensing on the shoe are applied in a wide range of applications. Especially for medical applications, they can give new insights into motor impairment in, e.g., neurodegenerative disease and help objectify patient assessment. One key component in these systems is the reconstruction of the foot trajectories from inertial data. In literature, various methods for this task have been proposed. However, performance is evaluated on a variety of datasets due to the lack of large, generally accepted benchmark datasets. This hinders a fair comparison of methods. In this work, we implement three orientation estimation and three double integration schemes for use in a foot trajectory estimation pipeline. All methods are drawn from literature and evaluated against a marker-based motion capture reference. We provide a fair comparison on the same dataset consisting of 735 strides from 16 healthy subjects. As a result, the implemented methods are ranked and we identify the most suitable processing pipeline for foot trajectory estimation in the context of mobile gait analysis.
Collapse
|