1
|
Podobnik G, Strojan P, Peterlin P, Ibragimov B, Vrtovec T. HaN-Seg: The head and neck organ-at-risk CT and MR segmentation dataset. Med Phys 2023; 50:1917-1927. [PMID: 36594372 DOI: 10.1002/mp.16197] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 11/17/2022] [Accepted: 12/07/2022] [Indexed: 01/04/2023] Open
Abstract
PURPOSE For the cancer in the head and neck (HaN), radiotherapy (RT) represents an important treatment modality. Segmentation of organs-at-risk (OARs) is the starting point of RT planning, however, existing approaches are focused on either computed tomography (CT) or magnetic resonance (MR) images, while multimodal segmentation has not been thoroughly explored yet. We present a dataset of CT and MR images of the same patients with curated reference HaN OAR segmentations for an objective evaluation of segmentation methods. ACQUISITION AND VALIDATION METHODS The cohort consists of HaN images of 56 patients that underwent both CT and T1-weighted MR imaging for image-guided RT. For each patient, reference segmentations of up to 30 OARs were obtained by experts performing manual pixel-wise image annotation. By maintaining the distribution of patient age and gender, and annotation type, the patients were randomly split into training Set 1 (42 cases or 75%) and test Set 2 (14 cases or 25%). Baseline auto-segmentation results are also provided by training the publicly available deep nnU-Net architecture on Set 1, and evaluating its performance on Set 2. DATA FORMAT AND USAGE NOTES The data are publicly available through an open-access repository under the name HaN-Seg: The Head and Neck Organ-at-Risk CT & MR Segmentation Dataset. Images and reference segmentations are stored in the NRRD file format, where the OAR filenames correspond to the nomenclature recommended by the American Association of Physicists in Medicine, and OAR and demographics information is stored in separate comma-separated value files. POTENTIAL APPLICATIONS The HaN-Seg: The Head and Neck Organ-at-Risk CT & MR Segmentation Challenge is launched in parallel with the dataset release to promote the development of automated techniques for OAR segmentation in the HaN. Other potential applications include out-of-challenge algorithm development and benchmarking, as well as external validation of the developed algorithms.
Collapse
Affiliation(s)
- Gašper Podobnik
- Faculty Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
| | | | | | - Bulat Ibragimov
- Faculty Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
- Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
| | - Tomaž Vrtovec
- Faculty Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia
| |
Collapse
|
2
|
Kitzler F, Barta N, Neugschwandtner RW, Gronauer A, Motsch V. WE3DS: An RGB-D Image Dataset for Semantic Segmentation in Agriculture. Sensors (Basel) 2023; 23:2713. [PMID: 36904917 PMCID: PMC10007111 DOI: 10.3390/s23052713] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 02/17/2023] [Accepted: 02/22/2023] [Indexed: 06/18/2023]
Abstract
Smart farming (SF) applications rely on robust and accurate computer vision systems. An important computer vision task in agriculture is semantic segmentation, which aims to classify each pixel of an image and can be used for selective weed removal. State-of-the-art implementations use convolutional neural networks (CNN) that are trained on large image datasets. In agriculture, publicly available RGB image datasets are scarce and often lack detailed ground-truth information. In contrast to agriculture, other research areas feature RGB-D datasets that combine color (RGB) with additional distance (D) information. Such results show that including distance as an additional modality can improve model performance further. Therefore, we introduce WE3DS as the first RGB-D image dataset for multi-class plant species semantic segmentation in crop farming. It contains 2568 RGB-D images (color image and distance map) and corresponding hand-annotated ground-truth masks. Images were taken under natural light conditions using an RGB-D sensor consisting of two RGB cameras in a stereo setup. Further, we provide a benchmark for RGB-D semantic segmentation on the WE3DS dataset and compare it with a solely RGB-based model. Our trained models achieve up to 70.7% mean Intersection over Union (mIoU) for discriminating between soil, seven crop species, and ten weed species. Finally, our work confirms the finding that additional distance information improves segmentation quality.
Collapse
Affiliation(s)
- Florian Kitzler
- Department of Sustainable Agricultural Systems, Institute of Agricultural Engineering, University of Natural Resources and Life Sciences Vienna, Peter-Jordan-Straße 82, 1190 Vienna, Austria
| | - Norbert Barta
- Department of Sustainable Agricultural Systems, Institute of Agricultural Engineering, University of Natural Resources and Life Sciences Vienna, Peter-Jordan-Straße 82, 1190 Vienna, Austria
| | - Reinhard W. Neugschwandtner
- Department of Crop Sciences, Institute of Agronomy, University of Natural Resources and Life Sciences Vienna, Konrad Lorenz-Straße 24, 3430 Tulln an der Donau, Austria
| | - Andreas Gronauer
- Department of Sustainable Agricultural Systems, Institute of Agricultural Engineering, University of Natural Resources and Life Sciences Vienna, Peter-Jordan-Straße 82, 1190 Vienna, Austria
| | - Viktoria Motsch
- Department of Sustainable Agricultural Systems, Institute of Agricultural Engineering, University of Natural Resources and Life Sciences Vienna, Peter-Jordan-Straße 82, 1190 Vienna, Austria
| |
Collapse
|
3
|
Shi L, Li X, Hu W, Chen H, Chen J, Fan Z, Gao M, Jing Y, Lu G, Ma D, Ma Z, Meng Q, Tang D, Sun H, Grzegorzek M, Qi S, Teng Y, Li C. EBHI-Seg: A novel enteroscope biopsy histopathological hematoxylin and eosin image dataset for image segmentation tasks. Front Med (Lausanne) 2023; 10:1114673. [PMID: 36760405 PMCID: PMC9902656 DOI: 10.3389/fmed.2023.1114673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 01/06/2023] [Indexed: 01/25/2023] Open
Abstract
Background and purpose Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of colorectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion This publicly available dataset contained 4,456 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients. EBHI-Seg is publicly available at: https://figshare.com/articles/dataset/EBHI-SEG/21540159/1.
Collapse
Affiliation(s)
- Liyu Shi
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Xiaoyan Li
- Department of Pathology, Cancer Hospital of China Medical University, Liaoning Cancer Hospital and Institute, Shengyang, China,*Correspondence: Xiaoyan Li ✉
| | - Weiming Hu
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Haoyuan Chen
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Jing Chen
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Zizhen Fan
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Minghe Gao
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Yujie Jing
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Guotao Lu
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Deguo Ma
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Zhiyu Ma
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Qingtao Meng
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Dechao Tang
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Hongzan Sun
- Shengjing Hospital, China Medical University, Shenyang, China
| | - Marcin Grzegorzek
- Institute of Medical Informatics, University of Lübeck, Lübeck, Germany,Department of Knowledge Engineering, University of Economics in Katowice, Katowice, Poland
| | - Shouliang Qi
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Yueyang Teng
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China
| | - Chen Li
- Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China,Chen Li ✉
| |
Collapse
|
4
|
Conrad R, Narayan K. Instance segmentation of mitochondria in electron microscopy images with a generalist deep learning model trained on a diverse dataset. Cell Syst 2023; 14:58-71.e5. [PMID: 36657391 PMCID: PMC9883049 DOI: 10.1016/j.cels.2022.12.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 10/10/2022] [Accepted: 12/14/2022] [Indexed: 01/19/2023]
Abstract
Mitochondria are extremely pleomorphic organelles. Automatically annotating each one accurately and precisely in any 2D or volume electron microscopy (EM) image is an unsolved computational challenge. Current deep learning-based approaches train models on images that provide limited cellular contexts, precluding generality. To address this, we amassed a highly heterogeneous ∼1.5 × 106 image 2D unlabeled cellular EM dataset and segmented ∼135,000 mitochondrial instances therein. MitoNet, a model trained on these resources, performs well on challenging benchmarks and on previously unseen volume EM datasets containing tens of thousands of mitochondria. We release a Python package and napari plugin, empanada, to rapidly run inference, visualize, and proofread instance segmentations. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Ryan Conrad
- Center for Molecular Microscopy, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda 20892, Maryland, USA.,Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick 21702, Maryland, USA
| | - Kedar Narayan
- Center for Molecular Microscopy, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda 20892, Maryland, USA.,Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Frederick 21702, Maryland, USA
| |
Collapse
|
5
|
Wang Z, Wu Y, Yang L, Thirunavukarasu A, Evison C, Zhao Y. Fast Personal Protective Equipment Detection for Real Construction Sites Using Deep Learning Approaches. Sensors (Basel) 2021; 21:s21103478. [PMID: 34067601 PMCID: PMC8156681 DOI: 10.3390/s21103478] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/10/2021] [Accepted: 05/11/2021] [Indexed: 11/20/2022]
Abstract
The existing deep learning-based Personal Protective Equipment (PPE) detectors can only detect limited types of PPE and their performance needs to be improved, particularly for their deployment on real construction sites. This paper introduces an approach to train and evaluate eight deep learning detectors, for real application purposes, based on You Only Look Once (YOLO) architectures for six classes, including helmets with four colours, person, and vest. Meanwhile, a dedicated high-quality dataset, CHV, consisting of 1330 images, is constructed by considering real construction site background, different gestures, varied angles and distances, and multi PPE classes. The comparison result among the eight models shows that YOLO v5x has the best mAP (86.55%), and YOLO v5s has the fastest speed (52 FPS) on GPU. The detection accuracy of helmet classes on blurred faces decreases by 7%, while there is no effect on other person and vest classes. And the proposed detectors trained on the CHV dataset have a superior performance compared to other deep learning approaches on the same datasets. The novel multiclass CHV dataset is open for public use.
Collapse
Affiliation(s)
- Zijian Wang
- School of Civil Engineering, Central South University, Changsha 410075, China; (Z.W.); (Y.W.)
- School of Aerospace, Transport and Manufacturing, Cranfield University, Bedfordshire MK43 0AL, UK;
| | - Yimin Wu
- School of Civil Engineering, Central South University, Changsha 410075, China; (Z.W.); (Y.W.)
| | - Lichao Yang
- School of Aerospace, Transport and Manufacturing, Cranfield University, Bedfordshire MK43 0AL, UK;
| | | | - Colin Evison
- BAM Nuttall, St James House, Knoll Road, Camberley GU15 3XW, UK; (A.T.); (C.E.)
| | - Yifan Zhao
- School of Aerospace, Transport and Manufacturing, Cranfield University, Bedfordshire MK43 0AL, UK;
- Correspondence:
| |
Collapse
|
6
|
Conrad R, Narayan K. CEM500K, a large-scale heterogeneous unlabeled cellular electron microscopy image dataset for deep learning. eLife 2021; 10:e65894. [PMID: 33830015 PMCID: PMC8032397 DOI: 10.7554/elife.65894] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 03/13/2021] [Indexed: 01/03/2023] Open
Abstract
Automated segmentation of cellular electron microscopy (EM) datasets remains a challenge. Supervised deep learning (DL) methods that rely on region-of-interest (ROI) annotations yield models that fail to generalize to unrelated datasets. Newer unsupervised DL algorithms require relevant pre-training images, however, pre-training on currently available EM datasets is computationally expensive and shows little value for unseen biological contexts, as these datasets are large and homogeneous. To address this issue, we present CEM500K, a nimble 25 GB dataset of 0.5 × 106 unique 2D cellular EM images curated from nearly 600 three-dimensional (3D) and 10,000 two-dimensional (2D) images from >100 unrelated imaging projects. We show that models pre-trained on CEM500K learn features that are biologically relevant and resilient to meaningful image augmentations. Critically, we evaluate transfer learning from these pre-trained models on six publicly available and one newly derived benchmark segmentation task and report state-of-the-art results on each. We release the CEM500K dataset, pre-trained models and curation pipeline for model building and further expansion by the EM community. Data and code are available at https://www.ebi.ac.uk/pdbe/emdb/empiar/entry/10592/ and https://git.io/JLLTz.
Collapse
Affiliation(s)
- Ryan Conrad
- Center for Molecular Microscopy, Center for Cancer Research, National Cancer Institute, National Institutes of HealthBethesdaUnited States
- Cancer Research Technology Program, Frederick National Laboratory for Cancer ResearchFrederickUnited States
| | - Kedar Narayan
- Center for Molecular Microscopy, Center for Cancer Research, National Cancer Institute, National Institutes of HealthBethesdaUnited States
- Cancer Research Technology Program, Frederick National Laboratory for Cancer ResearchFrederickUnited States
| |
Collapse
|
7
|
Wagner M, Reinke S, Hänsel R, Klapper W, Braumann UD. An image dataset related to automated macrophage detection in immunostained lymphoma tissue samples. Gigascience 2021; 9:5803336. [PMID: 32161948 PMCID: PMC7066390 DOI: 10.1093/gigascience/giaa016] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 12/20/2019] [Accepted: 02/17/2020] [Indexed: 02/07/2023] Open
Abstract
Background We present an image dataset related to automated segmentation and counting of macrophages in diffuse large B-cell lymphoma (DLBCL) tissue sections. For the classification of DLBCL subtypes, as well as for providing a prognosis of the clinical outcome, the analysis of the tumor microenvironment and, particularly, of the different types and functions of tumor-associated macrophages is indispensable. Until now, however, most information about macrophages has been obtained either in a completely indirect way by gene expression profiling or by manual counts in immunohistochemically (IHC) fluorescence-stained tissue samples while automated recognition of single IHC stained macrophages remains a difficult task. In an accompanying publication, a reliable approach to this problem has been established, and a large set of related images has been generated and analyzed. Results Provided image data comprise (i) fluorescence microscopy images of 44 multiple immunohistostained DLBCL tumor subregions, captured at 4 channels corresponding to CD14, CD163, Pax5, and DAPI; (ii) ”cartoon-like” total variation–filtered versions of these images, generated by Rudin-Osher-Fatemi denoising; (iii) an automatically generated mask of the evaluation subregion, based on information from the DAPI channel; and (iv) automatically generated segmentation masks for macrophages (using information from CD14 and CD163 channels), B-cells (using information from Pax5 channel), and all cell nuclei (using information from DAPI channel). Conclusions A large set of IHC stained DLBCL specimens is provided together with segmentation masks for different cell populations generated by a reference method for automated image analysis, thus featuring considerable reuse potential.
Collapse
Affiliation(s)
- Marcus Wagner
- Institute for Medical Informatics, Statistics and Epidemiology (IMISE), Leipzig University, Härtelstr. 16-18, D-04107 Leipzig, Germany
| | - Sarah Reinke
- Department of Pathology, Hematopathology Section and Lymph Node Registry, University of Kiel/University Hospital Schleswig-Holstein, Arnold-Heller-Str. 3, Haus 14, D-24105 Kiel, Germany
| | - René Hänsel
- Institute for Medical Informatics, Statistics and Epidemiology (IMISE), Leipzig University, Härtelstr. 16-18, D-04107 Leipzig, Germany
| | - Wolfram Klapper
- Department of Pathology, Hematopathology Section and Lymph Node Registry, University of Kiel/University Hospital Schleswig-Holstein, Arnold-Heller-Str. 3, Haus 14, D-24105 Kiel, Germany
| | - Ulf-Dietrich Braumann
- Faculty of Engineering, Leipzig University of Applied Sciences (HTWK), P.O.B. 30 11 66, D-04251 Leipzig, Germany.,Fraunhofer Institute for Cell Therapy and Immunology (IZI), Perlickstr. 1, D-04103 Leipzig, Germany
| |
Collapse
|