1
|
Li G, Wu C, Wang D, Srinivasan V, Kaeli DR, Dy JG, Gu AZ. Machine Learning-Based Determination of Sampling Depth for Complex Environmental Systems: Case Study with Single-Cell Raman Spectroscopy Data in EBPR Systems. Environ Sci Technol 2022; 56:13473-13484. [PMID: 36048618 DOI: 10.1021/acs.est.1c08768] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Rapid progress in various advanced analytical methods, such as single-cell technologies, enable unprecedented and deeper understanding of microbial ecology beyond the resolution of conventional approaches. A major application challenge exists in the determination of sufficient sample size without sufficient prior knowledge of the community complexity and, the need to balance between statistical power and limited time or resources. This hinders the desired standardization and wider application of these technologies. Here, we proposed, tested and validated a computational sampling size assessment protocol taking advantage of a metric, named kernel divergence. This metric has two advantages: First, it directly compares data set-wise distributional differences with no requirements on human intervention or prior knowledge-based preclassification. Second, minimal assumptions in distribution and sample space are made in data processing to enhance its application domain. This enables test-verified appropriate handling of data sets with both linear and nonlinear relationships. The model was then validated in a case study with Single-cell Raman Spectroscopy (SCRS) phenotyping data sets from eight different enhanced biological phosphorus removal (EBPR) activated sludge communities located across North America. The model allows the determination of sufficient sampling size for any targeted or customized information capture capacity or resolution level. Promised by its flexibility and minimal restriction of input data types, the proposed method is expected to be a standardized approach for sampling size optimization, enabling more comparable and reproducible experiments and analysis on complex environmental samples. Finally, these advantages enable the extension of the capability to other single-cell technologies or environmental applications with data sets exhibiting continuous features.
Collapse
Affiliation(s)
- Guangyu Li
- Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts 02115-5026, United States
- School of Civil and Environmental Engineering, Cornell University, Ithaca, New York 14853-0001, United States
| | - Chieh Wu
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115-5005, United States
| | - Dongqi Wang
- Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts 02115-5026, United States
- Department of Municipal and Environmental Engineering, School of Water Resources and Hydro-Electric Engineering, Xi'an University of Technology, Xi'an, Shaanxi 710048, PRC
| | - Varun Srinivasan
- Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts 02115-5026, United States
- Brown and Caldwell, One Tech Drive, Andover, Massachusetts 01810, United States
| | - David R Kaeli
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115-5005, United States
| | - Jennifer G Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts 02115-5005, United States
| | - April Z Gu
- Department of Civil and Environmental Engineering, Northeastern University, Boston, Massachusetts 02115-5026, United States
- School of Civil and Environmental Engineering, Cornell University, Ithaca, New York 14853-0001, United States
| |
Collapse
|
2
|
Boueiz A, Xu Z, Chang Y, Masoomi A, Gregory A, Lutz S, Qiao D, Crapo JD, Dy JG, Silverman EK, Castaldi PJ. Machine Learning Prediction of Progression in Forced Expiratory Volume in 1 Second in the COPDGene® Study. Chronic Obstr Pulm Dis 2022; 9:349-365. [PMID: 35649102 PMCID: PMC9448009 DOI: 10.15326/jcopdf.2021.0275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 05/18/2022] [Indexed: 05/24/2023]
Abstract
BACKGROUND The heterogeneous nature of chronic obstructive pulmonary disease (COPD) complicates the identification of the predictors of disease progression. We aimed to improve the prediction of disease progression in COPD by using machine learning and incorporating a rich dataset of phenotypic features. METHODS We included 4496 smokers with available data from their enrollment and 5-year follow-up visits in the COPD Genetic Epidemiology (COPDGene®) study. We constructed linear regression (LR) and supervised random forest models to predict 5-year progression in forced expiratory in 1 second (FEV1) from 46 baseline features. Using cross-validation, we randomly partitioned participants into training and testing samples. We also validated the results in the COPDGene 10-year follow-up visit. RESULTS Predicting the change in FEV1 over time is more challenging than simply predicting the future absolute FEV1 level. For random forest, R-squared was 0.15 and the area under the receiver operator characteristic (ROC) curves for the prediction of participants in the top quartile of observed progression was 0.71 (testing) and respectively, 0.10 and 0.70 (validation). Random forest provided slightly better performance than LR. The accuracy was best for Global initiative for chronic Obstructive Lung Disease (GOLD) grades 1-2 participants, and it was harder to achieve accurate prediction in advanced stages of the disease. Predictive variables differed in their relative importance as well as for the predictions by GOLD. CONCLUSION Random forest, along with deep phenotyping, predicts FEV1 progression with reasonable accuracy. There is significant room for improvement in future models. This prediction model facilitates the identification of smokers at increased risk for rapid disease progression. Such findings may be useful in the selection of patient populations for targeted clinical trials.
Collapse
Affiliation(s)
- Adel Boueiz
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Pulmonary and Critical Care Division, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
- *These authors contributed equally
| | - Zhonghui Xu
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
- *These authors contributed equally
| | - Yale Chang
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, United States
| | - Aria Masoomi
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, United States
| | - Andrew Gregory
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Sharon Lutz
- Department of Population Medicine, Harvard Pilgrim Health Care Institute, Boston, Massachusetts, United States
| | - Dandi Qiao
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - James D. Crapo
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, Colorado, United States
| | - Jennifer G. Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, United States
| | - Edwin K. Silverman
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Pulmonary and Critical Care Division, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | - Peter J. Castaldi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
- Division of General Medicine and Primary Care, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States
| | | |
Collapse
|
3
|
Bozkurt A, Kose K, Coll-Font J, Alessi-Fox C, Brooks DH, Dy JG, Rajadhyaksha M. Skin strata delineation in reflectance confocal microscopy images using recurrent convolutional networks with attention. Sci Rep 2021; 11:12576. [PMID: 34131165 PMCID: PMC8206415 DOI: 10.1038/s41598-021-90328-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 04/12/2021] [Indexed: 11/29/2022] Open
Abstract
Reflectance confocal microscopy (RCM) is an effective non-invasive tool for cancer diagnosis. However, acquiring and reading RCM images requires extensive training and experience, and novice clinicians exhibit high discordance in diagnostic accuracy. Quantitative tools to standardize image acquisition could reduce both required training and diagnostic variability. To perform diagnostic analysis, clinicians collect a set of RCM mosaics (RCM images concatenated in a raster fashion to extend the field view) at 4-5 specific layers in skin, all localized in the junction between the epidermal and dermal layers (dermal-epidermal junction, DEJ), necessitating locating that junction before mosaic acquisition. In this study, we automate DEJ localization using deep recurrent convolutional neural networks to delineate skin strata in stacks of RCM images collected at consecutive depths. Success will guide to automated and quantitative mosaic acquisition thus reducing inter operator variability and bring standardization in imaging. Testing our model against an expert labeled dataset of 504 RCM stacks, we achieved [Formula: see text] classification accuracy and nine-fold reduction in the number of anatomically impossible errors compared to the previous state-of-the-art.
Collapse
Affiliation(s)
- Alican Bozkurt
- Northeastern University, Boston, MA, 02115, USA.
- Paige AI, New York, NY, USA.
| | - Kivanc Kose
- Memorial Sloan Kettering Cancer Center, New York, NY, 10022, USA
| | - Jaume Coll-Font
- Northeastern University, Boston, MA, 02115, USA
- Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | |
Collapse
|
4
|
Lee SI, Adans-Dester CP, OBrien AT, Vergara-Diaz GP, Black-Schaffer R, Zafonte R, Dy JG, Bonato P. Predicting and Monitoring Upper-Limb Rehabilitation Outcomes Using Clinical and Wearable Sensor Data in Brain Injury Survivors. IEEE Trans Biomed Eng 2021; 68:1871-1881. [PMID: 32997621 PMCID: PMC8723794 DOI: 10.1109/tbme.2020.3027853] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE Rehabilitation specialists have shown considerable interest for the development of models, based on clinical data, to predict the response to rehabilitation interventions in stroke and traumatic brain injury survivors. However, accurate predictions are difficult to obtain due to the variability in patients' response to rehabilitation interventions. This study aimed to investigate the use of wearable technology in combination with clinical data to predict and monitor the recovery process and assess the responsiveness to treatment on an individual basis. METHODS Gaussian Process Regression-based algorithms were developed to estimate rehabilitation outcomes (i.e., Functional Ability Scale scores) using either clinical or wearable sensor data or a combination of the two. RESULTS The algorithm based on clinical data predicted rehabilitation outcomes with a Pearson's correlation of 0.79 compared to actual clinical scores provided by clinicians but failed to model the variability in responsiveness to the intervention observed across individuals. In contrast, the algorithm based on wearable sensor data generated rehabilitation outcome estimates with a Pearson's correlation of 0.91 and modeled the individual responses to rehabilitation more accurately. Furthermore, we developed a novel approach to combine estimates derived from the clinical data and the sensor data using a constrained linear model. This approach resulted in a Pearson's correlation of 0.94 between estimated and clinician-provided scores. CONCLUSION This algorithm could enable the design of patient-specific interventions based on predictions of rehabilitation outcomes relying on clinical and wearable sensor data. SIGNIFICANCE This is important in the context of developing precision rehabilitation interventions.
Collapse
|
5
|
Kose K, Bozkurt A, Alessi-Fox C, Gill M, Longo C, Pellacani G, Dy JG, Brooks DH, Rajadhyaksha M. Segmentation of cellular patterns in confocal images of melanocytic lesions in vivo via a multiscale encoder-decoder network (MED-Net). Med Image Anal 2021; 67:101841. [PMID: 33142135 PMCID: PMC7885250 DOI: 10.1016/j.media.2020.101841] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 12/11/2022]
Abstract
In-vivo optical microscopy is advancing into routine clinical practice for non-invasively guiding diagnosis and treatment of cancer and other diseases, and thus beginning to reduce the need for traditional biopsy. However, reading and analysis of the optical microscopic images are generally still qualitative, relying mainly on visual examination. Here we present an automated semantic segmentation method called "Multiscale Encoder-Decoder Network (MED-Net)" that provides pixel-wise labeling into classes of patterns in a quantitative manner. The novelty in our approach is the modeling of textural patterns at multiple scales (magnifications, resolutions). This mimics the traditional procedure for examining pathology images, which routinely starts with low magnification (low resolution, large field of view) followed by closer inspection of suspicious areas with higher magnification (higher resolution, smaller fields of view). We trained and tested our model on non-overlapping partitions of 117 reflectance confocal microscopy (RCM) mosaics of melanocytic lesions, an extensive dataset for this application, collected at four clinics in the US, and two in Italy. With patient-wise cross-validation, we achieved pixel-wise mean sensitivity and specificity of 74% and 92%, respectively, with 0.74 Dice coefficient over six classes. In the scenario, we partitioned the data clinic-wise and tested the generalizability of the model over multiple clinics. In this setting, we achieved pixel-wise mean sensitivity and specificity of 77% and 94%, respectively, with 0.77 Dice coefficient. We compared MED-Net against the state-of-the-art semantic segmentation models and achieved better quantitative segmentation performance. Our results also suggest that, due to its nested multiscale architecture, the MED-Net model annotated RCM mosaics more coherently, avoiding unrealistic-fragmented annotations.
Collapse
Affiliation(s)
- Kivanc Kose
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, 11377,NY, USA.
| | - Alican Bozkurt
- Electrical and Computer Engineering Department, Northeastern University, Boston, 02115, MA, USA.
| | | | - Melissa Gill
- Department of Pathology at SUNY Downstate Medical Center, New York, 11203, NY, USA; SkinMedical Research Diagnostics, P.L.L.C., Dobbs Ferry, 10522, NY, USA; Faculty of Medicine and Health Sciences, University of Alcala de Henares, Madrid, Spain.
| | - Caterina Longo
- University of Modena and Reggio Emilia, Reggio Emilia, Italy; Azienda Unità Sanitaria Locale - IRCCS di Reggio Emilia, Centro Oncologico ad Alta Tecnologia Diagnostica-Dermatologia, Reggio Emilia, Italy.
| | | | - Jennifer G Dy
- Electrical and Computer Engineering Department, Northeastern University, Boston, 02115, MA, USA.
| | - Dana H Brooks
- Electrical and Computer Engineering Department, Northeastern University, Boston, 02115, MA, USA.
| | - Milind Rajadhyaksha
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, 11377,NY, USA.
| |
Collapse
|
6
|
Kose K, Bozkurt A, Alessi-Fox C, Brooks DH, Dy JG, Rajadhyaksha M, Gill M. Utilizing Machine Learning for Image Quality Assessment for Reflectance Confocal Microscopy. J Invest Dermatol 2020; 140:1214-1222. [PMID: 31838127 PMCID: PMC7967900 DOI: 10.1016/j.jid.2019.10.018] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 10/09/2019] [Accepted: 10/16/2019] [Indexed: 10/25/2022]
Abstract
In vivo reflectance confocal microscopy (RCM) enables clinicians to examine lesions' morphological and cytological information in epidermal and dermal layers while reducing the need for biopsies. As RCM is being adopted more widely, the workflow is expanding from real-time diagnosis at the bedside to include a capture, store, and forward model with image interpretation and diagnosis occurring offsite, similar to radiology. As the patient may no longer be present at the time of image interpretation, quality assurance is key during image acquisition. Herein, we introduce a quality assurance process by means of automatically quantifying diagnostically uninformative areas within the lesional area by using RCM and coregistered dermoscopy images together. We trained and validated a pixel-level segmentation model on 117 RCM mosaics collected by international collaborators. The model delineates diagnostically uninformative areas with 82% sensitivity and 93% specificity. We further tested the model on a separate set of 372 coregistered RCM-dermoscopic image pairs and illustrate how the results of the RCM-only model can be improved via a multimodal (RCM + dermoscopy) approach, which can help quantify the uninformative regions within the lesional area. Our data suggest that machine learning-based automatic quantification offers a feasible objective quality control measure for RCM imaging.
Collapse
Affiliation(s)
- Kivanc Kose
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA.
| | - Alican Bozkurt
- Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, USA
| | | | - Dana H Brooks
- Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, USA
| | - Jennifer G Dy
- Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts, USA
| | - Milind Rajadhyaksha
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, New York, USA
| | - Melissa Gill
- Department of Pathology, SUNY Downstate Medical Center, Brooklyn, New York, USA; SkinMedical Research and Diagnostics, PLLC, Dobbs Ferry, New York, USA
| |
Collapse
|
7
|
Castaldi PJ, Boueiz A, Yun J, Estepar RSJ, Ross JC, Washko G, Cho MH, Hersh CP, Kinney GL, Young KA, Regan EA, Lynch DA, Criner GJ, Dy JG, Rennard SI, Casaburi R, Make BJ, Crapo J, Silverman EK, Hokanson JE. Machine Learning Characterization of COPD Subtypes: Insights From the COPDGene Study. Chest 2019; 157:1147-1157. [PMID: 31887283 DOI: 10.1016/j.chest.2019.11.039] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/18/2019] [Accepted: 11/29/2019] [Indexed: 12/17/2022] Open
Abstract
COPD is a heterogeneous syndrome. Many COPD subtypes have been proposed, but there is not yet consensus on how many COPD subtypes there are and how they should be defined. The COPD Genetic Epidemiology Study (COPDGene), which has generated 10-year longitudinal chest imaging, spirometry, and molecular data, is a rich resource for relating COPD phenotypes to underlying genetic and molecular mechanisms. In this article, we place COPDGene clustering studies in context with other highly cited COPD clustering studies, and summarize the main COPD subtype findings from COPDGene. First, most manifestations of COPD occur along a continuum, which explains why continuous aspects of COPD or disease axes may be more accurate and reproducible than subtypes identified through clustering methods. Second, continuous COPD-related measures can be used to create subgroups through the use of predictive models to define cut-points, and we review COPDGene research on blood eosinophil count thresholds as a specific example. Third, COPD phenotypes identified or prioritized through machine learning methods have led to novel biological discoveries, including novel emphysema genetic risk variants and systemic inflammatory subtypes of COPD. Fourth, trajectory-based COPD subtyping captures differences in the longitudinal evolution of COPD, addressing a major limitation of clustering analyses that are confounded by disease severity. Ongoing longitudinal characterization of subjects in COPDGene will provide useful insights about the relationship between lung imaging parameters, molecular markers, and COPD progression that will enable the identification of subtypes based on underlying disease processes and distinct patterns of disease progression, with the potential to improve the clinical relevance and reproducibility of COPD subtypes.
Collapse
Affiliation(s)
- Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; General Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.
| | - Adel Boueiz
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Jeong Yun
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Raul San Jose Estepar
- Applied Chest Imaging Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - James C Ross
- Applied Chest Imaging Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - George Washko
- Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Applied Chest Imaging Laboratory, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Gregory L Kinney
- Department of Epidemiology, University of Colorado, Denver, Aurora, CO
| | - Kendra A Young
- Department of Epidemiology, University of Colorado, Denver, Aurora, CO
| | | | - David A Lynch
- Department of Radiology, National Jewish Health, Denver, CO
| | - Gerald J Criner
- Department of Thoracic Medicine and Surgery, Lewis Katz School of Medicine at Temple University, Philadelphia, PA
| | - Jennifer G Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
| | - Stephen I Rennard
- Pulmonary and Critical Care Medicine, University of Nebraska Medical Center, Omaha, NE
| | - Richard Casaburi
- Rehabilitation Clinical Trials Center, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA
| | | | | | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - John E Hokanson
- Department of Epidemiology, University of Colorado, Denver, Aurora, CO
| | | |
Collapse
|
8
|
Lowe KE, Regan EA, Anzueto A, Austin E, Austin JHM, Beaty TH, Benos PV, Benway CJ, Bhatt SP, Bleecker ER, Bodduluri S, Bon J, Boriek AM, Boueiz ARE, Bowler RP, Budoff M, Casaburi R, Castaldi PJ, Charbonnier JP, Cho MH, Comellas A, Conrad D, Costa Davis C, Criner GJ, Curran-Everett D, Curtis JL, DeMeo DL, Diaz AA, Dransfield MT, Dy JG, Fawzy A, Fleming M, Flenaugh EL, Foreman MG, Fortis S, Gebrekristos H, Grant S, Grenier PA, Gu T, Gupta A, Han MK, Hanania NA, Hansel NN, Hayden LP, Hersh CP, Hobbs BD, Hoffman EA, Hogg JC, Hokanson JE, Hoth KF, Hsiao A, Humphries S, Jacobs K, Jacobson FL, Kazerooni EA, Kim V, Kim WJ, Kinney GL, Koegler H, Lutz SM, Lynch DA, MacIntye Jr. NR, Make BJ, Marchetti N, Martinez FJ, Maselli DJ, Mathews AM, McCormack MC, McDonald MLN, McEvoy CE, Moll M, Molye SS, Murray S, Nath H, Newell Jr. JD, Occhipinti M, Paoletti M, Parekh T, Pistolesi M, Pratte KA, Putcha N, Ragland M, Reinhardt JM, Rennard SI, Rosiello RA, Ross JC, Rossiter HB, Ruczinski I, San Jose Estepar R, Sciurba FC, Sieren JC, Singh H, Soler X, Steiner RM, Strand MJ, Stringer WW, Tal-Singer R, Thomashow B, Vegas Sánchez-Ferrero G, Walsh JW, Wan ES, Washko GR, Michael Wells J, Wendt CH, Westney G, Wilson A, Wise RA, Yen A, Young K, Yun J, Silverman EK, Crapo JD. COPDGene ® 2019: Redefining the Diagnosis of Chronic Obstructive Pulmonary Disease. Chronic Obstr Pulm Dis 2019; 6:384-399. [PMID: 31710793 PMCID: PMC7020846 DOI: 10.15326/jcopdf.6.5.2019.0149] [Citation(s) in RCA: 83] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/11/2019] [Indexed: 12/27/2022]
Abstract
BACKGROUND Chronic obstructive pulmonary disease (COPD) remains a major cause of morbidity and mortality. Present-day diagnostic criteria are largely based solely on spirometric criteria. Accumulating evidence has identified a substantial number of individuals without spirometric evidence of COPD who suffer from respiratory symptoms and/or increased morbidity and mortality. There is a clear need for an expanded definition of COPD that is linked to physiologic, structural (computed tomography [CT]) and clinical evidence of disease. Using data from the COPD Genetic Epidemiology study (COPDGene®), we hypothesized that an integrated approach that includes environmental exposure, clinical symptoms, chest CT imaging and spirometry better defines disease and captures the likelihood of progression of respiratory obstruction and mortality. METHODS Four key disease characteristics - environmental exposure (cigarette smoking), clinical symptoms (dyspnea and/or chronic bronchitis), chest CT imaging abnormalities (emphysema, gas trapping and/or airway wall thickening), and abnormal spirometry - were evaluated in a group of 8784 current and former smokers who were participants in COPDGene® Phase 1. Using these 4 disease characteristics, 8 categories of participants were identified and evaluated for odds of spirometric disease progression (FEV1 > 350 ml loss over 5 years), and the hazard ratio for all-cause mortality was examined. RESULTS Using smokers without symptoms, CT imaging abnormalities or airflow obstruction as the reference population, individuals were classified as Possible COPD, Probable COPD and Definite COPD. Current Global initiative for obstructive Lung Disease (GOLD) criteria would diagnose 4062 (46%) of the 8784 study participants with COPD. The proposed COPDGene® 2019 diagnostic criteria would add an additional 3144 participants. Under the new criteria, 82% of the 8784 study participants would be diagnosed with Possible, Probable or Definite COPD. These COPD groups showed increased risk of disease progression and mortality. Mortality increased in patients as the number of their COPD characteristics increased, with a maximum hazard ratio for all cause-mortality of 5.18 (95% confidence interval [CI]: 4.15-6.48) in those with all 4 disease characteristics. CONCLUSIONS A substantial portion of smokers with respiratory symptoms and imaging abnormalities do not manifest spirometric obstruction as defined by population normals. These individuals are at significant risk of death and spirometric disease progression. We propose to redefine the diagnosis of COPD through an integrated approach using environmental exposure, clinical symptoms, CT imaging and spirometric criteria. These expanded criteria offer the potential to stimulate both current and future interventions that could slow or halt disease progression in patients before disability or irreversible lung structural changes develop.
Collapse
Affiliation(s)
- Katherine E. Lowe
- Cleveland Clinic Lerner College of Medicine of Case Western Reserve School of Medicine, Cleveland, Ohio
| | | | | | | | | | | | | | | | | | | | | | - Jessica Bon
- University of Pittsburgh, Pittsburgh, Pennsylvania
- VA Pittsburgh Healthcare System, Pittsburgh, Pennsylvania
| | | | | | | | - Matthew Budoff
- Los Angeles Biomedical Research Institute at Harbor- University of California Los Angeles Medical Center, Torrance
| | - Richard Casaburi
- Los Angeles Biomedical Research Institute at Harbor- University of California Los Angeles Medical Center, Torrance
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Margaret Fleming
- Novartis Institute for Biomedical Research, Cambridge, Massachusetts
| | | | | | | | | | - Sarah Grant
- Novartis Institute for Biomedical Research, Cambridge, Massachusetts
| | | | - Tian Gu
- University of Michigan, Ann Arbor
| | - Abhya Gupta
- Boehringer Ingelheim, Biberach an der Riss, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Victor Kim
- Temple University, Philadelphia, Pennsylvania
| | - Woo Jin Kim
- Kangwon National University, Chuncheon, Korea
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Matthew Moll
- Brigham and Women's Hospital, Boston, Massachusetts
| | | | | | | | | | | | | | | | | | | | | | | | | | - Stephen I. Rennard
- AstraZeneca, Cambridge, United Kingdom
- University of Nebraska Medical Center, Omaha
| | | | | | - Harry B. Rossiter
- Los Angeles Biomedical Research Institute at Harbor- University of California Los Angeles Medical Center, Torrance
- University of Leeds, Leeds, United Kingdom
| | | | | | | | | | | | - Xavier Soler
- University of California at San Diego
- GlaxoSmithKline, Research Triangle Park, North Carolina
| | | | | | - William W. Stringer
- Los Angeles Biomedical Research Institute at Harbor- University of California Los Angeles Medical Center, Torrance
| | | | | | | | | | - Emily S. Wan
- Brigham and Women's Hospital, Boston, Massachusetts
- VA Boston Healthcare System, Jamaica Plain, Massachusetts
| | | | | | | | | | | | | | | | - Kendra Young
- University of Colorado Anschutz Medical Campus, Aurora
| | - Jeong Yun
- Brigham and Women's Hospital, Boston, Massachusetts
| | | | | |
Collapse
|
9
|
Sourati J, Gholipour A, Dy JG, Tomas-Fernandez X, Kurugol S, Warfield SK. Intelligent Labeling Based on Fisher Information for Medical Image Segmentation Using Deep Learning. IEEE Trans Med Imaging 2019; 38:2642-2653. [PMID: 30932833 PMCID: PMC7179938 DOI: 10.1109/tmi.2019.2907805] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Deep convolutional neural networks (CNN) have recently achieved superior performance at the task of medical image segmentation compared to classic models. However, training a generalizable CNN requires a large amount of training data, which is difficult, expensive, and time-consuming to obtain in medical settings. Active Learning (AL) algorithms can facilitate training CNN models by proposing a small number of the most informative data samples to be annotated to achieve a rapid increase in performance. We proposed a new active learning method based on Fisher information (FI) for CNNs for the first time. Using efficient backpropagation methods for computing gradients together with a novel low-dimensional approximation of FI enabled us to compute FI for CNNs with a large number of parameters. We evaluated the proposed method for brain extraction with a patch-wise segmentation CNN model in two different learning scenarios: universal active learning and active semi-automatic segmentation. In both scenarios, an initial model was obtained using labeled training subjects of a source data set and the goal was to annotate a small subset of new samples to build a model that performs well on the target subject(s). The target data sets included images that differed from the source data by either age group (e.g. newborns with different image contrast) or underlying pathology that was not available in the source data. In comparison to several recently proposed AL methods and brain extraction baselines, the results showed that FI-based AL outperformed the competing methods in improving the performance of the model after labeling a very small portion of target data set (<0.25%).
Collapse
|
10
|
Ross JC, Castaldi PJ, Cho MH, Hersh CP, Rahaghi FN, Sánchez-Ferrero GV, Parker MM, Litonjua AA, Sparrow D, Dy JG, Silverman EK, Washko GR, San José Estépar R. Longitudinal Modeling of Lung Function Trajectories in Smokers with and without Chronic Obstructive Pulmonary Disease. Am J Respir Crit Care Med 2018; 198:1033-1042. [PMID: 29671603 PMCID: PMC6221566 DOI: 10.1164/rccm.201707-1405oc] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 04/17/2018] [Indexed: 11/16/2022] Open
Abstract
RATIONALE The relationship between longitudinal lung function trajectories, chest computed tomography (CT) imaging, and genetic predisposition to chronic obstructive pulmonary disease (COPD) has not been explored. OBJECTIVES 1) To model trajectories using a data-driven approach applied to longitudinal data spanning adulthood in the Normative Aging Study (NAS), and 2) to apply these models to demographically similar subjects in the COPDGene (Genetic Epidemiology of COPD) Study with detailed phenotypic characterization including chest CT. METHODS We modeled lung function trajectories in 1,060 subjects in NAS with a median follow-up time of 29 years. We assigned 3,546 non-Hispanic white males in COPDGene to these trajectories for further analysis. We assessed phenotypic and genetic differences between trajectories and across age strata. MEASUREMENTS AND MAIN RESULTS We identified four trajectories in NAS with differing levels of maximum lung function and rate of decline. In COPDGene, 617 subjects (17%) were assigned to the lowest trajectory and had the greatest radiologic burden of disease (P < 0.01); 1,283 subjects (36%) were assigned to a low trajectory with evidence of airway disease preceding emphysema on CT; 1,411 subjects (40%) and 237 subjects (7%) were assigned to the remaining two trajectories and tended to have preserved lung function and negligible emphysema. The genetic contribution to these trajectories was as high as 83% (P = 0.02), and membership in lower lung function trajectories was associated with greater parental histories of COPD, decreased exercise capacity, greater dyspnea, and more frequent COPD exacerbations. CONCLUSIONS Data-driven analysis identifies four lung function trajectories. Trajectory membership has a genetic basis and is associated with distinct lung structural abnormalities.
Collapse
Affiliation(s)
| | - Peter J. Castaldi
- Channing Division of Network Medicine
- Divison of General Medicine, and
| | - Michael H. Cho
- Channing Division of Network Medicine
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston Massachusetts
| | - Craig P. Hersh
- Channing Division of Network Medicine
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston Massachusetts
| | - Farbod N. Rahaghi
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston Massachusetts
| | | | | | | | - David Sparrow
- VA Normative Aging Study, Veterans Affairs Boston Healthcare System, Boston, Massachusetts
- Department of Medicine, Boston University School of Medicine, Boston, Massachusetts; and
| | - Jennifer G. Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, Massachusetts
| | - Edwin K. Silverman
- Channing Division of Network Medicine
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston Massachusetts
| | - George R. Washko
- Division of Pulmonary and Critical Care Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston Massachusetts
| | | |
Collapse
|
11
|
Kinney GL, Santorico SA, Young KA, Cho MH, Castaldi PJ, San José Estépar R, Ross JC, Dy JG, Make BJ, Regan EA, Lynch DA, Everett DC, Lutz SM, Silverman EK, Washko GR, Crapo JD, Hokanson JE. Identification of Chronic Obstructive Pulmonary Disease Axes That Predict All-Cause Mortality: The COPDGene Study. Am J Epidemiol 2018; 187:2109-2116. [PMID: 29771274 DOI: 10.1093/aje/kwy087] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 04/13/2018] [Indexed: 11/12/2022] Open
Abstract
Chronic obstructive pulmonary disease (COPD) is a syndrome caused by damage to the lungs that results in decreased pulmonary function and reduced structural integrity. Pulmonary function testing (PFT) is used to diagnose and stratify COPD into severity groups, and computed tomography (CT) imaging of the chest is often used to assess structural changes in the lungs. We hypothesized that the combination of PFT and CT phenotypes would provide a more powerful tool for assessing underlying morphologic differences associated with pulmonary function in COPD than does PFT alone. We used factor analysis of 26 variables to classify 8,157 participants recruited into the COPDGene cohort between January 2008 and June 2011 from 21 clinical centers across the United States. These factors were used as predictors of all-cause mortality using Cox proportional hazards modeling. Five factors explained 80% of the covariance and represented the following domains: factor 1, increased emphysema and decreased pulmonary function; factor 2, airway disease and decreased pulmonary function; factor 3, gas trapping; factor 4, CT variability; and factor 5, hyperinflation. After more than 46,079 person-years of follow-up, factors 1 through 4 were associated with mortality and there was a significant synergistic interaction between factors 1 and 2 on death. Considering CT measures along with PFT in the assessment of COPD can identify patients at particularly high risk for death.
Collapse
Affiliation(s)
- Gregory L Kinney
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | - Stephanie A Santorico
- Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, Colorado
- Human Medical Genetics and Genomics Program, University of Colorado School of Medicine, Aurora, Colorado
- Division of Biostatistics and Bioinformatics, Office of Academic Affairs, National Jewish Health, Denver, Colorado
| | - Kendra A Young
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Raul San José Estépar
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - James C Ross
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - Jennifer G Dy
- Department of Electrical & Computer Engineering, Northeastern University, Boston, Massachusetts
| | - Barry J Make
- Department of Medicine, National Jewish Health, Denver, Colorado
| | | | - David A Lynch
- Department of Radiology, National Jewish Health, Denver, Colorado
| | - Douglas C Everett
- Division of Biostatistics and Bioinformatics, Office of Academic Affairs, National Jewish Health, Denver, Colorado
| | - Sharon M Lutz
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - George R Washko
- Channing Division of Network Medicine, Brigham and Women’s Hospital, Boston, Massachusetts
| | - James D Crapo
- Department of Medicine, National Jewish Health, Denver, Colorado
| | - John E Hokanson
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado
| | | |
Collapse
|
12
|
Sourati J, Akcakaya M, Erdogmus D, Leen TK, Dy JG. A Probabilistic Active Learning Algorithm Based on Fisher Information Ratio. IEEE Trans Pattern Anal Mach Intell 2018; 40:2023-2029. [PMID: 28858784 DOI: 10.1109/tpami.2017.2743707] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
The task of labeling samples is demanding and expensive. Active learning aims to generate the smallest possible training data set that results in a classifier with high performance in the test phase. It usually consists of two steps of selecting a set of queries and requesting their labels. Among the suggested objectives to score the query sets, information theoretic measures have become very popular. Yet among them, those based on Fisher information (FI) have the advantage of considering the diversity among the queries and tractable computations. In this work, we provide a practical algorithm based on Fisher information ratio to obtain query distribution for a general framework where, in contrast to the previous FI-based querying methods, we make no assumptions over the test distribution. The empirical results on synthetic and real-world data sets indicate that this algorithm gives competitive results.
Collapse
|
13
|
Boueiz A, Chang Y, Cho MH, Washko GR, San José Estépar R, Bowler RP, Crapo JD, DeMeo DL, Dy JG, Silverman EK, Castaldi PJ. Lobar Emphysema Distribution Is Associated With 5-Year Radiological Disease Progression. Chest 2017; 153:65-76. [PMID: 28943279 DOI: 10.1016/j.chest.2017.09.022] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Revised: 07/13/2017] [Accepted: 09/06/2017] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Emphysema has considerable variability in its regional distribution. Craniocaudal emphysema distribution is an important predictor of the response to lung volume reduction. However, there is little consensus regarding how to define upper lobe-predominant and lower lobe-predominant emphysema subtypes. Consequently, the clinical and genetic associations with these subtypes are poorly characterized. METHODS We sought to identify subgroups characterized by upper-lobe or lower-lobe emphysema predominance and comparable amounts of total emphysema by analyzing data from 9,210 smokers without alpha-1-antitrypsin deficiency in the Genetic Epidemiology of COPD (COPDGene) cohort. CT densitometric emphysema was measured in each lung lobe. Random forest clustering was applied to lobar emphysema variables after regressing out the effects of total emphysema. Clusters were tested for association with clinical and imaging outcomes at baseline and at 5-year follow-up. Their associations with genetic variants were also compared. RESULTS Three clusters were identified: minimal emphysema (n = 1,312), upper lobe-predominant emphysema (n = 905), and lower lobe-predominant emphysema (n = 796). Despite a similar amount of total emphysema, the lower-lobe group had more severe airflow obstruction at baseline and higher rates of metabolic syndrome compared with subjects with upper-lobe predominance. The group with upper-lobe predominance had greater 5-year progression of emphysema, gas trapping, and dyspnea. Differential associations with known COPD genetic risk variants were noted. CONCLUSIONS Subgroups of smokers defined by upper-lobe or lower-lobe emphysema predominance exhibit different functional and radiological disease progression rates, and the upper-lobe predominant subtype shows evidence of association with known COPD genetic risk variants. These subgroups may be useful in the development of personalized treatments for COPD.
Collapse
Affiliation(s)
- Adel Boueiz
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Yale Chang
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
| | - Michael H Cho
- Pulmonary and Critical Care Medicine Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - George R Washko
- Pulmonary and Critical Care Medicine Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Raul San José Estépar
- Surgical Planning Laboratory, Laboratory of Mathematics in Imaging, Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Russell P Bowler
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO
| | - James D Crapo
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Jennifer G Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Pulmonary and Critical Care Medicine Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA
| | - Peter J Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA; Division of General Medicine and Primary Care, Brigham and Women's Hospital, Harvard Medical School, Boston, MA.
| | | |
Collapse
|
14
|
Sourati J, Kazmierczak SC, Akcakaya M, Dy JG, Leen TK, Erdogmus D. Assessing subsets of analytes in context of detecting laboratory errors. Annu Int Conf IEEE Eng Med Biol Soc 2017; 2016:5793-5796. [PMID: 28269571 DOI: 10.1109/embc.2016.7592044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Laboratory error detection is a hard task yet plays an important role in efficient care of the patients. Quality controls are inadequate in detecting pre-analytic errors and are not frequent enough. Hence population- and patient-based detectors are developed. However, it is not clear what set of analytes leads to the most efficient error detectors. Here, we use three different scoring functions that can be used in detecting errors, to rank a set of analytes in terms of their strength in distinguishing erroneous measurements. We also observe that using evaluations of larger subsets of analytes in our analysis does not necessarily lead to a more accurate error detector. In our data set obtained from renal kidney disease inpatients, calcium, potassium, and sodium, emerged as the top-3 indicators of an erroneous measurement. Using the joint likelihood of these three analytes, we obtain an estimated AUC of 0.73 in error detection.
Collapse
|
15
|
Ross JC, Castaldi PJ, Cho MH, Chen J, Chang Y, Dy JG, Silverman EK, Washko GR, Jose Estepar RS. A Bayesian Nonparametric Model for Disease Subtyping: Application to Emphysema Phenotypes. IEEE Trans Med Imaging 2017; 36:343-354. [PMID: 28060702 PMCID: PMC5267575 DOI: 10.1109/tmi.2016.2608782] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
We introduce a novel Bayesian nonparametric model that uses the concept of disease trajectories for disease subtype identification. Although our model is general, we demonstrate that by treating fractions of tissue patterns derived from medical images as compositional data, our model can be applied to study distinct progression trends between population subgroups. Specifically, we apply our algorithm to quantitative emphysema measurements obtained from chest CT scans in the COPDGene Study and show several distinct progression patterns. As emphysema is one of the major components of chronic obstructive pulmonary disease (COPD), the third leading cause of death in the United States [1], an improved definition of emphysema and COPD subtypes is of great interest. We investigate several models with our algorithm, and show that one with age , pack years (a measure of cigarette exposure), and smoking status as predictors gives the best compromise between estimated predictive performance and model complexity. This model identified nine subtypes which showed significant associations to seven single nucleotide polymorphisms (SNPs) known to associate with COPD. Additionally, this model gives better predictive accuracy than multiple, multivariate ordinary least squares regression as demonstrated in a five-fold cross validation analysis. We view our subtyping algorithm as a contribution that can be applied to bridge the gap between CT-level assessment of tissue composition to population-level analysis of compositional trends that vary between disease subtypes.
Collapse
|
16
|
Ghanta S, Jordan MI, Kose K, Brooks DH, Rajadhyaksha M, Dy JG. A Marked Poisson Process Driven Latent Shape Model for 3D Segmentation of Reflectance Confocal Microscopy Image Stacks of Human Skin. IEEE Trans Image Process 2017; 26:172-184. [PMID: 27723590 PMCID: PMC5258843 DOI: 10.1109/tip.2016.2615291] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Segmenting objects of interest from 3D data sets is a common problem encountered in biological data. Small field of view and intrinsic biological variability combined with optically subtle changes of intensity, resolution, and low contrast in images make the task of segmentation difficult, especially for microscopy of unstained living or freshly excised thick tissues. Incorporating shape information in addition to the appearance of the object of interest can often help improve segmentation performance. However, the shapes of objects in tissue can be highly variable and design of a flexible shape model that encompasses these variations is challenging. To address such complex segmentation problems, we propose a unified probabilistic framework that can incorporate the uncertainty associated with complex shapes, variable appearance, and unknown locations. The driving application that inspired the development of this framework is a biologically important segmentation problem: the task of automatically detecting and segmenting the dermal-epidermal junction (DEJ) in 3D reflectance confocal microscopy (RCM) images of human skin. RCM imaging allows noninvasive observation of cellular, nuclear, and morphological detail. The DEJ is an important morphological feature as it is where disorder, disease, and cancer usually start. Detecting the DEJ is challenging, because it is a 2D surface in a 3D volume which has strong but highly variable number of irregularly spaced and variably shaped "peaks and valleys." In addition, RCM imaging resolution, contrast, and intensity vary with depth. Thus, a prior model needs to incorporate the intrinsic structure while allowing variability in essentially all its parameters. We propose a model which can incorporate objects of interest with complex shapes and variable appearance in an unsupervised setting by utilizing domain knowledge to build appropriate priors of the model. Our novel strategy to model this structure combines a spatial Poisson process with shape priors and performs inference using Gibbs sampling. Experimental results show that the proposed unsupervised model is able to automatically detect the DEJ with physiologically relevant accuracy in the range 10- 20 μm .
Collapse
|
17
|
Bozkurt A, Kose K, Alessi-Fox C, Dy JG, Brooks DH, Rajadhyaksha M. Unsupervised delineation of stratum corneum using reflectance confocal microscopy and spectral clustering. Skin Res Technol 2016; 23:176-185. [PMID: 27516408 DOI: 10.1111/srt.12316] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2016] [Indexed: 11/30/2022]
Abstract
BACKGROUND Measuring the thickness of the stratum corneum (SC) in vivo is often required in pharmacological, dermatological, and cosmetological studies. Reflectance confocal microscopy (RCM) offers a non-invasive imaging-based approach. However, RCM-based measurements currently rely on purely visual analysis of images, which is time-consuming and suffers from inter-user subjectivity. METHODS We developed an unsupervised segmentation algorithm that can automatically delineate the SC layer in stacks of RCM images of human skin. We represent the unique textural appearance of SC layer using complex wavelet transform and distinguish it from deeper granular layers of skin using spectral clustering. Moreover, through localized processing in a matrix of small areas (called 'tiles'), we obtain lateral variation of SC thickness over the entire field of view. RESULTS On a set of 15 RCM stacks of normal human skin, our method estimated SC thickness with a mean error of 5.4 ± 5.1 μm compared to the 'ground truth' segmentation obtained from a clinical expert. CONCLUSION Our algorithm provides a non-invasive RCM imaging-based solution which is automated, rapid, objective, and repeatable.
Collapse
Affiliation(s)
- A Bozkurt
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA, USA
| | - K Kose
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - C Alessi-Fox
- Caliber Imaging and Diagnostics, Rochester, NY, USA
| | - J G Dy
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA, USA
| | - D H Brooks
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA, USA
| | - M Rajadhyaksha
- Dermatology Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
18
|
Lee JH, Cho MH, McDonald MLN, Hersh CP, Castaldi PJ, Crapo JD, Wan ES, Dy JG, Chang Y, Regan EA, Hardin M, DeMeo DL, Silverman EK. Phenotypic and genetic heterogeneity among subjects with mild airflow obstruction in COPDGene. Respir Med 2014; 108:1469-80. [PMID: 25154699 DOI: 10.1016/j.rmed.2014.07.018] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/01/2014] [Revised: 07/29/2014] [Accepted: 07/31/2014] [Indexed: 01/21/2023]
Abstract
BACKGROUND Chronic obstructive pulmonary disease (COPD) is characterized by marked phenotypic heterogeneity. Most previous studies have focused on COPD subjects with FEV1 < 80% predicted. We investigated the clinical and genetic heterogeneity in subjects with mild airflow limitation in spirometry grade 1 defined by the Global Initiative for chronic Obstructive Lung Disease (GOLD 1). METHODS Data from current and former smokers participating in the COPDGene Study (NCT00608764) were analyzed. K-means clustering was performed to explore subtypes within 794 GOLD 1 subjects. For all subjects with GOLD 1 and with each cluster, a genome-wide association study and candidate gene testing were performed using smokers with normal lung function as a control group. Combinations of COPD genome-wide significant single nucleotide polymorphisms (SNPs) were tested for association with FEV1 (% predicted) in GOLD 1 and in a combined group of GOLD 1 and smoking control subjects. RESULTS K-means clustering of GOLD 1 subjects identified putative "near-normal", "airway-predominant", "emphysema-predominant" and "lowest FEV1% predicted" subtypes. In non-Hispanic whites, the only SNP nominally associated with GOLD 1 status relative to smoking controls was rs7671167 (FAM13A) in logistic regression models with adjustment for age, sex, pack-years of smoking, and genetic ancestry. The emphysema-predominant GOLD 1 cluster was nominally associated with rs7671167 (FAM13A) and rs161976 (BICD1). The lowest FEV1% predicted cluster was nominally associated with rs1980057 (HHIP) and rs1051730 (CHRNA3). Combinations of COPD genome-wide significant SNPs were associated with FEV1 (% predicted) in a combined group of GOLD 1 and smoking control subjects. CONCLUSIONS Our results indicate that GOLD 1 subjects show substantial clinical heterogeneity, which is at least partially related to genetic heterogeneity.
Collapse
Affiliation(s)
- Jin Hwa Lee
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Department of Internal Medicine, School of Medicine, Ewha Womans University, Seoul, Republic of Korea.
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Merry-Lynn N McDonald
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - James D Crapo
- National Jewish Health and University of Colorado Denver, Denver, CO, USA
| | - Emily S Wan
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Jennifer G Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | - Yale Chang
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | - Elizabeth A Regan
- National Jewish Health and University of Colorado Denver, Denver, CO, USA
| | - Megan Hardin
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA; Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| | | |
Collapse
|
19
|
Sourati J, Erdogmus D, Dy JG, Brooks DH. Accelerated learning-based interactive image segmentation using pairwise constraints. IEEE Trans Image Process 2014; 23:3057-3070. [PMID: 24860031 PMCID: PMC4096329 DOI: 10.1109/tip.2014.2325783] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Algorithms for fully automatic segmentation of images are often not sufficiently generic with suitable accuracy, and fully manual segmentation is not practical in many settings. There is a need for semiautomatic algorithms, which are capable of interacting with the user and taking into account the collected feedback. Typically, such methods have simply incorporated user feedback directly. Here, we employ active learning of optimal queries to guide user interaction. Our work in this paper is based on constrained spectral clustering that iteratively incorporates user feedback by propagating it through the calculated affinities. The original framework does not scale well to large data sets, and hence is not straightforward to apply to interactive image segmentation. In order to address this issue, we adopt advanced numerical methods for eigen-decomposition implemented over a subsampling scheme. Our key innovation, however, is an active learning strategy that chooses pairwise queries to present to the user in order to increase the rate of learning from the feedback. Performance evaluation is carried out on the Berkeley segmentation and Graz-02 image data sets, confirming that convergence to high accuracy levels is realizable in relatively few iterations.
Collapse
|
20
|
Abstract
Complex data can be grouped and interpreted in many different ways. Most existing clustering algorithms, however, only find one clustering solution, and provide little guidance to data analysts who may not be satisfied with that single clustering and may wish to explore alternatives. We introduce a novel approach that provides several clustering solutions to the user for the purposes of exploratory data analysis. Our approach additionally captures the notion that alternative clusterings may reside in different subspaces (or views). We present an algorithm that simultaneously finds these subspaces and the corresponding clusterings. The algorithm is based on an optimization procedure that incorporates terms for cluster quality and novelty relative to previously discovered clustering solutions. We present a range of experiments that compare our approach to alternatives and explore the connections between simultaneous and iterative modes of discovery of multiple clusterings.
Collapse
|
21
|
Fan J, Dy JG, Chang CC, Zhou X. Identification of SNP-containing regulatory motifs in the myelodysplastic syndromes model using SNP arrays and gene expression arrays. Chin J Cancer 2013; 32:170-85. [PMID: 23327800 PMCID: PMC3845573 DOI: 10.5732/cjc.012.10113] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Myelodysplastic syndromes have increased in frequency and incidence in the American population, but patient prognosis has not significantly improved over the last decade. Such improvements could be realized if biomarkers for accurate diagnosis and prognostic stratification were successfully identified. In this study, we propose a method that associates two state-of-the-art array technologies—single nucleotide polymorphism (SNP) array and gene expression array—with gene motifs considered transcription factor-binding sites (TFBS). We are particularly interested in SNP-containing motifs introduced by genetic variation and mutation as TFBS. The potential regulation of SNP-containing motifs affects only when certain mutations occur. These motifs can be identified from a group of co-expressed genes with copy number variation. Then, we used a sliding window to identify motif candidates near SNPs on gene sequences. The candidates were filtered by coarse thresholding and fine statistical testing. Using the regression-based LARS-EN algorithm and a level-wise sequence combination procedure, we identified 28 SNP-containing motifs as candidate TFBS. We confirmed 21 of the 28 motifs with ChIP-chip fragments in the TRANSFAC database. Another six motifs were validated by TRANSFAC via searching binding fragments on co-regulated genes. The identified motifs and their location genes can be considered potential biomarkers for myelodysplastic syndromes. Thus, our proposed method, a novel strategy for associating two data categories, is capable of integrating information from different sources to identify reliable candidate regulatory SNP-containing motifs introduced by genetic variation and mutation.
Collapse
Affiliation(s)
- Jing Fan
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA.
| | | | | | | |
Collapse
|
22
|
Sourati J, Brooks DH, Dy JG, Erdogmus D. CONSTRAINED SPECTRAL CLUSTERING FOR IMAGE SEGMENTATION. IEEE Int Workshop Mach Learn Signal Process 2012; 2013:1-6. [PMID: 24466500 PMCID: PMC3898593 DOI: 10.1109/mlsp.2012.6349765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Constrained spectral clustering with affinity propagation in its original form is not practical for large scale problems like image segmentation. In this paper we employ novelty selection sub-sampling strategy, besides using efficient numerical eigen-decomposition methods to make this algorithm work efficiently for images. In addition, entropy-based active learning is also employed to select the queries posed to the user more wisely in an interactive image segmentation framework. We evaluate the algorithm on general and medical images to show that the segmentation results will improve using constrained clustering even if one works with a subset of pixels. Furthermore, this happens more efficiently when pixels to be labeled are selected actively.
Collapse
Affiliation(s)
- Jamshid Sourati
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA
| | - Dana H Brooks
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA
| | - Jennifer G Dy
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA
| | - Deniz Erdogmus
- Electrical and Computer Engineering Department, Northeastern University, Boston, MA
| |
Collapse
|
23
|
Tan H, Fan J, Bao J, Dy JG, Zhou X. A computational model for compressed sensing RNAi cellular screening. BMC Bioinformatics 2012; 13:337. [PMID: 23270311 PMCID: PMC3544734 DOI: 10.1186/1471-2105-13-337] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2012] [Accepted: 12/15/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA interference (RNAi) becomes an increasingly important and effective genetic tool to study the function of target genes by suppressing specific genes of interest. This system approach helps identify signaling pathways and cellular phase types by tracking intensity and/or morphological changes of cells. The traditional RNAi screening scheme, in which one siRNA is designed to knockdown one specific mRNA target, needs a large library of siRNAs and turns out to be time-consuming and expensive. RESULTS In this paper, we propose a conceptual model, called compressed sensing RNAi (csRNAi), which employs a unique combination of group of small interfering RNAs (siRNAs) to knockdown a much larger size of genes. This strategy is based on the fact that one gene can be partially bound with several small interfering RNAs (siRNAs) and conversely, one siRNA can bind to a few genes with distinct binding affinity. This model constructs a multi-to-multi correspondence between siRNAs and their targets, with siRNAs much fewer than mRNA targets, compared with the conventional scheme. Mathematically this problem involves an underdetermined system of equations (linear or nonlinear), which is ill-posed in general. However, the recently developed compressed sensing (CS) theory can solve this problem. We present a mathematical model to describe the csRNAi system based on both CS theory and biological concerns. To build this model, we first search nucleotide motifs in a target gene set. Then we propose a machine learning based method to find the effective siRNAs with novel features, such as image features and speech features to describe an siRNA sequence. Numerical simulations show that we can reduce the siRNA library to one third of that in the conventional scheme. In addition, the features to describe siRNAs outperform the existing ones substantially. CONCLUSIONS This csRNAi system is very promising in saving both time and cost for large-scale RNAi screening experiments which may benefit the biological research with respect to cellular processes and pathways.
Collapse
Affiliation(s)
- Hua Tan
- Department of Radiology, The Methodist Hospital Research Institute, Weill Medical College of Cornell University, Houston, TX 77030, USA
| | | | | | | | | |
Collapse
|
24
|
Fan J, Xia X, Li Y, Dy JG, Wong STC. A quantitative analytic pipeline for evaluating neuronal activities by high-throughput synaptic vesicle imaging. Neuroimage 2012; 62:2040-54. [PMID: 22732566 PMCID: PMC3437259 DOI: 10.1016/j.neuroimage.2012.06.020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Accepted: 06/12/2012] [Indexed: 11/26/2022] Open
Abstract
Synaptic vesicle dynamics play an important role in the study of neuronal and synaptic activities of neurodegradation diseases ranging from the epidemic Alzheimer's disease to the rare Rett syndrome. A high-throughput assay with a large population of neurons would be useful and efficient to characterize neuronal activity based on the dynamics of synaptic vesicles for the study of mechanisms or to discover drug candidates for neurodegenerative and neurodevelopmental disorders. However, the massive amounts of image data generated via high-throughput screening require enormous manual processing time and effort, restricting the practical use of such an assay. This paper presents an automated analytic system to process and interpret the huge data set generated by such assays. Our system enables the automated detection, segmentation, quantification, and measurement of neuron activities based on the synaptic vesicle assay. To overcome challenges such as noisy background, inhomogeneity, and tiny object size, we first employ MSVST (Multi-Scale Variance Stabilizing Transform) to obtain a denoised and enhanced map of the original image data. Then, we propose an adaptive thresholding strategy to solve the inhomogeneity issue, based on the local information, and to accurately segment synaptic vesicles. We design algorithms to address the issue of tiny objects of interest overlapping. Several post processing criteria are defined to filter false positives. A total of 152 features are extracted for each detected vesicle. A score is defined for each synaptic vesicle image to quantify the neuron activity. We also compare the unsupervised strategy with the supervised method. Our experiments on hippocampal neuron assays showed that the proposed system can automatically detect vesicles and quantify their dynamics for evaluating neuron activities. The availability of such an automated system will open opportunities for investigation of synaptic neuropathology and identification of candidate therapeutics for neurodegeneration.
Collapse
Affiliation(s)
- Jing Fan
- The Ting Tsung and Wei Fong Chao Center for Bioinformatics Research and Imaging for Neurosciences, The Methodist Hospital Research Institute, Weill Cornell Medical College, Houston, TX 77030, USA
| | | | | | | | | |
Collapse
|
25
|
Kurugol S, Bas E, Erdogmus D, Dy JG, Sharp GC, Brooks DH. Centerline extraction with principal curve tracing to improve 3D level set esophagus segmentation in CT images. Annu Int Conf IEEE Eng Med Biol Soc 2012; 2011:3403-6. [PMID: 22255070 DOI: 10.1109/iembs.2011.6090921] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
For radiotherapy planning, contouring of target volume and healthy structures at risk in CT volumes is essential. To automate this process, one of the available segmentation techniques can be used for many thoracic organs except the esophagus, which is very hard to segment due to low contrast. In this work we propose to initialize our previously introduced model based 3D level set esophagus segmentation method with a principal curve tracing (PCT) algorithm, which we adapted to solve the esophagus centerline detection problem. To address challenges due to low intensity contrast, we enhanced the PCT algorithm by learning spatial and intensity priors from a small set of annotated CT volumes. To locate the esophageal wall, the model based 3D level set algorithm including a shape model that represents the variance of esophagus wall around the estimated centerline is utilized. Our results show improvement in esophagus segmentation when initialized by PCT compared to our previous work, where an ad hoc centerline initialization was performed. Unlike previous approaches, this work does not need a very large set of annotated training images and has similar performance.
Collapse
Affiliation(s)
- Sila Kurugol
- Dept of Electricaland Computer Engineering, Nor theastern University, Boston, MA, USA
| | | | | | | | | | | |
Collapse
|
26
|
Kurugol S, Rajadhyaksha M, Dy JG, Brooks DH. Validation Study of Automated Dermal/Epidermal Junction Localization Algorithm in Reflectance Confocal Microscopy Images of Skin. Proc SPIE Int Soc Opt Eng 2012; 8207. [PMID: 24376908 DOI: 10.1117/12.909227] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Reflectance confocal microscopy (RCM) has seen increasing clinical application for noninvasive diagnosis of skin cancer. Identifying the location of the dermal-epidermal junction (DEJ) in the image stacks is key for effective clinical imaging. For example, one clinical imaging procedure acquires a dense stack of 0.5×0.5mm FOV images and then, after manual determination of DEJ depth, collects a 5×5mm mosaic at that depth for diagnosis. However, especially in lightly pigmented skin, RCM images have low contrast at the DEJ which makes repeatable, objective visual identification challenging. We have previously published proof of concept for an automated algorithm for DEJ detection in both highly- and lightly-pigmented skin types based on sequential feature segmentation and classification. In lightly-pigmented skin the change of skin texture with depth was detected by the algorithm and used to locate the DEJ. Here we report on further validation of our algorithm on a more extensive collection of 24 image stacks (15 fair skin, 9 dark skin). We compare algorithm performance against classification by three clinical experts. We also evaluate inter-expert consistency among the experts. The average correlation across experts was 0.81 for lightly pigmented skin, indicating the difficulty of the problem. The algorithm achieved epidermis/dermis misclassification rates smaller than 10% (based on 25×25 mm tiles) and average distance from the expert labeled boundaries of ~6.4 μm for fair skin and ~5.3 μm for dark skin, well within average cell size and less than 2x the instrument resolution in the optical axis.
Collapse
Affiliation(s)
- Sila Kurugol
- Electrical and Comp. Eng., Northeastern University, 360 Huntington Av., Boston, MA
| | - Milind Rajadhyaksha
- Dermatology Service, Memorial Sloan Kettering Cancer Cnt., 160 East 53 St., New York, NY
| | - Jennifer G Dy
- Electrical and Comp. Eng., Northeastern University, 360 Huntington Av., Boston, MA
| | - Dana H Brooks
- Electrical and Comp. Eng., Northeastern University, 360 Huntington Av., Boston, MA
| |
Collapse
|
27
|
Kurugol S, Dy JG, Rajadhyaksha M, Gossage KW, Weissman J, Brooks DH. Semi-automated Algorithm for Localization of Dermal/ Epidermal Junction in Reflectance Confocal Microscopy Images of Human Skin. Proc SPIE Int Soc Opt Eng 2011; 7904:7901A. [PMID: 21709746 DOI: 10.1117/12.875392] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
The examination of the dermis/epidermis junction (DEJ) is clinically important for skin cancer diagnosis. Reflectance confocal microscopy (RCM) is an emerging tool for detection of skin cancers in vivo. However, visual localization of the DEJ in RCM images, with high accuracy and repeatability, is challenging, especially in fair skin, due to low contrast, heterogeneous structure and high inter- and intra-subject variability. We recently proposed a semi-automated algorithm to localize the DEJ in z-stacks of RCM images of fair skin, based on feature segmentation and classification. Here we extend the algorithm to dark skin. The extended algorithm first decides the skin type and then applies the appropriate DEJ localization method. In dark skin, strong backscatter from the pigment melanin causes the basal cells above the DEJ to appear with high contrast. To locate those high contrast regions, the algorithm operates on small tiles (regions) and finds the peaks of the smoothed average intensity depth profile of each tile. However, for some tiles, due to heterogeneity, multiple peaks in the depth profile exist and the strongest peak might not be the basal layer peak. To select the correct peak, basal cells are represented with a vector of texture features. The peak with most similar features to this feature vector is selected. The results show that the algorithm detected the skin types correctly for all 17 stacks tested (8 fair, 9 dark). The DEJ detection algorithm achieved an average distance from the ground truth DEJ surface of around 4.7μm for dark skin and around 7-14μm for fair skin.
Collapse
Affiliation(s)
- Sila Kurugol
- Electrical and Comp. Eng., Northeastern University, 360 Huntington Av., Boston, MA
| | | | | | | | | | | |
Collapse
|
28
|
Abstract
Traditional clustering focuses on finding a single best clustering solution from data. However, given a single data set, one could interpret it in different ways. This is particularly true with complex data that has become prevalent in the data mining community: text, video, images and biological data to name a few. It is thus of practical interest to find all possible alternative and interesting clustering solutions from data. Recently there has been increasing interest on developing algorithms to discover multiple clustering solutions from complex data. This report provides a description of the first international workshop on this emerging topic --- SIGKDD MultiClust10: Discovering, Summarizing and Using Multiple Clusterings, which was held in Washington DC, on July 25th 2010. The workshop program consists of three invited talks and presentations of four full research papers and three short papers.
Collapse
|
29
|
Kurugol S, Dy JG, Brooks DH, Rajadhyaksha M. Pilot study of semiautomated localization of the dermal/epidermal junction in reflectance confocal microscopy images of skin. J Biomed Opt 2011; 16:036005. [PMID: 21456869 PMCID: PMC3077965 DOI: 10.1117/1.3549740] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Revised: 01/07/2011] [Accepted: 01/10/2011] [Indexed: 05/21/2023]
Abstract
Reflectance confocal microscopy (RCM) continues to be translated toward the detection of skin cancers in vivo. Automated image analysis may help clinicians and accelerate clinical acceptance of RCM. For screening and diagnosis of cancer, the dermal/epidermal junction (DEJ), at which melanomas and basal cell carcinomas originate, is an important feature in skin. In RCM images, the DEJ is marked by optically subtle changes and features and is difficult to detect purely by visual examination. Challenges for automation of DEJ detection include heterogeneity of skin tissue, high inter-, intra-subject variability, and low optical contrast. To cope with these challenges, we propose a semiautomated hybrid sequence segmentation/classification algorithm that partitions z-stacks of tiles into homogeneous segments by fitting a model of skin layer dynamics and then classifies tile segments as epidermis, dermis, or transitional DEJ region using texture features. We evaluate two different training scenarios: 1. training and testing on portions of the same stack; 2. training on one labeled stack and testing on one from a different subject with similar skin type. Initial results demonstrate the detectability of the DEJ in both scenarios with epidermis/dermis misclassification rates smaller than 10% and average distance from the expert labeled boundaries around 8.5 μm.
Collapse
Affiliation(s)
- Sila Kurugol
- Northeastern University, Electrical and Computer Engineering, Boston, Massachusetts 02115, USA.
| | | | | | | |
Collapse
|
30
|
Kurugol S, Ozay N, Dy JG, Sharp GC, Brooks DH. Locally Deformable Shape Model to Improve 3D Level Set based Esophagus Segmentation. Proc IAPR Int Conf Pattern Recogn 2010:3955-3958. [PMID: 21731883 DOI: 10.1109/icpr.2010.962] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
In this paper we propose a supervised 3D segmentation algorithm to locate the esophagus in thoracic CT scans using a variational framework. To address challenges due to low contrast, several priors are learned from a training set of segmented images. Our algorithm first estimates the centerline based on a spatial model learned at a few manually marked anatomical reference points. Then an implicit shape model is learned by subtracting the centerline and applying PCA to these shapes. To allow local variations in the shapes, we propose to use nonlinear smooth local deformations. Finally, the esophageal wall is located within a 3D level set framework by optimizing a cost function including terms for appearance, the shape model, smoothness constraints and an air/contrast model.
Collapse
Affiliation(s)
- Sila Kurugol
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | | | | | | | | |
Collapse
|
31
|
Fan J, Zhou X, Dy JG, Zhang Y, Wong STC. An automated pipeline for dendrite spine detection and tracking of 3D optical microscopy neuron images of in vivo mouse models. Neuroinformatics 2009; 7:113-30. [PMID: 19434521 DOI: 10.1007/s12021-009-9047-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2008] [Accepted: 04/01/2009] [Indexed: 11/24/2022]
Abstract
The variations in dendritic branch morphology and spine density provide insightful information about the brain function and possible treatment to neurodegenerative disease, for example investigating structural plasticity during the course of Alzheimer's disease. Most automated image processing methods aiming at analyzing these problems are developed for in vitro data. However, in vivo neuron images provide real time information and direct observation of the dynamics of a disease process in a live animal model. This paper presents an automated approach for detecting spines and tracking spine evolution over time with in vivo image data in an animal model of Alzheimer's disease. We propose an automated pipeline starting with curvilinear structure detection to determine the medial axis of the dendritic backbone and spines connected to the backbone. We, then, propose the adaptive local binary fitting (aLBF) energy level set model to accurately locate the boundary of dendritic structures using the central line of curvilinear structure as initialization. To track the growth or loss of spines, we present a maximum likelihood based technique to find the graph homomorphism between two image graph structures at different time points. We employ dynamic programming to search for the optimum solution. The pipeline enables us to extract dynamically changing information from real time in vivo data. We validate our proposed approach by comparing with manual results generated by neurologists. In addition, we discuss the performance of 3D based segmentation and conclude that our method is more accurate in identifying weak spines. Experiments show that our approach can quickly and accurately detect and quantify spines of in vivo neuron images and is able to identify spine elimination and formation.
Collapse
Affiliation(s)
- Jing Fan
- Center for Biotechnology and Informatics, Department of Radiology, The Methodist Hospital Research Institute & The Methodist Hospital, Weill Cornell Medical College, Houston, TX 77030, USA
| | | | | | | | | |
Collapse
|
32
|
Abstract
In lung cancer radiotherapy, radiation to a mobile target can be delivered by respiratory gating, for which we need to know whether the target is inside or outside a predefined gating window at any time point during the treatment. This can be achieved by tracking one or more fiducial markers implanted inside or near the target, either fluoroscopically or electromagnetically. However, the clinical implementation of marker tracking is limited for lung cancer radiotherapy mainly due to the risk of pneumothorax. Therefore, gating without implanted fiducial markers is a promising clinical direction. We have developed several template-matching methods for fluoroscopic marker-less gating. Recently, we have modeled the gating problem as a binary pattern classification problem, in which principal component analysis (PCA) and support vector machine (SVM) are combined to perform the classification task. Following the same framework, we investigated different combinations of dimensionality reduction techniques (PCA and four nonlinear manifold learning methods) and two machine learning classification methods (artificial neural networks-ANN and SVM). Performance was evaluated on ten fluoroscopic image sequences of nine lung cancer patients. We found that among all combinations of dimensionality reduction techniques and classification methods, PCA combined with either ANN or SVM achieved a better performance than the other nonlinear manifold learning methods. ANN when combined with PCA achieves a better performance than SVM in terms of classification accuracy and recall rate, although the target coverage is similar for the two classification methods. Furthermore, the running time for both ANN and SVM with PCA is within tolerance for real-time applications. Overall, ANN combined with PCA is a better candidate than other combinations we investigated in this work for real-time gated radiotherapy.
Collapse
Affiliation(s)
- Tong Lin
- Department of Radiation Oncology, University of California San Diego, La Jolla, CA 92093, USA
| | | | | | | | | |
Collapse
|
33
|
Cui Y, Dy JG, Alexander B, Jiang SB. Fluoroscopic gating without implanted fiducial markers for lung cancer radiotherapy based on support vector machines. Phys Med Biol 2008; 53:N315-27. [DOI: 10.1088/0031-9155/53/16/n01] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
34
|
Azmandian F, Kaeli D, Dy JG, Hutchinson E, Ancukiewicz M, Niemierko A, Jiang SB. Towards the development of an error checker for radiotherapy treatment plans: a preliminary study. Phys Med Biol 2007; 52:6511-24. [DOI: 10.1088/0031-9155/52/21/012] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
35
|
Cui Y, Dy JG, Sharp GC, Alexander B, Jiang SB. Multiple template-based fluoroscopic tracking of lung tumor mass without implanted fiducial markers. Phys Med Biol 2007; 52:6229-42. [DOI: 10.1088/0031-9155/52/20/010] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
36
|
Affiliation(s)
- Ting Su
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA
| | - Jennifer G. Dy
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA 02115, USA
| |
Collapse
|
37
|
Abstract
For gated lung cancer radiotherapy, it is difficult to generate accurate gating signals due to the large uncertainties when using external surrogates and the risk of pneumothorax when using implanted fiducial markers. We have previously investigated and demonstrated the feasibility of generating gating signals using the correlation scores between the reference template image and the fluoroscopic images acquired during the treatment. In this paper, we present an in-depth study, aiming at the improvement of robustness of the algorithm and its validation using multiple sets of patient data. Three different template generating and matching methods have been developed and evaluated: (1) single template method, (2) multiple template method, and (3) template clustering method. Using the fluoroscopic data acquired during patient setup before each fraction of treatment, reference templates are built that represent the tumour position and shape in the gating window, which is assumed to be at the end-of-exhale phase. For the single template method, all the setup images within the gating window are averaged to generate a composite template. For the multiple template method, each setup image in the gating window is considered as a reference template and used to generate an ensemble of correlation scores. All the scores are then combined to generate the gating signal. For the template clustering method, clustering (grouping of similar objects together) is performed to reduce the large number of reference templates into a few representative ones. Each of these methods has been evaluated against the reference gating signal as manually determined by a radiation oncologist. Five patient datasets were used for evaluation. In each case, gated treatments were simulated at both 35% and 50% duty cycles. False positive, negative and total error rates were computed. Experiments show that the single template method is sensitive to noise; the multiple template and clustering methods are more robust to noise due to the smoothing effect of aggregation of correlation scores; and the clustering method results in the best performance in terms of computational efficiency and accuracy.
Collapse
Affiliation(s)
- Ying Cui
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, USA
| | | | | | | | | |
Collapse
|