Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

16
(from Reference Citation Analysis)

Article PDFs (2)

Cited by > 0 (11)

Searched Name

Data heterogeneity

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Xu Z, Lu D, Luo J, Zheng Y, Tong RKY. Separated collaborative learning for semi-supervised prostate segmentation with multi-site heterogeneous unlabeled MRI data. Med Image Anal 2024;93:103095. [PMID: 38310678 DOI: 10.1016/j.media.2024.103095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 09/11/2023] [Accepted: 01/24/2024] [Indexed: 02/06/2024]

Abstract

Segmenting prostate from magnetic resonance imaging (MRI) is a critical procedure in prostate cancer staging and treatment planning. Considering the nature of labeled data scarcity for medical images, semi-supervised learning (SSL) becomes an appealing solution since it can simultaneously exploit limited labeled data and a large amount of unlabeled data. However, SSL relies on the assumption that the unlabeled images are abundant, which may not be satisfied when the local institute has limited image collection capabilities. An intuitive solution is to seek support from other centers to enrich the unlabeled image pool. However, this further introduces data heterogeneity, which can impede SSL that works under identical data distribution with certain model assumptions. Aiming at this under-explored yet valuable scenario, in this work, we propose a separated collaborative learning (SCL) framework for semi-supervised prostate segmentation with multi-site unlabeled MRI data. Specifically, on top of the teacher-student framework, SCL exploits multi-site unlabeled data by: (i) Local learning, which advocates local distribution fitting, including the pseudo label learning that reinforces confirmation of low-entropy easy regions and the cyclic propagated real label learning that leverages class prototypes to regularize the distribution of intra-class features; (ii) External multi-site learning, which aims to robustly mine informative clues from external data, mainly including the local-support category mutual dependence learning, which takes the spirit that mutual information can effectively measure the amount of information shared by two variables even from different domains, and the stability learning under strong adversarial perturbations to enhance robustness to heterogeneity. Extensive experiments on prostate MRI data from six different clinical centers show that our method can effectively generalize SSL on multi-site unlabeled data and significantly outperform other semi-supervised segmentation methods. Besides, we validate the extensibility of our method on the multi-class cardiac MRI segmentation task with data from four different clinical centers.

Collapse

Thwal CM, Nguyen MNH, Tun YL, Kim ST, Thai MT, Hong CS. OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning. Neural Netw 2024;170:635-649. [PMID: 38100846 DOI: 10.1016/j.neunet.2023.11.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 10/26/2023] [Accepted: 11/20/2023] [Indexed: 12/17/2023]

Tun YL, Nguyen MNH, Thwal CM, Choi J, Hong CS. Contrastive encoder pre-training-based clustered federated learning for heterogeneous data. Neural Netw 2023;165:689-704. [PMID: 37385023 DOI: 10.1016/j.neunet.2023.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 05/26/2023] [Accepted: 06/05/2023] [Indexed: 07/01/2023]

Reitemeyer F, Fritz D, Jacobi N, Díaz-Bone L, Mariño Viteri C, Kropp JP. Quantification of urban mitigation potentials - coping with data heterogeneity. Heliyon 2023;9:e16733. [PMID: 37303575 PMCID: PMC10250789 DOI: 10.1016/j.heliyon.2023.e16733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 05/25/2023] [Accepted: 05/25/2023] [Indexed: 06/13/2023] Open

Li J, Zhang W, Wang P, Li Q, Zhang K, Liu Y. Nonparametric prediction distribution from resolution-wise regression with heterogeneous data. J Bus Econ Stat 2022;41:1157-1172. [PMID: 38046827 PMCID: PMC10691808 DOI: 10.1080/07350015.2022.2115498] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]

Qu L, Balachandar N, Zhang M, Rubin D. Handling data heterogeneity with generative replay in collaborative learning for medical imaging. Med Image Anal 2022;78:102424. [PMID: 35390737 DOI: 10.1016/j.media.2022.102424] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 03/02/2022] [Accepted: 03/12/2022] [Indexed: 01/11/2023]

Chen D, Hosner PA, Dittmann DL, O'Neill JP, Birks SM, Braun EL, Kimball RT. Divergence time estimation of Galliformes based on the best gene shopping scheme of ultraconserved elements. BMC Ecol Evol 2021;21:209. [PMID: 34809586 PMCID: PMC8609756 DOI: 10.1186/s12862-021-01935-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 11/08/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Divergence time estimation is fundamental to understanding many aspects of the evolution of organisms, such as character evolution, diversification, and biogeography. With the development of sequence technology, improved analytical methods, and knowledge of fossils for calibration, it is possible to obtain robust molecular dating results. However, while phylogenomic datasets show great promise in phylogenetic estimation, the best ways to leverage the large amounts of data for divergence time estimation has not been well explored. A potential solution is to focus on a subset of data for divergence time estimation, which can significantly reduce the computational burdens and avoid problems with data heterogeneity that may bias results.

RESULTS

In this study, we obtained thousands of ultraconserved elements (UCEs) from 130 extant galliform taxa, including representatives of all genera, to determine the divergence times throughout galliform history. We tested the effects of different "gene shopping" schemes on divergence time estimation using a carefully, and previously validated, set of fossils. Our results found commonly used clock-like schemes may not be suitable for UCE dating (or other data types) where some loci have little information. We suggest use of partitioning (e.g., PartitionFinder) and selection of tree-like partitions may be good strategies to select a subset of data for divergence time estimation from UCEs. Our galliform time tree is largely consistent with other molecular clock studies of mitochondrial and nuclear loci. With our increased taxon sampling, a well-resolved topology, carefully vetted fossil calibrations, and suitable molecular dating methods, we obtained a high quality galliform time tree.

CONCLUSIONS

We provide a robust galliform backbone time tree that can be combined with more fossil records to further facilitate our understanding of the evolution of Galliformes and can be used as a resource for comparative and biogeographic studies in this group.

Collapse

Wang T, Chen R, Liu W, Yu M. Structure-preserving integrated analysis for risk stratification with application to cancer staging. Biostatistics 2021;23:990-1006. [PMID: 33738474 DOI: 10.1093/biostatistics/kxab005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 01/21/2021] [Accepted: 01/25/2021] [Indexed: 11/13/2022] Open

Zhang Y, Bernau C, Parmigiani G, Waldron L. The impact of different sources of heterogeneity on loss of accuracy from genomic prediction models. Biostatistics 2020;21:253-268. [PMID: 30202918 DOI: 10.1093/biostatistics/kxy044] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2018] [Revised: 07/22/2018] [Accepted: 08/04/2018] [Indexed: 11/13/2022] Open

Wang K, Zhao S, Jackson E. Investigating exposure measures and functional forms in urban and suburban intersection safety performance functions using generalized negative binomial - P model. Accid Anal Prev 2020;148:105838. [PMID: 33125923 DOI: 10.1016/j.aap.2020.105838] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 09/14/2020] [Accepted: 10/03/2020] [Indexed: 06/11/2023]

Abstract

Selecting an appropriate exposure measure and functional form for Safety Performance Functions (SPFs) is critical in precisely predicting crash counts by different crash types for intersections. This study proposes a new approach, namely Generalized Negative Binomial-P (GNB-P) model, to model the complex relationship between crashes and different exposure measures by crash type for intersections, which helps not only identify the most reliable exposure measure for intersection SPFs, but also explore the most appropriate functional form of the NB models. To this end, three types of SPF functional forms, namely Power function, Hoerl function 1 and Hoerl function 2 with different exposure measures including major road AADT, minor road AADT and total AADT were estimated by crash type for stop-controlled and two types of signalized intersections. The over-dispersion of the SPF models was estimated using the exposure measures to account for crash data variation across different intersections. The SPF estimation results highlighted that the mean-variance structure of NB models is not consistent and varies by crash data. The over-dispersion of SPFs by crash type is not constant and varies across different intersections. The minor road AADT is shown to be positively correlated with the over-dispersion of SPFs in estimating crash counts for Same-Direction Crashes (SDC), Intersecting-Direction Crashes (IDC) and Single-Vehicle Crashes (SVC). Estimating the over-dispersion using exposure measures results in more reliable SPF results. Furthermore, it is found that the Power function with major road and minor road AADT as the exposure measure performs the best in estimating SPFs for Opposite-Direction Crashes (ODC). The Hoerl function 2 with total AADT and the proportion of minor road AADT over the total as the exposure measure performs the best in estimating SVC SPFs for intersections. The Hoerl function 1 with major road and minor road AADT as the exposure measure is more accurate in estimating SPFs for both SDC and IDC.

Collapse

Wang K, Zhao S, Jackson E. Functional forms of the negative binomial models in safety performance functions for rural two-lane intersections. Accid Anal Prev 2019;124:193-201. [PMID: 30665054 DOI: 10.1016/j.aap.2019.01.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Revised: 12/20/2018] [Accepted: 01/11/2019] [Indexed: 06/09/2023]

Abstract

Safety Performance Functions (SPFs) play a prominent role in estimating intersection crashes, and identifying the sites with the highest potential for safety improvement. To maximize the crash prediction accuracy, this paper describes the application of different functional forms of the Negative Binomial (NB) models (i.e. NB-1, NB-2 and NB-P) in estimating safety performance functions by crash type for three types of rural two-lane intersections, including three-leg stop-controlled (3ST) intersections, four-leg stop-controlled (4ST) intersections and four-leg signalized (4SG) intersections. Crash types were aggregated into same-direction, opposite-direction, intersecting-direction and single-vehicle crashes. Major and minor road Annual Average Daily Traffic (AADT) were used as predictors in the SPF estimation. In addition, major and minor road AADT were also used as predictors in the estimation of the over-dispersion parameter of the NB models to account for the crash data heterogeneity. In the end, all NB models were compared based on both the model estimation goodness-of-fit and the prediction performance. The model goodness-of-fit indicates that the NB-P model outperforms the NB-1 and NB-2 models for most crash types and intersection types, by providing a flexible variance structure to the NB approaches. The parameterization of the over-dispersion factor verifies that the over-dispersion parameter of the NB models highly depends on how the variance structure is defined in the model, and the over-dispersion parameter is shown to vary among different intersections for each crash type and can be estimated using both the major and minor road AADT at rural two-lane intersections. The NB-P model is found to more effectively capture the variation of over-dispersion among intersections in NB models, which benefits the accommodation of data heterogeneity in intersection SPF development. The prediction performance comparison illustrates that the NB-P model slightly improves the crash prediction accuracy compared with the other two models, especially for the 3ST and 4SG intersections. In conclusion, the NB-P model with parameterized over-dispersion factor is recommended to provide more unbiased parameter estimates when estimating SPFs by crash type for rural two-lane intersections.

Collapse

Cui L, Zeng N, Kim M, Mueller R, Hankosky ER, Redline S, Zhang GQ. X-search: an open access interface for cross-cohort exploration of the National Sleep Research Resource. BMC Med Inform Decis Mak 2018;18:99. [PMID: 30424756 PMCID: PMC6234631 DOI: 10.1186/s12911-018-0682-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Accepted: 10/18/2018] [Indexed: 01/27/2023] Open

Abstract

BACKGROUND

The National Sleep Research Resource (NSRR) is a large-scale, openly shared, data repository of de-identified, highly curated clinical sleep data from multiple NIH-funded epidemiological studies. Although many data repositories allow users to browse their content, few support fine-grained, cross-cohort query and exploration at study-subject level. We introduce a cross-cohort query and exploration system, called X-search, to enable researchers to query patient cohort counts across a growing number of completed, NIH-funded studies in NSRR and explore the feasibility or likelihood of reusing the data for research studies.

METHODS

X-search has been designed as a general framework with two loosely-coupled components: semantically annotated data repository and cross-cohort exploration engine. The semantically annotated data repository is comprised of a canonical data dictionary, data sources with a data dictionary, and mappings between each individual data dictionary and the canonical data dictionary. The cross-cohort exploration engine consists of five modules: query builder, graphical exploration, case-control exploration, query translation, and query execution. The canonical data dictionary serves as the unified metadata to drive the visual exploration interfaces and facilitate query translation through the mappings.

RESULTS

X-search is publicly available at https://www.x-search.net/ with nine NSRR datasets consisting of over 26,000 unique subjects. The canonical data dictionary contains over 900 common data elements across the datasets. X-search has received over 1800 cross-cohort queries by users from 16 countries.

CONCLUSIONS

X-search provides a powerful cross-cohort exploration interface for querying and exploring heterogeneous datasets in the NSRR data repository, so as to enable researchers to evaluate the feasibility of potential research studies and generate potential hypotheses using the NSRR data.

Collapse

Kumar K, Cava F. Principal coordinate analysis assisted chromatographic analysis of bacterial cell wall collection: A robust classification approach. Anal Biochem 2018;550:8-14. [PMID: 29649471 DOI: 10.1016/j.ab.2018.04.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 03/28/2018] [Accepted: 04/08/2018] [Indexed: 11/20/2022]

Shi Q, Zhang C, Guo W, Zeng T, Lu L, Jiang Z, Wang Z, Liu J, Chen L. Local network component analysis for quantifying transcription factor activities. Methods 2017;124:25-35. [PMID: 28710010 DOI: 10.1016/j.ymeth.2017.06.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Revised: 05/02/2017] [Accepted: 06/17/2017] [Indexed: 12/16/2022] Open

Abstract

Transcription factors (TFs) could regulate physiological transitions or determine stable phenotypic diversity. The accurate estimation on TF regulatory signals or functional activities is of great significance to guide biological experiments or elucidate molecular mechanisms, but still remains challenging. Traditional methods identify TF regulatory signals at the population level, which masks heterogeneous regulation mechanisms in individuals or subgroups, thus resulting in inaccurate analyses. Here, we propose a novel computational framework, namely local network component analysis (LNCA), to exploit data heterogeneity and automatically quantify accurate transcription factor activity (TFA) in practical terms, through integrating the partitioned expression sets (i.e., local information) and prior TF-gene regulatory knowledge. Specifically, LNCA adopts an adaptive optimization strategy, which evaluates the local similarities of regulation controls and corrects biases during data integration, to construct the TFA landscape. In particular, we first numerically demonstrate the effectiveness of LNCA for the simulated data sets, compared with traditional methods, such as FastNCA, ROBNCA and NINCA. Then, we apply our model to two real data sets with implicit temporal or spatial regulation variations. The results show that LNCA not only recognizes the periodic mode along the S. cerevisiae cell cycle process, but also substantially outperforms over other methods in terms of accuracy and consistency. In addition, the cross-validation study for glioblastomas multiforme (GBM) indicates that the TFAs, identified by LNCA, can better distinguish clinically distinct tumor groups than the expression values of the corresponding TFs, thus opening a new way to classify tumor subtypes and also providing a novel insight into cancer heterogeneity.

AVAILABILITY

LNCA was implemented as a Matlab package, which is available at http://sysbio.sibcb.ac.cn/cb/chenlab/software.htm/LNCApackage_0.1.rar.

Collapse

Qian P, Zhao K, Jiang Y, Su KH, Deng Z, Wang S, Muzic RF Jr. Knowledge-leveraged transfer fuzzy C-Means for texture image segmentation with self-adaptive cluster prototype matching. Knowl Based Syst 2017;130:33-50. [PMID: 30050232 DOI: 10.1016/j.knosys.2017.05.018] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract

We study a novel fuzzy clustering method to improve the segmentation performance on the target texture image by leveraging the knowledge from a prior texture image. Two knowledge transfer mechanisms, i.e. knowledge-leveraged prototype transfer (KL-PT) and knowledge-leveraged prototype matching (KL-PM) are first introduced as the bases. Applying them, the knowledge-leveraged transfer fuzzy C-means (KL-TFCM) method and its three-stage-interlinked framework, including knowledge extraction, knowledge matching, and knowledge utilization, are developed. There are two specific versions: KL-TFCM-c and KL-TFCM-f, i.e. the so-called crisp and flexible forms, which use the strategies of maximum matching degree and weighted sum, respectively. The significance of our work is fourfold: 1) Owing to the adjustability of referable degree between the source and target domains, KL-PT is capable of appropriately learning the insightful knowledge, i.e. the cluster prototypes, from the source domain; 2) KL-PM is able to self-adaptively determine the reasonable pairwise relationships of cluster prototypes between the source and target domains, even if the numbers of clusters differ in the two domains; 3) The joint action of KL-PM and KL-PT can effectively resolve the data inconsistency and heterogeneity between the source and target domains, e.g. the data distribution diversity and cluster number difference. Thus, using the three-stage-based knowledge transfer, the beneficial knowledge from the source domain can be extensively, self-adaptively leveraged in the target domain. As evidence of this, both KL-TFCM-c and KL-TFCM-f surpass many existing clustering methods in texture image segmentation; and 4) In the case of different cluster numbers between the source and target domains, KL-TFCM-f proves higher clustering effectiveness and segmentation performance than does KL-TFCM-c.

Collapse

Abraham A, Milham MP, Di Martino A, Craddock RC, Samaras D, Thirion B, Varoquaux G. Deriving reproducible biomarkers from multi-site resting-state data: An Autism-based example. Neuroimage 2016;147:736-745. [PMID: 27865923 DOI: 10.1016/j.neuroimage.2016.10.045] [Citation(s) in RCA: 292] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 10/16/2016] [Accepted: 10/21/2016] [Indexed: 12/30/2022] Open

Abstract

Resting-state functional Magnetic Resonance Imaging (R-fMRI) holds the promise to reveal functional biomarkers of neuropsychiatric disorders. However, extracting such biomarkers is challenging for complex multi-faceted neuropathologies, such as autism spectrum disorders. Large multi-site datasets increase sample sizes to compensate for this complexity, at the cost of uncontrolled heterogeneity. This heterogeneity raises new challenges, akin to those face in realistic diagnostic applications. Here, we demonstrate the feasibility of inter-site classification of neuropsychiatric status, with an application to the Autism Brain Imaging Data Exchange (ABIDE) database, a large (N=871) multi-site autism dataset. For this purpose, we investigate pipelines that extract the most predictive biomarkers from the data. These R-fMRI pipelines build participant-specific connectomes from functionally-defined brain areas. Connectomes are then compared across participants to learn patterns of connectivity that differentiate typical controls from individuals with autism. We predict this neuropsychiatric status for participants from the same acquisition sites or different, unseen, ones. Good choices of methods for the various steps of the pipeline lead to 67% prediction accuracy on the full ABIDE data, which is significantly better than previously reported results. We perform extensive validation on multiple subsets of the data defined by different inclusion criteria. These enables detailed analysis of the factors contributing to successful connectome-based prediction. First, prediction accuracy improves as we include more subjects, up to the maximum amount of subjects available. Second, the definition of functional brain areas is of paramount importance for biomarker discovery: brain areas extracted from large R-fMRI datasets outperform reference atlases in the classification tasks.

Collapse