Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

48
(from Reference Citation Analysis)

Article PDFs (1)

Cited by > 0 (36)

Searched Name

Nuno Vasconcelos

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Statistics

Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Category

Show more Refine

Number	Citation Analysis
1	A fully integrated wearable ultrasound system to monitor deep tissues in moving subjects. Nat Biotechnol 2024;42:448-457. [PMID: 37217752 DOI: 10.1038/s41587-023-01800-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 04/21/2023] [Indexed: 05/24/2023] Abstract Recent advances in wearable ultrasound technologies have demonstrated the potential for hands-free data acquisition, but technical barriers remain as these probes require wire connections, can lose track of moving targets and create data-interpretation challenges. Here we report a fully integrated autonomous wearable ultrasonic-system-on-patch (USoP). A miniaturized flexible control circuit is designed to interface with an ultrasound transducer array for signal pre-conditioning and wireless data communication. Machine learning is used to track moving tissue targets and assist the data interpretation. We demonstrate that the USoP allows continuous tracking of physiological signals from tissues as deep as 164 mm. On mobile subjects, the USoP can continuously monitor physiological signals, including central blood pressure, heart rate and cardiac output, for as long as 12 h. This result enables continuous autonomous surveillance of deep tissue signals toward the internet-of-medical-things. Collapse Key Words Collapse MESH Headings Humans Wearable Electronic Devices Vital Signs Collapse Grants Collapse
2	Deep Learning Estimation of 10-2 Visual Field Map Based on Macular Optical Coherence Tomography Angiography Measurements. Am J Ophthalmol 2024;257:187-200. [PMID: 37734638 DOI: 10.1016/j.ajo.2023.09.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Revised: 09/07/2023] [Accepted: 09/13/2023] [Indexed: 09/23/2023] Abstract PURPOSE To develop deep learning (DL) models estimating the central visual field (VF) from optical coherence tomography angiography (OCTA) vessel density (VD) measurements. DESIGN Development and validation of a deep learning model. METHODS A total of 1051 10-2 VF OCTA pairs from healthy, glaucoma suspects, and glaucoma eyes were included. DL models were trained on en face macula VD images from OCTA to estimate 10-2 mean deviation (MD), pattern standard deviation (PSD), 68 total deviation (TD) and pattern deviation (PD) values and compared with a linear regression (LR) model with the same input. Accuracy of the models was evaluated by calculating the average mean absolute error (MAE) and the R2 (squared Pearson correlation coefficients) of the estimated and actual VF values. RESULTS DL models predicting 10-2 MD achieved R2 of 0.85 (95% confidence interval [CI], 74-0.92) for 10-2 MD and MAEs of 1.76 dB (95% CI, 1.39-2.17 dB) for MD. This was significantly better than mean linear estimates for 10-2 MD. The DL model outperformed the LR model for the estimation of pointwise TD values with an average MAE of 2.48 dB (95% CI, 1.99-3.02) and R2 of 0.69 (95% CI, 0.57-0.76) over all test points. The DL model outperformed the LR model for the estimation of all sectors. CONCLUSIONS DL models enable the estimation of VF loss from OCTA images with high accuracy. Applying DL to the OCTA images may enhance clinical decision making. It also may improve individualized patient care and risk stratification of patients who are at risk for central VF damage. Collapse Key Words Collapse MESH Headings Humans Visual Fields Tomography, Optical Coherence/methods Deep Learning Retinal Ganglion Cells Glaucoma/diagnosis Visual Field Tests Angiography Intraocular Pressure Collapse Grants Collapse
3	A Generalized Explanation Framework for Visualization of Deep Learning Model Predictions. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023;PP:1-18. [PMID: 37022375 DOI: 10.1109/tpami.2023.3241106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023] Abstract Attribution-based explanations are popular in computer vision but of limited use for fine-grained classification problems typical of expert domains, where classes differ by subtle details. In these domains, users also seek understanding of "why" a class was chosen and "why not" an alternative class. A new GenerAlized expLanatiOn fRamEwork (GALORE) is proposed to satisfy all these requirements, by unifying attributive explanations with explanations of two other types. The first is a new class of explanations, denoted deliberative, proposed to address the "why" question, by exposing the network insecurities about a prediction. The second is the class of counterfactual explanations, which have been shown to address the "why not" question but are now more efficiently computed. GALORE unifies these explanations by defining them as combinations of attribution maps with respect to various classifier predictions and a confidence score. An evaluation protocol that leverages object recognition (CUB200) and scene classification (ADE20K) datasets combining part and attribute annotations is also proposed. Experiments show that confidence scores can improve explanation accuracy, deliberative explanations provide insight into the network deliberation process, the latter correlates with that performed by humans, and counterfactual explanations enhance the performance of human students in machine teaching experiments. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
4	P–334 CT virtual Histerotomography: a new method for the evaluation of fallopian tube patency and pelvic organs in patients seeking pregnancy. Hum Reprod 2021. [DOI: 10.1093/humrep/deab130.333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract Abstract Study question Can computerized virtual histerotomography (CT-HSG) be used for the evaluation of fallopian tube patency and pelvic organs in patients seeking pregnancy? Summary answer: CT-HSG seems to be an adequate test for the evaluation of fallopian tube patency, pelvic organs, and the uterine cavity. What is known already CT-HSG is a minimally invasive exam, which diagnoses variations in the female reproductive system, uses low radiation doses and is well tolerated by patients. It simultaneously evaluates the uterine wall, cavity and cervix, tubes, and adjacent pelvic structures. The exam enables virtual navigation, which consists of the endoluminal view of the cervical canal and uterine cavity and allows 3D reconstruction of images. The exam remains underused to assess infertility, but previous studies have shown potential and its use may be widespread. Study design, size, duration Retrospective cohort study, that included data from 317 women seeking pregnancy, between January/2019 and January/2021. The CT-HSG was indicated for infertility (90.3%) and RPL (0.9%) investigation, and for the evaluation of tubal stump in patients who were planning the tubal reversal surgery (8.8%). Patients filled out a questionnaire about their pain symptoms and data were collected from electronic records. Participants/materials, setting, methods The study analyzed patients’ clinical characteristics and image findings regarding tubes, uterine cavity, and ovaries. For the exam, a catheter was positioned in the cervix, where the contrast medium (iopromide) was injected through an infusion pump at 0.30 ml/s, for a total of 20ml. The tomographic slices were obtained at the 50th second. The CT-HSG images were interpreted by the same gynecologist and radiologist. Data were analyzed using SPSS version 20.0. Main results and the role of chance Women and partners’ mean age was 32.7 ± 5.6 and 34.6 ± 7.7 years, respectively, and women’s mean BMI was 28.4 ± 6.4 Kg/m². The pain scale was applied in 103 patients, who reported 5.4±3.2 pain scale scores at the end of the exam. Among the infertile patients 67% were nulliparous. Regarding the exam findings, most of the uterus findings were normal (72.6%). The variations found were uterine malformations (including unicornuate uterus, uterus didelphys, bicornuate uterus, septate uterus, and arcuate uterus), synechia, fibroids, endometrial polyps, adenomyosis and retractions/lateralizations that may suggest endometriosis. The tubal findings on the right/left (%) were: 65/67.5 patent horn; 18.9/17.7 obstructed tubes; 4/41 dilatation/hydrosalpinx and 9.4/9.1 with previous history of tubal ligation or salpingectomy; 1.5% of the tubal evaluation were inconclusive. Eleven from 317 patients had to repeat the exam due to occurrences during the execution (for example, improper catheter positioning, cuff fall, stenosis of the internal cervical ostium, severe pain).The 3D analysis and virtual navigation assist in the findings assessment, in addition to being simpler for the gynecologists evaluation. Limitations, reasons for caution The sample size is small due to the exam being a new technique. Patient follow-up and correlation with laparoscopy and hysteroscopy, when indicated, are under studied. Wider implications of the findings: The exam seems to be promising for assessing infertility, RPL and the tubal stump. Moreover, it may be a good option to hysterosalpingography as it seems to cause less pain and allows to evaluate the ovaries and the uterine contour, added to 3D reconstructions and to virtual uterine navigation. Trial registration number Not applicable Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
5	Cascade R-CNN: High Quality Object Detection and Instance Segmentation. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2021;43:1483-1498. [PMID: 31794388 DOI: 10.1109/tpami.2019.2956516] [Citation(s) in RCA: 149] [Impact Index Per Article: 49.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Abstract In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives. The threshold used to train a detector defines its quality. While the commonly used threshold of 0.5 leads to noisy (low-quality) detections, detection performance frequently degrades for larger thresholds. This paradox of high-quality detection has two causes: 1) overfitting, due to vanishing positive samples for large thresholds, and 2) inference-time quality mismatch between detector and test hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, composed of a sequence of detectors trained with increasing IoU thresholds, is proposed to address these problems. The detectors are trained sequentially, using the output of a detector as training set for the next. This resampling progressively improves hypotheses quality, guaranteeing a positive training set of equivalent size for all detectors and minimizing overfitting. The same cascade is applied at inference, to eliminate quality mismatches between hypotheses and detectors. An implementation of the Cascade R-CNN without bells or whistles achieves state-of-the-art performance on the COCO dataset, and significantly improves high-quality detection on generic and specific object datasets, including VOC, KITTI, CityPerson, and WiderFace. Finally, the Cascade R-CNN is generalized to instance segmentation, with nontrivial improvements over the Mask R-CNN. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
6	Semantic Fisher Scores for Task Transfer: Using Objects to Classify Scenes. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020;42:3102-3118. [PMID: 31180842 DOI: 10.1109/tpami.2019.2921960] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023] Abstract The transfer of a neural network (CNN) trained to recognize objects to the task of scene classification is considered. A Bag-of-Semantics (BoS) representation is first induced, by feeding scene image patches to the object CNN, and representing the scene image by the ensuing bag of posterior class probability vectors (semantic posteriors). The encoding of the BoS with a Fisher vector (FV) is then studied. A link is established between the FV of any probabilistic model and the Q-function of the expectation-maximization (EM) algorithm used to estimate its parameters by maximum likelihood. This enables 1) immediate derivation of FVs for any model for which an EM algorithm exists, and 2) leveraging efficient implementations from the EM literature for the computation of FVs. It is then shown that standard FVs, such as those derived from Gaussian or even Dirichlet mixtures, are unsuccessful for the transfer of semantic posteriors, due to the highly non-linear nature of the probability simplex. The analysis of these FVs shows that significant benefits can ensue by 1) designing FVs in the natural parameter space of the multinomial distribution, and 2) adopting sophisticated probabilistic models of semantic feature covariance. The combination of these two insights leads to the encoding of the BoS in the natural parameter space of the multinomial, using a vector of Fisher scores derived from a mixture of factor analyzers (MFA). A network implementation of the MFA Fisher Score (MFA-FS), denoted as the MFAFSNet, is finally proposed to enable end-to-end training. Experiments with various object CNNs and datasets show that the approach has state-of-the-art transfer performance. Somewhat surprisingly, the scene classification results are superior to those of a CNN explicitly trained for scene classification, using a large scene dataset (Places). This suggests that holistic analysis is insufficient for scene classification. The modeling of local object semantics appears to be at least equally important. The two approaches are also shown to be strongly complementary, leading to very large scene classification gains when combined, and outperforming all previous scene classification approaches by a sizable margin. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
7	Learning Complexity-Aware Cascades for Pedestrian Detection. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2020;42:2195-2211. [PMID: 30990173 DOI: 10.1109/tpami.2019.2910514] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023] Abstract The problem of pedestrian detection is considered. The design of complexity-aware cascaded pedestrian detectors, combining features of very different complexities, is investigated. A new cascade design procedure is introduced, by formulating cascade learning as the Lagrangian optimization of a risk that accounts for both accuracy and complexity. A boosting algorithm, denoted as complexity aware cascade training (CompACT), is then derived to solve this optimization. CompACT cascades are shown to seek an optimal trade-off between accuracy and complexity by pushing features of higher complexity to the later cascade stages, where only a few difficult candidate patches remain to be classified. This enables the use of features of vastly different complexities in a single detector. In result, the feature pool can be expanded to features previously impractical for cascade design, such as the responses of a deep convolutional neural network (CNN). This is demonstrated through the design of pedestrian detectors with a pool of features whose complexities span orders of magnitude. The resulting cascade generalizes the combination of a CNN with an object proposal mechanism: rather than a pre-processing stage, CompACT cascades seamlessly integrate CNNs in their stages. This enables accurate detection at fairly fast speeds. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
8	Deficient Endoplasmic Reticulum-Mitochondrial Phosphatidylserine Transfer Causes Liver Disease. Cell 2020;177:881-895.e17. [PMID: 31051106 DOI: 10.1016/j.cell.2019.04.010] [Citation(s) in RCA: 195] [Impact Index Per Article: 48.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2017] [Revised: 12/19/2018] [Accepted: 04/03/2019] [Indexed: 02/07/2023] Abstract Non-alcoholic fatty liver is the most common liver disease worldwide. Here, we show that the mitochondrial protein mitofusin 2 (Mfn2) protects against liver disease. Reduced Mfn2 expression was detected in liver biopsies from patients with non-alcoholic steatohepatitis (NASH). Moreover, reduced Mfn2 levels were detected in mouse models of steatosis or NASH, and its re-expression in a NASH mouse model ameliorated the disease. Liver-specific ablation of Mfn2 in mice provoked inflammation, triglyceride accumulation, fibrosis, and liver cancer. We demonstrate that Mfn2 binds phosphatidylserine (PS) and can specifically extract PS into membrane domains, favoring PS transfer to mitochondria and mitochondrial phosphatidylethanolamine (PE) synthesis. Consequently, hepatic Mfn2 deficiency reduces PS transfer and phospholipid synthesis, leading to endoplasmic reticulum (ER) stress and the development of a NASH-like phenotype and liver cancer. Ablation of Mfn2 in liver reveals that disruption of ER-mitochondrial PS transfer is a new mechanism involved in the development of liver disease. Collapse Key Words MAMs Mfn2 NASH mitochondria phosphatidylserine phospholipid transfer Collapse MESH Headings Collapse Grants Collapse
9	Super Diffusion for Salient Object Detection. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019;29:2903-2917. [PMID: 31765314 DOI: 10.1109/tip.2019.2954209] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023] Abstract One major branch of saliency object detection methods are diffusion-based which construct a graph model on a given image and diffuse seed saliency values to the whole graph by a diffusion matrix. While their performance is sensitive to specific feature spaces and scales used for the diffusion matrix definition, little work has been published to systematically promote the robustness and accuracy of salient object detection under the generic mechanism of diffusion. In this work, we firstly present a novel view of the working mechanism of the diffusion process based on mathematical analysis, which reveals that the diffusion process is actually computing the similarity of nodes with respect to the seeds based on diffusion maps. Following this analysis, we propose super diffusion, a novel inclusive learning-based framework for salient object detection, which makes the optimum and robust performance by integrating a large pool of feature spaces, scales and even features originally computed for non-diffusion-based salient object detection. A closed-form solution of the optimal parameters for the integration is determined through supervised learning. At the local level, we propose to promote each individual diffusion before the integration. Our mathematical analysis reveals the close relationship between saliency diffusion and spectral clustering. Based on this, we propose to re-synthesize each individual diffusion matrix from the most discriminative eigenvectors and the constant eigenvector (for saliency normalization). The proposed framework is implemented and experimented on prevalently used benchmark datasets, consistently leading to state-of-the-art performance. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
10	Volumetric Attention for 3D Medical Image Segmentation and Detection. LECTURE NOTES IN COMPUTER SCIENCE 2019. [DOI: 10.1007/978-3-030-32226-7_20] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
11	Comparison of different radiographic techniques in the detection of endo-perio lesions. J Clin Exp Dent 2017. [DOI: 10.4317/medoral.176438636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
12	Parametric Regression on the Grassmannian. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2016;38:2284-2297. [PMID: 26766216 DOI: 10.1109/tpami.2016.2516533] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023] Abstract We address the problem of fitting parametric curves on the Grassmann manifold for the purpose of intrinsic parametric regression. We start from the energy minimization formulation of linear least-squares in Euclidean space and generalize this concept to general nonflat Riemannian manifolds, following an optimal-control point of view. We then specialize this idea to the Grassmann manifold and demonstrate that it yields a simple, extensible and easy-to-implement solution to the parametric regression problem. In fact, it allows us to extend the basic geodesic model to (1) a "time-warped" variant and (2) cubic splines. We demonstrate the utility of the proposed solution on different vision problems, such as shape regression as a function of age, traffic-speed estimation and crowd-counting from surveillance video clips. Most notably, these problems can be conveniently solved within the same framework without any specifically-tailored steps along the processing pipeline. Collapse Key Words Collapse MESH Headings Collapse Grants P41 EB002025 NIBIB NIH HHS Collapse
13	Peak-Piloted Deep Network for Facial Expression Recognition. COMPUTER VISION – ECCV 2016 2016. [DOI: 10.1007/978-3-319-46475-6_27] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
14	Robust deformable and occluded object tracking with dynamic graph. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2014;23:5497-5509. [PMID: 25350927 DOI: 10.1109/tip.2014.2364919] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023] Abstract While some efforts have been paid to handle deformation and occlusion in visual tracking, they are still great challenges. In this paper, a dynamic graph-based tracker (DGT) is proposed to address these two challenges in a unified framework. In the dynamic target graph, nodes are the target local parts encoding appearance information, and edges are the interactions between nodes encoding inner geometric structure information. This graph representation provides much more information for tracking in the presence of deformation and occlusion. The target tracking is then formulated as tracking this dynamic undirected graph, which is also a matching problem between the target graph and the candidate graph. The local parts within the candidate graph are separated from the background with Markov random field, and spectral clustering is used to solve the graph matching. The final target state is determined through a weighted voting procedure according to the reliability of part correspondence, and refined with recourse to a foreground/background segmentation. An effective online updating mechanism is proposed to update the model, allowing DGT to robustly adapt to variations of target structure. Experimental results show improved performance over several state-of-the-art trackers, in various challenging scenarios. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
15	Object recognition with hierarchical discriminant saliency networks. Front Comput Neurosci 2014;8:109. [PMID: 25249971 PMCID: PMC4158795 DOI: 10.3389/fncom.2014.00109] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 08/22/2014] [Indexed: 12/22/2022] Open Abstract The benefits of integrating attention and object recognition are investigated. While attention is frequently modeled as a pre-processor for recognition, we investigate the hypothesis that attention is an intrinsic component of recognition and vice-versa. This hypothesis is tested with a recognition model, the hierarchical discriminant saliency network (HDSN), whose layers are top-down saliency detectors, tuned for a visual class according to the principles of discriminant saliency. As a model of neural computation, the HDSN has two possible implementations. In a biologically plausible implementation, all layers comply with the standard neurophysiological model of visual cortex, with sub-layers of simple and complex units that implement a combination of filtering, divisive normalization, pooling, and non-linearities. In a convolutional neural network implementation, all layers are convolutional and implement a combination of filtering, rectification, and pooling. The rectification is performed with a parametric extension of the now popular rectified linear units (ReLUs), whose parameters can be tuned for the detection of target object classes. This enables a number of functional enhancements over neural network models that lack a connection to saliency, including optimal feature denoising mechanisms for recognition, modulation of saliency responses by the discriminant power of the underlying features, and the ability to detect both feature presence and absence. In either implementation, each layer has a precise statistical interpretation, and all parameters are tuned by statistical learning. Each saliency detection layer learns more discriminant saliency templates than its predecessors and higher layers have larger pooling fields. This enables the HDSN to simultaneously achieve high selectivity to target object classes and invariance. The performance of the network in saliency and object recognition tasks is compared to those of models from the biological and computer vision literatures. This demonstrates benefits for all the functional enhancements of the HDSN, the class tuning inherent to discriminant saliency, and saliency layers based on templates of increasing target selectivity and invariance. Altogether, these experiments suggest that there are non-trivial benefits in integrating attention and recognition. Collapse Key Words discriminant saliency hierarchical network object detection object recognition top-down saliency Collapse MESH Headings Collapse Grants Collapse
16	On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2014;36:521-535. [PMID: 24457508 DOI: 10.1109/tpami.2013.142] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023] Abstract The problem of cross-modal retrieval from multimedia repositories is considered. This problem addresses the design of retrieval systems that support queries across content modalities, for example, using an image to search for texts. A mathematical formulation is proposed, equating the design of cross-modal retrieval systems to that of isomorphic feature spaces for different content modalities. Two hypotheses are then investigated regarding the fundamental attributes of these spaces. The first is that low-level cross-modal correlations should be accounted for. The second is that the space should enable semantic abstraction. Three new solutions to the cross-modal retrieval problem are then derived from these hypotheses: correlation matching (CM), an unsupervised method which models cross-modal correlations, semantic matching (SM), a supervised technique that relies on semantic representation, and semantic correlation matching (SCM), which combines both. An extensive evaluation of retrieval performance is conducted to test the validity of the hypotheses. All approaches are shown successful for text retrieval in response to image queries and vice versa. It is concluded that both hypotheses hold, in a complementary form, although evidence in favor of the abstraction hypothesis is stronger than that for correlation. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
17	Anomaly detection and localization in crowded scenes. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2014;36:18-32. [PMID: 24231863 DOI: 10.1109/tpami.2013.111] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023] Abstract The detection and localization of anomalous behaviors in crowded scenes is considered, and a joint detector of temporal and spatial anomalies is proposed. The proposed detector is based on a video representation that accounts for both appearance and dynamics, using a set of mixture of dynamic textures models. These models are used to implement 1) a center-surround discriminant saliency detector that produces spatial saliency scores, and 2) a model of normal behavior that is learned from training data and produces temporal saliency scores. Spatial and temporal anomaly maps are then defined at multiple spatial scales, by considering the scores of these operators at progressively larger regions of support. The multiscale scores act as potentials of a conditional random field that guarantees global consistency of the anomaly judgments. A data set of densely crowded pedestrian walkways is introduced and used to evaluate the proposed anomaly detector. Experiments on this and other data sets show that the latter achieves state-of-the-art anomaly detection results. Collapse Key Words Collapse MESH Headings Algorithms Crowding Humans Image Processing, Computer-Assisted/methods Pattern Recognition, Automated/methods Video Recording/methods Collapse Grants Collapse
18	Latent Dirichlet allocation models for image classification. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013;35:2665-2679. [PMID: 24051727 DOI: 10.1109/tpami.2013.69] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023] Abstract Two new extensions of latent Dirichlet allocation (LDA), denoted topic-supervised LDA (ts-LDA) and class-specific-simplex LDA (css-LDA), are proposed for image classification. An analysis of the supervised LDA models currently used for this task shows that the impact of class information on the topics discovered by these models is very weak in general. This implies that the discovered topics are driven by general image regularities, rather than the semantic regularities of interest for classification. To address this, ts-LDA models are introduced which replace the automated topic discovery of LDA with specified topics, identical to the classes of interest for classification. While this results in improvements in classification accuracy over existing LDA models, it compromises the ability of LDA to discover unanticipated structure of interest. This limitation is addressed by the introduction of css-LDA, an LDA model with class supervision at the level of image features. In css-LDA topics are discovered per class, i.e., a single set of topics shared across classes is replaced by multiple class-specific topic sets. The css-LDA model is shown to combine the labeling strength of topic-supervision with the flexibility of topic-discovery. Its effectiveness is demonstrated through an extensive experimental evaluation, involving multiple benchmark datasets, where it is shown to outperform existing LDA-based image classification approaches. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Computer Simulation Humans Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Models, Statistical Pattern Recognition, Automated/methods Collapse Grants Collapse
19	Localizing target structures in ultrasound video - a phantom study. Med Image Anal 2013;17:712-22. [PMID: 23746488 PMCID: PMC3737575 DOI: 10.1016/j.media.2013.05.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2012] [Revised: 04/24/2013] [Accepted: 05/02/2013] [Indexed: 10/26/2022] Abstract The problem of localizing specific anatomic structures using ultrasound (US) video is considered. This involves automatically determining when an US probe is acquiring images of a previously defined object of interest, during the course of an US examination. Localization using US is motivated by the increased availability of portable, low-cost US probes, which inspire applications where inexperienced personnel and even first-time users acquire US data that is then sent to experts for further assessment. This process is of particular interest for routine examinations in underserved populations as well as for patient triage after natural disasters and large-scale accidents, where experts may be in short supply. The proposed localization approach is motivated by research in the area of dynamic texture analysis and leverages several recent advances in the field of activity recognition. For evaluation, we introduce an annotated and publicly available database of US video, acquired on three phantoms. Several experiments reveal the challenges of applying video analysis approaches to US images and demonstrate that good localization performance is possible with the proposed solution. Collapse Key Words Dynamic textures Ultrasound imaging Video analysis Collapse MESH Headings Algorithms Artificial Intelligence Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Pattern Recognition, Automated/methods Phantoms, Imaging Reproducibility of Results Sensitivity and Specificity Subtraction Technique Ultrasonography/instrumentation Ultrasonography/methods Video Recording/methods Collapse Grants R44 CA143234 NCI NIH HHS R43 EB016621 NIBIB NIH HHS R43 CA143234 NCI NIH HHS 1R01CA138419-0 NCI NIH HHS R01 CA138419 NCI NIH HHS 1R43EB016621 NIBIB NIH HHS 2R44CA143234-02A1 NCI NIH HHS Collapse
20	Biologically Inspired Object Tracking Using Center-Surround Saliency Mechanisms. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2013;35:541-554. [PMID: 22529325 DOI: 10.1109/tpami.2012.98] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Abstract A biologically inspired discriminant object tracker is proposed. It is argued that discriminant tracking is a consequence of top-down tuning of the saliency mechanisms that guide the deployment of visual attention. The principle of discriminant saliency is then used to derive a tracker that implements a combination of center-surround saliency, a spatial spotlight of attention, and feature-based attention. In this framework, the tracking problem is formulated as one of continuous target-background classification, implemented in two stages. The first, or learning stage, combines a focus of attention (FoA) mechanism, and bottom-up saliency to identify a maximally discriminant set of features for target detection. The second, or detection stage, uses a feature-based attention mechanism and a target-tuned top-down discriminant saliency detector to detect the target. Overall, the tracker iterates between learning discriminant features from the target location in a video frame and detecting the location of the target in the next. The statistics of natural images are exploited to derive an implementation which is conceptually simple and computationally efficient. The saliency formulation is also shown to establish a unified framework for classifier design, target detection, automatic tracker initialization, and scale adaptation. Experimental results show that the proposed discriminant saliency tracker outperforms a number of state-of-the-art trackers in the literature. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
21	Glial cell activation in the spinal cord and dorsal root ganglia induced by surgery in mice. Eur J Pharmacol 2013;702:126-34. [PMID: 23396227 DOI: 10.1016/j.ejphar.2013.01.047] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2012] [Revised: 01/18/2013] [Accepted: 01/29/2013] [Indexed: 01/10/2023] Abstract In rodents, surgery and/or remifentanil induce postoperative pain hypersensitivity together with glial cell activation. The same stimulus also produces long-lasting adaptative changes resulting in latent pain sensitization, substantiated after naloxone administration. Glial contribution to postoperative latent sensitization is unknown. In the incisional pain model in mice, surgery was performed under sevoflurane+remifentanil anesthesia and 21 days later, 1 mg/kg of (-) or (+) naloxone was administered subcutaneously. Mechanical thresholds (Von Frey) and glial activation were repeatedly assessed from 30 min to 21 days. We used ionized calcium binding adaptor molecule 1 (Iba1) and glial fibrillary acidic protein (GFAP) to identify glial cells in the spinal cord and dorsal root ganglia by immunohistochemistry. Postoperative hypersensitivity was present up to 10 days, but the administration of (-) but not (+) naloxone at 21 days, induced again hyperalgesia. A transient microglia/macrophage and astrocyte activation was present between 30 min and 2 days postoperatively, while increased immunoreactivity in satellite glial cells lasted 21 days. At this time point, (-) naloxone, but not (+) naloxone, increased GFAP in satellite glial cells; conversely, both naloxone steroisomers similarly increased GFAP in the spinal cord. The report shows for the first time that surgery induces long-lasting morphological changes in astrocytes and satellite cells, involving opioid and toll-like receptors, that could contribute to the development of latent pain sensitization in mice. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
22	Learning optimal embedded cascades. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2012;34:2005-2018. [PMID: 22213762 DOI: 10.1109/tpami.2011.281] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Abstract The problem of automatic and optimal design of embedded object detector cascades is considered. Two main challenges are identified: optimization of the cascade configuration and optimization of individual cascade stages, so as to achieve the best tradeoff between classification accuracy and speed, under a detection rate constraint. Two novel boosting algorithms are proposed to address these problems. The first, RCBoost, formulates boosting as a constrained optimization problem which is solved with a barrier penalty method. The constraint is the target detection rate, which is met at all iterations of the boosting process. This enables the design of embedded cascades of known configuration without extensive cross validation or heuristics. The second, ECBoost, searches over cascade configurations to achieve the optimal tradeoff between classification risk and speed. The two algorithms are combined into an overall boosting procedure, RCECBoost, which optimizes both the cascade configuration and its stages under a detection rate constraint, in a fully automated manner. Extensive experiments in face, car, pedestrian, and panda detection show that the resulting detectors achieve an accuracy versus speed tradeoff superior to those of previous methods. Collapse Key Words Collapse MESH Headings Algorithms Animals Artificial Intelligence Automobiles Face/anatomy & histology Humans Image Processing, Computer-Assisted/methods Pattern Recognition, Automated/methods Ursidae Collapse Grants Collapse
23	Endoscopic image analysis in semantic space. Med Image Anal 2012;16:1415-22. [PMID: 22717411 DOI: 10.1016/j.media.2012.04.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2012] [Revised: 04/20/2012] [Accepted: 04/29/2012] [Indexed: 11/20/2022] Abstract A novel approach to the design of a semantic, low-dimensional, encoding for endoscopic imagery is proposed. This encoding is based on recent advances in scene recognition, where semantic modeling of image content has gained considerable attention over the last decade. While the semantics of scenes are mainly comprised of environmental concepts such as vegetation, mountains or sky, the semantics of endoscopic imagery are medically relevant visual elements, such as polyps, special surface patterns, or vascular structures. The proposed semantic encoding differs from the representations commonly used in endoscopic image analysis (for medical decision support) in that it establishes a semantic space, where each coordinate axis has a clear human interpretation. It is also shown to establish a connection to Riemannian geometry, which enables principled solutions to a number of problems that arise in both physician training and clinical practice. This connection is exploited by leveraging results from information geometry to solve problems such as (1) recognition of important semantic concepts, (2) semantically-focused image browsing, and (3) estimation of the average-case semantic encoding for a collection of images that share a medically relevant visual detail. The approach can provide physicians with an easily interpretable, semantic encoding of visual content, upon which further decisions, or operations, can be naturally carried out. This is contrary to the prevalent practice in endoscopic image analysis for medical decision support, where image content is primarily captured by discriminative, high-dimensional, appearance features, which possess discriminative power but lack human interpretability. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Endoscopy/methods Humans Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Imaging, Three-Dimensional/methods Pattern Recognition, Automated/methods Reproducibility of Results Sensitivity and Specificity Collapse Grants R01 CA138419 NCI NIH HHS U54 EB005149 NIBIB NIH HHS 1U54EB005149-01 NIBIB NIH HHS 1R01CA138419-01 NCI NIH HHS Collapse
24	Holistic context models for visual recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2012;34:902-917. [PMID: 21844625 DOI: 10.1109/tpami.2011.175] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Abstract A novel framework to context modeling based on the probability of co-occurrence of objects and scenes is proposed. The modeling is quite simple, and builds upon the availability of robust appearance classifiers. Images are represented by their posterior probabilities with respect to a set of contextual models, built upon the bag-of-features image representation, through two layers of probabilistic modeling. The first layer represents the image in a semantic space, where each dimension encodes an appearance-based posterior probability with respect to a concept. Due to the inherent ambiguity of classifying image patches, this representation suffers from a certain amount of contextual noise. The second layer enables robust inference in the presence of this noise by modeling the distribution of each concept in the semantic space. A thorough and systematic experimental evaluation of the proposed context modeling is presented. It is shown that it captures the contextual “gist” of natural images. Scene classification experiments show that contextual classifiers outperform their appearance-based counterparts, irrespective of the precise choice and accuracy of the latter. The effectiveness of the proposed approach to context modeling is further demonstrated through a comparison to existing approaches on scene classification and image retrieval, on benchmark data sets. In all cases, the proposed approach achieves superior results. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
25	Counting people with low-level features and Bayesian regression. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2012;21:2160-2177. [PMID: 22020684 DOI: 10.1109/tip.2011.2172800] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023] Abstract An approach to the problem of estimating the size of inhomogeneous crowds, which are composed of pedestrians that travel in different directions, without using explicit object segmentation or tracking is proposed. Instead, the crowd is segmented into components of homogeneous motion, using the mixture of dynamic-texture motion model. A set of holistic low-level features is extracted from each segmented region, and a function that maps features into estimates of the number of people per segment is learned with Bayesian regression. Two Bayesian regression models are examined. The first is a combination of Gaussian process regression with a compound kernel, which accounts for both the global and local trends of the count mapping but is limited by the real-valued outputs that do not match the discrete counts. We address this limitation with a second model, which is based on a Bayesian treatment of Poisson regression that introduces a prior distribution on the linear weights of the model. Since exact inference is analytically intractable, a closed-form approximation is derived that is computationally efficient and kernelizable, enabling the representation of nonlinear functions. An approximate marginal likelihood is also derived for kernel hyperparameter learning. The two regression-based crowd counting methods are evaluated on a large pedestrian data set, containing very distinct camera views, pedestrian traffic, and outliers, such as bikes or skateboarders. Experimental results show that regression-based counts are accurate regardless of the crowd size, outperforming the count estimates produced by state-of-the-art pedestrian detectors. Results on 2 h of video demonstrate the efficiency and robustness of the regression-based crowd size estimation over long periods of time. Collapse Key Words Collapse MESH Headings Artificial Intelligence Bayes Theorem Biometry/methods Censuses Humans Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Pattern Recognition, Automated/methods Photography/methods Population Density Regression Analysis Whole Body Imaging/methods Collapse Grants Collapse
26	Cost-sensitive boosting. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2011;33:294-309. [PMID: 21193808 DOI: 10.1109/tpami.2010.71] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023] Abstract A novel framework is proposed for the design of cost-sensitive boosting algorithms. The framework is based on the identification of two necessary conditions for optimal cost-sensitive learning that 1) expected losses must be minimized by optimal cost-sensitive decision rules and 2) empirical loss minimization must emphasize the neighborhood of the target cost-sensitive boundary. It is shown that these conditions enable the derivation of cost-sensitive losses that can be minimized by gradient descent, in the functional space of convex combinations of weak learners, to produce novel boosting algorithms. The proposed framework is applied to the derivation of cost-sensitive extensions of AdaBoost, RealBoost, and LogitBoost. Experimental evidence, with a synthetic problem, standard data sets, and the computer vision problems of face and car detection, is presented in support of the cost-sensitive optimality of the new algorithms. Their performance is also compared to those of various previous cost-sensitive boosting proposals, as well as the popular combination of large-margin classifiers and probability calibration. Cost-sensitive boosting is shown to consistently outperform all other methods. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
27	Learning pit pattern concepts for gastroenterological training. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION : MICCAI ... INTERNATIONAL CONFERENCE ON MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION 2011;14:280-7. [PMID: 22003710 DOI: 10.1007/978-3-642-23626-6_35] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Abstract In this article, we propose an approach to learn the characteristics of colonic mucosal surface structures, the so called pit patterns, commonly observed during high-magnification colonoscopy. Since the discrimination of the pit pattern types usually requires an experienced physician, an interesting question is whether we can automatically find a collection of images which most typically show a particular pit pattern characteristic. This is of considerable practical interest, since it is imperative for gastroenterological training to have a representative image set for the textbook descriptions of the pit patterns. Our approach exploits recent research on semantic image retrieval and annotation. This facilitates to learn a semantic space for the pit pattern concepts which eventually leads to a very natural formulation of our task. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
28	A novel approach to FRUC using discriminant saliency and frame segmentation. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2010;19:2924-2934. [PMID: 20494851 DOI: 10.1109/tip.2010.2050928] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023] Abstract Motion-compensated frame interpolation (MCFI) is a technique used extensively for increasing the temporal frequency of a video sequence. In order to obtain a high quality interpolation, the motion field between frames must be well-estimated. However, many current techniques for determining the motion are prone to errors in occlusion regions, as well as regions with repetitive structure. We propose an algorithm for improving both the objective and subjective quality of MCFI by refining the motion vector field. We first utilize a discriminant saliency classifier to determine which regions of the motion field are most important to a human observer. These regions are refined using a multistage motion vector refinement (MVR), which promotes motion vector candidates based upon their likelihood given a local neighborhood. For regions which fall below the saliency-threshold, a frame segmentation is used to locate regions of homogeneous color and texture via normalized cuts. Motion vectors are promoted such that each homogeneous region has a consistent motion. Experimental results demonstrate an improvement over previous frame rate up-conversion (FRUC) methods in both objective and subjective picture quality. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
29	Biologically plausible saliency mechanisms improve feedforward object recognition. Vision Res 2010;50:2295-307. [PMID: 20594959 DOI: 10.1016/j.visres.2010.05.034] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2009] [Revised: 02/13/2010] [Accepted: 05/26/2010] [Indexed: 11/25/2022] Abstract The biological plausibility of statistical inference and learning, tuned to the statistics of natural images, is investigated. It is shown that a rich family of statistical decision rules, confidence measures, and risk estimates, can be implemented with the computations attributed to the standard neurophysiological model of V1. In particular, different statistical quantities can be computed through simple re-arrangement of lateral divisive connections, non-linearities, and pooling. It is then shown that a number of proposals for the measurement of visual saliency can be implemented in a biologically plausible manner, through such re-arrangements. This enables the implementation of biologically plausible feedforward object recognition networks that include explicit saliency models. The potential of combined attention and recognition is illustrated by replacing the first layer of the HMAX architecture with a saliency network. Various saliency measures are compared, to investigate whether (1) saliency can substantially benefit visual recognition and (2) the benefits depend on the specific saliency mechanisms implemented. Experimental evaluation shows that saliency does indeed enhance recognition, but the gains are not independent of the saliency mechanisms. Best results are obtained with top-down mechanisms that equate saliency to classification confidence. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
30	A decision-theoretic saliency, its biological plausibility and implications for pre-attentive vision. J Vis 2010. [DOI: 10.1167/7.9.953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
31	Spatiotemporal saliency in dynamic scenes. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2010;32:171-177. [PMID: 19926907 DOI: 10.1109/tpami.2009.112] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023] Abstract A spatiotemporal saliency algorithm based on a center-surround framework is proposed. The algorithm is inspired by biological mechanisms of motion-based perceptual grouping and extends a discriminant formulation of center-surround saliency previously proposed for static imagery. Under this formulation, the saliency of a location is equated to the power of a predefined set of features to discriminate between the visual stimuli in a center and a surround window, centered at that location. The features are spatiotemporal video patches and are modeled as dynamic textures, to achieve a principled joint characterization of the spatial and temporal components of saliency. The combination of discriminant center-surround saliency with the modeling power of dynamic textures yields a robust, versatile, and fully unsupervised spatiotemporal saliency algorithm, applicable to scenes with highly dynamic backgrounds and moving cameras. The related problem of background subtraction is treated as the complement of saliency detection, by classifying nonsalient (with respect to appearance and motion dynamics) points in the visual field as background. The algorithm is tested for background subtraction on challenging sequences, and shown to substantially outperform various state-of-the-art techniques. Quantitatively, its average error rate is almost half that of the closest competitor. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Image Processing, Computer-Assisted/methods Markov Chains Models, Theoretical Normal Distribution Pattern Recognition, Automated/methods ROC Curve Video Recording Vision, Ocular/physiology Visual Perception Collapse Grants Collapse
32	Layered dynamic textures. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2009;31:1862-1879. [PMID: 19696455 DOI: 10.1109/tpami.2009.110] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023] Abstract A novel video representation, the layered dynamic texture (LDT), is proposed. The LDT is a generative model, which represents a video as a collection of stochastic layers of different appearance and dynamics. Each layer is modeled as a temporal texture sampled from a different linear dynamical system. The LDT model includes these systems, a collection of hidden layer assignment variables (which control the assignment of pixels to layers), and a Markov random field prior on these variables (which encourages smooth segmentations). An EM algorithm is derived for maximum-likelihood estimation of the model parameters from a training video. It is shown that exact inference is intractable, a problem which is addressed by the introduction of two approximate inference procedures: a Gibbs sampler and a computationally efficient variational approximation. The trade-off between the quality of the two approximations and their complexity is studied experimentally. The ability of the LDT to segment videos into layers of coherent appearance and dynamics is also evaluated, on both synthetic and natural videos. These experiments show that the model possesses an ability to group regions of globally homogeneous, but locally heterogeneous, stochastic dynamics currently unparalleled in the literature. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
33	Discriminant saliency, the detection of suspicious coincidences, and applications to visual recognition. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2009;31:989-1005. [PMID: 19372605 DOI: 10.1109/tpami.2009.27] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Abstract A discriminant formulation of top-down visual saliency, intrinsically connected to the recognition problem, is proposed. The new formulation is shown to be closely related to a number of classical principles for the organization of perceptual systems, including infomax, inference by detection of suspicious coincidences, classification with minimal uncertainty, and classification with minimum probability of error. The implementation of these principles with computational parsimony, by exploitation of the statistics of natural images, is investigated. It is shown that Barlow's principle of inference by the detection of suspicious coincidences enables computationally efficient saliency measures which are nearly optimal for classification. This principle is adopted for the solution of the two fundamental problems in discriminant saliency, feature selection and saliency detection. The resulting saliency detector is shown to have a number of interesting properties, and act effectively as a focus of attention mechanism for the selection of interest points according to their relevance for visual recognition. Experimental evidence shows that the selected points have good performance with respect to 1) the ability to localize objects embedded in significant amounts of clutter, 2) the ability to capture information relevant for image classification, and 3) the richness of the set of visual attributes that can be considered salient. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Biomimetics/methods Discriminant Analysis Humans Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Pattern Recognition, Automated/methods Reproducibility of Results Sensitivity and Specificity Visual Perception Collapse Grants Collapse
34	Decision-theoretic saliency: computational principles, biological plausibility, and implications for neurophysiology and psychophysics. Neural Comput 2009;21:239-71. [PMID: 19210172 DOI: 10.1162/neco.2009.11-06-391] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Abstract A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
35	Natural image statistics and low-complexity feature selection. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2009;31:228-244. [PMID: 19110490 DOI: 10.1109/tpami.2008.77] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023] Abstract Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Computer Simulation Data Interpretation, Statistical Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Models, Statistical Pattern Recognition, Automated/methods Reproducibility of Results Sensitivity and Specificity Collapse Grants Collapse
36	Fluoroscopic tumor tracking for image-guided lung cancer radiotherapy. Phys Med Biol 2009;54:981-92. [DOI: 10.1088/0031-9155/54/4/011] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
37	On the plausibility of the discriminant center-surround hypothesis for visual saliency. J Vis 2008;8:13.1-18. [PMID: 19146246 DOI: 10.1167/8.7.13] [Citation(s) in RCA: 205] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2007] [Accepted: 03/17/2008] [Indexed: 11/24/2022] Open Abstract It has been suggested that saliency mechanisms play a role in perceptual organization. This work evaluates the plausibility of a recently proposed generic principle for visual saliency: that all saliency decisions are optimal in a decision-theoretic sense. The discriminant saliency hypothesis is combined with the classical assumption that bottom-up saliency is a center-surround process to derive a (decision-theoretic) optimal saliency architecture. Under this architecture, the saliency of each image location is equated to the discriminant power of a set of features with respect to the classification problem that opposes stimuli at center and surround. The optimal saliency detector is derived for various stimulus modalities, including intensity, color, orientation, and motion, and shown to make accurate quantitative predictions of various psychophysics of human saliency for both static and motion stimuli. These include some classical nonlinearities of orientation and motion saliency and a Weber law that governs various types of saliency asymmetries. The discriminant saliency detectors are also applied to various saliency problems of interest in computer vision, including the prediction of human eye fixations on natural scenes, motion-based saliency in the presence of ego-motion, and background subtraction in highly dynamic scenes. In all cases, the discriminant saliency detectors outperform previously proposed methods from both the saliency and the general computer vision literatures. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
38	Modeling, clustering, and segmenting video with mixtures of dynamic textures. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2008;30:909-926. [PMID: 18369258 DOI: 10.1109/tpami.2007.70738] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023] Abstract A dynamic texture is a spatio-temporal generative model for video, which represents video sequences as observations from a linear dynamical system. This work studies the mixture of dynamic textures, a statistical model for an ensemble of video sequences that is sampled from a finite collection of visual processes, each of which is a dynamic texture. An expectationmaximization (EM) algorithm is derived for learning the parameters of the model, and the model is related to previous works in linear systems, machine learning, time-series clustering, control theory, and computer vision. Through experimentation, it is shown that the mixture of dynamic textures is a suitable representation for both the appearance and dynamics of a variety of visual processes that have traditionally been challenging for computer vision (e.g. fire, steam, water, vehicle and pedestrian traffic, etc.). When compared with state-of-the-art methods in motion segmentation, including both temporal texture methods and traditional representations (e.g. optical flow or other localized motion representations), the mixture of dynamic textures achieves superior performance in the problems of clustering and segmenting video of such processes. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Cluster Analysis Computer Simulation Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Information Storage and Retrieval/methods Likelihood Functions Models, Statistical Pattern Recognition, Automated/methods Reproducibility of Results Sensitivity and Specificity Video Recording/methods Collapse Grants Collapse
39	Supervised learning of semantic classes for image annotation and retrieval. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2007;29:394-410. [PMID: 17224611 DOI: 10.1109/tpami.2007.61] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Abstract A probabilistic formulation for semantic image annotation and retrieval is proposed. Annotation and retrieval are posed as classification problems where each class is defined as the group of database images labeled with a common semantic label. It is shown that, by establishing this one-to-one correspondence between semantic labels and semantic classes, a minimum probability of error annotation and retrieval are feasible with algorithms that are 1) conceptually simple, 2) computationally efficient, and 3) do not require prior semantic segmentation of training images. In particular, images are represented as bags of localized feature vectors, a mixture density estimated for each image, and the mixtures associated with all images annotated with a common semantic label pooled into a density estimate for the corresponding semantic class. This pooling is justified by a multiple instance learning argument and performed efficiently with a hierarchical extension of expectation-maximization. The benefits of the supervised formulation over the more complex, and currently popular, joint modeling of semantic label and visual feature distributions are illustrated through theoretical arguments and extensive experiments. The supervised formulation is shown to achieve higher accuracy than various previously published methods at a fraction of their computational cost. Finally, the proposed method is shown to be fairly robust to parameter tuning. Collapse Key Words Collapse MESH Headings Algorithms Artificial Intelligence Database Management Systems Databases, Factual Documentation/methods Image Enhancement/methods Image Interpretation, Computer-Assisted/methods Information Storage and Retrieval/methods Natural Language Processing Pattern Recognition, Automated/methods Semantics Sensitivity and Specificity Collapse Grants Collapse
40	Prognostic markers in treated hypertensive diabetic patients. 28-month follow-up. Rev Port Cardiol 2004;23:1119-35. [PMID: 15587573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2023] Open Abstract INTRODUCTION Diabetes mellitus has a prevalence of about 6 to 10% in western populations, with a rising tendency due to inappropriate increases in calorie intake and decreased physical activity. In diabetic patients hypertension (HT) has a prevalence of over 60% and cerebro- and cardiovascular disease is responsible for two-thirds of the mortality in these patients. PATIENTS AND METHODS We studied prospectively and consecutively 97 patients (age 63 +/- 8; 39-89) with treated type 2 diabetes and HT. The objective was to identify cardio- and cerebrovascular risk markers. The majority of the patients were evaluated by clinical and laboratory examination, 24h ambulatory blood pressure monitoring (ABPM), HbA1c, total cholesterol, HDL-C and triglycerides, microalbuminuria, echocardiogram (left ventricular mass index) and carotid-femoral pulse wave velocity. Later, the patients were re-evaluated using the same diagnostic methodology after a mean follow-up of 28 months. RESULTS The population was at high risk for cardio- and cerebrovascular disease (60% dyslipidemic, 39% with previous cerebro- or cardiovascular accidents, 73% nondipper, 69% with decreased vascular distensibility [<12 m/sec] and 35% with microalbuminuria) despite treatment. Diabetes was controlled in only 55% of cases and blood pressure (BP) in 10%, although by ABPM it was controlled in 40% of cases. Simultaneous control of diabetes and HT was present in only one third of the patients. At the end of follow-up these values had not changed significantly, which can only be considered positive in respect of reduction in microalbuminuria (due to ACEIs and AIIRAs). Thirty cardio- and cerebrovascular events occurred (5 deaths), related to inadequate control of diabetes at initial evaluation (p=0.012), night-time systolic BP (SBP) and nondipper status (p=0.02) and vascular distensibility at the end of the study (p=0.03). On multiple linear regression (stepwise) analysis the only variable which was significantly associated with cardio- and cerebrovascular mortality and morbidity was night-time SBP. CONCLUSIONS Overall analysis of the data confirmed the elevated risk of these patients and the importance of more frequent and aggressive control. The study also confirms the importance of evaluation by ABPM in these patients, which may lead to more efficacious, tailor-made treatment. Collapse Key Words Collapse MESH Headings Adult Aged Aged, 80 and over Diabetic Angiopathies/blood Diabetic Angiopathies/drug therapy Diabetic Angiopathies/physiopathology Female Follow-Up Studies Humans Hypertension/blood Hypertension/drug therapy Hypertension/physiopathology Male Middle Aged Prospective Studies Collapse Grants Collapse
41	The Kullback-Leibler Kernel as a Framework for Discriminant and Localized Representations for Visual Recognition. LECTURE NOTES IN COMPUTER SCIENCE 2004. [DOI: 10.1007/978-3-540-24672-5_34] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
42	Does orthostatic hypotension predict the occurrence of nocturnal arterial hypertension in the elderly patient? Rev Port Cardiol 2003;22:607-15. [PMID: 12940176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023] Open Abstract OBJECTIVE To determine whether the presence of orthostatic hypotension--which, in this age-group, could be due to varying degrees of autonomic dysfunction--is an indicator of nocturnal arterial hypertension. PATIENTS Between 1999 and 2001 we prospectively and consecutively studied 93 elderly patients with untreated (office) arterial hypertension, 65 (70%) of whom were true hypertensives according to 24 h ambulatory blood pressure monitoring (ABPM). INTERVENTIONS The patients were studied by clinical examination including blood pressure (BP) measurement in dorsal decubitus and orthostatic position, 24 h ABPM, evaluation of vascular distensibility by carotid-femoral pulse wave velocity (PWV) and Doppler echocardiography. For this study we analyzed especially the ambulatory behavior of BP, so we could relate the variation of systolic blood pressure (SBP) during orthostatism with non-dipper status for SBP and absolute nocturnal values of SBP. MEASUREMENTS AND RESULTS The results indicated that a greater decrease of blood pressure with orthostatism corresponded to a greater probability of nocturnal hypertension (p = 0.005) and of non-dipper status (p = 0.02). These results are in agreement with those subsequently found by other authors (Kario et al., 2002). CONCLUSIONS In this way, by means of a simple clinical maneuver that should always be performed in an elderly hypertensive patient, we can suspect the presence of nocturnal hypertension--which is a high-risk cardiovascular situation--and use this information to help select patients to undergo 24 hour-ABPM. Collapse Key Words Collapse MESH Headings Aged Blood Pressure Monitoring, Ambulatory Circadian Rhythm Female Humans Hypertension/diagnosis Hypertension/physiopathology Hypotension, Orthostatic/physiopathology Male Collapse Grants Collapse
43	[Pulmonary hypertension. Topic review. 2 clinical cases]. Rev Port Cardiol 2002;21:173-80. [PMID: 11963287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023] Open Abstract The authors describe two cases of pulmonary hypertension (PHT). In the first case it is secondary to pulmonary thromboembolism, a frequent and serious occurrence, witch is well known as a cause of PHT. In the second case the PHT is probably secondary to infection by human immunodeficiency virus, also a serious and frequent condition in clinical practice but which was only recently identified as a cause of PHT. Formerly these patients were considered as suffering from primary PHT. The authors make a brief review of the literature on pulmonary hypertension. Collapse Key Words Collapse MESH Headings Adult Fatal Outcome Female HIV Infections/complications Humans Hypertension, Pulmonary/etiology Male Middle Aged Pulmonary Embolism/complications Collapse Grants Collapse
44	[Is ambulatory blood pressure monitoring reliable in hypertensive patients with atrial fibrillation?]. Rev Port Cardiol 2001;20:647-50. [PMID: 11529254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023] Open Abstract Atrial fibrillation (AF) is commonly seen in patients (pts) with systemic hypertension. They are usually excluded from ambulatory blood pressure monitoring (ABPM) because its accuracy is unknown. The aim of our study was to determine if ABPM can be used to assess 24 hour BP in pts with AF. We included hypertensive pts with chronic (> 6 months) AF, controlled heart rate (60-100 c.p.m), under therapy and also hypertensive pts in sinus rhythm (control group--CG). They were submitted to 24 hour ABPM (Spacelabs 90207). Manual BP with a standard mercury sphygmomanometer was taken during 3 visits (office BP) and on the day of ambulatory monitoring. Simultaneous measurements with a T-Tube were also performed. Thirty pts with chronic AF (63% males), mean age 73 +/- 8 years (52-85) and 18 pts in sinus rhythm (CG) were studied. The age, gender, office BP, ambulatory BP and proportion of successful measurements was similar in the 2 groups. In CG systolic and diastolic office BP did not differ from day ambulatory BP (148 +/- 14/84 +/- 7 vs 138 +/- 18/76 +/- 11 mmHg) and the same was seen in pts in AF (table). In this group, only the systolic BP taken immediately before the ambulatory device was put on, was significantly different from day ambulatory BP (148 +/- 21 vs 137 +/- 19 mmHg, p = 0.04). The proportion of successful measurements in AF group was 94 +/- 8 (65-98) with 93% > 80%. In 64 simultaneous measurements the differences were 6 +/- 5 and 5 +/- 5 mmHg for systolic and diastolic BP. Casual and ambulatory heart rate was also similar in the two groups (76 +/- 7/76 +/- 12--AF group; 78 +/- 10/78 +/- 8--control group). In conclusion, this study demonstrates that ABPM can be used to assess BP in patients with atrial fibrillation. There was a high percentage of successful recordings (93%). As in patients in sinus rhythm, there was no significantly difference in mean office blood pressure and daytime ambulatory blood pressure. Collapse Key Words Collapse MESH Headings Aged Aged, 80 and over Atrial Fibrillation/complications Blood Pressure Monitoring, Ambulatory Female Humans Hypertension/complications Hypertension/diagnosis Male Middle Aged Reproducibility of Results Collapse Grants Collapse
45	Long-term (four years) follow-up of patients with treated nocturnal hypertension assessed by ambulatory blood pressure monitoring. Rev Port Cardiol 2001;20:135-50; discussion 153-4. [PMID: 11293873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023] Open Abstract STUDY OBJECTIVE Nocturnal Hypertension (NH) is an independent risk factor for cardiovascular morbidity and mortality (M-M). However, an inappropriate decrease in diastolic BP during the night significantly increases morbidity. There are no prospective studies on the long-term consequences on M-M in treated NH. We accordingly studied M-M in 107 consecutive patients with treated NH, assessed by ambulatory blood pressure monitoring (ABPM), during a four-year follow-up. PATIENTS AND METHODS From the initial 107 patients six died (5 from brain or cardiovascular causes). In 65 patients it was possible to repeat the ABPM during the follow-up period. They were hypertensive patients class I-II (JNC IV) 62 +/- 10 years old, 56 were male and were observed before and after starting treatment on a four-year follow-up period. We considered age, sex, body mass index, previous cerebral and cardiovascular accidents, type and number of drugs administered, smoking habits, plasma cholesterol, glycemia, and causal and ambulatory blood pressure monitoring (ABPM) (24 hr, 6 am-10 am, 10 pm-6 am and pulse pressure) before and after follow-up, dipper status and the period of follow-up. RESULTS The patients whom died were older and had a significantly higher systolic blood pressure compared to the survivors. We considered two groups: with (A - n = 18) or without (B - n = 47) cerebral and cardiac morbidity. The A group had more previous cerebral and cardiovascular accidents (p = 0.05), a more intensive treatment (p = 0.02), and a greater fall in diastolic blood pressure (DBP) during the night in both absolute and percentage numbers, after treatment, than the B group. However, after regression analysis, the only independent risk marker differentiating between the two groups was the percentage fall in the DBP after treatment (dipper phenomenon) (p = 0.01). CONCLUSIONS In 65 treated hypertensive (NH) patients assessed by ABPM before and after treatment (four-year follow-up) we identified a group with cerebral and cardiovascular morbidity. These patients, in contrast with another group with no morbidity, had more previous cerebral and cardiovascular accidents, they were more intensively treated, and they had a greater fall in diastolic blood pressure after therapy (absolute and percentage values). However, after regression analysis the diastolic nocturnal blood pressure dipper phenomenon after treatment was the only risk marker associated with morbidity. In such cases it is possible that treatment guided by ABPM can decrease morbidity. Collapse Key Words Collapse MESH Headings Aged Blood Pressure Monitoring, Ambulatory Female Follow-Up Studies Humans Hypertension/diagnosis Hypertension/therapy Male Middle Aged Time Factors Collapse Grants Collapse
46	Statistical models of video structure for content analysis and characterization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2000;9:3-19. [PMID: 18255369 DOI: 10.1109/83.817595] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023] Abstract Content structure plays an important role in the understanding of video. In this paper, we argue that knowledge about structure can be used both as a means to improve the performance of content analysis and to extract features that convey semantic information about the content. We introduce statistical models for two important components of this structure, shot duration and activity, and demonstrate the usefulness of these models with two practical applications. First, we develop a Bayesian formulation for the shot segmentation problem that is shown to extend the standard thresholding model in an adaptive and intuitive way, leading to improved segmentation accuracy. Second, by applying the transformation into the shot duration/activity feature space to a database of movie clips, we also illustrate how the Bayesian model captures semantic properties of the content. We suggest ways in which these properties can be used as a basis for intuitive content-based access to movie libraries. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
47	[Arterial hypertension difficult to control in the elderly patient. The significance of the "white coat effect"]. Rev Port Cardiol 1999;18:897-906. [PMID: 10590654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023] Open Abstract OBJECTIVE Previous studies have revealed a high prevalence of white coat effect among treated hypertensive patients. The difference between clinic and ambulatory blood pressure seems to be more pronounced in older patients. This abnormal rise in blood pressure BP in treated hypertensive patients can lead to a misdiagnosis of refractory hypertension. Clinicians may increase the dosage of antihypertensive drugs or add further medication, increasing costs and producing harmful secondary effects. Our aim was to evaluate the discrepancy between clinic and ambulatory blood pressure in hypertensive patients on adequate antihypertensive treatment and to analyse the magnitude of the white coat effect and its relationship with age, gender, clinic blood pressure and cardiovascular or cerebrovascular events. POPULATION AND METHODS We included 50 consecutive moderate/severe hypertensive patients, 58% female, mean age 68 +/- 10 years (48-88), clinic blood pressure (3 visits) > 160/90 mm Hg, on antihypertensive adequate treatment > 2 months with good compliance and without pseudohypertension. The patients were submitted to clinical evaluation (risk score), clinic blood pressure and heart rate, electrocardiogram and ambulatory blood pressure monitoring (Spacelabs 90,207). Systolic and diastolic 24 hour, daytime, night-time blood pressure and heart rate were recorded. We considered elderly patients above 60 years of age (80%). We defined white coat effect as the difference between systolic clinic blood pressure and daytime systolic blood pressure BP > 20 mm Hg or the difference between diastolic clinic blood pressure and daytime diastolic blood pressure > 10 mm Hg and severe white coat effect as systolic clinic blood pressure--daytime systolic blood pressure > 40 mm Hg or diastolic clinic blood pressure--daytime diastolic blood pressure > 20 mm Hg. The patients were asked to take blood pressure measurements out of hospital (at home or by a nurse). The majority of them performed an echocardiogram examination. RESULTS Clinic blood pressure was significantly different from daytime ambulatory blood pressure (189 +/- 19/96 +/- 13 vs 139 +/- 18/78 +/- 10 mm Hg, p < 0.005). The magnitude of white coat effect was 50 +/- 17 (8-84) mm Hg for systolic blood pressure and 18 +/- 11 (-9 +/- 41) mm Hg for diastolic blood pressure. A marked white coat effect (> 40 mm Hg) was observed in 78% of our hypertensive patients. In elderly people (> 60 years), this difference was greater (50 +/- 15 vs 45 +/- 21 mm Hg) though not significantly. We did not find significant differences between sexes (males 54 +/- 16 mm Hg vs 48 +/- 17 mm Hg). In 66% of these patients, ambulatory blood pressure monitoring showed daytime blood pressure values < 140/90 mm Hg, therefore refractory hypertension was excluded. In 8 patients (18%) there was a previous history of ischemic cardiovascular or cerebrovascular disease and all of them had a marked difference between systolic clinic and daytime blood pressure (> 40 mm Hg). Blood pressure measurements performed out of hospital did not help clinicians to identify this phenomena as only 16% were similar (+/- 5 mm Hg) to ambulatory daytime values. CONCLUSIONS Some hypertensive patients, on adequate antihypertensive treatment, have a significant difference between clinic blood pressure and ambulatory blood pressure measurements. This difference (White Coat Effect) is greater in elderly patients and in men (NS). Although clinic blood pressure values were significantly increased, the majority of these patients have controlled blood pressure on ambulatory monitoring. In this population, ambulatory blood pressure monitoring was of great value to identify a misdiagnosis of refractory hypertension, which could lead to improper decisions in the therapeutic management of elderly patients (increasing treatment) and compromise cerebrovascular or coronary circulation. Collapse Key Words Collapse MESH Headings Age Factors Aged Aged, 80 and over Antihypertensive Agents/therapeutic use Blood Pressure/drug effects Chronic Disease Circadian Rhythm/drug effects Drug Therapy, Combination Female Humans Hypertension/diagnosis Hypertension/drug therapy Hypertension/physiopathology Hypertension/psychology Male Middle Aged Risk Factors Sex Characteristics Collapse Grants Collapse
48	["White-coat hypertension": variation of normality or of hypertension?]. Rev Port Cardiol 1998;17:505-12. [PMID: 9677828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open Abstract INTRODUCTION Previous studies have demonstrated a high prevalence of "white coat" hypertension (20%), but it is still controversial if it implies an increase in cardiovascular risk. PATIENTS Between 1992 and 95 we prospectively studied 175 untreated hypertensive patients aged over 18 years (V Joint National Committee's stage I-II), and 91 controls. DESIGN AND METHODS The subjects were submitted to clinical evaluation, ambulatory blood pressure monitoring, 24-hour Holter monitoring, signal-averaged ECG, echocardiography/Doppler and ergometry. "White coat" hypertension was defined as mean daytime (6.00-22.00 H) ambulatory blood pressure < 136/87 mm Hg (males) and < 131/86 mm Hg (females). RESULTS "White coat" hypertension was present in 29 patients (18%). "White coat" hypertension patients had an identical prevalence of smoking, family history of cardiovascular disease, abnormal ECG and retinopathy (> Keith-Wagener II) as patients with daytime hypertension. Ambulatory blood pressure values (24 hour, 6.00-22.00 h, 22.00-6.00 h, sleep, blood pressure load, heart rate) were all significantly different from controls (p < 0.03 to 0.0007). In patients with daytime hypertension, only 24 hour and daytime diastolic ambulatory blood pressure (p < 0.005) were different from "white coat" hypertension patients. Exercise testing blood pressure values (6 min exercise, maximal, 3 min recovery) were significantly different between "white coat" hypertension patients and the control group (n = 70) (p varying from 0.05 to 0.005) but not between "white coat" hypertension and daytime hypertension (n = 33) patients. Diastolic function was studied only in 39 daytime hypertension patients, 10 individuals with "white coat" hypertension and 34 controls (for technical reasons and because we only analyzed individuals younger than 55 years). E velocity and E/A ratio were similar in "white coat" hypertension and daytime hypertension, but only in daytime hypertension patients they reached a significant difference from controls (p = 0.04; p = 0.01), probably due to the small number of patients. CONCLUSIONS These data (clinical, ambulatory blood pressure, ergometric, diastolic function) suggest that "white coat" hypertension might not be a benign entity. Collapse Key Words Collapse MESH Headings Blood Pressure/physiology Blood Pressure Determination Clothing Color Electrocardiography, Ambulatory Female Humans Hypertension/physiopathology Male Middle Aged Prospective Studies Collapse Grants Collapse