1
|
Mika D. Fast gradient algorithm for complex ICA and its application to the MIMO systems. Sci Rep 2023; 13:11633. [PMID: 37468514 DOI: 10.1038/s41598-023-36628-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 06/07/2023] [Indexed: 07/21/2023] Open
Abstract
This paper proposes a new gradient-descent algorithm for complex independent component analysis and presents its application to the Multiple-Input Multiple-Output communication systems. Algorithm uses the Lie structure of optimization landscape and toral decomposition of gradient matrix. The theoretical results are validated by computer simulation and compared to several classes of algorithms, gradient descent, quasi-Newton as well as complex JADE. The simulations performed showed excellent results of the algorithm in terms of speed, stability of operation and the quality of separation. A characteristic feature of gradient methods is their quick response to changes in the input signal. The good results of the proposed algorithm indicate potential use in on-line applications.
Collapse
Affiliation(s)
- Dariusz Mika
- The University College of Applied Sciences in Chełm, 22-100, Chełm, Poland.
| |
Collapse
|
2
|
Muehlmann C, De Iaco S, Nordhausen K. Blind recovery of sources for multivariate space-time random fields. STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT : RESEARCH JOURNAL 2022; 37:1593-1613. [PMID: 37041981 PMCID: PMC10081984 DOI: 10.1007/s00477-022-02348-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 11/14/2022] [Indexed: 06/19/2023]
Abstract
With advances in modern worlds technology, huge datasets that show dependencies in space as well as in time occur frequently in practice. As an example, several monitoring stations at different geographical locations track hourly concentration measurements of a number of air pollutants for several years. Such a dataset contains thousands of multivariate observations, thus, proper statistical analysis needs to account for dependencies in space and time between and among the different monitored variables. To simplify the consequent multivariate spatio-temporal statistical analysis it might be of interest to detect linear transformations of the original observations that result in straightforward interpretative, spatio-temporally uncorrelated processes that are also highly likely to have a real physical meaning. Blind source separation (BSS) represents a statistical methodology which has the aim to recover so-called latent processes, that exactly meet the former requirements. BSS was already successfully used in sole temporal and sole spatial applications with great success, but, it was not yet introduced for the spatio-temporal case. In this contribution, a reasonable and innovative generalization of BSS for multivariate space-time random fields (stBSS), under second-order stationarity, is proposed, together with two space-time extensions of the well-known algorithms for multiple unknown signals extraction (stAMUSE) and the second-order blind identification (stSOBI) which solve the formulated problem. Furthermore, symmetry and separability properties of the model are elaborated and connections to the space-time linear model of coregionalization and to the classical principal component analysis are drawn. Finally, the usefulness of the new methods is shown in a thorough simulation study and on a real environmental application.
Collapse
Affiliation(s)
- C. Muehlmann
- Institute of Statistics and Mathematical Methods in Economics, TU Wien / Technische Universität Wien / Vienna University of Technology, Vienna, Austria
| | - S. De Iaco
- Department of Economic Sciences-Sect. of Mathematics and Statistics, University of Salento, Lecce, Italy
- Centro Nazionale di Biodiversità, University of Salento, Lecce, Italy
| | - K. Nordhausen
- Department of Mathematics and Statistics, University of Jyväskylä, Jyväskylä, Finland
| |
Collapse
|
3
|
Pan Y, Matilainen M, Taskinen S, Nordhausen K. A review of second-order blind identification methods. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2022; 14:e1550. [PMID: 36249858 PMCID: PMC9540980 DOI: 10.1002/wics.1550] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 01/06/2021] [Accepted: 01/07/2021] [Indexed: 11/24/2022]
Abstract
Second-order source separation (SOS) is a data analysis tool which can be used for revealing hidden structures in multivariate time series data or as a tool for dimension reduction. Such methods are nowadays increasingly important as more and more high-dimensional multivariate time series data are measured in numerous fields of applied science. Dimension reduction is crucial, as modeling such high-dimensional data with multivariate time series models is often impractical as the number of parameters describing dependencies between the component time series is usually too high. SOS methods have their roots in the signal processing literature, where they were first used to separate source signals from an observed signal mixture. The SOS model assumes that the observed time series (signals) is a linear mixture of latent time series (sources) with uncorrelated components. The methods make use of the second-order statistics-hence the name "second-order source separation." In this review, we discuss the classical SOS methods and their extensions to more complex settings. An example illustrates how SOS can be performed. This article is categorized under:Statistical Models > Time Series ModelsStatistical and Graphical Methods of Data Analysis > Dimension ReductionData: Types and Structure > Time Series, Stochastic Processes, and Functional Data.
Collapse
Affiliation(s)
- Yan Pan
- Department of Mathematics and StatisticsUniversity of JyväskyläFinland
| | - Markus Matilainen
- Turku PET CentreTurku University Hospital and University of TurkuFinland
| | - Sara Taskinen
- Department of Mathematics and StatisticsUniversity of JyväskyläFinland
| | - Klaus Nordhausen
- Department of Mathematics and StatisticsUniversity of JyväskyläFinland
- Institute of Statistics and Mathematical Methods in Economics, TUViennaAustria
| |
Collapse
|
4
|
|
5
|
Loperfido N. Some theoretical properties of two kurtosis matrices, with application to invariant coordinate selection. J MULTIVARIATE ANAL 2021. [DOI: 10.1016/j.jmva.2021.104809] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
6
|
Virta J, Lietzén N, Ilmonen P, Nordhausen K. Fast tensorial JADE. Scand Stat Theory Appl 2021; 48:164-187. [PMID: 33664538 PMCID: PMC7891388 DOI: 10.1111/sjos.12445] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2018] [Revised: 10/24/2019] [Accepted: 01/05/2020] [Indexed: 11/28/2022]
Abstract
We propose a novel method for tensorial-independent component analysis. Our approach is based on TJADE and k-JADE, two recently proposed generalizations of the classical JADE algorithm. Our novel method achieves the consistency and the limiting distribution of TJADE under mild assumptions and at the same time offers notable improvement in computational speed. Detailed mathematical proofs of the statistical properties of our method are given and, as a special case, a conjecture on the properties of k-JADE is resolved. Simulations and timing comparisons demonstrate remarkable gain in speed. Moreover, the desired efficiency is obtained approximately for finite samples. The method is applied successfully to large-scale video data, for which neither TJADE nor k-JADE is feasible. Finally, an experimental procedure is proposed to select the values of a set of tuning parameters. Supplementary material including the R-code for running the examples and the proofs of the theoretical results is available online.
Collapse
Affiliation(s)
- Joni Virta
- Department of Mathematics and Systems AnalysisAalto University School of Science
- Department of Mathematics and StatisticsUniversity of Turku
| | - Niko Lietzén
- Department of Mathematics and Systems AnalysisAalto University School of Science
| | - Pauliina Ilmonen
- Department of Mathematics and Systems AnalysisAalto University School of Science
| | - Klaus Nordhausen
- Institute of Statistics & Mathematical Methods in EconomicsVienna University of Technology
| |
Collapse
|
7
|
|
8
|
Radojičić U, Nordhausen K, Virta J. Large-sample properties of unsupervised estimation of the linear discriminant using projection pursuit. Electron J Stat 2021. [DOI: 10.1214/21-ejs1956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Una Radojičić
- Institute of Statistics & Mathematical Methods in Economics, Vienna University of Technology, Austria
| | - Klaus Nordhausen
- Institute of Statistics & Mathematical Methods in Economics, Vienna University of Technology, Austria
| | - Joni Virta
- Department of Mathematics and Statistics, University of Turku, Finland
| |
Collapse
|
9
|
Lee S, Shen H, Truong Y. Sampling Properties of color Independent Component Analysis. J MULTIVARIATE ANAL 2020; 181. [PMID: 33162620 DOI: 10.1016/j.jmva.2020.104692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Independent Component Analysis (ICA) offers an effective data-driven approach for blind source extraction encountered in many signal and image processing problems. Although many ICA methods have been developed, they have received relatively little attention in the statistics literature, especially in terms of rigorous theoretical investigation for statistical inference. The current paper aims at narrowing this gap and investigates the statistical sampling properties of the colorICA (cICA) method. The cICA incorporates the correlation structure within sources through parametric time series models in the frequency domain and outperforms several existing ICA alternatives numerically. We establish the consistency and asymptotic normality of the cICA estimates, which then enables statistical inference based on the estimates. These asymptotic properties are further validated using simulation studies.
Collapse
Affiliation(s)
- Seonjoo Lee
- Department of Psychiatry and Biostatistics, Columbia University, New York, NY, USA.,Mental Health Data Science, New York State Psychiatric Institute and Research Foundation for Mental Hygiene, Inc., New York, NY, USA
| | - Haipeng Shen
- Innovation and Information Management, Faculty of Business and Economics, University of Hong Kong, Hong Kong, China
| | - Young Truong
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
10
|
Virta J, Li B, Nordhausen K, Oja H. Independent component analysis for multivariate functional data. J MULTIVARIATE ANAL 2020. [DOI: 10.1016/j.jmva.2019.104568] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
11
|
Miettinen J, Matilainen M, Nordhausen K, Taskinen S. Extracting Conditionally Heteroskedastic Components using Independent Component Analysis. JOURNAL OF TIME SERIES ANALYSIS 2020; 41:293-311. [PMID: 32508370 PMCID: PMC7266430 DOI: 10.1111/jtsa.12505] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 08/12/2019] [Accepted: 08/13/2019] [Indexed: 06/11/2023]
Abstract
In the independent component model, the multivariate data are assumed to be a mixture of mutually independent latent components. The independent component analysis (ICA) then aims at estimating these latent components. In this article, we study an ICA method which combines the use of linear and quadratic autocorrelations to enable efficient estimation of various kinds of stationary time series. Statistical properties of the estimator are studied by finding its limiting distribution under general conditions, and the asymptotic variances are derived in the case of ARMA-GARCH model. We use the asymptotic results and a finite sample simulation study to compare different choices of a weight coefficient. As it is often of interest to identify all those components which exhibit stochastic volatility features we suggest a test statistic for this problem. We also show that a slightly modified version of the principal volatility component analysis can be seen as an ICA method. Finally, we apply the estimators in analysing a data set which consists of time series of exchange rates of seven currencies to US dollar. Supporting information including proofs of the theorems is available online.
Collapse
Affiliation(s)
- Jari Miettinen
- Department of Signal Processing and AcousticsAalto UniversityHelsinkiFinland
| | - Markus Matilainen
- Department of Mathematics and StatisticsUniversity of TurkuTurkuFinland
- Turku PET CentreTurku University Hospital and University of TurkuFinland
| | - Klaus Nordhausen
- Institute of Statistics & Mathematical Methods in EconomicsVienna University of TechnologyWienAustria
| | - Sara Taskinen
- Department of Mathematics and StatisticsUniversity of JyvaskylaJyväskyläFinland
| |
Collapse
|
12
|
|
13
|
Di J, Spira A, Bai J, Urbanek J, Leroux A, Wu M, Resnick S, Simonsick E, Ferrucci L, Schrack J, Zipunnikov V. Joint and Individual Representation of Domains of Physical Activity, Sleep, and Circadian Rhythmicity. STATISTICS IN BIOSCIENCES 2019; 11:371-402. [PMID: 32440309 PMCID: PMC7241438 DOI: 10.1007/s12561-019-09236-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 03/07/2019] [Accepted: 04/02/2019] [Indexed: 10/27/2022]
Abstract
Developments in wearable technology have enabled researchers to continuously and objectively monitor various aspects and physiological domains of real-life including levels of physical activity, quality of sleep, and strength of circadian rhythm in many epidemiological and clinical studies. Current analytical practice is to summarize each of these three domains individually via a standard inventory of interpretable features, and explore individual associations between the features and clinical variables. However, the features often exhibit significant interaction and correlation both within and between domains. Integration of features across multiple domains remains methodologically challenging. To address this problem, we propose to use joint and individual variation explained (JIVE), a dimension reduction technique that efficiently deals with multivariate data representing multiple domains. In this paper, we review the most frequently used features to characterize the domains of physical activity, sleep, and circadian rhythniicity and illustrate the approach using wrist-worn actigraphy data from 198 participants of the Baltimore Longitudinal Study of Aging.
Collapse
Affiliation(s)
- Junrui Di
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
| | - Adam Spira
- Johns Hopkins Center on Aging and Health
- Departnient of Mental Health, Johns Hopkins Bloomberg School of Public Health
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine
| | - Jiawei Bai
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
| | - Jacek Urbanek
- Department of Medicine, Division of Geriatric Medicine and Gerontology, Johns Hopkins University School of Medicine
| | - Andrew Leroux
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
| | - Mark Wu
- Departments of Neurology and Neuroscience, Johns Hopkins University School of Medicine
| | - Susan Resnick
- Intramural Research Program, National Institute on Aging, National Institutes of Health
| | - Eleanor Simonsick
- Intramural Research Program, National Institute on Aging, National Institutes of Health
| | - Luigi Ferrucci
- Intramural Research Program, National Institute on Aging, National Institutes of Health
| | - Jennifer Schrack
- Johns Hopkins Center on Aging and Health
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health
| | - Vadim Zipunnikov
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health
- Johns Hopkins Center on Aging and Health
| |
Collapse
|
14
|
|
15
|
Virta J, Nordhausen K. Estimating the number of signals using principal component analysis. Stat (Int Stat Inst) 2019. [DOI: 10.1002/sta4.231] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Joni Virta
- Department of Mathematics and Systems AnalysisAalto University Espoo Finland
| | - Klaus Nordhausen
- Institute of Statistics & Mathematical Methods in EconomicsVienna University of Technology Vienna Austria
| |
Collapse
|
16
|
Matilainen M, Croux C, Nordhausen K, Oja H. Sliced average variance estimation for multivariate time series. STATISTICS-ABINGDON 2019. [DOI: 10.1080/02331888.2019.1605515] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- M. Matilainen
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
- Turku PET Centre, Turku, Finland
| | - C. Croux
- EDHEC Business School, Lille, France
| | - K. Nordhausen
- Institute of Statistics & Mathematical Methods in Economics, Vienna University of Technology, Wien, Austria
| | - H. Oja
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
| |
Collapse
|
17
|
Risk BB, Matteson DS, Ruppert D. Linear Non-Gaussian Component Analysis Via Maximum Likelihood. J Am Stat Assoc 2019. [DOI: 10.1080/01621459.2017.1407772] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Benjamin B. Risk
- Department of Biostatistics & Bioinformatics, Emory University, Atlanta, GA
| | | | - David Ruppert
- Department of Statistical Science, Cornell University, Ithaca, NY
| |
Collapse
|
18
|
|
19
|
|
20
|
Affiliation(s)
- Joni Virta
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
| | - Bing Li
- Department of Statistics, Pennsylvania State University, University Park, PA
| | - Klaus Nordhausen
- Institute of Statistics & Mathematical Methods in Economics, Vienna University of Technology, Vienna, Austria
| | - Hannu Oja
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
| |
Collapse
|
21
|
Virta J, Li B, Nordhausen K, Oja H. Independent component analysis for tensor-valued data. J MULTIVARIATE ANAL 2017. [DOI: 10.1016/j.jmva.2017.09.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
22
|
Fischer D, Honkatukia M, Tuiskula-Haavisto M, Nordhausen K, Cavero D, Preisinger R, Vilkki J. Subgroup detection in genotype data using invariant coordinate selection. BMC Bioinformatics 2017; 18:173. [PMID: 28302061 PMCID: PMC5356247 DOI: 10.1186/s12859-017-1589-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 03/09/2017] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND The current gold standard in dimension reduction methods for high-throughput genotype data is the Principle Component Analysis (PCA). The presence of PCA is so dominant, that other methods usually cannot be found in the analyst's toolbox and hence are only rarely applied. RESULTS We present a modern dimension reduction method called 'Invariant Coordinate Selection' (ICS) and its application to high-throughput genotype data. The more commonly known Independent Component Analysis (ICA) is in this framework just a special case of ICS. We use ICS on both, a simulated and a real dataset to demonstrate first some deficiencies of PCA and how ICS is capable to recover the correct subgroups within the simulated data. Second, we apply the ICS method on a chicken dataset and also detect there two subgroups. These subgroups are then further investigated with respect to their genotype to provide further evidence of the biological relevance of the detected subgroup division. Further, we compare the performance of ICS also to five other popular dimension reduction methods. CONCLUSION The ICS method was able to detect subgroups in data where the PCA fails to detect anything. Hence, we promote the application of ICS to high-throughput genotype data in addition to the established PCA. Especially in statistical programming environments like e.g. R, its application does not add any computational burden to the analysis pipeline.
Collapse
Affiliation(s)
- Daniel Fischer
- Natural Resources Institute Finland (LUKE), Myllytie 1, Jokioinen, Finland
| | - Mervi Honkatukia
- Natural Resources Institute Finland (LUKE), Myllytie 1, Jokioinen, Finland
| | | | - Klaus Nordhausen
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
- University of Tampere, School of Health Sciences, Medisiinarinkatu 3, Tampere, 33014 Finland
| | - David Cavero
- Lohmann Tierzucht GmbH, Am Seedeich 9-11, Cuxhaven, 27454 Germany
| | | | - Johanna Vilkki
- Natural Resources Institute Finland (LUKE), Myllytie 1, Jokioinen, Finland
| |
Collapse
|
23
|
|