In a Bayesian model selection and hypothesis testing, users should be cautious when choosing suitable prior distributions, as it is an important problem. More often than not, objective Bayesian analyses utilize noninformative priors such as Jeffreys priors. However, since these noninformative priors are often improper, the Bayes factor associated with these improper priors is not well-defined. To circumvent this indeterminate issue, the Bayes factor can be corrected by intrinsic and fractional methods. These adjusted Bayes factors are asymptotically equivalent to the ordinary Bayes factors calculated with proper priors, called intrinsic priors. In this article, we derive intrinsic priors for testing the point null hypothesis under a zero-inflated Poisson distribution. Extensive simulation studies are performed to support the theoretical results on asymptotic equivalence, and two real datasets are analyzed to illustrate the methodology developed in this paper.
{"title":"Default Bayesian testing for the zero-inflated Poisson distribution","authors":"Yewon Han, Haewon Hwang, Hon Keung Ng, Seong Kim","doi":"10.4310/22-sii750","DOIUrl":"https://doi.org/10.4310/22-sii750","url":null,"abstract":"In a Bayesian model selection and hypothesis testing, users should be cautious when choosing suitable prior distributions, as it is an important problem. More often than not, objective Bayesian analyses utilize noninformative priors such as Jeffreys priors. However, since these noninformative priors are often improper, the Bayes factor associated with these improper priors is not well-defined. To circumvent this indeterminate issue, the Bayes factor can be corrected by intrinsic and fractional methods. These adjusted Bayes factors are asymptotically equivalent to the ordinary Bayes factors calculated with proper priors, called intrinsic priors. In this article, we derive intrinsic priors for testing the point null hypothesis under a zero-inflated Poisson distribution. Extensive simulation studies are performed to support the theoretical results on asymptotic equivalence, and two real datasets are analyzed to illustrate the methodology developed in this paper.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"5 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern statistical analyses often encounter datasets with massive sizes and heavy-tailed distributions. For datasets with massive sizes, traditional estimation methods can hardly be used to estimate the extreme value index directly. To address the issue, we propose here a subsampling-based method. Specifically, multiple subsamples are drawn from the whole dataset by using the technique of simple random subsampling with replacement. Based on each subsample, an approximate maximum likelihood estimator can be computed. The resulting estimators are then averaged to form a more accurate one. Under appropriate regularity conditions, we show theoretically that the proposed estimator is consistent and asymptotically normal. With the help of the estimated extreme value index, we can estimate high-level quantiles and tail probabilities of a heavy-tailed random variable consistently. Extensive simulation experiments are provided to demonstrate the promising performance of our method. A real data analysis is also presented for illustration purpose.
{"title":"Estimating extreme value index by subsampling for massive datasets with heavy-tailed distributions","authors":"Yongxin Li, Liujun Chen, Deyuan Li, Hansheng Wang","doi":"10.4310/22-sii749","DOIUrl":"https://doi.org/10.4310/22-sii749","url":null,"abstract":"Modern statistical analyses often encounter datasets with massive sizes and heavy-tailed distributions. For datasets with massive sizes, traditional estimation methods can hardly be used to estimate the extreme value index directly. To address the issue, we propose here a subsampling-based method. Specifically, multiple subsamples are drawn from the whole dataset by using the technique of simple random subsampling with replacement. Based on each subsample, an approximate maximum likelihood estimator can be computed. The resulting estimators are then averaged to form a more accurate one. Under appropriate regularity conditions, we show theoretically that the proposed estimator is consistent and asymptotically normal. With the help of the estimated extreme value index, we can estimate high-level quantiles and tail probabilities of a heavy-tailed random variable consistently. Extensive simulation experiments are provided to demonstrate the promising performance of our method. A real data analysis is also presented for illustration purpose.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"27 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we consider a random projection method for a large-scale community detection task. We introduce a random Gaussian matrix that generates several projections on the column space of the network adjacency matrix. The $k$-means algorithm is then applied with the low-dimensional projected matrix. The computational complexity is much lower than that of the classic spectral clustering methods. Furthermore, the algorithm is easy to implement and accessible for privacy preservation. We can theoretically establish a strong consistency result of the algorithm under the stochastic block model. Extensive numerical studies are conducted to verify the theoretical findings and illustrate the usefulness of the proposed method.
{"title":"A random projection method for large-scale community detection","authors":"Haobo Qi, Hansheng Wang, Xuening Zhu","doi":"10.4310/22-sii752","DOIUrl":"https://doi.org/10.4310/22-sii752","url":null,"abstract":"In this work, we consider a random projection method for a large-scale community detection task. We introduce a random Gaussian matrix that generates several projections on the column space of the network adjacency matrix. The $k$-means algorithm is then applied with the low-dimensional projected matrix. The computational complexity is much lower than that of the classic spectral clustering methods. Furthermore, the algorithm is easy to implement and accessible for privacy preservation. We can theoretically establish a strong consistency result of the algorithm under the stochastic block model. Extensive numerical studies are conducted to verify the theoretical findings and illustrate the usefulness of the proposed method.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"1 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Positive-definite matrix-variate data is becoming popular in computer vision. The computer vision data descriptors in the form of Region Covariance Descriptors (RCD) are positive definite matrices, which extract the key features of the images. The RCDs are extensively used in image set classification. Some classification methods treating RCDs as Wishart distributed random matrices are being proposed. However, the majority of the current methods preclude the potential correlation among the RCDs caused by the so-called auxiliary information (e.g., subjects’ ages and nose widths, etc). Modeling correlated Wishart matrices is difficult since the joint density function of correlated Wishart matrices is difficult to be obtained. In this paper, we propose an Expectation-Maximization composite likelihoodbased algorithm of Wishart matrices to tackle this issue. Given the numerical studies based on the synthetic data and the real data (Chicago face data-set), our proposed algorithm performs better than the alternative methods which do not consider the correlation caused by the so-called auxiliary information.
{"title":"Correlated Wishart matrices classification via an expectation-maximization composite likelihood-based algorithm","authors":"Zhou Lan","doi":"10.4310/22-sii770","DOIUrl":"https://doi.org/10.4310/22-sii770","url":null,"abstract":"Positive-definite matrix-variate data is becoming popular in computer vision. The computer vision data descriptors in the form of Region Covariance Descriptors (RCD) are positive definite matrices, which extract the key features of the images. The RCDs are extensively used in image set classification. Some classification methods treating RCDs as Wishart distributed random matrices are being proposed. However, the majority of the current methods preclude the potential correlation among the RCDs caused by the so-called auxiliary information (e.g., subjects’ ages and nose widths, etc). Modeling correlated Wishart matrices is difficult since the joint density function of correlated Wishart matrices is difficult to be obtained. In this paper, we propose an Expectation-Maximization composite likelihoodbased algorithm of Wishart matrices to tackle this issue. Given the numerical studies based on the synthetic data and the real data (Chicago face data-set), our proposed algorithm performs better than the alternative methods which do not consider the correlation caused by the so-called auxiliary information.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"27 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-01DOI: 10.4310/sii.2024.v17.n2.a10
Ning Wang, Xin Zhang
Tensor data analysis is gaining increasing popularity in modern multivariate statistics. When analyzing real-world tensor data, many existing tensor estimation approaches are sensitive to heavy-tailed data and outliers, in addition to the apparent high-dimensionality. In this article, we develop a robust and covariance-assisted tensor response regression model based on a recently proposed tensor t‑distribution to address these issues in tensor data. This model assumes that the tensor regression coefficient has a low-rank structure that can be learned more effectively using the additional covariance information. This enables a fast and robust decomposition-based estimation method. Theoretical analysis and numerical experiments demonstrate the superior performance of our approach. By addressing the heavy-tail, high-order, and high-dimensional issues, our work contributes to robust and effective estimation methods for tensor response regression, with broad applicability in various domains.
张量数据分析在现代多元统计中越来越受欢迎。在分析真实世界的张量数据时,除了明显的高维性之外,许多现有的张量估计方法对重尾数据和异常值都很敏感。在本文中,我们基于最近提出的张量 t 分布建立了一个稳健的协方差辅助张量响应回归模型,以解决张量数据中的这些问题。该模型假定张量回归系数具有低秩结构,利用额外的协方差信息可以更有效地学习该结构。这使得基于分解的估计方法既快速又稳健。理论分析和数值实验证明了我们的方法性能优越。通过解决重尾、高阶和高维问题,我们的工作有助于为张量响应回归提供稳健有效的估计方法,并在各个领域具有广泛的适用性。
{"title":"Robust and covariance-assisted tensor response regression","authors":"Ning Wang, Xin Zhang","doi":"10.4310/sii.2024.v17.n2.a10","DOIUrl":"https://doi.org/10.4310/sii.2024.v17.n2.a10","url":null,"abstract":"Tensor data analysis is gaining increasing popularity in modern multivariate statistics. When analyzing real-world tensor data, many existing tensor estimation approaches are sensitive to heavy-tailed data and outliers, in addition to the apparent high-dimensionality. In this article, we develop a robust and covariance-assisted tensor response regression model based on a recently proposed tensor t‑distribution to address these issues in tensor data. This model assumes that the tensor regression coefficient has a low-rank structure that can be learned more effectively using the additional covariance information. This enables a fast and robust decomposition-based estimation method. Theoretical analysis and numerical experiments demonstrate the superior performance of our approach. By addressing the heavy-tail, high-order, and high-dimensional issues, our work contributes to robust and effective estimation methods for tensor response regression, with broad applicability in various domains.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"24 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a Bayesian tensor-on-tensor regression approach to predict a multidimensional array (tensor) of arbitrary dimensions from another tensor of arbitrary dimensions, building upon the Tucker decomposition of the regression coefficient tensor. Traditional tensor regression methods making use of the Tucker decomposition either assume the dimension of the core tensor to be known or estimate it via cross-validation or some model selection criteria. However, no existing method can simultaneously estimate the model dimension (the dimension of the core tensor) and other model parameters. To fill this gap, we develop an efficient Markov Chain Monte Carlo (MCMC) algorithm to estimate both the model dimension and parameters for posterior inference. Besides the MCMC sampler, we also develop an ultra-fast optimization-based computing algorithm wherein the maximum a posteriori estimators for parameters are computed, and the model dimension is optimized via a simulated annealing algorithm. The proposed Bayesian framework provides a natural way for uncertainty quantification. Through extensive simulation studies, we evaluate the proposed Bayesian tensor-on-tensor regression model and show its superior performance compared to alternative methods. We also demonstrate its practical effectiveness by applying it to two real-world datasets, including facial imaging data and 3D motion data.
{"title":"Bayesian tensor-on-tensor regression with efficient computation","authors":"Kunbo Wang, Yanxun Xu","doi":"10.4310/23-sii786","DOIUrl":"https://doi.org/10.4310/23-sii786","url":null,"abstract":"We propose a Bayesian tensor-on-tensor regression approach to predict a multidimensional array (tensor) of arbitrary dimensions from another tensor of arbitrary dimensions, building upon the Tucker decomposition of the regression coefficient tensor. Traditional tensor regression methods making use of the Tucker decomposition either assume the dimension of the core tensor to be known or estimate it via cross-validation or some model selection criteria. However, no existing method can simultaneously estimate the model dimension (the dimension of the core tensor) and other model parameters. To fill this gap, we develop an efficient Markov Chain Monte Carlo (MCMC) algorithm to estimate both the model dimension and parameters for posterior inference. Besides the MCMC sampler, we also develop an ultra-fast optimization-based computing algorithm wherein the maximum <i>a posteriori</i> estimators for parameters are computed, and the model dimension is optimized via a simulated annealing algorithm. The proposed Bayesian framework provides a natural way for uncertainty quantification. Through extensive simulation studies, we evaluate the proposed Bayesian tensor-on-tensor regression model and show its superior performance compared to alternative methods. We also demonstrate its practical effectiveness by applying it to two real-world datasets, including facial imaging data and 3D motion data.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"23 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139658968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the emergence of tensor data (also known as multi-dimensional arrays) in many modern applications such as image processing and digital marketing, tensor classification is gaining increasing attention. Although there is a rich toolbox of classification methods for vector-based data, these traditional methods may not be adequate for tensor data classification. In this paper, we propose a new classifier called density-convoluted tensor support vector machine (DCT‑SVM). This method is motivated by applying a kernel density convolution method on the SVM loss to induce a new family of classification loss functions. To establish the theoretical foundation of DCT‑SVM, the probabilistic order of magnitude for its excess risk is systematically studied. For efficiently computing DCT‑SVM, we develop a fast monotone accelerated proximal gradient descent algorithm and show the convergence of the algorithm. With simulation studies, we demonstrate the superior performance of DCT‑SVM over many popular classification methods. We further demonstrate the real potential of DCT‑SVM using a modern data application for online advertising.
{"title":"Density-convoluted tensor support vector machines","authors":"Boxiang Wang, Le Zhou, Jian Yang, Qing Mai","doi":"10.4310/23-sii796","DOIUrl":"https://doi.org/10.4310/23-sii796","url":null,"abstract":"With the emergence of tensor data (also known as multi-dimensional arrays) in many modern applications such as image processing and digital marketing, tensor classification is gaining increasing attention. Although there is a rich toolbox of classification methods for vector-based data, these traditional methods may not be adequate for tensor data classification. In this paper, we propose a new classifier called density-convoluted tensor support vector machine (DCT‑SVM). This method is motivated by applying a kernel density convolution method on the SVM loss to induce a new family of classification loss functions. To establish the theoretical foundation of DCT‑SVM, the probabilistic order of magnitude for its excess risk is systematically studied. For efficiently computing DCT‑SVM, we develop a fast monotone accelerated proximal gradient descent algorithm and show the convergence of the algorithm. With simulation studies, we demonstrate the superior performance of DCT‑SVM over many popular classification methods. We further demonstrate the real potential of DCT‑SVM using a modern data application for online advertising.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"36 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of modern sequencing technologies provides great opportunities to measure gene expression of multiple tissues from different individuals. The three-way variation across genes, tissues, and individuals makes statistical inference a challenging task. In this paper, we propose a Bayesian multi-way clustering approach to cluster genes, tissues, and individuals simultaneously. The proposed model adaptively trichotomizes the observed data into three latent categories and uses a Bayesian hierarchical construction to further decompose the latent variables into lower-dimensional features, which can be interpreted as overlapping clusters. With a Bayesian nonparametric prior, i.e., the Indian buffet process, our method determines the cluster number automatically. The utility of our approach is demonstrated through simulation studies and an application to the Genotype-Tissue Expression (GTEx) RNA-seq data. The clustering result reveals some interesting findings about depression-related genes in human brain, which are also consistent with biological domain knowledge. The detailed algorithm and some numerical results are available in the online Supplementary Material, available at $href{https://intlpress.com/site/pub/files/supp/sii/2024/0017/0002/sii-2024-0017-0002-s001.pdf}{ https://intlpress.com/site/pub/files/supp/sii/2024/0017/0002/sii-2024-0017-0002-s001.pdf}.
{"title":"Multi-way overlapping clustering by Bayesian tensor decomposition","authors":"Zhuofan Wang, Fangting Zhou, Kejun He, Yang Ni","doi":"10.4310/23-sii790","DOIUrl":"https://doi.org/10.4310/23-sii790","url":null,"abstract":"The development of modern sequencing technologies provides great opportunities to measure gene expression of multiple tissues from different individuals. The three-way variation across genes, tissues, and individuals makes statistical inference a challenging task. In this paper, we propose a Bayesian multi-way clustering approach to cluster genes, tissues, and individuals simultaneously. The proposed model adaptively trichotomizes the observed data into three latent categories and uses a Bayesian hierarchical construction to further decompose the latent variables into lower-dimensional features, which can be interpreted as overlapping clusters. With a Bayesian nonparametric prior, i.e., the Indian buffet process, our method determines the cluster number automatically. The utility of our approach is demonstrated through simulation studies and an application to the Genotype-Tissue Expression (GTEx) RNA-seq data. The clustering result reveals some interesting findings about depression-related genes in human brain, which are also consistent with biological domain knowledge. The detailed algorithm and some numerical results are available in the online Supplementary Material, available at $href{https://intlpress.com/site/pub/files/supp/sii/2024/0017/0002/sii-2024-0017-0002-s001.pdf}{ https://intlpress.com/site/pub/files/supp/sii/2024/0017/0002/sii-2024-0017-0002-s001.pdf}.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"12 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianchen Gao, Rui Pan, Junfei Zhang, Hansheng Wang
In the era of big data, network analysis has attracted widespread attention. Detecting and tracking community evolution in temporal networks can uncover important and interesting behaviors. In this paper, we analyze a temporal citation network constructed by publications collected from 44 statistical journals between 2001 and 2018. We propose an approach named Tensor-based Directed Spectral Clustering On Ratios of Eigenvectors (TD-SCORE) which can correct for degree heterogeneity to detect the community structure of the temporal citation network. We first explore the characteristics of the temporal network via in-degree distribution and visualization of different snapshots, and we find that both the community structure and the key nodes change over time. Then, we apply the TD-SCORE method to the core network of our temporal citation network. Seven communities are identified, including variable selection, Bayesian analysis, functional data analysis, and many others. Finally, we track the evolution of the above communities and reach some conclusions.
{"title":"Community detection in temporal citation network via a tensor-based approach","authors":"Tianchen Gao, Rui Pan, Junfei Zhang, Hansheng Wang","doi":"10.4310/22-sii751","DOIUrl":"https://doi.org/10.4310/22-sii751","url":null,"abstract":"In the era of big data, network analysis has attracted widespread attention. Detecting and tracking community evolution in temporal networks can uncover important and interesting behaviors. In this paper, we analyze a temporal citation network constructed by publications collected from 44 statistical journals between 2001 and 2018. We propose an approach named Tensor-based Directed Spectral Clustering On Ratios of Eigenvectors (TD-SCORE) which can correct for degree heterogeneity to detect the community structure of the temporal citation network. We first explore the characteristics of the temporal network via in-degree distribution and visualization of different snapshots, and we find that both the community structure and the key nodes change over time. Then, we apply the TD-SCORE method to the core network of our temporal citation network. Seven communities are identified, including variable selection, Bayesian analysis, functional data analysis, and many others. Finally, we track the evolution of the above communities and reach some conclusions.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"26 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties.We also discuss potential future directions in this field.
{"title":"Bayesian methods in tensor analysis","authors":"Shi Yiyao, Shen Weining","doi":"10.4310/23-sii802","DOIUrl":"https://doi.org/10.4310/23-sii802","url":null,"abstract":"Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties.We also discuss potential future directions in this field.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":"49 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}