首页 > 最新文献

Asta-Advances in Statistical Analysis最新文献

英文 中文
Goodness-of-fit testing in bivariate count time series based on a bivariate dispersion index 基于双变量离散指数的双变量计数时间序列拟合优度测试
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-09-17 DOI: 10.1007/s10182-024-00512-3
Huiqiao Wang, Christian H. Weiß, Mingming Zhang

A common choice for the marginal distribution of a bivariate count time series is the bivariate Poisson distribution. In practice, however, when the count data exhibit zero inflation, overdispersion or non-stationarity features, such that a marginal bivariate Poisson distribution is not suitable. To test the discrepancy between the actual count data and the bivariate Poisson distribution, we propose a new goodness-of-fit test based on a bivariate dispersion index. The asymptotic distribution of the test statistic under the null hypothesis of a first-order bivariate integer-valued autoregressive model with marginal bivariate Poisson distribution is derived, and the finite-sample performance of the goodness-of-fit test is analyzed by simulations. A real-data example illustrate the application and usefulness of the test in practice.

双变量泊松分布是双变量计数时间序列边际分布的常见选择。但在实际应用中,当计数数据表现出零膨胀、过度分散或非平稳性等特征时,边际双变量泊松分布就不适用了。为了检验实际计数数据与双变量泊松分布之间的差异,我们提出了一种新的基于双变量离散指数的拟合优度检验方法。推导了在边际二维泊松分布的一阶二维整数值自回归模型的零假设下检验统计量的渐近分布,并通过模拟分析了拟合优度检验的有限样本性能。一个真实数据示例说明了该检验在实践中的应用和实用性。
{"title":"Goodness-of-fit testing in bivariate count time series based on a bivariate dispersion index","authors":"Huiqiao Wang, Christian H. Weiß, Mingming Zhang","doi":"10.1007/s10182-024-00512-3","DOIUrl":"https://doi.org/10.1007/s10182-024-00512-3","url":null,"abstract":"<p>A common choice for the marginal distribution of a bivariate count time series is the bivariate Poisson distribution. In practice, however, when the count data exhibit zero inflation, overdispersion or non-stationarity features, such that a marginal bivariate Poisson distribution is not suitable. To test the discrepancy between the actual count data and the bivariate Poisson distribution, we propose a new goodness-of-fit test based on a bivariate dispersion index. The asymptotic distribution of the test statistic under the null hypothesis of a first-order bivariate integer-valued autoregressive model with marginal bivariate Poisson distribution is derived, and the finite-sample performance of the goodness-of-fit test is analyzed by simulations. A real-data example illustrate the application and usefulness of the test in practice.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian joint relatively quantile regression of latent ordinal multivariate linear models with application to multirater agreement analysis 贝叶斯联合相对量子回归潜序多元线性模型在多方一致分析中的应用
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-20 DOI: 10.1007/s10182-024-00509-y
YuZhu Tian, ChunHo Wu, ManLai Tang, MaoZai Tian

In this paper, we propose a Bayesian quantile regression (QR) approach to jointly model multivariate ordinal data. Firstly, a multivariate latent variable model is used to link the multivariate ordinal data and latent continuous responses and the multivariate asymmetric Laplace (MAL) distribution is employed to construct the joint QR-based working likelihood for the considered model. Secondly, adaptive-(L_{1/2}) penalization priors of regression parameters are incorporated into the working likelihood to implement high-dimensional Bayesian joint QR inference. Markov Chain Monte Carlo (MCMC) algorithm is utilized to derive the fully conditional posterior distributions of all parameters. Thirdly, Bayesian joint relatively QR estimation approach is recommended to result in more efficient estimation results. Finally, Monte Carlo simulation studies and a real instance analysis of multirater agreement data are presented to illustrate the performance of the proposed Bayesian joint relatively QR approach.

本文提出了一种贝叶斯量化回归(QR)方法,用于对多元序数数据进行联合建模。首先,使用多变量潜变量模型将多变量序数数据和潜连续响应联系起来,并使用多变量非对称拉普拉斯(MAL)分布为所考虑的模型构建基于 QR 的联合工作似然。其次,将回归参数的自适应-(L_{1/2}) 惩罚先验纳入工作似然,以实现高维贝叶斯联合 QR 推理。利用马尔可夫链蒙特卡罗(MCMC)算法得出所有参数的全条件后验分布。第三,建议采用贝叶斯联合相对 QR 估计方法,以获得更高效的估计结果。最后,介绍了蒙特卡罗模拟研究和多方一致数据的真实实例分析,以说明所建议的贝叶斯联合相对 QR 方法的性能。
{"title":"Bayesian joint relatively quantile regression of latent ordinal multivariate linear models with application to multirater agreement analysis","authors":"YuZhu Tian, ChunHo Wu, ManLai Tang, MaoZai Tian","doi":"10.1007/s10182-024-00509-y","DOIUrl":"https://doi.org/10.1007/s10182-024-00509-y","url":null,"abstract":"<p>In this paper, we propose a Bayesian quantile regression (QR) approach to jointly model multivariate ordinal data. Firstly, a multivariate latent variable model is used to link the multivariate ordinal data and latent continuous responses and the multivariate asymmetric Laplace (MAL) distribution is employed to construct the joint QR-based working likelihood for the considered model. Secondly, adaptive-<span>(L_{1/2})</span> penalization priors of regression parameters are incorporated into the working likelihood to implement high-dimensional Bayesian joint QR inference. Markov Chain Monte Carlo (MCMC) algorithm is utilized to derive the fully conditional posterior distributions of all parameters. Thirdly, Bayesian joint relatively QR estimation approach is recommended to result in more efficient estimation results. Finally, Monte Carlo simulation studies and a real instance analysis of multirater agreement data are presented to illustrate the performance of the proposed Bayesian joint relatively QR approach.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Finite-sample bias correction method for general linear model in the presence of differential measurement errors 差异测量误差下一般线性模型的有限样本偏差校正方法
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-14 DOI: 10.1007/s10182-024-00510-5
Ali Al-Sharadqah, Karine Bagdasaryan, Ola Nusierat

This paper focuses on the general linear measurement error model, in which some or all predictors are measured with error, while others are measured precisely. We propose a semi-parametric estimator that works under general mechanisms of measurement error, including differential and non-differential errors. Other popular methods, such as the corrected score and conditional score methods, only work for non-differential measurement error models, but our estimator works in all scenarios. We develop our estimator by considering a family of objective functions that depend on an unspecified weight function. Using statistical error analysis and perturbation theory, we derive the optimal weight function under the small-sigma regime. The resulting estimator is statistically optimal in all senses. Even though we develop it under the small-sigma regime, we also establish its consistency and asymptotic normality under the large sample regime. Finally, we conduct a series of numerical experiments to confirm that the proposed estimator outperforms other existing methods.

本文的重点是一般线性测量误差模型,在该模型中,部分或所有预测因子的测量都存在误差,而其他预测因子的测量则非常精确。我们提出了一种半参数估计器,它能在测量误差的一般机制下工作,包括微分误差和非微分误差。其他流行的方法,如校正得分法和条件得分法,只适用于非差分测量误差模型,但我们的估计器适用于所有情况。我们通过考虑一系列取决于未指定权重函数的目标函数来开发我们的估算器。利用统计误差分析和扰动理论,我们得出了小Σ机制下的最优权重函数。由此得出的估计器在所有意义上都是统计最优的。尽管我们是在小σ机制下推导的,但我们也确定了它在大样本机制下的一致性和渐近正态性。最后,我们进行了一系列数值实验,以证实所提出的估计器优于其他现有方法。
{"title":"A Finite-sample bias correction method for general linear model in the presence of differential measurement errors","authors":"Ali Al-Sharadqah, Karine Bagdasaryan, Ola Nusierat","doi":"10.1007/s10182-024-00510-5","DOIUrl":"https://doi.org/10.1007/s10182-024-00510-5","url":null,"abstract":"<p>This paper focuses on the general linear measurement error model, in which some or all predictors are measured with error, while others are measured precisely. We propose a semi-parametric estimator that works under general mechanisms of measurement error, including differential and non-differential errors. Other popular methods, such as the corrected score and conditional score methods, only work for non-differential measurement error models, but our estimator works in all scenarios. We develop our estimator by considering a family of objective functions that depend on an unspecified weight function. Using statistical error analysis and perturbation theory, we derive the optimal weight function under the small-sigma regime. The resulting estimator is statistically optimal in all senses. Even though we develop it under the small-sigma regime, we also establish its consistency and asymptotic normality under the large sample regime. Finally, we conduct a series of numerical experiments to confirm that the proposed estimator outperforms other existing methods.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classes of probability measures built on the properties of Benford’s law 基于本福德定律性质的概率度量类别
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-08-08 DOI: 10.1007/s10182-024-00505-2
Roy Cerqueti, Mario Maggi

Benford’s law is a particular discrete probability distribution that is often satisfied by the significant digits of a dataset. The nonconformity with Benford’s law suggests the possible presence of data manipulation. This paper introduces two novel generalized versions of Benford’s law that are less restrictive than the original Benford’s law—hence, leading to more probable conformity of a given dataset. Such generalizations are grounded on the existing mathematical relations between Benford’s law probability distribution elements. Moreover, one of them leads to a set of probability distributions that is a proper subset of that of the other one. We show that the considered versions of Benford’s law have a geometric representation on the three-dimensional Euclidean space. Through suitable optimization models, we show that all the probability distributions satisfying the more restrictive generalization exhibit at least acceptable conformity with Benford’s law, according to the most popular distance measures. We also present some examples to highlight the practical usefulness of the introduced devices.

本福德定律是一种特殊的离散概率分布,数据集的有效数字通常符合该定律。不符合本福德定律的情况表明可能存在数据操纵。本文介绍了本福德定律的两个新的广义版本,它们比原始的本福德定律限制更少,因此更有可能符合给定数据集。这些概括基于本福德定律概率分布元素之间现有的数学关系。此外,其中一个概率分布集是另一个概率分布集的适当子集。我们证明,所考虑的本福德定律版本在三维欧几里得空间上有一个几何表示。通过合适的优化模型,我们表明,根据最流行的距离度量,所有满足更严格广义化的概率分布至少表现出与本福德定律可接受的一致性。我们还列举了一些例子,以突出所介绍的方法的实用性。
{"title":"Classes of probability measures built on the properties of Benford’s law","authors":"Roy Cerqueti, Mario Maggi","doi":"10.1007/s10182-024-00505-2","DOIUrl":"https://doi.org/10.1007/s10182-024-00505-2","url":null,"abstract":"<p>Benford’s law is a particular discrete probability distribution that is often satisfied by the significant digits of a dataset. The nonconformity with Benford’s law suggests the possible presence of data manipulation. This paper introduces two novel generalized versions of Benford’s law that are less restrictive than the original Benford’s law—hence, leading to more probable conformity of a given dataset. Such generalizations are grounded on the existing mathematical relations between Benford’s law probability distribution elements. Moreover, one of them leads to a set of probability distributions that is a proper subset of that of the other one. We show that the considered versions of Benford’s law have a geometric representation on the three-dimensional Euclidean space. Through suitable optimization models, we show that all the probability distributions satisfying the more restrictive generalization exhibit at least acceptable conformity with Benford’s law, according to the most popular distance measures. We also present some examples to highlight the practical usefulness of the introduced devices.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141948728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wasserstein barycenter regression: application to the joint dynamics of regional GDP and life expectancy in Italy 瓦瑟施泰因原点回归:应用于意大利地区国内生产总值和预期寿命的联合动态变化
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-16 DOI: 10.1007/s10182-024-00506-1
Susanna Levantesi, Andrea Nigri, Paolo Pagnottoni, Alessandro Spelta

We propose to investigate the joint dynamics of regional gross domestic product and life expectancy in Italy through Wasserstein barycenter regression derived from optimal transport theory. Wasserstein barycenter regression has the advantage of being flexible in modeling complex data distributions, given its ability to capture multimodal relationships, while maintaining the possibility of incorporating uncertainty and priors, other than yielding interpretable results. The main findings reveal that regional clusters tend to emerge, highlighting inequalities in Italian regions in economic and life expectancy terms. This suggests that targeted policy actions at a regional level fostering equitable development, especially from an economic viewpoint, might reduce regional inequality. Our results are validated by a robustness check on a human mobility dataset and by an illustrative forecasting exercise, which confirms the model’s ability to estimate and predict joint distributions and produce novel empirical evidence.

我们建议通过源自最优运输理论的瓦瑟斯坦原点回归来研究意大利地区国内生产总值和预期寿命的共同动态。瓦瑟施泰因原点回归法的优势在于能够灵活地模拟复杂的数据分布,因为它能够捕捉多模态关系,同时除了产生可解释的结果外,还能保持纳入不确定性和先验的可能性。主要研究结果表明,区域集群的出现凸显了意大利各地区在经济和预期寿命方面的不平等。这表明,在地区层面采取有针对性的政策行动,促进公平发展,特别是从经济角度来看,可能会减少地区不平等。我们对人类流动性数据集进行了稳健性检查,并进行了说明性预测,从而验证了我们的结果,证实了该模型估计和预测联合分布的能力,并产生了新的经验证据。
{"title":"Wasserstein barycenter regression: application to the joint dynamics of regional GDP and life expectancy in Italy","authors":"Susanna Levantesi, Andrea Nigri, Paolo Pagnottoni, Alessandro Spelta","doi":"10.1007/s10182-024-00506-1","DOIUrl":"https://doi.org/10.1007/s10182-024-00506-1","url":null,"abstract":"<p>We propose to investigate the joint dynamics of regional gross domestic product and life expectancy in Italy through Wasserstein barycenter regression derived from optimal transport theory. Wasserstein barycenter regression has the advantage of being flexible in modeling complex data distributions, given its ability to capture multimodal relationships, while maintaining the possibility of incorporating uncertainty and priors, other than yielding interpretable results. The main findings reveal that regional clusters tend to emerge, highlighting inequalities in Italian regions in economic and life expectancy terms. This suggests that targeted policy actions at a regional level fostering equitable development, especially from an economic viewpoint, might reduce regional inequality. Our results are validated by a robustness check on a human mobility dataset and by an illustrative forecasting exercise, which confirms the model’s ability to estimate and predict joint distributions and produce novel empirical evidence.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A spatio-temporal model for binary data and its application in analyzing the direction of COVID-19 spread 二元数据时空模型及其在分析 COVID-19 传播方向中的应用
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-08 DOI: 10.1007/s10182-024-00507-0
Anagh Chattopadhyay, Soudeep Deb

It is often of primary interest to analyze and forecast the levels of a continuous phenomenon as a categorical variable. In this paper, we propose a new spatio-temporal model to deal with this problem in a binary setting, with an interesting application related to the COVID-19 pandemic, a phenomena that depends on both spatial proximity and temporal auto-correlation. Our model is defined through a hierarchical structure for the latent variable, which corresponds to the probit-link function. The mean of the latent variable in the proposed model is designed to capture the trend and the seasonal pattern as well as the lagged effects of relevant regressors. The covariance structure of the model is defined as an additive combination of a zero-mean spatio-temporally correlated process and a white noise process. The parameters associated with the space-time process enable us to analyze the effect of proximity of two points with respect to space or time and its influence on the overall process. For estimation and prediction, we adopt a complete Bayesian framework along with suitable prior specifications and utilize the concepts of Gibbs sampling. Using the county-level data from the state of New York, we show that the proposed methodology provides superior performance than benchmark techniques. We also use our model to devise a novel mechanism for predictive clustering which can be leveraged to develop localized policies.

分析和预测作为分类变量的连续现象的水平通常是人们最感兴趣的问题。在本文中,我们提出了一种新的时空模型来处理二元设置中的这一问题,其有趣的应用与 COVID-19 大流行有关,这种现象既取决于空间邻近性,也取决于时间自相关性。我们的模型是通过潜变量的分层结构定义的,与 probit 链接函数相对应。拟议模型中潜变量的均值旨在捕捉趋势和季节模式以及相关回归因子的滞后效应。模型的协方差结构被定义为零均值时空相关过程和白噪声过程的加法组合。与时空过程相关的参数使我们能够分析两点在空间或时间上的接近程度及其对整个过程的影响。在估计和预测方面,我们采用了完整的贝叶斯框架和适当的先验规范,并利用了吉布斯抽样的概念。通过使用纽约州的县级数据,我们证明了所提出的方法比基准技术具有更优越的性能。我们还利用我们的模型设计了一种新颖的预测聚类机制,可用于制定本地化政策。
{"title":"A spatio-temporal model for binary data and its application in analyzing the direction of COVID-19 spread","authors":"Anagh Chattopadhyay, Soudeep Deb","doi":"10.1007/s10182-024-00507-0","DOIUrl":"https://doi.org/10.1007/s10182-024-00507-0","url":null,"abstract":"<p>It is often of primary interest to analyze and forecast the levels of a continuous phenomenon as a categorical variable. In this paper, we propose a new spatio-temporal model to deal with this problem in a binary setting, with an interesting application related to the COVID-19 pandemic, a phenomena that depends on both spatial proximity and temporal auto-correlation. Our model is defined through a hierarchical structure for the latent variable, which corresponds to the probit-link function. The mean of the latent variable in the proposed model is designed to capture the trend and the seasonal pattern as well as the lagged effects of relevant regressors. The covariance structure of the model is defined as an additive combination of a zero-mean spatio-temporally correlated process and a white noise process. The parameters associated with the space-time process enable us to analyze the effect of proximity of two points with respect to space or time and its influence on the overall process. For estimation and prediction, we adopt a complete Bayesian framework along with suitable prior specifications and utilize the concepts of Gibbs sampling. Using the county-level data from the state of New York, we show that the proposed methodology provides superior performance than benchmark techniques. We also use our model to devise a novel mechanism for predictive clustering which can be leveraged to develop localized policies.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141567508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Artwork pricing model integrating the popularity and ability of artists 整合艺术家人气和能力的艺术品定价模式
IF 1.4 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2024-07-02 DOI: 10.1007/s10182-024-00504-3
Jinsu Park, Yoonjin Lee, Daewon Yang, Jongho Park, Hohyun Jung

Considerable research has been devoted to understanding the popularity effect on the art market dynamics, meaning that artworks by popular artists tend to have high prices. The hedonic pricing model has employed artists’ reputation attributes, such as survey results, to understand the popularity effect, but the reputation attributes are constant and not properly defined at the point of artwork sales. Moreover, the artist’s ability has been measured via random effect in the hedonic model, which fails to reflect ability changes. To remedy these problems, we present a method to define the popularity measure using the artwork sales dataset without relying on the artist’s reputation attributes. Also, we propose a novel pricing model to appropriately infer the time-dependent artist’s abilities using the presented popularity measure. An inference algorithm is presented using the EM algorithm and Gibbs sampling to estimate model parameters and artist abilities. We use the Artnet dataset to investigate the size of the rich-get-richer effect and the variables affecting artwork prices in real-world art market dynamics. We further conduct inferences about artists’ abilities under the popularity effect and examine how ability changes over time for various artists with remarkable interpretations.

大量研究致力于了解艺术市场动态中的人气效应,即受欢迎艺术家的艺术品往往价格较高。对冲定价模型利用艺术家的声誉属性(如调查结果)来理解人气效应,但声誉属性是恒定的,在艺术品销售时并没有正确定义。此外,在对冲定价模型中,艺术家的能力是通过随机效应来衡量的,无法反映能力的变化。为了解决这些问题,我们提出了一种方法,利用艺术品销售数据集来定义受欢迎程度,而不依赖于艺术家的声誉属性。此外,我们还提出了一个新颖的定价模型,利用所提出的受欢迎程度指标来适当推断随时间变化的艺术家能力。我们还提出了一种推理算法,使用 EM 算法和吉布斯采样来估计模型参数和艺术家能力。我们使用 Artnet 数据集来研究 "富者愈富 "效应的大小以及在现实世界艺术市场动态中影响艺术品价格的变量。我们还进一步推断了艺术家在人气效应下的能力,并研究了不同艺术家的能力随时间的变化情况,具有显著的解释力。
{"title":"Artwork pricing model integrating the popularity and ability of artists","authors":"Jinsu Park, Yoonjin Lee, Daewon Yang, Jongho Park, Hohyun Jung","doi":"10.1007/s10182-024-00504-3","DOIUrl":"https://doi.org/10.1007/s10182-024-00504-3","url":null,"abstract":"<p>Considerable research has been devoted to understanding the popularity effect on the art market dynamics, meaning that artworks by popular artists tend to have high prices. The hedonic pricing model has employed artists’ reputation attributes, such as survey results, to understand the popularity effect, but the reputation attributes are constant and not properly defined at the point of artwork sales. Moreover, the artist’s ability has been measured via random effect in the hedonic model, which fails to reflect ability changes. To remedy these problems, we present a method to define the popularity measure using the artwork sales dataset without relying on the artist’s reputation attributes. Also, we propose a novel pricing model to appropriately infer the time-dependent artist’s abilities using the presented popularity measure. An inference algorithm is presented using the EM algorithm and Gibbs sampling to estimate model parameters and artist abilities. We use the Artnet dataset to investigate the size of the rich-get-richer effect and the variables affecting artwork prices in real-world art market dynamics. We further conduct inferences about artists’ abilities under the popularity effect and examine how ability changes over time for various artists with remarkable interpretations.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Markov-switching decision trees 马尔可夫转换决策树
IF 1.4 4区 数学 Q2 Social Sciences Pub Date : 2024-05-29 DOI: 10.1007/s10182-024-00501-6
Timo Adam, Marius Ötting, Rouven Michels

Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.

决策树是一种简单但功能强大、可解释的机器学习工具。虽然基于树的方法只适用于横截面数据,但我们提出了一种将决策树与时间序列建模相结合的方法,从而缩小了机器学习与统计学之间的差距。特别是,我们将决策树与隐马尔可夫模型相结合,对于任何时间点,底层(隐)马尔可夫链都会选择生成相应观测值的树。我们提出了一种基于期望最大化算法的估计方法,并在模拟实验中评估了其可行性。在我们的真实数据应用中,我们使用美国国家橄榄球联盟(NFL)八个赛季的数据来预测以当前季度和比分等协变量为条件的比赛调用,其中模型的状态可以与球队的策略相关联。实现该方法的 R 代码可在 GitHub 上获取。
{"title":"Markov-switching decision trees","authors":"Timo Adam, Marius Ötting, Rouven Michels","doi":"10.1007/s10182-024-00501-6","DOIUrl":"https://doi.org/10.1007/s10182-024-00501-6","url":null,"abstract":"<p>Decision trees constitute a simple yet powerful and interpretable machine learning tool. While tree-based methods are designed only for cross-sectional data, we propose an approach that combines decision trees with time series modeling and thereby bridges the gap between machine learning and statistics. In particular, we combine decision trees with hidden Markov models where, for any time point, an underlying (hidden) Markov chain selects the tree that generates the corresponding observation. We propose an estimation approach that is based on the expectation-maximisation algorithm and assess its feasibility in simulation experiments. In our real-data application, we use eight seasons of National Football League (NFL) data to predict play calls conditional on covariates, such as the current quarter and the score, where the model’s states can be linked to the teams’ strategies. R code that implements the proposed method is available on GitHub.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Markov switching stereotype logit models for longitudinal ordinal data affected by unobserved heterogeneity in responding behavior 受反应行为中未观察到的异质性影响的纵向序数数据的马尔可夫转换定型 Logit 模型
IF 1.4 4区 数学 Q2 Social Sciences Pub Date : 2024-05-15 DOI: 10.1007/s10182-024-00500-7
Roberto Colombi, Sabrina Giordano

When asked to assess their opinion about attitudes or perceptions on Likert-scale, respondents often endorse the midpoint or extremes of the scale and agree or disagree regardless of the content. These responding behaviors are known in the psychometric literature as middle, extremes, aquiescence and disacquiescence response styles that generally introduce bias in the results. One of the key motivations behind our approach is to account for these attitudes and how they evolve over time. The novelty of our proposal, in the context of longitudinal ordered categorical data, is in considering simultaneously the temporal dynamics of the responses (observable ordinal variables) and unobservable answering behaviors, possibly influenced by response styles, through a Markov switching logit model with two latent components. One component accommodates serial dependence and respondent’s unobserved heterogeneity, the other component determines the responding attitude (due to response styles or not). The dependence of the responses on covariates is modelled by a stereotype logit model with parameters varying according to the two latent components. The stereotype logit model is adopted because it is a flexible extension of the proportional odds logit model that retains the advantage of using a single parameter to describe a regressor effect. In the paper, a new interpretation of the parameters of the stereotype model is given by defining the allocation sets as intervals of values of the linear predictor that identify the most probable response. Unobserved heterogeneity, serial dependence and tendency to response style are modelled through our approach on longitudinal data, collected by the Bank of Italy.

当要求受访者用李克特量表评估其对态度或认知的看法时,受访者通常会赞同量表的中点或极 端,并且无论内容如何,都会表示同意或不同意。这些回答行为在心理测量学文献中被称为中间、极端、钝化和不钝化回答风格,通常会给结果带来偏差。我们的方法背后的主要动机之一就是要考虑这些态度以及它们如何随时间演变。在纵向有序分类数据的背景下,我们的建议的新颖之处在于通过一个具有两个潜在成分的马尔可夫切换 logit 模型,同时考虑了回答(可观察的序变量)和不可观察的回答行为(可能受回答风格的影响)的时间动态。其中一个部分考虑了序列依赖性和应答者未观察到的异质性,另一个部分决定了应答态度(是否受应答风格影响)。回答对协变量的依赖性由一个定型 logit 模型来模拟,其参数根据两个潜变量的不同而变化。之所以采用定型 logit 模型,是因为它是比例几率 logit 模型的灵活扩展,保留了使用单一参数描述回归效应的优点。本文通过将分配集定义为线性预测因子值的区间来确定最可能的反应,从而对定型模型的参数给出了新的解释。通过我们对意大利银行收集的纵向数据所采用的方法,对未观察到的异质性、序列依赖性和反应风格倾向进行了建模。
{"title":"Markov switching stereotype logit models for longitudinal ordinal data affected by unobserved heterogeneity in responding behavior","authors":"Roberto Colombi, Sabrina Giordano","doi":"10.1007/s10182-024-00500-7","DOIUrl":"https://doi.org/10.1007/s10182-024-00500-7","url":null,"abstract":"<p>When asked to assess their opinion about attitudes or perceptions on Likert-scale, respondents often endorse the midpoint or extremes of the scale and agree or disagree regardless of the content. These responding behaviors are known in the psychometric literature as middle, extremes, aquiescence and disacquiescence response styles that generally introduce bias in the results. One of the key motivations behind our approach is to account for these attitudes and how they evolve over time. The novelty of our proposal, in the context of longitudinal ordered categorical data, is in considering simultaneously the temporal dynamics of the responses (observable ordinal variables) and unobservable answering behaviors, possibly influenced by response styles, through a Markov switching logit model with two latent components. One component accommodates serial dependence and respondent’s unobserved heterogeneity, the other component determines the responding attitude (due to response styles or not). The dependence of the responses on covariates is modelled by a stereotype logit model with parameters varying according to the two latent components. The stereotype logit model is adopted because it is a flexible extension of the proportional odds logit model that retains the advantage of using a single parameter to describe a regressor effect. In the paper, a new interpretation of the parameters of the stereotype model is given by defining the allocation sets as intervals of values of the linear predictor that identify the most probable response. Unobserved heterogeneity, serial dependence and tendency to response style are modelled through our approach on longitudinal data, collected by the Bank of Italy.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141059332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deducing neighborhoods of classes from a fitted model 从拟合模型中推断类别邻域
IF 1.4 4区 数学 Q2 Social Sciences Pub Date : 2024-05-08 DOI: 10.1007/s10182-024-00502-5
Alexander Gerharz, Andreas Groll, Gunther Schauberger

In this article, a new kind of interpretable machine learning method is presented, which can help to understand the partition of the feature space into predicted classes in a classification model using quantile shifts, and this way make the underlying statistical or machine learning model more trustworthy. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed. By comparing the predictions before and after the shifts, under certain conditions the observed changes in the predictions can be interpreted as neighborhoods of the classes with regard to the shifted features. Chord diagrams are used to visualize the observed changes. For illustration, this quantile shift method (QSM) is applied to an artificial example with medical labels and a real data example.

本文提出了一种新的可解释机器学习方法,它可以帮助理解分类模型中利用量子位移将特征空间划分为预测类别的过程,从而使底层统计或机器学习模型更加可信。基本上,该方法使用真实数据点(或特定的兴趣点),并观察在稍微提高或降低特定特征后预测结果的变化。通过比较移动前后的预测结果,在某些条件下,观察到的预测变化可以解释为与移动特征相关的类别邻近。弦线图用于直观显示观察到的变化。为便于说明,我们将这种量子位移方法(QSM)应用于一个带有医疗标签的人工示例和一个真实数据示例。
{"title":"Deducing neighborhoods of classes from a fitted model","authors":"Alexander Gerharz, Andreas Groll, Gunther Schauberger","doi":"10.1007/s10182-024-00502-5","DOIUrl":"https://doi.org/10.1007/s10182-024-00502-5","url":null,"abstract":"<p>In this article, a new kind of interpretable machine learning method is presented, which can help to understand the partition of the feature space into predicted classes in a classification model using quantile shifts, and this way make the underlying statistical or machine learning model more trustworthy. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed. By comparing the predictions before and after the shifts, under certain conditions the observed changes in the predictions can be interpreted as neighborhoods of the classes with regard to the shifted features. Chord diagrams are used to visualize the observed changes. For illustration, this quantile shift method (QSM) is applied to an artificial example with medical labels and a real data example.</p>","PeriodicalId":55446,"journal":{"name":"Asta-Advances in Statistical Analysis","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140936567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Asta-Advances in Statistical Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1