首页 > 最新文献

Journal of Classification最新文献

英文 中文
Computing Finite Mixture Estimators in the Tails 计算尾部的有限混合估计量
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-04-13 DOI: 10.1007/s00357-023-09433-3
Marilena Furno
{"title":"Computing Finite Mixture Estimators in the Tails","authors":"Marilena Furno","doi":"10.1007/s00357-023-09433-3","DOIUrl":"https://doi.org/10.1007/s00357-023-09433-3","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"267 - 297"},"PeriodicalIF":2.0,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47624445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local and Overall Deviance R-Squared Measures for Mixtures of Generalized Linear Models. 广义线性模型混合的局部和总体偏差R平方测度。
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-04-04 DOI: 10.1007/s00357-023-09432-4
Roberto Di Mari, Salvatore Ingrassia, Antonio Punzo

In generalized linear models (GLMs), measures of lack of fit are typically defined as the deviance between two nested models, and a deviance-based R2 is commonly used to evaluate the fit. In this paper, we extend deviance measures to mixtures of GLMs, whose parameters are estimated by maximum likelihood (ML) via the EM algorithm. Such measures are defined both locally, i.e., at cluster-level, and globally, i.e., with reference to the whole sample. At the cluster-level, we propose a normalized two-term decomposition of the local deviance into explained, and unexplained local deviances. At the sample-level, we introduce an additive normalized decomposition of the total deviance into three terms, where each evaluates a different aspect of the fitted model: (1) the cluster separation on the dependent variable, (2) the proportion of the total deviance explained by the fitted model, and (3) the proportion of the total deviance which remains unexplained. We use both local and global decompositions to define, respectively, local and overall deviance R2 measures for mixtures of GLMs, which we illustrate-for Gaussian, Poisson and binomial responses-by means of a simulation study. The proposed fit measures are then used to assess, and interpret clusters of COVID-19 spread in Italy in two time points.

在广义线性模型(GLM)中,缺乏拟合的度量通常被定义为两个嵌套模型之间的偏差,并且基于偏差的R2通常用于评估拟合。在本文中,我们将偏差度量扩展到GLM的混合物,其参数通过EM算法由最大似然(ML)估计。这种衡量标准既在本地定义,即在集群级别定义,也在全局定义,即参考整个样本。在聚类级别,我们提出了一种将局部偏差归一化为已解释和未解释的局部偏差的两项分解。在样本水平上,我们引入了总偏差的加性归一化分解为三个项,其中每个项都评估拟合模型的不同方面:(1)因变量上的聚类分离,(2)拟合模型解释的总偏差的比例,以及(3)仍然无法解释的总偏离的比例。我们使用局部和全局分解来分别定义GLM混合物的局部和总体偏差R2度量,我们通过模拟研究对高斯、泊松和二项式响应进行了说明。然后使用拟议的拟合措施来评估和解释新冠肺炎在两个时间点在意大利的集群传播。
{"title":"Local and Overall Deviance R-Squared Measures for Mixtures of Generalized Linear Models.","authors":"Roberto Di Mari,&nbsp;Salvatore Ingrassia,&nbsp;Antonio Punzo","doi":"10.1007/s00357-023-09432-4","DOIUrl":"10.1007/s00357-023-09432-4","url":null,"abstract":"<p><p>In generalized linear models (GLMs), measures of lack of fit are typically defined as the deviance between two nested models, and a deviance-based <i>R</i><sup>2</sup> is commonly used to evaluate the fit. In this paper, we extend deviance measures to mixtures of GLMs, whose parameters are estimated by maximum likelihood (ML) via the EM algorithm. Such measures are defined both locally, i.e., at cluster-level, and globally, i.e., with reference to the whole sample. At the cluster-level, we propose a normalized two-term decomposition of the local deviance into explained, and unexplained local deviances. At the sample-level, we introduce an additive normalized decomposition of the total deviance into three terms, where each evaluates a different aspect of the fitted model: (1) the cluster separation on the dependent variable, (2) the proportion of the total deviance explained by the fitted model, and (3) the proportion of the total deviance which remains unexplained. We use both local and global decompositions to define, respectively, local and overall deviance <i>R</i><sup>2</sup> measures for mixtures of GLMs, which we illustrate-for Gaussian, Poisson and binomial responses-by means of a simulation study. The proposed fit measures are then used to assess, and interpret clusters of COVID-19 spread in Italy in two time points.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":" ","pages":"1-34"},"PeriodicalIF":2.0,"publicationDate":"2023-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10071261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9768843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Characteristics of Distance Matrices Based on Euclidean, Manhattan and Hausdorff Coefficients 基于欧几里得、曼哈顿和豪斯多夫系数的距离矩阵特征
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-04-03 DOI: 10.1007/s00357-023-09435-1
J. T. Temple, R. Bateman
{"title":"Characteristics of Distance Matrices Based on Euclidean, Manhattan and Hausdorff Coefficients","authors":"J. T. Temple, R. Bateman","doi":"10.1007/s00357-023-09435-1","DOIUrl":"https://doi.org/10.1007/s00357-023-09435-1","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"214 - 232"},"PeriodicalIF":2.0,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46807267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Finding the Proverbial Needle: Improving Minority Class Identification Under Extreme Class Imbalance 找到谚语的针:在极端阶级失衡下提高少数民族的阶级认同
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-02-23 DOI: 10.1007/s00357-023-09431-5
Trent Geisler, Herman Ray, Ying Xie
{"title":"Finding the Proverbial Needle: Improving Minority Class Identification Under Extreme Class Imbalance","authors":"Trent Geisler, Herman Ray, Ying Xie","doi":"10.1007/s00357-023-09431-5","DOIUrl":"https://doi.org/10.1007/s00357-023-09431-5","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"192-212"},"PeriodicalIF":2.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46841940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Classification Trees with Mismeasured Responses 具有误判响应的分类树
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-02-16 DOI: 10.1007/s00357-023-09430-6
L. Diao, Grace Y. Yi
{"title":"Classification Trees with Mismeasured Responses","authors":"L. Diao, Grace Y. Yi","doi":"10.1007/s00357-023-09430-6","DOIUrl":"https://doi.org/10.1007/s00357-023-09430-6","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"168-191"},"PeriodicalIF":2.0,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44301135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty Diagnostics of Binomial Regression Trees for Ordered Rating Data 有序评级数据二项回归树的不确定性诊断
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-01-21 DOI: 10.1007/s00357-022-09429-5
R. Simone
{"title":"Uncertainty Diagnostics of Binomial Regression Trees for Ordered Rating Data","authors":"R. Simone","doi":"10.1007/s00357-022-09429-5","DOIUrl":"https://doi.org/10.1007/s00357-022-09429-5","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"79-105"},"PeriodicalIF":2.0,"publicationDate":"2023-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47005228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling. DDCAL:基于迭代特征缩放的数据均匀分布到低方差聚类。
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2023-01-01 DOI: 10.1007/s00357-022-09428-6
Marian Lux, Stefanie Rinderle-Ma

This work studies the problem of clustering one-dimensional data points such that they are evenly distributed over a given number of low variance clusters. One application is the visualization of data on choropleth maps or on business process models, but without over-emphasizing outliers. This enables the detection and differentiation of smaller clusters. The problem is tackled based on a heuristic algorithm called DDCAL (1d distribution cluster algorithm) that is based on iterative feature scaling which generates stable results of clusters. The effectiveness of the DDCAL algorithm is shown based on 5 artificial data sets with different distributions and 4 real-world data sets reflecting different use cases. Moreover, the results from DDCAL, by using these data sets, are compared to 11 existing clustering algorithms. The application of the DDCAL algorithm is illustrated through the visualization of pandemic and population data on choropleth maps as well as process mining results on process models.

这项工作研究了聚类一维数据点的问题,使它们均匀分布在给定数量的低方差聚类上。一个应用程序是对地形图或业务流程模型上的数据进行可视化,但不过分强调异常值。这使得检测和区分较小的集群成为可能。基于迭代特征缩放的启发式算法DDCAL(一维分布聚类算法)可以生成稳定的聚类结果。基于5个不同分布的人工数据集和4个反映不同用例的真实数据集,验证了DDCAL算法的有效性。此外,利用这些数据集,将DDCAL的结果与现有的11种聚类算法进行了比较。通过在地形图上可视化流行病和人口数据以及在过程模型上的过程挖掘结果,说明了DDCAL算法的应用。
{"title":"DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling.","authors":"Marian Lux,&nbsp;Stefanie Rinderle-Ma","doi":"10.1007/s00357-022-09428-6","DOIUrl":"https://doi.org/10.1007/s00357-022-09428-6","url":null,"abstract":"<p><p>This work studies the problem of clustering one-dimensional data points such that they are evenly distributed over a given number of low variance clusters. One application is the visualization of data on choropleth maps or on business process models, but without over-emphasizing outliers. This enables the detection and differentiation of smaller clusters. The problem is tackled based on a heuristic algorithm called DDCAL (1d distribution cluster algorithm) that is based on iterative feature scaling which generates stable results of clusters. The effectiveness of the DDCAL algorithm is shown based on 5 artificial data sets with different distributions and 4 real-world data sets reflecting different use cases. Moreover, the results from DDCAL, by using these data sets, are compared to 11 existing clustering algorithms. The application of the DDCAL algorithm is illustrated through the visualization of pandemic and population data on choropleth maps as well as process mining results on process models.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"106-144"},"PeriodicalIF":2.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9873542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9476660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Semi-parametric Density Estimation with Application in Clustering 半参数密度估计及其在聚类中的应用
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-12-14 DOI: 10.1007/s00357-022-09425-9
M. Salehi, A. Bekker, M. Arashi
{"title":"A Semi-parametric Density Estimation with Application in Clustering","authors":"M. Salehi, A. Bekker, M. Arashi","doi":"10.1007/s00357-022-09425-9","DOIUrl":"https://doi.org/10.1007/s00357-022-09425-9","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"52-78"},"PeriodicalIF":2.0,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48188739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Merging Components in Linear Gaussian Cluster-Weighted Models 线性高斯聚类加权模型中的分量合并
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-12-07 DOI: 10.1007/s00357-022-09424-w
Sangkon Oh, Byungtae Seo
{"title":"Merging Components in Linear Gaussian Cluster-Weighted Models","authors":"Sangkon Oh, Byungtae Seo","doi":"10.1007/s00357-022-09424-w","DOIUrl":"https://doi.org/10.1007/s00357-022-09424-w","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"25-51"},"PeriodicalIF":2.0,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49126059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Imputation Strategies for Clustering Mixed-Type Data with Missing Values 缺失值混合数据聚类的插值策略
IF 2 4区 计算机科学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS Pub Date : 2022-11-26 DOI: 10.1007/s00357-022-09422-y
Rabea Aschenbruck, G. Szepannek, A. Wilhelm
{"title":"Imputation Strategies for Clustering Mixed-Type Data with Missing Values","authors":"Rabea Aschenbruck, G. Szepannek, A. Wilhelm","doi":"10.1007/s00357-022-09422-y","DOIUrl":"https://doi.org/10.1007/s00357-022-09422-y","url":null,"abstract":"","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":"40 1","pages":"2-24"},"PeriodicalIF":2.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46679720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Journal of Classification
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1