首页 > 最新文献

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining最新文献

英文 中文
Direct optimization of ranking measures for learning to rank models 直接优化学习排序模型的排序方法
Ming Tan, Tian Xia, L. Guo, Shaojun Wang
We present a novel learning algorithm, DirectRank, which directly and exactly optimizes ranking measures without resorting to any upper bounds or approximations. Our approach is essentially an iterative coordinate ascent method. In each iteration, we choose one coordinate and only update the corresponding parameter, with all others remaining fixed. Since the ranking measure is a stepwise function of a single parameter, we propose a novel line search algorithm that can locate the interval with the best ranking measure along this coordinate quite efficiently. In order to stabilize our system in small datasets, we construct a probabilistic framework for document-query pairs to maximize the likelihood of the objective permutation of top-$tau$ documents. This iterative procedure ensures convergence. Furthermore, we integrate regression trees as our weak learners in order to consider the correlation between the different features. Experiments on LETOR datasets and two large datasets, Yahoo challenge data and Microsoft 30K web data, show an improvement over state-of-the-art systems.
我们提出了一种新的学习算法,DirectRank,它直接和精确地优化排名措施,而不依赖于任何上界或近似。我们的方法本质上是一种迭代坐标上升法。在每次迭代中,我们选择一个坐标,只更新相应的参数,其他参数保持不变。由于排序测度是单参数的阶跃函数,我们提出了一种新的直线搜索算法,该算法可以沿该坐标非常有效地定位到具有最佳排序测度的区间。为了在小数据集中稳定我们的系统,我们为文档-查询对构建了一个概率框架,以最大化top-$tau$文档客观排列的可能性。这个迭代过程保证了收敛性。此外,为了考虑不同特征之间的相关性,我们将回归树集成为弱学习器。在LETOR数据集和两个大型数据集(雅虎挑战数据和微软30K网络数据)上的实验表明,与最先进的系统相比,这种方法有所改进。
{"title":"Direct optimization of ranking measures for learning to rank models","authors":"Ming Tan, Tian Xia, L. Guo, Shaojun Wang","doi":"10.1145/2487575.2487630","DOIUrl":"https://doi.org/10.1145/2487575.2487630","url":null,"abstract":"We present a novel learning algorithm, DirectRank, which directly and exactly optimizes ranking measures without resorting to any upper bounds or approximations. Our approach is essentially an iterative coordinate ascent method. In each iteration, we choose one coordinate and only update the corresponding parameter, with all others remaining fixed. Since the ranking measure is a stepwise function of a single parameter, we propose a novel line search algorithm that can locate the interval with the best ranking measure along this coordinate quite efficiently. In order to stabilize our system in small datasets, we construct a probabilistic framework for document-query pairs to maximize the likelihood of the objective permutation of top-$tau$ documents. This iterative procedure ensures convergence. Furthermore, we integrate regression trees as our weak learners in order to consider the correlation between the different features. Experiments on LETOR datasets and two large datasets, Yahoo challenge data and Microsoft 30K web data, show an improvement over state-of-the-art systems.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88745983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound svmppauctight:一种基于紧凸上界优化局部AUC的支持向量方法
H. Narasimhan, S. Agarwal
The area under the ROC curve (AUC) is a well known performance measure in machine learning and data mining. In an increasing number of applications, however, ranging from ranking applications to a variety of important bioinformatics applications, performance is measured in terms of the partial area under the ROC curve between two specified false positive rates. In recent work, we proposed a structural SVM based approach for optimizing this performance measure (Narasimhan and Agarwal, 2013). In this paper, we develop a new support vector method, SVMpAUCtight, that optimizes a tighter convex upper bound on the partial AUC loss, which leads to both improved accuracy and reduced computational complexity. In particular, by rewriting the empirical partial AUC risk as a maximum over subsets of negative instances, we derive a new formulation, where a modified form of the earlier optimization objective is evaluated on each of these subsets, leading to a tighter hinge relaxation on the partial AUC loss. As with our previous method, the resulting optimization problem can be solved using a cutting-plane algorithm, but the new method has better run time guarantees. We also discuss a projected subgradient method for solving this problem, which offers additional computational savings in certain settings. We demonstrate on a wide variety of bioinformatics tasks, ranging from protein-protein interaction prediction to drug discovery tasks, that the proposed method does, in many cases, perform significantly better on the partial AUC measure than the previous structural SVM approach. In addition, we also develop extensions of our method to learn sparse and group sparse models, often of interest in biological applications.
ROC曲线下面积(AUC)是机器学习和数据挖掘中众所周知的性能度量。然而,在越来越多的应用中,从排名应用到各种重要的生物信息学应用,性能是根据两个指定的假阳性率之间的ROC曲线下的部分面积来衡量的。在最近的工作中,我们提出了一种基于结构支持向量机的方法来优化这种性能度量(Narasimhan和Agarwal, 2013)。在本文中,我们开发了一种新的支持向量方法svmppauctight,该方法优化了部分AUC损失的更紧凸上界,从而提高了精度并降低了计算复杂度。特别是,通过将经验部分AUC风险重写为负实例子集上的最大值,我们得出了一个新的公式,其中在每个这些子集上评估早期优化目标的修改形式,从而导致部分AUC损失的更紧密的铰链松弛。与我们之前的方法一样,可以使用切割平面算法来解决最终的优化问题,但新方法具有更好的运行时间保证。我们还讨论了解决此问题的投影亚梯度方法,该方法在某些设置中提供了额外的计算节省。我们在各种各样的生物信息学任务中证明,从蛋白质-蛋白质相互作用预测到药物发现任务,在许多情况下,所提出的方法在部分AUC测量上的表现明显优于以前的结构SVM方法。此外,我们还开发了我们的方法的扩展,以学习稀疏和组稀疏模型,通常在生物学应用中感兴趣。
{"title":"SVMpAUCtight: a new support vector method for optimizing partial AUC based on a tight convex upper bound","authors":"H. Narasimhan, S. Agarwal","doi":"10.1145/2487575.2487674","DOIUrl":"https://doi.org/10.1145/2487575.2487674","url":null,"abstract":"The area under the ROC curve (AUC) is a well known performance measure in machine learning and data mining. In an increasing number of applications, however, ranging from ranking applications to a variety of important bioinformatics applications, performance is measured in terms of the partial area under the ROC curve between two specified false positive rates. In recent work, we proposed a structural SVM based approach for optimizing this performance measure (Narasimhan and Agarwal, 2013). In this paper, we develop a new support vector method, SVMpAUCtight, that optimizes a tighter convex upper bound on the partial AUC loss, which leads to both improved accuracy and reduced computational complexity. In particular, by rewriting the empirical partial AUC risk as a maximum over subsets of negative instances, we derive a new formulation, where a modified form of the earlier optimization objective is evaluated on each of these subsets, leading to a tighter hinge relaxation on the partial AUC loss. As with our previous method, the resulting optimization problem can be solved using a cutting-plane algorithm, but the new method has better run time guarantees. We also discuss a projected subgradient method for solving this problem, which offers additional computational savings in certain settings. We demonstrate on a wide variety of bioinformatics tasks, ranging from protein-protein interaction prediction to drug discovery tasks, that the proposed method does, in many cases, perform significantly better on the partial AUC measure than the previous structural SVM approach. In addition, we also develop extensions of our method to learn sparse and group sparse models, often of interest in biological applications.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89343994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
When TEDDY meets GrizzLY: temporal dependency discovery for triggering road deicing operations 当TEDDY遇到GrizzLY:触发道路除冰操作的时间依赖性发现
C. Robardet, Vasile-Marian Scuturici, M. Plantevit, A. Fraboulet
Temporal dependencies between multiple sensor data sources link two types of events if the occurrence of one is repeatedly followed by the appearance of the other in a certain time interval. TEDDY algorithm aims at discovering such dependencies, identifying the statically significant time intervals with a chi2 test. We present how these dependencies can be used within the GrizzLY project to tackle an environmental and technical issue: the deicing of the roads. This project aims to wisely organize the deicing operations of an urban area, based on several sensor network measures of local atmospheric phenomena. A spatial and temporal dependency-based model is built from these data to predict freezing alerts.
多个传感器数据源之间的时间依赖关系将两种类型的事件联系起来,如果其中一种事件在一定时间间隔内重复出现,另一种事件就会出现。TEDDY算法旨在发现这种依赖关系,通过chi2测试识别静态显著的时间间隔。我们介绍了如何在GrizzLY项目中使用这些依赖关系来解决环境和技术问题:道路除冰。该项目旨在基于多个传感器网络对当地大气现象的测量,明智地组织城市地区的除冰作业。根据这些数据建立了一个基于空间和时间依赖性的模型来预测冰冻警报。
{"title":"When TEDDY meets GrizzLY: temporal dependency discovery for triggering road deicing operations","authors":"C. Robardet, Vasile-Marian Scuturici, M. Plantevit, A. Fraboulet","doi":"10.1145/2487575.2487706","DOIUrl":"https://doi.org/10.1145/2487575.2487706","url":null,"abstract":"Temporal dependencies between multiple sensor data sources link two types of events if the occurrence of one is repeatedly followed by the appearance of the other in a certain time interval. TEDDY algorithm aims at discovering such dependencies, identifying the statically significant time intervals with a chi2 test. We present how these dependencies can be used within the GrizzLY project to tackle an environmental and technical issue: the deicing of the roads. This project aims to wisely organize the deicing operations of an urban area, based on several sensor network measures of local atmospheric phenomena. A spatial and temporal dependency-based model is built from these data to predict freezing alerts.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89736033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An efficient ADMM algorithm for multidimensional anisotropic total variation regularization problems 多维各向异性全变分正则化问题的一种高效ADMM算法
Sen Yang, Jie Wang, Wei Fan, Xiatian Zhang, Peter Wonka, Jieping Ye
Total variation (TV) regularization has important applications in signal processing including image denoising, image deblurring, and image reconstruction. A significant challenge in the practical use of TV regularization lies in the nondifferentiable convex optimization, which is difficult to solve especially for large-scale problems. In this paper, we propose an efficient alternating augmented Lagrangian method (ADMM) to solve total variation regularization problems. The proposed algorithm is applicable for tensors, thus it can solve multidimensional total variation regularization problems. One appealing feature of the proposed algorithm is that it does not need to solve a linear system of equations, which is often the most expensive part in previous ADMM-based methods. In addition, each step of the proposed algorithm involves a set of independent and smaller problems, which can be solved in parallel. Thus, the proposed algorithm scales to large size problems. Furthermore, the global convergence of the proposed algorithm is guaranteed, and the time complexity of the proposed algorithm is O(dN/ε) on a d-mode tensor with N entries for achieving an ε-optimal solution. Extensive experimental results demonstrate the superior performance of the proposed algorithm in comparison with current state-of-the-art methods.
全变分正则化在图像去噪、去模糊和图像重建等信号处理中有着重要的应用。电视正则化在实际应用中面临的一个重大挑战是不可微凸优化,特别是在大规模问题中难以解决。本文提出了一种有效的交替增广拉格朗日方法来解决全变分正则化问题。该算法适用于张量,可以解决多维全变分正则化问题。该算法的一个吸引人的特点是它不需要求解线性方程组,而这通常是以前基于admm的方法中最昂贵的部分。此外,该算法的每一步都涉及一组独立且较小的问题,这些问题可以并行解决。因此,该算法适用于大尺度问题。该算法在具有N个项的d模张量上的时间复杂度为0 (dN/ε),以达到ε-最优解。大量的实验结果表明,与目前最先进的方法相比,所提出的算法具有优越的性能。
{"title":"An efficient ADMM algorithm for multidimensional anisotropic total variation regularization problems","authors":"Sen Yang, Jie Wang, Wei Fan, Xiatian Zhang, Peter Wonka, Jieping Ye","doi":"10.1145/2487575.2487586","DOIUrl":"https://doi.org/10.1145/2487575.2487586","url":null,"abstract":"Total variation (TV) regularization has important applications in signal processing including image denoising, image deblurring, and image reconstruction. A significant challenge in the practical use of TV regularization lies in the nondifferentiable convex optimization, which is difficult to solve especially for large-scale problems. In this paper, we propose an efficient alternating augmented Lagrangian method (ADMM) to solve total variation regularization problems. The proposed algorithm is applicable for tensors, thus it can solve multidimensional total variation regularization problems. One appealing feature of the proposed algorithm is that it does not need to solve a linear system of equations, which is often the most expensive part in previous ADMM-based methods. In addition, each step of the proposed algorithm involves a set of independent and smaller problems, which can be solved in parallel. Thus, the proposed algorithm scales to large size problems. Furthermore, the global convergence of the proposed algorithm is guaranteed, and the time complexity of the proposed algorithm is O(dN/ε) on a d-mode tensor with N entries for achieving an ε-optimal solution. Extensive experimental results demonstrate the superior performance of the proposed algorithm in comparison with current state-of-the-art methods.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82468564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
Collaborative boosting for activity classification in microblogs 微博活动分类的协同提升
Yangqiu Song, Zhengdong Lu, C. Leung, Qiang Yang
Users' daily activities, such as dining and shopping, inherently reflect their habits, intents and preferences, thus provide invaluable information for services such as personalized information recommendation and targeted advertising. Users' activity information, although ubiquitous on social media, has largely been unexploited. This paper addresses the task of user activity classification in microblogs, where users can publish short messages and maintain social networks online. We identify the importance of modeling a user's individuality, and that of exploiting opinions of the user's friends for accurate activity classification. In this light, we propose a novel collaborative boosting framework comprising a text-to-activity classifier for each user, and a mechanism for collaboration between classifiers of users having social connections. The collaboration between two classifiers includes exchanging their own training instances and their dynamically changing labeling decisions. We propose an iterative learning procedure that is formulated as gradient descent in learning function space, while opinion exchange between classifiers is implemented with a weighted voting in each learning iteration. We show through experiments that on real-world data from Sina Weibo, our method outperforms existing off-the-shelf algorithms that do not take users' individuality or social connections into account.
用户的日常活动,如就餐、购物等,本质上反映了用户的习惯、意图和偏好,为个性化信息推荐、定向广告等服务提供了宝贵的信息。尽管用户的活动信息在社交媒体上无处不在,但在很大程度上尚未得到利用。本文研究了微博中用户活动分类的问题,用户可以在微博中发布短消息并在线维护社交网络。我们认识到对用户个性建模的重要性,以及利用用户朋友的意见进行准确的活动分类的重要性。鉴于此,我们提出了一个新的协作提升框架,该框架包括每个用户的文本到活动分类器,以及具有社会联系的用户分类器之间的协作机制。两个分类器之间的协作包括交换它们自己的训练实例和它们动态变化的标记决策。我们提出了一种迭代学习过程,该过程在学习函数空间中被表述为梯度下降,而分类器之间的意见交换在每次学习迭代中通过加权投票实现。我们通过实验表明,在新浪微博的真实数据上,我们的方法优于现有的现成算法,这些算法不考虑用户的个性或社会关系。
{"title":"Collaborative boosting for activity classification in microblogs","authors":"Yangqiu Song, Zhengdong Lu, C. Leung, Qiang Yang","doi":"10.1145/2487575.2487661","DOIUrl":"https://doi.org/10.1145/2487575.2487661","url":null,"abstract":"Users' daily activities, such as dining and shopping, inherently reflect their habits, intents and preferences, thus provide invaluable information for services such as personalized information recommendation and targeted advertising. Users' activity information, although ubiquitous on social media, has largely been unexploited. This paper addresses the task of user activity classification in microblogs, where users can publish short messages and maintain social networks online. We identify the importance of modeling a user's individuality, and that of exploiting opinions of the user's friends for accurate activity classification. In this light, we propose a novel collaborative boosting framework comprising a text-to-activity classifier for each user, and a mechanism for collaboration between classifiers of users having social connections. The collaboration between two classifiers includes exchanging their own training instances and their dynamically changing labeling decisions. We propose an iterative learning procedure that is formulated as gradient descent in learning function space, while opinion exchange between classifiers is implemented with a weighted voting in each learning iteration. We show through experiments that on real-world data from Sina Weibo, our method outperforms existing off-the-shelf algorithms that do not take users' individuality or social connections into account.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81621931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Speeding up large-scale learning with a social prior 加速具有社会先验的大规模学习
Deepayan Chakrabarti, R. Herbrich
Slow convergence and poor initial accuracy are two problems that plague efforts to use very large feature sets in online learning. This is especially true when only a few features are "active" in any training example, and the frequency of activations of different features is skewed. We show how these problems can be mitigated if a graph of relationships between features is known. We study this problem in a fully Bayesian setting, focusing on the problem of using Facebook user-IDs as features, with the social network giving the relationship structure. Our analysis uncovers significant problems with the obvious regularizations, and motivates a two-component mixture-model "social prior" that is provably better. Empirical results on large-scale click prediction problems show that our algorithm can learn as well as the baseline with 12M fewer training examples, and continuously outperforms it for over 60M examples. On a second problem using binned features, our model outperforms the baseline even after the latter sees 5x as much data.
缓慢的收敛和较差的初始精度是困扰在线学习中使用非常大的特征集的两个问题。当在任何训练示例中只有少数特征是“活动的”时尤其如此,并且不同特征的激活频率是倾斜的。如果已知特征之间的关系图,我们将展示如何缓解这些问题。我们在完全贝叶斯环境下研究这个问题,重点关注使用Facebook用户id作为特征的问题,社交网络给出关系结构。我们的分析揭示了明显正则化的重大问题,并激发了一个双组分混合模型“社会先验”,这被证明是更好的。在大规模点击预测问题上的实证结果表明,我们的算法在训练样例少于12M的情况下,可以和基线一样学习,并且在训练样例超过60M的情况下,性能持续优于基线。在第二个问题上使用分类特征,我们的模型优于基线,即使后者看到5倍的数据量。
{"title":"Speeding up large-scale learning with a social prior","authors":"Deepayan Chakrabarti, R. Herbrich","doi":"10.1145/2487575.2487587","DOIUrl":"https://doi.org/10.1145/2487575.2487587","url":null,"abstract":"Slow convergence and poor initial accuracy are two problems that plague efforts to use very large feature sets in online learning. This is especially true when only a few features are \"active\" in any training example, and the frequency of activations of different features is skewed. We show how these problems can be mitigated if a graph of relationships between features is known. We study this problem in a fully Bayesian setting, focusing on the problem of using Facebook user-IDs as features, with the social network giving the relationship structure. Our analysis uncovers significant problems with the obvious regularizations, and motivates a two-component mixture-model \"social prior\" that is provably better. Empirical results on large-scale click prediction problems show that our algorithm can learn as well as the baseline with 12M fewer training examples, and continuously outperforms it for over 60M examples. On a second problem using binned features, our model outperforms the baseline even after the latter sees 5x as much data.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82566400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Massively parallel expectation maximization using graphics processing units 使用图形处理单元实现大规模并行期望最大化
M. C. Altinigneli, C. Plant, C. Böhm
Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. Expectation Maximization (EM) is a widely used technique for maximum likelihood estimation. In this paper, we propose an innovative EM clustering algorithm particularly suited for the GPU platform on NVIDIA's Fermi architecture. The central idea of our algorithm is to allow the parallel threads exchanging their local information in an asynchronous way and thus updating their cluster representatives on demand by a technique called Asynchronous Model Updates (Async-EM). Async-EM enables our algorithm not only to accelerate convergence but also to reduce the overhead induced by memory bandwidth limitations and synchronization requirements. We demonstrate (1) how to reformulate the EM algorithm to be able to exchange information using Async-EM and (2) how to exploit the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective Async-EM is not limited to EM but can be applied to a variety of algorithms.
图形处理单元(GPU)由数百个处理器组成,已经成为一个非常有趣的平台,用于处理大量数据上的计算要求很高的任务。处理器和快速存储单元的特殊层次结构允许非常强大和有效的并行化,但也需要新颖的并行算法。期望最大化(EM)是一种广泛使用的极大似然估计技术。在本文中,我们提出了一种创新的EM聚类算法,特别适用于NVIDIA的Fermi架构的GPU平台。我们算法的核心思想是允许并行线程以异步方式交换它们的本地信息,从而通过一种称为异步模型更新(Async-EM)的技术按需更新它们的集群代表。Async-EM使我们的算法不仅可以加速收敛,而且还可以减少由内存带宽限制和同步要求引起的开销。我们演示(1)如何重新制定EM算法,以便能够使用Async-EM交换信息;(2)如何利用现代GPU的特殊内存和处理器架构,以便在线程之间以最佳方式共享此信息。作为一个透视图,Async-EM不仅限于EM,而且可以应用于各种算法。
{"title":"Massively parallel expectation maximization using graphics processing units","authors":"M. C. Altinigneli, C. Plant, C. Böhm","doi":"10.1145/2487575.2487628","DOIUrl":"https://doi.org/10.1145/2487575.2487628","url":null,"abstract":"Composed of several hundreds of processors, the Graphics Processing Unit (GPU) has become a very interesting platform for computationally demanding tasks on massive data. A special hierarchy of processors and fast memory units allow very powerful and efficient parallelization but also demands novel parallel algorithms. Expectation Maximization (EM) is a widely used technique for maximum likelihood estimation. In this paper, we propose an innovative EM clustering algorithm particularly suited for the GPU platform on NVIDIA's Fermi architecture. The central idea of our algorithm is to allow the parallel threads exchanging their local information in an asynchronous way and thus updating their cluster representatives on demand by a technique called Asynchronous Model Updates (Async-EM). Async-EM enables our algorithm not only to accelerate convergence but also to reduce the overhead induced by memory bandwidth limitations and synchronization requirements. We demonstrate (1) how to reformulate the EM algorithm to be able to exchange information using Async-EM and (2) how to exploit the special memory and processor architecture of a modern GPU in order to share this information among threads in an optimal way. As a perspective Async-EM is not limited to EM but can be applied to a variety of algorithms.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90299323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Unsupervised link prediction using aggregative statistics on heterogeneous social networks 基于聚合统计的异构社会网络无监督链接预测
Tsung-Ting Kuo, Rui Yan, Yu-Yang Huang, Perng-Hwa Kung, Shou-de Lin
The concern of privacy has become an important issue for online social networks. In services such as Foursquare.com, whether a person likes an article is considered private and therefore not disclosed; only the aggregative statistics of articles (i.e., how many people like this article) is revealed. This paper tries to answer a question: can we predict the opinion holder in a heterogeneous social network without any labeled data? This question can be generalized to a link prediction with aggregative statistics problem. This paper devises a novel unsupervised framework to solve this problem, including two main components: (1) a three-layer factor graph model and three types of potential functions; (2) a ranked-margin learning and inference algorithm. Finally, we evaluate our method on four diverse prediction scenarios using four datasets: preference (Foursquare), repost (Twitter), response (Plurk), and citation (DBLP). We further exploit nine unsupervised models to solve this problem as baselines. Our approach not only wins out in all scenarios, but on the average achieves 9.90% AUC and 12.59% NDCG improvement over the best competitors. The resources are available at http://www.csie.ntu.edu.tw/~d97944007/aggregative/
对隐私的关注已经成为在线社交网络的一个重要问题。在foursquare等服务中,一个人是否喜欢一篇文章被认为是隐私,因此不会被披露;只显示文章的汇总统计(即有多少人喜欢这篇文章)。本文试图回答一个问题:我们能否在没有任何标记数据的情况下预测异质社会网络中的意见持有者?这个问题可以推广为一个带有聚合统计的链路预测问题。本文设计了一种新的无监督框架来解决这一问题,该框架包括两个主要部分:(1)三层因子图模型和三种类型的势函数;(2)排序边缘学习与推理算法。最后,我们使用四个数据集对四种不同的预测场景进行了评估:偏好(Foursquare)、转发(Twitter)、响应(Plurk)和引用(DBLP)。我们进一步利用9个无监督模型作为基线来解决这个问题。我们的方法不仅在所有场景中胜出,而且平均比最佳竞争对手实现9.90%的AUC和12.59%的NDCG改进。资源可在http://www.csie.ntu.edu.tw/~d97944007/aggregative/上获得
{"title":"Unsupervised link prediction using aggregative statistics on heterogeneous social networks","authors":"Tsung-Ting Kuo, Rui Yan, Yu-Yang Huang, Perng-Hwa Kung, Shou-de Lin","doi":"10.1145/2487575.2487614","DOIUrl":"https://doi.org/10.1145/2487575.2487614","url":null,"abstract":"The concern of privacy has become an important issue for online social networks. In services such as Foursquare.com, whether a person likes an article is considered private and therefore not disclosed; only the aggregative statistics of articles (i.e., how many people like this article) is revealed. This paper tries to answer a question: can we predict the opinion holder in a heterogeneous social network without any labeled data? This question can be generalized to a link prediction with aggregative statistics problem. This paper devises a novel unsupervised framework to solve this problem, including two main components: (1) a three-layer factor graph model and three types of potential functions; (2) a ranked-margin learning and inference algorithm. Finally, we evaluate our method on four diverse prediction scenarios using four datasets: preference (Foursquare), repost (Twitter), response (Plurk), and citation (DBLP). We further exploit nine unsupervised models to solve this problem as baselines. Our approach not only wins out in all scenarios, but on the average achieves 9.90% AUC and 12.59% NDCG improvement over the best competitors. The resources are available at http://www.csie.ntu.edu.tw/~d97944007/aggregative/","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90371022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Improving quality control by early prediction of manufacturing outcomes 通过对生产结果的早期预测来改善质量控制
S. Weiss, Amit Dhurandhar, R. Baseman
We describe methods for continual prediction of manufactured product quality prior to final testing. In our most expansive modeling approach, an estimated final characteristic of a product is updated after each manufacturing operation. Our initial application is for the manufacture of microprocessors, and we predict final microprocessor speed. Using these predictions, early corrective manufacturing actions may be taken to increase the speed of expected slow wafers (a collection of microprocessors) or reduce the speed of fast wafers. Such predictions may also be used to initiate corrective supply chain management actions. Developing statistical learning models for this task has many complicating factors: (a) a temporally unstable population (b) missing data that is a result of sparsely sampled measurements and (c) relatively few available measurements prior to corrective action opportunities. In a real manufacturing pilot application, our automated models selected 125 fast wafers in real-time. As predicted, those wafers were significantly faster than average. During manufacture, downstream corrective processing restored 25 nominally unacceptable wafers to normal operation.
我们描述了在最终测试之前对成品质量进行持续预测的方法。在我们最广泛的建模方法中,每次制造操作后都会更新产品的估计最终特性。我们最初的应用是用于微处理器的制造,我们预测了微处理器的最终速度。利用这些预测,可以采取早期的纠正措施来提高预期的慢晶圆(微处理器集合)的速度或降低快速晶圆的速度。这样的预测也可以用来启动纠正供应链管理行动。为这项任务开发统计学习模型有许多复杂的因素:(a)人口暂时不稳定;(b)稀疏抽样测量导致的数据缺失;(c)在纠正行动机会之前可用的测量相对较少。在实际制造试点应用中,我们的自动化模型实时选择了125个快速晶圆。正如预测的那样,这些晶圆比平均速度快得多。在生产过程中,下游的纠正处理使25片名义上不合格的晶圆恢复正常运行。
{"title":"Improving quality control by early prediction of manufacturing outcomes","authors":"S. Weiss, Amit Dhurandhar, R. Baseman","doi":"10.1145/2487575.2488192","DOIUrl":"https://doi.org/10.1145/2487575.2488192","url":null,"abstract":"We describe methods for continual prediction of manufactured product quality prior to final testing. In our most expansive modeling approach, an estimated final characteristic of a product is updated after each manufacturing operation. Our initial application is for the manufacture of microprocessors, and we predict final microprocessor speed. Using these predictions, early corrective manufacturing actions may be taken to increase the speed of expected slow wafers (a collection of microprocessors) or reduce the speed of fast wafers. Such predictions may also be used to initiate corrective supply chain management actions. Developing statistical learning models for this task has many complicating factors: (a) a temporally unstable population (b) missing data that is a result of sparsely sampled measurements and (c) relatively few available measurements prior to corrective action opportunities. In a real manufacturing pilot application, our automated models selected 125 fast wafers in real-time. As predicted, those wafers were significantly faster than average. During manufacture, downstream corrective processing restored 25 nominally unacceptable wafers to normal operation.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78043618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Model selection in markovian processes 马尔可夫过程中的模型选择
Assaf Hallak, Dotan Di Castro, Shie Mannor
When analyzing data that originated from a dynamical system, a common practice is to encompass the problem in the well known frameworks of Markov Decision Processes (MDPs) and Reinforcement Learning (RL). The state space in these solutions is usually chosen in some heuristic fashion and the formed MDP can then be used to simulate and predict data, as well as indicate the best possible action in each state. The model chosen to characterize the data affects the complexity and accuracy of any further action we may wish to apply, yet few methods that rely on the dynamic structure to select such a model were suggested. In this work we address the problem of how to use time series data to choose from a finite set of candidate discrete state spaces, where these spaces are constructed by a domain expert. We formalize the notion of model selection consistency in the proposed setup. We then discuss the difference between our proposed framework and the classical Maximum Likelihood (ML) framework, and give an example where ML fails. Afterwards, we suggest alternative selection criteria and show them to be weakly consistent. We then define weak consistency for a model construction algorithm and show a simple algorithm that is weakly consistent. Finally, we test the performance of the suggested criteria and algorithm on both simulated and real world data.
当分析来自动态系统的数据时,一种常见的做法是将问题包含在众所周知的马尔可夫决策过程(mdp)和强化学习(RL)框架中。这些解决方案中的状态空间通常以某种启发式方式选择,然后可以使用形成的MDP来模拟和预测数据,并指出每个状态中的最佳操作。所选择的表征数据的模型会影响我们可能希望应用的任何进一步行动的复杂性和准确性,但很少有方法依赖于动态结构来选择这样的模型。在这项工作中,我们解决了如何使用时间序列数据从有限的候选离散状态空间中进行选择的问题,这些空间由领域专家构建。我们在提出的设置中形式化了模型选择一致性的概念。然后,我们讨论了我们提出的框架与经典的最大似然(ML)框架之间的区别,并给出了ML失败的例子。然后,我们提出了备选的选择标准,并表明它们是弱一致的。然后,我们定义了一个模型构建算法的弱一致性,并给出了一个弱一致性的简单算法。最后,我们在模拟和真实世界的数据上测试了所建议的标准和算法的性能。
{"title":"Model selection in markovian processes","authors":"Assaf Hallak, Dotan Di Castro, Shie Mannor","doi":"10.1145/2487575.2487613","DOIUrl":"https://doi.org/10.1145/2487575.2487613","url":null,"abstract":"When analyzing data that originated from a dynamical system, a common practice is to encompass the problem in the well known frameworks of Markov Decision Processes (MDPs) and Reinforcement Learning (RL). The state space in these solutions is usually chosen in some heuristic fashion and the formed MDP can then be used to simulate and predict data, as well as indicate the best possible action in each state. The model chosen to characterize the data affects the complexity and accuracy of any further action we may wish to apply, yet few methods that rely on the dynamic structure to select such a model were suggested. In this work we address the problem of how to use time series data to choose from a finite set of candidate discrete state spaces, where these spaces are constructed by a domain expert. We formalize the notion of model selection consistency in the proposed setup. We then discuss the difference between our proposed framework and the classical Maximum Likelihood (ML) framework, and give an example where ML fails. Afterwards, we suggest alternative selection criteria and show them to be weakly consistent. We then define weak consistency for a model construction algorithm and show a simple algorithm that is weakly consistent. Finally, we test the performance of the suggested criteria and algorithm on both simulated and real world data.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81982832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1