首页 > 最新文献

ACM Transactions on Knowledge Discovery from Data (TKDD)最新文献

英文 中文
Hybrid Variational Autoencoder for Recommender Systems 用于推荐系统的混合变分自编码器
Pub Date : 2021-09-04 DOI: 10.1145/3470659
Hangbin Zhang, R. Wong, Victor W. Chu
E-commerce platforms heavily rely on automatic personalized recommender systems, e.g., collaborative filtering models, to improve customer experience. Some hybrid models have been proposed recently to address the deficiency of existing models. However, their performances drop significantly when the dataset is sparse. Most of the recent works failed to fully address this shortcoming. At most, some of them only tried to alleviate the problem by considering either user side or item side content information. In this article, we propose a novel recommender model called Hybrid Variational Autoencoder (HVAE) to improve the performance on sparse datasets. Different from the existing approaches, we encode both user and item information into a latent space for semantic relevance measurement. In parallel, we utilize collaborative filtering to find the implicit factors of users and items, and combine their outputs to deliver a hybrid solution. In addition, we compare the performance of Gaussian distribution and multinomial distribution in learning the representations of the textual data. Our experiment results show that HVAE is able to significantly outperform state-of-the-art models with robust performance.
电子商务平台在很大程度上依赖于自动个性化推荐系统,例如协同过滤模型,来改善客户体验。为了解决现有模型的不足,最近提出了一些混合模型。然而,当数据集稀疏时,它们的性能会显著下降。最近的大部分工作都未能充分解决这一缺点。最多,他们中的一些人只是试图通过考虑用户端或项目端内容信息来缓解问题。在本文中,我们提出了一种新的推荐模型,称为混合变分自编码器(HVAE),以提高在稀疏数据集上的性能。与现有方法不同的是,我们将用户和项目信息编码到一个潜在空间中进行语义相关性测量。同时,我们利用协同过滤来寻找用户和项目的隐含因素,并将它们的输出组合在一起以提供混合解决方案。此外,我们还比较了高斯分布和多项分布在学习文本数据表示方面的性能。我们的实验结果表明,HVAE能够显著优于最先进的模型,具有鲁棒性。
{"title":"Hybrid Variational Autoencoder for Recommender Systems","authors":"Hangbin Zhang, R. Wong, Victor W. Chu","doi":"10.1145/3470659","DOIUrl":"https://doi.org/10.1145/3470659","url":null,"abstract":"E-commerce platforms heavily rely on automatic personalized recommender systems, e.g., collaborative filtering models, to improve customer experience. Some hybrid models have been proposed recently to address the deficiency of existing models. However, their performances drop significantly when the dataset is sparse. Most of the recent works failed to fully address this shortcoming. At most, some of them only tried to alleviate the problem by considering either user side or item side content information. In this article, we propose a novel recommender model called Hybrid Variational Autoencoder (HVAE) to improve the performance on sparse datasets. Different from the existing approaches, we encode both user and item information into a latent space for semantic relevance measurement. In parallel, we utilize collaborative filtering to find the implicit factors of users and items, and combine their outputs to deliver a hybrid solution. In addition, we compare the performance of Gaussian distribution and multinomial distribution in learning the representations of the textual data. Our experiment results show that HVAE is able to significantly outperform state-of-the-art models with robust performance.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123932952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Assessing Large-Scale Power Relations among Locations from Mobility Data 从移动数据评估地点间大规模权力关系
Pub Date : 2021-09-04 DOI: 10.1145/3470770
L. S. Oliveira, P. V. D. Melo, A. C. Viana
The pervasiveness of smartphones has shaped our lives, social norms, and the structure that dictates human behavior. They now directly influence how individuals demand resources or interact with network services. From this scenario, identifying key locations in cities is fundamental for the investigation of human mobility and also for the understanding of social problems. In this context, we propose the first graph-based methodology in the literature to quantify the power of Point-of-Interests (POIs) over its vicinity by means of user mobility trajectories. Different from literature, we consider the flow of people in our analysis, instead of the number of neighbor POIs or their structural locations in the city. Thus, we modeled POI’s visits using the multiflow graph model where each POI is a node and the transitions of users among POIs are a weighted direct edge. Using this multiflow graph model, we compute the attract, support, and independence powers. The attract power and support power measure how many visits a POI gathers from and disseminate over its neighborhood, respectively. Moreover, the independence power captures the capacity of a POI to receive visitors independently from other POIs. We tested our methodology on well-known university campus mobility datasets and validated on Location-Based Social Networks (LBSNs) datasets from various cities around the world. Our findings show that in university campus: (i) buildings have low support power and attract power; (ii) people tend to move over a few buildings and spend most of their time in the same building; and (iii) there is a slight dependence among buildings, even those with high independence power receive user visits from other buildings on campus. Globally, we reveal that (i) our metrics capture places that impact the number of visits in their neighborhood; (ii) cities in the same continent have similar independence patterns; and (iii) places with a high number of visitation and city central areas are the regions with the highest degree of independence.
智能手机的普及塑造了我们的生活、社会规范和支配人类行为的结构。它们现在直接影响个人对资源的需求或与网络服务的交互方式。在这种情况下,确定城市中的关键位置是研究人类流动性和理解社会问题的基础。在此背景下,我们提出了文献中第一个基于图形的方法,通过用户移动轨迹来量化兴趣点(poi)在其附近的力量。与文献不同的是,我们在分析中考虑的是人流,而不是相邻poi的数量或它们在城市中的结构位置。因此,我们使用多流图模型对POI的访问进行建模,其中每个POI是一个节点,用户在POI之间的转换是一个加权的直接边。利用这个多流图模型,我们计算了吸引力、支持力和独立性。吸引力和支持力分别衡量POI从其邻居聚集和传播的访问量。此外,独立性能力捕获POI独立于其他POI接收访问者的能力。我们在知名大学校园流动性数据集上测试了我们的方法,并在来自世界各地不同城市的基于位置的社交网络(LBSNs)数据集上进行了验证。研究结果表明:大学校园建筑的支撑力和吸引力较低;(ii)人们往往会搬离几幢建筑物,大部分时间都在同一幢建筑物内度过;(3)建筑物之间存在轻微的依赖性,即使具有较高独立性的建筑物也会受到校园内其他建筑物的用户访问。在全球范围内,我们发现(i)我们的指标捕获了影响其附近访问数量的地方;同一大陆的城市具有相似的独立模式;(3)游客较多的地方和城市中心区是独立程度最高的区域。
{"title":"Assessing Large-Scale Power Relations among Locations from Mobility Data","authors":"L. S. Oliveira, P. V. D. Melo, A. C. Viana","doi":"10.1145/3470770","DOIUrl":"https://doi.org/10.1145/3470770","url":null,"abstract":"The pervasiveness of smartphones has shaped our lives, social norms, and the structure that dictates human behavior. They now directly influence how individuals demand resources or interact with network services. From this scenario, identifying key locations in cities is fundamental for the investigation of human mobility and also for the understanding of social problems. In this context, we propose the first graph-based methodology in the literature to quantify the power of Point-of-Interests (POIs) over its vicinity by means of user mobility trajectories. Different from literature, we consider the flow of people in our analysis, instead of the number of neighbor POIs or their structural locations in the city. Thus, we modeled POI’s visits using the multiflow graph model where each POI is a node and the transitions of users among POIs are a weighted direct edge. Using this multiflow graph model, we compute the attract, support, and independence powers. The attract power and support power measure how many visits a POI gathers from and disseminate over its neighborhood, respectively. Moreover, the independence power captures the capacity of a POI to receive visitors independently from other POIs. We tested our methodology on well-known university campus mobility datasets and validated on Location-Based Social Networks (LBSNs) datasets from various cities around the world. Our findings show that in university campus: (i) buildings have low support power and attract power; (ii) people tend to move over a few buildings and spend most of their time in the same building; and (iii) there is a slight dependence among buildings, even those with high independence power receive user visits from other buildings on campus. Globally, we reveal that (i) our metrics capture places that impact the number of visits in their neighborhood; (ii) cities in the same continent have similar independence patterns; and (iii) places with a high number of visitation and city central areas are the regions with the highest degree of independence.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122256362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Streaming Data Preprocessing via Online Tensor Recovery for Large Environmental Sensor Networks 基于在线张量恢复的大型环境传感器网络流数据预处理
Pub Date : 2021-09-01 DOI: 10.1145/3532189
Yue Hu, Ao Qu, Yanbing Wang, D. Work
Measuring the built and natural environment at a fine-grained scale is now possible with low-cost urban environmental sensor networks. However, fine-grained city-scale data analysis is complicated by tedious data cleaning including removing outliers and imputing missing data. While many methods exist to automatically correct anomalies and impute missing entries, challenges still exist on data with large spatial-temporal scales and shifting patterns. To address these challenges, we propose an online robust tensor recovery (OLRTR) method to preprocess streaming high-dimensional urban environmental datasets. A small-sized dictionary that captures the underlying patterns of the data is computed and constantly updated with new data. OLRTR enables online recovery for large-scale sensor networks that provide continuous data streams, with a lower computational memory usage compared to offline batch counterparts. In addition, we formulate the objective function so that OLRTR can detect structured outliers, such as faulty readings over a long period of time. We validate OLRTR on a synthetically degraded National Oceanic and Atmospheric Administration temperature dataset, and apply it to the Array of Things city-scale sensor network in Chicago, IL, showing superior results compared with several established online and batch-based low-rank decomposition methods.
通过低成本的城市环境传感器网络,现在可以在细粒度尺度上测量建筑和自然环境。然而,细粒度的城市规模数据分析由于繁琐的数据清理(包括去除异常值和输入缺失数据)而变得复杂。虽然已有许多方法可以自动纠正异常和输入缺失条目,但对于大时空尺度和变化模式的数据仍然存在挑战。为了解决这些挑战,我们提出了一种在线鲁棒张量恢复(OLRTR)方法来预处理高维城市环境数据集。计算一个捕获数据底层模式的小型字典,并用新数据不断更新它。OLRTR支持提供连续数据流的大型传感器网络的在线恢复,与离线批处理相比,其计算内存使用量更低。此外,我们制定了目标函数,使OLRTR可以检测结构化的异常值,例如长时间的错误读数。我们在综合退化的美国国家海洋和大气管理局温度数据集上验证了OLRTR,并将其应用于伊利诺伊州芝加哥的Array of Things城市规模传感器网络,与几种已建立的在线和基于批处理的低秩分解方法相比,显示出更好的结果。
{"title":"Streaming Data Preprocessing via Online Tensor Recovery for Large Environmental Sensor Networks","authors":"Yue Hu, Ao Qu, Yanbing Wang, D. Work","doi":"10.1145/3532189","DOIUrl":"https://doi.org/10.1145/3532189","url":null,"abstract":"Measuring the built and natural environment at a fine-grained scale is now possible with low-cost urban environmental sensor networks. However, fine-grained city-scale data analysis is complicated by tedious data cleaning including removing outliers and imputing missing data. While many methods exist to automatically correct anomalies and impute missing entries, challenges still exist on data with large spatial-temporal scales and shifting patterns. To address these challenges, we propose an online robust tensor recovery (OLRTR) method to preprocess streaming high-dimensional urban environmental datasets. A small-sized dictionary that captures the underlying patterns of the data is computed and constantly updated with new data. OLRTR enables online recovery for large-scale sensor networks that provide continuous data streams, with a lower computational memory usage compared to offline batch counterparts. In addition, we formulate the objective function so that OLRTR can detect structured outliers, such as faulty readings over a long period of time. We validate OLRTR on a synthetically degraded National Oceanic and Atmospheric Administration temperature dataset, and apply it to the Array of Things city-scale sensor network in Chicago, IL, showing superior results compared with several established online and batch-based low-rank decomposition methods.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116547537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series 稀疏和不规则采样多变量临床时间序列的自监督变压器
Pub Date : 2021-07-29 DOI: 10.1145/3516367
Sindhu Tipirneni, C. Reddy
Multivariate time-series data are frequently observed in critical care settings and are typically characterized by sparsity (missing information) and irregular time intervals. Existing approaches for learning representations in this domain handle these challenges by either aggregation or imputation of values, which in-turn suppresses the fine-grained information and adds undesirable noise/overhead into the machine learning model. To tackle this problem, we propose a Self-supervised Transformer for Time-Series (STraTS) model, which overcomes these pitfalls by treating time-series as a set of observation triplets instead of using the standard dense matrix representation. It employs a novel Continuous Value Embedding technique to encode continuous time and variable values without the need for discretization. It is composed of a Transformer component with multi-head attention layers, which enable it to learn contextual triplet embeddings while avoiding the problems of recurrence and vanishing gradients that occur in recurrent architectures. In addition, to tackle the problem of limited availability of labeled data (which is typically observed in many healthcare applications), STraTS utilizes self-supervision by leveraging unlabeled data to learn better representations by using time-series forecasting as an auxiliary proxy task. Experiments on real-world multivariate clinical time-series benchmark datasets demonstrate that STraTS has better prediction performance than state-of-the-art methods for mortality prediction, especially when labeled data is limited. Finally, we also present an interpretable version of STraTS, which can identify important measurements in the time-series data. Our data preprocessing and model implementation codes are available at https://github.com/sindhura97/STraTS.
在重症监护环境中经常观察到多变量时间序列数据,其典型特征是稀疏性(信息缺失)和不规则的时间间隔。在该领域学习表示的现有方法通过值的聚合或imputation来处理这些挑战,这反过来又抑制了细粒度信息,并在机器学习模型中添加了不希望的噪声/开销。为了解决这个问题,我们提出了一个时间序列自监督变压器(STraTS)模型,该模型通过将时间序列视为一组观测三元组而不是使用标准的密集矩阵表示来克服这些缺陷。它采用了一种新颖的连续值嵌入技术来编码连续时间和变量值,而不需要离散化。它由一个具有多头注意层的Transformer组件组成,这使它能够学习上下文三元组嵌入,同时避免在循环架构中出现的递归和梯度消失问题。此外,为了解决标记数据可用性有限的问题(这在许多医疗保健应用程序中很常见),strat利用自我监督,利用未标记的数据,通过使用时间序列预测作为辅助代理任务来学习更好的表示。在真实世界的多变量临床时间序列基准数据集上的实验表明,STraTS在死亡率预测方面比最先进的方法具有更好的预测性能,特别是在标记数据有限的情况下。最后,我们还提出了一个可解释的strat版本,它可以识别时间序列数据中的重要测量值。我们的数据预处理和模型实现代码可在https://github.com/sindhura97/STraTS上获得。
{"title":"Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series","authors":"Sindhu Tipirneni, C. Reddy","doi":"10.1145/3516367","DOIUrl":"https://doi.org/10.1145/3516367","url":null,"abstract":"Multivariate time-series data are frequently observed in critical care settings and are typically characterized by sparsity (missing information) and irregular time intervals. Existing approaches for learning representations in this domain handle these challenges by either aggregation or imputation of values, which in-turn suppresses the fine-grained information and adds undesirable noise/overhead into the machine learning model. To tackle this problem, we propose a Self-supervised Transformer for Time-Series (STraTS) model, which overcomes these pitfalls by treating time-series as a set of observation triplets instead of using the standard dense matrix representation. It employs a novel Continuous Value Embedding technique to encode continuous time and variable values without the need for discretization. It is composed of a Transformer component with multi-head attention layers, which enable it to learn contextual triplet embeddings while avoiding the problems of recurrence and vanishing gradients that occur in recurrent architectures. In addition, to tackle the problem of limited availability of labeled data (which is typically observed in many healthcare applications), STraTS utilizes self-supervision by leveraging unlabeled data to learn better representations by using time-series forecasting as an auxiliary proxy task. Experiments on real-world multivariate clinical time-series benchmark datasets demonstrate that STraTS has better prediction performance than state-of-the-art methods for mortality prediction, especially when labeled data is limited. Finally, we also present an interpretable version of STraTS, which can identify important measurements in the time-series data. Our data preprocessing and model implementation codes are available at https://github.com/sindhura97/STraTS.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127928011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
Establishing Smartphone User Behavior Model Based on Energy Consumption Data 基于能耗数据建立智能手机用户行为模型
Pub Date : 2021-07-21 DOI: 10.1145/3461459
M. Ding, Tianyu Wang, Xudong Wang
In smartphone data analysis, both energy consumption modeling and user behavior mining have been explored extensively, but the relationship between energy consumption and user behavior has been rarely studied. Such a relationship is explored over large-scale users in this article. Based on energy consumption data, where each users’ feature vector is represented by energy breakdown on hardware components of different apps, User Behavior Models (UBM) are established to capture user behavior patterns (i.e., app preference, usage time). The challenge lies in the high diversity of user behaviors (i.e., massive apps and usage ways), which leads to high dimension and dispersion of data. To overcome the challenge, three mechanisms are designed. First, to reduce the dimension, apps are ranked with the top ones identified as typical apps to represent all. Second, the dispersion is reduced by scaling each users’ feature vector with typical apps to unit ℓ1 norm. The scaled vector becomes Usage Pattern, while the ℓ1 norm of vector before scaling is treated as Usage Intensity. Third, the usage pattern is analyzed with a two-layer clustering approach to further reduce data dispersion. In the upper layer, each typical app is studied across its users with respect to hardware components to identify Typical Hardware Usage Patterns (THUP). In the lower layer, users are studied with respect to these THUPs to identify Typical App Usage Patterns (TAUP). The analytical results of these two layers are consolidated into Usage Pattern Models (UPM), and UBMs are finally established by a union of UPMs and Usage Intensity Distributions (UID). By carrying out experiments on energy consumption data from 18,308 distinct users over 10 days, 33 UBMs are extracted from training data. With the test data, it is proven that these UBMs cover 94% user behaviors and achieve up to 20% improvement in accuracy of energy representation, as compared with the baseline method, PCA. Besides, potential applications and implications of these UBMs are illustrated for smartphone manufacturers, app developers, network providers, and so on.
在智能手机数据分析中,能源消耗建模和用户行为挖掘都得到了广泛的探索,但能源消耗与用户行为之间的关系研究却很少。本文将在大规模用户中探讨这种关系。基于能耗数据,将每个用户的特征向量表示为不同应用硬件组件上的能量分解,建立用户行为模型(User Behavior Models, UBM),捕捉用户行为模式(即应用偏好、使用时间)。挑战在于用户行为的高度多样性(即海量的应用和使用方式),导致数据的高维度和分散性。为了克服这一挑战,设计了三种机制。首先,为了减少维度,应用程序被列为最典型的应用程序来代表所有。其次,通过将典型应用的每个用户的特征向量缩放到单位1范数来减小离散度。缩放后的向量称为“使用模式”,而缩放前向量的v1范数称为“使用强度”。第三,使用两层聚类方法分析使用模式,进一步减少数据分散。在上层,每个典型的应用程序都是根据硬件组件在其用户中进行研究,以确定典型硬件使用模式(THUP)。在较低的层次,研究用户对这些thup的看法,以确定典型的应用程序使用模式(TAUP)。这两层的分析结果被整合到使用模式模型(UPM)中,最后通过UPM和使用强度分布(UID)的联合建立ubm。通过对18308个不同用户10天的能耗数据进行实验,从训练数据中提取出33个ubm。通过测试数据证明,与基线方法PCA相比,这些ubm覆盖了94%的用户行为,并且在能量表示的准确性方面提高了20%。此外,还为智能手机制造商、应用程序开发人员、网络提供商等说明了这些ubm的潜在应用和影响。
{"title":"Establishing Smartphone User Behavior Model Based on Energy Consumption Data","authors":"M. Ding, Tianyu Wang, Xudong Wang","doi":"10.1145/3461459","DOIUrl":"https://doi.org/10.1145/3461459","url":null,"abstract":"In smartphone data analysis, both energy consumption modeling and user behavior mining have been explored extensively, but the relationship between energy consumption and user behavior has been rarely studied. Such a relationship is explored over large-scale users in this article. Based on energy consumption data, where each users’ feature vector is represented by energy breakdown on hardware components of different apps, User Behavior Models (UBM) are established to capture user behavior patterns (i.e., app preference, usage time). The challenge lies in the high diversity of user behaviors (i.e., massive apps and usage ways), which leads to high dimension and dispersion of data. To overcome the challenge, three mechanisms are designed. First, to reduce the dimension, apps are ranked with the top ones identified as typical apps to represent all. Second, the dispersion is reduced by scaling each users’ feature vector with typical apps to unit ℓ1 norm. The scaled vector becomes Usage Pattern, while the ℓ1 norm of vector before scaling is treated as Usage Intensity. Third, the usage pattern is analyzed with a two-layer clustering approach to further reduce data dispersion. In the upper layer, each typical app is studied across its users with respect to hardware components to identify Typical Hardware Usage Patterns (THUP). In the lower layer, users are studied with respect to these THUPs to identify Typical App Usage Patterns (TAUP). The analytical results of these two layers are consolidated into Usage Pattern Models (UPM), and UBMs are finally established by a union of UPMs and Usage Intensity Distributions (UID). By carrying out experiments on energy consumption data from 18,308 distinct users over 10 days, 33 UBMs are extracted from training data. With the test data, it is proven that these UBMs cover 94% user behaviors and achieve up to 20% improvement in accuracy of energy representation, as compared with the baseline method, PCA. Besides, potential applications and implications of these UBMs are illustrated for smartphone manufacturers, app developers, network providers, and so on.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122586298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Opinion Dynamics Optimization by Varying Susceptibility to Persuasion via Non-Convex Local Search 基于非凸局部搜索的不同说服敏感性意见动态优化
Pub Date : 2021-07-21 DOI: 10.1145/3466617
Rediet Abebe, T-H. Hubert Chan, J. Kleinberg, Zhibin Liang, D. Parkes, Mauro Sozio, Charalampos E. Tsourakakis
A long line of work in social psychology has studied variations in people’s susceptibility to persuasion—the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people’s intrinsic opinions, it is also natural to consider interventions that modify people’s susceptibility to persuasion. In this work, motivated by this fact, we propose an influence optimization problem. Specifically, we adopt a popular model for social opinion dynamics, where each agent has some fixed innate opinion, and a resistance that measures the importance it places on its innate opinion; agents influence one another’s opinions through an iterative process. Under certain conditions, this iterative process converges to some equilibrium opinion vector. For the unbudgeted variant of the problem, the goal is to modify the resistance of any number of agents (within some given range) such that the sum of the equilibrium opinions is minimized; for the budgeted variant, in addition the algorithm is given upfront a restriction on the number of agents whose resistance may be modified. We prove that the objective function is in general non-convex. Hence, formulating the problem as a convex program as in an early version of this work (Abebe et al., KDD’18) might have potential correctness issues. We instead analyze the structure of the objective function, and show that any local optimum is also a global optimum, which is somehow surprising as the objective function might not be convex. Furthermore, we combine the iterative process and the local search paradigm to design very efficient algorithms that can solve the unbudgeted variant of the problem optimally on large-scale graphs containing millions of nodes. Finally, we propose and evaluate experimentally a family of heuristics for the budgeted variant of the problem.
社会心理学的长期工作研究了人们对说服的敏感性的变化——他们愿意在多大程度上改变他们对一个话题的看法。这些文献对网络中相互作用的各方形成意见的理论模型提出了一个有趣的观点:除了考虑直接改变人们内在意见的干预措施外,考虑改变人们对说服的易感性的干预措施也是很自然的。在这项工作中,基于这一事实,我们提出了一个影响优化问题。具体来说,我们采用了一种流行的社会意见动态模型,其中每个主体都有一些固定的固有意见,以及衡量其对其固有意见重要性的阻力;代理通过一个迭代过程影响彼此的意见。在一定条件下,该迭代过程收敛于某个均衡意见向量。对于问题的非预算变体,目标是修改任意数量的代理(在给定范围内)的阻力,使均衡意见的总和最小化;此外,对于预算变量,该算法预先给出了可以修改阻力的代理数量的限制。证明了目标函数一般是非凸的。因此,在这项工作的早期版本(Abebe等人,KDD ' 18)中将问题表述为凸程序可能存在潜在的正确性问题。我们转而分析目标函数的结构,并表明任何局部最优也是全局最优,这在某种程度上令人惊讶,因为目标函数可能不是凸的。此外,我们将迭代过程和局部搜索范式结合起来,设计了非常有效的算法,可以在包含数百万节点的大规模图上最优地解决问题的非预算变体。最后,我们提出并实验评估了一组启发式的问题的预算变体。
{"title":"Opinion Dynamics Optimization by Varying Susceptibility to Persuasion via Non-Convex Local Search","authors":"Rediet Abebe, T-H. Hubert Chan, J. Kleinberg, Zhibin Liang, D. Parkes, Mauro Sozio, Charalampos E. Tsourakakis","doi":"10.1145/3466617","DOIUrl":"https://doi.org/10.1145/3466617","url":null,"abstract":"A long line of work in social psychology has studied variations in people’s susceptibility to persuasion—the extent to which they are willing to modify their opinions on a topic. This body of literature suggests an interesting perspective on theoretical models of opinion formation by interacting parties in a network: in addition to considering interventions that directly modify people’s intrinsic opinions, it is also natural to consider interventions that modify people’s susceptibility to persuasion. In this work, motivated by this fact, we propose an influence optimization problem. Specifically, we adopt a popular model for social opinion dynamics, where each agent has some fixed innate opinion, and a resistance that measures the importance it places on its innate opinion; agents influence one another’s opinions through an iterative process. Under certain conditions, this iterative process converges to some equilibrium opinion vector. For the unbudgeted variant of the problem, the goal is to modify the resistance of any number of agents (within some given range) such that the sum of the equilibrium opinions is minimized; for the budgeted variant, in addition the algorithm is given upfront a restriction on the number of agents whose resistance may be modified. We prove that the objective function is in general non-convex. Hence, formulating the problem as a convex program as in an early version of this work (Abebe et al., KDD’18) might have potential correctness issues. We instead analyze the structure of the objective function, and show that any local optimum is also a global optimum, which is somehow surprising as the objective function might not be convex. Furthermore, we combine the iterative process and the local search paradigm to design very efficient algorithms that can solve the unbudgeted variant of the problem optimally on large-scale graphs containing millions of nodes. Finally, we propose and evaluate experimentally a family of heuristics for the budgeted variant of the problem.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131354628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Dynamically Adjusting Diversity in Ensembles for the Classification of Data Streams with Concept Drift 基于概念漂移的数据流分类中的集成多样性动态调整
Pub Date : 2021-07-21 DOI: 10.1145/3466616
Juan Isidro González Hidalgo, S. G. T. C. Santos, Roberto S. M. Barros
A data stream can be defined as a system that continually generates a lot of data over time. Today, processing data streams requires new demands and challenging tasks in the data mining and machine learning areas. Concept Drift is a problem commonly characterized as changes in the distribution of the data within a data stream. The implementation of new methods for dealing with data streams where concept drifts occur requires algorithms that can adapt to several scenarios to improve its performance in the different experimental situations where they are tested. This research proposes a strategy for dynamic parameter adjustment in the presence of concept drifts. Parameter Estimation Procedure (PEP) is a general method proposed for dynamically adjusting parameters which is applied to the diversity parameter (λ) of several classification ensembles commonly used in the area. To this end, the proposed estimation method (PEP) was used to create Boosting-like Online Learning Ensemble with Parameter Estimation (BOLE-PE), Online AdaBoost-based M1 with Parameter Estimation (OABM1-PE), and Oza and Russell’s Online Bagging with Parameter Estimation (OzaBag-PE), based on the existing ensembles BOLE, OABM1, and OzaBag, respectively. To validate them, experiments were performed with artificial and real-world datasets using Hoeffding Tree (HT) as base classifier. The accuracy results were statistically evaluated using a variation of the Friedman test and the Nemenyi post-hoc test. The experimental results showed that the application of the dynamic estimation in the diversity parameter (λ) produced good results in most scenarios, i.e., the modified methods have improved accuracy in the experiments with both artificial and real-world datasets.
数据流可以定义为随着时间的推移不断生成大量数据的系统。今天,处理数据流在数据挖掘和机器学习领域提出了新的要求和具有挑战性的任务。概念漂移是一个通常以数据流中数据分布变化为特征的问题。实现处理发生概念漂移的数据流的新方法需要能够适应多种场景的算法,以提高其在测试的不同实验情况下的性能。本研究提出了一种存在概念漂移时的动态参数调整策略。参数估计程序(PEP)是一种动态调整参数的通用方法,应用于该领域常用的几种分类系统的分集参数(λ)。为此,利用所提出的估计方法(PEP),分别在现有的集成系统BOLE、OABM1和OzaBag的基础上,创建了类boost在线学习集成与参数估计(BOLE- pe)、基于adaboost的在线学习集成与参数估计(OABM1- pe)和Oza和Russell的在线Bagging与参数估计(OzaBag- pe)。为了验证它们,使用Hoeffding Tree (HT)作为基础分类器,在人工和现实世界的数据集上进行了实验。使用Friedman检验和Nemenyi事后检验的变体对准确性结果进行统计评估。实验结果表明,对分集参数(λ)的动态估计在大多数情况下都取得了良好的效果,即改进的方法在人工和真实数据集的实验中都提高了精度。
{"title":"Dynamically Adjusting Diversity in Ensembles for the Classification of Data Streams with Concept Drift","authors":"Juan Isidro González Hidalgo, S. G. T. C. Santos, Roberto S. M. Barros","doi":"10.1145/3466616","DOIUrl":"https://doi.org/10.1145/3466616","url":null,"abstract":"A data stream can be defined as a system that continually generates a lot of data over time. Today, processing data streams requires new demands and challenging tasks in the data mining and machine learning areas. Concept Drift is a problem commonly characterized as changes in the distribution of the data within a data stream. The implementation of new methods for dealing with data streams where concept drifts occur requires algorithms that can adapt to several scenarios to improve its performance in the different experimental situations where they are tested. This research proposes a strategy for dynamic parameter adjustment in the presence of concept drifts. Parameter Estimation Procedure (PEP) is a general method proposed for dynamically adjusting parameters which is applied to the diversity parameter (λ) of several classification ensembles commonly used in the area. To this end, the proposed estimation method (PEP) was used to create Boosting-like Online Learning Ensemble with Parameter Estimation (BOLE-PE), Online AdaBoost-based M1 with Parameter Estimation (OABM1-PE), and Oza and Russell’s Online Bagging with Parameter Estimation (OzaBag-PE), based on the existing ensembles BOLE, OABM1, and OzaBag, respectively. To validate them, experiments were performed with artificial and real-world datasets using Hoeffding Tree (HT) as base classifier. The accuracy results were statistically evaluated using a variation of the Friedman test and the Nemenyi post-hoc test. The experimental results showed that the application of the dynamic estimation in the diversity parameter (λ) produced good results in most scenarios, i.e., the modified methods have improved accuracy in the experiments with both artificial and real-world datasets.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129065230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
High-Value Token-Blocking: Efficient Blocking Method for Record Linkage 高值令牌阻塞:记录链接的有效阻塞方法
Pub Date : 2021-07-21 DOI: 10.1145/3450527
K. O'Hare, Anna Jurek-Loughrey, Cassio P. de Campos
Data integration is an important component of Big Data analytics. One of the key challenges in data integration is record linkage, that is, matching records that represent the same real-world entity. Because of computational costs, methods referred to as blocking are employed as a part of the record linkage pipeline in order to reduce the number of comparisons among records. In the past decade, a range of blocking techniques have been proposed. Real-world applications require approaches that can handle heterogeneous data sources and do not rely on labelled data. We propose high-value token-blocking (HVTB), a simple and efficient approach for blocking that is unsupervised and schema-agnostic, based on a crafted use of Term Frequency-Inverse Document Frequency. We compare HVTB with multiple methods and over a range of datasets, including a novel unstructured dataset composed of titles and abstracts of scientific papers. We thoroughly discuss results in terms of accuracy, use of computational resources, and different characteristics of datasets and records. The simplicity of HVTB yields fast computations and does not harm its accuracy when compared with existing approaches. It is shown to be significantly superior to other methods, suggesting that simpler methods for blocking should be considered before resorting to more sophisticated methods.
数据集成是大数据分析的重要组成部分。数据集成中的关键挑战之一是记录链接,即匹配代表相同现实世界实体的记录。由于计算成本,称为阻塞的方法被用作记录链接管道的一部分,以减少记录之间的比较次数。在过去的十年中,已经提出了一系列的阻塞技术。现实世界的应用程序需要能够处理异构数据源并且不依赖于标记数据的方法。我们提出了高价值令牌阻塞(HVTB),这是一种简单有效的无监督和模式无关的阻塞方法,基于精心使用术语频率-逆文档频率。我们将HVTB与多种方法和一系列数据集进行比较,包括一个由科学论文标题和摘要组成的新型非结构化数据集。我们将从准确性、计算资源的使用以及数据集和记录的不同特征等方面全面讨论结果。与现有方法相比,HVTB的简单性使计算速度更快,而且不影响其准确性。它明显优于其他方法,这表明在采用更复杂的方法之前,应考虑更简单的方法进行阻塞。
{"title":"High-Value Token-Blocking: Efficient Blocking Method for Record Linkage","authors":"K. O'Hare, Anna Jurek-Loughrey, Cassio P. de Campos","doi":"10.1145/3450527","DOIUrl":"https://doi.org/10.1145/3450527","url":null,"abstract":"Data integration is an important component of Big Data analytics. One of the key challenges in data integration is record linkage, that is, matching records that represent the same real-world entity. Because of computational costs, methods referred to as blocking are employed as a part of the record linkage pipeline in order to reduce the number of comparisons among records. In the past decade, a range of blocking techniques have been proposed. Real-world applications require approaches that can handle heterogeneous data sources and do not rely on labelled data. We propose high-value token-blocking (HVTB), a simple and efficient approach for blocking that is unsupervised and schema-agnostic, based on a crafted use of Term Frequency-Inverse Document Frequency. We compare HVTB with multiple methods and over a range of datasets, including a novel unstructured dataset composed of titles and abstracts of scientific papers. We thoroughly discuss results in terms of accuracy, use of computational resources, and different characteristics of datasets and records. The simplicity of HVTB yields fast computations and does not harm its accuracy when compared with existing approaches. It is shown to be significantly superior to other methods, suggesting that simpler methods for blocking should be considered before resorting to more sophisticated methods.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127038517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Constrained Dual-Level Bandit for Personalized Impression Regulation in Online Ranking Systems 在线排名系统中个性化印象调节的约束双级强盗
Pub Date : 2021-07-21 DOI: 10.1145/3461340
Zhao Li, Junshuai Song, Zehong Hu, Zhen Wang, Jun Gao
Impression regulation plays an important role in various online ranking systems, e.g., e-commerce ranking systems always need to achieve local commercial demands on some pre-labeled target items like fresh item cultivation and fraudulent item counteracting while maximizing its global revenue. However, local impression regulation may cause “butterfly effects” on the global scale, e.g., in e-commerce, the price preference fluctuation in initial conditions (overpriced or underpriced items) may create a significantly different outcome, thus affecting shopping experience and bringing economic losses to platforms. To prevent “butterfly effects”, some researchers define their regulation objectives with global constraints, by using contextual bandit at the page-level that requires all items on one page sharing the same regulation action, which fails to conduct impression regulation on individual items. To address this problem, in this article, we propose a personalized impression regulation method that can directly makes regulation decisions for each user-item pair. Specifically, we model the regulation problem as a Constrained Dual-level Bandit (CDB) problem, where the local regulation action and reward signals are at the item-level while the global effect constraint on the platform impression can be calculated at the page-level only. To handle the asynchronous signals, we first expand the page-level constraint to the item-level and then derive the policy updating as a second-order cone optimization problem. Our CDB approaches the optimal policy by iteratively solving the optimization problem. Experiments are performed on both offline and online datasets, and the results, theoretically and empirically, demonstrate CDB outperforms state-of-the-art algorithms.
印象调节在各种在线排名系统中发挥着重要的作用,例如电子商务排名系统总是需要在实现其全球收益最大化的同时,对一些预先标记的目标物品(如生鲜物品培育、欺诈物品抵消)实现当地的商业需求。然而,局部印象调控可能会在全球范围内产生“蝴蝶效应”,例如在电子商务中,初始条件下(商品价格过高或过低)的价格偏好波动可能会产生明显不同的结果,从而影响购物体验,给平台带来经济损失。为了防止“蝴蝶效应”,一些研究者用全局约束来定义他们的调节目标,通过在页面级别使用上下文强盗,要求一个页面上的所有项目共享相同的调节动作,这不能对单个项目进行印象调节。为了解决这一问题,本文提出了一种个性化印象调节方法,可以直接对每个用户-物品对进行调节决策。具体来说,我们将监管问题建模为约束双级强盗(CDB)问题,其中局部监管行为和奖励信号在项目层面,而平台印象的全局效应约束只能在页面层面计算。为了处理异步信号,我们首先将页面级约束扩展到项目级,然后将策略更新导出为二阶锥优化问题。我们的CDB通过迭代求解优化问题来逼近最优策略。实验在离线和在线数据集上进行,结果在理论上和经验上都证明CDB优于最先进的算法。
{"title":"Constrained Dual-Level Bandit for Personalized Impression Regulation in Online Ranking Systems","authors":"Zhao Li, Junshuai Song, Zehong Hu, Zhen Wang, Jun Gao","doi":"10.1145/3461340","DOIUrl":"https://doi.org/10.1145/3461340","url":null,"abstract":"Impression regulation plays an important role in various online ranking systems, e.g., e-commerce ranking systems always need to achieve local commercial demands on some pre-labeled target items like fresh item cultivation and fraudulent item counteracting while maximizing its global revenue. However, local impression regulation may cause “butterfly effects” on the global scale, e.g., in e-commerce, the price preference fluctuation in initial conditions (overpriced or underpriced items) may create a significantly different outcome, thus affecting shopping experience and bringing economic losses to platforms. To prevent “butterfly effects”, some researchers define their regulation objectives with global constraints, by using contextual bandit at the page-level that requires all items on one page sharing the same regulation action, which fails to conduct impression regulation on individual items. To address this problem, in this article, we propose a personalized impression regulation method that can directly makes regulation decisions for each user-item pair. Specifically, we model the regulation problem as a Constrained Dual-level Bandit (CDB) problem, where the local regulation action and reward signals are at the item-level while the global effect constraint on the platform impression can be calculated at the page-level only. To handle the asynchronous signals, we first expand the page-level constraint to the item-level and then derive the policy updating as a second-order cone optimization problem. Our CDB approaches the optimal policy by iteratively solving the optimization problem. Experiments are performed on both offline and online datasets, and the results, theoretically and empirically, demonstrate CDB outperforms state-of-the-art algorithms.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"54 27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124700581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Synopsis Based Approach for Itemset Frequency Estimation over Massive Multi-Transaction Stream 一种基于概要的海量多事务流项集频率估计方法
Pub Date : 2021-07-21 DOI: 10.1145/3465238
Guangtao Wang, G. Cong, Ying Zhang, Zhen Hai, Jieping Ye
The streams where multiple transactions are associated with the same key are prevalent in practice, e.g., a customer has multiple shopping records arriving at different time. Itemset frequency estimation on such streams is very challenging since sampling based methods, such as the popularly used reservoir sampling, cannot be used. In this article, we propose a novel k-Minimum Value (KMV) synopsis based method to estimate the frequency of itemsets over multi-transaction streams. First, we extract the KMV synopses for each item from the stream. Then, we propose a novel estimator to estimate the frequency of an itemset over the KMV synopses. Comparing to the existing estimator, our method is not only more accurate and efficient to calculate but also follows the downward-closure property. These properties enable the incorporation of our new estimator with existing frequent itemset mining (FIM) algorithm (e.g., FP-Growth) to mine frequent itemsets over multi-transaction streams. To demonstrate this, we implement a KMV synopsis based FIM algorithm by integrating our estimator into existing FIM algorithms, and we prove it is capable of guaranteeing the accuracy of FIM with a bounded size of KMV synopsis. Experimental results on massive streams show our estimator can significantly improve on the accuracy for both estimating itemset frequency and FIM compared to the existing estimators.
多个事务与同一个键相关联的流在实践中很普遍,例如,一个客户有多个在不同时间到达的购物记录。由于不能使用基于采样的方法,例如常用的储层采样,因此对此类流的项集频率估计非常具有挑战性。在本文中,我们提出了一种新的基于k-最小值(KMV)概要的方法来估计多事务流上项目集的频率。首先,我们从流中提取每个项目的KMV概要。然后,我们提出了一种新的估计器来估计项目集在KMV集上的频率。与现有的估计方法相比,该方法不仅计算精度高,效率高,而且遵循下闭包特性。这些属性使我们的新估计器与现有的频繁项集挖掘(FIM)算法(例如,FP-Growth)相结合,可以在多事务流上挖掘频繁项集。为了证明这一点,我们将我们的估计器集成到现有的FIM算法中,实现了一个基于KMV概要的FIM算法,并证明了它能够在KMV概要有界的情况下保证FIM的准确性。在海量流上的实验结果表明,与现有的估计器相比,我们的估计器在估计项目集频率和FIM方面都能显著提高精度。
{"title":"A Synopsis Based Approach for Itemset Frequency Estimation over Massive Multi-Transaction Stream","authors":"Guangtao Wang, G. Cong, Ying Zhang, Zhen Hai, Jieping Ye","doi":"10.1145/3465238","DOIUrl":"https://doi.org/10.1145/3465238","url":null,"abstract":"The streams where multiple transactions are associated with the same key are prevalent in practice, e.g., a customer has multiple shopping records arriving at different time. Itemset frequency estimation on such streams is very challenging since sampling based methods, such as the popularly used reservoir sampling, cannot be used. In this article, we propose a novel k-Minimum Value (KMV) synopsis based method to estimate the frequency of itemsets over multi-transaction streams. First, we extract the KMV synopses for each item from the stream. Then, we propose a novel estimator to estimate the frequency of an itemset over the KMV synopses. Comparing to the existing estimator, our method is not only more accurate and efficient to calculate but also follows the downward-closure property. These properties enable the incorporation of our new estimator with existing frequent itemset mining (FIM) algorithm (e.g., FP-Growth) to mine frequent itemsets over multi-transaction streams. To demonstrate this, we implement a KMV synopsis based FIM algorithm by integrating our estimator into existing FIM algorithms, and we prove it is capable of guaranteeing the accuracy of FIM with a bounded size of KMV synopsis. Experimental results on massive streams show our estimator can significantly improve on the accuracy for both estimating itemset frequency and FIM compared to the existing estimators.","PeriodicalId":435653,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data (TKDD)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128185663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
ACM Transactions on Knowledge Discovery from Data (TKDD)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1