2020 International Conference on Data Mining Workshops (ICDMW)最新文献

英文中文

AttentionFM: Incorporating Attention Mechanism and Factorization Machine for Credit Scoring 基于注意力机制和因子分解机的信用评分

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00056

Ying Liu, Wei Wang, Tianlin Zhang, Zhenyu Cui

Learning effective feature interactions behind user behavior is challenging in credit scoring. Existing machine learning methods seem to have a strong bias towards low-order or high-order interactions, or require expertise feature engineering. In this paper, we present a novel neural network approach AttentionFM, which incorporates Factorization Machines and Attention mechanism for credit scoring. The proposed model focuses more on critical features and emphasizes both low- and high-order feature interactions, with no need of manually feature engineering on raw data representation. Experimental results demonstrate that our proposed model significantly outperforms the baselines based on two public datasets.

在信用评分中，学习用户行为背后的有效功能交互是一个挑战。现有的机器学习方法似乎对低阶或高阶交互有强烈的偏见，或者需要专业的特征工程。在本文中，我们提出了一种新的神经网络方法AttentionFM，它结合了分解机器和注意机制来进行信用评分。该模型更关注关键特征，强调低阶和高阶特征交互，不需要对原始数据表示进行手动特征工程。实验结果表明，我们提出的模型明显优于基于两个公共数据集的基线。

引用次数: 0

Detecting Dynamic Critical Links within Large Scale Network for Traffic State Prediction 基于流量状态预测的大规模网络动态关键链路检测

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00119

Pierre-Antoine Laharotte, Romain Billot, Nour-Eddin El Faouzi

Can we expose the relationship between the physical dynamics of a network and its predictability? To contribute to this point, we propose a dimensionality reduction method for network states prediction based on spatiotemporal data. The method is intended to deal with large scale networks, where only a subset of critical links can be relevant for accurate multidimensional prediction (MIMO) performances. The algorithm is based on Latent Dirichlet Allocation (LDA) to highlight relevant topics in terms of networks dynamics. The feature selection trick relies on the assumption that the most representative links of the most dominant topics are critical links for short term prediction. The method is fully implemented to an original application field: short term road traffic prediction on large scale urban networks based on GPS data. Results highlight significant reductions in dimensionality and execution time, a global improvement of prediction performances as well as a better resilience to non recurrent traffic flow conditions.

我们能揭示网络的物理动态与其可预测性之间的关系吗?为此，我们提出了一种基于时空数据的网络状态预测降维方法。该方法旨在处理大规模网络，其中只有关键链路的子集可以与准确的多维预测(MIMO)性能相关。该算法基于潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)，从网络动力学的角度突出相关主题。特征选择技巧依赖于一个假设，即最主要主题中最具代表性的链接是短期预测的关键链接。该方法完全应用于基于GPS数据的大规模城市网络短期道路交通预测这一新颖的应用领域。结果突出了维数和执行时间的显著降低，预测性能的整体改善以及对非经常性交通流条件的更好的弹性。

引用次数: 0

Classification of Dementia Associated Disorders Using EEG based Frequent Subgraph Technique 基于脑电图频繁子图技术的痴呆相关疾病分类

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00087

A. T. Adebisi, V. Gonuguntla, Ho-Won Lee, K. Veluvolu

Dementia associated disorders such as vascular dementia, frontotemporal dementia and Alzheimer dementia lead to cognitive impairment. Discrimination of dementia associated disorders has reamined a challenging task as they have overlapping underlying complex structures and display similar clinical features. In this work, we explore an EEG based frequent subgraph searching technique to characterize stages of brain functional networks of mild cognitive impairment (MCI), Alzheimer's disease (AD) and vascular dementia (VD) subjects in comparison with healthy control (HC) subjects. To identify the frequent subgraph related to dementia, we first formulated the brain functional network based on the phase information of EEG with mutual information as a measure. The whole network is then divided into sub-regions and frequent sub-graph search is performed. The identified frequent subgraphs were employed to discriminate the dementia associated disorders from the data recorded from 10 healthy and 32 dementia subjects in various stages. Results show that the proposed method has the potential to quantify the disease progression using brain functional connectivity and the identified networks can aid in the diagnosis of dementia associated disorders.

痴呆相关疾病，如血管性痴呆、额颞叶痴呆和阿尔茨海默氏痴呆，会导致认知障碍。痴呆症相关疾病的鉴别仍然是一项具有挑战性的任务，因为它们具有重叠的潜在复杂结构并表现出相似的临床特征。在这项工作中，我们探索了一种基于脑电图的频繁子图搜索技术，以表征轻度认知障碍(MCI)、阿尔茨海默病(AD)和血管性痴呆(VD)受试者与健康对照组(HC)受试者的脑功能网络阶段。为了识别与痴呆相关的频繁子图，我们首先以互信息为度量，建立了基于脑电相位信息的脑功能网络。然后将整个网络划分为子区域，并进行频繁的子图搜索。利用识别出的频繁子图从10名健康受试者和32名不同阶段的痴呆受试者的数据中区分痴呆相关疾病。结果表明，所提出的方法具有利用脑功能连接来量化疾病进展的潜力，并且确定的网络可以帮助诊断痴呆症相关疾病。

{"title":"Classification of Dementia Associated Disorders Using EEG based Frequent Subgraph Technique","authors":"A. T. Adebisi, V. Gonuguntla, Ho-Won Lee, K. Veluvolu","doi":"10.1109/ICDMW51313.2020.00087","DOIUrl":"https://doi.org/10.1109/ICDMW51313.2020.00087","url":null,"abstract":"Dementia associated disorders such as vascular dementia, frontotemporal dementia and Alzheimer dementia lead to cognitive impairment. Discrimination of dementia associated disorders has reamined a challenging task as they have overlapping underlying complex structures and display similar clinical features. In this work, we explore an EEG based frequent subgraph searching technique to characterize stages of brain functional networks of mild cognitive impairment (MCI), Alzheimer's disease (AD) and vascular dementia (VD) subjects in comparison with healthy control (HC) subjects. To identify the frequent subgraph related to dementia, we first formulated the brain functional network based on the phase information of EEG with mutual information as a measure. The whole network is then divided into sub-regions and frequent sub-graph search is performed. The identified frequent subgraphs were employed to discriminate the dementia associated disorders from the data recorded from 10 healthy and 32 dementia subjects in various stages. Results show that the proposed method has the potential to quantify the disease progression using brain functional connectivity and the identified networks can aid in the diagnosis of dementia associated disorders.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133765864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

StreamDL: Deep Learning Serving Platform for AMI Stream Forecasting StreamDL: AMI流预测的深度学习服务平台

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00104

Eunju Yang, Changha Lee, Ji-Hwan Kim, Tuan Manh Tao, Chan-Hyun Youn

Advanced Metering Infrastructures (AMIs) facilitate individual load forecasting. The individual load forecasting not only improves the accuracy of aggregated load forecasting but is a fundamental component of various power applications. With the highlight of deep learning (DL) in the individual load forecasting, a serving platform specialized in deep learning is required to forecast with AMI stream data. However, the existing serving platforms for DL models do not consider stream data as an input but usually support image or text data through RESTful API. To solve this problem, we propose StreamDL that is a serving framework providing deep learning inference with AMI stream data. It leverages Apache Kafka to support stream data and Kubernetes to support the cloud environment. StreamDL considers the specific requirements for stream data, which supports stream parsing to fit any DL model especially recurrent network and continual training to alleviate accuracy degradation by the change of stream distribution. In this paper, we introduce the detail of the StreamDL platform and its use-cases using real AMI data.

先进的计量基础设施(ami)促进了个体负荷预测。个体负荷预测不仅提高了总体负荷预测的准确性，而且是各种电力应用的基本组成部分。随着深度学习在个体负荷预测中的突出作用，需要一个专门的深度学习服务平台来对AMI流数据进行预测。然而，现有的深度学习模型服务平台并不将流数据作为输入，而是通常通过RESTful API支持图像或文本数据。为了解决这个问题，我们提出了StreamDL，它是一个服务框架，提供AMI流数据的深度学习推理。它利用Apache Kafka来支持流数据，利用Kubernetes来支持云环境。StreamDL考虑了对流数据的特殊要求，支持流解析以适应任何深度学习模型，特别是循环网络和持续训练，以减轻由于流分布变化而导致的准确性下降。在本文中，我们详细介绍了StreamDL平台及其使用实例。

{"title":"StreamDL: Deep Learning Serving Platform for AMI Stream Forecasting","authors":"Eunju Yang, Changha Lee, Ji-Hwan Kim, Tuan Manh Tao, Chan-Hyun Youn","doi":"10.1109/ICDMW51313.2020.00104","DOIUrl":"https://doi.org/10.1109/ICDMW51313.2020.00104","url":null,"abstract":"Advanced Metering Infrastructures (AMIs) facilitate individual load forecasting. The individual load forecasting not only improves the accuracy of aggregated load forecasting but is a fundamental component of various power applications. With the highlight of deep learning (DL) in the individual load forecasting, a serving platform specialized in deep learning is required to forecast with AMI stream data. However, the existing serving platforms for DL models do not consider stream data as an input but usually support image or text data through RESTful API. To solve this problem, we propose StreamDL that is a serving framework providing deep learning inference with AMI stream data. It leverages Apache Kafka to support stream data and Kubernetes to support the cloud environment. StreamDL considers the specific requirements for stream data, which supports stream parsing to fit any DL model especially recurrent network and continual training to alleviate accuracy degradation by the change of stream distribution. In this paper, we introduce the detail of the StreamDL platform and its use-cases using real AMI data.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127018210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

An Improved Wide-Kernel CNN for Classifying Multivariate Signals in Fault Diagnosis 一种用于故障诊断中多变量信号分类的改进宽核CNN

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00046

J. V. D. Hoogen, Stefan Bloemheuvel, M. Atzmüller

Deep Learning (DL) provides considerable opportunities for increased efficiency and performance in fault diagnosis. The ability of DL methods for automatic feature extraction can reduce the need for time-intensive feature construction and prior knowledge on complex signal processing. In this paper, we propose two models that are built on the Wide-Kernel Deep Convolutional Neural Network (WDCNN) framework to improve performance of classifying fault conditions using multivariate time series data, also with respect to limited and/or noisy training data. In our experiments, we use the renowned benchmark dataset from the Case Western Reserve University (CWRU) bearing experiment [1] to assess our models' performance, and to investigate their usability towards large-scale applications by simulating noisy industrial environments. Here, the proposed models show an exceptionally good performance without any preprocessing or data augmentation and outperform traditional Machine Learning applications as well as state-of-the-art DL models considerably, even in such complex multi-class classification tasks. We show that both models are also able to adapt well to noisy input data, which makes them suitable for condition-based maintenance contexts. Furthermore, we investigate and demonstrate explainability and transparency of the models which is particularly important in large-scale industrial applications.

深度学习(DL)为提高故障诊断的效率和性能提供了大量的机会。深度学习方法的自动特征提取能力可以减少对复杂信号处理的耗时特征构建和先验知识的需求。在本文中，我们提出了建立在宽核深度卷积神经网络(WDCNN)框架上的两个模型，以提高使用多元时间序列数据对故障条件进行分类的性能，也适用于有限和/或有噪声的训练数据。在我们的实验中，我们使用来自凯斯西储大学(CWRU)轴承实验[1]的著名基准数据集来评估我们的模型的性能，并通过模拟嘈杂的工业环境来研究它们在大规模应用中的可用性。在这里，所提出的模型在没有任何预处理或数据增强的情况下表现出非常好的性能，并且即使在如此复杂的多类分类任务中，也大大优于传统的机器学习应用程序以及最先进的深度学习模型。我们表明，这两种模型也能够很好地适应噪声输入数据，这使得它们适用于基于状态的维护环境。此外，我们调查并证明了模型的可解释性和透明度，这在大规模工业应用中尤为重要。

{"title":"An Improved Wide-Kernel CNN for Classifying Multivariate Signals in Fault Diagnosis","authors":"J. V. D. Hoogen, Stefan Bloemheuvel, M. Atzmüller","doi":"10.1109/ICDMW51313.2020.00046","DOIUrl":"https://doi.org/10.1109/ICDMW51313.2020.00046","url":null,"abstract":"Deep Learning (DL) provides considerable opportunities for increased efficiency and performance in fault diagnosis. The ability of DL methods for automatic feature extraction can reduce the need for time-intensive feature construction and prior knowledge on complex signal processing. In this paper, we propose two models that are built on the Wide-Kernel Deep Convolutional Neural Network (WDCNN) framework to improve performance of classifying fault conditions using multivariate time series data, also with respect to limited and/or noisy training data. In our experiments, we use the renowned benchmark dataset from the Case Western Reserve University (CWRU) bearing experiment [1] to assess our models' performance, and to investigate their usability towards large-scale applications by simulating noisy industrial environments. Here, the proposed models show an exceptionally good performance without any preprocessing or data augmentation and outperform traditional Machine Learning applications as well as state-of-the-art DL models considerably, even in such complex multi-class classification tasks. We show that both models are also able to adapt well to noisy input data, which makes them suitable for condition-based maintenance contexts. Furthermore, we investigate and demonstrate explainability and transparency of the models which is particularly important in large-scale industrial applications.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116010948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

TrustedChain: A Blockchain-based Data Sharing Scheme for Supply Chain TrustedChain:基于区块链的供应链数据共享方案

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00128

Gejun Le, Qifeng Gu, Qingshan Jiang, Weiyi Lin

Supply chain involves mutual independent and distrusted stakeholders and large of sensitive order data. Sharing data among stakeholders is a essential project because that improves efficiency for various workflow among stakeholders. This paper proposes TrustedChain, a blockchain-based data sharing scheme for supply chain, which has two advantages: (a) trusted: we present a trusted environment, Trusted Environment (TE), based on blockchain to allow mutually distrusted stakeholders manage data collaboratively. (b) secure: we provide a secure design that first stores order forms in Distributed Database (DDB) and then records URI in Contract Account (CA) of TE. In addition, Supply-Business Contract Management (SCM) manages all CA and Node Communication (NC) allows communication over the network. The security analysis and evaluation prove the effectiveness of TrustedChain.

供应链涉及相互独立且互不信任的利益相关者和大量敏感的订单数据。在利益相关者之间共享数据是一个重要的项目，因为它可以提高利益相关者之间各种工作流程的效率。本文提出了一种基于区块链的供应链数据共享方案TrustedChain，该方案具有两个优点:(a)可信:我们提出了一个基于区块链的可信环境trusted environment (TE)，允许互不信任的利益相关者协同管理数据。(b)安全:我们提供了一种安全的设计，首先将订单存储在分布式数据库(DDB)中，然后在TE的合同账户(CA)中记录URI。此外，供应业务合同管理(SCM)管理所有CA，节点通信(NC)允许通过网络进行通信。安全性分析和评估证明了TrustedChain的有效性。

引用次数: 2

Design of Neural Network-based Boost Charging for Reducing the Charging Time of Li-ion Battery 缩短锂离子电池充电时间的神经网络升压充电设计

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00109

Sue Hyang Lim, S. Kim, Hyeong Min Lee, Sijun Kim, Y. Shin

Rapid charging of Li-ion batteries is vital for the commercialization of electric propulsion systems. But, during the fast-charging process, reduction in the battery capacity and temperature increases must be considered in real-time. Most Li-ion battery chargers follow the charging profile of an open-loop system, which has been determined based on prior knowledge. However, such a system does not reflect the temperature change of the battery and the degree of aging. Therefore, in this study, we propose a neural network-based charging profile model by applying a closed-loop system to reflect the various states of batteries; we also show two battery-state characteristics in addition to temperature. Consequently, we show battery characteristics other than those shown in the past, such as the battery voltage and temperature trends. In addition to the design of the charging current, an improvement of approximately 22 ∼ 50% based on the mean absolute error (MAE) is achieved. By considering the various characteristics, the long short-term memory performance is determined to be better when compared to the feed-forward neural network, and this performance is improved by 35% based on MAE.

锂离子电池的快速充电对于电力推进系统的商业化至关重要。但是，在快速充电过程中，必须实时考虑电池容量的减少和温度的升高。大多数锂离子电池充电器遵循开环系统的充电曲线，这是基于先验知识确定的。但是，这样的系统并不能反映电池的温度变化和老化程度。因此，在本研究中，我们提出了一种基于神经网络的充电剖面模型，该模型采用闭环系统来反映电池的各种状态;除了温度，我们还展示了两个电池状态特性。因此，我们展示了不同于以往的电池特性，例如电池电压和温度趋势。除了充电电流的设计之外，基于平均绝对误差(MAE)的改进约为22 ~ 50%。综合考虑各种特征，确定了与前馈神经网络相比长短期记忆性能更好，并且基于MAE的长短期记忆性能提高了35%。

{"title":"Design of Neural Network-based Boost Charging for Reducing the Charging Time of Li-ion Battery","authors":"Sue Hyang Lim, S. Kim, Hyeong Min Lee, Sijun Kim, Y. Shin","doi":"10.1109/ICDMW51313.2020.00109","DOIUrl":"https://doi.org/10.1109/ICDMW51313.2020.00109","url":null,"abstract":"Rapid charging of Li-ion batteries is vital for the commercialization of electric propulsion systems. But, during the fast-charging process, reduction in the battery capacity and temperature increases must be considered in real-time. Most Li-ion battery chargers follow the charging profile of an open-loop system, which has been determined based on prior knowledge. However, such a system does not reflect the temperature change of the battery and the degree of aging. Therefore, in this study, we propose a neural network-based charging profile model by applying a closed-loop system to reflect the various states of batteries; we also show two battery-state characteristics in addition to temperature. Consequently, we show battery characteristics other than those shown in the past, such as the battery voltage and temperature trends. In addition to the design of the charging current, an improvement of approximately 22 ∼ 50% based on the mean absolute error (MAE) is achieved. By considering the various characteristics, the long short-term memory performance is determined to be better when compared to the feed-forward neural network, and this performance is improved by 35% based on MAE.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123358265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient Mining of Non-Dominated High Quantity-Utility Patterns 非支配型高数量效用模式的有效挖掘

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00097

J. Wu, Qian Teng, Gautam Srivastava, Matin Pirouz, Chun-Wei Lin

In this paper, we propose a new pattern called skyline quantity-utility pattern (SQUP) to provide better estimations in the decision-making process by considering quantity and utility together. Two algorithms respectively called SQUM-1 and SQUM-2 are presented to efficiently mine the set of SQUPs. Two new efficient utility-max structures are also mentioned for the reduction of the candidate itemsets respectively utilized in two developed algorithms. Our in-depth experimental results prove that our proposed algorithms achieve good performance in terms of runtime and memory usage.

本文提出了一种新的天际线数量-效用模式(SQUP)，以便在决策过程中更好地综合考虑数量和效用。提出了SQUM-1和SQUM-2两种算法来有效地挖掘squp集。本文还提出了两种新的有效的效用最大化结构，分别用于两种开发的算法中候选项集的缩减。我们的深入实验结果证明，我们提出的算法在运行时间和内存使用方面取得了良好的性能。

引用次数: 0

Attentive-Feature Transfer based on Mapping for Cross-domain Recommendation 基于映射的跨域推荐注意特征转移

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00030

Zhen Liu, J. Tian, Lingxi Zhao, Yanling Zhang

Recommendation systems have been widely developed for numerous applications. Existing systems may still suffer from negative transfer or cold starts. These drawbacks are essentially due to overlooking domain-specific users' personal preferences or cross-domain user-item interactions. To address these problems, we propose a cross-domain recommendation algorithm built on a mapping-based attentive feature transfer (MAFT) model. Our MAFT model utilizes matrix factorization and an attention mechanism for fine-grained modeling of user preferences. Then, overlapping cross-domain user features are combined through feature fusion. Moreover, a multilayer perceptron (MLP) is built to map the obtained user features to target-domain user features. Finally, the user-item ratings can be predicted in the target domain. We carried out experiments on the large-scale MovieLens dataset as well as the real Douban Book and Douban Movie datasets. The results show that the precision of the MAFT-based method is clearly higher than those of other cross-domain recommendation methods, especially for cold-start users with few item interactions.

推荐系统已经被广泛开发用于许多应用。现有的系统可能仍然受到负转移或冷启动的影响。这些缺点主要是由于忽略了特定领域用户的个人偏好或跨领域用户-项目交互。为了解决这些问题，我们提出了一种基于映射的注意特征转移(MAFT)模型的跨域推荐算法。我们的MAFT模型利用矩阵分解和注意机制对用户偏好进行细粒度建模。然后，通过特征融合将重叠的跨域用户特征组合起来。此外，构建多层感知器(MLP)将获取的用户特征映射到目标域用户特征。最后，在目标域中预测用户-物品评级。我们在大规模的MovieLens数据集以及真实的豆瓣图书和豆瓣电影数据集上进行了实验。结果表明，基于mft的推荐方法的推荐精度明显高于其他跨域推荐方法，特别是对于项目交互较少的冷启动用户。

{"title":"Attentive-Feature Transfer based on Mapping for Cross-domain Recommendation","authors":"Zhen Liu, J. Tian, Lingxi Zhao, Yanling Zhang","doi":"10.1109/ICDMW51313.2020.00030","DOIUrl":"https://doi.org/10.1109/ICDMW51313.2020.00030","url":null,"abstract":"Recommendation systems have been widely developed for numerous applications. Existing systems may still suffer from negative transfer or cold starts. These drawbacks are essentially due to overlooking domain-specific users' personal preferences or cross-domain user-item interactions. To address these problems, we propose a cross-domain recommendation algorithm built on a mapping-based attentive feature transfer (MAFT) model. Our MAFT model utilizes matrix factorization and an attention mechanism for fine-grained modeling of user preferences. Then, overlapping cross-domain user features are combined through feature fusion. Moreover, a multilayer perceptron (MLP) is built to map the obtained user features to target-domain user features. Finally, the user-item ratings can be predicted in the target domain. We carried out experiments on the large-scale MovieLens dataset as well as the real Douban Book and Douban Movie datasets. The results show that the precision of the MAFT-based method is clearly higher than those of other cross-domain recommendation methods, especially for cold-start users with few item interactions.","PeriodicalId":426846,"journal":{"name":"2020 International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130021462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Restructuring of Hoeffding Trees for Trapezoidal Data Streams 梯形数据流Hoeffding树的重构

2020 International Conference on Data Mining Workshops (ICDMW)

Pub Date : 2020-11-01 DOI: 10.1109/ICDMW51313.2020.00064

Christian Schreckenberger, Tim Glockner, H. Stuckenschmidt, Christian Bartelt

Trapezoidal Data Streams are an emerging topic, where not only the data volume increases, but also the data dimension, i.e. new features emerge. In this paper, we address the challenges that arise from this problem by providing a novel approach to restructure and prune Hoeffding trees. We evaluate our approach on synthetic datasets, where we can show that the approach significantly improves the performance compared to the baseline of an adjusted Hoeffding tree algorithm without restructuring and pruning.

梯形数据流是一个新兴的话题，它不仅增加了数据量，而且增加了数据维度，即出现了新的特征。在本文中，我们通过提供一种重组和修剪Hoeffding树的新方法来解决这个问题所带来的挑战。我们在合成数据集上评估了我们的方法，在那里我们可以证明，与调整后的Hoeffding树算法的基线相比，该方法在没有重组和修剪的情况下显着提高了性能。

引用次数: 4

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2020 International Conference on Data Mining Workshops (ICDMW)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀