2019 IEEE International Conference on Big Knowledge (ICBK)最新文献

英文中文

Which Patient to Treat Next? Probabilistic Stream-Based Reasoning for Decision Support and Monitoring 下一个治疗哪位患者?基于概率流的决策支持与监控推理

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00018

M. Gehrke, Simon Schiff, Tanya Braun, R. Möller

Providing decision support for questions such as "Which Patient to Treat Next?" requires a combination of stream-based reasoning and probabilistic reasoning. The former arises due to a multitude of sensors constantly collecting data (data streams). The latter stems from the underlying decision making problem based on a probabilistic model of the scenario at hand. The STARQL engine handles temporal data streams efficiently and the lifted dynamic junction tree algorithm handles temporal probabilistic relational data efficiently. In this paper, we leverage the two approaches and propose probabilistic stream-based reasoning. Additionally, we demonstrate that our proposed solution runs in linear time w.r.t. the maximum number of time steps to allow for real-time decision support and monitoring.

为诸如“下一步治疗哪个病人?”之类的问题提供决策支持需要结合基于流的推理和概率推理。前者是由于大量传感器不断收集数据(数据流)而产生的。后者源于基于当前场景的概率模型的潜在决策问题。STARQL引擎有效地处理时间数据流，提升的动态连接树算法有效地处理时间概率关系数据。在本文中，我们利用这两种方法并提出基于概率流的推理。此外，我们还演示了我们提出的解决方案在线性时间内运行，这是允许实时决策支持和监视的最大时间步数。

引用次数: 0

Continuous Path-Based Range Keyword Queries on Road Networks 道路网络中基于连续路径的距离关键字查询

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00014

Fangshu Chen, Pengfei Zhang, Huaizhong Lin, S. Tang

In this paper, we study the continuous path-based range keyword queries, which find the answer set continuously when the query point q moves along a given path P on the road network. This type of queries have many real applications, whereas leading to challenges as issuing the query at each point on P is expensive and infeasible. To answer the query, we transform it to the issue of identifying a set of event points. Specifically, the event point captures the query point where the answer set changes, and query points between two adjacent event points share the same answer set. To identify event points efficiently, we develop a backbone network index (BNI) over a simplified network topology, which supports efficient distance computations and offers insights for keyword tests. Moreover, we develop a two-phase progressive (TPP) query processing framework over BNI. The first phase performs range keyword queries to get answer sets for a fraction of vertices on P . Note that this can be achieved by only issuing the query once. In the second phase, event points are identified with these retrieved answer sets. Extensive experiments on both real and synthetic datasets show that our algorithm outperforms competitor by several orders of magnitude.

本文研究了基于连续路径的范围关键字查询，当查询点q沿着给定路径P在路网上移动时，连续地找到答案集。这种类型的查询有许多实际应用程序，而在P上的每个点发出查询会带来挑战，这是昂贵且不可行的。为了回答查询，我们将其转换为识别一组事件点的问题。具体来说，事件点捕获答案集发生变化的查询点，两个相邻事件点之间的查询点共享相同的答案集。为了有效地识别事件点，我们在简化的网络拓扑上开发了一个骨干网络索引(BNI)，它支持有效的距离计算，并为关键字测试提供了见解。此外，我们开发了一个基于BNI的两阶段渐进(TPP)查询处理框架。第一阶段执行范围关键字查询以获取P上部分顶点的答案集。注意，这可以通过只发出一次查询来实现。在第二阶段，用这些检索到的答案集识别事件点。在真实和合成数据集上的大量实验表明，我们的算法优于竞争对手几个数量级。

{"title":"Continuous Path-Based Range Keyword Queries on Road Networks","authors":"Fangshu Chen, Pengfei Zhang, Huaizhong Lin, S. Tang","doi":"10.1109/ICBK.2019.00014","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00014","url":null,"abstract":"In this paper, we study the continuous path-based range keyword queries, which find the answer set continuously when the query point q moves along a given path P on the road network. This type of queries have many real applications, whereas leading to challenges as issuing the query at each point on P is expensive and infeasible. To answer the query, we transform it to the issue of identifying a set of event points. Specifically, the event point captures the query point where the answer set changes, and query points between two adjacent event points share the same answer set. To identify event points efficiently, we develop a backbone network index (BNI) over a simplified network topology, which supports efficient distance computations and offers insights for keyword tests. Moreover, we develop a two-phase progressive (TPP) query processing framework over BNI. The first phase performs range keyword queries to get answer sets for a fraction of vertices on P . Note that this can be achieved by only issuing the query once. In the second phase, event points are identified with these retrieved answer sets. Extensive experiments on both real and synthetic datasets show that our algorithm outperforms competitor by several orders of magnitude.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122328860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

An Abnormal Login Detection Method Based on Multi-source Log Fusion Analysis 基于多源日志融合分析的异常登录检测方法

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00038

Jing Tao, Waner Wang, Ning Zheng, Ting Han, Yue Chang, Xuna Zhan

Anomaly login detection is a critical step towards building a secure and trustworthy system. When a new user appears in the login record, the traditional method determines that an anomaly behavior of login has occurred. However, in fact, the first login subject may be a new employee other than the attacker. In this paper, we propose an asynchronous anomaly login detection algorithm model of "Off-line Learning + On-line detection" to solve the real-time anomaly login detection problem. In addition, based on the analysis of multi-source logs, we extract the operating features of users to solve the problem of how to distinguish malicious users from legitimate users who log on to the host for the first time. Extensive experimental evaluations over large log data have shown that our algorithm can not only catch the first abnormal account effectively but also reduce the running time by tens of times compared with K-means and other cluster algorithms

异常登录检测是构建安全可靠系统的关键一步。当登录记录中出现新用户时，传统方法判断登录行为异常。然而，实际上，第一个登录主体可能是新员工而不是攻击者。本文提出了一种“离线学习+在线检测”的异步异常登录检测算法模型，解决了实时异常登录检测问题。此外，在多源日志分析的基础上，提取用户的操作特征，解决了首次登录主机的恶意用户与合法用户的区分问题。在大量日志数据上进行的大量实验评估表明，与K-means和其他聚类算法相比，我们的算法不仅可以有效地捕获第一个异常账户，而且可以将运行时间缩短数十倍

引用次数: 1

Welcome Message from Conference Organizers 会议主办方致欢迎辞

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/icbk.2017.7

Xindong Wu

Welcome to IEEE ICBK 2019, the 10 th IEEE International Conference on Big Knowledge, and to Beijing, China! Beijing is China’s capital, and has a history of 3 millennia. It has an amazing combination of modern and traditional architecture, including the Great Wall, the Forbidden City (largest palace in the world), Tiananmen Square (the second largest city square in the world) and the Summer Palace. Big Knowledge seeks to systematically combine fragmented knowledge from heterogeneous, autonomous information sources for complex and evolving relationships, in addition to consolidating domain expertise. The IEEE International Conference on Big Knowledge (ICBK) is a premier international forum for the presentation of original research results in Big Knowledge opportunities and challenges, as well as for exchange and dissemination of innovative, practical development experiences. The conference covers all aspects of Big Knowledge, including algorithms, software, systems, and applications. ICBK attracts researchers and application developers from a wide range of areas related to Big Knowledge such as statistics, machine learning, pattern recognition, knowledge visualization, expert systems, high performance computing, World Wide Web, and big data analytics. By promoting novel, high quality research findings and innovative solutions to challenging Big Knowledge problems, the conference continuously advances the state-of-the-art in Big Knowledge. ICBK 2019 is the third International Edition, and the 10 th Edition in its conference history. The first 8 editions (annually from 2010 through 2017) were all organized in Hefei, China, while the latest edition was held in Singapore last year. ICBK 2019 is co-located with the 19 IEEE International Conference on Data Mining (ICDM 2019), and we share our prominent keynote speakers: Ronald Fagin (IBM Research – Almaden Fellow of the US National Academy of Engineering) and Joseph Halpern (Cornell University, Fellow of the National Academy of Engineering). We also share the 2019 ICDM/ICBK Knowledge Graph Contest and the 2019 ICDM/ICBK Panel on Marketing Intelligence – Let Marketing Drive Efficiency and Innovation. The organization of a successful conference would not be possible without the dedicated efforts of many individuals. We would like to express our gratitude to all functional chairs on our organizing committee listed on a separate page of this proceedings. We owe special thanks to our conference sponsors: the financial and organizational support of the National Key Research and Development Program of China under grant 2016YFB1000900 and its 15 participating institutions, Mininglamp Technology and the IEEE Computer Society. We are especially grateful to all local institutions that have supported the conference, in particular Hefei University of Technology. Last but not least, we would like to thank all authors who submitted research papers to the conference, and all participants. We are encouraged by your scie

欢迎参加IEEE ICBK 2019，第10届IEEE大知识国际会议，并来到中国北京!北京是中国的首都，有三千年的历史。它是现代与传统建筑的完美结合，包括长城、紫禁城(世界上最大的宫殿)、天安门广场(世界上第二大城市广场)和颐和园。除了巩固领域专业知识外，大知识还寻求系统地结合来自异构、自主信息源的碎片化知识，以处理复杂和不断发展的关系。IEEE大知识国际会议(ICBK)是一个重要的国际论坛，旨在展示大知识领域的机遇和挑战方面的原创研究成果，以及交流和传播创新的、实用的发展经验。会议涵盖了大知识的各个方面，包括算法、软件、系统和应用。ICBK吸引了来自统计学、机器学习、模式识别、知识可视化、专家系统、高性能计算、万维网和大数据分析等大知识相关领域的研究人员和应用程序开发人员。通过推广新颖，高质量的研究成果和创新的解决方案来挑战大知识问题，会议不断推进大知识领域的最新技术。ICBK 2019是第三届国际版，也是会议历史上的第十届。前8届(从2010年到2017年每年一次)都在中国合肥举办，而最新一届去年在新加坡举行。ICBK 2019与19 IEEE数据挖掘国际会议(ICDM 2019)同地举行，我们分享了我们的著名主题演讲嘉宾:Ronald Fagin (IBM研究-美国国家工程院阿尔马登研究员)和Joseph Halpern(康奈尔大学，美国国家工程院研究员)。我们还分享了2019年ICDM/ICBK知识图谱竞赛和2019年ICDM/ICBK营销情报小组-让营销驱动效率和创新。如果没有许多人的努力，就不可能成功地组织一次会议。我们要对本次会议的另一页列出的组委会的所有职能主席表示感谢。我们特别感谢本次会议的发起人:中国国家重点研发计划(2016YFB1000900)及其15个参与机构的资金和组织支持，Mininglamp Technology和IEEE Computer Society。我们特别感谢所有支持会议的地方机构，特别是合肥工业大学。最后但并非最不重要的是，我们要感谢所有向会议提交研究论文的作者和所有与会者。我们对您的科学贡献、支持和参与感到鼓舞，以进一步探索这一新兴的大知识领域。

{"title":"Welcome Message from Conference Organizers","authors":"Xindong Wu","doi":"10.1109/icbk.2017.7","DOIUrl":"https://doi.org/10.1109/icbk.2017.7","url":null,"abstract":"Welcome to IEEE ICBK 2019, the 10 th IEEE International Conference on Big Knowledge, and to Beijing, China! Beijing is China’s capital, and has a history of 3 millennia. It has an amazing combination of modern and traditional architecture, including the Great Wall, the Forbidden City (largest palace in the world), Tiananmen Square (the second largest city square in the world) and the Summer Palace. Big Knowledge seeks to systematically combine fragmented knowledge from heterogeneous, autonomous information sources for complex and evolving relationships, in addition to consolidating domain expertise. The IEEE International Conference on Big Knowledge (ICBK) is a premier international forum for the presentation of original research results in Big Knowledge opportunities and challenges, as well as for exchange and dissemination of innovative, practical development experiences. The conference covers all aspects of Big Knowledge, including algorithms, software, systems, and applications. ICBK attracts researchers and application developers from a wide range of areas related to Big Knowledge such as statistics, machine learning, pattern recognition, knowledge visualization, expert systems, high performance computing, World Wide Web, and big data analytics. By promoting novel, high quality research findings and innovative solutions to challenging Big Knowledge problems, the conference continuously advances the state-of-the-art in Big Knowledge. ICBK 2019 is the third International Edition, and the 10 th Edition in its conference history. The first 8 editions (annually from 2010 through 2017) were all organized in Hefei, China, while the latest edition was held in Singapore last year. ICBK 2019 is co-located with the 19 IEEE International Conference on Data Mining (ICDM 2019), and we share our prominent keynote speakers: Ronald Fagin (IBM Research – Almaden Fellow of the US National Academy of Engineering) and Joseph Halpern (Cornell University, Fellow of the National Academy of Engineering). We also share the 2019 ICDM/ICBK Knowledge Graph Contest and the 2019 ICDM/ICBK Panel on Marketing Intelligence – Let Marketing Drive Efficiency and Innovation. The organization of a successful conference would not be possible without the dedicated efforts of many individuals. We would like to express our gratitude to all functional chairs on our organizing committee listed on a separate page of this proceedings. We owe special thanks to our conference sponsors: the financial and organizational support of the National Key Research and Development Program of China under grant 2016YFB1000900 and its 15 participating institutions, Mininglamp Technology and the IEEE Computer Society. We are especially grateful to all local institutions that have supported the conference, in particular Hefei University of Technology. Last but not least, we would like to thank all authors who submitted research papers to the conference, and all participants. We are encouraged by your scie","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126398289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Recurrent Neural Networks for Autoregressive Moving Average Model Selection 用于自回归移动平均模型选择的递归神经网络

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00013

Bei Chen, Beat Buesser, Kelsey L. DiPietro

Selecting an appropriate Autoregressive Moving Average (ARMA) model for a given time series is a classic problem in statistics that is encountered in many applications. Typically this involves a human-in-the-loop and repeated parameter evaluation of candidate models, which is not ideal for learning at scale. We propose a Long Short Term Memory (LSTM) classification model for automatic ARMA model selection. Our numerical experiments show that the proposed method is fast and provides better accuracy than the traditional Box-Jenkins approach based on autocorrelations and model selection criterion. We demonstrate the application of our approach with a case study on volatility prediction of daily stock prices.

为给定的时间序列选择合适的自回归移动平均(ARMA)模型是统计学中的一个经典问题，在许多应用中都会遇到。通常，这涉及到人在循环和候选模型的重复参数评估，这对于大规模学习来说并不理想。提出了一种用于ARMA模型自动选择的长短期记忆(LSTM)分类模型。数值实验表明，该方法比基于自相关和模型选择准则的传统Box-Jenkins方法速度快，精度高。我们通过对每日股票价格波动率预测的案例研究来证明我们的方法的应用。

引用次数: 2

Open-Domain Document-Based Automatic QA Models Based on CNN and Attention Mechanism 基于CNN和注意机制的开放域文档自动QA模型

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00051

Guangjie Zhang, Xumin Fan, Canghong Jin, Ming-hui Wu

The open domain automatic question answering models have been widely studied in recent years. When dealing with automatic question answering systems, the RNN-based models are the most commonly used models. However we choose the CNN-based model to construct our question answering models, and use the attention mechanism to enhance the performance. We test our models on Microsoft open domain automatic question answering dataset. Experiments show that compared with the models without attention mechanism, our models get the best results. Experiments also show that adding the RNN network in our model can further improve the performance.

近年来，开放域自动问答模型得到了广泛的研究。在处理自动问答系统时，基于rnn的模型是最常用的模型。然而，我们选择基于cnn的模型来构建我们的问答模型，并使用注意机制来提高性能。我们在Microsoft开放域自动问答数据集上测试了我们的模型。实验表明，与没有注意机制的模型相比，我们的模型得到了最好的结果。实验还表明，在我们的模型中加入RNN网络可以进一步提高性能。

引用次数: 3

Nonparametric Functional Approximation with Delaunay Triangulation Learner 基于Delaunay三角学习器的非参数泛函逼近

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00030

Yehong Liu, G. Yin

We propose a differentiable nonparametric algorithm, the Delaunay triangulation learner (DTL), to solve the functional approximation problem on the basis of a p-dimensional feature space. By conducting the Delaunay triangulation algorithm on the data points, the DTL partitions the feature space into a series of p-dimensional simplices in a geometrically optimal way, and fits a linear model within each simplex. We study its theoretical properties by exploring the geometric properties of the Delaunay triangulation, and compare its performance with other statistical learners in numerical studies.

我们提出了一种可微的非参数算法——Delaunay三角学习算法(DTL)来解决基于p维特征空间的泛函逼近问题。DTL通过对数据点进行Delaunay三角剖分算法，以几何最优的方式将特征空间划分为一系列p维单纯形，并在每个单纯形内拟合线性模型。我们通过探索Delaunay三角剖分的几何性质来研究其理论性质，并将其与其他统计学习器在数值研究中的性能进行比较。

引用次数: 5

Nonlinear Cross-Domain Feature Representation Learning Method Based on Dual Constraints 基于对偶约束的非线性跨域特征表示学习方法

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00017

Han Ding, Yuhong Zhang, Shuai Yang, Yaojin Lin

Feature representation learning is a research focus in domain adaptation. Recently, due to the fast training speed, the marginalized Denoising Autoencoder (mDA) as a standing deep learning model has been widely utilized for feature representation learning. However, the training of mDA suffers from the lack of nonlinear relationship and does not explicitly consider the distribution discrepancy between domains. To address these problems, this paper proposes a novel method for feature representation learning, namely Nonlinear cross-domain Feature learning based Dual Constraints (NFDC), which consists of kernelization and dual constraints. Firstly, we introduce kernelization to effectively extract nonlinear relationship in feature representation learning. Secondly, we design dual constraints including Maximum Mean Discrepancy (MMD) and Manifold Regularization (MR) in order to minimize distribution discrepancy during the training process. Experimental results show that our approach is superior to several state-of-the-art methods in domain adaptation tasks.

特征表示学习是领域自适应中的一个研究热点。近年来，边缘去噪自编码器(mDA)由于训练速度快，作为一种成熟的深度学习模型被广泛应用于特征表示学习。然而，mDA的训练缺乏非线性关系，没有明确考虑域之间的分布差异。为了解决这些问题，本文提出了一种新的特征表示学习方法，即基于对偶约束的非线性跨域特征学习(NFDC)，该方法由核化和对偶约束组成。首先，在特征表示学习中引入核化，有效提取非线性关系。其次，为了最小化训练过程中的分布差异，我们设计了包括最大均值差异(MMD)和流形正则化(MR)在内的对偶约束。实验结果表明，该方法在领域自适应任务中优于几种最先进的方法。

引用次数: 0

Mining Spatial Co-location Patterns by the Fuzzy Technology 利用模糊技术挖掘空间共位模式

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00025

Le Lei, Lizhen Wang, Xiaoxuan Wang

The main purpose of co-location pattern mining is to mine the set of spatial features whose instances are frequently located together in space. Because a single distance threshold is chosen in the previous methods when generating the neighbourhood relationships, some interesting spatial colocation patterns can't be extracted. In addition, previous methods don't take the neighborhood degree into consideration and they depend upon the PI (participation index) to measure the prevalence of the co-locations, which these methods are very sensitive to PI and also lead to the absence of co-location patterns. In order to overcome these limitations of traditional co-location pattern mining, considering that the neighbor relationship is a fuzzy concept, this paper introduces the fuzzy theory into co-location pattern mining, a new fuzzy spatial neighborhood relationship measurement between instances and a reasonable feature proximity measurement between spatial features are proposed. Then, a novel algorithm based on fuzzy C-medoids clustering algorithm, FCB, is proposed, extensive experiments on synthetic and real-world data sets prove the practicability and efficiency of the proposed mining algorithm, it also proves that the algorithm has low sensitivity to thresholds and has high robustness.

同位模式挖掘的主要目的是挖掘空间特征的集合，这些特征的实例在空间中经常位于一起。由于以往的方法在生成邻域关系时选择了单一的距离阈值，无法提取出一些有趣的空间配位模式。此外，以往的方法没有考虑邻域度，而是依赖于PI(参与指数)来衡量共址的普遍程度，这些方法对PI非常敏感，也导致了共址模式的缺失。为了克服传统同址模式挖掘的这些局限性，考虑到邻域关系是一个模糊概念，将模糊理论引入到同址模式挖掘中，提出了一种新的实例间模糊空间邻域关系度量方法和空间特征间合理的特征邻近度量方法。然后，提出了一种基于模糊c -介质聚类算法的新算法FCB，在合成数据集和真实数据集上的大量实验证明了所提出的挖掘算法的实用性和高效性，同时也证明了该算法对阈值的敏感性低，具有较高的鲁棒性。

{"title":"Mining Spatial Co-location Patterns by the Fuzzy Technology","authors":"Le Lei, Lizhen Wang, Xiaoxuan Wang","doi":"10.1109/ICBK.2019.00025","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00025","url":null,"abstract":"The main purpose of co-location pattern mining is to mine the set of spatial features whose instances are frequently located together in space. Because a single distance threshold is chosen in the previous methods when generating the neighbourhood relationships, some interesting spatial colocation patterns can't be extracted. In addition, previous methods don't take the neighborhood degree into consideration and they depend upon the PI (participation index) to measure the prevalence of the co-locations, which these methods are very sensitive to PI and also lead to the absence of co-location patterns. In order to overcome these limitations of traditional co-location pattern mining, considering that the neighbor relationship is a fuzzy concept, this paper introduces the fuzzy theory into co-location pattern mining, a new fuzzy spatial neighborhood relationship measurement between instances and a reasonable feature proximity measurement between spatial features are proposed. Then, a novel algorithm based on fuzzy C-medoids clustering algorithm, FCB, is proposed, extensive experiments on synthetic and real-world data sets prove the practicability and efficiency of the proposed mining algorithm, it also proves that the algorithm has low sensitivity to thresholds and has high robustness.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"50 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120906767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Modeling Multi-label Recurrence in Data Streams 数据流中的多标签递归建模

2019 IEEE International Conference on Big Knowledge (ICBK)

Pub Date : 2019-11-01 DOI: 10.1109/ICBK.2019.00010

Zahra Ahmadi, S. Kramer

Most of the existing data stream algorithms assume a single label as the target variable. However, in many applications, each observation is assigned to several labels with latent dependencies among them, which their target function may change over time. Classification of such non-stationary multi-label streaming data with the consideration of dependencies among labels and potential drifts is a challenging task. The few existing studies mostly cope with drifts implicitly, and all learn models on the original label space, which requires a lot of time and memory. None of them consider recurrent drifts in multi-label streams and particularly drifts and recurrences visible in a latent label space. In this paper, we propose a graph-based framework that maintains a pool of multi-label concepts with transitions among them and the corresponding multi-label classifiers. As a base classifier, a fast linear label space dimension reduction method is developed that transforms the labels into a random encoded space and trains models in the reduced space. An analytical method updates the decoding matrix which is used during the test phase to map the labels back into the original space. Experimental results show the effectiveness of the proposed framework in terms of prediction performance and pool management.

现有的大多数数据流算法都将单个标签作为目标变量。然而，在许多应用中，每个观测值被分配给多个具有潜在依赖关系的标签，这些标签之间的目标函数可能随着时间的推移而变化。考虑标签之间的依赖关系和潜在漂移的非平稳多标签流数据分类是一项具有挑战性的任务。现有的少数研究大多是隐式处理漂移，并且都是在原始标签空间上学习模型，这需要大量的时间和内存。它们都没有考虑多标签流中的反复漂移，特别是在潜在标签空间中可见的漂移和递归。在本文中，我们提出了一个基于图的框架，该框架维护一个多标签概念池，其中包含它们之间的转换和相应的多标签分类器。作为基分类器，提出了一种快速线性标签空间降维方法，将标签转换为随机编码空间，并在降维后的空间中训练模型。分析方法更新解码矩阵，该解码矩阵在测试阶段用于将标签映射回原始空间。实验结果表明了该框架在预测性能和池管理方面的有效性。

{"title":"Modeling Multi-label Recurrence in Data Streams","authors":"Zahra Ahmadi, S. Kramer","doi":"10.1109/ICBK.2019.00010","DOIUrl":"https://doi.org/10.1109/ICBK.2019.00010","url":null,"abstract":"Most of the existing data stream algorithms assume a single label as the target variable. However, in many applications, each observation is assigned to several labels with latent dependencies among them, which their target function may change over time. Classification of such non-stationary multi-label streaming data with the consideration of dependencies among labels and potential drifts is a challenging task. The few existing studies mostly cope with drifts implicitly, and all learn models on the original label space, which requires a lot of time and memory. None of them consider recurrent drifts in multi-label streams and particularly drifts and recurrences visible in a latent label space. In this paper, we propose a graph-based framework that maintains a pool of multi-label concepts with transitions among them and the corresponding multi-label classifiers. As a base classifier, a fast linear label space dimension reduction method is developed that transforms the labels into a random encoded space and trains models in the reduced space. An analytical method updates the decoding matrix which is used during the test phase to map the labels back into the original space. Experimental results show the effectiveness of the proposed framework in terms of prediction performance and pool management.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124546312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 IEEE International Conference on Big Knowledge (ICBK)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀