Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00091
Yufei He, Yao Ma
As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.
{"title":"SGKD: A Scalable and Effective Knowledge Distillation Framework for Graph Representation Learning","authors":"Yufei He, Yao Ma","doi":"10.1109/ICDMW58026.2022.00091","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00091","url":null,"abstract":"As Graph Neural Networks (GNNs) are widely used in various fields, there is a growing demand for improving their efficiency and scalablity. Knowledge Distillation (KD), a classical methods for model compression and acceleration, has been gradually introduced into the field of graph learning. More recently, it has been shown that, through knowledge distillation, the predictive capability of a well-trained GNN model can be transferred to lightweight and easy-to-deploy MLP models. Such distilled MLPs are able to achieve comparable performance as their corresponding G NN teachers while being significantly more efficient in terms of both space and time. However, the research of KD for graph learning is still in its early stage and there exist several limitations in the existing KD framework. The major issues lie in distilled MLPs lack useful information about the graph structure and logits of teacher are not always reliable. In this paper, we propose a Scalable and effective graph neural network Knowledge Distillation framework (SGKD) to address these issues. Specifically, to include the graph, we use feature propagation as preprocessing to provide MLPs with graph structure-aware features in the original feature space; to address unreliable logits of teacher, we introduce simple yet effective training strategies such as masking and temperature. With these innovations, our framework is able to be more effective while remaining scalable and efficient in training and inference. We conducted comprehensive experiments on eight datasets of different sizes - up to 100 million nodes - under various settings. The results demonstrated that SG KD is able to significantly outperform existing KD methods and even achieve comparable performance with their state-of-the-art GNN teachers.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125519595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00026
Fabian Fingerhut, Chaitra Harsha, Amirmohammad Eghbalian, Tom Jacobs, Mahdi Tabassian, R. Verbeke, E. Tsiporkova
There is a lot of room for improvement towards more sustainability in manufacturing companies. During the machining operations, replacement of the cutting tools is not done in an optimal way, resulting in sub-optimal usage of resources and inefficiencies during the production process. Using data-driven approaches to extend the usage of tools can greatly improve on this shortcoming by optimizing the replacement process of these tools. This study is therefore sought to investigate the value of several data-driven approaches, applied to an industrial dataset, to achieve this goal. Although the examined data-driven methods were applied to a dataset which has been generated under a wide variety of machining conditions and lacks reliable ground truth, the obtained experimental results confirm that these methods are indeed capable of extracting informative profiles from the tool usages and can identify anomalous patterns and signs in the time-series datasets collected during different machining processes.
{"title":"Data-Driven Usage Profiling and Anomaly Detection in Support of Sustainable Machining Processes","authors":"Fabian Fingerhut, Chaitra Harsha, Amirmohammad Eghbalian, Tom Jacobs, Mahdi Tabassian, R. Verbeke, E. Tsiporkova","doi":"10.1109/ICDMW58026.2022.00026","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00026","url":null,"abstract":"There is a lot of room for improvement towards more sustainability in manufacturing companies. During the machining operations, replacement of the cutting tools is not done in an optimal way, resulting in sub-optimal usage of resources and inefficiencies during the production process. Using data-driven approaches to extend the usage of tools can greatly improve on this shortcoming by optimizing the replacement process of these tools. This study is therefore sought to investigate the value of several data-driven approaches, applied to an industrial dataset, to achieve this goal. Although the examined data-driven methods were applied to a dataset which has been generated under a wide variety of machining conditions and lacks reliable ground truth, the obtained experimental results confirm that these methods are indeed capable of extracting informative profiles from the tool usages and can identify anomalous patterns and signs in the time-series datasets collected during different machining processes.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127700004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00063
Ashok Kumar, T. Trueman, E. Cambria
Online social networks have become one of the primary ways of communication to individuals. It rapidly gen-erates a large volume of textual and non-textual data such as images, audio, and videos. In particular, textual data plays a vital role in detecting mental health-related problems such as stress, depression, anxiety, and emotional and behavioral disorders. In this paper, we identify the mental stress of online users in social networks using a transformers-based RoBERTa model and an autoregressive model, also called XLNet. We implement this model in both a constrained system and an unconstrained system. The constrained system maintains the gold standard datasets such as training, validation, and testing. On the other hand, the unconstrained system divides the given dataset into user-specific training, validation, and test sets. Our results indicate that the proposed transformers-based RoBERTa model achieves a better result in both constrained and unconstrained systems than the state-of-the-art models.
{"title":"Stress Identification in Online Social Networks","authors":"Ashok Kumar, T. Trueman, E. Cambria","doi":"10.1109/ICDMW58026.2022.00063","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00063","url":null,"abstract":"Online social networks have become one of the primary ways of communication to individuals. It rapidly gen-erates a large volume of textual and non-textual data such as images, audio, and videos. In particular, textual data plays a vital role in detecting mental health-related problems such as stress, depression, anxiety, and emotional and behavioral disorders. In this paper, we identify the mental stress of online users in social networks using a transformers-based RoBERTa model and an autoregressive model, also called XLNet. We implement this model in both a constrained system and an unconstrained system. The constrained system maintains the gold standard datasets such as training, validation, and testing. On the other hand, the unconstrained system divides the given dataset into user-specific training, validation, and test sets. Our results indicate that the proposed transformers-based RoBERTa model achieves a better result in both constrained and unconstrained systems than the state-of-the-art models.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127725724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00062
Wojciech Korczynski, Jan Kocoń
Transformer models like BERT have significantly improved performance on many NLP tasks, e.g., sentiment analysis. However, their large number of parameters makes real-world applications difficult because of computational costs and latency. Many compression methods have been proposed to solve this problem using quantization, weight pruning, and knowledge distillation. In this work, we explore some of these task-specific and task-agnostic methods by comparing their effectiveness and quality on the MultiEmo sentiment analysis dataset. Additionally, we analyze their ability to generalize and capture sentiment features by conducting domain-sentiment experiments. The results show that the compression methods reduce the model size by 8.6 times and the inference time by 6.9 times compared to the original model while maintaining unimpaired quality. Smaller models perform better on tasks with fewer data and retain more remarkable generalization ability after fine-tuning because they are less prone to overfitting. The best trade-off is obtained using the task-agnostic XtremeDistil model.
{"title":"Compression Methods for Transformers in Multidomain Sentiment Analysis","authors":"Wojciech Korczynski, Jan Kocoń","doi":"10.1109/ICDMW58026.2022.00062","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00062","url":null,"abstract":"Transformer models like BERT have significantly improved performance on many NLP tasks, e.g., sentiment analysis. However, their large number of parameters makes real-world applications difficult because of computational costs and latency. Many compression methods have been proposed to solve this problem using quantization, weight pruning, and knowledge distillation. In this work, we explore some of these task-specific and task-agnostic methods by comparing their effectiveness and quality on the MultiEmo sentiment analysis dataset. Additionally, we analyze their ability to generalize and capture sentiment features by conducting domain-sentiment experiments. The results show that the compression methods reduce the model size by 8.6 times and the inference time by 6.9 times compared to the original model while maintaining unimpaired quality. Smaller models perform better on tasks with fewer data and retain more remarkable generalization ability after fine-tuning because they are less prone to overfitting. The best trade-off is obtained using the task-agnostic XtremeDistil model.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128068324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00152
P. Caso, Martino Trevisan, L. Vassio
Online Social Networks (OSN s) are an integral part of modern life for sharing thoughts, stories, and news. An ecosystem of influencers generates a flood of content in the form of posts, some of which have an unusually high level of engagement with the influencer's fan base. These posts relate to blossoming topics of discussion that generate particular interest among users: The COVID-19 pandemic is a prominent example. Studying these phenomena provides an understanding of the OSN landscape and requires appropriate methods. This paper presents a methodology to discover notable posts and group them according to their related topic. By combining anomaly detection, graph modelling and community detection techniques, we pinpoint salient events automatically, with the ability to tune the amount of them. We showcase our approach using a large Instagram dataset and extract some notable weekly topics that gained momentum from 1.4 million posts. We then illustrate some use cases ranging from the COVID-19 outbreak to sporting events.
{"title":"Disentangling the Information Flood on OSNs: Finding Notable Posts and Topics","authors":"P. Caso, Martino Trevisan, L. Vassio","doi":"10.1109/ICDMW58026.2022.00152","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00152","url":null,"abstract":"Online Social Networks (OSN s) are an integral part of modern life for sharing thoughts, stories, and news. An ecosystem of influencers generates a flood of content in the form of posts, some of which have an unusually high level of engagement with the influencer's fan base. These posts relate to blossoming topics of discussion that generate particular interest among users: The COVID-19 pandemic is a prominent example. Studying these phenomena provides an understanding of the OSN landscape and requires appropriate methods. This paper presents a methodology to discover notable posts and group them according to their related topic. By combining anomaly detection, graph modelling and community detection techniques, we pinpoint salient events automatically, with the ability to tune the amount of them. We showcase our approach using a large Instagram dataset and extract some notable weekly topics that gained momentum from 1.4 million posts. We then illustrate some use cases ranging from the COVID-19 outbreak to sporting events.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129588418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00028
Shi Jer Low, V. Raghavan, H. Gopalan, Jian Cheng Wong, J. Yeoh, C. Ooi
Data-driven approaches, including deep learning, have shown great promise as surrogate models across many domains, including computer vision and natural language pro-cessing. These extend to various areas in sustainability, including for satellite image analysis to obtain information such as land usage and extent of development. An interesting direction for which data-driven methods have not been applied much yet is in the quick quantitative evaluation of urban layouts for planning and design. In particular, urban designs typically involve complex trade-offs between multiple objectives, including limits on urban build-up and/or consideration of urban heat island effect. Hence, it can be beneficial to urban planners to have a fast surrogate model to predict urban characteristics of a hypothetical layout, e.g. pedestrian-level wind velocity, without having to run compu-tationally expensive and time-consuming high-fidelity numerical simulations each time. This fast surrogate can then be potentially integrated into other design optimization frameworks, including generative models or other gradient-based methods. Here we present an investigation into the use of convolutional neural networks as a surrogate for urban layout characterization that is typically done via high-fidelity numerical simulation. We then further apply this model towards a first demonstration of its utility for data-driven pedestrian-level wind velocity prediction. The data set in this work comprises results from high-fidelity numerical simulations of wind velocities for a diverse set of realistic urban layouts, based on randomized samples from a real-world, highly built-up urban city. We then provide prediction results obtained from the neural network trained on this data-set, demonstrating test errors of under 0.1 m/s for previously unseen novel urban layouts. We further illustrate how this can be useful for purposes such as rapid evaluation of pedestrian wind velocity for a potential new layout. In addition, it is hoped that this data set will further inspire, facilitate and accelerate research in data-driven urban AI, even as our baseline model facilitates quantitative comparison to future, more innovative methods.
{"title":"FastFlow: AI for Fast Urban Wind Velocity Prediction","authors":"Shi Jer Low, V. Raghavan, H. Gopalan, Jian Cheng Wong, J. Yeoh, C. Ooi","doi":"10.1109/ICDMW58026.2022.00028","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00028","url":null,"abstract":"Data-driven approaches, including deep learning, have shown great promise as surrogate models across many domains, including computer vision and natural language pro-cessing. These extend to various areas in sustainability, including for satellite image analysis to obtain information such as land usage and extent of development. An interesting direction for which data-driven methods have not been applied much yet is in the quick quantitative evaluation of urban layouts for planning and design. In particular, urban designs typically involve complex trade-offs between multiple objectives, including limits on urban build-up and/or consideration of urban heat island effect. Hence, it can be beneficial to urban planners to have a fast surrogate model to predict urban characteristics of a hypothetical layout, e.g. pedestrian-level wind velocity, without having to run compu-tationally expensive and time-consuming high-fidelity numerical simulations each time. This fast surrogate can then be potentially integrated into other design optimization frameworks, including generative models or other gradient-based methods. Here we present an investigation into the use of convolutional neural networks as a surrogate for urban layout characterization that is typically done via high-fidelity numerical simulation. We then further apply this model towards a first demonstration of its utility for data-driven pedestrian-level wind velocity prediction. The data set in this work comprises results from high-fidelity numerical simulations of wind velocities for a diverse set of realistic urban layouts, based on randomized samples from a real-world, highly built-up urban city. We then provide prediction results obtained from the neural network trained on this data-set, demonstrating test errors of under 0.1 m/s for previously unseen novel urban layouts. We further illustrate how this can be useful for purposes such as rapid evaluation of pedestrian wind velocity for a potential new layout. In addition, it is hoped that this data set will further inspire, facilitate and accelerate research in data-driven urban AI, even as our baseline model facilitates quantitative comparison to future, more innovative methods.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115061443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00044
André Picado, A. Finamore, Ana Moura Santos, C. Antunes
Online education has gained significant relevance over the last few years, and the pandemic situation has brought evidence that it plays a fundamental role nowadays. However, even with the increasing number of students enrolled in online courses, these still do not allow for enough personalization, often leading students to become demotivated and dropping out. The goal of better adapting online courses to students aims to support them in an inclusive and equitable way, since the learners are often students from quite diverse backgrounds. The continuous demand for online learning, and the need to customize it according to the students' profile has led to a succession of attempts at recommendation systems. Nevertheless, many of them were entirely based on collaborative filtering, almost ignoring profiling requirements. In this paper, we propose a recommendation system to be integrated into MOOCs (Massive Open Online Courses), following a hybrid architecture. In our proposal, learning resources are described by a set of terms, extracted directly from the supporting texts in the MOOC. From these terms, those which are included in the exercises will be used to specify the important skills learners must acquire, and the results achieved by each learner in them are used to characterize the particular student's state, at a given moment. Those states are then used to make the recommendation collaboratively, allowing for different recommendations for each particular student over time. The system is validated across several MOOCs.
在线教育在过去几年中具有重要意义,大流行的形势证明,在线教育在当今发挥着重要作用。然而,尽管越来越多的学生参加了在线课程,但这些课程仍然没有提供足够的个性化,这往往导致学生失去动力并辍学。更好地适应学生的在线课程的目标是以包容和公平的方式支持他们,因为学习者通常是来自不同背景的学生。对在线学习的持续需求,以及根据学生的个人资料对其进行定制的需求,导致了对推荐系统的一系列尝试。然而,它们中的许多完全基于协同过滤,几乎忽略了分析需求。在本文中,我们提出了一个基于混合架构的推荐系统,并将其集成到mooc (Massive Open Online Courses)中。在我们的建议中,学习资源由一组术语来描述,这些术语直接从MOOC的支持文本中提取。从这些术语中,那些包含在练习中的术语将被用来指定学习者必须获得的重要技能,并且每个学习者在其中获得的结果被用来表征特定学生在给定时刻的状态。然后使用这些状态进行协作推荐,允许对每个特定学生进行不同的推荐。该系统在多个mooc上进行了验证。
{"title":"Students Temporal Profiling and e-Learning Resources Recommendation Based on NLP's Terms Extraction","authors":"André Picado, A. Finamore, Ana Moura Santos, C. Antunes","doi":"10.1109/ICDMW58026.2022.00044","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00044","url":null,"abstract":"Online education has gained significant relevance over the last few years, and the pandemic situation has brought evidence that it plays a fundamental role nowadays. However, even with the increasing number of students enrolled in online courses, these still do not allow for enough personalization, often leading students to become demotivated and dropping out. The goal of better adapting online courses to students aims to support them in an inclusive and equitable way, since the learners are often students from quite diverse backgrounds. The continuous demand for online learning, and the need to customize it according to the students' profile has led to a succession of attempts at recommendation systems. Nevertheless, many of them were entirely based on collaborative filtering, almost ignoring profiling requirements. In this paper, we propose a recommendation system to be integrated into MOOCs (Massive Open Online Courses), following a hybrid architecture. In our proposal, learning resources are described by a set of terms, extracted directly from the supporting texts in the MOOC. From these terms, those which are included in the exercises will be used to specify the important skills learners must acquire, and the results achieved by each learner in them are used to characterize the particular student's state, at a given moment. Those states are then used to make the recommendation collaboratively, allowing for different recommendations for each particular student over time. The system is validated across several MOOCs.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123211161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00011
Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang
Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic net-work. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end, we propose a new simulation-based criterion that considers teaching autonomous agents to mimic sensor patterns, planning their next visit based on the sensor's profile (e.g., traffic, speed, occupancy). The data recorded by the sensor is most accurate when the agent can perfectly simulate the sensor's activity pattern. We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network. Actions taken by the agent change the environment, which in turn forces the agent's mode to update, while the agent further explores changes in the dynamic traffic network, which helps the agent predict its next visit more accurately. Therefore, we develop a strategy in which sensors and traffic networks update each other and incorporate temporal context to quantify state representations evolving over time. Along these lines, we propose streaming traffic flow prediction based on continuous reinforcement learning model (ST-CRL), a kind of predictive model based on reinforcement learning and continuous learning, and an analytical algorithm based on KL divergence that cleverly incorporates long-term novel patterns into model induction. Second, we introduce a prioritized experience replay strategy to consolidate and aggregate previously learned core knowledge into the model. The proposed model is able to continuously learn and predict as the traffic flow network expands and evolves over time. Extensive experiments show that the algorithm has great potential in predicting long-term streaming media networks, while achieving data privacy protection to a certain extent.
{"title":"Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning","authors":"Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang","doi":"10.1109/ICDMW58026.2022.00011","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00011","url":null,"abstract":"Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic net-work. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end, we propose a new simulation-based criterion that considers teaching autonomous agents to mimic sensor patterns, planning their next visit based on the sensor's profile (e.g., traffic, speed, occupancy). The data recorded by the sensor is most accurate when the agent can perfectly simulate the sensor's activity pattern. We propose to formulate the problem as a continuous reinforcement learning task, where the agent is the next flow value predictor, the action is the next time-series flow value in the sensor, and the environment state is a dynamically fused representation of the sensor and transportation network. Actions taken by the agent change the environment, which in turn forces the agent's mode to update, while the agent further explores changes in the dynamic traffic network, which helps the agent predict its next visit more accurately. Therefore, we develop a strategy in which sensors and traffic networks update each other and incorporate temporal context to quantify state representations evolving over time. Along these lines, we propose streaming traffic flow prediction based on continuous reinforcement learning model (ST-CRL), a kind of predictive model based on reinforcement learning and continuous learning, and an analytical algorithm based on KL divergence that cleverly incorporates long-term novel patterns into model induction. Second, we introduce a prioritized experience replay strategy to consolidate and aggregate previously learned core knowledge into the model. The proposed model is able to continuously learn and predict as the traffic flow network expands and evolves over time. Extensive experiments show that the algorithm has great potential in predicting long-term streaming media networks, while achieving data privacy protection to a certain extent.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123908499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00154
Zhangyue Shi, Yuxuan Li, Chenang Liu
In advanced manufacturing, the incorporation of online sensing technologies has enabled great potentials to achieve effective in-situ process monitoring via machine learning-based approaches. In manufacturing practice, the online sensor data are usually collected in a progressive manner, and the stream data collected at latter stages may also contain informative knowledge for process monitoring. Therefore, it is highly valuable to make the machine learning-based monitoring model learn incrementally in manufacturing. To achieve this goal, this paper develops a multi-stage incremental learning approach enabled by the knowledge distillation, which distills representative information from the machine learning model trained at early/offline stage and then enhances the monitoring performance at the latter stages. To validate its effectiveness, a real-world case study in additive manufacturing, which is an emerging advanced manufacturing technology, is conducted. The experimental results show that the developed knowledge distillation-enabled multi-stage incremental learning is very promising to improve the online monitoring performance in advanced manufacturing.
{"title":"Knowledge Distillation-enabled Multi-stage Incremental Learning for Online Process Monitoring in Advanced Manufacturing","authors":"Zhangyue Shi, Yuxuan Li, Chenang Liu","doi":"10.1109/ICDMW58026.2022.00154","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00154","url":null,"abstract":"In advanced manufacturing, the incorporation of online sensing technologies has enabled great potentials to achieve effective in-situ process monitoring via machine learning-based approaches. In manufacturing practice, the online sensor data are usually collected in a progressive manner, and the stream data collected at latter stages may also contain informative knowledge for process monitoring. Therefore, it is highly valuable to make the machine learning-based monitoring model learn incrementally in manufacturing. To achieve this goal, this paper develops a multi-stage incremental learning approach enabled by the knowledge distillation, which distills representative information from the machine learning model trained at early/offline stage and then enhances the monitoring performance at the latter stages. To validate its effectiveness, a real-world case study in additive manufacturing, which is an emerging advanced manufacturing technology, is conducted. The experimental results show that the developed knowledge distillation-enabled multi-stage incremental learning is very promising to improve the online monitoring performance in advanced manufacturing.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122634163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-01DOI: 10.1109/ICDMW58026.2022.00056
Yang Syu, Chien-Min Wang
Before conducting any further applications or performing more advanced processing, analyzing and realizing the probability distribution of data is a crucial task. Traditionally, statistical methods are being developed for this procedure. In recent years, researchers in computer science have proposed and applied different machine learning-based techniques to address the abovementioned problem. However, the existing solutions remain problematic and inconvenient, such as the need for human intervention and the complexity of the resulting models. Thus, in this paper, without causing deficiency and inconvenience, a genetic programming-based approach for the identification of probability functions is proposed, implemented, and tested. Based on our empirical trials, in an immense search space of mathematical expressions, the proposed and developed approach can effectively recognize (retrieve) the probability distribution function behind data.
{"title":"Using Genetic Programming to Identify Probability Distribution behind Data: A Preliminary Trial","authors":"Yang Syu, Chien-Min Wang","doi":"10.1109/ICDMW58026.2022.00056","DOIUrl":"https://doi.org/10.1109/ICDMW58026.2022.00056","url":null,"abstract":"Before conducting any further applications or performing more advanced processing, analyzing and realizing the probability distribution of data is a crucial task. Traditionally, statistical methods are being developed for this procedure. In recent years, researchers in computer science have proposed and applied different machine learning-based techniques to address the abovementioned problem. However, the existing solutions remain problematic and inconvenient, such as the need for human intervention and the complexity of the resulting models. Thus, in this paper, without causing deficiency and inconvenience, a genetic programming-based approach for the identification of probability functions is proposed, implemented, and tested. Based on our empirical trials, in an immense search space of mathematical expressions, the proposed and developed approach can effectively recognize (retrieve) the probability distribution function behind data.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122815245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}