Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953616
Phuc-Thinh Nguyen, M. Nazmudeen, Minh-Son Dao, Duy-Dong Le
Regular exercise and scientific eating can support weight control and benefit everyone’s health, especially athletes. In recent years, although much research has been conducted in this field, only small groups of people were studied, and a few models revealed links between weight and speed attributes (e.g., activities, wellbeing, habits) to extract tips to assist people in controlling their weight and running speed. In this research, we propose an approach that uses pattern mining and correlation discovery techniques to discover the most optimal attributes over time to forecast the weight and speed of an athlete for a sports event. Furthermore, we propose Adaptive Learning Models, which can learn from personal and public data to forecast a person’s weight or speed in various age groups, such as young adults, middle-aged adults, and female or male members. Based on the above analysis, different approaches to building prediction models of athletes’ weight or running speed are being examined based on the primary data. Our suggested approach yields encouraging results when tested on public and private data sets.
{"title":"Adaptive Learning Models for Getting Insights into Multimodal Lifelog Data","authors":"Phuc-Thinh Nguyen, M. Nazmudeen, Minh-Son Dao, Duy-Dong Le","doi":"10.1109/KSE56063.2022.9953616","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953616","url":null,"abstract":"Regular exercise and scientific eating can support weight control and benefit everyone’s health, especially athletes. In recent years, although much research has been conducted in this field, only small groups of people were studied, and a few models revealed links between weight and speed attributes (e.g., activities, wellbeing, habits) to extract tips to assist people in controlling their weight and running speed. In this research, we propose an approach that uses pattern mining and correlation discovery techniques to discover the most optimal attributes over time to forecast the weight and speed of an athlete for a sports event. Furthermore, we propose Adaptive Learning Models, which can learn from personal and public data to forecast a person’s weight or speed in various age groups, such as young adults, middle-aged adults, and female or male members. Based on the above analysis, different approaches to building prediction models of athletes’ weight or running speed are being examined based on the primary data. Our suggested approach yields encouraging results when tested on public and private data sets.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131702589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953759
Bao Le, M. Nguyen, Nhi Kieu-Phuong Nguyen, Binh T. Nguyen
Intelligent systems, especially smartphones, have become crucial parts of the world. These devices can solve various human tasks, from long-distance communication to healthcare assistants. For this tremendous success, customer feedback on a smartphone plays an integral role during the development process. This paper presents an improved approach for the Vietnamese Smartphone Feedback Dataset (UIT-ViSFD), collected and annotated carefully in 2021 (including 11,122 comments and their labels) by employing the pretrained PhoBERT model with a proper pre-processing method. In the experiments, we compare the approach with other transformer-based models such as XLM-R, DistilBERT, RoBERTa, and BERT. The experimental results show that the proposed method can bypass the state-of-the-art methods related to the UIT-ViSFD corpus. As a result, our model can achieve better macro-F1 scores for the Aspect and Sentiment Detection task, which are 86.03% and 78.76%, respectively. In addition, our approach could improve the results of Aspect-Based Sentiment Analysis datasets in the Vietnamese language.
{"title":"A New Approach for Vietnamese Aspect-Based Sentiment Analysis","authors":"Bao Le, M. Nguyen, Nhi Kieu-Phuong Nguyen, Binh T. Nguyen","doi":"10.1109/KSE56063.2022.9953759","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953759","url":null,"abstract":"Intelligent systems, especially smartphones, have become crucial parts of the world. These devices can solve various human tasks, from long-distance communication to healthcare assistants. For this tremendous success, customer feedback on a smartphone plays an integral role during the development process. This paper presents an improved approach for the Vietnamese Smartphone Feedback Dataset (UIT-ViSFD), collected and annotated carefully in 2021 (including 11,122 comments and their labels) by employing the pretrained PhoBERT model with a proper pre-processing method. In the experiments, we compare the approach with other transformer-based models such as XLM-R, DistilBERT, RoBERTa, and BERT. The experimental results show that the proposed method can bypass the state-of-the-art methods related to the UIT-ViSFD corpus. As a result, our model can achieve better macro-F1 scores for the Aspect and Sentiment Detection task, which are 86.03% and 78.76%, respectively. In addition, our approach could improve the results of Aspect-Based Sentiment Analysis datasets in the Vietnamese language.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122321820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953802
Thuy-Anh Nguyen Thi, Thi-Hong Vuong, Thi-Hanh Le, X. Phan, Thi-Thao Le, Quang-Thuy Ha
Knowledge base completion (KBC) is the task of predicting and filling missing information based on the current data in that knowledge base. Recently, one of the most feasible approaches introduced by V. Kocijan and T. Lukasiewicz (2021) is to transfer knowledge from one collection of information to another without the need for entity or relation matching. Still, this work has not scaled pre-training to larger models and datasets and investigated the impact of the encoder architecture. In this work, we propose a method that can combine the benefits of Bidirectional Encoder Representations from Transformer (BERT), fastText, Gated Recurrent Unit (GRU), and Fully Connected (FC) layer to improve the KBC task in Kocijan and Lukasiewicz’s model. The experimental results show the effectiveness of our proposed model in several popular datasets like ReVerb20K, ReVerb45K, FB15K237, and WN18RR.
{"title":"Knowledge Base Completion with transfer learning using BERT and fastText","authors":"Thuy-Anh Nguyen Thi, Thi-Hong Vuong, Thi-Hanh Le, X. Phan, Thi-Thao Le, Quang-Thuy Ha","doi":"10.1109/KSE56063.2022.9953802","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953802","url":null,"abstract":"Knowledge base completion (KBC) is the task of predicting and filling missing information based on the current data in that knowledge base. Recently, one of the most feasible approaches introduced by V. Kocijan and T. Lukasiewicz (2021) is to transfer knowledge from one collection of information to another without the need for entity or relation matching. Still, this work has not scaled pre-training to larger models and datasets and investigated the impact of the encoder architecture. In this work, we propose a method that can combine the benefits of Bidirectional Encoder Representations from Transformer (BERT), fastText, Gated Recurrent Unit (GRU), and Fully Connected (FC) layer to improve the KBC task in Kocijan and Lukasiewicz’s model. The experimental results show the effectiveness of our proposed model in several popular datasets like ReVerb20K, ReVerb45K, FB15K237, and WN18RR.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"408 25","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120889604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953621
Kieu Dang Nam, Thi-Oanh Nguyen, N. T. Thuy, D. V. Hang, D. Long, Tran Quang Trung, D. V. Sang
The goal of the Unsupervised Domain Adaptation (UDA) is to transfer the knowledge of the model learned from a source domain with available labels to the target data domain without having access to labels. However, the performance of UDA can greatly suffer from the domain shift issue caused by the misalignment of the two data distributions from the two data sources. Endoscopy can be performed under different light modes, including white-light imaging (WLI) and image-enhanced endoscopy (IEE) light modes. However, most of the current polyp datasets are collected in the WLI mode since it is the standard and most popular one in all endoscopy systems. Therefore, AI models trained on such WLI datasets can strongly degrade when applied to other light modes. In order to address this issue, this paper proposes a coarse-to-fine UDA method that first coarsely aligns the two data distributions at the input level using the Fourier transform in chromatic space; then finely aligns them at the feature level using a fine-grained adversarial training. The backbone of our model is based on a powerful transformer architecture. Experimental results show that our proposed method effectively solves the domain shift issue and achieves a substantial performance improvement on cross-mode polyp segmentation for endoscopy.
{"title":"A Coarse-to-fine Unsupervised Domain Adaptation Method for Cross-Mode Polyp Segmentation","authors":"Kieu Dang Nam, Thi-Oanh Nguyen, N. T. Thuy, D. V. Hang, D. Long, Tran Quang Trung, D. V. Sang","doi":"10.1109/KSE56063.2022.9953621","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953621","url":null,"abstract":"The goal of the Unsupervised Domain Adaptation (UDA) is to transfer the knowledge of the model learned from a source domain with available labels to the target data domain without having access to labels. However, the performance of UDA can greatly suffer from the domain shift issue caused by the misalignment of the two data distributions from the two data sources. Endoscopy can be performed under different light modes, including white-light imaging (WLI) and image-enhanced endoscopy (IEE) light modes. However, most of the current polyp datasets are collected in the WLI mode since it is the standard and most popular one in all endoscopy systems. Therefore, AI models trained on such WLI datasets can strongly degrade when applied to other light modes. In order to address this issue, this paper proposes a coarse-to-fine UDA method that first coarsely aligns the two data distributions at the input level using the Fourier transform in chromatic space; then finely aligns them at the feature level using a fine-grained adversarial training. The backbone of our model is based on a powerful transformer architecture. Experimental results show that our proposed method effectively solves the domain shift issue and achieves a substantial performance improvement on cross-mode polyp segmentation for endoscopy.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133183851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953796
Thu Trang Hoa, Minh Anh Nguyen
The talent scheduling problem seeks to determine the movie shooting sequence that minimizes the total cost of the actors involved, which usually accounts for a significant portion of the cost of any real-world movie production. This paper introduces an extension of the talent scheduling problem that takes into account both the costs of filming locations and actors. To better capture reality, we consider that the rental cost for a filming location can vary across the planning horizon. The objective is to find the shooting sequence as well as the start date for each scene that minimizes the total cost, including actor and location costs, while ensuring all scenes are completed within the planning horizon. We first formulate the problem as a mixed integer linear programming (MILP) model, from which small instances can be solved to optimality by MILP solvers. Next, an iterated local search heuristic that can efficiently solve larger instances is developed. Then we provide a new benchmark data set for our new variance of the talent scheduling problem. The results of computational experiments upon new benchmark instances suggest that our heuristic can outperform the MILP model solved by a commercial solver in terms of both solution quality and runtime.
{"title":"An Iterated Local Search for the Talent Scheduling Problem with Location Costs","authors":"Thu Trang Hoa, Minh Anh Nguyen","doi":"10.1109/KSE56063.2022.9953796","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953796","url":null,"abstract":"The talent scheduling problem seeks to determine the movie shooting sequence that minimizes the total cost of the actors involved, which usually accounts for a significant portion of the cost of any real-world movie production. This paper introduces an extension of the talent scheduling problem that takes into account both the costs of filming locations and actors. To better capture reality, we consider that the rental cost for a filming location can vary across the planning horizon. The objective is to find the shooting sequence as well as the start date for each scene that minimizes the total cost, including actor and location costs, while ensuring all scenes are completed within the planning horizon. We first formulate the problem as a mixed integer linear programming (MILP) model, from which small instances can be solved to optimality by MILP solvers. Next, an iterated local search heuristic that can efficiently solve larger instances is developed. Then we provide a new benchmark data set for our new variance of the talent scheduling problem. The results of computational experiments upon new benchmark instances suggest that our heuristic can outperform the MILP model solved by a commercial solver in terms of both solution quality and runtime.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116728768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953618
D. Nguyen, Hieu Nguyen, Tung Le, Le-Minh Nguyen
Document retrieval for domain-specific has been an important and challenging research in NLP, particularly legal documents. The main challenge in the legal domain is the close combination of specialized knowledge from experts, which makes the entire data collecting and evaluation procedure complex and time consuming. In this study, we propose a training data augmentation procedure and an unsupervised embedding learning method and apply it to the Legal Document Retrieval task at the Automated Legal Question Answering Competition 2022 (ALQAC 2022). In this task, our method outperformed current standard models and achieved competitive results at ALQAC 2022.
{"title":"An Unsupervised Learning Method to improve Legal Document Retrieval task at ALQAC 2022","authors":"D. Nguyen, Hieu Nguyen, Tung Le, Le-Minh Nguyen","doi":"10.1109/KSE56063.2022.9953618","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953618","url":null,"abstract":"Document retrieval for domain-specific has been an important and challenging research in NLP, particularly legal documents. The main challenge in the legal domain is the close combination of specialized knowledge from experts, which makes the entire data collecting and evaluation procedure complex and time consuming. In this study, we propose a training data augmentation procedure and an unsupervised embedding learning method and apply it to the Legal Document Retrieval task at the Automated Legal Question Answering Competition 2022 (ALQAC 2022). In this task, our method outperformed current standard models and achieved competitive results at ALQAC 2022.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"127 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123584454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953753
Joon-Choul Shin, Wansu Kim, Jusang Lee, Jieun Park, Cheolyoung Ock
In machine learning, the feature frequency in learning data can be used for a value of the feature, and in this case, sparse feature is likely to create overfitting problems in the weight optimization process. This is called sparse data problem, and this paper proposes a method that reduce the probability of weight update as the feature is sparse. We experimented with this method in four Natural Language Processing tasks, and the experiment results showed that this method had positive effects on all tasks. On average, this method had the effect of reducing 8 per 100 errors. Also it reduced the number of weight updates, therefore the learning time was reduced to 81% in Named Entity Recognition task.
{"title":"Controlling Weight Update Probability of Sparse Features in Machine Learning","authors":"Joon-Choul Shin, Wansu Kim, Jusang Lee, Jieun Park, Cheolyoung Ock","doi":"10.1109/KSE56063.2022.9953753","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953753","url":null,"abstract":"In machine learning, the feature frequency in learning data can be used for a value of the feature, and in this case, sparse feature is likely to create overfitting problems in the weight optimization process. This is called sparse data problem, and this paper proposes a method that reduce the probability of weight update as the feature is sparse. We experimented with this method in four Natural Language Processing tasks, and the experiment results showed that this method had positive effects on all tasks. On average, this method had the effect of reducing 8 per 100 errors. Also it reduced the number of weight updates, therefore the learning time was reduced to 81% in Named Entity Recognition task.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131483773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953615
Suong N. Hoang, Binh Duc Nguyen, Nam-Phong Nguyen, Son T. Luu, Hieu T. Phan, H. Nguyen
The explosion of free-text content on social media has brought the exponential propagation of hate speech. The definition of hate speech is well-defined in the community guidelines of many popular platforms such as Facebook, Tiktok, and Twitter, where any communication judges towards the minor, protected groups are considered hateful content. This paper first points out the sophisticated word-play of malicious users in a Vietnamese Hate Speech (VHS) Dataset. The Center Loss in the training process to disambiguate the task-based sentence embedding is proposed for improving generalizations of the model. Moreover, a task-based lexical attention pooling is also proposed to highlight lexicon-level information and then combined into sentence embedding. The experimental results show that the proposed method improves the F1 score in the ViHSD dataset, while the training time and inference speed are insignificantly changed.
{"title":"Enhanced Task-based Knowledge for Lexicon-based Approach in Vietnamese Hate Speech Detection","authors":"Suong N. Hoang, Binh Duc Nguyen, Nam-Phong Nguyen, Son T. Luu, Hieu T. Phan, H. Nguyen","doi":"10.1109/KSE56063.2022.9953615","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953615","url":null,"abstract":"The explosion of free-text content on social media has brought the exponential propagation of hate speech. The definition of hate speech is well-defined in the community guidelines of many popular platforms such as Facebook, Tiktok, and Twitter, where any communication judges towards the minor, protected groups are considered hateful content. This paper first points out the sophisticated word-play of malicious users in a Vietnamese Hate Speech (VHS) Dataset. The Center Loss in the training process to disambiguate the task-based sentence embedding is proposed for improving generalizations of the model. Moreover, a task-based lexical attention pooling is also proposed to highlight lexicon-level information and then combined into sentence embedding. The experimental results show that the proposed method improves the F1 score in the ViHSD dataset, while the training time and inference speed are insignificantly changed.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128669858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953797
T. Hoang, Viet-Cuong Ta
Graph neural networks (GNNs) are among the dominated approaches for learning graph structured data and are used in various applications such as social network or product recommendation. The GNN operates mainly on the message passing mechanism which a node receives related nodes information to improve its internal representation. However, when the depth of the GNN increases, the message passing mechanism cut-offs the high-frequency component of the nodes’ representation, thus leads to the over-smoothing issue. In this paper, we propose the usage of cluster-based sampling to reduce the smoothing effect of the high number of layers in GNN. Given each nodes is assigned to a specific region of the embedding space, the cluster-based sampling is expected to propagate this information to the node’s neighbour, thus improve the nodes’ expressivity. Our approach is tested with several popular GNN architecture and the experiments show that our approach could reduce the smoothing effect in comparison with the standard approaches using the Mean Average Distance metric.
{"title":"Effect of Cluster-based Sampling on the Over-smoothing Issue in Graph Neural Network","authors":"T. Hoang, Viet-Cuong Ta","doi":"10.1109/KSE56063.2022.9953797","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953797","url":null,"abstract":"Graph neural networks (GNNs) are among the dominated approaches for learning graph structured data and are used in various applications such as social network or product recommendation. The GNN operates mainly on the message passing mechanism which a node receives related nodes information to improve its internal representation. However, when the depth of the GNN increases, the message passing mechanism cut-offs the high-frequency component of the nodes’ representation, thus leads to the over-smoothing issue. In this paper, we propose the usage of cluster-based sampling to reduce the smoothing effect of the high number of layers in GNN. Given each nodes is assigned to a specific region of the embedding space, the cluster-based sampling is expected to propagate this information to the node’s neighbour, thus improve the nodes’ expressivity. Our approach is tested with several popular GNN architecture and the experiments show that our approach could reduce the smoothing effect in comparison with the standard approaches using the Mean Average Distance metric.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"614 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120875932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-19DOI: 10.1109/KSE56063.2022.9953790
Duy-Dong Le, Anh-Khoa Tran, Minh-Son Dao, M. Nazmudeen, Viet-Tiep Mai, Nhat-Ha Su
The air quality index forecast in big cities is an exciting study area in smart cities and healthcare on the Internet of Things. In recent years, a large number of empirical, academic, and review papers using machine learning for air quality analysis have been published. However, most of those studies focused on traditional centralized processing on a single machine, and there had been few surveys of federated learning in this field. This overview aims to fill this gap and provide newcomers with a broader perspective to inform future research on this topic, especially for the multi-model approach. We have examined over 70 carefully selected papers in this scope and discovered that multi-model federated learning is the most effective technique that could enhance the air quality index prediction result. Therefore, this mechanism needs to be considered by science community in the coming years.
{"title":"Federated Learning for Air Quality Index Prediction: An Overview","authors":"Duy-Dong Le, Anh-Khoa Tran, Minh-Son Dao, M. Nazmudeen, Viet-Tiep Mai, Nhat-Ha Su","doi":"10.1109/KSE56063.2022.9953790","DOIUrl":"https://doi.org/10.1109/KSE56063.2022.9953790","url":null,"abstract":"The air quality index forecast in big cities is an exciting study area in smart cities and healthcare on the Internet of Things. In recent years, a large number of empirical, academic, and review papers using machine learning for air quality analysis have been published. However, most of those studies focused on traditional centralized processing on a single machine, and there had been few surveys of federated learning in this field. This overview aims to fill this gap and provide newcomers with a broader perspective to inform future research on this topic, especially for the multi-model approach. We have examined over 70 carefully selected papers in this scope and discovered that multi-model federated learning is the most effective technique that could enhance the air quality index prediction result. Therefore, this mechanism needs to be considered by science community in the coming years.","PeriodicalId":330865,"journal":{"name":"2022 14th International Conference on Knowledge and Systems Engineering (KSE)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130797396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}