首页 > 最新文献

Data Technologies and Applications最新文献

英文 中文
A novel similarity measure SF-IPF for CBKNN with implicit feedback data 隐式反馈数据 CBKNN 的新型相似性测量 SF-IPF
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-06-04 DOI: 10.1108/dta-07-2023-0370
Rajalakshmi Sivanaiah, Mirnalinee T T, Sakaya Milton R

Purpose

The increasing popularity of music streaming services also increases the need to customize the services for each user to attract and retain customers. Most of the music streaming services will not have explicit ratings for songs; they will have only implicit feedback data, i.e user listening history. For efficient music recommendation, the preferences of the users have to be infered, which is a challenging task.

Design/methodology/approach

Preferences of the users can be identified from the users' listening history. In this paper, a hybrid music recommendation system is proposed that infers features from user's implicit feedback and uses the hybrid of content-based and collaborative filtering method to recommend songs. A Content Boosted K-Nearest Neighbours (CBKNN) filtering technique was proposed, which used the users' listening history, popularity of songs, song features, and songs of similar interested users for recommending songs. The song features are taken as content features. Song Frequency–Inverse Popularity Frequency (SF-IPF) metric is proposed to find the similarity among the neighbours in collaborative filtering. Million Song Dataset and Echo Nest Taste Profile Subset are used as data sets.

Findings

The proposed CBKNN technique with SF-IPF similarity measure to identify similar interest neighbours performs better than other machine learning techniques like linear regression, decision trees, random forest, support vector machines, XGboost and Adaboost. The performance of proposed SF-IPF was tested with other similarity metrics like Pearson and Cosine similarity measures, in which SF-IPF results in better performance.

Originality/value

This method was devised to infer the user preferences from the implicit feedback data and it is converted as rating preferences. The importance of adding content features with collaborative information is analysed in hybrid filtering. A new similarity metric SF-IPF is formulated to identify the similarity between the users in collaborative filtering.

目的随着音乐流媒体服务的日益普及,为每个用户定制服务以吸引和留住客户的需求也随之增加。大多数音乐流媒体服务都没有明确的歌曲评级,只有隐含的反馈数据,即用户的收听历史。为了实现高效的音乐推荐,必须推断出用户的偏好,而这是一项具有挑战性的任务。本文提出了一种混合音乐推荐系统,它能从用户的隐式反馈中推断出特征,并使用基于内容和协同过滤的混合方法来推荐歌曲。本文提出了一种内容增强 K 近邻(CBKNN)过滤技术,该技术利用用户的收听历史、歌曲流行度、歌曲特征以及类似兴趣用户的歌曲来推荐歌曲。歌曲特征被视为内容特征。提出了歌曲频率-反向流行频率(SF-IPF)指标,用于查找协作过滤中相邻用户之间的相似性。研究结果与线性回归、决策树、随机森林、支持向量机、XGboost 和 Adaboost 等其他机器学习技术相比,利用 SF-IPF 相似性度量来识别相似兴趣邻域的 CBKNN 技术表现更好。提议的 SF-IPF 的性能与其他相似度量(如皮尔逊和余弦相似度量)进行了测试,其中 SF-IPF 的性能更好。分析了在混合过滤中添加内容特征与协作信息的重要性。提出了一种新的相似度量 SF-IPF,用于识别协同过滤中用户之间的相似性。
{"title":"A novel similarity measure SF-IPF for CBKNN with implicit feedback data","authors":"Rajalakshmi Sivanaiah, Mirnalinee T T, Sakaya Milton R","doi":"10.1108/dta-07-2023-0370","DOIUrl":"https://doi.org/10.1108/dta-07-2023-0370","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The increasing popularity of music streaming services also increases the need to customize the services for each user to attract and retain customers. Most of the music streaming services will not have explicit ratings for songs; they will have only implicit feedback data, i.e user listening history. For efficient music recommendation, the preferences of the users have to be infered, which is a challenging task.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>Preferences of the users can be identified from the users' listening history. In this paper, a hybrid music recommendation system is proposed that infers features from user's implicit feedback and uses the hybrid of content-based and collaborative filtering method to recommend songs. A Content Boosted K-Nearest Neighbours (CBKNN) filtering technique was proposed, which used the users' listening history, popularity of songs, song features, and songs of similar interested users for recommending songs. The song features are taken as content features. Song Frequency–Inverse Popularity Frequency (SF-IPF) metric is proposed to find the similarity among the neighbours in collaborative filtering. Million Song Dataset and Echo Nest Taste Profile Subset are used as data sets.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The proposed CBKNN technique with SF-IPF similarity measure to identify similar interest neighbours performs better than other machine learning techniques like linear regression, decision trees, random forest, support vector machines, XGboost and Adaboost. The performance of proposed SF-IPF was tested with other similarity metrics like Pearson and Cosine similarity measures, in which SF-IPF results in better performance.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This method was devised to infer the user preferences from the implicit feedback data and it is converted as rating preferences. The importance of adding content features with collaborative information is analysed in hybrid filtering. A new similarity metric SF-IPF is formulated to identify the similarity between the users in collaborative filtering.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141254302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Practice challenge recommendations in online judge using implicit rating extraction and utility sequence patterns 利用隐性评级提取和效用序列模式在在线评判中推荐实践挑战
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-05-29 DOI: 10.1108/dta-10-2023-0688
Ramesh P Natarajan, Kannimuthu S, Bhanu D

Purpose

The existing traditional recommendations based on content-based filtering (CBF), collaborative filtering (CF) and hybrid approaches are inadequate for recommending practice challenges in programming online judge (POJ). These systems only consider the preferences of the target users or similar users to recommend items. In the learning environment, recommender systems should consider the learning path, knowledge level and ability of the learner. Another major problem in POJ is the learners don't give ratings to practice challenges like e-commerce and video streaming portals. This purpose of the proposed approach is to overcome the abovementioned shortcomings.

Design/methodology/approach

To achieve the context-aware practice challenge recommendation, the data preparation techniques including implicit rating extraction, data preprocessing to remove outliers, sequence-based learner clustering and utility sequence pattern mining approaches are used in the proposed approach. The approach ensures that the recommender system considers the knowledge level, learning path and learning goals of the learner to recommend practice challenges.

Findings

Experiments on practice challenge recommendations conducted using real-world POJ dataset show that the proposed system outperforms other traditional approaches. The experiment also demonstrates that the proposed system is recommending challenges based on the learner's current context. The implicit rating extracted using the proposed approach works accurately in the recommender system.

Originality/value

The proposed system contains the following novel approaches to address the lack of rating and context-aware recommendations. The mathematical model was used to extract ratings from learner submissions. The statistical approach was used in data preprocessing. The sequence similarity-based learner clustering was used in transition matrix. Utilizing the rating as a utility in the USPAN algorithm provides useful insights into learner–challenge relationships.

目的现有的基于内容过滤(CBF)、协同过滤(CF)和混合方法的传统推荐方法不足以应对编程在线评判(POJ)中的实践挑战。这些系统仅考虑目标用户或相似用户的偏好来推荐项目。在学习环境中,推荐系统应考虑学习者的学习路径、知识水平和能力。POJ 的另一个主要问题是,学习者不会对电子商务和视频流门户等实践挑战给出评分。为了实现情境感知的练习挑战推荐,所提出的方法采用了数据准备技术,包括隐含评分提取、去除异常值的数据预处理、基于序列的学习者聚类和实用序列模式挖掘方法。研究结果使用真实世界的 POJ 数据集进行的练习挑战推荐实验表明,所提出的系统优于其他传统方法。实验还表明,所提出的系统是根据学习者当前的情境来推荐挑战的。利用所提出的方法提取的隐含评分在推荐系统中准确地发挥作用。原创性/价值所提出的系统包含以下新方法,以解决缺乏评分和情境感知推荐的问题。数学模型用于从学习者提交的内容中提取评分。统计方法用于数据预处理。在过渡矩阵中使用了基于序列相似性的学习者聚类。在 USPAN 算法中将评级作为一种实用工具,有助于深入了解学习者与挑战之间的关系。
{"title":"Practice challenge recommendations in online judge using implicit rating extraction and utility sequence patterns","authors":"Ramesh P Natarajan, Kannimuthu S, Bhanu D","doi":"10.1108/dta-10-2023-0688","DOIUrl":"https://doi.org/10.1108/dta-10-2023-0688","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The existing traditional recommendations based on content-based filtering (CBF), collaborative filtering (CF) and hybrid approaches are inadequate for recommending practice challenges in programming online judge (POJ). These systems only consider the preferences of the target users or similar users to recommend items. In the learning environment, recommender systems should consider the learning path, knowledge level and ability of the learner. Another major problem in POJ is the learners don't give ratings to practice challenges like e-commerce and video streaming portals. This purpose of the proposed approach is to overcome the abovementioned shortcomings.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>To achieve the context-aware practice challenge recommendation, the data preparation techniques including implicit rating extraction, data preprocessing to remove outliers, sequence-based learner clustering and utility sequence pattern mining approaches are used in the proposed approach. The approach ensures that the recommender system considers the knowledge level, learning path and learning goals of the learner to recommend practice challenges.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>Experiments on practice challenge recommendations conducted using real-world POJ dataset show that the proposed system outperforms other traditional approaches. The experiment also demonstrates that the proposed system is recommending challenges based on the learner's current context. The implicit rating extracted using the proposed approach works accurately in the recommender system.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>The proposed system contains the following novel approaches to address the lack of rating and context-aware recommendations. The mathematical model was used to extract ratings from learner submissions. The statistical approach was used in data preprocessing. The sequence similarity-based learner clustering was used in transition matrix. Utilizing the rating as a utility in the USPAN algorithm provides useful insights into learner–challenge relationships.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient software mutation test by clustering the single-line redundant mutants 通过聚类单行冗余突变体实现高效软件突变测试
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-04-24 DOI: 10.1108/dta-05-2023-0152
Bahman Arasteh, Ali Ghaffari
PurposeReducing the number of generated mutants by clustering redundant mutants, reducing the execution time by decreasing the number of generated mutants and reducing the cost of mutation testing are the main goals of this study.Design/methodology/approachIn this study, a method is suggested to identify and prone the redundant mutants. In the method, first, the program source code is analyzed by the developed parser to filter out the effectless instructions; then the remaining instructions are mutated by the standard mutation operators. The single-line mutants are partially executed by the developed instruction evaluator. Next, a clustering method is used to group the single-line mutants with the same results. There is only one complete run per cluster.FindingsThe results of experiments on the Java benchmarks indicate that the proposed method causes a 53.51 per cent reduction in the number of mutants and a 57.64 per cent time reduction compared to similar experiments in the MuJava and MuClipse tools.Originality/valueDeveloping a classifier that takes the source code of the program and classifies the programs' instructions into effective and effectless classes using a dependency graph; filtering out the effectless instructions reduces the total number of mutants generated; Developing and implementing an instruction parser and instruction-level mutant generator for Java programs; the mutant generator takes instruction in the original program as a string and generates its single-line mutants based on the standard mutation operators in MuJava; Developing a stack-based evaluator that takes an instruction (original or mutant) and the test data and evaluates its result without executing the whole program.
目的通过聚类冗余突变体来减少生成突变体的数量,通过减少生成突变体的数量来缩短执行时间,以及降低突变测试的成本是本研究的主要目标。在该方法中,首先通过开发的解析器对程序源代码进行分析,以过滤掉无效指令;然后使用标准突变算子对剩余指令进行突变。单行突变指令由开发的指令评估器部分执行。接下来,使用聚类方法对结果相同的单行突变体进行分组。对 Java 基准的实验结果表明,与在 MuJava 和 MuClipse 工具中进行的类似实验相比,所提出的方法减少了 53.51% 的突变体数量,缩短了 57.64% 的时间。原创性/价值开发了一种分类器,该分类器获取程序的源代码,并使用依赖图将程序的指令分为有效类和无效类;过滤掉无效指令可减少产生的突变体总数;开发并实现了 Java 程序的指令解析器和指令级突变体生成器;突变体生成器以字符串形式接收原始程序中的指令,并根据 MuJava 中的标准突变运算符生成其单行突变体;开发基于堆栈的评估器,该评估器接收指令(原始指令或突变指令)和测试数据,并在不执行整个程序的情况下评估其结果。
{"title":"Efficient software mutation test by clustering the single-line redundant mutants","authors":"Bahman Arasteh, Ali Ghaffari","doi":"10.1108/dta-05-2023-0152","DOIUrl":"https://doi.org/10.1108/dta-05-2023-0152","url":null,"abstract":"PurposeReducing the number of generated mutants by clustering redundant mutants, reducing the execution time by decreasing the number of generated mutants and reducing the cost of mutation testing are the main goals of this study.Design/methodology/approachIn this study, a method is suggested to identify and prone the redundant mutants. In the method, first, the program source code is analyzed by the developed parser to filter out the effectless instructions; then the remaining instructions are mutated by the standard mutation operators. The single-line mutants are partially executed by the developed instruction evaluator. Next, a clustering method is used to group the single-line mutants with the same results. There is only one complete run per cluster.FindingsThe results of experiments on the Java benchmarks indicate that the proposed method causes a 53.51 per cent reduction in the number of mutants and a 57.64 per cent time reduction compared to similar experiments in the MuJava and MuClipse tools.Originality/valueDeveloping a classifier that takes the source code of the program and classifies the programs' instructions into effective and effectless classes using a dependency graph; filtering out the effectless instructions reduces the total number of mutants generated; Developing and implementing an instruction parser and instruction-level mutant generator for Java programs; the mutant generator takes instruction in the original program as a string and generates its single-line mutants based on the standard mutation operators in MuJava; Developing a stack-based evaluator that takes an instruction (original or mutant) and the test data and evaluates its result without executing the whole program.","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140663620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel neural network architecture and cross-model transfer learning for multi-task autonomous driving 用于多任务自动驾驶的新型神经网络架构和交叉模型迁移学习
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-04-12 DOI: 10.1108/dta-08-2022-0307
Youwei Li, Jian Qu

Purpose

The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous driving, the authors found that the trained neural network model performs poorly in untrained scenarios. Therefore, the authors proposed to improve the transfer efficiency of the model for new scenarios through transfer learning.

Design/methodology/approach

First, the authors achieved multi-task autonomous driving by training a model combining convolutional neural network and different structured long short-term memory (LSTM) layers. Second, the authors achieved fast transfer of neural network models in new scenarios by cross-model transfer learning. Finally, the authors combined data collection and data labeling to improve the efficiency of deep learning. Furthermore, the authors verified that the model has good robustness through light and shadow test.

Findings

This research achieved road tracking, real-time acceleration–deceleration, obstacle avoidance and left/right sign recognition. The model proposed by the authors (UniBiCLSTM) outperforms the existing models tested with model cars in terms of autonomous driving performance. Furthermore, the CMTL-UniBiCL-RL model trained by the authors through cross-model transfer learning improves the efficiency of model adaptation to new scenarios. Meanwhile, this research proposed an automatic data annotation method, which can save 1/4 of the time for deep learning.

Originality/value

This research provided novel solutions in the achievement of multi-task autonomous driving and neural network model scenario for transfer learning. The experiment was achieved on a single camera with an embedded chip and a scale model car, which is expected to simplify the hardware for autonomous driving.

目的本研究的目的是通过调整模型的网络结构来实现多任务自动驾驶。同时,在实现多任务自动驾驶后,作者发现经过训练的神经网络模型在未经训练的场景中表现不佳。因此,作者提出通过迁移学习提高模型在新场景下的迁移效率。首先,作者通过训练一个结合了卷积神经网络和不同结构的长短期记忆(LSTM)层的模型实现了多任务自动驾驶。其次,作者通过交叉模型迁移学习实现了神经网络模型在新场景中的快速迁移。最后,作者将数据收集和数据标注结合起来,提高了深度学习的效率。此外,作者还通过光影测试验证了模型具有良好的鲁棒性。研究结果这项研究实现了道路跟踪、实时加减速、避障和左右标志识别。作者提出的模型(UniBiCLSTM)在自动驾驶性能方面优于使用模型车测试的现有模型。此外,作者通过交叉模型迁移学习训练的 CMTL-UniBiCL-RL 模型提高了模型适应新场景的效率。同时,该研究提出了一种自动数据标注方法,可为深度学习节省1/4的时间。原创性/价值该研究为实现多任务自动驾驶和神经网络模型场景下的迁移学习提供了新颖的解决方案。实验在嵌入式芯片的单摄像头和比例模型车上实现,有望简化自动驾驶的硬件。
{"title":"A novel neural network architecture and cross-model transfer learning for multi-task autonomous driving","authors":"Youwei Li, Jian Qu","doi":"10.1108/dta-08-2022-0307","DOIUrl":"https://doi.org/10.1108/dta-08-2022-0307","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The purpose of this research is to achieve multi-task autonomous driving by adjusting the network architecture of the model. Meanwhile, after achieving multi-task autonomous driving, the authors found that the trained neural network model performs poorly in untrained scenarios. Therefore, the authors proposed to improve the transfer efficiency of the model for new scenarios through transfer learning.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>First, the authors achieved multi-task autonomous driving by training a model combining convolutional neural network and different structured long short-term memory (LSTM) layers. Second, the authors achieved fast transfer of neural network models in new scenarios by cross-model transfer learning. Finally, the authors combined data collection and data labeling to improve the efficiency of deep learning. Furthermore, the authors verified that the model has good robustness through light and shadow test.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>This research achieved road tracking, real-time acceleration–deceleration, obstacle avoidance and left/right sign recognition. The model proposed by the authors (UniBiCLSTM) outperforms the existing models tested with model cars in terms of autonomous driving performance. Furthermore, the CMTL-UniBiCL-RL model trained by the authors through cross-model transfer learning improves the efficiency of model adaptation to new scenarios. Meanwhile, this research proposed an automatic data annotation method, which can save 1/4 of the time for deep learning.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This research provided novel solutions in the achievement of multi-task autonomous driving and neural network model scenario for transfer learning. The experiment was achieved on a single camera with an embedded chip and a scale model car, which is expected to simplify the hardware for autonomous driving.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140579652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of deep learning model incorporating domain knowledge in international migration forecasting 结合领域知识的深度学习模型在国际移民预测中的应用
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-04-12 DOI: 10.1108/dta-08-2023-0523
Tongzheng Pu, Chongxing Huang, Haimo Zhang, Jingjing Yang, Ming Huang

Purpose

Forecasting population movement trends is crucial for implementing effective policies to regulate labor force growth and understand demographic changes. Combining migration theory expertise and neural network technology can bring a fresh perspective to international migration forecasting research.

Design/methodology/approach

This study proposes a conditional generative adversarial neural network model incorporating the migration knowledge – conditional generative adversarial network (MK-CGAN). By using the migration knowledge to design the parameters, MK-CGAN can effectively address the limited data problem, thereby enhancing the accuracy of migration forecasts.

Findings

The model was tested by forecasting migration flows between different countries and had good generalizability and validity. The results are robust as the proposed solutions can achieve lesser mean absolute error, mean squared error, root mean square error, mean absolute percentage error and R2 values, reaching 0.9855 compared to long short-term memory (LSTM), gated recurrent unit, generative adversarial network (GAN) and the traditional gravity model.

Originality/value

This study is significant because it demonstrates a highly effective technique for predicting international migration using conditional GANs. By incorporating migration knowledge into our models, we can achieve prediction accuracy, gaining valuable insights into the differences between various model characteristics. We used SHapley Additive exPlanations to enhance our understanding of these differences and provide clear and concise explanations for our model predictions. The results demonstrated the theoretical significance and practical value of the MK-CGAN model in predicting international migration.

目的预测人口流动趋势对于实施有效的劳动力增长调控政策和了解人口变化至关重要。本研究提出了一种包含移民知识的条件生成对抗神经网络模型--条件生成对抗网络(MK-CGAN)。通过利用移民知识设计参数,MK-CGAN 可以有效解决数据有限的问题,从而提高移民预测的准确性。研究结果该模型通过预测不同国家之间的移民流量进行了测试,具有良好的普适性和有效性。与长短时记忆(LSTM)、门控递归单元、生成对抗网络(GAN)和传统重力模型相比,所提出的解决方案可以获得较小的均值绝对误差、均值平方误差、均值平方根误差、均值绝对百分比误差和 R2 值,达到 0.9855,因此结果是稳健的。通过将移民知识纳入模型,我们可以实现预测的准确性,并对各种模型特征之间的差异获得有价值的见解。我们利用 SHapley Additive exPlanations 增强了对这些差异的理解,并为我们的模型预测提供了简洁明了的解释。结果证明了 MK-CGAN 模型在预测国际移民方面的理论意义和实用价值。
{"title":"Application of deep learning model incorporating domain knowledge in international migration forecasting","authors":"Tongzheng Pu, Chongxing Huang, Haimo Zhang, Jingjing Yang, Ming Huang","doi":"10.1108/dta-08-2023-0523","DOIUrl":"https://doi.org/10.1108/dta-08-2023-0523","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Forecasting population movement trends is crucial for implementing effective policies to regulate labor force growth and understand demographic changes. Combining migration theory expertise and neural network technology can bring a fresh perspective to international migration forecasting research.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>This study proposes a conditional generative adversarial neural network model incorporating the migration knowledge – conditional generative adversarial network (MK-CGAN). By using the migration knowledge to design the parameters, MK-CGAN can effectively address the limited data problem, thereby enhancing the accuracy of migration forecasts.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The model was tested by forecasting migration flows between different countries and had good generalizability and validity. The results are robust as the proposed solutions can achieve lesser mean absolute error, mean squared error, root mean square error, mean absolute percentage error and <em>R</em><sup>2</sup> values, reaching 0.9855 compared to long short-term memory (LSTM), gated recurrent unit, generative adversarial network (GAN) and the traditional gravity model.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This study is significant because it demonstrates a highly effective technique for predicting international migration using conditional GANs. By incorporating migration knowledge into our models, we can achieve prediction accuracy, gaining valuable insights into the differences between various model characteristics. We used SHapley Additive exPlanations to enhance our understanding of these differences and provide clear and concise explanations for our model predictions. The results demonstrated the theoretical significance and practical value of the MK-CGAN model in predicting international migration.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140579647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring cross-cultural disparities in tourists' perceived images: a text mining and sentiment analysis study using LDA and BERT-BILSTM models 探索游客感知图像中的跨文化差异:利用 LDA 和 BERT-BILSTM 模型进行的文本挖掘和情感分析研究
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-03-20 DOI: 10.1108/dta-10-2023-0645
Qiuying Chen, Ronghui Liu, Qingquan Jiang, Shangyue Xu

Purpose

Tourists with different cultural backgrounds think and behave differently. Accurately capturing and correctly understanding cultural differences will help tourist destinations in product/service planning, marketing communication and attracting and retaining tourists. This research employs Hofstede's cultural dimensions theory to analyse the variations in destination image perceptions of Chinese-speaking and English-speaking tourists to Xiamen, a prominent tourist attraction in China.

Design/methodology/approach

The evaluation utilizes a two-stage approach, incorporating LDA and BERT-BILSTM models. By leveraging text mining, sentiment analysis and t-tests, this research investigates the variations in tourists' perceptions of Xiamen across different cultures.

Findings

The results reveal that cultural disparities significantly impact tourists' perceived image of Xiamen, particularly regarding their preferences for renowned tourist destinations and the factors influencing their travel experience.

Originality/value

This research pioneers applying natural language processing methods and machine learning techniques to affirm the substantial differences in the perceptions of tourist destinations among Chinese-speaking and English-speaking tourists based on Hofstede's cultural theory. The findings furnish theoretical insights for destination marketing organizations to target diverse cultural tourists through precise marketing strategies and illuminate the practical application of Hofstede's cultural theory in tourism and hospitality.

目的不同文化背景的游客有不同的思维和行为方式。准确把握和正确理解文化差异有助于旅游目的地的产品/服务规划、营销传播以及吸引和留住游客。本研究采用霍夫斯泰德的文化维度理论,分析了中国著名旅游景点厦门的汉语游客和英语游客对目的地形象认知的差异。通过文本挖掘、情感分析和 t 检验,本研究调查了不同文化背景下游客对厦门的认知差异。研究结果表明,文化差异极大地影响了游客对厦门的认知形象,尤其是在游客对知名旅游目的地的偏好以及影响其旅游体验的因素方面。研究结果为旅游目的地营销机构通过精准营销策略锁定不同文化游客提供了理论依据,并阐明了霍夫斯泰德文化理论在旅游业和酒店业中的实际应用。
{"title":"Exploring cross-cultural disparities in tourists' perceived images: a text mining and sentiment analysis study using LDA and BERT-BILSTM models","authors":"Qiuying Chen, Ronghui Liu, Qingquan Jiang, Shangyue Xu","doi":"10.1108/dta-10-2023-0645","DOIUrl":"https://doi.org/10.1108/dta-10-2023-0645","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Tourists with different cultural backgrounds think and behave differently. Accurately capturing and correctly understanding cultural differences will help tourist destinations in product/service planning, marketing communication and attracting and retaining tourists. This research employs Hofstede's cultural dimensions theory to analyse the variations in destination image perceptions of Chinese-speaking and English-speaking tourists to Xiamen, a prominent tourist attraction in China.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>The evaluation utilizes a two-stage approach, incorporating LDA and BERT-BILSTM models. By leveraging text mining, sentiment analysis and <em>t</em>-tests, this research investigates the variations in tourists' perceptions of Xiamen across different cultures.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The results reveal that cultural disparities significantly impact tourists' perceived image of Xiamen, particularly regarding their preferences for renowned tourist destinations and the factors influencing their travel experience.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This research pioneers applying natural language processing methods and machine learning techniques to affirm the substantial differences in the perceptions of tourist destinations among Chinese-speaking and English-speaking tourists based on Hofstede's cultural theory. The findings furnish theoretical insights for destination marketing organizations to target diverse cultural tourists through precise marketing strategies and illuminate the practical application of Hofstede's cultural theory in tourism and hospitality.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Light field image coding using a residual channel attention network–based view synthesis 利用基于残差通道注意网络的视图合成技术进行光场图像编码
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-02-21 DOI: 10.1108/dta-03-2023-0071
Faguo Liu, Qian Zhang, Tao Yan, Bin Wang, Ying Gao, Jiaqi Hou, Feiniu Yuan

Purpose

Light field images (LFIs) have gained popularity as a technology to increase the field of view (FoV) of plenoptic cameras since they can capture information about light rays with a large FoV. Wide FoV causes light field (LF) data to increase rapidly, which restricts the use of LF imaging in image processing, visual analysis and user interface. Effective LFI coding methods become of paramount importance. This paper aims to eliminate more redundancy by exploring sparsity and correlation in the angular domain of LFIs, as well as mitigate the loss of perceptual quality of LFIs caused by encoding.

Design/methodology/approach

This work proposes a new efficient LF coding framework. On the coding side, a new sampling scheme and a hierarchical prediction structure are used to eliminate redundancy in the LFI's angular and spatial domains. At the decoding side, high-quality dense LF is reconstructed using a view synthesis method based on the residual channel attention network (RCAN).

Findings

In three different LF datasets, our proposed coding framework not only reduces the transmitted bit rate but also maintains a higher view quality than the current more advanced methods.

Originality/value

(1) A new sampling scheme is designed to synthesize high-quality LFIs while better ensuring LF angular domain sparsity. (2) To further eliminate redundancy in the spatial domain, new ranking schemes and hierarchical prediction structures are designed. (3) A synthetic network based on RCAN and a novel loss function is designed to mitigate the perceptual quality loss due to the coding process.

目的 光场图像(LFIs)可以捕捉大视场(FoV)的光线信息,因此作为一种增加全视角照相机视场(FoV)的技术而广受欢迎。宽视场会导致光场(LF)数据迅速增加,从而限制了 LF 成像在图像处理、视觉分析和用户界面中的应用。有效的光场成像编码方法变得至关重要。本文旨在通过探索 LFI 角度域的稀疏性和相关性来消除更多冗余,同时减轻编码对 LFI 感知质量造成的损失。在编码方面,采用了新的采样方案和分层预测结构来消除 LFI 角域和空间域中的冗余。在解码端,使用基于残差信道注意网络(RCAN)的视图合成方法重建高质量的密集 LF。在三个不同的 LF 数据集中,我们提出的编码框架不仅降低了传输比特率,而且与当前更先进的方法相比保持了更高的视图质量。(2)为进一步消除空间域的冗余,设计了新的排序方案和分层预测结构。(3) 设计了基于 RCAN 和新型损失函数的合成网络,以减轻编码过程造成的感知质量损失。
{"title":"Light field image coding using a residual channel attention network–based view synthesis","authors":"Faguo Liu, Qian Zhang, Tao Yan, Bin Wang, Ying Gao, Jiaqi Hou, Feiniu Yuan","doi":"10.1108/dta-03-2023-0071","DOIUrl":"https://doi.org/10.1108/dta-03-2023-0071","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>Light field images (LFIs) have gained popularity as a technology to increase the field of view (FoV) of plenoptic cameras since they can capture information about light rays with a large FoV. Wide FoV causes light field (LF) data to increase rapidly, which restricts the use of LF imaging in image processing, visual analysis and user interface. Effective LFI coding methods become of paramount importance. This paper aims to eliminate more redundancy by exploring sparsity and correlation in the angular domain of LFIs, as well as mitigate the loss of perceptual quality of LFIs caused by encoding.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>This work proposes a new efficient LF coding framework. On the coding side, a new sampling scheme and a hierarchical prediction structure are used to eliminate redundancy in the LFI's angular and spatial domains. At the decoding side, high-quality dense LF is reconstructed using a view synthesis method based on the residual channel attention network (RCAN).</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>In three different LF datasets, our proposed coding framework not only reduces the transmitted bit rate but also maintains a higher view quality than the current more advanced methods.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>(1) A new sampling scheme is designed to synthesize high-quality LFIs while better ensuring LF angular domain sparsity. (2) To further eliminate redundancy in the spatial domain, new ranking schemes and hierarchical prediction structures are designed. (3) A synthetic network based on RCAN and a novel loss function is designed to mitigate the perceptual quality loss due to the coding process.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139920787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
False alarm detection in intensive care unit for monitoring arrhythmia condition using bio-signals 利用生物信号监测重症监护室心律失常状况的误报检测
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-02-13 DOI: 10.1108/dta-08-2023-0437
Aleena Swetapadma, Tishya Manna, Maryam Samami

Purpose

A novel method has been proposed to reduce the false alarm rate of arrhythmia patients regarding life-threatening conditions in the intensive care unit. In this purpose, the atrial blood pressure, photoplethysmogram (PLETH), electrocardiogram (ECG) and respiratory (RESP) signals are considered as input signals.

Design/methodology/approach

Three machine learning approaches feed-forward artificial neural network (ANN), ensemble learning method and k-nearest neighbors searching methods are used to detect the false alarm. The proposed method has been implemented using Arduino and MATLAB/SIMULINK for real-time ICU-arrhythmia patients' monitoring data.

Findings

The proposed method detects the false alarm with an accuracy of 99.4 per cent during asystole, 100 per cent during ventricular flutter, 98.5 per cent during ventricular tachycardia, 99.6 per cent during bradycardia and 100 per cent during tachycardia. The proposed framework is adaptive in many scenarios, easy to implement, computationally friendly and highly accurate and robust with overfitting issue.

Originality/value

As ECG signals consisting with PQRST wave, any deviation from the normal pattern may signify some alarming conditions. These deviations can be utilized as input to classifiers for the detection of false alarms; hence, there is no need for other feature extraction techniques. Feed-forward ANN with the Lavenberg–Marquardt algorithm has shown higher rate of convergence than other neural network algorithms which helps provide better accuracy with no overfitting.

目的 为降低重症监护室中心律失常患者在危及生命的情况下的误报率,提出了一种新方法。设计/方法/途径使用了三种机器学习方法:前馈人工神经网络(ANN)、集合学习法和 k 近邻搜索法来检测误报。使用 Arduino 和 MATLAB/SIMULINK 对 ICU 心律失常患者的实时监测数据实施了所提出的方法。研究结果所提出的方法检测误报的准确率为:心搏骤停 99.4%、心室扑动 100%、室性心动过速 98.5%、心动过缓 99.6%、心动过速 100%。由于心电信号由 PQRST 波组成,任何与正常模式的偏差都可能意味着一些警报情况。这些偏差可作为分类器的输入,用于检测误报,因此无需其他特征提取技术。与其他神经网络算法相比,采用 Lavenberg-Marquardt 算法的前馈神经网络显示出更高的收敛速度,这有助于提供更好的准确性,同时不会出现过度拟合。
{"title":"False alarm detection in intensive care unit for monitoring arrhythmia condition using bio-signals","authors":"Aleena Swetapadma, Tishya Manna, Maryam Samami","doi":"10.1108/dta-08-2023-0437","DOIUrl":"https://doi.org/10.1108/dta-08-2023-0437","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>A novel method has been proposed to reduce the false alarm rate of arrhythmia patients regarding life-threatening conditions in the intensive care unit. In this purpose, the atrial blood pressure, photoplethysmogram (PLETH), electrocardiogram (ECG) and respiratory (RESP) signals are considered as input signals.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>Three machine learning approaches feed-forward artificial neural network (ANN), ensemble learning method and <em>k</em>-nearest neighbors searching methods are used to detect the false alarm. The proposed method has been implemented using Arduino and MATLAB/SIMULINK for real-time ICU-arrhythmia patients' monitoring data.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The proposed method detects the false alarm with an accuracy of 99.4 per cent during asystole, 100 per cent during ventricular flutter, 98.5 per cent during ventricular tachycardia, 99.6 per cent during bradycardia and 100 per cent during tachycardia. The proposed framework is adaptive in many scenarios, easy to implement, computationally friendly and highly accurate and robust with overfitting issue.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>As ECG signals consisting with PQRST wave, any deviation from the normal pattern may signify some alarming conditions. These deviations can be utilized as input to classifiers for the detection of false alarms; hence, there is no need for other feature extraction techniques. Feed-forward ANN with the Lavenberg–Marquardt algorithm has shown higher rate of convergence than other neural network algorithms which helps provide better accuracy with no overfitting.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Community relations discovery methods for users in Fancircle based on sentiment analysis in China 基于情感分析的中国 Fancircle 用户社区关系发现方法
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-01-29 DOI: 10.1108/dta-09-2023-0570
Kai Wang

Purpose

The identification of network user relationship in Fancircle contributes to quantifying the violence index of user text, mining the internal correlation of network behaviors among users, which provides necessary data support for the construction of knowledge graph.

Design/methodology/approach

A correlation identification method based on sentiment analysis (CRDM-SA) is put forward by extracting user semantic information, as well as introducing violent sentiment membership. To be specific, the topic of the implementation of topology mapping in the community can be obtained based on self-built field of violent sentiment dictionary (VSD) by extracting user text information. Afterward, the violence index of the user text is calculated to quantify the fuzzy sentiment representation between the user and the topic. Finally, the multi-granularity violence association rules mining of user text is realized by constructing violence fuzzy concept lattice.

Findings

It is helpful to reveal the internal relationship of online violence under complex network environment. In that case, the sentiment dependence of users can be characterized from a granular perspective.

Originality/value

The membership degree of violent sentiment into user relationship recognition in Fancircle community is introduced, and a text sentiment association recognition method based on VSD is proposed. By calculating the value of violent sentiment in the user text, the annotation of violent sentiment in the topic dimension of the text is achieved, and the partial order relation between fuzzy concepts of violence under the effective confidence threshold is utilized to obtain the association relation.

目的Fancircle中网络用户关系的识别有助于量化用户文本的暴力指数,挖掘用户间网络行为的内在关联性,为知识图谱的构建提供必要的数据支持。设计/方法/途径通过提取用户语义信息,并引入暴力情感成员,提出了一种基于情感分析的关联识别方法(CRDM-SA)。具体来说,通过提取用户文本信息,在自建的暴力情感字典(VSD)字段基础上,可以获得社区中实施拓扑映射的主题。然后,计算用户文本的暴力指数,量化用户与话题之间的模糊情感表征。最后,通过构建暴力模糊概念网格,实现对用户文本的多粒度暴力关联规则挖掘。 研究结果这有助于揭示复杂网络环境下网络暴力的内在关系。原创性/价值介绍了Fancircle社区中暴力情感在用户关系识别中的成员度,提出了一种基于VSD的文本情感关联识别方法。通过计算用户文本中的暴力情感值,实现文本主题维度的暴力情感标注,并利用有效置信度阈值下暴力模糊概念间的偏序关系得到关联关系。
{"title":"Community relations discovery methods for users in Fancircle based on sentiment analysis in China","authors":"Kai Wang","doi":"10.1108/dta-09-2023-0570","DOIUrl":"https://doi.org/10.1108/dta-09-2023-0570","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The identification of network user relationship in Fancircle contributes to quantifying the violence index of user text, mining the internal correlation of network behaviors among users, which provides necessary data support for the construction of knowledge graph.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>A correlation identification method based on sentiment analysis (CRDM-SA) is put forward by extracting user semantic information, as well as introducing violent sentiment membership. To be specific, the topic of the implementation of topology mapping in the community can be obtained based on self-built field of violent sentiment dictionary (VSD) by extracting user text information. Afterward, the violence index of the user text is calculated to quantify the fuzzy sentiment representation between the user and the topic. Finally, the multi-granularity violence association rules mining of user text is realized by constructing violence fuzzy concept lattice.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>It is helpful to reveal the internal relationship of online violence under complex network environment. In that case, the sentiment dependence of users can be characterized from a granular perspective.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>The membership degree of violent sentiment into user relationship recognition in Fancircle community is introduced, and a text sentiment association recognition method based on VSD is proposed. By calculating the value of violent sentiment in the user text, the annotation of violent sentiment in the topic dimension of the text is achieved, and the partial order relation between fuzzy concepts of violence under the effective confidence threshold is utilized to obtain the association relation.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139578943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Bayesian Inference-based approach for extracting driving data with implicit intention 基于贝叶斯推理的隐含意图驾驶数据提取方法
IF 1.6 4区 计算机科学 Q1 Social Sciences Pub Date : 2024-01-19 DOI: 10.1108/dta-03-2023-0074
Ping Huang, Haitao Ding, Hong Chen, Jianwei Zhang, Zhenjia Sun

Purpose

The growing availability of naturalistic driving datasets (NDDs) presents a valuable opportunity to develop various models for autonomous driving. However, while current NDDs include data on vehicles with and without intended driving behavior changes, they do not explicitly demonstrate a type of data on vehicles that intend to change their driving behavior but do not execute the behaviors because of safety, efficiency, or other factors. This missing data is essential for autonomous driving decisions. This study aims to extract the driving data with implicit intentions to support the development of decision-making models.

Design/methodology/approach

According to Bayesian inference, drivers who have the same intended changes likely share similar influencing factors and states. Building on this principle, this study proposes an approach to extract data on vehicles that intended to execute specific behaviors but failed to do so. This is achieved by computing driving similarities between the candidate vehicles and benchmark vehicles with incorporation of the standard similarity metrics, which takes into account information on the surrounding vehicles' location topology and individual vehicle motion states. By doing so, the method enables a more comprehensive analysis of driving behavior and intention.

Findings

The proposed method is verified on the Next Generation SIMulation dataset (NGSim), which confirms its ability to reveal similarities between vehicles executing similar behaviors during the decision-making process in nature. The approach is also validated using simulated data, achieving an accuracy of 96.3 per cent in recognizing vehicles with specific driving behavior intentions that are not executed.

Originality/value

This study provides an innovative approach to extract driving data with implicit intentions and offers strong support to develop data-driven decision-making models for autonomous driving. With the support of this approach, the development of autonomous vehicles can capture more real driving experience from human drivers moving towards a safer and more efficient future.

目的 越来越多的自然驾驶数据集(NDD)为开发各种自动驾驶模型提供了宝贵的机会。然而,尽管当前的自然驾驶数据集包含有驾驶行为变化和无驾驶行为变化车辆的数据,但它们并没有明确展示有驾驶行为变化意图但因安全、效率或其他因素而未执行驾驶行为的车辆的数据类型。这些缺失的数据对于自动驾驶决策至关重要。本研究旨在提取具有隐含意图的驾驶数据,以支持决策模型的开发。根据贝叶斯推理,具有相同意图改变的驾驶员可能具有相似的影响因素和状态。基于这一原则,本研究提出了一种方法,用于提取打算执行特定行为但未能执行的车辆的数据。该方法通过计算候选车辆与基准车辆之间的驾驶相似性,并结合标准的相似性度量,将周围车辆的位置拓扑和单个车辆的运动状态等信息考虑在内。研究结果在下一代 SIMulation 数据集(NGSim)上验证了所提出的方法,证实该方法能够揭示车辆在自然决策过程中执行类似行为的相似性。该方法还通过模拟数据进行了验证,在识别具有未执行的特定驾驶行为意图的车辆方面,准确率达到 96.3%。 原创性/价值 本研究提供了一种提取具有隐含意图的驾驶数据的创新方法,为开发数据驱动的自动驾驶决策模型提供了有力支持。在这种方法的支持下,自动驾驶汽车的开发可以从人类驾驶员那里获取更多真实的驾驶经验,从而迈向更安全、更高效的未来。
{"title":"A Bayesian Inference-based approach for extracting driving data with implicit intention","authors":"Ping Huang, Haitao Ding, Hong Chen, Jianwei Zhang, Zhenjia Sun","doi":"10.1108/dta-03-2023-0074","DOIUrl":"https://doi.org/10.1108/dta-03-2023-0074","url":null,"abstract":"<h3>Purpose</h3>\u0000<p>The growing availability of naturalistic driving datasets (NDDs) presents a valuable opportunity to develop various models for autonomous driving. However, while current NDDs include data on vehicles with and without intended driving behavior changes, they do not explicitly demonstrate a type of data on vehicles that intend to change their driving behavior but do not execute the behaviors because of safety, efficiency, or other factors. This missing data is essential for autonomous driving decisions. This study aims to extract the driving data with implicit intentions to support the development of decision-making models.</p><!--/ Abstract__block -->\u0000<h3>Design/methodology/approach</h3>\u0000<p>According to Bayesian inference, drivers who have the same intended changes likely share similar influencing factors and states. Building on this principle, this study proposes an approach to extract data on vehicles that intended to execute specific behaviors but failed to do so. This is achieved by computing driving similarities between the candidate vehicles and benchmark vehicles with incorporation of the standard similarity metrics, which takes into account information on the surrounding vehicles' location topology and individual vehicle motion states. By doing so, the method enables a more comprehensive analysis of driving behavior and intention.</p><!--/ Abstract__block -->\u0000<h3>Findings</h3>\u0000<p>The proposed method is verified on the Next Generation SIMulation dataset (NGSim), which confirms its ability to reveal similarities between vehicles executing similar behaviors during the decision-making process in nature. The approach is also validated using simulated data, achieving an accuracy of 96.3 per cent in recognizing vehicles with specific driving behavior intentions that are not executed.</p><!--/ Abstract__block -->\u0000<h3>Originality/value</h3>\u0000<p>This study provides an innovative approach to extract driving data with implicit intentions and offers strong support to develop data-driven decision-making models for autonomous driving. With the support of this approach, the development of autonomous vehicles can capture more real driving experience from human drivers moving towards a safer and more efficient future.</p><!--/ Abstract__block -->","PeriodicalId":56156,"journal":{"name":"Data Technologies and Applications","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2024-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139496379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data Technologies and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1