首页 > 最新文献

Knowledge Engineering and Data Science最新文献

英文 中文
Optimizing Random Forest Algorithm to Classify Player's Memorisation via In-game Data 优化随机森林算法,通过游戏内数据对玩家记忆进行分类
Pub Date : 2023-10-02 DOI: 10.17977/um018v6i12023p103-113
Akmal Vrisna Alzuhdi, Harits Ar Rosyid, M. Chuttur, Shah Nazir
Assessment of a player's knowledge in game education has been around for some time. Traditional evaluation in and around a gaming session may disrupt the players' immersion. This research uses an optimized Random Forest to construct a non-invasive prediction of a game education player's Memorization via in-game data. Firstly, we obtained the dataset from a 3-month survey to record in-game data of 50 players who play 4-15 game stages of the Chem Fight (a test case game). Next, we generated three variants of datasets via the preprocessing stages: resampling method (SMOTE), normalization (min-max), and a combination of resampling and normalization. Then, we trained and optimized three Random Forest (RF) classifiers to predict the player's Memorization. We chose RF because it can generalize well given the high-dimensional dataset. We used RF as the classifier, subject to optimization using its hyperparameter: n_estimators. We implemented a Grid Search Cross Validation (GSCV) method to identify the best value of  n_estimators. We utilized the statistics of GSCV results to reduce the weight of  n_estimators by observing the region of interest shown by the graphs of performances of the classifiers. Overall, the classifiers fitted using the BEST n_estimators (i.e., 89, 31, 89, and 196 trees) from GSCV performed well with around 80% accuracy. Moreover, we successfully identified the smaller number of n_estimators (OPTIMAL), at least halved the BEST  n_estimators. All classifiers were retrained using the OPTIMAL  n_estimators (37, 12, 37, and 41 trees). We found out that the performances of the classifiers were relatively steady at ~80%. This means that we successfully optimized the Random Forest in predicting a player's Memorization when playing the Chem Fight game. An automated technique presented in this paper can monitor student interactions and evaluate their abilities based on in-game data. As such, it can offer objective data about the skills used.
在游戏教育中对玩家的知识进行评估已经有一段时间了。传统的游戏评估可能会破坏玩家的沉浸感。本研究利用优化的随机森林技术,通过游戏中的数据对游戏教育玩家的记忆力进行非侵入式预测。首先,我们从一项为期 3 个月的调查中获得了数据集,记录了 50 名玩家在《化学大战》(测试案例游戏)中进行 4-15 个游戏阶段的游戏内数据。接下来,我们通过预处理阶段生成了三种不同的数据集:重采样方法(SMOTE)、归一化(最小-最大)以及重采样和归一化的组合。然后,我们训练并优化了三个随机森林(RF)分类器来预测玩家的记忆力。我们选择 RF 是因为它在高维数据集上具有良好的泛化能力。我们使用 RF 作为分类器,并使用其超参数:n_estimators 进行优化。我们采用网格搜索交叉验证(GSCV)方法来确定 n_estimators 的最佳值。我们利用 GSCV 结果的统计数据,通过观察分类器性能曲线图所显示的关注区域来降低 n_estimators 的权重。总体而言,使用 GSCV 中的 BEST n_估计器(即 89、31、89 和 196 棵树)拟合的分类器表现良好,准确率约为 80%。此外,我们还成功识别了较少数量的 n_估计器(OPTIMAL),至少比 BEST n_估计器少了一半。我们使用 OPTIMAL n_estimators 对所有分类器(37、12、37 和 41 棵树)进行了重新训练。我们发现,分类器的性能相对稳定在约 80%。这说明我们成功地优化了随机森林,使其能够预测玩家在玩化学大战游戏时的记忆力。本文介绍的自动化技术可以监控学生的互动,并根据游戏中的数据评估他们的能力。因此,它可以提供有关所用技能的客观数据。
{"title":"Optimizing Random Forest Algorithm to Classify Player's Memorisation via In-game Data","authors":"Akmal Vrisna Alzuhdi, Harits Ar Rosyid, M. Chuttur, Shah Nazir","doi":"10.17977/um018v6i12023p103-113","DOIUrl":"https://doi.org/10.17977/um018v6i12023p103-113","url":null,"abstract":"Assessment of a player's knowledge in game education has been around for some time. Traditional evaluation in and around a gaming session may disrupt the players' immersion. This research uses an optimized Random Forest to construct a non-invasive prediction of a game education player's Memorization via in-game data. Firstly, we obtained the dataset from a 3-month survey to record in-game data of 50 players who play 4-15 game stages of the Chem Fight (a test case game). Next, we generated three variants of datasets via the preprocessing stages: resampling method (SMOTE), normalization (min-max), and a combination of resampling and normalization. Then, we trained and optimized three Random Forest (RF) classifiers to predict the player's Memorization. We chose RF because it can generalize well given the high-dimensional dataset. We used RF as the classifier, subject to optimization using its hyperparameter: n_estimators. We implemented a Grid Search Cross Validation (GSCV) method to identify the best value of  n_estimators. We utilized the statistics of GSCV results to reduce the weight of  n_estimators by observing the region of interest shown by the graphs of performances of the classifiers. Overall, the classifiers fitted using the BEST n_estimators (i.e., 89, 31, 89, and 196 trees) from GSCV performed well with around 80% accuracy. Moreover, we successfully identified the smaller number of n_estimators (OPTIMAL), at least halved the BEST  n_estimators. All classifiers were retrained using the OPTIMAL  n_estimators (37, 12, 37, and 41 trees). We found out that the performances of the classifiers were relatively steady at ~80%. This means that we successfully optimized the Random Forest in predicting a player's Memorization when playing the Chem Fight game. An automated technique presented in this paper can monitor student interactions and evaluate their abilities based on in-game data. As such, it can offer objective data about the skills used.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"84 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139324120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-Term Traffic Prediction Based on Stacked GCN Model 基于叠加 GCN 模型的长期交通流量预测
Pub Date : 2023-09-24 DOI: 10.17977/um018v6i12023p92-102
Atkia Akila Karim, Naushin Nower
With the recent surge in road traffic within major cities, the need for both short and long-term traffic flow forecasting has become paramount for city authorities. Previous research efforts have predominantly focused on short-term traffic flow estimations for specific road segments and paths. However, applications of paramount importance, such as traffic management and schedule routing planning, demand a deep understanding of long-term traffic flow predictions. However, due to the intricate interplay of underlying factors, there exists a scarcity of studies dedicated to long-term traffic prediction. Previous research has also highlighted the challenge of lower accuracy in long-term predictions owing to error propagation within the model. This model effectively combines Graph Convolutional Network (GCN) capacity to extract spatial characteristics from the road network with the stacked GCN aptitude for capturing temporal context. Our developed model is subsequently employed for traffic flow forecasting within urban road networks. We rigorously compare our method against baseline techniques using two real-world datasets. Our approach significantly reduces prediction errors by 40% to 60% compared to other methods. The experimental results underscore our model's ability to uncover spatiotemporal dependencies within traffic data and its superior predictive performance over baseline models using real-world traffic datasets.
随着近期各大城市道路交通流量的激增,城市管理部门对短期和长期交通流量预测的需求变得越来越迫切。以往的研究工作主要集中在特定路段和路径的短期交通流量估算上。然而,交通管理和日程路由规划等极其重要的应用要求深入了解长期交通流量预测。然而,由于各种基本因素之间错综复杂的相互作用,专门针对长期交通流量预测的研究十分稀少。以往的研究也强调了由于模型内部的误差传播导致长期预测准确率较低的挑战。本模型有效地结合了图形卷积网络(GCN)从道路网络中提取空间特征的能力和叠加 GCN 捕捉时间背景的能力。我们开发的模型随后被用于城市路网的交通流量预测。我们使用两个真实世界数据集将我们的方法与基准技术进行了严格比较。与其他方法相比,我们的方法大大减少了 40% 到 60% 的预测误差。实验结果表明,我们的模型能够发现交通数据中的时空依赖关系,并且在使用真实世界交通数据集时,其预测性能优于基线模型。
{"title":"Long-Term Traffic Prediction Based on Stacked GCN Model","authors":"Atkia Akila Karim, Naushin Nower","doi":"10.17977/um018v6i12023p92-102","DOIUrl":"https://doi.org/10.17977/um018v6i12023p92-102","url":null,"abstract":"With the recent surge in road traffic within major cities, the need for both short and long-term traffic flow forecasting has become paramount for city authorities. Previous research efforts have predominantly focused on short-term traffic flow estimations for specific road segments and paths. However, applications of paramount importance, such as traffic management and schedule routing planning, demand a deep understanding of long-term traffic flow predictions. However, due to the intricate interplay of underlying factors, there exists a scarcity of studies dedicated to long-term traffic prediction. Previous research has also highlighted the challenge of lower accuracy in long-term predictions owing to error propagation within the model. This model effectively combines Graph Convolutional Network (GCN) capacity to extract spatial characteristics from the road network with the stacked GCN aptitude for capturing temporal context. Our developed model is subsequently employed for traffic flow forecasting within urban road networks. We rigorously compare our method against baseline techniques using two real-world datasets. Our approach significantly reduces prediction errors by 40% to 60% compared to other methods. The experimental results underscore our model's ability to uncover spatiotemporal dependencies within traffic data and its superior predictive performance over baseline models using real-world traffic datasets.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"45 3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139336805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Round-Robin Algorithm in Load Balancing for National Data Centers 国家数据中心负载平衡中的循环算法
Pub Date : 2023-09-22 DOI: 10.17977/um018v6i12023p79-91
I. K. W. Sudiatmika, G. Indrawan, S. Sariyasa
The Provincial Government of Bali assumes a crucial role in administering various public service applications to meet the requirements of its community, traditional villages, and regional apparatus. Nevertheless, the escalating magnitude of traffic and uneven distribution of requests have resulted in substantial server burdens, which may jeopardize the operation of applications and heighten the likelihood of downtime. Ensuring efficient load distribution is of utmost importance in tackling these difficulties, and the Round Robin algorithm is often utilized for this purpose. However, the current body of research has not extensively examined the distinct circumstances surrounding on-premise servers in the Bali Provincial Government. The primary objective of this study is to address the significant gap in knowledge by conducting a comprehensive evaluation of the Round Robin algorithm's effectiveness in load-balancing on-premise servers inside the Bali Provincial Government. The primary objective of our study is to assess the appropriateness of the algorithm within the given context, with the ultimate goal of providing practical and implementable suggestions. The observations above can optimize system efficiency and minimize periods of inactivity, thereby enhancing the provision of vital public services across Bali. This study provides essential insights for enhancing server infrastructure and load-balancing strategies through empirical evaluation and comprehensive analysis. Its findings are valuable for the Bali Provincial Government and serve as a reference for other organizations facing challenges managing server loads. This study signifies a notable advancement in establishing reliable and practical public service applications within Bali.
巴厘岛省政府在管理各种公共服务应用程序以满足其社区、传统村落和地区机构的要求方面发挥着至关重要的作用。然而,不断攀升的流量和不均衡的请求分布导致服务器负担沉重,可能危及应用程序的运行并增加停机的可能性。要解决这些问题,确保有效的负载分配至关重要,为此通常采用循环罗宾算法。然而,目前的研究还没有广泛考察巴厘岛省政府内部服务器的独特情况。本研究的主要目的是通过对循环罗宾算法在巴厘岛省政府内部内部服务器负载平衡中的有效性进行全面评估,填补知识上的重大空白。我们研究的主要目的是评估该算法在特定情况下的适当性,最终目标是提供切实可行的建议。上述观察结果可以优化系统效率,最大限度地减少闲置时间,从而加强巴厘岛重要公共服务的提供。本研究通过实证评估和综合分析,为加强服务器基础设施和负载平衡策略提供了重要见解。研究结果对巴厘岛省政府很有价值,也可为其他面临服务器负载管理挑战的组织提供参考。这项研究标志着巴厘岛在建立可靠、实用的公共服务应用程序方面取得了显著进展。
{"title":"Round-Robin Algorithm in Load Balancing for National Data Centers","authors":"I. K. W. Sudiatmika, G. Indrawan, S. Sariyasa","doi":"10.17977/um018v6i12023p79-91","DOIUrl":"https://doi.org/10.17977/um018v6i12023p79-91","url":null,"abstract":"The Provincial Government of Bali assumes a crucial role in administering various public service applications to meet the requirements of its community, traditional villages, and regional apparatus. Nevertheless, the escalating magnitude of traffic and uneven distribution of requests have resulted in substantial server burdens, which may jeopardize the operation of applications and heighten the likelihood of downtime. Ensuring efficient load distribution is of utmost importance in tackling these difficulties, and the Round Robin algorithm is often utilized for this purpose. However, the current body of research has not extensively examined the distinct circumstances surrounding on-premise servers in the Bali Provincial Government. The primary objective of this study is to address the significant gap in knowledge by conducting a comprehensive evaluation of the Round Robin algorithm's effectiveness in load-balancing on-premise servers inside the Bali Provincial Government. The primary objective of our study is to assess the appropriateness of the algorithm within the given context, with the ultimate goal of providing practical and implementable suggestions. The observations above can optimize system efficiency and minimize periods of inactivity, thereby enhancing the provision of vital public services across Bali. This study provides essential insights for enhancing server infrastructure and load-balancing strategies through empirical evaluation and comprehensive analysis. Its findings are valuable for the Bali Provincial Government and serve as a reference for other organizations facing challenges managing server loads. This study signifies a notable advancement in establishing reliable and practical public service applications within Bali.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"135 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139337490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
K-Means Clustering and Multilayer Perceptron for Categorizing Student Business Groups 利用 K-Means 聚类和多层感知器对学生商业团体进行分类
Pub Date : 2023-09-18 DOI: 10.17977/um018v6i12023p69-78
Miftahul Walid, Norfiah Lailatin Nispi Sahbaniya, Hozairi Hozairi, Fajar Baskoro, Arya Yudhi Wijaya
The research conducted in this study was driven by the East Java provincial government's requirement to assess the transaction levels of the Student Business Group (KUS) in the SMA Double Track program. These transaction levels are a basis for allocating supplementary financial aid to each business group. The system's primary objective is to assist the provincial government of East Java in making well-informed choices pertaining to the distribution of supplementary capital to the KUS. The classification technique employed in this study is the multilayer perceptron. However, the K-Means Clustering method is utilised to generate target data due to the limited availability during the classification process, which involves dividing the transaction level attributes into three distinct groups: (0) low transactions, (1) medium transactions, and (2) high transactions. The clustering process encompasses three distinct features: (1) income, (2) spending, and (3) profit. These three traits will be utilized as input data throughout the categorization procedure. The classification procedure employing the Multilayer Perceptron technique involved processing a dataset including 1383 data points. The training data constituted 80% of the dataset, while the remaining 20% was allocated for testing. In order to evaluate the efficacy of the constructed model, the training error was assessed using K-Fold cross-validation, yielding an average accuracy score of 0.92. In the present study, the categorization technique yielded an accuracy of 0.96. This model aims to classify scenarios when the dataset lacks prior target data.
东爪哇省政府要求评估 SMA 双轨项目中学生企业集团(KUS)的交易水平,本研究正是在此要求下开展的。这些交易水平是向每个企业集团分配补充财政援助的依据。该系统的主要目的是协助东爪哇省政府在向 KUS 分配补充资金时做出明智的选择。本研究采用的分类技术是多层感知器。然而,在分类过程中,由于可用性有限,因此使用 K-Means 聚类法生成目标数据,该方法涉及将交易级属性分为三个不同的组别:(0)低交易、(1)中交易和(2)高交易。聚类过程包括三个不同的特征:(1) 收入、(2) 支出和 (3) 利润。这三个特征将在整个分类过程中用作输入数据。采用多层感知器技术的分类程序需要处理包括 1383 个数据点的数据集。训练数据占数据集的 80%,其余 20% 用于测试。为了评估所建模型的有效性,使用 K-Fold 交叉验证法对训练误差进行了评估,得出的平均准确率为 0.92。在本研究中,分类技术的准确率为 0.96。该模型旨在对缺乏先验目标数据的数据集进行场景分类。
{"title":"K-Means Clustering and Multilayer Perceptron for Categorizing Student Business Groups","authors":"Miftahul Walid, Norfiah Lailatin Nispi Sahbaniya, Hozairi Hozairi, Fajar Baskoro, Arya Yudhi Wijaya","doi":"10.17977/um018v6i12023p69-78","DOIUrl":"https://doi.org/10.17977/um018v6i12023p69-78","url":null,"abstract":"The research conducted in this study was driven by the East Java provincial government's requirement to assess the transaction levels of the Student Business Group (KUS) in the SMA Double Track program. These transaction levels are a basis for allocating supplementary financial aid to each business group. The system's primary objective is to assist the provincial government of East Java in making well-informed choices pertaining to the distribution of supplementary capital to the KUS. The classification technique employed in this study is the multilayer perceptron. However, the K-Means Clustering method is utilised to generate target data due to the limited availability during the classification process, which involves dividing the transaction level attributes into three distinct groups: (0) low transactions, (1) medium transactions, and (2) high transactions. The clustering process encompasses three distinct features: (1) income, (2) spending, and (3) profit. These three traits will be utilized as input data throughout the categorization procedure. The classification procedure employing the Multilayer Perceptron technique involved processing a dataset including 1383 data points. The training data constituted 80% of the dataset, while the remaining 20% was allocated for testing. In order to evaluate the efficacy of the constructed model, the training error was assessed using K-Fold cross-validation, yielding an average accuracy score of 0.92. In the present study, the categorization technique yielded an accuracy of 0.96. This model aims to classify scenarios when the dataset lacks prior target data.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139339252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum Marginal Relevance and Vector Space Model for Summarizing Students' Final Project Abstracts 用于汇总学生毕业设计摘要的最大边际相关性和矢量空间模型
Pub Date : 2023-08-01 DOI: 10.17977/um018v6i12023p57-68
Gunawan Gunawan, Fitria Fitria, Esther Irawati Setiawan, Kimiya Fujisawa
Automatic summarization is reducing a text document with a computer program to create a summary that retains the essential parts of the original document. Automatic summarization is necessary to deal with information overload, and the amount of data is increasing. A summary is needed to get the contents of the article briefly. A summary is an effective way to present extended information in a concise form of the main contents of an article, and the aim is to tell the reader the essence of a central idea. The simple concept of a summary is to take an essential part of the entire contents of the article. Which then presents it back in summary form. The steps in this research will start with the user selecting or searching for text documents that will be summarized with keywords in the abstract as a query. The proposed approach performs text preprocessing for documents: sentence breaking, case folding, word tokenizing, filtering, and stemming. The results of the preprocessed text are weighted by term frequency-inverse document frequency (tf-idf), then weighted for query relevance using the vector space model and sentence similarity using cosine similarity. The next stage is maximum marginal relevance for sentence extraction. The proposed approach provides comprehensive summarization compared with another approach. The test results are compared with manual summaries, which produce an average precision of 88%, recall of 61%, and f-measure of 70%.
自动摘要是用计算机程序缩减文本文档,以创建保留原始文档重要部分的摘要。自动摘要是应对信息过载和数据量不断增加的必要手段。要想简明扼要地了解文章内容,就需要摘要。摘要是以简洁的形式呈现文章主要内容的扩展信息的有效方法,其目的是告诉读者中心思想的精髓。摘要的简单概念是摘取文章全部内容的重要部分。然后再以摘要的形式呈现出来。本研究的步骤将从用户选择或搜索文本文档开始,这些文档将以摘要中的关键词作为查询内容进行总结。建议的方法对文档进行文本预处理:断句、大小写折叠、单词标记化、过滤和词干化。预处理文本的结果按照词频-反文档频率(tf-idf)进行加权,然后使用向量空间模型对查询相关性进行加权,并使用余弦相似性对句子相似性进行加权。下一阶段是提取句子的最大边际相关性。与其他方法相比,建议的方法提供了全面的摘要。测试结果与人工摘要进行了比较,后者的平均精确度为 88%,召回率为 61%,f-measure 为 70%。
{"title":"Maximum Marginal Relevance and Vector Space Model for Summarizing Students' Final Project Abstracts","authors":"Gunawan Gunawan, Fitria Fitria, Esther Irawati Setiawan, Kimiya Fujisawa","doi":"10.17977/um018v6i12023p57-68","DOIUrl":"https://doi.org/10.17977/um018v6i12023p57-68","url":null,"abstract":"Automatic summarization is reducing a text document with a computer program to create a summary that retains the essential parts of the original document. Automatic summarization is necessary to deal with information overload, and the amount of data is increasing. A summary is needed to get the contents of the article briefly. A summary is an effective way to present extended information in a concise form of the main contents of an article, and the aim is to tell the reader the essence of a central idea. The simple concept of a summary is to take an essential part of the entire contents of the article. Which then presents it back in summary form. The steps in this research will start with the user selecting or searching for text documents that will be summarized with keywords in the abstract as a query. The proposed approach performs text preprocessing for documents: sentence breaking, case folding, word tokenizing, filtering, and stemming. The results of the preprocessed text are weighted by term frequency-inverse document frequency (tf-idf), then weighted for query relevance using the vector space model and sentence similarity using cosine similarity. The next stage is maximum marginal relevance for sentence extraction. The proposed approach provides comprehensive summarization compared with another approach. The test results are compared with manual summaries, which produce an average precision of 88%, recall of 61%, and f-measure of 70%.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139352161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning for Multi-Structured Javanese Gamelan Note Generator 多结构爪哇加麦兰音符生成器的深度学习
Pub Date : 2023-07-18 DOI: 10.17977/um018v6i12023p41-56
Arik Kurniawati, E. M. Yuniarno, Y. Suprapto
Javanese gamelan, a traditional Indonesian musical style, has several song structures called gendhing. Gendhing (songs) are written in conventional notation and require gamelan musicians to recognize patterns in the structure of each song. Usually, previous research on gendhing focuses on artistic and ethnomusicological perspectives, but this study is to explore the correlation between gendhing as traditional music in Indonesia and deep learning technology that replaces the task of gamelan composers. This research proposes CNN-LSTM to generate notation of ricikan struktural instruments as an accompaniment to Javanese gamelan music compositions based on balungan notation, rhythm, song structure, and gatra information. This proposed method (CNN-LSTM) is compared with LSTM and CNN. The musical data in this study is represented using numerical notation for the main melody in balungan notation. The experimental results showed that the CNN-LSTM model showed better performance compared to the LSTM and CNN models, with accuracy values of 91.9%, 91.5%, and 91.2% for CNN-LSTM, LSTM, and CNN, respectively. And the value of note distance for the Sampak song structure is 4 for the CNN-LSTM model, 8 for the LSTM model, and 12 for the CNN model. The smaller the note distance, the closer it is to the original notation provided by the gamelan composer. This study provides relevance for novice gamelan musicians who are interested in learning karawitan, especially in understanding ricikan struktural music notation and gamelan art in composing musical compositions of a song.
爪哇加麦兰是印尼的一种传统音乐风格,有几种称为 "Gendhing "的歌曲结构。Gendhing(歌曲)以传统记谱法书写,要求加麦兰音乐家识别每首歌结构中的模式。以往对 "根丁 "的研究通常侧重于艺术和民族音乐学角度,而本研究则旨在探索作为印尼传统音乐的 "根丁 "与替代加麦兰作曲家任务的深度学习技术之间的相关性。本研究建议使用 CNN-LSTM,根据 Balungan 记谱法、节奏、曲式结构和 gatra 信息,生成 ricikan 结构乐器的记谱,作为爪哇加麦兰音乐作品的伴奏。本研究将所提出的方法(CNN-LSTM)与 LSTM 和 CNN 进行了比较。本研究中的音乐数据使用数字符号来表示巴隆甘符号中的主旋律。实验结果表明,与 LSTM 和 CNN 模型相比,CNN-LSTM 模型表现出更好的性能,CNN-LSTM、LSTM 和 CNN 的准确率分别为 91.9%、91.5% 和 91.2%。而对于 Sampak 歌曲结构,CNN-LSTM 模型的音符距离值为 4,LSTM 模型为 8,CNN 模型为 12。音符距离越小,就越接近加麦兰作曲家提供的原始符号。这项研究为有兴趣学习卡拉威坦的加麦兰音乐新手提供了借鉴,尤其是在理解ricikan结构音乐符号和加麦兰艺术在创作歌曲音乐作品方面。
{"title":"Deep Learning for Multi-Structured Javanese Gamelan Note Generator","authors":"Arik Kurniawati, E. M. Yuniarno, Y. Suprapto","doi":"10.17977/um018v6i12023p41-56","DOIUrl":"https://doi.org/10.17977/um018v6i12023p41-56","url":null,"abstract":"Javanese gamelan, a traditional Indonesian musical style, has several song structures called gendhing. Gendhing (songs) are written in conventional notation and require gamelan musicians to recognize patterns in the structure of each song. Usually, previous research on gendhing focuses on artistic and ethnomusicological perspectives, but this study is to explore the correlation between gendhing as traditional music in Indonesia and deep learning technology that replaces the task of gamelan composers. This research proposes CNN-LSTM to generate notation of ricikan struktural instruments as an accompaniment to Javanese gamelan music compositions based on balungan notation, rhythm, song structure, and gatra information. This proposed method (CNN-LSTM) is compared with LSTM and CNN. The musical data in this study is represented using numerical notation for the main melody in balungan notation. The experimental results showed that the CNN-LSTM model showed better performance compared to the LSTM and CNN models, with accuracy values of 91.9%, 91.5%, and 91.2% for CNN-LSTM, LSTM, and CNN, respectively. And the value of note distance for the Sampak song structure is 4 for the CNN-LSTM model, 8 for the LSTM model, and 12 for the CNN model. The smaller the note distance, the closer it is to the original notation provided by the gamelan composer. This study provides relevance for novice gamelan musicians who are interested in learning karawitan, especially in understanding ricikan struktural music notation and gamelan art in composing musical compositions of a song.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"95 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139358216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Impact of Students Demographic Attributes on Performance Prediction through Binary Classification in the KDP Model 通过 KDP 模型中的二元分类探索学生人口统计学属性对成绩预测的影响
Pub Date : 2023-06-25 DOI: 10.17977/um018v6i12023p24-40
Issah Iddrisu, Peter Appiahene, Obed Appiah, Inusah Fuseini
During the course of this research, binary classification and the Knowledge Discovery Process (KDP) were used. The experimental and analytical capabilities of Rapid Miner's 9.10.010 instructional environment are supported by five different classifiers. Included in the analysis were 2334 entries, 17 characteristics, and one class variable containing the students' average score for the semester. There were twenty experiments carried out. During the studies, 10-fold cross-validation and ratio split validation, together with bootstrap sampling, were used. It was determined whether or not to use the Random Forest (RF), Rule Induction (RI), Naive Bayes (NB), Logistic Regression (LR), or Deep Learning (DL) methods. RF outperformed the other four methods in all six selection measures, with an accuracy of 93.96%. According to the RF classifier model, the level of education that a child's parents have is a major factor in that child's academic performance before entering higher education.
在研究过程中,使用了二进制分类和知识发现过程(KDP)。Rapid Miner 9.10.010 教学环境的实验和分析能力得到了五个不同分类器的支持。分析包括 2334 个条目、17 个特征和一个包含学生学期平均分的类变量。共进行了 20 次实验。在研究过程中,使用了 10 倍交叉验证和比率分割验证,以及引导抽样。确定了是否使用随机森林(RF)、规则归纳(RI)、奈夫贝叶斯(NB)、逻辑回归(LR)或深度学习(DL)方法。在所有六项选择指标中,RF 的准确率高达 93.96%,优于其他四种方法。根据 RF 分类器模型,孩子父母的教育水平是影响孩子升学前学习成绩的主要因素。
{"title":"Exploring the Impact of Students Demographic Attributes on Performance Prediction through Binary Classification in the KDP Model","authors":"Issah Iddrisu, Peter Appiahene, Obed Appiah, Inusah Fuseini","doi":"10.17977/um018v6i12023p24-40","DOIUrl":"https://doi.org/10.17977/um018v6i12023p24-40","url":null,"abstract":"During the course of this research, binary classification and the Knowledge Discovery Process (KDP) were used. The experimental and analytical capabilities of Rapid Miner's 9.10.010 instructional environment are supported by five different classifiers. Included in the analysis were 2334 entries, 17 characteristics, and one class variable containing the students' average score for the semester. There were twenty experiments carried out. During the studies, 10-fold cross-validation and ratio split validation, together with bootstrap sampling, were used. It was determined whether or not to use the Random Forest (RF), Rule Induction (RI), Naive Bayes (NB), Logistic Regression (LR), or Deep Learning (DL) methods. RF outperformed the other four methods in all six selection measures, with an accuracy of 93.96%. According to the RF classifier model, the level of education that a child's parents have is a major factor in that child's academic performance before entering higher education.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139368590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ant Colony Optimization for Resistor Color Code Detection 用于电阻器色码检测的蚁群优化技术
Pub Date : 2023-06-06 DOI: 10.17977/um018v6i12023p15-23
S. Wibawanto, K. Kirana, Hani Ramadhan
In the early stages of learning resistors, introducing color-based values is needed. Moreover, some combinations require a resistor trip analysis to identify. Unfortunately, a resistor body color is considered a local solution, which often confuses resistor coloration. Ant Colony Optimization (ACO) is a heuristic algorithm that can recognize problems with traveling a group of ants. ACO is proposed to select commercial matrix values to be computed without preventing local solutions. In this study, each explores the matrix based on pheromones and heuristic information to generate local solutions. Global solutions are selected based on their high degree of similarity with other local solutions. The first stage of testing focuses on exploring variations of parameter values. Applying the best parameters resulted in 85% accuracy and 43 seconds for 20 resistor images. This method is expected to prevent local solutions without wasteful computation of the matrix.
在学习电阻器的早期阶段,需要引入基于颜色的数值。此外,有些组合需要进行电阻行程分析才能识别。遗憾的是,电阻体颜色被认为是一种局部解决方案,这往往会混淆电阻着色。蚁群优化(ACO)是一种启发式算法,可以通过一群蚂蚁的旅行来识别问题。ACO 的提出是为了在不妨碍局部解的情况下选择要计算的商业矩阵值。在这项研究中,每只蚂蚁根据信息素和启发式信息探索矩阵,生成局部解决方案。全局解决方案根据其与其他局部解决方案的高度相似性进行选择。第一阶段的测试重点是探索参数值的变化。应用最佳参数后,20 个电阻器图像的准确率达到 85%,耗时 43 秒。这种方法有望在不浪费矩阵计算量的情况下防止出现局部解决方案。
{"title":"Ant Colony Optimization for Resistor Color Code Detection","authors":"S. Wibawanto, K. Kirana, Hani Ramadhan","doi":"10.17977/um018v6i12023p15-23","DOIUrl":"https://doi.org/10.17977/um018v6i12023p15-23","url":null,"abstract":"In the early stages of learning resistors, introducing color-based values is needed. Moreover, some combinations require a resistor trip analysis to identify. Unfortunately, a resistor body color is considered a local solution, which often confuses resistor coloration. Ant Colony Optimization (ACO) is a heuristic algorithm that can recognize problems with traveling a group of ants. ACO is proposed to select commercial matrix values to be computed without preventing local solutions. In this study, each explores the matrix based on pheromones and heuristic information to generate local solutions. Global solutions are selected based on their high degree of similarity with other local solutions. The first stage of testing focuses on exploring variations of parameter values. Applying the best parameters resulted in 85% accuracy and 43 seconds for 20 resistor images. This method is expected to prevent local solutions without wasteful computation of the matrix.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"7 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139370739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human Facial Expressions Identification using Convolutional Neural Network with VGG16 Architecture 基于VGG16结构的卷积神经网络人脸表情识别
Pub Date : 2022-06-07 DOI: 10.17977/um018v5i12022p78-86
L. Latumakulita, Sandy Laurentius Lumintang, Deiby Tineke Salakia, S. R. Sentinuwo, A. Sambul, N. Islam
The human facial expression identification system is essential in developing human interaction and technology. The development of Artificial Intelligence for monitoring human emotions can be helpful in the workplace. Commonly, there are six basic human expressions, namely anger, disgust, fear, happiness, sadness, and surprise, that the system can identify. This study aims to create a facial expression identification system based on basic human expressions using the Convolutional Neural Network (CNN) with a 16-layer VGG architecture. Two thousand one hundred thirty-seven facial expression images were selected from the FER2013, JAFFE, and MUG datasets. By implementing image augmentation and setting up the network parameters to Epoch of 100, the learning rate of 0,0001, and applying in the 5Fold Cross Validation, this system shows performance with an average accuracy of 84%. Results show that the model is suitable for identifying the basic facial expressions of humans.
人脸表情识别系统对人类互动和技术的发展至关重要。用于监测人类情绪的人工智能的发展在工作场所可能会有所帮助。通常,系统可以识别出六种基本的人类表达方式,即愤怒、厌恶、恐惧、快乐、悲伤和惊讶。本研究旨在使用16层VGG架构的卷积神经网络(CNN)创建一个基于人类基本表情的面部表情识别系统。从FER2013、JAFFE和MUG数据集中选择了二千一百三十七张面部表情图像。通过实现图像增强,将网络参数设置为Epoch为100,学习率为00001,并应用于5折叠交叉验证,该系统显示出平均准确率为84%的性能。结果表明,该模型适用于识别人类的基本面部表情。
{"title":"Human Facial Expressions Identification using Convolutional Neural Network with VGG16 Architecture","authors":"L. Latumakulita, Sandy Laurentius Lumintang, Deiby Tineke Salakia, S. R. Sentinuwo, A. Sambul, N. Islam","doi":"10.17977/um018v5i12022p78-86","DOIUrl":"https://doi.org/10.17977/um018v5i12022p78-86","url":null,"abstract":"The human facial expression identification system is essential in developing human interaction and technology. The development of Artificial Intelligence for monitoring human emotions can be helpful in the workplace. Commonly, there are six basic human expressions, namely anger, disgust, fear, happiness, sadness, and surprise, that the system can identify. This study aims to create a facial expression identification system based on basic human expressions using the Convolutional Neural Network (CNN) with a 16-layer VGG architecture. Two thousand one hundred thirty-seven facial expression images were selected from the FER2013, JAFFE, and MUG datasets. By implementing image augmentation and setting up the network parameters to Epoch of 100, the learning rate of 0,0001, and applying in the 5Fold Cross Validation, this system shows performance with an average accuracy of 84%. Results show that the model is suitable for identifying the basic facial expressions of humans.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49406322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fish Image Classification Using Adaptive Learning Rate In Transfer Learning Method 基于自适应学习率的鱼类图像分类迁移学习方法
Pub Date : 2022-06-07 DOI: 10.17977/um018v5i12022p67-77
R. Suhana, W. Mahmudy, Agung Setia Budi
The existence of fish species diversity in coastal ecosystems which include mangrove forests, seagrass beds and coral reefs is one of the benchmarks in determining health in coastal ecosystems. It is certain that we must maintain, preserve and care for so that conservation efforts need to be carried out in water areas. Many experts at the Indonesian Fisheries and Marine Research and Development Agency often classify fish images manually, of course it will take a long time, therefore with today's developments they can use the latest technology.  One of the reliable techniques in terms of image classification is Convolutional Neural Network (CNN). As time goes by, of course, many people want fast learning and solving new problems faster and better, so transfer learning appears, which adopts part of CNN, the name is modified convolution layer. Observing the needs of experts in the field of marine conservation, the researchers decided to solve this problem by using transfer learning modifications. The transfer learning used is an architectural model from the pre-trained Mobilenet V2, which is known for its light computing process and can be applied to our gadgets and other embedded tools. The research image data used is 49.281 data of various sizes and there are 18 types of fish, in the pre-processing data there is a resize of the image to a size of 224x224 pixels. testing with the modified transfer learning architectural model obtained an accuracy score of 99.54%, this model is quite reliable in classifying fish images.
包括红树林、海草床和珊瑚礁在内的沿海生态系统中鱼类物种多样性的存在是决定沿海生态系统健康状况的基准之一。可以肯定的是,我们必须维护、保护和照顾,以便在水域开展保护工作。印尼渔业和海洋研究与发展局的许多专家经常手动对鱼类图像进行分类,当然这需要很长时间,因此随着今天的发展,他们可以使用最新的技术。卷积神经网络是图像分类的可靠技术之一。当然,随着时间的推移,许多人希望快速学习,更快更好地解决新问题,因此出现了迁移学习,它采用了CNN的一部分,名称为修改卷积层。考虑到海洋保护领域专家的需求,研究人员决定通过迁移学习修改来解决这个问题。所使用的迁移学习是预训练的Mobilenet V2的架构模型,该模型以其轻计算过程而闻名,可以应用于我们的小工具和其他嵌入式工具。所使用的研究图像数据是各种大小的49.281个数据,有18种类型的鱼,在预处理数据中,将图像调整为224x224像素的大小。用改进的迁移学习结构模型进行测试,获得了99.54%的准确率,该模型在鱼类图像分类中是相当可靠的。
{"title":"Fish Image Classification Using Adaptive Learning Rate In Transfer Learning Method","authors":"R. Suhana, W. Mahmudy, Agung Setia Budi","doi":"10.17977/um018v5i12022p67-77","DOIUrl":"https://doi.org/10.17977/um018v5i12022p67-77","url":null,"abstract":"The existence of fish species diversity in coastal ecosystems which include mangrove forests, seagrass beds and coral reefs is one of the benchmarks in determining health in coastal ecosystems. It is certain that we must maintain, preserve and care for so that conservation efforts need to be carried out in water areas. Many experts at the Indonesian Fisheries and Marine Research and Development Agency often classify fish images manually, of course it will take a long time, therefore with today's developments they can use the latest technology.  One of the reliable techniques in terms of image classification is Convolutional Neural Network (CNN). As time goes by, of course, many people want fast learning and solving new problems faster and better, so transfer learning appears, which adopts part of CNN, the name is modified convolution layer. Observing the needs of experts in the field of marine conservation, the researchers decided to solve this problem by using transfer learning modifications. The transfer learning used is an architectural model from the pre-trained Mobilenet V2, which is known for its light computing process and can be applied to our gadgets and other embedded tools. The research image data used is 49.281 data of various sizes and there are 18 types of fish, in the pre-processing data there is a resize of the image to a size of 224x224 pixels. testing with the modified transfer learning architectural model obtained an accuracy score of 99.54%, this model is quite reliable in classifying fish images.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46307625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Knowledge Engineering and Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1