首页 > 最新文献

Journal of Applied Data Sciences最新文献

英文 中文
Analysis of Real Time Twitter Sentiments using Deep Learning Models 利用深度学习模型分析实时 Twitter 情绪
Pub Date : 2023-12-01 DOI: 10.47738/jads.v4i4.146
Raed Alsini
{"title":"Analysis of Real Time Twitter Sentiments using Deep Learning Models","authors":"Raed Alsini","doi":"10.47738/jads.v4i4.146","DOIUrl":"https://doi.org/10.47738/jads.v4i4.146","url":null,"abstract":"","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"379 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138989671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gold Prices Time-Series Forecasting: Comparison of Statistical Techniques 黄金价格时间序列预测:统计技术比较
Pub Date : 2023-12-01 DOI: 10.47738/jads.v4i4.135
Indra Maryati
{"title":"Gold Prices Time-Series Forecasting: Comparison of Statistical Techniques","authors":"Indra Maryati","doi":"10.47738/jads.v4i4.135","DOIUrl":"https://doi.org/10.47738/jads.v4i4.135","url":null,"abstract":"","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"425 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138991270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiple Choice Question Difficulty Level Classification with Multi Class Confusion Matrix in the Online Question Bank of Education Gallery 教育图库在线题库中的多选题难度等级分类与多类混淆矩阵
Pub Date : 2023-12-01 DOI: 10.47738/jads.v4i4.132
Pariang Sonang Siregar
{"title":"Multiple Choice Question Difficulty Level Classification with Multi Class Confusion Matrix in the Online Question Bank of Education Gallery","authors":"Pariang Sonang Siregar","doi":"10.47738/jads.v4i4.132","DOIUrl":"https://doi.org/10.47738/jads.v4i4.132","url":null,"abstract":"","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"321 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139019892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image Classifier based on Histogram Matching and Outlier Detection using Hellinger distance 基于直方图匹配和使用海灵格距离检测离群点的图像分类器
Pub Date : 2023-12-01 DOI: 10.47738/jads.v4i4.114
Anamika Gupta
{"title":"Image Classifier based on Histogram Matching and Outlier Detection using Hellinger distance","authors":"Anamika Gupta","doi":"10.47738/jads.v4i4.114","DOIUrl":"https://doi.org/10.47738/jads.v4i4.114","url":null,"abstract":"","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"294 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139021741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predictive and Analytics using Data Mining and Machine Learning for Customer Churn Prediction 利用数据挖掘和机器学习进行客户流失预测和分析
Pub Date : 2023-12-01 DOI: 10.47738/jads.v4i4.131
Chandra Lukita
{"title":"Predictive and Analytics using Data Mining and Machine Learning for Customer Churn Prediction","authors":"Chandra Lukita","doi":"10.47738/jads.v4i4.131","DOIUrl":"https://doi.org/10.47738/jads.v4i4.131","url":null,"abstract":"","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"98 ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139025060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Learning Methods for Topic Extraction and Modeling in Large-scale Text Corpora using LSA and LDA 基于LSA和LDA的大规模文本语料库主题提取和建模的无监督学习方法
Pub Date : 2023-09-15 DOI: 10.47738/jads.v4i3.102
Henderi Henderi
This research compares unsupervised learning methods in topic extraction and modeling in large-scale text corpora. The methods used are Singular Value Decomposition (SVD) and Latent Dirichlet Allocation (LDA). SVD is used to extract important features through term-document matrix decomposition, while LDA identifies hidden topics based on the probability distribution of words. The research involves data collection, data exploratory analysis (EDA), topic extraction using SVD, data preprocessing, and topic extraction using LDA. The data used were large-scale text corpora. Data explorative analysis was conducted to understand the characteristics and structure of text corpora before topic extraction was performed. SVD and LDA were used to identify the main topics in the text corpora. The results showed that SVD and LDA were successful in topic extraction and modeling of large-scale text corpora. SVD reveals cohesive patterns and thematically related topics. LDA identifies hidden topics based on the probability distribution of words. These findings have important implications in text processing and analysis. The resulting topic representations can be used for information mining, document categorization, and more in-depth text analysis. The use of SVD and LDA in topic extraction and modeling of large-scale text corpora provides valuable insights in text analysis. However, this research has limitations. The success of the methods depends on the quality and representativeness of the text corpora. Topic interpretation still requires further understanding and analysis. Future research can develop methods and techniques to improve the accuracy and efficiency of topic extraction and text corpora modeling.
本研究比较了非监督学习方法在大规模文本语料库中的主题提取和建模。使用的方法是奇异值分解(SVD)和潜在狄利克雷分配(LDA)。SVD通过词-文档矩阵分解提取重要特征,LDA根据词的概率分布识别隐藏主题。研究内容包括数据收集、数据探索性分析(EDA)、基于奇异值分解的主题提取、数据预处理和基于LDA的主题提取。使用的数据为大规模文本语料库。在进行主题提取之前,进行数据探索性分析,了解文本语料库的特征和结构。采用SVD和LDA对文本语料库中的主题进行识别。结果表明,SVD和LDA在大规模文本语料库的主题提取和建模中取得了成功。SVD揭示了内聚模式和主题相关的主题。LDA根据单词的概率分布识别隐藏主题。这些发现对文本处理和分析具有重要意义。得到的主题表示可用于信息挖掘、文档分类和更深入的文本分析。在大规模文本语料库的主题提取和建模中使用SVD和LDA为文本分析提供了有价值的见解。然而,这项研究也有局限性。这些方法的成功与否取决于文本语料库的质量和代表性。主题解读还需要进一步的理解和分析。未来的研究可以开发出提高主题提取和文本语料库建模的准确性和效率的方法和技术。
{"title":"Unsupervised Learning Methods for Topic Extraction and Modeling in Large-scale Text Corpora using LSA and LDA","authors":"Henderi Henderi","doi":"10.47738/jads.v4i3.102","DOIUrl":"https://doi.org/10.47738/jads.v4i3.102","url":null,"abstract":"This research compares unsupervised learning methods in topic extraction and modeling in large-scale text corpora. The methods used are Singular Value Decomposition (SVD) and Latent Dirichlet Allocation (LDA). SVD is used to extract important features through term-document matrix decomposition, while LDA identifies hidden topics based on the probability distribution of words. The research involves data collection, data exploratory analysis (EDA), topic extraction using SVD, data preprocessing, and topic extraction using LDA. The data used were large-scale text corpora. Data explorative analysis was conducted to understand the characteristics and structure of text corpora before topic extraction was performed. SVD and LDA were used to identify the main topics in the text corpora. The results showed that SVD and LDA were successful in topic extraction and modeling of large-scale text corpora. SVD reveals cohesive patterns and thematically related topics. LDA identifies hidden topics based on the probability distribution of words. These findings have important implications in text processing and analysis. The resulting topic representations can be used for information mining, document categorization, and more in-depth text analysis. The use of SVD and LDA in topic extraction and modeling of large-scale text corpora provides valuable insights in text analysis. However, this research has limitations. The success of the methods depends on the quality and representativeness of the text corpora. Topic interpretation still requires further understanding and analysis. Future research can develop methods and techniques to improve the accuracy and efficiency of topic extraction and text corpora modeling.","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135437965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mean-Median Smoothing Backpropagation Neural Network to Forecast Unique Visitors Time Series of Electronic Journal 均值-中值平滑反向传播神经网络预测电子期刊访客时间序列
Pub Date : 2023-09-15 DOI: 10.47738/jads.v4i3.97
Aji Prasetya Wibawa
Sessions or unique visitors is the number of visitors from one IP who accessed a journal portal for the first time in a certain period of time. The large number of unique daily average subscriber visits to electronic journal pages indicates that this scientific periodical is in high demand. Hence, the number of unique visitors is an important indicator of the accomplishment of an electronic journal as a measure of the dissemination in accelerating the journal accreditation system. Numerous methods can be used for forecasting, one of which is the backpropagation neural network (BPNN). Data quality is very important in building a good BPNN model, because the success of modeling at BPNN is very dependent on input data. One way that can be carried out to improve data quality is by smoothing the data. In this study, the forecasting method for predicting time series data for unique visitors to electronic journals employed three models, respectively BPNN, BPNN with mean smoothing, and BPNN with median smoothing. Based on the findings, the results of the smallest error were obtained by the BPNN model with a mean smoothing with MSE 0.00129 and RMSE 0.03518 with a learning rate of 0.4 on 1-2-1 architecture which can be used as a forecast for unique visitors of electronic journals.
会话数或唯一访客数是指在一定时间内,同一IP第一次访问日志门户的访客数。电子期刊页面的日均访问量之大,表明这种科学期刊的需求量很大。因此,独立访客数量是电子期刊发展的重要标志,也是期刊认证制度加快实施的一项衡量传播的指标。有许多方法可以用于预测,其中一种是反向传播神经网络(BPNN)。数据质量对于建立一个好的bp神经网络模型是非常重要的,因为bp神经网络建模的成功很大程度上依赖于输入数据。提高数据质量的一种方法是平滑数据。在本研究中,电子期刊唯一访问者时间序列数据的预测方法采用了三种模型,分别是BPNN、BPNN带均值平滑和BPNN带中值平滑。在此基础上,基于1-2-1结构的BPNN模型得到了误差最小的预测结果,其平均平滑度为MSE 0.00129, RMSE 0.03518,学习率为0.4,可用于预测电子期刊的唯一访问者。
{"title":"Mean-Median Smoothing Backpropagation Neural Network to Forecast Unique Visitors Time Series of Electronic Journal","authors":"Aji Prasetya Wibawa","doi":"10.47738/jads.v4i3.97","DOIUrl":"https://doi.org/10.47738/jads.v4i3.97","url":null,"abstract":"Sessions or unique visitors is the number of visitors from one IP who accessed a journal portal for the first time in a certain period of time. The large number of unique daily average subscriber visits to electronic journal pages indicates that this scientific periodical is in high demand. Hence, the number of unique visitors is an important indicator of the accomplishment of an electronic journal as a measure of the dissemination in accelerating the journal accreditation system. Numerous methods can be used for forecasting, one of which is the backpropagation neural network (BPNN). Data quality is very important in building a good BPNN model, because the success of modeling at BPNN is very dependent on input data. One way that can be carried out to improve data quality is by smoothing the data. In this study, the forecasting method for predicting time series data for unique visitors to electronic journals employed three models, respectively BPNN, BPNN with mean smoothing, and BPNN with median smoothing. Based on the findings, the results of the smallest error were obtained by the BPNN model with a mean smoothing with MSE 0.00129 and RMSE 0.03518 with a learning rate of 0.4 on 1-2-1 architecture which can be used as a forecast for unique visitors of electronic journals.","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135437963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparative Study of Feature Selection Techniques in Machine Learning for Predicting Stock Market Trends 股票市场趋势预测机器学习中特征选择技术的比较研究
Pub Date : 2023-09-15 DOI: 10.47738/jads.v4i3.99
Adi Suryaputra Paramita
This study aims to compare the effectiveness of three feature selection techniques, namely Principal Component Analysis (PCA), Information Gain (IG), and Recursive Feature Elimination (RFE), in predicting stock market conditions. This research uses three distinct Kaggle datasets that contain data for predicting stock market values. The results show that RFE performs better than PCA and IG in predicting market value with fairly precise accuracy. By using the RFE technique, this study was able to identify the most influential features in prediction, reduce the dimensionality of the data, and improve the performance of the prediction model. These provide significant benefits in the world of stocks, including improved investment decisions, reduced investment risk, improved trading strategy performance, and identification of promising investment opportunities. For future research, further comparative studies between other feature selection techniques can be conducted. This research has novelty in several aspects. First, it applies different feature selection techniques, namely Principal Component Analysis (PCA), Information Gain (IG), and Recursive Feature Elimination (RFE), in the context of stock market prediction. Utilizing these techniques to select the most relevant features in predicting stock market conditions provides a deeper understanding of the influence of these features on stock price movements. Furthermore, this research utilizes different datasets from Kaggle, which represent various stock market value predictions. The utilization of diverse datasets provides variation in the data and allows this research to examine the performance of feature selection techniques in multiple stock market contexts. In conclusion, this research provides insight into the effectiveness of feature selection techniques in stock market value prediction. It also provides actionable guidance for market participants to improve investment decisions and trading performance in the stock market.
本研究旨在比较三种特征选择技术,即主成分分析(PCA)、信息增益(IG)和递归特征消除(RFE)在预测股票市场状况方面的有效性。本研究使用了三个不同的Kaggle数据集,这些数据集包含预测股票市场价值的数据。结果表明,RFE在预测市场价值方面优于PCA和IG,准确率较高。通过RFE技术,本研究能够识别预测中影响最大的特征,降低数据的维数,提高预测模型的性能。这些为股票世界提供了显著的好处,包括改进投资决策,降低投资风险,提高交易策略绩效,并识别有前途的投资机会。在未来的研究中,可以对其他特征选择技术进行进一步的比较研究。本研究在几个方面具有新颖性。首先,在股票市场预测中应用不同的特征选择技术,即主成分分析(PCA)、信息增益(IG)和递归特征消除(RFE)。利用这些技术来选择最相关的特征来预测股票市场状况,可以更深入地了解这些特征对股票价格走势的影响。此外,本研究利用了来自Kaggle的不同数据集,这些数据集代表了不同的股票市场价值预测。不同数据集的使用提供了数据的变化,并允许本研究在多种股票市场背景下检查特征选择技术的性能。总之,本研究为特征选择技术在股票市场价值预测中的有效性提供了深入的见解。它还为市场参与者提供了可操作的指导,以改善股票市场的投资决策和交易绩效。
{"title":"A Comparative Study of Feature Selection Techniques in Machine Learning for Predicting Stock Market Trends","authors":"Adi Suryaputra Paramita","doi":"10.47738/jads.v4i3.99","DOIUrl":"https://doi.org/10.47738/jads.v4i3.99","url":null,"abstract":"This study aims to compare the effectiveness of three feature selection techniques, namely Principal Component Analysis (PCA), Information Gain (IG), and Recursive Feature Elimination (RFE), in predicting stock market conditions. This research uses three distinct Kaggle datasets that contain data for predicting stock market values. The results show that RFE performs better than PCA and IG in predicting market value with fairly precise accuracy. By using the RFE technique, this study was able to identify the most influential features in prediction, reduce the dimensionality of the data, and improve the performance of the prediction model. These provide significant benefits in the world of stocks, including improved investment decisions, reduced investment risk, improved trading strategy performance, and identification of promising investment opportunities. For future research, further comparative studies between other feature selection techniques can be conducted. This research has novelty in several aspects. First, it applies different feature selection techniques, namely Principal Component Analysis (PCA), Information Gain (IG), and Recursive Feature Elimination (RFE), in the context of stock market prediction. Utilizing these techniques to select the most relevant features in predicting stock market conditions provides a deeper understanding of the influence of these features on stock price movements. Furthermore, this research utilizes different datasets from Kaggle, which represent various stock market value predictions. The utilization of diverse datasets provides variation in the data and allows this research to examine the performance of feature selection techniques in multiple stock market contexts. In conclusion, this research provides insight into the effectiveness of feature selection techniques in stock market value prediction. It also provides actionable guidance for market participants to improve investment decisions and trading performance in the stock market.","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135437957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LSTM-Based Machine Translation for Madurese-Indonesian 基于lstm的印尼语-马杜雷语机器翻译
Pub Date : 2023-09-15 DOI: 10.47738/jads.v4i3.113
Danang Arbian Sulistyo
Madurese is one of the regional languages in Indonesia, which dominates East Java and Madura Island in particular. The use of Madurese as a daily language has declined significantly due to a language shift in children and adolescents, some of which are caused by a sense of prestige and difficulty in learning Madurese. The scarcity of research or scientific titles that raises the Madurese language also helps reduce literacy in the language. Our research focuses on creating a translation machine for Madurese to Indonesian to maintain and preserve the existence of the Madurese language so that learning can be done through digital media. This study use the latest dataset for the Madurese-Indonesian language by using a corpus of 30,000 Madura-Indonesian sentence pairs from the online Bible. This study scrapped online Bible pages to organize the corpus based on the Indonesian and Madurese bilingual Bible. Then This study manually process text to match the two languages' scrapping results, normalization, and tokenization to remove non-printable characters and punctuation from the corpus. To perform neural machine translation (NMT), This study connected the RNN encoder with the RNN decoder of the language model, while for training and testing, This study used a sequential model with LSTM, while the BLEU measure was used to assess the accuracy of the translation results. This study used the SoftMax optimization function with Adam Optimizer and added some settings, including using 128 layers in the training process and adding a Dropout layer so that This study got the average evaluation result for BLEU-1 is 0.798068, BLEU-2 is 0.680932, BLEU-3 is 0.623489, and for BLEU-4 is 0.523546 from five tests conducted. Given the language differences between Madurese and Indonesian, this can be the best approach for machine translation of Indonesian to Madurese.
马杜罗语是印度尼西亚的一种地区性语言,尤其在东爪哇和马杜拉岛占主导地位。由于儿童和青少年的语言转变,使用马杜罗语作为日常语言的人数大幅下降,其中一些是由于威信感和学习马杜罗语的困难造成的。提高马杜罗语水平的研究或科学头衔的缺乏也有助于降低该语言的识字率。我们的研究重点是创建一个马杜罗语到印尼语的翻译机器,以维持和保存马杜罗语的存在,以便通过数字媒体进行学习。这项研究使用了最新的马杜罗语-印尼语数据集,使用了来自在线圣经的30,000对马杜罗语-印尼语句子。这项研究取消了在线圣经页面,以印尼语和马杜罗语双语圣经为基础组织语料库。然后,本研究手动处理文本以匹配两种语言的废弃结果,规范化和标记化以从语料库中删除不可打印的字符和标点符号。为了进行神经机器翻译(NMT),本研究将语言模型的RNN编码器与RNN解码器连接起来,而对于训练和测试,本研究使用了具有LSTM的序列模型,并使用BLEU度量来评估翻译结果的准确性。本研究使用了带有Adam Optimizer的SoftMax优化函数,并增加了一些设置,包括在训练过程中使用128层,并增加了Dropout层,因此本研究通过五次测试得到BLEU-1的平均评价结果为0.798068,BLEU-2为0.680932,BLEU-3为0.623489,BLEU-4为0.523546。考虑到印尼语和印尼语之间的语言差异,这可能是印尼语到印尼语的最佳机器翻译方法。
{"title":"LSTM-Based Machine Translation for Madurese-Indonesian","authors":"Danang Arbian Sulistyo","doi":"10.47738/jads.v4i3.113","DOIUrl":"https://doi.org/10.47738/jads.v4i3.113","url":null,"abstract":"Madurese is one of the regional languages in Indonesia, which dominates East Java and Madura Island in particular. The use of Madurese as a daily language has declined significantly due to a language shift in children and adolescents, some of which are caused by a sense of prestige and difficulty in learning Madurese. The scarcity of research or scientific titles that raises the Madurese language also helps reduce literacy in the language. Our research focuses on creating a translation machine for Madurese to Indonesian to maintain and preserve the existence of the Madurese language so that learning can be done through digital media. This study use the latest dataset for the Madurese-Indonesian language by using a corpus of 30,000 Madura-Indonesian sentence pairs from the online Bible. This study scrapped online Bible pages to organize the corpus based on the Indonesian and Madurese bilingual Bible. Then This study manually process text to match the two languages' scrapping results, normalization, and tokenization to remove non-printable characters and punctuation from the corpus. To perform neural machine translation (NMT), This study connected the RNN encoder with the RNN decoder of the language model, while for training and testing, This study used a sequential model with LSTM, while the BLEU measure was used to assess the accuracy of the translation results. This study used the SoftMax optimization function with Adam Optimizer and added some settings, including using 128 layers in the training process and adding a Dropout layer so that This study got the average evaluation result for BLEU-1 is 0.798068, BLEU-2 is 0.680932, BLEU-3 is 0.623489, and for BLEU-4 is 0.523546 from five tests conducted. Given the language differences between Madurese and Indonesian, this can be the best approach for machine translation of Indonesian to Madurese.","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135437964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bank Soundness Level Prediction: ANFIS vs Deep Learning 银行稳健性水平预测:ANFIS vs深度学习
Pub Date : 2023-09-15 DOI: 10.47738/jads.v4i3.116
Satia Nur Maharani
The systemic nature of the risk of bankruptcy of financial institutions has become an important issue in maintaining the existence and stability of domestic and global finance. The use of statistics for bankruptcy prediction so far provides optimal benefits. However, this approach has limitations, especially since the model is built based on systematic relationships, so the linearity and normality aspects are often weaknesses. This can be overcome very efficiently through linear and non-linear patterns built by artificial intelligence models. One of the most popular of these techniques is the Artificial Neural Network (ANN). Many studies show that ANN and fuzzy set theory is more accurate, adaptive, and strong in predicting compared to statistical models. One technique to integrate ANN with fuzzy logic systems is through the Adaptive-Network-Based Fuzzy Inference System (ANFIS). ANFIS is an adaptive network that is functionally equivalent to fuzzy inference and has the advantages of ANN and fuzzy logic. One of the important features of ANFIS is its acclimatization capability where the membership function parameters can adapt and change in the learning procedure. Utilizing the ANN model and fuzzy logic for bankruptcy prediction is still very limited in Indonesia. Therefore, this study aims to construct a financial institution bankruptcy prediction model that is much more accurate, operational quickly, and effective through ANFIS as a hybrid of fuzzy logic and ANN. The results showed that ANFIS can be used to predict the bankruptcy of financial institutions with the best MAPE 0.140335507.
金融机构破产风险的系统性已经成为维护国内和全球金融存在与稳定的重要问题。到目前为止,使用统计数据进行破产预测提供了最佳效益。然而,这种方法有局限性,特别是因为模型是基于系统关系建立的,所以线性和正态性方面往往是弱点。这可以通过人工智能模型建立的线性和非线性模式非常有效地克服。其中最流行的技术之一是人工神经网络(ANN)。许多研究表明,与统计模型相比,人工神经网络和模糊集理论具有更准确、自适应和更强的预测能力。将人工神经网络与模糊逻辑系统集成的一种技术是基于自适应网络的模糊推理系统(ANFIS)。ANFIS是一种功能等同于模糊推理的自适应网络,具有人工神经网络和模糊逻辑的优点。ANFIS的一个重要特征是它的适应能力,即隶属函数参数在学习过程中可以适应和改变。利用人工神经网络模型和模糊逻辑进行破产预测在印尼还很有限。因此,本研究旨在通过模糊逻辑与人工神经网络相结合的ANFIS,构建更加准确、快速、有效的金融机构破产预测模型。结果表明,ANFIS能够预测金融机构破产,其MAPE为0.140335507。
{"title":"Bank Soundness Level Prediction: ANFIS vs Deep Learning","authors":"Satia Nur Maharani","doi":"10.47738/jads.v4i3.116","DOIUrl":"https://doi.org/10.47738/jads.v4i3.116","url":null,"abstract":"The systemic nature of the risk of bankruptcy of financial institutions has become an important issue in maintaining the existence and stability of domestic and global finance. The use of statistics for bankruptcy prediction so far provides optimal benefits. However, this approach has limitations, especially since the model is built based on systematic relationships, so the linearity and normality aspects are often weaknesses. This can be overcome very efficiently through linear and non-linear patterns built by artificial intelligence models. One of the most popular of these techniques is the Artificial Neural Network (ANN). Many studies show that ANN and fuzzy set theory is more accurate, adaptive, and strong in predicting compared to statistical models. One technique to integrate ANN with fuzzy logic systems is through the Adaptive-Network-Based Fuzzy Inference System (ANFIS). ANFIS is an adaptive network that is functionally equivalent to fuzzy inference and has the advantages of ANN and fuzzy logic. One of the important features of ANFIS is its acclimatization capability where the membership function parameters can adapt and change in the learning procedure. Utilizing the ANN model and fuzzy logic for bankruptcy prediction is still very limited in Indonesia. Therefore, this study aims to construct a financial institution bankruptcy prediction model that is much more accurate, operational quickly, and effective through ANFIS as a hybrid of fuzzy logic and ANN. The results showed that ANFIS can be used to predict the bankruptcy of financial institutions with the best MAPE 0.140335507.","PeriodicalId":479720,"journal":{"name":"Journal of Applied Data Sciences","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135437954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Applied Data Sciences
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1