Adidah Lajis, Tiliza Awang Mat, Norsuhaili Seid, Zanariah Abu Bakar, Ahmad Shahrafidz Khalid, Faiza Roslee, Nor Azlina Ali
Information is crucial in any organization especially with regard to university ranking assessment. The information retrieval by using the words of “universiti kuala lumpur” will retrieve a publication from others university entity that contain those words. The most common data included was from Infrastructure University Kuala Lumpur. Besides, the data received from Scopus are in the form of authors initial and it will take some time for the Publication Unit officer to identify the author of the papers and to indicate the campus. This makes the process of preparing the master list taking longer time. This initiative is to shorten the process of preparing the university ranking document. The processed data will be visualized accordingly for head of campus and top management university and the publication performance can be assessed. It is found that the system developed enable the Publication Unit officers to process the data received from the library within 3 days for 200 data. Previously, the process took almost two weeks or more due to process of findings the owner of the publication. This finding confirmed the argument made by Patel (2021) that the information system is an enabler to achieve the final goal and in this case is the goal are to the prepare MyRA and SETARA publication master list and to provide the publication analysis for university and campus for decision making and strategic planning.
信息在任何组织中都是至关重要的,尤其是在大学排名评估方面。通过使用“universiti kuala lumpur”的单词进行信息检索,将检索包含这些单词的其他大学实体的出版物。其中最常见的数据来自吉隆坡基础设施大学。此外,从Scopus收到的数据是作者首字母的形式,出版单位官员需要一些时间来确定论文的作者并指出校园。这使得准备主列表的过程花费更长的时间。这一举措是为了缩短准备大学排名文件的过程。处理后的数据将相应可视化,供校园负责人和高层管理人员使用,并可对出版业绩进行评估。发现所开发的系统使出版股干事能够在3天内处理从图书馆收到的200个数据。以前,由于出版物所有者的调查过程,这个过程几乎需要两周或更长时间。这一发现证实了Patel(2021)提出的观点,即信息系统是实现最终目标的推手,在这种情况下,目标是准备MyRA和SETARA出版总清单,并为大学和校园的决策和战略规划提供出版分析。
{"title":"Unstructured Data to Visualized Information: Improving process of master list preparation for university ranking assessment and decision making: Unstructured Data to Visualized Information","authors":"Adidah Lajis, Tiliza Awang Mat, Norsuhaili Seid, Zanariah Abu Bakar, Ahmad Shahrafidz Khalid, Faiza Roslee, Nor Azlina Ali","doi":"10.1145/3587828.3587831","DOIUrl":"https://doi.org/10.1145/3587828.3587831","url":null,"abstract":"Information is crucial in any organization especially with regard to university ranking assessment. The information retrieval by using the words of “universiti kuala lumpur” will retrieve a publication from others university entity that contain those words. The most common data included was from Infrastructure University Kuala Lumpur. Besides, the data received from Scopus are in the form of authors initial and it will take some time for the Publication Unit officer to identify the author of the papers and to indicate the campus. This makes the process of preparing the master list taking longer time. This initiative is to shorten the process of preparing the university ranking document. The processed data will be visualized accordingly for head of campus and top management university and the publication performance can be assessed. It is found that the system developed enable the Publication Unit officers to process the data received from the library within 3 days for 200 data. Previously, the process took almost two weeks or more due to process of findings the owner of the publication. This finding confirmed the argument made by Patel (2021) that the information system is an enabler to achieve the final goal and in this case is the goal are to the prepare MyRA and SETARA publication master list and to provide the publication analysis for university and campus for decision making and strategic planning.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"676 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132142241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The effectiveness of emotion regulation based on traditional Chinese music has been verified in clinical trials over thousands of years, but the reasons are unclear. This paper aims to use feature engineering to find effective music features which are effective for classifying different types of music and thus to try to provide an automatic recognition framework for building music libraries that can be used for mood regulation and music therapy. In this work, five modes (equivalent to the scales of Western music) of traditional Chinese music repertoire which can be used to regulate loneliness, anxiety, anger, joy, and fear are used. Features including Chroma, Mel-spectrogram, Tonnetz, and full feature vector features, are extracted for different length fragments of a piece of music which are then used to build a classification model for the five modes using a convolutional neural network (CNN). The results show that the highest 5-classes classification accuracy, 71.09%, is achieved from a Mel map of 5s music clips. A music mode labeling model is then constructed using a weighted combination of the different individual feature models. This model was then qualitatively evaluated on 13 pieces of music in different musical styles, and the results were reasonable from a music theory perspective. In future work, this music labeling model will be tested on more types of tracks to better assess its reliability.
{"title":"A Music Labeling Model Based on Traditional Chinese Music Characteristics for Emotional Regulation","authors":"Zhenghao He, Ruifan Chen, Yayue Hou, Fei Xie, Xiaoliang Gong, A. Cohn","doi":"10.1145/3587828.3587837","DOIUrl":"https://doi.org/10.1145/3587828.3587837","url":null,"abstract":"The effectiveness of emotion regulation based on traditional Chinese music has been verified in clinical trials over thousands of years, but the reasons are unclear. This paper aims to use feature engineering to find effective music features which are effective for classifying different types of music and thus to try to provide an automatic recognition framework for building music libraries that can be used for mood regulation and music therapy. In this work, five modes (equivalent to the scales of Western music) of traditional Chinese music repertoire which can be used to regulate loneliness, anxiety, anger, joy, and fear are used. Features including Chroma, Mel-spectrogram, Tonnetz, and full feature vector features, are extracted for different length fragments of a piece of music which are then used to build a classification model for the five modes using a convolutional neural network (CNN). The results show that the highest 5-classes classification accuracy, 71.09%, is achieved from a Mel map of 5s music clips. A music mode labeling model is then constructed using a weighted combination of the different individual feature models. This model was then qualitatively evaluated on 13 pieces of music in different musical styles, and the results were reasonable from a music theory perspective. In future work, this music labeling model will be tested on more types of tracks to better assess its reliability.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122214776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. P. Yulianti, A. Trisetyarso, J. Santoso, K. Surendro
The basis of ensemble learning is using multiple learning algorithms to improve predictive performance compared to individual learners. Behind the various advantages of ensemble learning, there are several issues that need attention, one of which is related to finding a set of diverse base learners. Recently, clustering has been used to generate diverse base learners as opposed to bagging. The main advantages of cluster-based ensemble learners are their robustness and versatility. The key parameters for implementing a clustering algorithm are the cluster size and distance metrics. The contribution of this study is to compare four distance metrics, including the Euclidean, Manhattan, Chebyshev, and Canberra distances, in the clustering method for ensemble generation and evaluate them based on accuracy, purity, and diversity. The methodology is tested on 10 benchmark UCI datasets. The results show that the use of the Chebyshev and Canberra distances achieved superior accuracy to both the Euclidean and Manhattan distances, while the purity and diversity values of the use of the Chebyshev distance outperformed the other three.
{"title":"Comparison of Distance Metrics for Generating Cluster-based Ensemble Learning","authors":"L. P. Yulianti, A. Trisetyarso, J. Santoso, K. Surendro","doi":"10.1145/3587828.3587833","DOIUrl":"https://doi.org/10.1145/3587828.3587833","url":null,"abstract":"The basis of ensemble learning is using multiple learning algorithms to improve predictive performance compared to individual learners. Behind the various advantages of ensemble learning, there are several issues that need attention, one of which is related to finding a set of diverse base learners. Recently, clustering has been used to generate diverse base learners as opposed to bagging. The main advantages of cluster-based ensemble learners are their robustness and versatility. The key parameters for implementing a clustering algorithm are the cluster size and distance metrics. The contribution of this study is to compare four distance metrics, including the Euclidean, Manhattan, Chebyshev, and Canberra distances, in the clustering method for ensemble generation and evaluate them based on accuracy, purity, and diversity. The methodology is tested on 10 benchmark UCI datasets. The results show that the use of the Chebyshev and Canberra distances achieved superior accuracy to both the Euclidean and Manhattan distances, while the purity and diversity values of the use of the Chebyshev distance outperformed the other three.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132612008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For a multi-criteria recommender system (MCRS), a complete set of criteria ratings is necessary to produce an accurate recommendation. Incomplete preferences, known as the "partial preferences problem," is one of the problems in MCRS. This issue affects the performance of MCRS due to an increase in data sparsity. Criteria rating prediction is one method for completing the preferences. Therefore, this study proposes a new method for preferences completion, that is a multi-attribute Bidirectional Encoder Representations from Transformers (BERT). The proposed method incorporates reviews and overall ratings to predict incomplete criteria ratings. Rule-based adjustment is also performed to enhance the performance of the proposed method in predicting the worst rating. This study shows that the proposed method outperforms the baseline method. The proposed method is also evaluated on MCRS using a user-based multi-criteria collaborative filtering approach. The result is that it has a positive impact on the recommendation system.
{"title":"Multi-Attribute BERT for Preferences Completion in Multi-Criteria Recommender System","authors":"Rita Rismala, N. Maulidevi, K. Surendro","doi":"10.1145/3587828.3587875","DOIUrl":"https://doi.org/10.1145/3587828.3587875","url":null,"abstract":"For a multi-criteria recommender system (MCRS), a complete set of criteria ratings is necessary to produce an accurate recommendation. Incomplete preferences, known as the \"partial preferences problem,\" is one of the problems in MCRS. This issue affects the performance of MCRS due to an increase in data sparsity. Criteria rating prediction is one method for completing the preferences. Therefore, this study proposes a new method for preferences completion, that is a multi-attribute Bidirectional Encoder Representations from Transformers (BERT). The proposed method incorporates reviews and overall ratings to predict incomplete criteria ratings. Rule-based adjustment is also performed to enhance the performance of the proposed method in predicting the worst rating. This study shows that the proposed method outperforms the baseline method. The proposed method is also evaluated on MCRS using a user-based multi-criteria collaborative filtering approach. The result is that it has a positive impact on the recommendation system.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123541366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
String figure is a traditional game with a loop of a string played by hooking and/or unhooking strands of the loop from fingers to produce patterns representing certain objects. The patterns of the string figure change dynamically by the string manipulations by fingers. A computer-based method based on knot theory has been proposed to make string figure patterns. This method represents a string in the string figure as an extended knot diagram and generates string figure patterns by monotonically decreasing the number of crossing points. However, there are string figures where this method does not make the patterns correctly. In contrast, this paper proposes a method using physical simulation from the viewpoint that the string figure patterns are determined according to the tension and frictional force applied to the string. To evaluate the proposed method, we conducted two types of experiments. In the first experiments, we showed that physical simulations using an adaptive multi-resolution wire model have sufficient capability to compute the string figure patterns. In the second, we conducted experiments to make patterns for instances of the string figure. The experimental results indicate that the proposed method can correctly make the string figure patterns in most cases. We also found that the proposed method can make some of the string figure patterns that cannot be generated by conventional methods.
{"title":"String Figure Simulation with Multiresolution Wire Model","authors":"Seikoh Nishita","doi":"10.1145/3587828.3587839","DOIUrl":"https://doi.org/10.1145/3587828.3587839","url":null,"abstract":"String figure is a traditional game with a loop of a string played by hooking and/or unhooking strands of the loop from fingers to produce patterns representing certain objects. The patterns of the string figure change dynamically by the string manipulations by fingers. A computer-based method based on knot theory has been proposed to make string figure patterns. This method represents a string in the string figure as an extended knot diagram and generates string figure patterns by monotonically decreasing the number of crossing points. However, there are string figures where this method does not make the patterns correctly. In contrast, this paper proposes a method using physical simulation from the viewpoint that the string figure patterns are determined according to the tension and frictional force applied to the string. To evaluate the proposed method, we conducted two types of experiments. In the first experiments, we showed that physical simulations using an adaptive multi-resolution wire model have sufficient capability to compute the string figure patterns. In the second, we conducted experiments to make patterns for instances of the string figure. The experimental results indicate that the proposed method can correctly make the string figure patterns in most cases. We also found that the proposed method can make some of the string figure patterns that cannot be generated by conventional methods.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116675607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Mekong Delta is an area with many rivers, so it is very convenient for waterway traffic. However, the transportation of agricultural and aquatic products by large ships is one of the possible causes of adverse impacts on the riverbank. For the above reasons, this paper presents how to design a monitoring system and collect hydrological parameters such as wave pressure, water level, and soil moisture that can seriously affect the embankment. The programmable system on chip (PSoC) was chosen for the design of sensor nodes because of the flexibility in interfacing the sensors. Wireless sensor networks using LoRa technology allow continuous operation and can be deployed in a wide area. The Raspberry Pi board is used as a gateway function for data aggregation and uploading to Firebase's real-time database. An Android application has been developed that allows easy monitoring of the recorded values. To evaluate the performance of the experimental system, the experiment was conducted at a high-risk landslide embankment on a tributary of the Hau River in Can Tho City, Vietnam.
{"title":"Hydrological Monitoring System Design and Implementation for River Embankment Protection","authors":"T. Truong, Doan Thanh Nguyen, Nghia Trong Nguyen","doi":"10.1145/3587828.3587848","DOIUrl":"https://doi.org/10.1145/3587828.3587848","url":null,"abstract":"The Mekong Delta is an area with many rivers, so it is very convenient for waterway traffic. However, the transportation of agricultural and aquatic products by large ships is one of the possible causes of adverse impacts on the riverbank. For the above reasons, this paper presents how to design a monitoring system and collect hydrological parameters such as wave pressure, water level, and soil moisture that can seriously affect the embankment. The programmable system on chip (PSoC) was chosen for the design of sensor nodes because of the flexibility in interfacing the sensors. Wireless sensor networks using LoRa technology allow continuous operation and can be deployed in a wide area. The Raspberry Pi board is used as a gateway function for data aggregation and uploading to Firebase's real-time database. An Android application has been developed that allows easy monitoring of the recorded values. To evaluate the performance of the experimental system, the experiment was conducted at a high-risk landslide embankment on a tributary of the Hau River in Can Tho City, Vietnam.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134103848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Innovations in machine learning algorithms have enhanced the effectiveness of malware detection systems during the previous decades. However, the advancement of high throughput technologies results in high dimensional malware data, making feature selection useful and mandatory in such datasets. The feature selection technique is an information retrieval tool that aims to improve classifiers by listing important features, which also aids in reducing computational overload. However, different feature selection algorithms select representative features using various criteria, making it difficult to determine the optimal technique for distinct domain datasets. Ensemble feature selection approaches, which integrate the results of several feature selections, can be used to overcome the inadequacies of single-feature selection methods. Therefore, this paper attempts to determine whether the heterogeneous ensemble of filter and embedded feature selection approaches, namely the heterogenous ensemble of ANOVA F-test, ReliefF, L1-penalized logistic regression, LASSO regression, Extra-Tree Classifier and XGBoost feature selection techniques, namely HEFS-ARLLEX, can provide a better classification performance than the single feature selection technique and other ensemble feature selection approaches for malware classification data. The experimental results show that HEFS-ARLLEX, which combines both filters and embedded, is a better choice, providing consistently high classification accuracy, recall, precision, specificity and F-measure and a reasonable feature reduction rate for malware classification dataset.
{"title":"Ensemble of Filter and Embedded Feature Selection Techniques for Malware Classification using High-dimensional Jar Extension Dataset","authors":"Yi Wei Tye, U. K. Yusof, Samat Tulpar","doi":"10.1145/3587828.3587849","DOIUrl":"https://doi.org/10.1145/3587828.3587849","url":null,"abstract":"Innovations in machine learning algorithms have enhanced the effectiveness of malware detection systems during the previous decades. However, the advancement of high throughput technologies results in high dimensional malware data, making feature selection useful and mandatory in such datasets. The feature selection technique is an information retrieval tool that aims to improve classifiers by listing important features, which also aids in reducing computational overload. However, different feature selection algorithms select representative features using various criteria, making it difficult to determine the optimal technique for distinct domain datasets. Ensemble feature selection approaches, which integrate the results of several feature selections, can be used to overcome the inadequacies of single-feature selection methods. Therefore, this paper attempts to determine whether the heterogeneous ensemble of filter and embedded feature selection approaches, namely the heterogenous ensemble of ANOVA F-test, ReliefF, L1-penalized logistic regression, LASSO regression, Extra-Tree Classifier and XGBoost feature selection techniques, namely HEFS-ARLLEX, can provide a better classification performance than the single feature selection technique and other ensemble feature selection approaches for malware classification data. The experimental results show that HEFS-ARLLEX, which combines both filters and embedded, is a better choice, providing consistently high classification accuracy, recall, precision, specificity and F-measure and a reasonable feature reduction rate for malware classification dataset.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"113 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We are currently witnessing the proliferation of blockchain environments to support a wide spectrum of corporate applications through the use of smart contracts. It is of no surprise that smart contract programming language technology constantly evolves to include not only specialized languages such as Solidity, but also general purpose languages such as GoLang and JavaScript. Furthermore, blockchain technology imposes unique challenges related to the monetary cost of deploying smart contracts, and handling roll-back issues when a smart contract fails. It is therefore evident that the complexity of systems involving smart contracts will only increase over time thus making the maintenance and evolution of such systems a very challenging task. One solution to these problems is to approach the implementation and deployment of such systems in a disciplined and automated way. In this paper, we propose a model-driven approach where the structure and inter-dependencies of smart contract, as well as stakeholder objectives, are denoted by extended goal models which can then be transformed to yield Solidity code that conforms with those models. More specifically, we present first a Domain Specific Language (DSL) to denote extended goal models and second, a transformation process which allows for the Abstract Syntax Trees of such a DSL program to be transformed into Solidity smart contact source code. The transformation process ensures that the generated smart contract skeleton code yields a system that is conformant with the model, which serves as a specification of said system so that subsequent analysis, understanding, and maintenance will be easier to achieve.
{"title":"Goal Driven Code Generation for Smart Contract Assemblies","authors":"Konstantinos Tsiounis, K. Kontogiannis","doi":"10.1145/3587828.3587846","DOIUrl":"https://doi.org/10.1145/3587828.3587846","url":null,"abstract":"We are currently witnessing the proliferation of blockchain environments to support a wide spectrum of corporate applications through the use of smart contracts. It is of no surprise that smart contract programming language technology constantly evolves to include not only specialized languages such as Solidity, but also general purpose languages such as GoLang and JavaScript. Furthermore, blockchain technology imposes unique challenges related to the monetary cost of deploying smart contracts, and handling roll-back issues when a smart contract fails. It is therefore evident that the complexity of systems involving smart contracts will only increase over time thus making the maintenance and evolution of such systems a very challenging task. One solution to these problems is to approach the implementation and deployment of such systems in a disciplined and automated way. In this paper, we propose a model-driven approach where the structure and inter-dependencies of smart contract, as well as stakeholder objectives, are denoted by extended goal models which can then be transformed to yield Solidity code that conforms with those models. More specifically, we present first a Domain Specific Language (DSL) to denote extended goal models and second, a transformation process which allows for the Abstract Syntax Trees of such a DSL program to be transformed into Solidity smart contact source code. The transformation process ensures that the generated smart contract skeleton code yields a system that is conformant with the model, which serves as a specification of said system so that subsequent analysis, understanding, and maintenance will be easier to achieve.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128284917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duy Trong Nguyen, B. Q. Tran, A. Tran, Dat Trong Than, D. Tran
Technical analysis is a chart-based method, from price movements and trading volume of stocks to analyze movements to make trend predictions and make real buying and selling decisions. Pattern-based technical analysis is one of the most effective for stock market volatility. The problem analysts often face is that looking for patterns wastes time and effort with thousands of stock symbols. This research aims to apply object detection techniques to analyze and recognize chart patterns, thus evaluating the accuracy of price action candlesticks. However, image data of the patterns on the candlestick chart is too scarce. We built an image dataset consisting of four patterns: Head and Shoulder, reverse Head and Shoulders, Double Top, and Double Bottom. Candlestick charts' distinctive shape makes it challenging to discern precise patterns, and segmentation has been used in the data processing section to reduce candlestick chart noise. Moreover, data collection also encountered the problem of time and effort. So the method to generate variable data uses possible patterns to enrich the data set. The experiments reveal that performance in detecting patterns is described later in this article.
{"title":"Object Detection Approach for Stock Chart Patterns Recognition in Financial Markets","authors":"Duy Trong Nguyen, B. Q. Tran, A. Tran, Dat Trong Than, D. Tran","doi":"10.1145/3587828.3587851","DOIUrl":"https://doi.org/10.1145/3587828.3587851","url":null,"abstract":"Technical analysis is a chart-based method, from price movements and trading volume of stocks to analyze movements to make trend predictions and make real buying and selling decisions. Pattern-based technical analysis is one of the most effective for stock market volatility. The problem analysts often face is that looking for patterns wastes time and effort with thousands of stock symbols. This research aims to apply object detection techniques to analyze and recognize chart patterns, thus evaluating the accuracy of price action candlesticks. However, image data of the patterns on the candlestick chart is too scarce. We built an image dataset consisting of four patterns: Head and Shoulder, reverse Head and Shoulders, Double Top, and Double Bottom. Candlestick charts' distinctive shape makes it challenging to discern precise patterns, and segmentation has been used in the data processing section to reduce candlestick chart noise. Moreover, data collection also encountered the problem of time and effort. So the method to generate variable data uses possible patterns to enrich the data set. The experiments reveal that performance in detecting patterns is described later in this article.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132898412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Social media has been flooded with enormous amounts of COVID-19-related information ever since the COVID- 19 pandemic started back in 2020. Since then, Malaysian citizens have become more reliant than ever on social media for consumption of COVID-19 information. However, the lack of COVID-19 news regulations on social media platforms encouraged people to post unverified, fake and misleading COVID-19 related information. Because of the time-consuming nature of fact-checking, people often take these unverified COVID-19 news for granted. Consequently, people inadvertently spread these fake COVID-19 news to their families, friends and relatives on social messaging platforms like WhatsApp. The spread of COVID- 19 fake news online in Malaysia can have severe sequences, causing widespread panic among fellow Malaysians. In this paper, we proposed a supervised learning approach to detect COVID-19 fake news. The fake news on COVID-19 were scraped from the website called Sebenarnya, and real news were scraped from The Star website. We applied a semantic model with different word representations which include Bag of Words (BOW), Term Frequency - Inverse Document Frequency (TF-IDF), Word2Vec and Global Vectors (GloVe). In the evaluation step, 6 supervised machine learning algorithms were applied such as Multinomial Naive Bayes, Support Vector Machines, Decision Tree, Random Forest, Logistic Regression and Adaboost. Afterward, 10-fold cross validation was used to train and evaluate the 6 supervised algorithms according to performance metrics such as accuracy, precision, recall, AUC-ROC, F1-score. The results showed that Random Forest with the word representation of TF-IDF per- formed the best with over 97% accuracy in contrast to numerous conventional supervised machine learning classifiers.
{"title":"COVID-19 Fake News Detection in Malaysia – A Supervised Approach","authors":"R. Kalaimagal, Balakrishnan Vimala, Soo Mun Chong","doi":"10.1145/3587828.3587853","DOIUrl":"https://doi.org/10.1145/3587828.3587853","url":null,"abstract":"Social media has been flooded with enormous amounts of COVID-19-related information ever since the COVID- 19 pandemic started back in 2020. Since then, Malaysian citizens have become more reliant than ever on social media for consumption of COVID-19 information. However, the lack of COVID-19 news regulations on social media platforms encouraged people to post unverified, fake and misleading COVID-19 related information. Because of the time-consuming nature of fact-checking, people often take these unverified COVID-19 news for granted. Consequently, people inadvertently spread these fake COVID-19 news to their families, friends and relatives on social messaging platforms like WhatsApp. The spread of COVID- 19 fake news online in Malaysia can have severe sequences, causing widespread panic among fellow Malaysians. In this paper, we proposed a supervised learning approach to detect COVID-19 fake news. The fake news on COVID-19 were scraped from the website called Sebenarnya, and real news were scraped from The Star website. We applied a semantic model with different word representations which include Bag of Words (BOW), Term Frequency - Inverse Document Frequency (TF-IDF), Word2Vec and Global Vectors (GloVe). In the evaluation step, 6 supervised machine learning algorithms were applied such as Multinomial Naive Bayes, Support Vector Machines, Decision Tree, Random Forest, Logistic Regression and Adaboost. Afterward, 10-fold cross validation was used to train and evaluate the 6 supervised algorithms according to performance metrics such as accuracy, precision, recall, AUC-ROC, F1-score. The results showed that Random Forest with the word representation of TF-IDF per- formed the best with over 97% accuracy in contrast to numerous conventional supervised machine learning classifiers.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130183113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}