首页 > 最新文献

Proceedings of the 2023 12th International Conference on Software and Computer Applications最新文献

英文 中文
Unstructured Data to Visualized Information: Improving process of master list preparation for university ranking assessment and decision making: Unstructured Data to Visualized Information 从非结构化数据到可视化信息:改进大学排名评估和决策的总表编制流程:从非结构化数据到可视化信息
Adidah Lajis, Tiliza Awang Mat, Norsuhaili Seid, Zanariah Abu Bakar, Ahmad Shahrafidz Khalid, Faiza Roslee, Nor Azlina Ali
Information is crucial in any organization especially with regard to university ranking assessment. The information retrieval by using the words of “universiti kuala lumpur” will retrieve a publication from others university entity that contain those words. The most common data included was from Infrastructure University Kuala Lumpur. Besides, the data received from Scopus are in the form of authors initial and it will take some time for the Publication Unit officer to identify the author of the papers and to indicate the campus. This makes the process of preparing the master list taking longer time. This initiative is to shorten the process of preparing the university ranking document. The processed data will be visualized accordingly for head of campus and top management university and the publication performance can be assessed. It is found that the system developed enable the Publication Unit officers to process the data received from the library within 3 days for 200 data. Previously, the process took almost two weeks or more due to process of findings the owner of the publication. This finding confirmed the argument made by Patel (2021) that the information system is an enabler to achieve the final goal and in this case is the goal are to the prepare MyRA and SETARA publication master list and to provide the publication analysis for university and campus for decision making and strategic planning.
信息在任何组织中都是至关重要的,尤其是在大学排名评估方面。通过使用“universiti kuala lumpur”的单词进行信息检索,将检索包含这些单词的其他大学实体的出版物。其中最常见的数据来自吉隆坡基础设施大学。此外,从Scopus收到的数据是作者首字母的形式,出版单位官员需要一些时间来确定论文的作者并指出校园。这使得准备主列表的过程花费更长的时间。这一举措是为了缩短准备大学排名文件的过程。处理后的数据将相应可视化,供校园负责人和高层管理人员使用,并可对出版业绩进行评估。发现所开发的系统使出版股干事能够在3天内处理从图书馆收到的200个数据。以前,由于出版物所有者的调查过程,这个过程几乎需要两周或更长时间。这一发现证实了Patel(2021)提出的观点,即信息系统是实现最终目标的推手,在这种情况下,目标是准备MyRA和SETARA出版总清单,并为大学和校园的决策和战略规划提供出版分析。
{"title":"Unstructured Data to Visualized Information: Improving process of master list preparation for university ranking assessment and decision making: Unstructured Data to Visualized Information","authors":"Adidah Lajis, Tiliza Awang Mat, Norsuhaili Seid, Zanariah Abu Bakar, Ahmad Shahrafidz Khalid, Faiza Roslee, Nor Azlina Ali","doi":"10.1145/3587828.3587831","DOIUrl":"https://doi.org/10.1145/3587828.3587831","url":null,"abstract":"Information is crucial in any organization especially with regard to university ranking assessment. The information retrieval by using the words of “universiti kuala lumpur” will retrieve a publication from others university entity that contain those words. The most common data included was from Infrastructure University Kuala Lumpur. Besides, the data received from Scopus are in the form of authors initial and it will take some time for the Publication Unit officer to identify the author of the papers and to indicate the campus. This makes the process of preparing the master list taking longer time. This initiative is to shorten the process of preparing the university ranking document. The processed data will be visualized accordingly for head of campus and top management university and the publication performance can be assessed. It is found that the system developed enable the Publication Unit officers to process the data received from the library within 3 days for 200 data. Previously, the process took almost two weeks or more due to process of findings the owner of the publication. This finding confirmed the argument made by Patel (2021) that the information system is an enabler to achieve the final goal and in this case is the goal are to the prepare MyRA and SETARA publication master list and to provide the publication analysis for university and campus for decision making and strategic planning.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"676 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132142241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Music Labeling Model Based on Traditional Chinese Music Characteristics for Emotional Regulation 基于中国传统音乐特征的情绪调节音乐标注模型
Zhenghao He, Ruifan Chen, Yayue Hou, Fei Xie, Xiaoliang Gong, A. Cohn
The effectiveness of emotion regulation based on traditional Chinese music has been verified in clinical trials over thousands of years, but the reasons are unclear. This paper aims to use feature engineering to find effective music features which are effective for classifying different types of music and thus to try to provide an automatic recognition framework for building music libraries that can be used for mood regulation and music therapy. In this work, five modes (equivalent to the scales of Western music) of traditional Chinese music repertoire which can be used to regulate loneliness, anxiety, anger, joy, and fear are used. Features including Chroma, Mel-spectrogram, Tonnetz, and full feature vector features, are extracted for different length fragments of a piece of music which are then used to build a classification model for the five modes using a convolutional neural network (CNN). The results show that the highest 5-classes classification accuracy, 71.09%, is achieved from a Mel map of 5s music clips. A music mode labeling model is then constructed using a weighted combination of the different individual feature models. This model was then qualitatively evaluated on 13 pieces of music in different musical styles, and the results were reasonable from a music theory perspective. In future work, this music labeling model will be tested on more types of tracks to better assess its reliability.
几千年来,基于中国传统音乐的情绪调节的有效性已经在临床试验中得到证实,但原因尚不清楚。本文旨在利用特征工程寻找有效的音乐特征,有效地对不同类型的音乐进行分类,从而尝试为构建可用于情绪调节和音乐治疗的音乐库提供一个自动识别框架。在这部作品中,使用了中国传统音乐曲目中的五个调式(相当于西方音乐的音阶),可以用来调节孤独、焦虑、愤怒、喜悦和恐惧。提取音乐片段的不同长度片段的特征,包括Chroma、Mel-spectrogram、Tonnetz和全特征向量特征,然后使用卷积神经网络(CNN)构建五种模式的分类模型。结果表明,在包含5个音乐片段的Mel地图上,获得了最高的5类分类准确率71.09%。然后使用不同个体特征模型的加权组合构建音乐模式标记模型。然后对13首不同音乐风格的乐曲进行了定性评价,结果从乐理角度来看是合理的。在未来的工作中,这个音乐标签模型将在更多类型的曲目上进行测试,以更好地评估其可靠性。
{"title":"A Music Labeling Model Based on Traditional Chinese Music Characteristics for Emotional Regulation","authors":"Zhenghao He, Ruifan Chen, Yayue Hou, Fei Xie, Xiaoliang Gong, A. Cohn","doi":"10.1145/3587828.3587837","DOIUrl":"https://doi.org/10.1145/3587828.3587837","url":null,"abstract":"The effectiveness of emotion regulation based on traditional Chinese music has been verified in clinical trials over thousands of years, but the reasons are unclear. This paper aims to use feature engineering to find effective music features which are effective for classifying different types of music and thus to try to provide an automatic recognition framework for building music libraries that can be used for mood regulation and music therapy. In this work, five modes (equivalent to the scales of Western music) of traditional Chinese music repertoire which can be used to regulate loneliness, anxiety, anger, joy, and fear are used. Features including Chroma, Mel-spectrogram, Tonnetz, and full feature vector features, are extracted for different length fragments of a piece of music which are then used to build a classification model for the five modes using a convolutional neural network (CNN). The results show that the highest 5-classes classification accuracy, 71.09%, is achieved from a Mel map of 5s music clips. A music mode labeling model is then constructed using a weighted combination of the different individual feature models. This model was then qualitatively evaluated on 13 pieces of music in different musical styles, and the results were reasonable from a music theory perspective. In future work, this music labeling model will be tested on more types of tracks to better assess its reliability.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122214776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparison of Distance Metrics for Generating Cluster-based Ensemble Learning 生成基于聚类的集成学习的距离度量比较
L. P. Yulianti, A. Trisetyarso, J. Santoso, K. Surendro
The basis of ensemble learning is using multiple learning algorithms to improve predictive performance compared to individual learners. Behind the various advantages of ensemble learning, there are several issues that need attention, one of which is related to finding a set of diverse base learners. Recently, clustering has been used to generate diverse base learners as opposed to bagging. The main advantages of cluster-based ensemble learners are their robustness and versatility. The key parameters for implementing a clustering algorithm are the cluster size and distance metrics. The contribution of this study is to compare four distance metrics, including the Euclidean, Manhattan, Chebyshev, and Canberra distances, in the clustering method for ensemble generation and evaluate them based on accuracy, purity, and diversity. The methodology is tested on 10 benchmark UCI datasets. The results show that the use of the Chebyshev and Canberra distances achieved superior accuracy to both the Euclidean and Manhattan distances, while the purity and diversity values of the use of the Chebyshev distance outperformed the other three.
与单个学习器相比,集成学习的基础是使用多种学习算法来提高预测性能。在集成学习的各种优势背后,有几个问题需要注意,其中一个问题与寻找一组不同的基础学习器有关。最近,聚类被用来生成不同的基础学习器,而不是bagging。基于聚类的集成学习器的主要优点是鲁棒性和通用性。实现聚类算法的关键参数是聚类大小和距离度量。本研究的贡献在于比较了欧几里得距离、曼哈顿距离、切比雪夫距离和堪培拉距离这四种距离度量,并基于准确性、纯度和多样性对它们进行了评价。该方法在10个基准UCI数据集上进行了测试。结果表明,切比雪夫距离和堪培拉距离的精度均优于欧几里得距离和曼哈顿距离,而切比雪夫距离的纯度和多样性值均优于其他三种距离。
{"title":"Comparison of Distance Metrics for Generating Cluster-based Ensemble Learning","authors":"L. P. Yulianti, A. Trisetyarso, J. Santoso, K. Surendro","doi":"10.1145/3587828.3587833","DOIUrl":"https://doi.org/10.1145/3587828.3587833","url":null,"abstract":"The basis of ensemble learning is using multiple learning algorithms to improve predictive performance compared to individual learners. Behind the various advantages of ensemble learning, there are several issues that need attention, one of which is related to finding a set of diverse base learners. Recently, clustering has been used to generate diverse base learners as opposed to bagging. The main advantages of cluster-based ensemble learners are their robustness and versatility. The key parameters for implementing a clustering algorithm are the cluster size and distance metrics. The contribution of this study is to compare four distance metrics, including the Euclidean, Manhattan, Chebyshev, and Canberra distances, in the clustering method for ensemble generation and evaluate them based on accuracy, purity, and diversity. The methodology is tested on 10 benchmark UCI datasets. The results show that the use of the Chebyshev and Canberra distances achieved superior accuracy to both the Euclidean and Manhattan distances, while the purity and diversity values of the use of the Chebyshev distance outperformed the other three.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132612008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-Attribute BERT for Preferences Completion in Multi-Criteria Recommender System 多标准推荐系统中偏好补全的多属性BERT
Rita Rismala, N. Maulidevi, K. Surendro
For a multi-criteria recommender system (MCRS), a complete set of criteria ratings is necessary to produce an accurate recommendation. Incomplete preferences, known as the "partial preferences problem," is one of the problems in MCRS. This issue affects the performance of MCRS due to an increase in data sparsity. Criteria rating prediction is one method for completing the preferences. Therefore, this study proposes a new method for preferences completion, that is a multi-attribute Bidirectional Encoder Representations from Transformers (BERT). The proposed method incorporates reviews and overall ratings to predict incomplete criteria ratings. Rule-based adjustment is also performed to enhance the performance of the proposed method in predicting the worst rating. This study shows that the proposed method outperforms the baseline method. The proposed method is also evaluated on MCRS using a user-based multi-criteria collaborative filtering approach. The result is that it has a positive impact on the recommendation system.
对于多标准推荐系统(MCRS),需要一套完整的标准评级来产生准确的推荐。不完全偏好,被称为“部分偏好问题”,是MCRS中的一个问题。由于数据稀疏性的增加,这个问题会影响MCRS的性能。标准评级预测是完成首选项的一种方法。因此,本研究提出了一种新的偏好补全方法,即多属性双向编码器表示(BERT)。所提出的方法结合了评论和总体评级来预测不完整的标准评级。为了提高该方法在预测最差评级方面的性能,还进行了基于规则的调整。研究表明,该方法优于基线方法。采用基于用户的多准则协同过滤方法对该方法进行了评价。结果是它对推荐系统产生了积极的影响。
{"title":"Multi-Attribute BERT for Preferences Completion in Multi-Criteria Recommender System","authors":"Rita Rismala, N. Maulidevi, K. Surendro","doi":"10.1145/3587828.3587875","DOIUrl":"https://doi.org/10.1145/3587828.3587875","url":null,"abstract":"For a multi-criteria recommender system (MCRS), a complete set of criteria ratings is necessary to produce an accurate recommendation. Incomplete preferences, known as the \"partial preferences problem,\" is one of the problems in MCRS. This issue affects the performance of MCRS due to an increase in data sparsity. Criteria rating prediction is one method for completing the preferences. Therefore, this study proposes a new method for preferences completion, that is a multi-attribute Bidirectional Encoder Representations from Transformers (BERT). The proposed method incorporates reviews and overall ratings to predict incomplete criteria ratings. Rule-based adjustment is also performed to enhance the performance of the proposed method in predicting the worst rating. This study shows that the proposed method outperforms the baseline method. The proposed method is also evaluated on MCRS using a user-based multi-criteria collaborative filtering approach. The result is that it has a positive impact on the recommendation system.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123541366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
String Figure Simulation with Multiresolution Wire Model 多分辨率线模型的弦图仿真
Seikoh Nishita
String figure is a traditional game with a loop of a string played by hooking and/or unhooking strands of the loop from fingers to produce patterns representing certain objects. The patterns of the string figure change dynamically by the string manipulations by fingers. A computer-based method based on knot theory has been proposed to make string figure patterns. This method represents a string in the string figure as an extended knot diagram and generates string figure patterns by monotonically decreasing the number of crossing points. However, there are string figures where this method does not make the patterns correctly. In contrast, this paper proposes a method using physical simulation from the viewpoint that the string figure patterns are determined according to the tension and frictional force applied to the string. To evaluate the proposed method, we conducted two types of experiments. In the first experiments, we showed that physical simulations using an adaptive multi-resolution wire model have sufficient capability to compute the string figure patterns. In the second, we conducted experiments to make patterns for instances of the string figure. The experimental results indicate that the proposed method can correctly make the string figure patterns in most cases. We also found that the proposed method can make some of the string figure patterns that cannot be generated by conventional methods.
线形是一种传统的游戏,通过将手指上的线形线钩起或松开,形成代表特定物体的图案。通过手指对琴弦的操纵,琴弦图形会动态变化。提出了一种基于绳结理论的计算机弦图制作方法。该方法将弦图中的弦表示为扩展结图,并通过单调减少交叉点的数量来生成弦图图案。但是,在某些字符串图中,此方法不能正确地生成模式。相反,本文提出了一种物理模拟的方法,认为弦的图形模式是由施加在弦上的张力和摩擦力决定的。为了评估所提出的方法,我们进行了两类实验。在第一个实验中,我们证明了使用自适应多分辨率线模型的物理模拟具有足够的能力来计算弦图模式。在第二个实验中,我们进行了实验,为字符串图的实例制作模式。实验结果表明,该方法在大多数情况下都能正确生成字符串图形模式。我们还发现,该方法可以生成一些传统方法无法生成的字符串图形模式。
{"title":"String Figure Simulation with Multiresolution Wire Model","authors":"Seikoh Nishita","doi":"10.1145/3587828.3587839","DOIUrl":"https://doi.org/10.1145/3587828.3587839","url":null,"abstract":"String figure is a traditional game with a loop of a string played by hooking and/or unhooking strands of the loop from fingers to produce patterns representing certain objects. The patterns of the string figure change dynamically by the string manipulations by fingers. A computer-based method based on knot theory has been proposed to make string figure patterns. This method represents a string in the string figure as an extended knot diagram and generates string figure patterns by monotonically decreasing the number of crossing points. However, there are string figures where this method does not make the patterns correctly. In contrast, this paper proposes a method using physical simulation from the viewpoint that the string figure patterns are determined according to the tension and frictional force applied to the string. To evaluate the proposed method, we conducted two types of experiments. In the first experiments, we showed that physical simulations using an adaptive multi-resolution wire model have sufficient capability to compute the string figure patterns. In the second, we conducted experiments to make patterns for instances of the string figure. The experimental results indicate that the proposed method can correctly make the string figure patterns in most cases. We also found that the proposed method can make some of the string figure patterns that cannot be generated by conventional methods.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116675607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hydrological Monitoring System Design and Implementation for River Embankment Protection 河堤防护水文监测系统设计与实现
T. Truong, Doan Thanh Nguyen, Nghia Trong Nguyen
The Mekong Delta is an area with many rivers, so it is very convenient for waterway traffic. However, the transportation of agricultural and aquatic products by large ships is one of the possible causes of adverse impacts on the riverbank. For the above reasons, this paper presents how to design a monitoring system and collect hydrological parameters such as wave pressure, water level, and soil moisture that can seriously affect the embankment. The programmable system on chip (PSoC) was chosen for the design of sensor nodes because of the flexibility in interfacing the sensors. Wireless sensor networks using LoRa technology allow continuous operation and can be deployed in a wide area. The Raspberry Pi board is used as a gateway function for data aggregation and uploading to Firebase's real-time database. An Android application has been developed that allows easy monitoring of the recorded values. To evaluate the performance of the experimental system, the experiment was conducted at a high-risk landslide embankment on a tributary of the Hau River in Can Tho City, Vietnam.
湄公河三角洲是一个河流众多的地区,因此水路交通非常方便。然而,大型船舶的农水产品运输是对河堤造成不利影响的可能原因之一。基于以上原因,本文介绍了如何设计监测系统,采集对路堤有严重影响的波浪压力、水位、土壤湿度等水文参数。由于可编程系统芯片(PSoC)在传感器接口上具有灵活性,因此选择PSoC作为传感器节点的设计。使用LoRa技术的无线传感器网络允许连续运行,并且可以部署在广泛的区域。树莓派板作为网关功能,用于数据聚合和上传至Firebase的实时数据库。已经开发了一个Android应用程序,可以轻松监控记录值。为了评估试验系统的性能,在越南芹苴市Hau河支流的一个高风险滑坡路堤上进行了试验。
{"title":"Hydrological Monitoring System Design and Implementation for River Embankment Protection","authors":"T. Truong, Doan Thanh Nguyen, Nghia Trong Nguyen","doi":"10.1145/3587828.3587848","DOIUrl":"https://doi.org/10.1145/3587828.3587848","url":null,"abstract":"The Mekong Delta is an area with many rivers, so it is very convenient for waterway traffic. However, the transportation of agricultural and aquatic products by large ships is one of the possible causes of adverse impacts on the riverbank. For the above reasons, this paper presents how to design a monitoring system and collect hydrological parameters such as wave pressure, water level, and soil moisture that can seriously affect the embankment. The programmable system on chip (PSoC) was chosen for the design of sensor nodes because of the flexibility in interfacing the sensors. Wireless sensor networks using LoRa technology allow continuous operation and can be deployed in a wide area. The Raspberry Pi board is used as a gateway function for data aggregation and uploading to Firebase's real-time database. An Android application has been developed that allows easy monitoring of the recorded values. To evaluate the performance of the experimental system, the experiment was conducted at a high-risk landslide embankment on a tributary of the Hau River in Can Tho City, Vietnam.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134103848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ensemble of Filter and Embedded Feature Selection Techniques for Malware Classification using High-dimensional Jar Extension Dataset 基于高维Jar扩展数据集的集成过滤器和嵌入式特征选择技术的恶意软件分类
Yi Wei Tye, U. K. Yusof, Samat Tulpar
Innovations in machine learning algorithms have enhanced the effectiveness of malware detection systems during the previous decades. However, the advancement of high throughput technologies results in high dimensional malware data, making feature selection useful and mandatory in such datasets. The feature selection technique is an information retrieval tool that aims to improve classifiers by listing important features, which also aids in reducing computational overload. However, different feature selection algorithms select representative features using various criteria, making it difficult to determine the optimal technique for distinct domain datasets. Ensemble feature selection approaches, which integrate the results of several feature selections, can be used to overcome the inadequacies of single-feature selection methods. Therefore, this paper attempts to determine whether the heterogeneous ensemble of filter and embedded feature selection approaches, namely the heterogenous ensemble of ANOVA F-test, ReliefF, L1-penalized logistic regression, LASSO regression, Extra-Tree Classifier and XGBoost feature selection techniques, namely HEFS-ARLLEX, can provide a better classification performance than the single feature selection technique and other ensemble feature selection approaches for malware classification data. The experimental results show that HEFS-ARLLEX, which combines both filters and embedded, is a better choice, providing consistently high classification accuracy, recall, precision, specificity and F-measure and a reasonable feature reduction rate for malware classification dataset.
在过去的几十年里,机器学习算法的创新提高了恶意软件检测系统的有效性。然而,高吞吐量技术的进步导致了高维恶意软件数据,使得特征选择在这些数据集中变得有用和强制性。特征选择技术是一种信息检索工具,旨在通过列出重要特征来改进分类器,这也有助于减少计算过载。然而,不同的特征选择算法使用不同的标准来选择具有代表性的特征,这使得难以确定针对不同领域数据集的最佳技术。集成特征选择方法将多个特征选择结果集成在一起,可以克服单一特征选择方法的不足。因此,本文试图确定滤波和嵌入式特征选择方法的异构集成,即ANOVA F-test、ReliefF、l1惩罚逻辑回归、LASSO回归、Extra-Tree Classifier和XGBoost特征选择技术的异构集成,即HEFS-ARLLEX,是否能够提供比单一特征选择技术和其他集成特征选择方法更好的恶意软件分类数据分类性能。实验结果表明,结合过滤器和嵌入式的HEFS-ARLLEX是一种较好的选择,对恶意软件分类数据集具有较高的分类准确率、查全率、精密度、特异性和F-measure以及合理的特征约简率。
{"title":"Ensemble of Filter and Embedded Feature Selection Techniques for Malware Classification using High-dimensional Jar Extension Dataset","authors":"Yi Wei Tye, U. K. Yusof, Samat Tulpar","doi":"10.1145/3587828.3587849","DOIUrl":"https://doi.org/10.1145/3587828.3587849","url":null,"abstract":"Innovations in machine learning algorithms have enhanced the effectiveness of malware detection systems during the previous decades. However, the advancement of high throughput technologies results in high dimensional malware data, making feature selection useful and mandatory in such datasets. The feature selection technique is an information retrieval tool that aims to improve classifiers by listing important features, which also aids in reducing computational overload. However, different feature selection algorithms select representative features using various criteria, making it difficult to determine the optimal technique for distinct domain datasets. Ensemble feature selection approaches, which integrate the results of several feature selections, can be used to overcome the inadequacies of single-feature selection methods. Therefore, this paper attempts to determine whether the heterogeneous ensemble of filter and embedded feature selection approaches, namely the heterogenous ensemble of ANOVA F-test, ReliefF, L1-penalized logistic regression, LASSO regression, Extra-Tree Classifier and XGBoost feature selection techniques, namely HEFS-ARLLEX, can provide a better classification performance than the single feature selection technique and other ensemble feature selection approaches for malware classification data. The experimental results show that HEFS-ARLLEX, which combines both filters and embedded, is a better choice, providing consistently high classification accuracy, recall, precision, specificity and F-measure and a reasonable feature reduction rate for malware classification dataset.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"113 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122371430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Goal Driven Code Generation for Smart Contract Assemblies 智能合约程序集的目标驱动代码生成
Konstantinos Tsiounis, K. Kontogiannis
We are currently witnessing the proliferation of blockchain environments to support a wide spectrum of corporate applications through the use of smart contracts. It is of no surprise that smart contract programming language technology constantly evolves to include not only specialized languages such as Solidity, but also general purpose languages such as GoLang and JavaScript. Furthermore, blockchain technology imposes unique challenges related to the monetary cost of deploying smart contracts, and handling roll-back issues when a smart contract fails. It is therefore evident that the complexity of systems involving smart contracts will only increase over time thus making the maintenance and evolution of such systems a very challenging task. One solution to these problems is to approach the implementation and deployment of such systems in a disciplined and automated way. In this paper, we propose a model-driven approach where the structure and inter-dependencies of smart contract, as well as stakeholder objectives, are denoted by extended goal models which can then be transformed to yield Solidity code that conforms with those models. More specifically, we present first a Domain Specific Language (DSL) to denote extended goal models and second, a transformation process which allows for the Abstract Syntax Trees of such a DSL program to be transformed into Solidity smart contact source code. The transformation process ensures that the generated smart contract skeleton code yields a system that is conformant with the model, which serves as a specification of said system so that subsequent analysis, understanding, and maintenance will be easier to achieve.
我们目前正在见证区块链环境的扩散,通过使用智能合约来支持广泛的企业应用程序。毫不奇怪,智能合约编程语言技术不断发展,不仅包括Solidity等专业语言,还包括GoLang和JavaScript等通用语言。此外,区块链技术带来了独特的挑战,涉及部署智能合约的货币成本,以及在智能合约失败时处理回滚问题。因此,很明显,涉及智能合约的系统的复杂性只会随着时间的推移而增加,从而使此类系统的维护和发展成为一项非常具有挑战性的任务。这些问题的一个解决方案是以一种有纪律和自动化的方式来实现和部署这些系统。在本文中,我们提出了一种模型驱动的方法,其中智能合约的结构和相互依赖关系以及利益相关者的目标由扩展的目标模型表示,然后可以将其转换为生成符合这些模型的solid代码。更具体地说,我们首先提出了一个领域特定语言(DSL)来表示扩展的目标模型,然后提出了一个转换过程,该过程允许将这样一个DSL程序的抽象语法树转换为Solidity智能接触源代码。转换过程确保生成的智能合约框架代码产生与模型一致的系统,该模型作为所述系统的规范,以便后续的分析、理解和维护将更容易实现。
{"title":"Goal Driven Code Generation for Smart Contract Assemblies","authors":"Konstantinos Tsiounis, K. Kontogiannis","doi":"10.1145/3587828.3587846","DOIUrl":"https://doi.org/10.1145/3587828.3587846","url":null,"abstract":"We are currently witnessing the proliferation of blockchain environments to support a wide spectrum of corporate applications through the use of smart contracts. It is of no surprise that smart contract programming language technology constantly evolves to include not only specialized languages such as Solidity, but also general purpose languages such as GoLang and JavaScript. Furthermore, blockchain technology imposes unique challenges related to the monetary cost of deploying smart contracts, and handling roll-back issues when a smart contract fails. It is therefore evident that the complexity of systems involving smart contracts will only increase over time thus making the maintenance and evolution of such systems a very challenging task. One solution to these problems is to approach the implementation and deployment of such systems in a disciplined and automated way. In this paper, we propose a model-driven approach where the structure and inter-dependencies of smart contract, as well as stakeholder objectives, are denoted by extended goal models which can then be transformed to yield Solidity code that conforms with those models. More specifically, we present first a Domain Specific Language (DSL) to denote extended goal models and second, a transformation process which allows for the Abstract Syntax Trees of such a DSL program to be transformed into Solidity smart contact source code. The transformation process ensures that the generated smart contract skeleton code yields a system that is conformant with the model, which serves as a specification of said system so that subsequent analysis, understanding, and maintenance will be easier to achieve.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128284917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Object Detection Approach for Stock Chart Patterns Recognition in Financial Markets 金融市场股票图表模式识别的目标检测方法
Duy Trong Nguyen, B. Q. Tran, A. Tran, Dat Trong Than, D. Tran
Technical analysis is a chart-based method, from price movements and trading volume of stocks to analyze movements to make trend predictions and make real buying and selling decisions. Pattern-based technical analysis is one of the most effective for stock market volatility. The problem analysts often face is that looking for patterns wastes time and effort with thousands of stock symbols. This research aims to apply object detection techniques to analyze and recognize chart patterns, thus evaluating the accuracy of price action candlesticks. However, image data of the patterns on the candlestick chart is too scarce. We built an image dataset consisting of four patterns: Head and Shoulder, reverse Head and Shoulders, Double Top, and Double Bottom. Candlestick charts' distinctive shape makes it challenging to discern precise patterns, and segmentation has been used in the data processing section to reduce candlestick chart noise. Moreover, data collection also encountered the problem of time and effort. So the method to generate variable data uses possible patterns to enrich the data set. The experiments reveal that performance in detecting patterns is described later in this article.
技术分析是一种以图表为基础的方法,从股票的价格变动和交易量来分析走势,做出趋势预测,做出真正的买卖决策。基于模式的技术分析是分析股票市场波动最有效的方法之一。分析师经常面临的问题是,在成千上万的股票代码中寻找模式浪费了时间和精力。本研究旨在应用目标检测技术来分析和识别图表模式,从而评估价格走势烛台的准确性。然而,烛台图上图案的图像资料太少了。我们建立了一个由四种模式组成的图像数据集:头和肩,反向头和肩,双顶和双底。烛台图的独特形状使其难以识别精确的模式,并且在数据处理部分使用分割来减少烛台图噪声。此外,数据收集也遇到了时间和精力的问题。因此,生成变量数据的方法使用可能的模式来丰富数据集。实验表明,本文稍后将描述检测模式的性能。
{"title":"Object Detection Approach for Stock Chart Patterns Recognition in Financial Markets","authors":"Duy Trong Nguyen, B. Q. Tran, A. Tran, Dat Trong Than, D. Tran","doi":"10.1145/3587828.3587851","DOIUrl":"https://doi.org/10.1145/3587828.3587851","url":null,"abstract":"Technical analysis is a chart-based method, from price movements and trading volume of stocks to analyze movements to make trend predictions and make real buying and selling decisions. Pattern-based technical analysis is one of the most effective for stock market volatility. The problem analysts often face is that looking for patterns wastes time and effort with thousands of stock symbols. This research aims to apply object detection techniques to analyze and recognize chart patterns, thus evaluating the accuracy of price action candlesticks. However, image data of the patterns on the candlestick chart is too scarce. We built an image dataset consisting of four patterns: Head and Shoulder, reverse Head and Shoulders, Double Top, and Double Bottom. Candlestick charts' distinctive shape makes it challenging to discern precise patterns, and segmentation has been used in the data processing section to reduce candlestick chart noise. Moreover, data collection also encountered the problem of time and effort. So the method to generate variable data uses possible patterns to enrich the data set. The experiments reveal that performance in detecting patterns is described later in this article.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"166 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132898412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
COVID-19 Fake News Detection in Malaysia – A Supervised Approach 马来西亚的COVID-19假新闻检测-监督方法
R. Kalaimagal, Balakrishnan Vimala, Soo Mun Chong
Social media has been flooded with enormous amounts of COVID-19-related information ever since the COVID- 19 pandemic started back in 2020. Since then, Malaysian citizens have become more reliant than ever on social media for consumption of COVID-19 information. However, the lack of COVID-19 news regulations on social media platforms encouraged people to post unverified, fake and misleading COVID-19 related information. Because of the time-consuming nature of fact-checking, people often take these unverified COVID-19 news for granted. Consequently, people inadvertently spread these fake COVID-19 news to their families, friends and relatives on social messaging platforms like WhatsApp. The spread of COVID- 19 fake news online in Malaysia can have severe sequences, causing widespread panic among fellow Malaysians. In this paper, we proposed a supervised learning approach to detect COVID-19 fake news. The fake news on COVID-19 were scraped from the website called Sebenarnya, and real news were scraped from The Star website. We applied a semantic model with different word representations which include Bag of Words (BOW), Term Frequency - Inverse Document Frequency (TF-IDF), Word2Vec and Global Vectors (GloVe). In the evaluation step, 6 supervised machine learning algorithms were applied such as Multinomial Naive Bayes, Support Vector Machines, Decision Tree, Random Forest, Logistic Regression and Adaboost. Afterward, 10-fold cross validation was used to train and evaluate the 6 supervised algorithms according to performance metrics such as accuracy, precision, recall, AUC-ROC, F1-score. The results showed that Random Forest with the word representation of TF-IDF per- formed the best with over 97% accuracy in contrast to numerous conventional supervised machine learning classifiers.
自2020年COVID-19大流行开始以来,社交媒体上充斥着大量与COVID-19相关的信息。从那时起,马来西亚公民比以往任何时候都更加依赖社交媒体来获取COVID-19信息。然而,社交媒体平台缺乏新冠肺炎新闻监管,鼓励人们发布未经核实、虚假和误导性的新冠肺炎相关信息。由于事实核查的耗时性,人们往往将这些未经证实的COVID-19新闻视为理所当然。因此,人们在WhatsApp等社交信息平台上无意中将这些虚假的COVID-19新闻传播给家人、朋友和亲戚。新冠假新闻在马来西亚的网络传播可能会有严重的后果,引起马来西亚同胞的广泛恐慌。在本文中,我们提出了一种监督学习方法来检测COVID-19假新闻。关于新冠肺炎的假新闻是从名为Sebenarnya的网站上抓取的,而真实的新闻是从《星报》网站上抓取的。我们应用了一个具有不同词表示的语义模型,包括词包(BOW)、词频-逆文档频率(TF-IDF)、Word2Vec和全局向量(GloVe)。在评估步骤中,使用了多项朴素贝叶斯、支持向量机、决策树、随机森林、逻辑回归和Adaboost等6种监督机器学习算法。随后,采用10倍交叉验证,根据准确率、精密度、召回率、AUC-ROC、F1-score等性能指标对6种监督算法进行训练和评价。结果表明,与许多传统的有监督机器学习分类器相比,使用TF-IDF per的单词表示的随机森林形成了最好的分类器,准确率超过97%。
{"title":"COVID-19 Fake News Detection in Malaysia – A Supervised Approach","authors":"R. Kalaimagal, Balakrishnan Vimala, Soo Mun Chong","doi":"10.1145/3587828.3587853","DOIUrl":"https://doi.org/10.1145/3587828.3587853","url":null,"abstract":"Social media has been flooded with enormous amounts of COVID-19-related information ever since the COVID- 19 pandemic started back in 2020. Since then, Malaysian citizens have become more reliant than ever on social media for consumption of COVID-19 information. However, the lack of COVID-19 news regulations on social media platforms encouraged people to post unverified, fake and misleading COVID-19 related information. Because of the time-consuming nature of fact-checking, people often take these unverified COVID-19 news for granted. Consequently, people inadvertently spread these fake COVID-19 news to their families, friends and relatives on social messaging platforms like WhatsApp. The spread of COVID- 19 fake news online in Malaysia can have severe sequences, causing widespread panic among fellow Malaysians. In this paper, we proposed a supervised learning approach to detect COVID-19 fake news. The fake news on COVID-19 were scraped from the website called Sebenarnya, and real news were scraped from The Star website. We applied a semantic model with different word representations which include Bag of Words (BOW), Term Frequency - Inverse Document Frequency (TF-IDF), Word2Vec and Global Vectors (GloVe). In the evaluation step, 6 supervised machine learning algorithms were applied such as Multinomial Naive Bayes, Support Vector Machines, Decision Tree, Random Forest, Logistic Regression and Adaboost. Afterward, 10-fold cross validation was used to train and evaluate the 6 supervised algorithms according to performance metrics such as accuracy, precision, recall, AUC-ROC, F1-score. The results showed that Random Forest with the word representation of TF-IDF per- formed the best with over 97% accuracy in contrast to numerous conventional supervised machine learning classifiers.","PeriodicalId":340917,"journal":{"name":"Proceedings of the 2023 12th International Conference on Software and Computer Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130183113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2023 12th International Conference on Software and Computer Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1