首页 > 最新文献

2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)最新文献

英文 中文
Legal Judgment Prediction for Canadian Appeal Cases 加拿大上诉案件的判决预测
Intisar Almuslim, D. Inkpen
Law is one of the knowledge domains that are most reliant on textual material. Nowadays, however, it is very difficult and time-consuming for legal professionals to read, understand, and analyze all the available documents, due to the vast volume of case law that is published every day. In this age of legal big data, and with the increased availability of legal text online, many researchers have given more focus to the development of legal intelligent systems and applications. These intelligent systems can provide great services and solve many problems in legal domain. Over the last years, researchers have focused on predicting judicial case outcomes using Natural Language Processing (NLP) and Machine Learning (ML) methods over case documents. Thus, Legal Judgment Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the text of the case. To the best of our knowledge, no prior research with this intention has been conducted in English for appeal courts in Canada, as of 2021. The NLP application to legal judgments, that our proposed methodology focuses on, is to predict the outcomes of cases by looking only at the description of cases written by the court. Because appeal court decisions are often binary, as in accept or reject, the task is defined as a binary classification problem between’ Allow’ and ‘Dismiss'. This is the general approach in the literature as well. We employ various classification methods including classical classifiers, Deep Learning (DL) models, and compare their performances. Our best results are obtained using DL models with accuracy values reaching 93.46% and F1-scores reaching 0.92, which are on par with the best results in the literature. Through this study, we hope to establish the basis for future research on the legal system of Canada and offer a baseline for future work.
法律是最依赖文本材料的知识领域之一。然而,如今,由于每天都有大量的判例法出版,对于法律专业人士来说,阅读、理解和分析所有可用的文件是非常困难和耗时的。在这个法律大数据时代,随着在线法律文本的增加,许多研究人员更加关注法律智能系统和应用的发展。这些智能系统可以提供大量的服务,解决法律领域的许多问题。在过去的几年里,研究人员一直专注于使用自然语言处理(NLP)和机器学习(ML)方法对案件文件进行预测司法案件的结果。因此,法律判决预测(Legal Judgment Prediction, LJP)的任务是仅根据案件文本自动预测法院案件的结果。据我们所知,截至2021年,还没有针对加拿大上诉法院的英语相关研究。NLP在法律判决中的应用,是我们提出的方法的重点,是通过只看法院写的案件描述来预测案件的结果。由于上诉法院的判决通常是二元的,如接受或拒绝,因此该任务被定义为“允许”和“驳回”之间的二元分类问题。这也是文献中的一般方法。我们采用了各种分类方法,包括经典分类器、深度学习(DL)模型,并比较了它们的性能。我们使用DL模型得到了最好的结果,准确率达到93.46%,f1得分达到0.92,与文献中最好的结果相当。我们希望通过本研究为今后对加拿大法律制度的研究奠定基础,为今后的工作提供一个基线。
{"title":"Legal Judgment Prediction for Canadian Appeal Cases","authors":"Intisar Almuslim, D. Inkpen","doi":"10.1109/CDMA54072.2022.00032","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00032","url":null,"abstract":"Law is one of the knowledge domains that are most reliant on textual material. Nowadays, however, it is very difficult and time-consuming for legal professionals to read, understand, and analyze all the available documents, due to the vast volume of case law that is published every day. In this age of legal big data, and with the increased availability of legal text online, many researchers have given more focus to the development of legal intelligent systems and applications. These intelligent systems can provide great services and solve many problems in legal domain. Over the last years, researchers have focused on predicting judicial case outcomes using Natural Language Processing (NLP) and Machine Learning (ML) methods over case documents. Thus, Legal Judgment Prediction (LJP) is the task of automatically predicting the outcome of a court case given only the text of the case. To the best of our knowledge, no prior research with this intention has been conducted in English for appeal courts in Canada, as of 2021. The NLP application to legal judgments, that our proposed methodology focuses on, is to predict the outcomes of cases by looking only at the description of cases written by the court. Because appeal court decisions are often binary, as in accept or reject, the task is defined as a binary classification problem between’ Allow’ and ‘Dismiss'. This is the general approach in the literature as well. We employ various classification methods including classical classifiers, Deep Learning (DL) models, and compare their performances. Our best results are obtained using DL models with accuracy values reaching 93.46% and F1-scores reaching 0.92, which are on par with the best results in the literature. Through this study, we hope to establish the basis for future research on the legal system of Canada and offer a baseline for future work.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126773688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic Classification of Accessibility User Reviews in Android Apps Android应用中可访问性用户评论的自动分类
Wajdi Aljedaani, Mohamed Wiem Mkaouer, S. Ludi, Yasir Javed
In recent years, mobile applications have gained popularity for providing information, digital services, and content to users including users with disabilities. However, recent studies have shown that even popular mobile apps are facing issues related to accessibility, which hinders their usability experience for people with disabilities. For discovering these issues in the new app releases, developers consider user reviews published on the official app stores. However, it is a challenging and time-consuming task to identify the type of accessibility-related reviews manually. Therefore, in this study, we have used super-vised learning techniques, namely, Extra Tree Classifier (ETC), Random Forest, Support Vector Classification, Decision Tree, K-Nearest Neighbors (KNN), and Logistic Regression for automated classification of 2,663 Android app reviews based on four types of accessibility guidelines, i.e., Principles, Audio/Images, Design and Focus. Results have shown that the ETC classifier produces the best results in the automated classification of accessibility app reviews with 93% accuracy.
近年来,移动应用程序为包括残疾用户在内的用户提供信息、数字服务和内容而变得越来越受欢迎。然而,最近的研究表明,即使是流行的移动应用也面临着与可访问性相关的问题,这阻碍了它们对残疾人的可用性体验。为了在新应用发布中发现这些问题,开发者会考虑发布在官方应用商店上的用户评论。然而,手动确定与可访问性相关的审查类型是一项具有挑战性且耗时的任务。因此,在本研究中,我们使用了监督学习技术,即额外树分类器(ETC)、随机森林、支持向量分类、决策树、k近邻(KNN)和逻辑回归,基于四种可访问性准则,即原则、音频/图像、设计和焦点,对2663条Android应用评论进行了自动分类。结果表明,ETC分类器在可访问性应用程序评论的自动分类中产生了最好的结果,准确率为93%。
{"title":"Automatic Classification of Accessibility User Reviews in Android Apps","authors":"Wajdi Aljedaani, Mohamed Wiem Mkaouer, S. Ludi, Yasir Javed","doi":"10.1109/CDMA54072.2022.00027","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00027","url":null,"abstract":"In recent years, mobile applications have gained popularity for providing information, digital services, and content to users including users with disabilities. However, recent studies have shown that even popular mobile apps are facing issues related to accessibility, which hinders their usability experience for people with disabilities. For discovering these issues in the new app releases, developers consider user reviews published on the official app stores. However, it is a challenging and time-consuming task to identify the type of accessibility-related reviews manually. Therefore, in this study, we have used super-vised learning techniques, namely, Extra Tree Classifier (ETC), Random Forest, Support Vector Classification, Decision Tree, K-Nearest Neighbors (KNN), and Logistic Regression for automated classification of 2,663 Android app reviews based on four types of accessibility guidelines, i.e., Principles, Audio/Images, Design and Focus. Results have shown that the ETC classifier produces the best results in the automated classification of accessibility app reviews with 93% accuracy.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132274150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Phishing Attacks Detection using Machine Learning and Deep Learning Models 使用机器学习和深度学习模型检测网络钓鱼攻击
M. Aljabri, Samiha Mirza
Because of the fast expansion of internet users, phishing attacks have become a significant menace where the attacker poses as a trusted entity in order to steal sensitive data, causing reputational damage, loss of money, ransomware, or other malware infections. Intelligent techniques mainly Machine Learning (ML) and Deep Learning (D L) are increasingly applied in the field of cybersecurity due to their ability to learn from available data in order to extract useful insight and predict future events. The effectiveness of applying such intelligent approaches in detecting phishing web sites is investigated in this paper. We used two separate datasets and selected the highest correlated features which comprised of a combination of content-based, URL lexical-based, and domain-based features. A set of ML models were then applied, and a comparative performance evaluation was conducted. Results proved the importance of features selection in improving the models' performance. Furthermore, the results also aimed to identify the best features that influence the model in identifying phishing websites. For classification performance, Random Forest (RF) algorithm achieved the highest accuracy for both datasets.
由于互联网用户的快速扩张,网络钓鱼攻击已经成为一个重大的威胁,攻击者冒充一个受信任的实体,以窃取敏感数据,造成声誉损害,金钱损失,勒索软件或其他恶意软件感染。智能技术(主要是机器学习(ML)和深度学习(dl))在网络安全领域的应用越来越多,因为它们能够从可用数据中学习,以提取有用的见解并预测未来事件。本文研究了应用这种智能方法检测钓鱼网站的有效性。我们使用了两个独立的数据集,并选择了相关度最高的特征,这些特征包括基于内容的、基于URL词汇的和基于域的特征。然后应用了一组ML模型,并进行了性能比较评价。结果证明了特征选择对提高模型性能的重要性。此外,结果还旨在确定影响识别网络钓鱼网站模型的最佳特征。在分类性能方面,随机森林(RF)算法在两个数据集上都达到了最高的准确率。
{"title":"Phishing Attacks Detection using Machine Learning and Deep Learning Models","authors":"M. Aljabri, Samiha Mirza","doi":"10.1109/CDMA54072.2022.00034","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00034","url":null,"abstract":"Because of the fast expansion of internet users, phishing attacks have become a significant menace where the attacker poses as a trusted entity in order to steal sensitive data, causing reputational damage, loss of money, ransomware, or other malware infections. Intelligent techniques mainly Machine Learning (ML) and Deep Learning (D L) are increasingly applied in the field of cybersecurity due to their ability to learn from available data in order to extract useful insight and predict future events. The effectiveness of applying such intelligent approaches in detecting phishing web sites is investigated in this paper. We used two separate datasets and selected the highest correlated features which comprised of a combination of content-based, URL lexical-based, and domain-based features. A set of ML models were then applied, and a comparative performance evaluation was conducted. Results proved the importance of features selection in improving the models' performance. Furthermore, the results also aimed to identify the best features that influence the model in identifying phishing websites. For classification performance, Random Forest (RF) algorithm achieved the highest accuracy for both datasets.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116640498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Evaluation of Machine Learning to Early Detection of Highly Cited Papers 机器学习对高被引论文早期检测的评价
G. M. Binmakhashen, Hamdi A. Al-Jamimi
As one of the fastest-growing topics, machine learning has many applications that span through different domains including image and signal recognition, text mining, information retrieval, robotics, etc. It enables information extraction and analysis for better insights and decision-based systems. The Web of Science(WoS) citation database is a leading organization that provides citation data of high-quality published research. WoS has its metrics to label published articles as Highly Cited Paper(HCP). Machine learning (ML) can help researchers in identifying the key characteristics of HCP. Moreover, it can allow research evaluation units forecasting significant scientific articles. In other words, it may allow researchers and/or research evaluators to detect potential scientific breakthrough ideas and stay current. In this study, more than 26 thousand records of published articles indexed by WoS were analyzed. All the records are drawn from the Technology research area as defined by WoS. Four ML algorithms are evaluated to verify the HCP common factors influence in raising citations and interest in scientific articles. The ensemble algorithms show promising results to identify HCP articles using only four factors.
作为发展最快的课题之一,机器学习在图像和信号识别、文本挖掘、信息检索、机器人等不同领域有着广泛的应用。它支持信息提取和分析,以获得更好的见解和基于决策的系统。Web of Science(WoS)引文数据库是提供高质量已发表研究的引文数据的领先组织。WoS有自己的指标来将发表的文章标记为高被引论文(HCP)。机器学习(ML)可以帮助研究人员识别HCP的关键特征。此外,它可以让研究评价单位预测重要的科学文章。换句话说,它可以让研究人员和/或研究评估人员发现潜在的科学突破性想法,并保持最新状态。在这项研究中,我们分析了超过2.6万条由WoS索引的已发表文章记录。所有记录均取自WoS定义的技术研究区域。评估了四种机器学习算法,以验证HCP共同因素对提高科学文章的引用和兴趣的影响。集成算法显示了有希望的结果,识别HCP文章仅使用四个因素。
{"title":"Evaluation of Machine Learning to Early Detection of Highly Cited Papers","authors":"G. M. Binmakhashen, Hamdi A. Al-Jamimi","doi":"10.1109/CDMA54072.2022.00006","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00006","url":null,"abstract":"As one of the fastest-growing topics, machine learning has many applications that span through different domains including image and signal recognition, text mining, information retrieval, robotics, etc. It enables information extraction and analysis for better insights and decision-based systems. The Web of Science(WoS) citation database is a leading organization that provides citation data of high-quality published research. WoS has its metrics to label published articles as Highly Cited Paper(HCP). Machine learning (ML) can help researchers in identifying the key characteristics of HCP. Moreover, it can allow research evaluation units forecasting significant scientific articles. In other words, it may allow researchers and/or research evaluators to detect potential scientific breakthrough ideas and stay current. In this study, more than 26 thousand records of published articles indexed by WoS were analyzed. All the records are drawn from the Technology research area as defined by WoS. Four ML algorithms are evaluated to verify the HCP common factors influence in raising citations and interest in scientific articles. The ensemble algorithms show promising results to identify HCP articles using only four factors.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116951519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A comparative analysis of Graph Neural Networks and commonly used machine learning algorithms on fake news detection 图神经网络与常用机器学习算法在假新闻检测中的比较分析
Fahim Mahmud, Mahi Md. Sadek Rayhan, Mahdi Hasan Shuvo, Islam Sadia, Md. Kishor Morol
Fake news on social media is increasingly regarded as one of the most concerning issues. Low cost, simple accessibility via social platforms, and a plethora of low-budget online news sources are some of the factors that contribute to the spread of false news. Most of the existing fake news detection algorithms are solely focused on the news content only but engaged users' prior posts or social activities provide a wealth of information about their views on news and have significant ability to improve fake news identification. Graph Neural Networks are a form of deep learning approach that conducts prediction on graph-described data. Social media platforms are followed graph structure in their representation, Graph Neural Network are special types of neural networks that could be usually applied to graphs, making it much easier to execute edge, node and graph-level prediction. Therefore, in this paper, we present a comparative analysis among some commonly used machine learning algorithms and Graph Neural Networks for detecting the spread of false news on social media platforms. In this study, we take the UPFD dataset and implement several existing machine learning algorithms on text data only. Besides this, we create different GNN layers for fusing graph-structured news propagation data and the text data as the node feature in our GNN models. GNNs provide the best solutions to the dilemma of identifying false news in our research.
社交媒体上的假新闻越来越被视为最令人担忧的问题之一。低成本,通过社交平台的简单访问,以及大量低成本的在线新闻来源是导致虚假新闻传播的一些因素。现有的假新闻检测算法大多只关注新闻内容,但参与用户之前的帖子或社交活动提供了丰富的新闻观点信息,具有显著的提高假新闻识别能力。图神经网络是一种深度学习方法,用于对图描述的数据进行预测。社交媒体平台在其表示中遵循图结构,图神经网络是一种特殊类型的神经网络,通常可以应用于图,使其更容易执行边缘,节点和图级预测。因此,在本文中,我们对一些常用的机器学习算法和图神经网络进行了比较分析,以检测社交媒体平台上虚假新闻的传播。在本研究中,我们采用UPFD数据集,并仅在文本数据上实现几种现有的机器学习算法。此外,我们创建了不同的GNN层,用于融合图结构新闻传播数据和文本数据作为我们的GNN模型的节点特征。在我们的研究中,gnn为识别假新闻的困境提供了最佳解决方案。
{"title":"A comparative analysis of Graph Neural Networks and commonly used machine learning algorithms on fake news detection","authors":"Fahim Mahmud, Mahi Md. Sadek Rayhan, Mahdi Hasan Shuvo, Islam Sadia, Md. Kishor Morol","doi":"10.1109/CDMA54072.2022.00021","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00021","url":null,"abstract":"Fake news on social media is increasingly regarded as one of the most concerning issues. Low cost, simple accessibility via social platforms, and a plethora of low-budget online news sources are some of the factors that contribute to the spread of false news. Most of the existing fake news detection algorithms are solely focused on the news content only but engaged users' prior posts or social activities provide a wealth of information about their views on news and have significant ability to improve fake news identification. Graph Neural Networks are a form of deep learning approach that conducts prediction on graph-described data. Social media platforms are followed graph structure in their representation, Graph Neural Network are special types of neural networks that could be usually applied to graphs, making it much easier to execute edge, node and graph-level prediction. Therefore, in this paper, we present a comparative analysis among some commonly used machine learning algorithms and Graph Neural Networks for detecting the spread of false news on social media platforms. In this study, we take the UPFD dataset and implement several existing machine learning algorithms on text data only. Besides this, we create different GNN layers for fusing graph-structured news propagation data and the text data as the node feature in our GNN models. GNNs provide the best solutions to the dilemma of identifying false news in our research.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131173958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An Investigation of Forecasting Tadawul All Share Index (TASI) Using Machine Learning 利用机器学习预测Tadawul全股指数(TASI)的研究
G. M. Binmakhashen, A. Bakather, A. Bin-Salem
Stock markets are one of the most complex, and dynamic environments. To make predictions about the stock prices, we may require combining several sources of market information. Another possibility is to attempt to monitor and predict the stock index prices of a target market. In this study, we investigated several machine learning algorithms to predict the Saudi stock price index by utilizing Bloomberg's most used indicators. The collected data represents 26 years of Tadawul All Share Index(TASI) index prices. Several machine learning algorithms were investigated for forecasting midterm TASI index pricing. Two Recurrent Neural Network (RNN) architectures (deeper, and shallower architectures) were created, trained, tested, and their performances in forecasting TASI index prices are contrasted. Furthermore, several traditional machine learning methods such as Linear regression, decision trees, and random forests are also studied for index price prediction. The experiments suggested that with 26 years of TASI index transactions, simple machine learning(ML) models are generally suitable to make better midterm index price forecasting in comparison to more complex ML models.
股票市场是最复杂、最动态的环境之一。为了预测股票价格,我们可能需要结合几种市场信息来源。另一种可能性是试图监控和预测目标市场的股票指数价格。在这项研究中,我们研究了几种机器学习算法,利用彭博最常用的指标来预测沙特股票价格指数。所收集的数据代表了26年来Tadawul所有股票指数(TASI)指数的价格。研究了几种机器学习算法用于预测中期TASI指数定价。两种循环神经网络(RNN)架构(深层和浅层架构)被创建、训练和测试,并对比了它们在预测TASI指数价格方面的表现。此外,本文还研究了几种传统的机器学习方法,如线性回归、决策树和随机森林等,用于指数价格预测。实验表明,通过26年的TASI指数交易,与更复杂的机器学习模型相比,简单的机器学习(ML)模型通常适合于更好的中期指数价格预测。
{"title":"An Investigation of Forecasting Tadawul All Share Index (TASI) Using Machine Learning","authors":"G. M. Binmakhashen, A. Bakather, A. Bin-Salem","doi":"10.1109/CDMA54072.2022.00009","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00009","url":null,"abstract":"Stock markets are one of the most complex, and dynamic environments. To make predictions about the stock prices, we may require combining several sources of market information. Another possibility is to attempt to monitor and predict the stock index prices of a target market. In this study, we investigated several machine learning algorithms to predict the Saudi stock price index by utilizing Bloomberg's most used indicators. The collected data represents 26 years of Tadawul All Share Index(TASI) index prices. Several machine learning algorithms were investigated for forecasting midterm TASI index pricing. Two Recurrent Neural Network (RNN) architectures (deeper, and shallower architectures) were created, trained, tested, and their performances in forecasting TASI index prices are contrasted. Furthermore, several traditional machine learning methods such as Linear regression, decision trees, and random forests are also studied for index price prediction. The experiments suggested that with 26 years of TASI index transactions, simple machine learning(ML) models are generally suitable to make better midterm index price forecasting in comparison to more complex ML models.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"11 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124184796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Face-Swap-Verification Using PRNU 使用PRNU的高效人脸交换验证
Ali Hassani, H. Malik
Facial recognition is becoming the go-to method of identifying users for convenience applications. While great advances have occurred in achieving strong false acceptance and false rejection rates on authentic images, these systems can be vulnerable to face-swap-attacks. This research addresses face-swap-attacks via camera forensics. Whenever an image is modified, there is necessarily an impact to the noise profile (in this case Photo Response Non-Uniformity). Hence, a framework is proposed to enroll the facial recognition camera's “noiseprint” and assess authenticity on future images based on deviation from expected value. This is done using down-sampling compression to improve run time, where images are further segmented into sub-zones to retain local sensitivity. Framework performance is evalu-ated by recording identical facial-images using multiple cameras of the same make. Next, a subset is modified via hand-crafted and AI-tool face-swaps. 100% of images are correctly identified as authentic or tampering when using full-image analysis at full-scale. Efficiency is then optimized by dividing the image into sub-zones and applying compression. Run-time is improved to 4.6 msec on CPU, a 99.1% reduction, by applying quarter-scale down-sampling with 16 sub-zones (this retains 93.5% verification accuracy). These results are validated against three existing state-of-the-art algorithms, which in comparison show over-fitting when compressed. This demonstrates that compressed PRNU can be used to efficiently verify facial-images, including against AI facial manipulation tools.
面部识别正在成为方便应用程序识别用户的首选方法。虽然在真实图像的高错误接受率和错误拒绝率方面取得了巨大进展,但这些系统可能容易受到人脸交换攻击。这项研究通过摄像头取证解决了人脸交换攻击。每当图像被修改时,必然会对噪声轮廓产生影响(在这种情况下,照片响应不均匀性)。因此,提出了一种框架来登记面部识别相机的“噪声指纹”,并基于期望值的偏差评估未来图像的真实性。这是通过下采样压缩来改善运行时间的,其中图像被进一步分割成子区域以保持局部灵敏度。框架的性能是通过使用同一品牌的多个摄像头记录相同的面部图像来评估的。接下来,通过手工制作和人工智能工具的面部交换来修改子集。当使用全尺寸图像分析时,100%的图像被正确识别为真实或篡改。然后通过将图像划分为子区域并应用压缩来优化效率。通过对16个子区域应用四分之一比例的降采样(这保留了93.5%的验证精度),CPU上的运行时间提高到4.6 msec,减少了99.1%。这些结果是针对现有的三种最先进的算法进行验证的,相比之下,这些算法在压缩时显示出过拟合。这表明压缩的PRNU可以用于有效地验证面部图像,包括对抗人工智能面部操作工具。
{"title":"Efficient Face-Swap-Verification Using PRNU","authors":"Ali Hassani, H. Malik","doi":"10.1109/CDMA54072.2022.00012","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00012","url":null,"abstract":"Facial recognition is becoming the go-to method of identifying users for convenience applications. While great advances have occurred in achieving strong false acceptance and false rejection rates on authentic images, these systems can be vulnerable to face-swap-attacks. This research addresses face-swap-attacks via camera forensics. Whenever an image is modified, there is necessarily an impact to the noise profile (in this case Photo Response Non-Uniformity). Hence, a framework is proposed to enroll the facial recognition camera's “noiseprint” and assess authenticity on future images based on deviation from expected value. This is done using down-sampling compression to improve run time, where images are further segmented into sub-zones to retain local sensitivity. Framework performance is evalu-ated by recording identical facial-images using multiple cameras of the same make. Next, a subset is modified via hand-crafted and AI-tool face-swaps. 100% of images are correctly identified as authentic or tampering when using full-image analysis at full-scale. Efficiency is then optimized by dividing the image into sub-zones and applying compression. Run-time is improved to 4.6 msec on CPU, a 99.1% reduction, by applying quarter-scale down-sampling with 16 sub-zones (this retains 93.5% verification accuracy). These results are validated against three existing state-of-the-art algorithms, which in comparison show over-fitting when compressed. This demonstrates that compressed PRNU can be used to efficiently verify facial-images, including against AI facial manipulation tools.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122161412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Deep Learning Framework for Temperature Forecasting 温度预测的深度学习框架
Patil Malini, B. Qureshi
Among many global warming issues, the increase in global temperatures causing summer heatwaves have triggered heat-strokes leading to untimely deaths of thousands of people. Heatwaves are meteorological events with prolonged periods of excessive heat. Machine learning algorithms such as Auto-Regressive Integrated Moving Average (ARIMA) and Ensemble-learning and Long Short-term Memory Network (LSTM) have recently been used to forecast weather conditions. Optimizing the hyperparameters for accurate temperature forecasting is challenging. This paper presents Cauchy Particle-swarm optimization (CPSO) technique for finding the hyperparameters of the LSTM. The proposed technique minimizes the validation mean square error rate (MSER) to improve accuracy. We test the proposed technique on 30-year Riyadh city temperature datasets. In our experimental evaluation, the proposed CPSO-LSTM outperforms LSTM and Grid-search LSTM by 50% and 55% respectively.
在众多全球变暖问题中,全球气温上升引发的夏季热浪引发了中暑,导致数千人过早死亡。热浪是指长时间过热的气象事件。机器学习算法,如自回归综合移动平均(ARIMA)和集成学习和长短期记忆网络(LSTM)最近被用于预测天气状况。优化超参数以实现准确的温度预报是一项具有挑战性的工作。本文提出了求解LSTM超参数的柯西粒子群算法(CPSO)。该方法最大限度地降低了验证均方错误率(MSER),提高了验证的准确性。我们在30年的利雅得城市温度数据集上测试了所提出的技术。在我们的实验评估中,所提出的CPSO-LSTM分别比LSTM和网格搜索LSTM高出50%和55%。
{"title":"A Deep Learning Framework for Temperature Forecasting","authors":"Patil Malini, B. Qureshi","doi":"10.1109/CDMA54072.2022.00016","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00016","url":null,"abstract":"Among many global warming issues, the increase in global temperatures causing summer heatwaves have triggered heat-strokes leading to untimely deaths of thousands of people. Heatwaves are meteorological events with prolonged periods of excessive heat. Machine learning algorithms such as Auto-Regressive Integrated Moving Average (ARIMA) and Ensemble-learning and Long Short-term Memory Network (LSTM) have recently been used to forecast weather conditions. Optimizing the hyperparameters for accurate temperature forecasting is challenging. This paper presents Cauchy Particle-swarm optimization (CPSO) technique for finding the hyperparameters of the LSTM. The proposed technique minimizes the validation mean square error rate (MSER) to improve accuracy. We test the proposed technique on 30-year Riyadh city temperature datasets. In our experimental evaluation, the proposed CPSO-LSTM outperforms LSTM and Grid-search LSTM by 50% and 55% respectively.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"117 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126939890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On the Capabilities of Quantum Machine Learning 量子机器学习的能力
Sarah Alghamdi, Sultan Almuhammadi
Machine learning techniques give impressive results in many areas. However, due to the physical limitation of integrated circuits which restricts their computational power growth, and the rapid advances in quantum computing, lots of research studies on quantum machine learning (QML) have been done recently. QML is a technique that uses quantum algorithms as parts of the implementation. Quantum algorithms use quantum mechanics and have the potential to outperform classical algorithms for a given problem. In this paper, three widely used machine learning algorithms are discussed and their quantum versions are presented, namely: quantum neural network, quantum autoencoder, and quantum kernel method. In addition, we discuss the potential capabilities of these QML algorithms and review recent work employing them. Moreover, a quantum neural network prototype is implemented using Qiskit as a proof of concept and tested on a real quantum computer. Empirical results show that quantum neural networks can be trained efficiently.
机器学习技术在许多领域取得了令人印象深刻的成果。然而,由于集成电路的物理限制限制了其计算能力的增长,以及量子计算的快速发展,近年来人们对量子机器学习(QML)进行了大量的研究。QML是一种使用量子算法作为实现部分的技术。量子算法使用量子力学,并且在给定问题上具有超越经典算法的潜力。本文讨论了三种广泛使用的机器学习算法,并给出了它们的量子版本,即量子神经网络、量子自编码器和量子核方法。此外,我们还讨论了这些QML算法的潜在功能,并回顾了最近使用它们的工作。此外,使用Qiskit实现了量子神经网络原型作为概念验证,并在真实的量子计算机上进行了测试。实验结果表明,量子神经网络可以有效地训练。
{"title":"On the Capabilities of Quantum Machine Learning","authors":"Sarah Alghamdi, Sultan Almuhammadi","doi":"10.1109/CDMA54072.2022.00035","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00035","url":null,"abstract":"Machine learning techniques give impressive results in many areas. However, due to the physical limitation of integrated circuits which restricts their computational power growth, and the rapid advances in quantum computing, lots of research studies on quantum machine learning (QML) have been done recently. QML is a technique that uses quantum algorithms as parts of the implementation. Quantum algorithms use quantum mechanics and have the potential to outperform classical algorithms for a given problem. In this paper, three widely used machine learning algorithms are discussed and their quantum versions are presented, namely: quantum neural network, quantum autoencoder, and quantum kernel method. In addition, we discuss the potential capabilities of these QML algorithms and review recent work employing them. Moreover, a quantum neural network prototype is implemented using Qiskit as a proof of concept and tested on a real quantum computer. Empirical results show that quantum neural networks can be trained efficiently.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114948105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Robot-based Arabic Sign Language Translating System 基于机器人的阿拉伯手语翻译系统
Dina A. Alabbad, Nouha O. Alsaleh, Naimah A. Alaqeel, Yara A. Alshehri, Nashwa A. Alzahrani, Maha K. Alhobaishi
Services provided to deaf people in the Eastern province of Saudi Arabia were evaluated, which confirmed a high need to support the deaf community. This paper proposes utilizing the Pepper robot in the task of recognizing and translating Arabic sign language (ArSL), by which the robot recognizes static hand gestures of the letters in ArSL from each keyframe extracted from the input video and translate it into written text and vice versa. This project aims to conduct a two-way translation of the Arabic sign language in a way that fulfills the communication gap found in Saudi Arabia among deaf and non-deaf people. The methods proposed in this paper are computer vision to use the pepper robot's camera and sensors, Natural language processing to convert natural speech to sign language and Deep learning to build a convolutional neural network model that classifies the sign language gestures and convert them into their corresponding written and spoken form. Moreover, two datasets were used, first one is a collection of hand gestures for training the model and the other one is 39 animated signs of all the Arabic letters and special letters.
对沙特阿拉伯东部省向聋人提供的服务进行了评估,证实了对聋人社区的高度支持需求。本文提出将Pepper机器人用于识别和翻译阿拉伯手语(ArSL)任务,机器人从输入视频中提取的每个关键帧中识别ArSL中字母的静态手势,并将其翻译成书面文本,反之亦然。本项目旨在对阿拉伯手语进行双向翻译,以填补沙特阿拉伯聋哑人与非聋哑人之间的沟通差距。本文提出的方法是计算机视觉,利用辣椒机器人的摄像头和传感器;自然语言处理,将自然语音转换为手语;深度学习,建立卷积神经网络模型,对手语手势进行分类,并将其转换为相应的书面和口头形式。此外,我们使用了两个数据集,第一个数据集是用于训练模型的手势集合,另一个数据集是所有阿拉伯字母和特殊字母的39个动画符号。
{"title":"A Robot-based Arabic Sign Language Translating System","authors":"Dina A. Alabbad, Nouha O. Alsaleh, Naimah A. Alaqeel, Yara A. Alshehri, Nashwa A. Alzahrani, Maha K. Alhobaishi","doi":"10.1109/CDMA54072.2022.00030","DOIUrl":"https://doi.org/10.1109/CDMA54072.2022.00030","url":null,"abstract":"Services provided to deaf people in the Eastern province of Saudi Arabia were evaluated, which confirmed a high need to support the deaf community. This paper proposes utilizing the Pepper robot in the task of recognizing and translating Arabic sign language (ArSL), by which the robot recognizes static hand gestures of the letters in ArSL from each keyframe extracted from the input video and translate it into written text and vice versa. This project aims to conduct a two-way translation of the Arabic sign language in a way that fulfills the communication gap found in Saudi Arabia among deaf and non-deaf people. The methods proposed in this paper are computer vision to use the pepper robot's camera and sensors, Natural language processing to convert natural speech to sign language and Deep learning to build a convolutional neural network model that classifies the sign language gestures and convert them into their corresponding written and spoken form. Moreover, two datasets were used, first one is a collection of hand gestures for training the model and the other one is 39 animated signs of all the Arabic letters and special letters.","PeriodicalId":313042,"journal":{"name":"2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127570816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2022 7th International Conference on Data Science and Machine Learning Applications (CDMA)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1