Pub Date : 2024-07-03DOI: 10.1016/j.array.2024.100355
Zafar Hussain , Jukka K. Nurminen , Perttu Ranta-aho
To protect systems from malicious activities, it is important to differentiate between valid and harmful commands. One way to achieve this is by learning the syntax of the commands, which is a complex task because of the expansive and evolving nature of command syntax. To address this, we harnessed the power of a language model. Our methodology involved constructing a specialized vocabulary from our commands dataset, and training a custom tokenizer with a Masked Language Model head, resulting in the development of a BERT-like language model. This model exhibits proficiency in learning command syntax by predicting masked tokens. In comparative analyses, our language model outperformed the Markov Model in categorizing commands using clustering algorithms (DBSCAN, HDBSCAN, OPTICS). The language model achieved higher Silhouette scores (0.72, 0.88, 0.85) compared to the Markov Model (0.53, 0.25, 0.06) and demonstrated significantly lower noise levels (2.63%, 5.39%, 8.49%) versus the Markov Model’s higher noise rates (9.31%, 29.85%, 50.35%). Further validation with manually crafted syntax and BERTScore assessments consistently produced metrics above 0.90 for precision, recall, and F1-score. Our language model excels at learning command syntax, enhancing protective measures against malicious activities.
要保护系统免受恶意活动的侵害,必须区分有效命令和有害命令。实现这一目标的方法之一是学习命令的语法,但由于命令语法的扩展性和演变性,这是一项复杂的任务。为此,我们利用了语言模型的强大功能。我们的方法包括从命令数据集中构建专门的词汇表,并使用屏蔽语言模型头训练自定义标记器,从而开发出类似于 BERT 的语言模型。该模型通过预测掩码标记来熟练学习命令语法。在比较分析中,我们的语言模型在使用聚类算法(DBSCAN、HDBSCAN、OPTICS)对命令进行分类方面的表现优于马尔可夫模型。与马尔可夫模型(0.53、0.25、0.06)相比,语言模型获得了更高的 Silhouette 分数(0.72、0.88、0.85),噪声水平(2.63%、5.39%、8.49%)也明显低于马尔可夫模型较高的噪声率(9.31%、29.85%、50.35%)。使用人工编写的语法和 BERTScore 评估进行进一步验证后,精确度、召回率和 F1 分数均超过了 0.90。我们的语言模型在学习命令语法方面表现出色,增强了针对恶意活动的保护措施。
{"title":"Training a language model to learn the syntax of commands","authors":"Zafar Hussain , Jukka K. Nurminen , Perttu Ranta-aho","doi":"10.1016/j.array.2024.100355","DOIUrl":"https://doi.org/10.1016/j.array.2024.100355","url":null,"abstract":"<div><p>To protect systems from malicious activities, it is important to differentiate between valid and harmful commands. One way to achieve this is by learning the syntax of the commands, which is a complex task because of the expansive and evolving nature of command syntax. To address this, we harnessed the power of a language model. Our methodology involved constructing a specialized vocabulary from our commands dataset, and training a custom tokenizer with a Masked Language Model head, resulting in the development of a BERT-like language model. This model exhibits proficiency in learning command syntax by predicting masked tokens. In comparative analyses, our language model outperformed the Markov Model in categorizing commands using clustering algorithms (DBSCAN, HDBSCAN, OPTICS). The language model achieved higher Silhouette scores (0.72, 0.88, 0.85) compared to the Markov Model (0.53, 0.25, 0.06) and demonstrated significantly lower noise levels (2.63%, 5.39%, 8.49%) versus the Markov Model’s higher noise rates (9.31%, 29.85%, 50.35%). Further validation with manually crafted syntax and BERTScore assessments consistently produced metrics above 0.90 for precision, recall, and F1-score. Our language model excels at learning command syntax, enhancing protective measures against malicious activities.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"23 ","pages":"Article 100355"},"PeriodicalIF":2.3,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000213/pdfft?md5=68aae0cad29d029f8b3ee94e2999445f&pid=1-s2.0-S2590005624000213-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141592737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-02DOI: 10.1016/j.array.2024.100356
Leonardo Horn Iwaya , Ala Sarah Alaqra , Marit Hansen , Simone Fischer-Hübner
Privacy Impact Assessments (PIAs) offer a process for assessing the privacy impacts of a project or system. As a privacy engineering strategy, they are one of the main approaches to privacy by design, supporting the early identification of threats and controls. However, there is still a shortage of empirical evidence on their use and proven effectiveness in practice. To better understand the current literature and research, this paper provides a comprehensive Scoping Review (ScR) on the topic of PIAs “in the wild,” following the well-established Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. This ScR includes 45 studies, providing an extensive synthesis of the existing body of knowledge, classifying types of research and publications, appraising the methodological quality of primary research, and summarising the positive and negative aspects of PIAs in practice, as reported by those studies. This ScR also identifies significant research gaps (e.g., evidence gaps from contradictory results and methodological gaps from research design deficiencies), future research pathways, and implications for researchers, practitioners, and policymakers developing and using PIA frameworks. As we conclude, there is still a significant need for more primary research on the topic, both qualitative and quantitative. A critical appraisal of qualitative studies revealed deficiencies in the methodological quality, and only four quantitative studies were identified, suggesting that current primary research remains incipient. Nonetheless, PIAs can be regarded as a prominent sub-area in the broader field of empirical privacy engineering, in which further scientific research to support existing practices is needed.
隐私影响评估 (PIA) 提供了一个评估项目或系统隐私影响的流程。作为一种隐私工程策略,隐私影响评估是通过设计实现隐私保护的主要方法之一,有助于及早识别威胁和控制措施。然而,关于它们的使用和在实践中被证明的有效性,仍然缺乏实证证据。为了更好地了解当前的文献和研究,本文按照成熟的系统综述和荟萃分析首选报告项目 (PRISMA) 指南,对 "野生 "的 PIA 主题进行了全面的范围界定综述 (SCR)。本系统综述包括 45 项研究,对现有知识体系进行了广泛综述,对研究和出版物类型进行了分类,对主要研究的方法论质量进行了评估,并总结了这些研究报告中 PIA 在实践中的积极和消极方面。本科学报告还指出了重要的研究缺口(例如,相互矛盾的结果造成的证据缺口和研究设计缺陷造成的方法缺口)、未来的研究路径,以及对研究人员、从业人员和政策制定者开发和使用 PIA 框架的影响。正如我们总结的那样,仍然非常需要对该主题进行更多的初级研究,包括定性和定量研究。对定性研究的批判性评估显示了方法论质量方面的缺陷,仅发现了四项定量研究,这表明当前的初级研究仍处于起步阶段。尽管如此,隐私影响评估可被视为更广泛的实证隐私工程领域中的一个突出子领域,需要进一步的科学研究来支持现有的做法。
{"title":"Privacy impact assessments in the wild: A scoping review","authors":"Leonardo Horn Iwaya , Ala Sarah Alaqra , Marit Hansen , Simone Fischer-Hübner","doi":"10.1016/j.array.2024.100356","DOIUrl":"https://doi.org/10.1016/j.array.2024.100356","url":null,"abstract":"<div><p>Privacy Impact Assessments (PIAs) offer a process for assessing the privacy impacts of a project or system. As a privacy engineering strategy, they are one of the main approaches to privacy by design, supporting the early identification of threats and controls. However, there is still a shortage of empirical evidence on their use and proven effectiveness in practice. To better understand the current literature and research, this paper provides a comprehensive Scoping Review (ScR) on the topic of PIAs “in the wild,” following the well-established Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. This ScR includes 45 studies, providing an extensive synthesis of the existing body of knowledge, classifying types of research and publications, appraising the methodological quality of primary research, and summarising the positive and negative aspects of PIAs in practice, as reported by those studies. This ScR also identifies significant research gaps (e.g., evidence gaps from contradictory results and methodological gaps from research design deficiencies), future research pathways, and implications for researchers, practitioners, and policymakers developing and using PIA frameworks. As we conclude, there is still a significant need for more primary research on the topic, both qualitative and quantitative. A critical appraisal of qualitative studies revealed deficiencies in the methodological quality, and only four quantitative studies were identified, suggesting that current primary research remains incipient. Nonetheless, PIAs can be regarded as a prominent sub-area in the broader field of empirical privacy engineering, in which further scientific research to support existing practices is needed.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"23 ","pages":"Article 100356"},"PeriodicalIF":2.3,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000225/pdfft?md5=fc78c3586c447695244b568609d2c91f&pid=1-s2.0-S2590005624000225-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-17DOI: 10.1016/j.array.2024.100354
Wenhua Zeng , Junjie Liu , Bo Zhang
The task of next basket recommendation is pivotal for recommender systems. It involves predicting user actions, such as the next product purchase or movie selection, by exploring sequential purchase behavior and integrating users’ general preferences. These elements may converge and influence users’ subsequent choices. The challenge intensifies with the presence of varied user purchase sequences in the training set, as indiscriminate incorporation of these sequences can introduce superfluous noise. In response to these challenges, we propose an innovative approach: the Selective Hierarchical Representation Model (SHRM). This model effectively integrates transactional data and user profiles to discern both sequential purchase transactions and general user preferences. The SHRM’s adaptability, particularly in employing nonlinear aggregation operations on user representations, enables it to model complex interactions among various influencing factors. Notably, the SHRM employs a Recurrent Neural Network (RNN) to capture extended dependencies in recent purchasing activities. Moreover, it incorporates an innovative sequence similarity task, grounded in a k-plet sampling strategy. This strategy clusters similar sequences, significantly mitigating the learning process’s noise impact. Through empirical validation on three diverse real-world datasets, we demonstrate that our model consistently surpasses leading benchmarks across various evaluation metrics, establishing a new standard in next-basket recommendation.
下一篮子推荐任务对推荐系统至关重要。它涉及通过探索用户的连续购买行为并综合用户的一般偏好来预测用户的行为,如下一次购买产品或选择电影。这些因素可能会交汇在一起,影响用户的后续选择。如果训练集中存在不同的用户购买序列,挑战就会加剧,因为不加区分地纳入这些序列可能会带来多余的噪音。为了应对这些挑战,我们提出了一种创新方法:选择性分层表示模型(SHRM)。该模型有效地整合了交易数据和用户特征,既能辨别连续的购买交易,也能辨别一般的用户偏好。SHRM 具有很强的适应性,尤其是在用户表征上采用非线性聚合操作,使其能够模拟各种影响因素之间复杂的相互作用。值得注意的是,SHRM 采用了循环神经网络(RNN)来捕捉近期采购活动中的扩展依赖关系。此外,它还采用了创新性的序列相似性任务,以 k 小段抽样策略为基础。该策略对相似序列进行聚类,大大减轻了学习过程中的噪声影响。通过在三个不同的真实数据集上进行经验验证,我们证明了我们的模型在各种评估指标上始终超越领先基准,为下一篮子推荐建立了新的标准。
{"title":"Hierarchical representation learning for next basket recommendation","authors":"Wenhua Zeng , Junjie Liu , Bo Zhang","doi":"10.1016/j.array.2024.100354","DOIUrl":"https://doi.org/10.1016/j.array.2024.100354","url":null,"abstract":"<div><p>The task of next basket recommendation is pivotal for recommender systems. It involves predicting user actions, such as the next product purchase or movie selection, by exploring sequential purchase behavior and integrating users’ general preferences. These elements may converge and influence users’ subsequent choices. The challenge intensifies with the presence of varied user purchase sequences in the training set, as indiscriminate incorporation of these sequences can introduce superfluous noise. In response to these challenges, we propose an innovative approach: the Selective Hierarchical Representation Model (SHRM). This model effectively integrates transactional data and user profiles to discern both sequential purchase transactions and general user preferences. The SHRM’s adaptability, particularly in employing nonlinear aggregation operations on user representations, enables it to model complex interactions among various influencing factors. Notably, the SHRM employs a Recurrent Neural Network (RNN) to capture extended dependencies in recent purchasing activities. Moreover, it incorporates an innovative sequence similarity task, grounded in a k-plet sampling strategy. This strategy clusters similar sequences, significantly mitigating the learning process’s noise impact. Through empirical validation on three diverse real-world datasets, we demonstrate that our model consistently surpasses leading benchmarks across various evaluation metrics, establishing a new standard in next-basket recommendation.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"23 ","pages":"Article 100354"},"PeriodicalIF":2.3,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000201/pdfft?md5=78ee4b9a97b496d96fbd334c5bf79bfb&pid=1-s2.0-S2590005624000201-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141485906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04DOI: 10.1016/j.array.2024.100353
Ntivuguruzwa Jean De La Croix , Tohari Ahmad , Fengling Han
Steganalysis, a field devoted to detecting concealed information in various forms of digital media, including text, images, audio, and video files, has evolved significantly over time. This evolution aims to improve the accuracy of revealing potential hidden data. Traditional machine learning approaches, such as support vector machines (SVM) and ensemble classifiers (ECs), were previously employed in steganalysis. However, they demonstrated ineffective against contemporary and prevalent steganographic methods. The field of steganalysis has experienced noteworthy advancements by transitioning from traditional machine learning methods to deep learning techniques, resulting in superior outcomes. More specifically, deep learning-based steganalysis approaches exhibit rapid detection of steganographic payloads and demonstrate remarkable accuracy and efficiency across a spectrum of modern steganographic algorithms. This paper is dedicated to investigating recent developments in deep learning-based steganalysis schemes, exploring their evolution, and conducting a thorough analysis of the techniques incorporated in these schemes. Furthermore, the research aims to delve into the current trends in steganalysis, explicitly focusing on digital image steganography. By examining the latest techniques and methodologies, this work contributes to an enhanced understanding of the evolving landscape of steganalysis, shedding light on the advancements achieved through deep learning-based approaches.
{"title":"Comprehensive survey on image steganalysis using deep learning","authors":"Ntivuguruzwa Jean De La Croix , Tohari Ahmad , Fengling Han","doi":"10.1016/j.array.2024.100353","DOIUrl":"https://doi.org/10.1016/j.array.2024.100353","url":null,"abstract":"<div><p>Steganalysis, a field devoted to detecting concealed information in various forms of digital media, including text, images, audio, and video files, has evolved significantly over time. This evolution aims to improve the accuracy of revealing potential hidden data. Traditional machine learning approaches, such as support vector machines (SVM) and ensemble classifiers (ECs), were previously employed in steganalysis. However, they demonstrated ineffective against contemporary and prevalent steganographic methods. The field of steganalysis has experienced noteworthy advancements by transitioning from traditional machine learning methods to deep learning techniques, resulting in superior outcomes. More specifically, deep learning-based steganalysis approaches exhibit rapid detection of steganographic payloads and demonstrate remarkable accuracy and efficiency across a spectrum of modern steganographic algorithms. This paper is dedicated to investigating recent developments in deep learning-based steganalysis schemes, exploring their evolution, and conducting a thorough analysis of the techniques incorporated in these schemes. Furthermore, the research aims to delve into the current trends in steganalysis, explicitly focusing on digital image steganography. By examining the latest techniques and methodologies, this work contributes to an enhanced understanding of the evolving landscape of steganalysis, shedding light on the advancements achieved through deep learning-based approaches.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100353"},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000195/pdfft?md5=3dd7fe4cac4a2f244f4a326af65ea83d&pid=1-s2.0-S2590005624000195-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141292436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-03DOI: 10.1016/j.array.2024.100352
Mritunjoy Chakraborty, Nishat Naoal, Sifat Momen, Nabeel Mohammed
Alzheimer’s disease, characterized by progressive and irreversible deterioration of cognitive functions, represents a significant health concern, particularly among older adults, as it stands as the foremost cause of dementia. Despite its debilitating nature, early detection of Alzheimer’s disease holds considerable advantages for affected individuals. This study investigates machine-learning methodologies for the early diagnosis of Alzheimer’s disease, utilizing datasets sourced from OASIS and ADNI. The initial classification methods consist of a 5-class ADNI classification and a 3-class OASIS classification. Three unique methodologies encompass binary-class inter-dataset models, which involve training on a single dataset and subsequently testing on another dataset for both ADNI and OASIS datasets. Additionally, a hybrid dataset model is also considered. The proposed methodology entails the concatenation of both datasets, followed by shuffling and subsequently conducting training and testing on the amalgamated dataset. The findings demonstrate impressive levels of accuracy, as Light Gradient Boosting Machine (LGBM) achieved a 99.63% accuracy rate for 5-class ADNI classification and a 95.75% accuracy rate by Multilayer Perceptron (MLP) for 3-class OASIS classification, both when hyperparameter tweaking was implemented. The K-nearest neighbor algorithm demonstrated exceptional performance, achieving an accuracy of 87.50% in ADNI-OASIS (2 Class) when utilizing the Select K Best method. The Gaussian Naive Bayes algorithm demonstrated exceptional performance in the OASIS-ADNI approach, attaining an accuracy of 77.97% using Chi-squared feature selection. The accuracy achieved by the Hybrid method, which utilized LGBM with hyperparameter optimization, was 99.21%. Furthermore, the utilization of Explainable AI approaches, particularly Lime, was implemented in order to augment the interpretability of the model.
{"title":"ANALYZE-AD: A comparative analysis of novel AI approaches for early Alzheimer’s detection","authors":"Mritunjoy Chakraborty, Nishat Naoal, Sifat Momen, Nabeel Mohammed","doi":"10.1016/j.array.2024.100352","DOIUrl":"10.1016/j.array.2024.100352","url":null,"abstract":"<div><p>Alzheimer’s disease, characterized by progressive and irreversible deterioration of cognitive functions, represents a significant health concern, particularly among older adults, as it stands as the foremost cause of dementia. Despite its debilitating nature, early detection of Alzheimer’s disease holds considerable advantages for affected individuals. This study investigates machine-learning methodologies for the early diagnosis of Alzheimer’s disease, utilizing datasets sourced from OASIS and ADNI. The initial classification methods consist of a 5-class ADNI classification and a 3-class OASIS classification. Three unique methodologies encompass binary-class inter-dataset models, which involve training on a single dataset and subsequently testing on another dataset for both ADNI and OASIS datasets. Additionally, a hybrid dataset model is also considered. The proposed methodology entails the concatenation of both datasets, followed by shuffling and subsequently conducting training and testing on the amalgamated dataset. The findings demonstrate impressive levels of accuracy, as Light Gradient Boosting Machine (LGBM) achieved a 99.63% accuracy rate for 5-class ADNI classification and a 95.75% accuracy rate by Multilayer Perceptron (MLP) for 3-class OASIS classification, both when hyperparameter tweaking was implemented. The K-nearest neighbor algorithm demonstrated exceptional performance, achieving an accuracy of 87.50% in ADNI-OASIS (2 Class) when utilizing the Select K Best method. The Gaussian Naive Bayes algorithm demonstrated exceptional performance in the OASIS-ADNI approach, attaining an accuracy of 77.97% using Chi-squared feature selection. The accuracy achieved by the Hybrid method, which utilized LGBM with hyperparameter optimization, was 99.21%. Furthermore, the utilization of Explainable AI approaches, particularly Lime, was implemented in order to augment the interpretability of the model.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100352"},"PeriodicalIF":0.0,"publicationDate":"2024-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000183/pdfft?md5=e9c710d51ce1b8bb949bd1c6ac280602&pid=1-s2.0-S2590005624000183-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141276619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study delves into the comparative efficacy of YOLOv5 and YOLOv8 in corrosion segmentation tasks. We employed three unique datasets, comprising 4942, 5501, and 6136 images, aiming to thoroughly evaluate the models’ adaptability and robustness in diverse scenarios. The assessment metrics included precision, recall, F1-score, and mean average precision. Furthermore, graphical tests offered a visual perspective on the segmentation capabilities of each architecture. Our results highlight YOLOv8’s superior speed and segmentation accuracy across datasets, further corroborated by graphical evaluations. These visual assessments were instrumental in emphasizing YOLOv8’s proficiency in handling complex corroded surfaces. However, in the largest dataset, both models encountered challenges, particularly with overlapping bounding boxes. YOLOv5 notably lagged, struggling to achieve the performance standards set by YOLOv8, especially with irregular corroded surfaces. In conclusion, our findings underscore YOLOv8’s enhanced capabilities, establishing it as a preferable choice for real-world corrosion detection tasks. This research thus offers invaluable insights, poised to redefine corrosion management strategies and guide future explorations in corrosion identification.
{"title":"A comparative study of YOLOv5 and YOLOv8 for corrosion segmentation tasks in metal surfaces","authors":"Edmundo Casas , Leo Ramos , Cristian Romero , Francklin Rivas-Echeverría","doi":"10.1016/j.array.2024.100351","DOIUrl":"10.1016/j.array.2024.100351","url":null,"abstract":"<div><p>This study delves into the comparative efficacy of YOLOv5 and YOLOv8 in corrosion segmentation tasks. We employed three unique datasets, comprising 4942, 5501, and 6136 images, aiming to thoroughly evaluate the models’ adaptability and robustness in diverse scenarios. The assessment metrics included precision, recall, F1-score, and mean average precision. Furthermore, graphical tests offered a visual perspective on the segmentation capabilities of each architecture. Our results highlight YOLOv8’s superior speed and segmentation accuracy across datasets, further corroborated by graphical evaluations. These visual assessments were instrumental in emphasizing YOLOv8’s proficiency in handling complex corroded surfaces. However, in the largest dataset, both models encountered challenges, particularly with overlapping bounding boxes. YOLOv5 notably lagged, struggling to achieve the performance standards set by YOLOv8, especially with irregular corroded surfaces. In conclusion, our findings underscore YOLOv8’s enhanced capabilities, establishing it as a preferable choice for real-world corrosion detection tasks. This research thus offers invaluable insights, poised to redefine corrosion management strategies and guide future explorations in corrosion identification.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100351"},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000171/pdfft?md5=9e4e2adc95d4bf31d6930cbc85e19fa3&pid=1-s2.0-S2590005624000171-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141230560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-18DOI: 10.1016/j.array.2024.100350
Ahmad Mustapha , Wael Khreich , Wes Masri
Since early machine learning models, metrics such as accuracy and precision have been the de facto way to evaluate and compare trained models. However, a single metric number does not fully capture model similarities and differences, especially in the computer vision domain. A model with high accuracy on a certain dataset might provide a lower accuracy on another dataset without further insights. To address this problem, we build on a recent interpretability technique called Dissect to introduce inter-model interpretability, which determines how models relate or complement each other based on the visual concepts they have learned (such as objects and materials). Toward this goal, we project 13 top-performing self-supervised models into a Learned Concepts Embedding (LCE) space that reveals proximities among models from the perspective of learned concepts. We further crossed this information with the performance of these models on four computer vision tasks and 15 datasets. The experiment allowed us to categorize the models into three categories and revealed the type of visual concepts different tasks required for the first time. This is a step forward for designing cross-task learning algorithms.
{"title":"Inter-model interpretability: Self-supervised models as a case study","authors":"Ahmad Mustapha , Wael Khreich , Wes Masri","doi":"10.1016/j.array.2024.100350","DOIUrl":"https://doi.org/10.1016/j.array.2024.100350","url":null,"abstract":"<div><p>Since early machine learning models, metrics such as accuracy and precision have been the de facto way to evaluate and compare trained models. However, a single metric number does not fully capture model similarities and differences, especially in the computer vision domain. A model with high accuracy on a certain dataset might provide a lower accuracy on another dataset without further insights. To address this problem, we build on a recent interpretability technique called Dissect to introduce <em>inter-model interpretability</em>, which determines how models relate or complement each other based on the visual concepts they have learned (such as objects and materials). Toward this goal, we project 13 top-performing self-supervised models into a Learned Concepts Embedding (LCE) space that reveals proximities among models from the perspective of learned concepts. We further crossed this information with the performance of these models on four computer vision tasks and 15 datasets. The experiment allowed us to categorize the models into three categories and revealed the type of visual concepts different tasks required for the first time. This is a step forward for designing cross-task learning algorithms.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100350"},"PeriodicalIF":0.0,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S259000562400016X/pdfft?md5=33f9642cc8597d6783b926660acecf8c&pid=1-s2.0-S259000562400016X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141095322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network Intrusion Detection Systems (NIDSes) are essential for safeguarding critical information systems. However, the lack of adaptability of Machine Learning (ML) based NIDSes to different environments could cause slow adoption. In this paper, we propose a multimodal NIDS that combines flow and payload features to detect cyber-attacks. The focus of the paper is to evaluate the use of multimodal traffic features in detecting attacks, but not on a practical online implementation. In the multimodal NIDS, two random forest models are trained to classify network traffic using selected flow-based features and the first few bytes of protocol payload, respectively. Predictions from the two models are combined using a soft voting approach to get the final traffic classification results. We evaluate the multimodal NIDS using flow-based features and the corresponding payloads extracted from Packet Capture (PCAP) files of a publicly available UNSW-NB15 dataset. The experimental results show that the proposed multimodal NIDS can detect most attacks with average Accuracy, Recall, Precision and F scores ranging from 98% to 99% using only six flow-based traffic features, and the first 32 bytes of protocol payload. The proposed multimodal NIDS provides a reliable approach to detecting cyber-attacks in different environments.
{"title":"Network intrusion detection leveraging multimodal features","authors":"Aklil Kiflay, Athanasios Tsokanos, Mahmood Fazlali, Raimund Kirner","doi":"10.1016/j.array.2024.100349","DOIUrl":"10.1016/j.array.2024.100349","url":null,"abstract":"<div><p>Network Intrusion Detection Systems (NIDSes) are essential for safeguarding critical information systems. However, the lack of adaptability of Machine Learning (ML) based NIDSes to different environments could cause slow adoption. In this paper, we propose a multimodal NIDS that combines flow and payload features to detect cyber-attacks. The focus of the paper is to evaluate the use of multimodal traffic features in detecting attacks, but not on a practical online implementation. In the multimodal NIDS, two random forest models are trained to classify network traffic using selected flow-based features and the first few bytes of protocol payload, respectively. Predictions from the two models are combined using a soft voting approach to get the final traffic classification results. We evaluate the multimodal NIDS using flow-based features and the corresponding payloads extracted from Packet Capture (PCAP) files of a publicly available UNSW-NB15 dataset. The experimental results show that the proposed multimodal NIDS can detect most attacks with average Accuracy, Recall, Precision and F<span><math><msub><mrow></mrow><mrow><mn>1</mn></mrow></msub></math></span> scores ranging from 98% to 99% using only six flow-based traffic features, and the first 32 bytes of protocol payload. The proposed multimodal NIDS provides a reliable approach to detecting cyber-attacks in different environments.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100349"},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000158/pdfft?md5=571a5eb4d14694ec615bacb4ecbc6a5f&pid=1-s2.0-S2590005624000158-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141029468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-05DOI: 10.1016/j.array.2024.100348
Jiang Huang, Xianglin Huang, Lifang Yang, Zhulin Tao
In general, dance is always associated with music to improve stage performance effect. As we know, artificial music arrangement consumes a lot of time and manpower. While automatic music arrangement based on input dance video perfectly solves this problem. In the cross-modal music generation task, we take advantage of the complementary information between two input modalities of facial expressions and dance movements. Then we present Dance2MusicNet (D2MNet), an autoregressive generation model based on dilated convolution, which adopts two feature vectors, dance style and beats, as control signals to generate real and diverse music that matches dance video. Finally, a comprehensive evaluation method for qualitative and quantitative experiment is proposed. Compared to baseline methods, D2MNet outperforms better in all evaluating metrics, which clearly demonstrates the effectiveness of our framework.
{"title":"D2MNet for music generation joint driven by facial expressions and dance movements","authors":"Jiang Huang, Xianglin Huang, Lifang Yang, Zhulin Tao","doi":"10.1016/j.array.2024.100348","DOIUrl":"https://doi.org/10.1016/j.array.2024.100348","url":null,"abstract":"<div><p>In general, dance is always associated with music to improve stage performance effect. As we know, artificial music arrangement consumes a lot of time and manpower. While automatic music arrangement based on input dance video perfectly solves this problem. In the cross-modal music generation task, we take advantage of the complementary information between two input modalities of facial expressions and dance movements. Then we present Dance2MusicNet (D2MNet), an autoregressive generation model based on dilated convolution, which adopts two feature vectors, dance style and beats, as control signals to generate real and diverse music that matches dance video. Finally, a comprehensive evaluation method for qualitative and quantitative experiment is proposed. Compared to baseline methods, D2MNet outperforms better in all evaluating metrics, which clearly demonstrates the effectiveness of our framework.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100348"},"PeriodicalIF":0.0,"publicationDate":"2024-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000146/pdfft?md5=57bcf00a132e600642ca5c16a65b9121&pid=1-s2.0-S2590005624000146-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140893510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-04-29DOI: 10.1016/j.array.2024.100347
Jianlin Zhang , Chen Hou , Xu Yang , Xuechao Yang , Wencheng Yang , Hui Cui
The advancement of convolutional neural networks (CNNs) has markedly progressed in the field of face detection, significantly enhancing accuracy and recall metrics. Precision and recall remain pivotal for evaluating CNN-based detection models; however, there is a prevalent inclination to focus on improving true positive rates at the expense of addressing false positives. A critical issue contributing to this discrepancy is the lack of pseudo-face images within training and evaluation datasets. This deficiency impairs the regression capabilities of detection models, leading to numerous erroneous detections and inadequate localization. To address this gap, we introduce the WIDERFACE dataset, enriched with a considerable number of pseudo-face images created by amalgamating human and animal facial features. This dataset aims to bolster the detection of false positives during training phases. Furthermore, we propose a new face detection architecture that incorporates a classification model into the conventional face detection model to diminish the false positive rate and augment detection precision. Our comparative analysis on the WIDERFACE and other renowned datasets reveals that our architecture secures a lower false positive rate while preserving the true positive rate in comparison to existing top-tier face detection models.
{"title":"Advancing face detection efficiency: Utilizing classification networks for lowering false positive incidences","authors":"Jianlin Zhang , Chen Hou , Xu Yang , Xuechao Yang , Wencheng Yang , Hui Cui","doi":"10.1016/j.array.2024.100347","DOIUrl":"https://doi.org/10.1016/j.array.2024.100347","url":null,"abstract":"<div><p>The advancement of convolutional neural networks (CNNs) has markedly progressed in the field of face detection, significantly enhancing accuracy and recall metrics. Precision and recall remain pivotal for evaluating CNN-based detection models; however, there is a prevalent inclination to focus on improving true positive rates at the expense of addressing false positives. A critical issue contributing to this discrepancy is the lack of pseudo-face images within training and evaluation datasets. This deficiency impairs the regression capabilities of detection models, leading to numerous erroneous detections and inadequate localization. To address this gap, we introduce the WIDERFACE dataset, enriched with a considerable number of pseudo-face images created by amalgamating human and animal facial features. This dataset aims to bolster the detection of false positives during training phases. Furthermore, we propose a new face detection architecture that incorporates a classification model into the conventional face detection model to diminish the false positive rate and augment detection precision. Our comparative analysis on the WIDERFACE and other renowned datasets reveals that our architecture secures a lower false positive rate while preserving the true positive rate in comparison to existing top-tier face detection models.</p></div>","PeriodicalId":8417,"journal":{"name":"Array","volume":"22 ","pages":"Article 100347"},"PeriodicalIF":0.0,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590005624000134/pdfft?md5=be911996c21c7c166881a8828f984b70&pid=1-s2.0-S2590005624000134-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140825066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}