2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)最新文献_第2页

Risk-Aware Mobile App Security Testing: Safeguarding Sensitive User Inputs 具有风险意识的移动应用程序安全测试：保护敏感的用户输入

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433804

Trishla Shah, Raghav V. Sampangi, Angela Siegel

Over the years, mobile applications have brought about transformative changes in user interactions with digital services. Many of these apps however, are free and offer convenience at the cost of exchanging personal data. This convenience, however, comes with inherent risks to user privacy and security. This paper introduces a comprehensive methodology that evaluates the risks associated with sharing sensitive data through mobile applications. Building upon the Hierarchical Weighted Risk Scoring Model (HWRSM), this paper proposes an evaluation methodology for HWRSM, keeping in mind the implications of such risk scoring on real-world security scenarios. The methodology employs innovative risk scoring, considering various factors to assess potential security vulnerabilities related to sensitive terms. Practical assessments involving diverse set of Android applications, particularly in data-intensive categories, reveal insights into data privacy practices, vulnerabilities, and alignment with HWRSM scores. By offering insights into testing, validation, real-world findings, and model effectiveness, the paper aims to provide practical considerations to mobile application security discussions, facilitating informed approaches to address security and privacy concerns.

多年来，移动应用程序为用户与数字服务的互动带来了变革。然而，这些应用程序中有许多都是免费的，它们以交换个人数据为代价提供便利。然而，这种便利也带来了用户隐私和安全方面的固有风险。本文介绍了一种综合方法，用于评估通过移动应用程序共享敏感数据所带来的风险。在分层加权风险评分模型（HWRSM）的基础上，本文提出了 HWRSM 的评估方法，同时考虑到这种风险评分对现实世界安全场景的影响。该方法采用创新的风险评分法，考虑各种因素来评估与敏感术语相关的潜在安全漏洞。涉及各种 Android 应用程序（尤其是数据密集型类别）的实际评估揭示了数据隐私实践、漏洞以及与 HWRSM 评分的一致性。通过对测试、验证、实际发现和模型有效性的深入分析，本文旨在为移动应用安全讨论提供实用的考虑因素，促进采用知情的方法来解决安全和隐私问题。

{"title":"Risk-Aware Mobile App Security Testing: Safeguarding Sensitive User Inputs","authors":"Trishla Shah, Raghav V. Sampangi, Angela Siegel","doi":"10.1109/ICAIC60265.2024.10433804","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433804","url":null,"abstract":"Over the years, mobile applications have brought about transformative changes in user interactions with digital services. Many of these apps however, are free and offer convenience at the cost of exchanging personal data. This convenience, however, comes with inherent risks to user privacy and security. This paper introduces a comprehensive methodology that evaluates the risks associated with sharing sensitive data through mobile applications. Building upon the Hierarchical Weighted Risk Scoring Model (HWRSM), this paper proposes an evaluation methodology for HWRSM, keeping in mind the implications of such risk scoring on real-world security scenarios. The methodology employs innovative risk scoring, considering various factors to assess potential security vulnerabilities related to sensitive terms. Practical assessments involving diverse set of Android applications, particularly in data-intensive categories, reveal insights into data privacy practices, vulnerabilities, and alignment with HWRSM scores. By offering insights into testing, validation, real-world findings, and model effectiveness, the paper aims to provide practical considerations to mobile application security discussions, facilitating informed approaches to address security and privacy concerns.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"17 5","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139893354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mobile Application Security Risk Score: A sensitive user input-based approach 移动应用程序安全风险评分：基于敏感用户输入的方法

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433828

Trishla Shah, Raghav V. Sampangi, Angela Siegel

This research paper introduces a Hierarchical Weighted Risk Scoring Model specifically designed to assess the risk levels of mobile applications based on user inputs. Through an extensive review of literature on risk score calculation models and term sensitivity identification techniques, this study categorizes terms based on their sensitivity, particularly in relation to sensitive user inputs that may potentially lead to data leaks. The sensitivity of user terms are defined based on the guidelines from PIPEDA. By integrating these terms, along with test outcomes and weights, the model accurately calculates risk scores. The resulting risk assessments provide users with valuable insights, empowering them to make informed decisions and effectively manage risks associated with mobile application usage. This research contributes to the field by offering a user-centric framework that combines various risk score calculation models and term sensitivity identification techniques, tailored specifically for mobile applications and addressing the potential risks arising from sensitive user inputs.

本研究论文介绍了一种分层加权风险评分模型，专门用于根据用户输入评估移动应用程序的风险等级。通过广泛查阅有关风险评分计算模型和术语敏感性识别技术的文献，本研究根据术语的敏感性，尤其是与可能导致数据泄漏的敏感用户输入相关的敏感性，对术语进行了分类。用户术语的敏感性是根据 PIPEDA 的指导方针定义的。通过整合这些术语以及测试结果和权重，模型可以准确计算出风险分数。由此得出的风险评估结果为用户提供了有价值的见解，使他们能够做出明智的决定，并有效管理与移动应用使用相关的风险。这项研究提供了一个以用户为中心的框架，该框架结合了各种风险分数计算模型和术语敏感性识别技术，专门为移动应用量身定制，可解决敏感用户输入带来的潜在风险。

{"title":"Mobile Application Security Risk Score: A sensitive user input-based approach","authors":"Trishla Shah, Raghav V. Sampangi, Angela Siegel","doi":"10.1109/ICAIC60265.2024.10433828","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433828","url":null,"abstract":"This research paper introduces a Hierarchical Weighted Risk Scoring Model specifically designed to assess the risk levels of mobile applications based on user inputs. Through an extensive review of literature on risk score calculation models and term sensitivity identification techniques, this study categorizes terms based on their sensitivity, particularly in relation to sensitive user inputs that may potentially lead to data leaks. The sensitivity of user terms are defined based on the guidelines from PIPEDA. By integrating these terms, along with test outcomes and weights, the model accurately calculates risk scores. The resulting risk assessments provide users with valuable insights, empowering them to make informed decisions and effectively manage risks associated with mobile application usage. This research contributes to the field by offering a user-centric framework that combines various risk score calculation models and term sensitivity identification techniques, tailored specifically for mobile applications and addressing the potential risks arising from sensitive user inputs.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"23 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139895495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Holistic Review on Detection of Malicious Browser Extensions and Links using Deep Learning 利用深度学习检测恶意浏览器扩展和链接的综述

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433842

Rama Abirami K, Tiago Zonta, Mithileysh Sathiyanarayanan

The growth of the Internet has aroused people’s attention toward network security. A secure network environment is fundamental for the expeditious and impeccable development of the Internet. The majority of internet-based tasks can be completed with the help of a web browser. Although many web applications add browser extensions to improve their functionality, some of these extensions are malicious and can access sensitive data without the user’s knowledge. Browser extensions with malicious intent present a growing security concern and have quickly become one of the most prevalent methods used to compromise Internet security. This is largely due to their widespread usage and the extensive privileges they possess. After being installed, these malicious extensions are executed and make an attempt to compromise the victim’s browser. This makes them particularly elusive and challenging to combat. It is crucial to promptly develop an effective strategy to address the threats posed by these extensions. A comprehensive review of the research on browser extension vulnerabilities is presented in this paper. The role of malicious links in web browser extensions are examined for several attacks. Detection of malicious browser extension on various aspects are represented namely Intrusion malicious web browser extensions detection using Intrusion detection, Machine learning based detection methods and Deep learning based techniques to mitigate malicious web browser extensions are examined. This study investigates the critical function of malicious detection in protecting web browsers, looking at the changing threats and risk-reduction tactics. A robust cybersecurity frameworks can be created that not only respond to known threats but also anticipate and thwart the strategies of future cyber adversaries by realizing the significance of proactive detection. Thus this survey provides a detailed comparison of various solutions for malicious browser extension.

互联网的发展引起了人们对网络安全的关注。一个安全的网络环境是互联网快速、无懈可击发展的基础。大多数基于互联网的任务都可以借助网络浏览器来完成。尽管许多网络应用程序都添加了浏览器扩展程序来改善其功能，但其中一些扩展程序是恶意的，可以在用户不知情的情况下访问敏感数据。具有恶意意图的浏览器扩展带来了日益严重的安全问题，并迅速成为危害互联网安全的最普遍方法之一。这主要是由于它们的广泛使用及其拥有的广泛权限。安装后，这些恶意扩展程序会被执行，并试图入侵受害者的浏览器。这使得它们特别难以捉摸，打击起来也极具挑战性。及时制定有效策略来应对这些扩展程序带来的威胁至关重要。本文全面回顾了有关浏览器扩展漏洞的研究。研究了网络浏览器扩展中的恶意链接在几种攻击中的作用。恶意浏览器扩展的检测涉及多个方面，即使用入侵检测、基于机器学习的检测方法和基于深度学习的技术来检测恶意网页浏览器扩展，以减轻恶意网页浏览器扩展的危害。本研究探讨了恶意检测在保护网络浏览器方面的关键作用，研究了不断变化的威胁和降低风险的策略。通过认识到主动检测的重要性，可以创建一个强大的网络安全框架，不仅能应对已知的威胁，还能预测和挫败未来网络对手的策略。因此，本调查详细比较了针对恶意浏览器扩展的各种解决方案。

{"title":"A Holistic Review on Detection of Malicious Browser Extensions and Links using Deep Learning","authors":"Rama Abirami K, Tiago Zonta, Mithileysh Sathiyanarayanan","doi":"10.1109/ICAIC60265.2024.10433842","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433842","url":null,"abstract":"The growth of the Internet has aroused people’s attention toward network security. A secure network environment is fundamental for the expeditious and impeccable development of the Internet. The majority of internet-based tasks can be completed with the help of a web browser. Although many web applications add browser extensions to improve their functionality, some of these extensions are malicious and can access sensitive data without the user’s knowledge. Browser extensions with malicious intent present a growing security concern and have quickly become one of the most prevalent methods used to compromise Internet security. This is largely due to their widespread usage and the extensive privileges they possess. After being installed, these malicious extensions are executed and make an attempt to compromise the victim’s browser. This makes them particularly elusive and challenging to combat. It is crucial to promptly develop an effective strategy to address the threats posed by these extensions. A comprehensive review of the research on browser extension vulnerabilities is presented in this paper. The role of malicious links in web browser extensions are examined for several attacks. Detection of malicious browser extension on various aspects are represented namely Intrusion malicious web browser extensions detection using Intrusion detection, Machine learning based detection methods and Deep learning based techniques to mitigate malicious web browser extensions are examined. This study investigates the critical function of malicious detection in protecting web browsers, looking at the changing threats and risk-reduction tactics. A robust cybersecurity frameworks can be created that not only respond to known threats but also anticipate and thwart the strategies of future cyber adversaries by realizing the significance of proactive detection. Thus this survey provides a detailed comparison of various solutions for malicious browser extension.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"273 4","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139896075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ICAIC 2024 Cover Page ICAIC 2024 封面页

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/icaic60265.2024.10433833

引用次数: 0

Enhanced Network Intrusion Detection System Using PCGSO-Optimized BI-GRU Model in AI-Driven Cybersecurity 在人工智能驱动的网络安全中使用 PCGSO 优化的 BI-GRU 模型增强网络入侵检测系统

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10443675

Priyan Malarvizhi Kumar, Kavya Vedantham, Jeeva Selvaraj, B. P. Kavin

The detection of complex attacks by Network Intrusion Detection Systems (NIDS) is hindered by evasion strategies including encrypted traffic and polymorphic malware. Attackers frequently take advantage of holes in NIDS algorithms, emphasising the never-ending cat-and-mouse game between cybersecurity defences and dynamic attack tactics. In the context of cybersecurity, this study offers a sophisticated method for supporting Network Intrusion Detection Systems (NIDS). The tactic includes a thorough preprocessing stage that include functions for normalisation and standardisation in order to recover the accuracy and consistency of the input data. The Perceptive Craving Game Search Optimisation (PCGSO) algorithm is then used for feature selection, maximising the effectiveness of the NIDS. Bidirectional Gated Recurrent Unit (BI-GRU) representations are used in the classification phase because of their ability to identify sequential dependencies in network traffic data. A second PCGSO programme is used to carry out hyperparameter tuning, which guarantees the best possible model performance. The ISCXIDS2012, a popular benchmark dataset in the field, has been selected as the dataset for evaluation. The suggested approach demonstrates how PCGSO may be used to improve feature selection and hyperparameter tweaking, leading to an NIDS that is more accurate and resilient to cyberattacks. Performance evaluations and experimental findings show that the suggested technique outperforms other current models with 99% accuracy

网络入侵检测系统（NIDS）对复杂攻击的检测受到加密流量和多态恶意软件等规避策略的阻碍。攻击者经常利用网络入侵检测系统算法中的漏洞，强调网络安全防御与动态攻击策略之间永无止境的猫鼠游戏。在网络安全方面，本研究提供了一种支持网络入侵检测系统（NIDS）的复杂方法。该战术包括一个全面的预处理阶段，其中包括规范化和标准化功能，以恢复输入数据的准确性和一致性。然后使用感知渴求游戏搜索优化（PCGSO）算法进行特征选择，从而最大限度地提高 NIDS 的效率。分类阶段使用双向门控循环单元（BI-GRU）表示法，因为它能够识别网络流量数据中的顺序依赖关系。第二个 PCGSO 程序用于进行超参数调优，以确保获得最佳的模型性能。ISCXIDS2012 是该领域流行的基准数据集，被选为评估数据集。所建议的方法展示了 PCGSO 如何用于改进特征选择和超参数调整，从而使 NIDS 更准确、更能抵御网络攻击。性能评估和实验结果表明，所建议的技术优于其他现有模型，准确率达 99%。

{"title":"Enhanced Network Intrusion Detection System Using PCGSO-Optimized BI-GRU Model in AI-Driven Cybersecurity","authors":"Priyan Malarvizhi Kumar, Kavya Vedantham, Jeeva Selvaraj, B. P. Kavin","doi":"10.1109/ICAIC60265.2024.10443675","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10443675","url":null,"abstract":"The detection of complex attacks by Network Intrusion Detection Systems (NIDS) is hindered by evasion strategies including encrypted traffic and polymorphic malware. Attackers frequently take advantage of holes in NIDS algorithms, emphasising the never-ending cat-and-mouse game between cybersecurity defences and dynamic attack tactics. In the context of cybersecurity, this study offers a sophisticated method for supporting Network Intrusion Detection Systems (NIDS). The tactic includes a thorough preprocessing stage that include functions for normalisation and standardisation in order to recover the accuracy and consistency of the input data. The Perceptive Craving Game Search Optimisation (PCGSO) algorithm is then used for feature selection, maximising the effectiveness of the NIDS. Bidirectional Gated Recurrent Unit (BI-GRU) representations are used in the classification phase because of their ability to identify sequential dependencies in network traffic data. A second PCGSO programme is used to carry out hyperparameter tuning, which guarantees the best possible model performance. The ISCXIDS2012, a popular benchmark dataset in the field, has been selected as the dataset for evaluation. The suggested approach demonstrates how PCGSO may be used to improve feature selection and hyperparameter tweaking, leading to an NIDS that is more accurate and resilient to cyberattacks. Performance evaluations and experimental findings show that the suggested technique outperforms other current models with 99% accuracy","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"34 4","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139965082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Simulations and Advancements in MRI-Guided Power-Driven Ferric Tools for Wireless Therapeutic Interventions 用于无线治疗干预的磁共振成像引导动力驱动铁质工具的模拟与进展

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433835

Wenhui Chu, Aobo Jin, Hardik A. Gohel

Designing a robotic system that functions effectively within the specific environment of a Magnetic Resonance Imaging (MRI) scanner requires solving numerous technical issues, such as maintaining the robot’s precision and stability under strong magnetic fields. This research focuses on enhancing MRI’s role in medical imaging, especially in its application to guide intravascular interventions using robot-assisted devices. A newly developed computational system is introduced, designed for seamless integration with the MRI scanner, including a computational unit and user interface. This system processes MR images to delineate the vascular network, establishing virtual paths and boundaries within vessels to prevent procedural damage. Key findings reveal the system’s capability to create tailored magnetic field gradient patterns for device control, considering the vessel’s geometry and safety norms, and adapting to different blood flow characteristics for finer navigation. Additionally, the system’s modeling aspect assesses the safety and feasibility of navigating pre-set vascular paths. Conclusively, this system, based on the Qt framework and C/C++, with specialized software modules, represents a major step forward in merging imaging technology with robotic aid, significantly enhancing precision and safety in intravascular procedures.

设计一个能在磁共振成像（MRI）扫描仪的特定环境中有效运行的机器人系统，需要解决许多技术问题，如在强磁场下保持机器人的精度和稳定性。这项研究的重点是加强核磁共振成像在医学成像中的作用，特别是在使用机器人辅助设备引导血管内介入治疗方面的应用。本文介绍了一种新开发的计算系统，该系统可与核磁共振成像扫描仪无缝集成，包括一个计算单元和用户界面。该系统通过处理磁共振图像来描绘血管网络，在血管内建立虚拟路径和边界，以防止手术损伤。主要研究结果表明，该系统能够为设备控制创建量身定制的磁场梯度模式，同时考虑血管的几何形状和安全规范，并适应不同的血流特征，以实现更精细的导航。此外，该系统的建模功能还能评估导航预设血管路径的安全性和可行性。总之，该系统基于 Qt 框架和 C/C++，并配有专门的软件模块，在成像技术与机器人辅助技术的融合方面迈出了一大步，大大提高了血管内手术的精确性和安全性。

{"title":"Simulations and Advancements in MRI-Guided Power-Driven Ferric Tools for Wireless Therapeutic Interventions","authors":"Wenhui Chu, Aobo Jin, Hardik A. Gohel","doi":"10.1109/ICAIC60265.2024.10433835","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433835","url":null,"abstract":"Designing a robotic system that functions effectively within the specific environment of a Magnetic Resonance Imaging (MRI) scanner requires solving numerous technical issues, such as maintaining the robot’s precision and stability under strong magnetic fields. This research focuses on enhancing MRI’s role in medical imaging, especially in its application to guide intravascular interventions using robot-assisted devices. A newly developed computational system is introduced, designed for seamless integration with the MRI scanner, including a computational unit and user interface. This system processes MR images to delineate the vascular network, establishing virtual paths and boundaries within vessels to prevent procedural damage. Key findings reveal the system’s capability to create tailored magnetic field gradient patterns for device control, considering the vessel’s geometry and safety norms, and adapting to different blood flow characteristics for finer navigation. Additionally, the system’s modeling aspect assesses the safety and feasibility of navigating pre-set vascular paths. Conclusively, this system, based on the Qt framework and C/C++, with specialized software modules, represents a major step forward in merging imaging technology with robotic aid, significantly enhancing precision and safety in intravascular procedures.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"13 3","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139895374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DataAgent: Evaluating Large Language Models’ Ability to Answer Zero-Shot, Natural Language Queries 数据代理：评估大型语言模型回答零即时自然语言查询的能力

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433803

Manit Mishra, Abderrahman Braham, Charles Marsom, Bryan Chung, Gavin Griffin, Dakshesh Sidnerlikar, Chatanya Sarin, Arjun Rajaram

Conventional processes for analyzing datasets and extracting meaningful information are often time-consuming and laborious. Previous work has identified manual, repetitive coding and data collection as major obstacles that hinder data scientists from undertaking more nuanced labor and high-level projects. To combat this, we evaluated OpenAI’s GPT-3.5 as a "Language Data Scientist" (LDS) that can extrapolate key findings, including correlations and basic information, from a given dataset. The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards, including data science code-generation based tasks involving libraries such as NumPy, Pandas, Scikit-Learn, and TensorFlow, and was broadly successful in correctly answering a given data science query related to the benchmark dataset. The LDS used various novel prompt engineering techniques to effectively answer a given question, including Chain-of-Thought reinforcement and SayCan prompt engineering. Our findings demonstrate great potential for leveraging Large Language Models for low-level, zero-shot data analysis.

分析数据集和提取有意义信息的传统流程往往费时费力。以往的工作发现，人工重复编码和数据收集是阻碍数据科学家开展更细致入微的工作和高级别项目的主要障碍。为了解决这个问题，我们将 OpenAI 的 GPT-3.5 评估为 "语言数据科学家"（LDS），它可以从给定的数据集中推断出关键结论，包括相关性和基本信息。该模型在一组不同的基准数据集上进行了测试，以评估其在多种标准下的性能，包括基于数据科学代码生成的任务，其中涉及 NumPy、Pandas、Scikit-Learn 和 TensorFlow 等库。LDS 采用了各种新颖的提示工程技术，包括思维链强化和 SayCan 提示工程，以有效回答给定问题。我们的研究结果表明，利用大型语言模型进行底层、零镜头数据分析具有巨大的潜力。

{"title":"DataAgent: Evaluating Large Language Models’ Ability to Answer Zero-Shot, Natural Language Queries","authors":"Manit Mishra, Abderrahman Braham, Charles Marsom, Bryan Chung, Gavin Griffin, Dakshesh Sidnerlikar, Chatanya Sarin, Arjun Rajaram","doi":"10.1109/ICAIC60265.2024.10433803","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433803","url":null,"abstract":"Conventional processes for analyzing datasets and extracting meaningful information are often time-consuming and laborious. Previous work has identified manual, repetitive coding and data collection as major obstacles that hinder data scientists from undertaking more nuanced labor and high-level projects. To combat this, we evaluated OpenAI’s GPT-3.5 as a \"Language Data Scientist\" (LDS) that can extrapolate key findings, including correlations and basic information, from a given dataset. The model was tested on a diverse set of benchmark datasets to evaluate its performance across multiple standards, including data science code-generation based tasks involving libraries such as NumPy, Pandas, Scikit-Learn, and TensorFlow, and was broadly successful in correctly answering a given data science query related to the benchmark dataset. The LDS used various novel prompt engineering techniques to effectively answer a given question, including Chain-of-Thought reinforcement and SayCan prompt engineering. Our findings demonstrate great potential for leveraging Large Language Models for low-level, zero-shot data analysis.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"144 1","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139895527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Navigating Data Privacy and Analytics: The Role of Large Language Models in Masking conversational data in data platforms 驾驭数据隐私与分析：大型语言模型在掩盖数据平台对话数据中的作用

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433801

Mandar Khoje

In the rapidly evolving landscape of data analytics, safeguarding conversational data privacy presents a pivotal challenge, especially with third-party enterprises commonly offering analytic services. This paper delves into the innovative application of Large Language Models (LLMs) for real-time masking of sensitive information in conversational data. The focus is on balancing privacy protection and data utility for analytics within a multi-stakeholder framework. The significance of data privacy is underscored across sectors, with specific attention to challenges in industries like healthcare, particularly when analytics involve external entities. A comprehensive literature review reveals limitations in existing data masking techniques and explores the role of LLMs in diverse contexts, extending beyond direct healthcare applications.The proposed methodology utilizes LLMs for real-time entity recognition and replacement, effectively masking sensitive information while adhering to privacy regulations. This approach is particularly pertinent for third-party analytics providers dealing with conversational data from various sources. Hypothetical case studies, including healthcare scenarios, showcase the practical application and efficacy of the method in real-world settings with external data analytics providers. The dual assessment evaluates the method’s efficiency in preserving privacy and maintaining data utility for analytical purposes. Experimental results using synthetically generated healthcare conversational data sets further illustrate the effectiveness of the approach in typical third-party analytics service scenarios.The discussion highlights broader implications, addressing challenges and limitations [1] across industries, and emphasizes ethical considerations in handling sensitive data by external entities. In conclusion, the paper summarizes the significant strides achievable with LLMs for data masking, with implications for diverse sectors and analytics providers. Future research directions, especially fine-tuning LLMs for enhanced performance in varied analytic scenarios, are suggested. This study sets the stage for a harmonious coexistence of customer data protection and utility in the intricate ecosystem of data analytics services, facilitated by the advanced capabilities of LLM technology.

在快速发展的数据分析领域，保护对话数据隐私是一项关键挑战，尤其是在第三方企业普遍提供分析服务的情况下。本文深入探讨了大语言模型（LLM）在对话数据中实时屏蔽敏感信息的创新应用。重点是在多方利益相关者框架内平衡隐私保护和数据分析的实用性。数据隐私的重要性在各行各业都得到了强调，并特别关注医疗保健等行业面临的挑战，尤其是当分析涉及外部实体时。全面的文献综述揭示了现有数据掩蔽技术的局限性，并探讨了 LLMs 在不同背景下的作用，其范围已超出了直接的医疗保健应用。这种方法尤其适用于处理各种来源会话数据的第三方分析提供商。假设案例研究（包括医疗保健场景）展示了该方法在外部数据分析提供商的真实环境中的实际应用和功效。双重评估评估了该方法在保护隐私和维护数据实用性以达到分析目的方面的效率。使用合成生成的医疗保健对话数据集的实验结果进一步说明了该方法在典型的第三方分析服务场景中的有效性。讨论强调了更广泛的影响，解决了各行业面临的挑战和限制[1]，并强调了外部实体处理敏感数据时的道德考虑。最后，本文总结了 LLM 在数据掩蔽方面取得的重大进展，以及对不同行业和分析提供商的影响。本文提出了未来的研究方向，特别是微调 LLM，以提高其在各种分析场景中的性能。本研究为在数据分析服务错综复杂的生态系统中实现客户数据保护与实用性的和谐共存奠定了基础，而 LLM 技术的先进功能则为这一和谐共存提供了便利。

{"title":"Navigating Data Privacy and Analytics: The Role of Large Language Models in Masking conversational data in data platforms","authors":"Mandar Khoje","doi":"10.1109/ICAIC60265.2024.10433801","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433801","url":null,"abstract":"In the rapidly evolving landscape of data analytics, safeguarding conversational data privacy presents a pivotal challenge, especially with third-party enterprises commonly offering analytic services. This paper delves into the innovative application of Large Language Models (LLMs) for real-time masking of sensitive information in conversational data. The focus is on balancing privacy protection and data utility for analytics within a multi-stakeholder framework. The significance of data privacy is underscored across sectors, with specific attention to challenges in industries like healthcare, particularly when analytics involve external entities. A comprehensive literature review reveals limitations in existing data masking techniques and explores the role of LLMs in diverse contexts, extending beyond direct healthcare applications.The proposed methodology utilizes LLMs for real-time entity recognition and replacement, effectively masking sensitive information while adhering to privacy regulations. This approach is particularly pertinent for third-party analytics providers dealing with conversational data from various sources. Hypothetical case studies, including healthcare scenarios, showcase the practical application and efficacy of the method in real-world settings with external data analytics providers. The dual assessment evaluates the method’s efficiency in preserving privacy and maintaining data utility for analytical purposes. Experimental results using synthetically generated healthcare conversational data sets further illustrate the effectiveness of the approach in typical third-party analytics service scenarios.The discussion highlights broader implications, addressing challenges and limitations [1] across industries, and emphasizes ethical considerations in handling sensitive data by external entities. In conclusion, the paper summarizes the significant strides achievable with LLMs for data masking, with implications for diverse sectors and analytics providers. Future research directions, especially fine-tuning LLMs for enhanced performance in varied analytic scenarios, are suggested. This study sets the stage for a harmonious coexistence of customer data protection and utility in the intricate ecosystem of data analytics services, facilitated by the advanced capabilities of LLM technology.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"64 2","pages":"1-5"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139895542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying Race and Gender Bias in Stable Diffusion AI Image Generation 识别稳定扩散人工智能图像生成中的种族和性别偏见

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433840

Aadi Chauhan, Taran Anand, Tanisha Jauhari, Arjav Shah, Rudransh Singh, Arjun Rajaram, Rithvik Vanga

In this study, we set out to measure race and gender bias prevalent in text-to-image (TTI) AI image generation, focusing on the popular model Stable Diffusion from Stability AI. Previous investigations into the biases of word embedding models—which serve as the basis for image generation models—have demonstrated that models tend to overstate the relationship between semantic values and gender, ethnicity, or race. These biases are not limited to straightforward stereotypes; more deeply rooted biases may manifest as microaggressions or imposed opinions on policies, such as paid paternity leave decisions. In this analysis, we use image captioning software OpenFlamingo and Stable Diffusion to identify and classify bias within text-to-image models. Utilizing data from the Bureau of Labor Statistics, we engineered 50 prompts for profession and 50 prompts for actions in the interest of coaxing out shallow to systemic biases in the model. Prompts included generating images for "CEO", "nurse", "secretary", "playing basketball", and "doing homework". After generating 20 images for each prompt, we document the model’s results. We find that biases do exist within the model across a variety of prompts. For example, 95% of the images generated for "playing basketball" were African American men. We then analyze our results through categorizing our prompts into a series of income and education levels corresponding to data from the Bureau of Labor Statistics. Ultimately, we find that racial and gender biases are present yet not drastic.

在本研究中，我们以稳定人工智能的流行模型稳定扩散为重点，测量了文本到图像（TTI）人工智能图像生成中普遍存在的种族和性别偏见。以前对作为图像生成模型基础的词嵌入模型的偏差进行的调查表明，模型往往会夸大语义值与性别、民族或种族之间的关系。这些偏差并不局限于直接的刻板印象；更深层次的偏差可能表现为微观偏见或对政策的强加意见，如带薪陪产假决策。在本分析中，我们使用图像字幕软件 OpenFlamingo 和 Stable Diffusion 来识别文本到图像模型中的偏见并对其进行分类。利用美国劳工统计局的数据，我们设计了 50 个职业提示和 50 个行动提示，以找出模型中浅层的系统性偏差。提示包括生成 "首席执行官"、"护士"、"秘书"、"打篮球 "和 "做家庭作业 "的图像。在为每个提示生成 20 幅图像后，我们记录了模型的结果。我们发现，在各种提示中，模型确实存在偏差。例如，为 "打篮球 "生成的图片中 95% 都是非裔美国男性。然后，我们根据劳工统计局的数据，将提示语分为一系列收入和教育水平，对结果进行分析。最终，我们发现种族和性别偏见是存在的，但并不严重。

{"title":"Identifying Race and Gender Bias in Stable Diffusion AI Image Generation","authors":"Aadi Chauhan, Taran Anand, Tanisha Jauhari, Arjav Shah, Rudransh Singh, Arjun Rajaram, Rithvik Vanga","doi":"10.1109/ICAIC60265.2024.10433840","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433840","url":null,"abstract":"In this study, we set out to measure race and gender bias prevalent in text-to-image (TTI) AI image generation, focusing on the popular model Stable Diffusion from Stability AI. Previous investigations into the biases of word embedding models—which serve as the basis for image generation models—have demonstrated that models tend to overstate the relationship between semantic values and gender, ethnicity, or race. These biases are not limited to straightforward stereotypes; more deeply rooted biases may manifest as microaggressions or imposed opinions on policies, such as paid paternity leave decisions. In this analysis, we use image captioning software OpenFlamingo and Stable Diffusion to identify and classify bias within text-to-image models. Utilizing data from the Bureau of Labor Statistics, we engineered 50 prompts for profession and 50 prompts for actions in the interest of coaxing out shallow to systemic biases in the model. Prompts included generating images for \"CEO\", \"nurse\", \"secretary\", \"playing basketball\", and \"doing homework\". After generating 20 images for each prompt, we document the model’s results. We find that biases do exist within the model across a variety of prompts. For example, 95% of the images generated for \"playing basketball\" were African American men. We then analyze our results through categorizing our prompts into a series of income and education levels corresponding to data from the Bureau of Labor Statistics. Ultimately, we find that racial and gender biases are present yet not drastic.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"1 3","pages":"1-6"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139895889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Novel Deep Learning Method for Segmenting the Left Ventricle in Cardiac Cine MRI 在心脏显像核磁共振成像中分割左心室的新型深度学习方法

2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)

Pub Date : 2024-02-07 DOI: 10.1109/ICAIC60265.2024.10433830

Wenhui Chu, Aobo Jin, Hardik A. Gohel

This research aims to develop a novel deep learning network, GBU-Net, utilizing a group-batch-normalized U-Net framework, specifically designed for the precise semantic segmentation of the left ventricle in short-axis cine MRI scans. The methodology includes a down-sampling pathway for feature extraction and an up-sampling pathway for detail restoration, enhanced for medical imaging. Key modifications include techniques for better contextual understanding crucial in cardiac MRI segmentation. The dataset consists of 805 left ventricular MRI scans from 45 patients, with comparative analysis using established metrics such as the dice coefficient and mean perpendicular distance. GBU-Net significantly improves the accuracy of left ventricle segmentation in cine MRI scans. Its innovative design outperforms existing methods in tests, surpassing standard metrics like the dice coefficient and mean perpendicular distance. The approach is unique in its ability to capture contextual information, often missed in traditional CNN-based segmentation. An ensemble of the GBU-Net attains a 97% dice score on the SunnyBrook testing dataset. GBU-Net offers enhanced precision and contextual understanding in left ventricle segmentation for surgical robotics and medical analysis.

本研究旨在开发一种新型深度学习网络--GBU-Net，该网络利用分组批处理归一化 U-Net 框架，专门用于短轴电影核磁共振成像扫描中左心室的精确语义分割。该方法包括一个用于特征提取的下采样路径和一个用于细节还原的上采样路径，并针对医学成像进行了改进。主要修改包括在心脏磁共振成像分割中至关重要的更好地理解上下文的技术。数据集包括来自 45 名患者的 805 张左心室 MRI 扫描图像，并使用骰子系数和平均垂直距离等既定指标进行了比较分析。GBU-Net 大大提高了电影核磁共振扫描中左心室分割的准确性。其创新设计在测试中优于现有方法，超越了骰子系数和平均垂直距离等标准指标。这种方法的独特之处在于它能捕捉上下文信息，而传统的基于 CNN 的分割方法往往会忽略这些信息。在 SunnyBrook 测试数据集上，GBU-Net 的集合获得了 97% 的骰子得分。GBU-Net 为外科手术机器人和医学分析提供了更高的左心室分割精度和上下文理解能力。

{"title":"A Novel Deep Learning Method for Segmenting the Left Ventricle in Cardiac Cine MRI","authors":"Wenhui Chu, Aobo Jin, Hardik A. Gohel","doi":"10.1109/ICAIC60265.2024.10433830","DOIUrl":"https://doi.org/10.1109/ICAIC60265.2024.10433830","url":null,"abstract":"This research aims to develop a novel deep learning network, GBU-Net, utilizing a group-batch-normalized U-Net framework, specifically designed for the precise semantic segmentation of the left ventricle in short-axis cine MRI scans. The methodology includes a down-sampling pathway for feature extraction and an up-sampling pathway for detail restoration, enhanced for medical imaging. Key modifications include techniques for better contextual understanding crucial in cardiac MRI segmentation. The dataset consists of 805 left ventricular MRI scans from 45 patients, with comparative analysis using established metrics such as the dice coefficient and mean perpendicular distance. GBU-Net significantly improves the accuracy of left ventricle segmentation in cine MRI scans. Its innovative design outperforms existing methods in tests, surpassing standard metrics like the dice coefficient and mean perpendicular distance. The approach is unique in its ability to capture contextual information, often missed in traditional CNN-based segmentation. An ensemble of the GBU-Net attains a 97% dice score on the SunnyBrook testing dataset. GBU-Net offers enhanced precision and contextual understanding in left ventricle segmentation for surgical robotics and medical analysis.","PeriodicalId":517265,"journal":{"name":"2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC)","volume":"1 5-6","pages":"1-9"},"PeriodicalIF":0.0,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139896056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0