Securing Big Data in the Age of AI

2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA) Pub Date : 2019-12-01 DOI:10.1109/TPS-ISA48467.2019.00035

Murat Kantarcioglu, Fahad Shaon

{"title":"Securing Big Data in the Age of AI","authors":"Murat Kantarcioglu, Fahad Shaon","doi":"10.1109/TPS-ISA48467.2019.00035","DOIUrl":null,"url":null,"abstract":"Increasingly organizations are collecting ever larger amounts of data to build complex data analytics, machine learning and AI models. Furthermore, the data needed for building such models may be unstructured (e.g., text, image, and video). Hence such data may be stored in different data management systems ranging from relational databases to newer NoSQL databases tailored for storing unstructured data. Furthermore, data scientists are increasingly using programming languages such as Python, R etc. to process data using many existing libraries. In some cases, the developed code will be automatically executed by the NoSQL system on the stored data. These developments indicate the need for a data security and privacy solution that can uniformly protect data stored in many different data management systems and enforce security policies even if sensitive data is processed using a data scientist submitted complex program. In this paper, we introduce our vision for building such a solution for protecting big data. Specifically, our proposed system system allows organizations to 1) enforce policies that control access to sensitive data, 2) keep necessary audit logs automatically for data governance and regulatory compliance, 3) sanitize and redact sensitive data on-the-fly based on the data sensitivity and AI model needs, 4) detect potentially unauthorized or anomalous access to sensitive data, 5) automatically create attribute-based access control policies based on data sensitivity and data type.","PeriodicalId":129820,"journal":{"name":"2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPS-ISA48467.2019.00035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

Increasingly organizations are collecting ever larger amounts of data to build complex data analytics, machine learning and AI models. Furthermore, the data needed for building such models may be unstructured (e.g., text, image, and video). Hence such data may be stored in different data management systems ranging from relational databases to newer NoSQL databases tailored for storing unstructured data. Furthermore, data scientists are increasingly using programming languages such as Python, R etc. to process data using many existing libraries. In some cases, the developed code will be automatically executed by the NoSQL system on the stored data. These developments indicate the need for a data security and privacy solution that can uniformly protect data stored in many different data management systems and enforce security policies even if sensitive data is processed using a data scientist submitted complex program. In this paper, we introduce our vision for building such a solution for protecting big data. Specifically, our proposed system system allows organizations to 1) enforce policies that control access to sensitive data, 2) keep necessary audit logs automatically for data governance and regulatory compliance, 3) sanitize and redact sensitive data on-the-fly based on the data sensitivity and AI model needs, 4) detect potentially unauthorized or anomalous access to sensitive data, 5) automatically create attribute-based access control policies based on data sensitivity and data type.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在人工智能时代保护大数据

越来越多的组织正在收集越来越多的数据来构建复杂的数据分析、机器学习和人工智能模型。此外，构建这种模型所需的数据可能是非结构化的(例如，文本、图像和视频)。因此，这些数据可以存储在不同的数据管理系统中，从关系数据库到专门用于存储非结构化数据的较新的NoSQL数据库。此外，数据科学家越来越多地使用编程语言，如Python、R等，使用许多现有的库来处理数据。在某些情况下，开发的代码将由NoSQL系统对存储的数据自动执行。这些发展表明，需要一种数据安全和隐私解决方案，能够统一保护存储在许多不同数据管理系统中的数据，并执行安全策略，即使使用数据科学家提交的复杂程序处理敏感数据。在本文中，我们介绍了构建这样一个保护大数据的解决方案的愿景。具体来说，我们建议的系统系统允许组织1)执行控制敏感数据访问的策略，2)自动保留必要的审计日志以实现数据治理和法规遵从性，3)根据数据敏感性和人工智能模型需求实时清理和编辑敏感数据，4)检测对敏感数据的潜在未经授权或异常访问，5)根据数据敏感性和数据类型自动创建基于属性的访问控制策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)

自引率

0.00%

发文量

期刊最新文献

A Performance Evaluation of CAN Encryption Title Page I Disincentivizing Double Spend Attacks Across Interoperable Blockchains User Acceptance of Usable Blockchain-Based Research Data Sharing System: An Extended TAM-Based Study Next Generation Smart Built Environments: The Fusion of Empathy, Privacy and Ethics