{"title":"Securing Big Data in the Age of AI","authors":"Murat Kantarcioglu, Fahad Shaon","doi":"10.1109/TPS-ISA48467.2019.00035","DOIUrl":null,"url":null,"abstract":"Increasingly organizations are collecting ever larger amounts of data to build complex data analytics, machine learning and AI models. Furthermore, the data needed for building such models may be unstructured (e.g., text, image, and video). Hence such data may be stored in different data management systems ranging from relational databases to newer NoSQL databases tailored for storing unstructured data. Furthermore, data scientists are increasingly using programming languages such as Python, R etc. to process data using many existing libraries. In some cases, the developed code will be automatically executed by the NoSQL system on the stored data. These developments indicate the need for a data security and privacy solution that can uniformly protect data stored in many different data management systems and enforce security policies even if sensitive data is processed using a data scientist submitted complex program. In this paper, we introduce our vision for building such a solution for protecting big data. Specifically, our proposed system system allows organizations to 1) enforce policies that control access to sensitive data, 2) keep necessary audit logs automatically for data governance and regulatory compliance, 3) sanitize and redact sensitive data on-the-fly based on the data sensitivity and AI model needs, 4) detect potentially unauthorized or anomalous access to sensitive data, 5) automatically create attribute-based access control policies based on data sensitivity and data type.","PeriodicalId":129820,"journal":{"name":"2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPS-ISA48467.2019.00035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

Increasingly organizations are collecting ever larger amounts of data to build complex data analytics, machine learning and AI models. Furthermore, the data needed for building such models may be unstructured (e.g., text, image, and video). Hence such data may be stored in different data management systems ranging from relational databases to newer NoSQL databases tailored for storing unstructured data. Furthermore, data scientists are increasingly using programming languages such as Python, R etc. to process data using many existing libraries. In some cases, the developed code will be automatically executed by the NoSQL system on the stored data. These developments indicate the need for a data security and privacy solution that can uniformly protect data stored in many different data management systems and enforce security policies even if sensitive data is processed using a data scientist submitted complex program. In this paper, we introduce our vision for building such a solution for protecting big data. Specifically, our proposed system system allows organizations to 1) enforce policies that control access to sensitive data, 2) keep necessary audit logs automatically for data governance and regulatory compliance, 3) sanitize and redact sensitive data on-the-fly based on the data sensitivity and AI model needs, 4) detect potentially unauthorized or anomalous access to sensitive data, 5) automatically create attribute-based access control policies based on data sensitivity and data type.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在人工智能时代保护大数据
越来越多的组织正在收集越来越多的数据来构建复杂的数据分析、机器学习和人工智能模型。此外,构建这种模型所需的数据可能是非结构化的(例如,文本、图像和视频)。因此,这些数据可以存储在不同的数据管理系统中,从关系数据库到专门用于存储非结构化数据的较新的NoSQL数据库。此外,数据科学家越来越多地使用编程语言,如Python、R等,使用许多现有的库来处理数据。在某些情况下,开发的代码将由NoSQL系统对存储的数据自动执行。这些发展表明,需要一种数据安全和隐私解决方案,能够统一保护存储在许多不同数据管理系统中的数据,并执行安全策略,即使使用数据科学家提交的复杂程序处理敏感数据。在本文中,我们介绍了构建这样一个保护大数据的解决方案的愿景。具体来说,我们建议的系统系统允许组织1)执行控制敏感数据访问的策略,2)自动保留必要的审计日志以实现数据治理和法规遵从性,3)根据数据敏感性和人工智能模型需求实时清理和编辑敏感数据,4)检测对敏感数据的潜在未经授权或异常访问,5)根据数据敏感性和数据类型自动创建基于属性的访问控制策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Performance Evaluation of CAN Encryption Title Page I Disincentivizing Double Spend Attacks Across Interoperable Blockchains User Acceptance of Usable Blockchain-Based Research Data Sharing System: An Extended TAM-Based Study Next Generation Smart Built Environments: The Fusion of Empathy, Privacy and Ethics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1