2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...最新文献_第4页

BusinessDetect: An Advanced Business Information Mining Application for Intelligent Marketing BusinessDetect:面向智能营销的高级商业信息挖掘应用

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00074

Ye Qiu, Xiaolong Gong, Zhiyi Ma

With the arrival of the era of big data, information technology is widely applied in almost all industries. Traditional marketing is largely dependent on manpower, which is quite inefficient. Data mining combined with big data technology has become an effective solution for intelligent marketing. However, the existing marketing applications mainly concentrate on providing business information retrieval but have limited capability to discover business insights. Hence, in this paper, we propose BusinessDetect, a business information mining application that integrates complete business information and extracts appropriate knowledge to support intelligent marketing. Furthermore, we design different interfaces to display information and interact with users. The evaluation results show that BusinessDetect can provide comprehensive support for developing customers and making decisions more efficiently.

随着大数据时代的到来，信息技术被广泛应用于几乎所有行业。传统的营销很大程度上依赖于人力，效率很低。数据挖掘与大数据技术相结合，成为智能营销的有效解决方案。然而，现有的市场营销应用主要集中在提供业务信息检索，而发现业务洞察的能力有限。因此，在本文中，我们提出了BusinessDetect，这是一个商业信息挖掘应用程序，它集成了完整的商业信息并提取适当的知识来支持智能营销。此外，我们还设计了不同的界面来显示信息并与用户进行交互。评价结果表明，BusinessDetect可以为开发客户和提高决策效率提供全面的支持。

引用次数: 3

An Approach for Schema Extraction of NoSQL Graph Databases 一种NoSQL图数据库模式提取方法

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00046

A. A. Frozza, Salomão Rodrigues Jacinto, R. Mello

Currently, a large volume of heterogeneous data is generated and consumed by several classes of applications, which raise a new family of database models called NoSQL. NoSQL graph databases is a member of this family. They provide high scalability and are schemaless, i.e., they do not require an implicit schema such as relational databases. However, the knowledge of how data is structured may be of great importance for data integration or data analysis processes. There are some works in the literature that extract the schema from graph structures or graph-based data sources. Different from them, this work proposes a comprehensive approach that consider all the common NoSQL database graph data model concepts, and generates a schema in the recent JSON Schema recommendation. Experimental evaluations show that our solution generates a suitable schema representation with a linear complexity.

目前，大量的异构数据由几类应用程序生成和使用，这就产生了一个新的数据库模型家族，称为NoSQL。NoSQL图数据库是这个家族的一员。它们提供高可伸缩性并且是无模式的，也就是说，它们不需要像关系数据库那样的隐式模式。然而，关于数据结构的知识对于数据集成或数据分析过程可能非常重要。文献中有一些工作是从图结构或基于图的数据源中提取模式的。与它们不同的是，本文提出了一种综合的方法，考虑了所有常见的NoSQL数据库图数据模型概念，并在最近的JSON模式推荐中生成了一个模式。实验评估表明，我们的解决方案产生了一个合适的模式表示，具有线性复杂度。

引用次数: 4

Artificial Intelligence and Data Science Governance: Roles and Responsibilities at the C-Level and the Board 人工智能和数据科学治理:c级和董事会的角色和责任

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00052

B. Thuraisingham

Corporate governance and the roles and responsibilities of the corporate officers and the board of directors have received an increasing interest since the Enron scandal of the early 2000s. This scandal resulted in enacting policies, laws and regulations such as the Sarbanes-Oxley and others. More recently, with almost every corporation focusing on the applications of Artificial Intelligence (AI) and Data Science (DS) for their businesses in numerous industries including finance and banking, healthcare and medicine, manufacturing and retail and defense and intelligence, it is critical that these corporations take a serious look at the roles and responsibilities of the corporate officers and the board with respect to the governance of the AI and DS operations. This paper discusses the issues and challenges for AI and DS governance with an emphasis on the potential roles and responsibilities of the corporate officers and the board of directors.

自本世纪初安然(Enron)丑闻以来，公司治理以及公司高管和董事会的角色和责任受到了越来越多的关注。这一丑闻导致制定了诸如萨班斯-奥克斯利法案等政策、法律和法规。最近，几乎每家公司都在关注人工智能(AI)和数据科学(DS)在金融和银行、医疗保健和医药、制造和零售、国防和情报等众多行业的业务应用，这些公司必须认真考虑公司高管和董事会在人工智能和数据科学运营治理方面的角色和责任。本文讨论了人工智能和DS治理的问题和挑战，重点是公司官员和董事会的潜在角色和责任。

{"title":"Artificial Intelligence and Data Science Governance: Roles and Responsibilities at the C-Level and the Board","authors":"B. Thuraisingham","doi":"10.1109/IRI49571.2020.00052","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00052","url":null,"abstract":"Corporate governance and the roles and responsibilities of the corporate officers and the board of directors have received an increasing interest since the Enron scandal of the early 2000s. This scandal resulted in enacting policies, laws and regulations such as the Sarbanes-Oxley and others. More recently, with almost every corporation focusing on the applications of Artificial Intelligence (AI) and Data Science (DS) for their businesses in numerous industries including finance and banking, healthcare and medicine, manufacturing and retail and defense and intelligence, it is critical that these corporations take a serious look at the roles and responsibilities of the corporate officers and the board with respect to the governance of the AI and DS operations. This paper discusses the issues and challenges for AI and DS governance with an emphasis on the potential roles and responsibilities of the corporate officers and the board of directors.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"28 1","pages":"314-318"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82557955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

An intelligent baby monitoring system based on Raspberry PI, IoT sensors and convolutional neural network 基于树莓派、物联网传感器和卷积神经网络的智能婴儿监测系统

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00059

R. Cheggou, Siham Si hadj mohand, Oussama Annad, E. Khoumeri

Taking care of a baby is a challenging task for working parents. In this paper, we present an intelligent baby monitoring system that allows parents to check on their baby remotely and in real time. The proposed system is based on the “Raspberry Pi 3 B +” card, a Pi camera, a sound and temperature sensors. To be more efficient, this system uses a convolutional neural network to identify and interpret the baby status in his cradle. The implementation and the experimental results of the proposed system demonstrate its efficiency and accuracy and how it can greatly help parents to take care of their baby.

对有工作的父母来说，照顾孩子是一项具有挑战性的任务。在本文中，我们提出了一个智能婴儿监控系统，允许父母远程和实时检查他们的孩子。提出的系统是基于“树莓派3b +”卡，一个Pi摄像头，一个声音和温度传感器。为了提高效率，该系统使用卷积神经网络来识别和解释婴儿在摇篮中的状态。系统的实现和实验结果证明了该系统的有效性和准确性，能够极大地帮助父母照顾自己的宝宝。

引用次数: 3

Building Damage Evaluation from Satellite Imagery using Deep Learning 基于深度学习的卫星图像建筑损伤评估

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00020

Fei Zhao, Chengcui Zhang

In recent decades, millions of people are killed by natural disasters such as wildfire, landslide, tsunami, and volcanic eruption. The efficiency of post-disaster emergency responses and humanitarian assistance has become crucial in minimizing the expected casualties. This paper focuses on the task of building damage level evaluation, which is a key step for maximizing the deployment efficiency of post-event rescue activities. In this paper, we implement a Mask R-CNN based building damage evaluation model with a practical two-stage training strategy. The motivation of Stage-l is to train a ResNet 101 backbone in Mask R-CNN as a Building Feature Extractor. In Stage-2, we further build on top the model trained in Stage-l a deep learning architecture that performs more sophisticated tasks and is able to classify buildings with different damage levels from satellite images. In particular, in order to take advantage of pre-disaster satellite images, we extract the ResNet 101 backbone from the Mask R-CNN trained on pre-disaster images in Stage-l and utilize it to build a Siamese based semantic segmentation model for classifying the building damage level at the pixel level. The pre- and post-disaster satellite images are simultaneously fed into the proposed Siamese based model during the training and inference process. The output of these two models own the same size as input satellite images. Buildings with different damage levels, i.e., ‘no damage’, ‘minor damage’, ‘major damage’, and ‘destroyed’, are represented as segments of different damage classes in the output. Comparative experiments are conducted on the xBD satellite imagery dataset and compared with multiple state-of-the-art methods. The experimental results indicate that the proposed Siamese based method is capable to improve the damage evaluation accuracy by 16 times and 80%, compared with a baseline model implemented by xBD team and the Mask-RCNN framework, respectively.

近几十年来，数百万人死于自然灾害，如野火、山体滑坡、海啸和火山爆发。灾后紧急反应和人道主义援助的效率在尽量减少预期伤亡方面已变得至关重要。建筑物损伤程度评估是提高灾后救援行动部署效率的关键环节。本文采用一种实用的两阶段训练策略，实现了一种基于掩模R-CNN的建筑物损伤评估模型。阶段1的动机是训练ResNet 101骨干网掩码R-CNN作为建筑特征提取器。在第二阶段，我们在第一阶段训练的模型的基础上进一步构建一个深度学习架构，该架构执行更复杂的任务，并能够从卫星图像中对不同损坏程度的建筑物进行分类。特别是，为了利用灾前卫星图像，我们从阶段1的灾前图像上训练的Mask R-CNN中提取ResNet 101主干，并利用其构建基于Siamese的语义分割模型，在像素级对建筑物损伤程度进行分类。在训练和推理过程中，将灾前和灾后卫星图像同时输入到所提出的基于Siamese的模型中。这两种模型的输出与输入卫星图像具有相同的大小。具有不同伤害等级的建筑，即“无伤害”、“轻微伤害”、“严重伤害”和“被摧毁”，在输出中被表示为不同伤害等级的部分。在xBD卫星图像数据集上进行了对比实验，并与多种最先进的方法进行了比较。实验结果表明，与xBD团队实现的基线模型和Mask-RCNN框架相比，基于Siamese方法的损伤评估准确率分别提高了16倍和80%。

{"title":"Building Damage Evaluation from Satellite Imagery using Deep Learning","authors":"Fei Zhao, Chengcui Zhang","doi":"10.1109/IRI49571.2020.00020","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00020","url":null,"abstract":"In recent decades, millions of people are killed by natural disasters such as wildfire, landslide, tsunami, and volcanic eruption. The efficiency of post-disaster emergency responses and humanitarian assistance has become crucial in minimizing the expected casualties. This paper focuses on the task of building damage level evaluation, which is a key step for maximizing the deployment efficiency of post-event rescue activities. In this paper, we implement a Mask R-CNN based building damage evaluation model with a practical two-stage training strategy. The motivation of Stage-l is to train a ResNet 101 backbone in Mask R-CNN as a Building Feature Extractor. In Stage-2, we further build on top the model trained in Stage-l a deep learning architecture that performs more sophisticated tasks and is able to classify buildings with different damage levels from satellite images. In particular, in order to take advantage of pre-disaster satellite images, we extract the ResNet 101 backbone from the Mask R-CNN trained on pre-disaster images in Stage-l and utilize it to build a Siamese based semantic segmentation model for classifying the building damage level at the pixel level. The pre- and post-disaster satellite images are simultaneously fed into the proposed Siamese based model during the training and inference process. The output of these two models own the same size as input satellite images. Buildings with different damage levels, i.e., ‘no damage’, ‘minor damage’, ‘major damage’, and ‘destroyed’, are represented as segments of different damage classes in the output. Comparative experiments are conducted on the xBD satellite imagery dataset and compared with multiple state-of-the-art methods. The experimental results indicate that the proposed Siamese based method is capable to improve the damage evaluation accuracy by 16 times and 80%, compared with a baseline model implemented by xBD team and the Mask-RCNN framework, respectively.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"86 1","pages":"82-89"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84793890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Toward Data-Driven Assessment of Caregiver’s Burden for Persons with Dementia using Machine Learning Models 使用机器学习模型对痴呆症患者照顾者负担的数据驱动评估

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00061

Hilda Goins, SeyyedPooya HekmatiAthar, G. Byfield, Raymond Samuel, Mohd Anwar

Giving care to persons with dementia (PwD) has a significant strain on the quality of life for familial caregivers. Due to the overdependent nature of PwD, caregivers are burdened with health issues, stress, depression, loneliness, and social isolation. As a result, there is a need for understanding the nature and severity of this burden. In this paper, we introduce a novel data-driven approach based on machine learning modeling to ascertain caregiver burden using multimodal data from multitudinal sources. In particular, we propose to leverage data from smart devices, wearables, and psychometric surveys, to assess caregiver burden employing both shallow and deep neural network architectures.

照顾痴呆症患者(PwD)对家庭照顾者的生活质量造成了重大压力。由于残疾人的过度依赖性质，照顾者承受着健康问题、压力、抑郁、孤独和社会孤立的负担。因此，有必要了解这一负担的性质和严重性。在本文中，我们介绍了一种基于机器学习建模的新颖数据驱动方法，利用来自多纵向来源的多模态数据来确定护理人员负担。特别是，我们建议利用来自智能设备、可穿戴设备和心理测量调查的数据，采用浅层和深层神经网络架构来评估护理人员的负担。

引用次数: 3

Topic Diffusion Discovery based on Deep Non-negative Autoencoder 基于深度非负自编码器的主题扩散发现

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00067

Sheng-Tai Huang, Yihuang Kang, Shao-Min Hung, Bowen Kuo, I-Ling Cheng

Researchers have been overwhelmed by the explosion of research articles published by various research communities. Many research scholarly websites, search engines, and digital libraries have been created to help researchers identify potential research topics and keep up with recent progress on research of interests. However, it is still difficult for researchers to keep track of the research topic diffusion and evolution without spending a large amount of time reviewing numerous relevant and irrelevant articles. In this paper, we consider a novel topic diffusion discovery technique. Specifically, we propose using a Deep Non-negative Autoencoder with information divergence measurement that monitors evolutionary distance of the topic diffusion to understand how research topics change with time. The experimental results show that the proposed approach is able to identify the evolution of research topics as well as to discover topic diffusions in online fashions.

研究人员已经被各种研究团体发表的大量研究论文所淹没。许多研究学术网站、搜索引擎和数字图书馆已经建立起来，以帮助研究人员确定潜在的研究主题，并跟上感兴趣的研究的最新进展。然而，如果不花费大量的时间去查阅大量相关和不相关的文章，研究人员仍然很难跟踪研究课题的扩散和演变。本文提出了一种新的主题扩散发现技术。具体来说，我们建议使用带有信息发散测量的深度非负自编码器来监测主题扩散的进化距离，以了解研究主题如何随时间变化。实验结果表明，该方法能够识别研究主题的演变，并发现在线时尚中的主题扩散。

{"title":"Topic Diffusion Discovery based on Deep Non-negative Autoencoder","authors":"Sheng-Tai Huang, Yihuang Kang, Shao-Min Hung, Bowen Kuo, I-Ling Cheng","doi":"10.1109/IRI49571.2020.00067","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00067","url":null,"abstract":"Researchers have been overwhelmed by the explosion of research articles published by various research communities. Many research scholarly websites, search engines, and digital libraries have been created to help researchers identify potential research topics and keep up with recent progress on research of interests. However, it is still difficult for researchers to keep track of the research topic diffusion and evolution without spending a large amount of time reviewing numerous relevant and irrelevant articles. In this paper, we consider a novel topic diffusion discovery technique. Specifically, we propose using a Deep Non-negative Autoencoder with information divergence measurement that monitors evolutionary distance of the topic diffusion to understand how research topics change with time. The experimental results show that the proposed approach is able to identify the evolution of research topics as well as to discover topic diffusions in online fashions.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"13 1","pages":"405-408"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76990771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A New Emulation Platform for Real-time Machine Learning in Substance Use Data Streams 一种新的物质使用数据流实时机器学习仿真平台

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00054

Stefan A. Bruendl, Hua Fang, H. Ngo, E. Boyer, Honggang Wang

With 5G networks on the rise, it becomes more and more important to grant researchers access to tools that allow for development and experimentation in the field of 5G transmission. Healthcare can benefit greatly from these developments. In this paper a real-time transmission technique is described and tested that, if implemented, allows wearable devices to transmit multiple streams of data on various frequencies. These tests will be used to explain how this presented platform works, what drawbacks and benefits exist with the proposed scheme, and how to further develop the solution of real-time transmission of sensitive data, such as substance-use data, at higher frequencies.

随着5G网络的兴起，为研究人员提供能够在5G传输领域进行开发和实验的工具变得越来越重要。医疗保健可以从这些发展中受益匪浅。本文描述并测试了一种实时传输技术，如果实现，可穿戴设备可以在不同频率上传输多个数据流。这些测试将用于解释所提出的平台如何工作，拟议方案存在哪些缺点和优点，以及如何进一步开发以更高频率实时传输敏感数据(如物质使用数据)的解决方案。

引用次数: 1

Mining Frequent Differences in File Collections 挖掘文件集合中的频繁差异

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00058

S. Chawathe

Collections of textual files, or documents, with substantial inter-document similarities are common in diverse domains. A practically significant class of such similarities, and the dual differences, are well characterized by edit scripts, or colloquially diffs, that use a simple sequence model for documents. The study of such diffs provides valuable insights into the inter-document relationships within a collection and can guide data integration within and across collections. This paper describes a framework for such study that is based on frequently occurring inter-document differences. It motivates and defines a general problem of mining frequent differences and outlines some specific instances. It presents the design and implementation of a prototype system for interactively discovering and visualizing frequent differences. A notable feature of this method is its use of difference-components, or deltas, to bootstrap the discovery of interesting structure in file collections. The paper describes a preliminary experimental evaluation of the method and implementation on a widely used corpus of file-collections.

具有大量文档间相似性的文本文件或文档集合在不同的领域中很常见。这类相似之处和双重差异的一个实际意义重大的类别，可以很好地通过编辑脚本(或者通俗地说，使用简单的文档序列模型的差异)来描述。对这些差异的研究为了解集合内的文档间关系提供了有价值的见解，并可以指导集合内部和跨集合的数据集成。本文描述了一个基于频繁发生的文件间差异的研究框架。它激发并定义了挖掘频繁差异的一般问题，并概述了一些具体实例。提出了一个用于频繁差异交互发现和可视化的原型系统的设计与实现。该方法的一个显著特点是使用差分组件(delta)来引导发现文件集合中感兴趣的结构。本文描述了该方法的初步实验评估和在一个广泛使用的文件集合语料库上的实现。

{"title":"Mining Frequent Differences in File Collections","authors":"S. Chawathe","doi":"10.1109/IRI49571.2020.00058","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00058","url":null,"abstract":"Collections of textual files, or documents, with substantial inter-document similarities are common in diverse domains. A practically significant class of such similarities, and the dual differences, are well characterized by edit scripts, or colloquially diffs, that use a simple sequence model for documents. The study of such diffs provides valuable insights into the inter-document relationships within a collection and can guide data integration within and across collections. This paper describes a framework for such study that is based on frequently occurring inter-document differences. It motivates and defines a general problem of mining frequent differences and outlines some specific instances. It presents the design and implementation of a prototype system for interactively discovering and visualizing frequent differences. A notable feature of this method is its use of difference-components, or deltas, to bootstrap the discovery of interesting structure in file collections. The paper describes a preliminary experimental evaluation of the method and implementation on a widely used corpus of file-collections.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"8 1","pages":"357-364"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76327641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Empirical Analysis on the Usability and Security of Passwords 密码的可用性和安全性实证分析

2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...

Pub Date : 2020-08-01 DOI: 10.1109/IRI49571.2020.00009

Kanwardeep Singh Walia, S. Shenoy, Yuan Cheng

Security and usability are two essential aspects of a system, but they usually move in opposite directions. Sometimes, to achieve security, usability has to be compromised, and vice versa. Password-based authentication systems require both security and usability. However, to increase password security, absurd rules are introduced, which often drive users to compromise the usability of their passwords. Users tend to forget complex passwords and use techniques such as writing them down, reusing them, and storing them in vulnerable ways. Enhancing the strength while maintaining the usability of a password has become one of the biggest challenges for users and security experts. In this paper, we define the pronounceability of a password as a means to measure how easy it is to memorize - an aspect we associate with usability. We examine a dataset of more than 7 million passwords to determine whether the usergenerated passwords are secure. Moreover, we convert the usergenerated passwords into phonemes and measure the pronounceability of the phoneme-based representations. We then establish a relationship between the two and suggest how password creation strategies can be adapted to better align with both security and usability.

安全性和可用性是系统的两个基本方面，但它们通常是相反的方向。有时，为了实现安全性，必须牺牲可用性，反之亦然。基于密码的身份验证系统需要安全性和可用性。然而，为了提高密码安全性，引入了一些荒谬的规则，这往往会导致用户牺牲密码的可用性。用户往往会忘记复杂的密码，并使用写下来、重复使用和以易受攻击的方式存储密码等技术。在保持密码可用性的同时增强密码的强度已成为用户和安全专家面临的最大挑战之一。在本文中，我们将密码的可发音性定义为衡量其记忆难易程度的一种手段——这是与可用性相关的一个方面。我们检查了超过700万个密码的数据集，以确定用户生成的密码是否安全。此外，我们将用户生成的密码转换为音素，并测量基于音素的表示的可发音性。然后，我们建立两者之间的关系，并建议如何适应密码创建策略，以更好地与安全性和可用性保持一致。

{"title":"An Empirical Analysis on the Usability and Security of Passwords","authors":"Kanwardeep Singh Walia, S. Shenoy, Yuan Cheng","doi":"10.1109/IRI49571.2020.00009","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00009","url":null,"abstract":"Security and usability are two essential aspects of a system, but they usually move in opposite directions. Sometimes, to achieve security, usability has to be compromised, and vice versa. Password-based authentication systems require both security and usability. However, to increase password security, absurd rules are introduced, which often drive users to compromise the usability of their passwords. Users tend to forget complex passwords and use techniques such as writing them down, reusing them, and storing them in vulnerable ways. Enhancing the strength while maintaining the usability of a password has become one of the biggest challenges for users and security experts. In this paper, we define the pronounceability of a password as a means to measure how easy it is to memorize - an aspect we associate with usability. We examine a dataset of more than 7 million passwords to determine whether the usergenerated passwords are secure. Moreover, we convert the usergenerated passwords into phonemes and measure the pronounceability of the phoneme-based representations. We then establish a relationship between the two and suggest how password creation strategies can be adapted to better align with both security and usability.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":"27 1","pages":"1-8"},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82332927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3