2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...最新文献
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00074
Ye Qiu, Xiaolong Gong, Zhiyi Ma
With the arrival of the era of big data, information technology is widely applied in almost all industries. Traditional marketing is largely dependent on manpower, which is quite inefficient. Data mining combined with big data technology has become an effective solution for intelligent marketing. However, the existing marketing applications mainly concentrate on providing business information retrieval but have limited capability to discover business insights. Hence, in this paper, we propose BusinessDetect, a business information mining application that integrates complete business information and extracts appropriate knowledge to support intelligent marketing. Furthermore, we design different interfaces to display information and interact with users. The evaluation results show that BusinessDetect can provide comprehensive support for developing customers and making decisions more efficiently.
{"title":"BusinessDetect: An Advanced Business Information Mining Application for Intelligent Marketing","authors":"Ye Qiu, Xiaolong Gong, Zhiyi Ma","doi":"10.1109/IRI49571.2020.00074","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00074","url":null,"abstract":"With the arrival of the era of big data, information technology is widely applied in almost all industries. Traditional marketing is largely dependent on manpower, which is quite inefficient. Data mining combined with big data technology has become an effective solution for intelligent marketing. However, the existing marketing applications mainly concentrate on providing business information retrieval but have limited capability to discover business insights. Hence, in this paper, we propose BusinessDetect, a business information mining application that integrates complete business information and extracts appropriate knowledge to support intelligent marketing. Furthermore, we design different interfaces to display information and interact with users. The evaluation results show that BusinessDetect can provide comprehensive support for developing customers and making decisions more efficiently.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81032041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00046
A. A. Frozza, Salomão Rodrigues Jacinto, R. Mello
Currently, a large volume of heterogeneous data is generated and consumed by several classes of applications, which raise a new family of database models called NoSQL. NoSQL graph databases is a member of this family. They provide high scalability and are schemaless, i.e., they do not require an implicit schema such as relational databases. However, the knowledge of how data is structured may be of great importance for data integration or data analysis processes. There are some works in the literature that extract the schema from graph structures or graph-based data sources. Different from them, this work proposes a comprehensive approach that consider all the common NoSQL database graph data model concepts, and generates a schema in the recent JSON Schema recommendation. Experimental evaluations show that our solution generates a suitable schema representation with a linear complexity.
{"title":"An Approach for Schema Extraction of NoSQL Graph Databases","authors":"A. A. Frozza, Salomão Rodrigues Jacinto, R. Mello","doi":"10.1109/IRI49571.2020.00046","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00046","url":null,"abstract":"Currently, a large volume of heterogeneous data is generated and consumed by several classes of applications, which raise a new family of database models called NoSQL. NoSQL graph databases is a member of this family. They provide high scalability and are schemaless, i.e., they do not require an implicit schema such as relational databases. However, the knowledge of how data is structured may be of great importance for data integration or data analysis processes. There are some works in the literature that extract the schema from graph structures or graph-based data sources. Different from them, this work proposes a comprehensive approach that consider all the common NoSQL database graph data model concepts, and generates a schema in the recent JSON Schema recommendation. Experimental evaluations show that our solution generates a suitable schema representation with a linear complexity.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79803347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00052
B. Thuraisingham
Corporate governance and the roles and responsibilities of the corporate officers and the board of directors have received an increasing interest since the Enron scandal of the early 2000s. This scandal resulted in enacting policies, laws and regulations such as the Sarbanes-Oxley and others. More recently, with almost every corporation focusing on the applications of Artificial Intelligence (AI) and Data Science (DS) for their businesses in numerous industries including finance and banking, healthcare and medicine, manufacturing and retail and defense and intelligence, it is critical that these corporations take a serious look at the roles and responsibilities of the corporate officers and the board with respect to the governance of the AI and DS operations. This paper discusses the issues and challenges for AI and DS governance with an emphasis on the potential roles and responsibilities of the corporate officers and the board of directors.
{"title":"Artificial Intelligence and Data Science Governance: Roles and Responsibilities at the C-Level and the Board","authors":"B. Thuraisingham","doi":"10.1109/IRI49571.2020.00052","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00052","url":null,"abstract":"Corporate governance and the roles and responsibilities of the corporate officers and the board of directors have received an increasing interest since the Enron scandal of the early 2000s. This scandal resulted in enacting policies, laws and regulations such as the Sarbanes-Oxley and others. More recently, with almost every corporation focusing on the applications of Artificial Intelligence (AI) and Data Science (DS) for their businesses in numerous industries including finance and banking, healthcare and medicine, manufacturing and retail and defense and intelligence, it is critical that these corporations take a serious look at the roles and responsibilities of the corporate officers and the board with respect to the governance of the AI and DS operations. This paper discusses the issues and challenges for AI and DS governance with an emphasis on the potential roles and responsibilities of the corporate officers and the board of directors.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82557955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00059
R. Cheggou, Siham Si hadj mohand, Oussama Annad, E. Khoumeri
Taking care of a baby is a challenging task for working parents. In this paper, we present an intelligent baby monitoring system that allows parents to check on their baby remotely and in real time. The proposed system is based on the “Raspberry Pi 3 B +” card, a Pi camera, a sound and temperature sensors. To be more efficient, this system uses a convolutional neural network to identify and interpret the baby status in his cradle. The implementation and the experimental results of the proposed system demonstrate its efficiency and accuracy and how it can greatly help parents to take care of their baby.
{"title":"An intelligent baby monitoring system based on Raspberry PI, IoT sensors and convolutional neural network","authors":"R. Cheggou, Siham Si hadj mohand, Oussama Annad, E. Khoumeri","doi":"10.1109/IRI49571.2020.00059","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00059","url":null,"abstract":"Taking care of a baby is a challenging task for working parents. In this paper, we present an intelligent baby monitoring system that allows parents to check on their baby remotely and in real time. The proposed system is based on the “Raspberry Pi 3 B +” card, a Pi camera, a sound and temperature sensors. To be more efficient, this system uses a convolutional neural network to identify and interpret the baby status in his cradle. The implementation and the experimental results of the proposed system demonstrate its efficiency and accuracy and how it can greatly help parents to take care of their baby.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84769165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00020
Fei Zhao, Chengcui Zhang
In recent decades, millions of people are killed by natural disasters such as wildfire, landslide, tsunami, and volcanic eruption. The efficiency of post-disaster emergency responses and humanitarian assistance has become crucial in minimizing the expected casualties. This paper focuses on the task of building damage level evaluation, which is a key step for maximizing the deployment efficiency of post-event rescue activities. In this paper, we implement a Mask R-CNN based building damage evaluation model with a practical two-stage training strategy. The motivation of Stage-l is to train a ResNet 101 backbone in Mask R-CNN as a Building Feature Extractor. In Stage-2, we further build on top the model trained in Stage-l a deep learning architecture that performs more sophisticated tasks and is able to classify buildings with different damage levels from satellite images. In particular, in order to take advantage of pre-disaster satellite images, we extract the ResNet 101 backbone from the Mask R-CNN trained on pre-disaster images in Stage-l and utilize it to build a Siamese based semantic segmentation model for classifying the building damage level at the pixel level. The pre- and post-disaster satellite images are simultaneously fed into the proposed Siamese based model during the training and inference process. The output of these two models own the same size as input satellite images. Buildings with different damage levels, i.e., ‘no damage’, ‘minor damage’, ‘major damage’, and ‘destroyed’, are represented as segments of different damage classes in the output. Comparative experiments are conducted on the xBD satellite imagery dataset and compared with multiple state-of-the-art methods. The experimental results indicate that the proposed Siamese based method is capable to improve the damage evaluation accuracy by 16 times and 80%, compared with a baseline model implemented by xBD team and the Mask-RCNN framework, respectively.
{"title":"Building Damage Evaluation from Satellite Imagery using Deep Learning","authors":"Fei Zhao, Chengcui Zhang","doi":"10.1109/IRI49571.2020.00020","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00020","url":null,"abstract":"In recent decades, millions of people are killed by natural disasters such as wildfire, landslide, tsunami, and volcanic eruption. The efficiency of post-disaster emergency responses and humanitarian assistance has become crucial in minimizing the expected casualties. This paper focuses on the task of building damage level evaluation, which is a key step for maximizing the deployment efficiency of post-event rescue activities. In this paper, we implement a Mask R-CNN based building damage evaluation model with a practical two-stage training strategy. The motivation of Stage-l is to train a ResNet 101 backbone in Mask R-CNN as a Building Feature Extractor. In Stage-2, we further build on top the model trained in Stage-l a deep learning architecture that performs more sophisticated tasks and is able to classify buildings with different damage levels from satellite images. In particular, in order to take advantage of pre-disaster satellite images, we extract the ResNet 101 backbone from the Mask R-CNN trained on pre-disaster images in Stage-l and utilize it to build a Siamese based semantic segmentation model for classifying the building damage level at the pixel level. The pre- and post-disaster satellite images are simultaneously fed into the proposed Siamese based model during the training and inference process. The output of these two models own the same size as input satellite images. Buildings with different damage levels, i.e., ‘no damage’, ‘minor damage’, ‘major damage’, and ‘destroyed’, are represented as segments of different damage classes in the output. Comparative experiments are conducted on the xBD satellite imagery dataset and compared with multiple state-of-the-art methods. The experimental results indicate that the proposed Siamese based method is capable to improve the damage evaluation accuracy by 16 times and 80%, compared with a baseline model implemented by xBD team and the Mask-RCNN framework, respectively.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84793890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00061
Hilda Goins, SeyyedPooya HekmatiAthar, G. Byfield, Raymond Samuel, Mohd Anwar
Giving care to persons with dementia (PwD) has a significant strain on the quality of life for familial caregivers. Due to the overdependent nature of PwD, caregivers are burdened with health issues, stress, depression, loneliness, and social isolation. As a result, there is a need for understanding the nature and severity of this burden. In this paper, we introduce a novel data-driven approach based on machine learning modeling to ascertain caregiver burden using multimodal data from multitudinal sources. In particular, we propose to leverage data from smart devices, wearables, and psychometric surveys, to assess caregiver burden employing both shallow and deep neural network architectures.
{"title":"Toward Data-Driven Assessment of Caregiver’s Burden for Persons with Dementia using Machine Learning Models","authors":"Hilda Goins, SeyyedPooya HekmatiAthar, G. Byfield, Raymond Samuel, Mohd Anwar","doi":"10.1109/IRI49571.2020.00061","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00061","url":null,"abstract":"Giving care to persons with dementia (PwD) has a significant strain on the quality of life for familial caregivers. Due to the overdependent nature of PwD, caregivers are burdened with health issues, stress, depression, loneliness, and social isolation. As a result, there is a need for understanding the nature and severity of this burden. In this paper, we introduce a novel data-driven approach based on machine learning modeling to ascertain caregiver burden using multimodal data from multitudinal sources. In particular, we propose to leverage data from smart devices, wearables, and psychometric surveys, to assess caregiver burden employing both shallow and deep neural network architectures.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89559491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Researchers have been overwhelmed by the explosion of research articles published by various research communities. Many research scholarly websites, search engines, and digital libraries have been created to help researchers identify potential research topics and keep up with recent progress on research of interests. However, it is still difficult for researchers to keep track of the research topic diffusion and evolution without spending a large amount of time reviewing numerous relevant and irrelevant articles. In this paper, we consider a novel topic diffusion discovery technique. Specifically, we propose using a Deep Non-negative Autoencoder with information divergence measurement that monitors evolutionary distance of the topic diffusion to understand how research topics change with time. The experimental results show that the proposed approach is able to identify the evolution of research topics as well as to discover topic diffusions in online fashions.
{"title":"Topic Diffusion Discovery based on Deep Non-negative Autoencoder","authors":"Sheng-Tai Huang, Yihuang Kang, Shao-Min Hung, Bowen Kuo, I-Ling Cheng","doi":"10.1109/IRI49571.2020.00067","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00067","url":null,"abstract":"Researchers have been overwhelmed by the explosion of research articles published by various research communities. Many research scholarly websites, search engines, and digital libraries have been created to help researchers identify potential research topics and keep up with recent progress on research of interests. However, it is still difficult for researchers to keep track of the research topic diffusion and evolution without spending a large amount of time reviewing numerous relevant and irrelevant articles. In this paper, we consider a novel topic diffusion discovery technique. Specifically, we propose using a Deep Non-negative Autoencoder with information divergence measurement that monitors evolutionary distance of the topic diffusion to understand how research topics change with time. The experimental results show that the proposed approach is able to identify the evolution of research topics as well as to discover topic diffusions in online fashions.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76990771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00054
Stefan A. Bruendl, Hua Fang, H. Ngo, E. Boyer, Honggang Wang
With 5G networks on the rise, it becomes more and more important to grant researchers access to tools that allow for development and experimentation in the field of 5G transmission. Healthcare can benefit greatly from these developments. In this paper a real-time transmission technique is described and tested that, if implemented, allows wearable devices to transmit multiple streams of data on various frequencies. These tests will be used to explain how this presented platform works, what drawbacks and benefits exist with the proposed scheme, and how to further develop the solution of real-time transmission of sensitive data, such as substance-use data, at higher frequencies.
{"title":"A New Emulation Platform for Real-time Machine Learning in Substance Use Data Streams","authors":"Stefan A. Bruendl, Hua Fang, H. Ngo, E. Boyer, Honggang Wang","doi":"10.1109/IRI49571.2020.00054","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00054","url":null,"abstract":"With 5G networks on the rise, it becomes more and more important to grant researchers access to tools that allow for development and experimentation in the field of 5G transmission. Healthcare can benefit greatly from these developments. In this paper a real-time transmission technique is described and tested that, if implemented, allows wearable devices to transmit multiple streams of data on various frequencies. These tests will be used to explain how this presented platform works, what drawbacks and benefits exist with the proposed scheme, and how to further develop the solution of real-time transmission of sensitive data, such as substance-use data, at higher frequencies.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80349705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00058
S. Chawathe
Collections of textual files, or documents, with substantial inter-document similarities are common in diverse domains. A practically significant class of such similarities, and the dual differences, are well characterized by edit scripts, or colloquially diffs, that use a simple sequence model for documents. The study of such diffs provides valuable insights into the inter-document relationships within a collection and can guide data integration within and across collections. This paper describes a framework for such study that is based on frequently occurring inter-document differences. It motivates and defines a general problem of mining frequent differences and outlines some specific instances. It presents the design and implementation of a prototype system for interactively discovering and visualizing frequent differences. A notable feature of this method is its use of difference-components, or deltas, to bootstrap the discovery of interesting structure in file collections. The paper describes a preliminary experimental evaluation of the method and implementation on a widely used corpus of file-collections.
{"title":"Mining Frequent Differences in File Collections","authors":"S. Chawathe","doi":"10.1109/IRI49571.2020.00058","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00058","url":null,"abstract":"Collections of textual files, or documents, with substantial inter-document similarities are common in diverse domains. A practically significant class of such similarities, and the dual differences, are well characterized by edit scripts, or colloquially diffs, that use a simple sequence model for documents. The study of such diffs provides valuable insights into the inter-document relationships within a collection and can guide data integration within and across collections. This paper describes a framework for such study that is based on frequently occurring inter-document differences. It motivates and defines a general problem of mining frequent differences and outlines some specific instances. It presents the design and implementation of a prototype system for interactively discovering and visualizing frequent differences. A notable feature of this method is its use of difference-components, or deltas, to bootstrap the discovery of interesting structure in file collections. The paper describes a preliminary experimental evaluation of the method and implementation on a widely used corpus of file-collections.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76327641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-08-01DOI: 10.1109/IRI49571.2020.00009
Kanwardeep Singh Walia, S. Shenoy, Yuan Cheng
Security and usability are two essential aspects of a system, but they usually move in opposite directions. Sometimes, to achieve security, usability has to be compromised, and vice versa. Password-based authentication systems require both security and usability. However, to increase password security, absurd rules are introduced, which often drive users to compromise the usability of their passwords. Users tend to forget complex passwords and use techniques such as writing them down, reusing them, and storing them in vulnerable ways. Enhancing the strength while maintaining the usability of a password has become one of the biggest challenges for users and security experts. In this paper, we define the pronounceability of a password as a means to measure how easy it is to memorize - an aspect we associate with usability. We examine a dataset of more than 7 million passwords to determine whether the usergenerated passwords are secure. Moreover, we convert the usergenerated passwords into phonemes and measure the pronounceability of the phoneme-based representations. We then establish a relationship between the two and suggest how password creation strategies can be adapted to better align with both security and usability.
{"title":"An Empirical Analysis on the Usability and Security of Passwords","authors":"Kanwardeep Singh Walia, S. Shenoy, Yuan Cheng","doi":"10.1109/IRI49571.2020.00009","DOIUrl":"https://doi.org/10.1109/IRI49571.2020.00009","url":null,"abstract":"Security and usability are two essential aspects of a system, but they usually move in opposite directions. Sometimes, to achieve security, usability has to be compromised, and vice versa. Password-based authentication systems require both security and usability. However, to increase password security, absurd rules are introduced, which often drive users to compromise the usability of their passwords. Users tend to forget complex passwords and use techniques such as writing them down, reusing them, and storing them in vulnerable ways. Enhancing the strength while maintaining the usability of a password has become one of the biggest challenges for users and security experts. In this paper, we define the pronounceability of a password as a means to measure how easy it is to memorize - an aspect we associate with usability. We examine a dataset of more than 7 million passwords to determine whether the usergenerated passwords are secure. Moreover, we convert the usergenerated passwords into phonemes and measure the pronounceability of the phoneme-based representations. We then establish a relationship between the two and suggest how password creation strategies can be adapted to better align with both security and usability.","PeriodicalId":93159,"journal":{"name":"2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82332927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science : IRI 2020 : proceedings : virtual conference, 11-13 August 2020. IEEE International Conference on Information Reuse and Integration (21st : 2...