Manasi S. Patwardhan, A. Sainani, Richa Sharma, S. Karande, S. Ghaisas
{"title":"Towards Automating Disambiguation of Regulations: Using the Wisdom of Crowds","authors":"Manasi S. Patwardhan, A. Sainani, Richa Sharma, S. Karande, S. Ghaisas","doi":"10.1145/3238147.3240727","DOIUrl":null,"url":null,"abstract":"Compliant software is a critical need of all modern businesses. Disambiguating regulations to derive requirements is therefore an important software engineering activity. Regulations however are ridden with ambiguities that make their comprehension a challenge, seemingly surmountable only by legal experts. Since legal experts' involvement in every project is expensive, approaches to automate the disambiguation need to be explored. These approaches however require a large amount of annotated data. Collecting data exclusively from experts is not a scalable and affordable solution. In this paper, we present the results of a crowd sourcing experiment to collect annotations on ambiguities in regulations from professional software engineers. We discuss an approach to automate the arduous and critical step of identifying ground truth labels by employing crowd consensus using Expectation Maximization (EM). We demonstrate that the annotations reaching a consensus match those of experts with an accuracy of 87%.","PeriodicalId":6622,"journal":{"name":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"4 1","pages":"850-855"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3238147.3240727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Compliant software is a critical need of all modern businesses. Disambiguating regulations to derive requirements is therefore an important software engineering activity. Regulations however are ridden with ambiguities that make their comprehension a challenge, seemingly surmountable only by legal experts. Since legal experts' involvement in every project is expensive, approaches to automate the disambiguation need to be explored. These approaches however require a large amount of annotated data. Collecting data exclusively from experts is not a scalable and affordable solution. In this paper, we present the results of a crowd sourcing experiment to collect annotations on ambiguities in regulations from professional software engineers. We discuss an approach to automate the arduous and critical step of identifying ground truth labels by employing crowd consensus using Expectation Maximization (EM). We demonstrate that the annotations reaching a consensus match those of experts with an accuracy of 87%.