Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115646
Thomas Rupprecht, Xi Chen, D. H. White, Jan H. Boockmann, Gerald Lüttgen, H. Bos
Reverse engineering binary code is notoriously difficult and, especially, understanding a binary's dynamic data structures. Existing data structure analyzers are limited wrt. program comprehension: they do not detect complex structures such as skip lists, or lists running through nodes of different types such as in the Linux kernel's cyclic doubly-linked list. They also do not reveal complex parent-child relationships between structures. The tool DSI remedies these shortcomings but requires source code, where type information on heap nodes is available. We present DSIbin, a combination of DSI and the type excavator Howard for the inspection of C/C++ binaries. While a naive combination already improves upon related work, its precision is limited because Howard's inferred types are often too coarse. To address this we auto-generate candidates of refined types based on speculative nested-struct detection and type merging; the plausibility of these hypotheses is then validated by DSI. We demonstrate via benchmarking that DSIbin detects data structures with high precision.
{"title":"DSIbin: Identifying dynamic data structures in C/C++ binaries","authors":"Thomas Rupprecht, Xi Chen, D. H. White, Jan H. Boockmann, Gerald Lüttgen, H. Bos","doi":"10.1109/ASE.2017.8115646","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115646","url":null,"abstract":"Reverse engineering binary code is notoriously difficult and, especially, understanding a binary's dynamic data structures. Existing data structure analyzers are limited wrt. program comprehension: they do not detect complex structures such as skip lists, or lists running through nodes of different types such as in the Linux kernel's cyclic doubly-linked list. They also do not reveal complex parent-child relationships between structures. The tool DSI remedies these shortcomings but requires source code, where type information on heap nodes is available. We present DSIbin, a combination of DSI and the type excavator Howard for the inspection of C/C++ binaries. While a naive combination already improves upon related work, its precision is limited because Howard's inferred types are often too coarse. To address this we auto-generate candidates of refined types based on speculative nested-struct detection and type merging; the plausibility of these hypotheses is then validated by DSI. We demonstrate via benchmarking that DSIbin detects data structures with high precision.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121437749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115662
A. D. Franco, Hui Guo, Cindy Rubio-González
Numerical software is used in a wide variety of applications including safety-critical systems, which have stringent correctness requirements, and whose failures have catastrophic consequences that endanger human life. Numerical bugs are known to be particularly difficult to diagnose and fix, largely due to the use of approximate representations of numbers such as floating point. Understanding the characteristics of numerical bugs is the first step to combat them more effectively. In this paper, we present the first comprehensive study of real-world numerical bugs. Specifically, we identify and carefully examine 269 numerical bugs from five widely-used numerical software libraries: NumPy, SciPy, LAPACK, GNU Scientific Library, and Elemental. We propose a categorization of numerical bugs, and discuss their frequency, symptoms and fixes. Our study opens new directions in the areas of program analysis, testing, and automated program repair of numerical software, and provides a collection of real-world numerical bugs.
{"title":"A comprehensive study of real-world numerical bug characteristics","authors":"A. D. Franco, Hui Guo, Cindy Rubio-González","doi":"10.1109/ASE.2017.8115662","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115662","url":null,"abstract":"Numerical software is used in a wide variety of applications including safety-critical systems, which have stringent correctness requirements, and whose failures have catastrophic consequences that endanger human life. Numerical bugs are known to be particularly difficult to diagnose and fix, largely due to the use of approximate representations of numbers such as floating point. Understanding the characteristics of numerical bugs is the first step to combat them more effectively. In this paper, we present the first comprehensive study of real-world numerical bugs. Specifically, we identify and carefully examine 269 numerical bugs from five widely-used numerical software libraries: NumPy, SciPy, LAPACK, GNU Scientific Library, and Elemental. We propose a categorization of numerical bugs, and discuss their frequency, symptoms and fixes. Our study opens new directions in the areas of program analysis, testing, and automated program repair of numerical software, and provides a collection of real-world numerical bugs.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115860038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115712
Lakhdar Meftah, María Gómez, Romain Rouvoy, Isabelle Chrisment
WiFi P2P allows mobile apps to connect to each other via WiFi without an intermediate access point. This communication mode is widely used by mobile apps to support interactions with one or more devices simultaneously. However, testing such P2P apps remains a challenge for app developers as i) existing testing frameworks lack support for WiFi P2P, and ii) WiFi P2P testing fails to scale when considering a deployment on more than two devices. In this paper, we therefore propose an acceptance testing framework, named Androfleet, to automate testing of WiFi P2P mobile apps at scale. Beyond the capability of testing point-to-point interactions under various conditions, An-drofleet supports the deployment and the emulation of a fleet of mobile devices as part of an alpha testing phase in order to assess the robustness of a WiFi P2P app once deployed in the field. To validate Androfleet, we demonstrate the detection of failing black-box acceptance tests for WiFi P2P apps and we capture the conditions under which such a mobile app can correctly work in the field. The demo video of Androfleet is made available from https://youtu.be/gJ5_Ed7XL04.
{"title":"ANDROFLEET: Testing WiFi peer-to-peer mobile apps in the large","authors":"Lakhdar Meftah, María Gómez, Romain Rouvoy, Isabelle Chrisment","doi":"10.1109/ASE.2017.8115712","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115712","url":null,"abstract":"WiFi P2P allows mobile apps to connect to each other via WiFi without an intermediate access point. This communication mode is widely used by mobile apps to support interactions with one or more devices simultaneously. However, testing such P2P apps remains a challenge for app developers as i) existing testing frameworks lack support for WiFi P2P, and ii) WiFi P2P testing fails to scale when considering a deployment on more than two devices. In this paper, we therefore propose an acceptance testing framework, named Androfleet, to automate testing of WiFi P2P mobile apps at scale. Beyond the capability of testing point-to-point interactions under various conditions, An-drofleet supports the deployment and the emulation of a fleet of mobile devices as part of an alpha testing phase in order to assess the robustness of a WiFi P2P app once deployed in the field. To validate Androfleet, we demonstrate the detection of failing black-box acceptance tests for WiFi P2P apps and we capture the conditions under which such a mobile app can correctly work in the field. The demo video of Androfleet is made available from https://youtu.be/gJ5_Ed7XL04.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116315463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115664
Yoshiki Higo, Akio Ohtani, S. Kusumoto
In software development, there are many situations in which developers need to understand given source code changes in detail. Until now, a variety of techniques have been proposed to support understanding source code changes. Tree-based differencing techniques are expected to have better understandability than text-based ones, which are widely used nowadays (e.g., diff in Unix). In this paper, we propose to consider copy-and-paste as a kind of editing action forming tree-based edit script, which is an editing sequence that transforms a tree to another one. Software developers often perform copy- and-paste when they are writing source code. Introducing copy- and-paste action into edit script contributes to not only making simpler (more easily understandable) edit scripts but also making edit scripts closer to developers' actual editing sequences. We conducted experiments on an open dataset. As a result, we confirmed that our technique made edit scripts shorter for 18% of the code changes with a little more computational time. For the other 82% code changes, our technique generated the same edit scripts as an existing technique. We also confirmed that our technique provided more helpful visualizations.
{"title":"Generating simpler AST edit scripts by considering copy-and-paste","authors":"Yoshiki Higo, Akio Ohtani, S. Kusumoto","doi":"10.1109/ASE.2017.8115664","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115664","url":null,"abstract":"In software development, there are many situations in which developers need to understand given source code changes in detail. Until now, a variety of techniques have been proposed to support understanding source code changes. Tree-based differencing techniques are expected to have better understandability than text-based ones, which are widely used nowadays (e.g., diff in Unix). In this paper, we propose to consider copy-and-paste as a kind of editing action forming tree-based edit script, which is an editing sequence that transforms a tree to another one. Software developers often perform copy- and-paste when they are writing source code. Introducing copy- and-paste action into edit script contributes to not only making simpler (more easily understandable) edit scripts but also making edit scripts closer to developers' actual editing sequences. We conducted experiments on an open dataset. As a result, we confirmed that our technique made edit scripts shorter for 18% of the code changes with a little more computational time. For the other 82% code changes, our technique generated the same edit scripts as an existing technique. We also confirmed that our technique provided more helpful visualizations.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"623 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123325745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115629
Gias Uddin, Foutse Khomh
With the proliferation of online developer forums as informal documentation, developers often share their opinions about the APIs they use. However, given the plethora of opinions available for an API in various online developer forums, it can be challenging for a developer to make informed decisions about the APIs. While automatic summarization of opinions have been explored for other domains (e.g., cameras, cars), we found little research that investigates the benefits of summaries of public API reviews. In this paper, we present two algorithms (statistical and aspect-based) to summarize opinions about APIs. To investigate the usefulness of the techniques, we developed, Opiner, an online opinion summarization engine that presents summaries of opinions using both our proposed techniques and existing six off-the-shelf techniques. We investigated the usefulness of Opiner using two case studies, both involving professional software engineers. We found that developers were interested to use our proposed summaries much more frequently than other summaries (daily vs once a year) and that while combined with Stack Overflow, Opiner helped developers to make the right decision with more accuracy and confidence and in less time.
{"title":"Automatic summarization of API reviews","authors":"Gias Uddin, Foutse Khomh","doi":"10.1109/ASE.2017.8115629","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115629","url":null,"abstract":"With the proliferation of online developer forums as informal documentation, developers often share their opinions about the APIs they use. However, given the plethora of opinions available for an API in various online developer forums, it can be challenging for a developer to make informed decisions about the APIs. While automatic summarization of opinions have been explored for other domains (e.g., cameras, cars), we found little research that investigates the benefits of summaries of public API reviews. In this paper, we present two algorithms (statistical and aspect-based) to summarize opinions about APIs. To investigate the usefulness of the techniques, we developed, Opiner, an online opinion summarization engine that presents summaries of opinions using both our proposed techniques and existing six off-the-shelf techniques. We investigated the usefulness of Opiner using two case studies, both involving professional software engineers. We found that developers were interested to use our proposed summaries much more frequently than other summaries (daily vs once a year) and that while combined with Stack Overflow, Opiner helped developers to make the right decision with more accuracy and confidence and in less time.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121205398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115659
Mona Rahimi, Wandi Xiong, J. Cleland-Huang, R. Lutz
Problems with the correctness and completeness of environmental assumptions contribute to many accidents in safety-critical systems. The problem is exacerbated when products are modified in new releases or in new products of a product line. In such cases existing sets of environmental assumptions are often carried forward without sufficiently rigorous analysis. This paper describes a new technique that exploits the traceability required by many certifying bodies to reason about the likelihood that environmental assumptions are omitted or incorrectly retained in new products. An analysis of over 150 examples of environmental assumptions in historical systems informs the approach. In an evaluation on three safety-related product lines the approach caught all but one of the assumption-related problems. It also provided clearly defined steps for mitigating the identified issues. The contribution of the work is to arm the safety analyst with useful information for assessing the validity of environmental assumptions for a new product.
{"title":"Diagnosing assumption problems in safety-critical products","authors":"Mona Rahimi, Wandi Xiong, J. Cleland-Huang, R. Lutz","doi":"10.1109/ASE.2017.8115659","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115659","url":null,"abstract":"Problems with the correctness and completeness of environmental assumptions contribute to many accidents in safety-critical systems. The problem is exacerbated when products are modified in new releases or in new products of a product line. In such cases existing sets of environmental assumptions are often carried forward without sufficiently rigorous analysis. This paper describes a new technique that exploits the traceability required by many certifying bodies to reason about the likelihood that environmental assumptions are omitted or incorrectly retained in new products. An analysis of over 150 examples of environmental assumptions in historical systems informs the approach. In an evaluation on three safety-related product lines the approach caught all but one of the assumption-related problems. It also provided clearly defined steps for mitigating the identified issues. The contribution of the work is to arm the safety analyst with useful information for assessing the validity of environmental assumptions for a new product.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125155701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115694
Shayan Zamanirad, B. Benatallah, M. C. Barukh, F. Casati, Carlos Rodríguez
At present, bots are still in their preliminary stages of development. Many are relatively simple, or developed ad-hoc for a very specific use-case. For this reason, they are typically programmed manually, or utilize machine-learning classifiers to interpret a fixed set of user utterances. In reality, real world conversations with humans require support for dynamically capturing users expressions. Moreover, bots will derive immeasurable value by programming them to invoke APIs for their results. Today, within the Web and Mobile development community, complex applications are being stringed together with a few lines of code — all made possible by APIs. Yet, developers today are not as empowered to program bots in much the same way. To overcome this, we introduce BotBase, a bot programming platform that dynamically synthesizes natural language user expressions into API invocations. Our solution is two faceted: Firstly, we construct an API knowledge graph to encode and evolve APIs; secondly, leveraging the above we apply techniques in NLP, ML and Entity Recognition to perform the required synthesis from natural language user expressions into API calls.
{"title":"Programming bots by synthesizing natural language expressions into API invocations","authors":"Shayan Zamanirad, B. Benatallah, M. C. Barukh, F. Casati, Carlos Rodríguez","doi":"10.1109/ASE.2017.8115694","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115694","url":null,"abstract":"At present, bots are still in their preliminary stages of development. Many are relatively simple, or developed ad-hoc for a very specific use-case. For this reason, they are typically programmed manually, or utilize machine-learning classifiers to interpret a fixed set of user utterances. In reality, real world conversations with humans require support for dynamically capturing users expressions. Moreover, bots will derive immeasurable value by programming them to invoke APIs for their results. Today, within the Web and Mobile development community, complex applications are being stringed together with a few lines of code — all made possible by APIs. Yet, developers today are not as empowered to program bots in much the same way. To overcome this, we introduce BotBase, a bot programming platform that dynamically synthesizes natural language user expressions into API invocations. Our solution is two faceted: Firstly, we construct an API knowledge graph to encode and evolve APIs; secondly, leveraging the above we apply techniques in NLP, ML and Entity Recognition to perform the required synthesis from natural language user expressions into API calls.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125383954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115696
Lin Cheng, Z. Yang, Chao Wang
Graphic user interface (GUI) is an integral part of many software applications. However, GUI testing remains a challenging task. The main problem is to generate a set of high-quality test cases, i.e., sequences of user events to cover the often large input space. Since manually crafting event sequences is labor-intensive and automated testing tools often have poor performance, we propose a new GUI testing framework to efficiently generate progressively longer event sequences while avoiding redundant sequences. Our technique for identifying the redundancy among these sequences relies on statically checking a set of simple and syntactic-level conditions, whose reduction power matches and sometimes exceeds that of classic techniques based on partial order reduction. We have evaluated our method on 17 Java Swing applications. Our experimental results show the new technique, while being sound and systematic, can achieve more than 10X reduction in the number of test sequences compared to the state-of-the-art GUI testing tools.
{"title":"Systematic reduction of GUI test sequences","authors":"Lin Cheng, Z. Yang, Chao Wang","doi":"10.1109/ASE.2017.8115696","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115696","url":null,"abstract":"Graphic user interface (GUI) is an integral part of many software applications. However, GUI testing remains a challenging task. The main problem is to generate a set of high-quality test cases, i.e., sequences of user events to cover the often large input space. Since manually crafting event sequences is labor-intensive and automated testing tools often have poor performance, we propose a new GUI testing framework to efficiently generate progressively longer event sequences while avoiding redundant sequences. Our technique for identifying the redundancy among these sequences relies on statically checking a set of simple and syntactic-level conditions, whose reduction power matches and sometimes exceeds that of classic techniques based on partial order reduction. We have evaluated our method on 17 Java Swing applications. Our experimental results show the new technique, while being sound and systematic, can achieve more than 10X reduction in the number of test sequences compared to the state-of-the-art GUI testing tools.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129033395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115628
Yuan Tian, Ferdian Thung, Abhishek Sharma, D. Lo
As the carrier of Application Programming Interfaces (APIs) knowledge, API documentation plays a crucial role in how developers learn and use an API. It is also a valuable information resource for answering API-related questions, especially when developers cannot find reliable answers to their questions online/offline. However, finding answers to API-related questions from API documentation might not be easy because one may have to manually go through multiple pages before reaching the relevant page, and then read and understand the information inside the relevant page to figure out the answers. To deal with this challenge, we develop APIBot, a bot that can answer API questions given API documentation as an input. APIBot is built on top of SiriusQA, the QA system from Sirius, a state of the art intelligent personal assistant. To make SiriusQA work well under software engineering scenario, we make several modifications over SiriusQA by injecting domain specific knowledge. We evaluate APIBot on 92 API questions, answers of which are known to be present in Java 8 documentation. Our experiment shows that APIBot can achieve a Hit@5 score of 0.706.
{"title":"APIBot: Question answering bot for API documentation","authors":"Yuan Tian, Ferdian Thung, Abhishek Sharma, D. Lo","doi":"10.1109/ASE.2017.8115628","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115628","url":null,"abstract":"As the carrier of Application Programming Interfaces (APIs) knowledge, API documentation plays a crucial role in how developers learn and use an API. It is also a valuable information resource for answering API-related questions, especially when developers cannot find reliable answers to their questions online/offline. However, finding answers to API-related questions from API documentation might not be easy because one may have to manually go through multiple pages before reaching the relevant page, and then read and understand the information inside the relevant page to figure out the answers. To deal with this challenge, we develop APIBot, a bot that can answer API questions given API documentation as an input. APIBot is built on top of SiriusQA, the QA system from Sirius, a state of the art intelligent personal assistant. To make SiriusQA work well under software engineering scenario, we make several modifications over SiriusQA by injecting domain specific knowledge. We evaluate APIBot on 92 API questions, answers of which are known to be present in Java 8 documentation. Our experiment shows that APIBot can achieve a Hit@5 score of 0.706.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129293115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-30DOI: 10.1109/ASE.2017.8115726
M. Guerriero
The rise of Big Data is leading to an increasing demand for data-intensive applications (DIAs), which, in many cases, are expected to process massive amounts of sensitive data. In this context, ensuring data privacy becomes paramount. While the way we design and develop DIAs has radically changed over the last few years in order to deal with Big Data, there has been relatively little effort to make such design privacy-aware. As a result, enforcing privacy policies in large-scale data processing is currently an open research problem. This thesis proposal makes one step towards this investigation: after identifying the dataflow model as the reference computational model for large-scale DIAs, (1) we propose a novel language for specifying privacy policies on dataflow applications along with (2) a dataflow rewriting mechanism to enforce such policies during DIA execution. Although a systematic evaluation still needs to be carried out, preliminary results are promising. We plan to implement our approach within a model-driven solution to ultimately simplify the design and development of privacy-aware DIAs, i.e. DIAs that ensure privacy policies at runtime.
{"title":"Privacy-aware data-intensive applications","authors":"M. Guerriero","doi":"10.1109/ASE.2017.8115726","DOIUrl":"https://doi.org/10.1109/ASE.2017.8115726","url":null,"abstract":"The rise of Big Data is leading to an increasing demand for data-intensive applications (DIAs), which, in many cases, are expected to process massive amounts of sensitive data. In this context, ensuring data privacy becomes paramount. While the way we design and develop DIAs has radically changed over the last few years in order to deal with Big Data, there has been relatively little effort to make such design privacy-aware. As a result, enforcing privacy policies in large-scale data processing is currently an open research problem. This thesis proposal makes one step towards this investigation: after identifying the dataflow model as the reference computational model for large-scale DIAs, (1) we propose a novel language for specifying privacy policies on dataflow applications along with (2) a dataflow rewriting mechanism to enforce such policies during DIA execution. Although a systematic evaluation still needs to be carried out, preliminary results are promising. We plan to implement our approach within a model-driven solution to ultimately simplify the design and development of privacy-aware DIAs, i.e. DIAs that ensure privacy policies at runtime.","PeriodicalId":382876,"journal":{"name":"2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114606503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}