Mohamed A. Oumaziz, Jean-Rémy Falleri, Xavier Blanc, Tegawendé F. Bissyandé, Jacques Klein
{"title":"Handling Duplicates in Dockerfiles Families: Learning from Experts","authors":"Mohamed A. Oumaziz, Jean-Rémy Falleri, Xavier Blanc, Tegawendé F. Bissyandé, Jacques Klein","doi":"10.1109/ICSME.2019.00086","DOIUrl":null,"url":null,"abstract":"Docker is becoming a popular tool used by developers and end-users to deploy and run software applications. Dockerfiles now belong to software projects as any other software artefacts such as source code or configuration files. Many projects are even starting to maintain families of Dockerfiles rather than a single Dockerfile like the Python project who simultaneously maintains a family of 43 Dockerfiles (specific versions/dependencies). In this paper, we wonder if traditional maintenance challenge of handling duplicates arises in such projects since this challenge is classical in software development, even for non-code software artefacts. Our goal is to provide practitioners a clear explanation for why duplicates arise in projects, and what are the different means to handle duplicates with their pros and cons. To do so, we observe the practices of expert Dockerfile maintainers of Official Docker projects (128 projects) and perform a survey on 25 maintainers from our corpus. We show that duplicates in Dockerfiles are frequent in our corpus, that developers are aware of their existence, are frequently facing them and have a split opinion regarding them (error-prone but easy to maintain with the right tools). Finally, we show that some maintainers manage to limit duplicates by using ad-hoc tools. These tools while sometimes hard to set-up can help reduce the amount of duplicates by up-to 85%.","PeriodicalId":106748,"journal":{"name":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSME.2019.00086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Docker is becoming a popular tool used by developers and end-users to deploy and run software applications. Dockerfiles now belong to software projects as any other software artefacts such as source code or configuration files. Many projects are even starting to maintain families of Dockerfiles rather than a single Dockerfile like the Python project who simultaneously maintains a family of 43 Dockerfiles (specific versions/dependencies). In this paper, we wonder if traditional maintenance challenge of handling duplicates arises in such projects since this challenge is classical in software development, even for non-code software artefacts. Our goal is to provide practitioners a clear explanation for why duplicates arise in projects, and what are the different means to handle duplicates with their pros and cons. To do so, we observe the practices of expert Dockerfile maintainers of Official Docker projects (128 projects) and perform a survey on 25 maintainers from our corpus. We show that duplicates in Dockerfiles are frequent in our corpus, that developers are aware of their existence, are frequently facing them and have a split opinion regarding them (error-prone but easy to maintain with the right tools). Finally, we show that some maintainers manage to limit duplicates by using ad-hoc tools. These tools while sometimes hard to set-up can help reduce the amount of duplicates by up-to 85%.