Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00085
C. Jiang, David Ojika, Bhavesh Patel, A. Gordon-Ross, H. Lam
The demand for computational accelerators (GPUs, FPGAs, ASICs, etc.) is growing due to the widening variety of datacenter applications fueled by recent scientific breakthroughs that leverage artificial intelligence (AI). As much as these applications (e.g., cosmology, physics, etc.) have continued to witness record-breaking accuracy in predictive capabilities due to AI widespread influence, the infrastructure and workflow to take these applications out of research labs into production and business use-cases continues to lag. To address these important infrastructural challenges, we present SCAIGATE, a prototype science gateway with a simplified workflow aimed at facilitating model building/validation workflows in large-scale scientific applications.
{"title":"Accelerating Scientific Discovery with SCAIGATE Science Gateway","authors":"C. Jiang, David Ojika, Bhavesh Patel, A. Gordon-Ross, H. Lam","doi":"10.1109/eScience.2019.00085","DOIUrl":"https://doi.org/10.1109/eScience.2019.00085","url":null,"abstract":"The demand for computational accelerators (GPUs, FPGAs, ASICs, etc.) is growing due to the widening variety of datacenter applications fueled by recent scientific breakthroughs that leverage artificial intelligence (AI). As much as these applications (e.g., cosmology, physics, etc.) have continued to witness record-breaking accuracy in predictive capabilities due to AI widespread influence, the infrastructure and workflow to take these applications out of research labs into production and business use-cases continues to lag. To address these important infrastructural challenges, we present SCAIGATE, a prototype science gateway with a simplified workflow aimed at facilitating model building/validation workflows in large-scale scientific applications.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"49 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113937005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00076
Y. Demchenko, T. Wiktorski, J. Cuadrado-Gallego, Steve Brewer
The emerging data-driven economy (also defined as Industry 4.0 or simply 4IR), encompassing industry, research and business, requires new types of specialists that are able to support all stages of the data lifecycle from data production and input, to data processing and actionable results delivery, visualisation and reporting, which can be collectively defined as the Data Science family of professions. Data Science as a research and academic discipline provides a basis for Data Analytics and ML/AI applications. The education and training of the data related professions must reflect all multi-disciplinary knowledge and competences that are required from the Data Science and handling practitioners in modern, data-driven research and the digital economy. In the modern era, with ever faster technology changes, matched by strong skills demand, the Data Science education and training programme should be customizable and deliverable in multiple forms, tailored for different categories of professional roles and profiles. Referring to other publications by the authors on building customizable and interoperable Data Science curricula for different types of learners and target application domains, this paper is focused on defining a set of transversal competences and skills that are required from modern and future Data Science professions. These include workplace and professional skills that cover critical thinking, problem solving, and creativity required to work in highly automated and dynamic environment. The proposed approach is based on the EDISON Data Science Framework (EDSF) initially developed within the EU funded Project EDISON and currently being further developed in the EU funded MATES project and also the FAIRsFAIR projects.
{"title":"EDISON Data Science Framework (EDSF) Extension to Address Transversal Skills Required by Emerging Industry 4.0 Transformation","authors":"Y. Demchenko, T. Wiktorski, J. Cuadrado-Gallego, Steve Brewer","doi":"10.1109/eScience.2019.00076","DOIUrl":"https://doi.org/10.1109/eScience.2019.00076","url":null,"abstract":"The emerging data-driven economy (also defined as Industry 4.0 or simply 4IR), encompassing industry, research and business, requires new types of specialists that are able to support all stages of the data lifecycle from data production and input, to data processing and actionable results delivery, visualisation and reporting, which can be collectively defined as the Data Science family of professions. Data Science as a research and academic discipline provides a basis for Data Analytics and ML/AI applications. The education and training of the data related professions must reflect all multi-disciplinary knowledge and competences that are required from the Data Science and handling practitioners in modern, data-driven research and the digital economy. In the modern era, with ever faster technology changes, matched by strong skills demand, the Data Science education and training programme should be customizable and deliverable in multiple forms, tailored for different categories of professional roles and profiles. Referring to other publications by the authors on building customizable and interoperable Data Science curricula for different types of learners and target application domains, this paper is focused on defining a set of transversal competences and skills that are required from modern and future Data Science professions. These include workplace and professional skills that cover critical thinking, problem solving, and creativity required to work in highly automated and dynamic environment. The proposed approach is based on the EDISON Data Science Framework (EDSF) initially developed within the EU funded Project EDISON and currently being further developed in the EU funded MATES project and also the FAIRsFAIR projects.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130268761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00091
Rokas Maciulaitis, T. Simko, P. Brenner, Scott S. Hampton, M. Hildreth, K. H. Anampa, Irena Johnson, Cody Kankel, Jan Okraska, D. Rodríguez
REANA is a reusable and reproducible data analysis platform allowing researchers to structure their analysis pipelines and run them on remote containerised compute clouds. REANA supports several different workflows systems (CWL, Serial, Yadage) and uses Kubernetes' job execution backend. We have designed an abstract job execution component that extends the REANA platform job execution capabilities to support multiple compute backends. We have tested the abstract job execution component with HTCondor and verified the scalability of the designed solution. The results show that the REANA platform would be able to support hybrid scientific workflows where different parts of the analysis pipelines can be executed on multiple computing backends.
{"title":"Support for HTCondor high-Throughput Computing Workflows in the REANA Reusable Analysis Platform","authors":"Rokas Maciulaitis, T. Simko, P. Brenner, Scott S. Hampton, M. Hildreth, K. H. Anampa, Irena Johnson, Cody Kankel, Jan Okraska, D. Rodríguez","doi":"10.1109/eScience.2019.00091","DOIUrl":"https://doi.org/10.1109/eScience.2019.00091","url":null,"abstract":"REANA is a reusable and reproducible data analysis platform allowing researchers to structure their analysis pipelines and run them on remote containerised compute clouds. REANA supports several different workflows systems (CWL, Serial, Yadage) and uses Kubernetes' job execution backend. We have designed an abstract job execution component that extends the REANA platform job execution capabilities to support multiple compute backends. We have tested the abstract job execution component with HTCondor and verified the scalability of the designed solution. The results show that the REANA platform would be able to support hybrid scientific workflows where different parts of the analysis pipelines can be executed on multiple computing backends.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131402793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00078
Sara Shakeri, Valentina Maccatrozzo, L. Veen, R. Bakhshi, L. Gommans, C. D. Laat, P. Grosso
Recently, Digital Data Marketplaces (DDMs) are gaining wide attention as a sharing platform among different organizations. That is due to the fact that sharing the information and participating in research collaborations play an important role in addressing multiple scientific challenges. To increase trust among participating organizations multiple contracts and agreements should be established in order to determine regulations and policies about who has access to what. Describing these agreements in a general model to be applicable in different DDMs is of utmost importance. In this paper, we present a semantic model for describing the access policies by means of semantic web technologies. In particular, we use and extend the Open Digital Rights Language (ODRL) to describe the pre-established agreements in a DDM.
{"title":"Modeling and Matching Digital Data Marketplace Policies","authors":"Sara Shakeri, Valentina Maccatrozzo, L. Veen, R. Bakhshi, L. Gommans, C. D. Laat, P. Grosso","doi":"10.1109/eScience.2019.00078","DOIUrl":"https://doi.org/10.1109/eScience.2019.00078","url":null,"abstract":"Recently, Digital Data Marketplaces (DDMs) are gaining wide attention as a sharing platform among different organizations. That is due to the fact that sharing the information and participating in research collaborations play an important role in addressing multiple scientific challenges. To increase trust among participating organizations multiple contracts and agreements should be established in order to determine regulations and policies about who has access to what. Describing these agreements in a general model to be applicable in different DDMs is of utmost importance. In this paper, we present a semantic model for describing the access policies by means of semantic web technologies. In particular, we use and extend the Open Digital Rights Language (ODRL) to describe the pre-established agreements in a DDM.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134628264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00014
Yogesh L. Simmhan, M. Hegde, Rajesh Zele, S. Tripathi, S. Nair, S. Monga, R. Sahu, Kuldeep Dixit, R. Sutaria, Brijesh Mishra, Anamika Sharma, A. Svr
Air pollution is a public health emergency in large cities. The availability of commodity sensors and the advent of Internet of Things (IoT) enable the deployment of a city-wide network of 1000's of low-cost real-time air quality monitors to help manage this challenge. This needs to be supported by an IoT cyber-infrastructure for reliable and scalable data acquisition from the edge to the Cloud. The low accuracy of such sensors also motivates the need for data-driven calibration models that can accurately predict the science variables from the raw sensor signals. Here, we offer our experiences with designing and deploying such an IoT software platform and calibration models, and validate it through a pilot field deployment at two mega-cities, Delhi and Mumbai. Our edge data service is able to even-out the differential bandwidths from the sensing devices and to the Cloud repository, and recover from transient failures. Our analytical models reduce the errors of the sensors from a best-case of 63% using the factory baseline to as low as 21%, and substantially advances the state-of-the-art in this domain.
{"title":"SATVAM: Toward an IoT Cyber-Infrastructure for Low-Cost Urban Air Quality Monitoring","authors":"Yogesh L. Simmhan, M. Hegde, Rajesh Zele, S. Tripathi, S. Nair, S. Monga, R. Sahu, Kuldeep Dixit, R. Sutaria, Brijesh Mishra, Anamika Sharma, A. Svr","doi":"10.1109/eScience.2019.00014","DOIUrl":"https://doi.org/10.1109/eScience.2019.00014","url":null,"abstract":"Air pollution is a public health emergency in large cities. The availability of commodity sensors and the advent of Internet of Things (IoT) enable the deployment of a city-wide network of 1000's of low-cost real-time air quality monitors to help manage this challenge. This needs to be supported by an IoT cyber-infrastructure for reliable and scalable data acquisition from the edge to the Cloud. The low accuracy of such sensors also motivates the need for data-driven calibration models that can accurately predict the science variables from the raw sensor signals. Here, we offer our experiences with designing and deploying such an IoT software platform and calibration models, and validate it through a pilot field deployment at two mega-cities, Delhi and Mumbai. Our edge data service is able to even-out the differential bandwidths from the sensing devices and to the Cloud repository, and recover from transient failures. Our analytical models reduce the errors of the sensors from a best-case of 63% using the factory baseline to as low as 21%, and substantially advances the state-of-the-art in this domain.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129609615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00082
Rui Zhao, M. Atkinson
With the needs of science and business, data sharing and re-use has become an intensive activity for various areas. In many cases, governance imposes rules concerning data use, but there is no existing computational technique to help data-users comply with such rules. We argue that intelligent systems can be used to improve the situation, by recording provenance records during processing, encoding the rules and performing reasoning. We present our initial work, designing formal models for data rules and flow rules and the reasoning system, as the first step towards helping data providers and data users sustain productive relationships.
{"title":"Towards a Computer-Interpretable Actionable Formal Model to Encode Data Governance Rules","authors":"Rui Zhao, M. Atkinson","doi":"10.1109/eScience.2019.00082","DOIUrl":"https://doi.org/10.1109/eScience.2019.00082","url":null,"abstract":"With the needs of science and business, data sharing and re-use has become an intensive activity for various areas. In many cases, governance imposes rules concerning data use, but there is no existing computational technique to help data-users comply with such rules. We argue that intelligent systems can be used to improve the situation, by recording provenance records during processing, encoding the rules and performing reasoning. We present our initial work, designing formal models for data rules and flow rules and the reasoning system, as the first step towards helping data providers and data users sustain productive relationships.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132546605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00029
Chreston A. Miller, Christa Miller
In this paper we explore the use of temporal patterns to define interaction dynamics between different kinds of meetings. Meetings occur on a daily basis and include different behavioral dynamics between participants, such as floor shifts and intense dialog. These dynamics can tell a story of the meeting and provide insight into how participants interact. We focus our investigation on defining diversity metrics to compare the interaction dynamics of scenario and non-scenario meetings. These metrics may be able to provide insight into the similarities and differences between scenario and non-scenario meetings. We observe that certain interaction dynamics can be identified through temporal patterns of speech intervals, i.e., when a participant is talking. We apply the principles of Parallel Episodes in identifying moments of speech overlap, e.g., interaction "bursts", and introduce Situated Data Mining, an approach for identifying repeated behavior patterns based on situated context. Applying these algorithms provides an overview of certain meeting dynamics and defines metrics for meeting comparison and diversity of interaction. We tested on a subset of the AMI corpus and developed three diversity metrics to describe similarities and differences between meetings. These metrics also present the researcher with an overview of interaction dynamics and presents points-of-interest for analysis.
{"title":"Timing is Everything: Identifying Diverse Interaction Dynamics in Scenario and Non-Scenario Meetings","authors":"Chreston A. Miller, Christa Miller","doi":"10.1109/eScience.2019.00029","DOIUrl":"https://doi.org/10.1109/eScience.2019.00029","url":null,"abstract":"In this paper we explore the use of temporal patterns to define interaction dynamics between different kinds of meetings. Meetings occur on a daily basis and include different behavioral dynamics between participants, such as floor shifts and intense dialog. These dynamics can tell a story of the meeting and provide insight into how participants interact. We focus our investigation on defining diversity metrics to compare the interaction dynamics of scenario and non-scenario meetings. These metrics may be able to provide insight into the similarities and differences between scenario and non-scenario meetings. We observe that certain interaction dynamics can be identified through temporal patterns of speech intervals, i.e., when a participant is talking. We apply the principles of Parallel Episodes in identifying moments of speech overlap, e.g., interaction \"bursts\", and introduce Situated Data Mining, an approach for identifying repeated behavior patterns based on situated context. Applying these algorithms provides an overview of certain meeting dynamics and defines metrics for meeting comparison and diversity of interaction. We tested on a subset of the AMI corpus and developed three diversity metrics to describe similarities and differences between meetings. These metrics also present the researcher with an overview of interaction dynamics and presents points-of-interest for analysis.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121304260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00036
Joaquín Chung, Zhengchun Liu, R. Kettimuthu, Ian T Foster
Data transfer over wide area networks is an integral part of many science workflows that must, for example, move data from scientific facilities to remote resources for analysis, sharing, and storage. Yet despite continued enhancements in data transfer infrastructure (DTI), our previous analyses of approximately 40 billion GridFTP command logs collected over four years from the Globus transfer service show that data transfer nodes (DTNs) are idle (i.e., are performing no transfers) 94.3% of the time. On the other hand, we have also observed periods in which CPU resource scarcity negatively impacts DTN throughput. Motivated by the opportunity to optimize DTI performance, we present here an elastic DTI architecture in which the pool of nodes allocated to DTN activities expands and shrinks over time, based on demand. Our results show that this elastic DTI can save up to ~95% of resources compared with a typical static DTN deployment, with the median slowdown incurred remaining close to one for most of the evaluated scenarios.
{"title":"Toward an Elastic Data Transfer Infrastructure","authors":"Joaquín Chung, Zhengchun Liu, R. Kettimuthu, Ian T Foster","doi":"10.1109/eScience.2019.00036","DOIUrl":"https://doi.org/10.1109/eScience.2019.00036","url":null,"abstract":"Data transfer over wide area networks is an integral part of many science workflows that must, for example, move data from scientific facilities to remote resources for analysis, sharing, and storage. Yet despite continued enhancements in data transfer infrastructure (DTI), our previous analyses of approximately 40 billion GridFTP command logs collected over four years from the Globus transfer service show that data transfer nodes (DTNs) are idle (i.e., are performing no transfers) 94.3% of the time. On the other hand, we have also observed periods in which CPU resource scarcity negatively impacts DTN throughput. Motivated by the opportunity to optimize DTI performance, we present here an elastic DTI architecture in which the pool of nodes allocated to DTN activities expands and shrinks over time, based on demand. Our results show that this elastic DTI can save up to ~95% of resources compared with a typical static DTN deployment, with the median slowdown incurred remaining close to one for most of the evaluated scenarios.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"11 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130725156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00095
R. Gentz, H. Martín, Edward Baidoo, S. Peisert
We describe the fully automated workflow path developed for the ingest and analysis of liquid chromatography mass spectrometry (LCMS) data. With the help of this computational workflow, we were able to replace two human work days to analyze data with two hours of unsupervised computation time. In addition, this tool also can compute confidence intervals for all its results, based on the noise level present in the data. We leverage only open source tools and libraries in this workflow.
{"title":"Workflow Automation in Liquid Chromatography Mass Spectrometry","authors":"R. Gentz, H. Martín, Edward Baidoo, S. Peisert","doi":"10.1109/eScience.2019.00095","DOIUrl":"https://doi.org/10.1109/eScience.2019.00095","url":null,"abstract":"We describe the fully automated workflow path developed for the ingest and analysis of liquid chromatography mass spectrometry (LCMS) data. With the help of this computational workflow, we were able to replace two human work days to analyze data with two hours of unsupervised computation time. In addition, this tool also can compute confidence intervals for all its results, based on the noise level present in the data. We leverage only open source tools and libraries in this workflow.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125993935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-01DOI: 10.1109/eScience.2019.00075
Y. Demchenko, Zhiming Zhao, Jayachander Surbiryala, Spiros Koulouzis, Zeshun Shi, X. Liao, Jelena Gordiyenko
This paper presents recommendations on the design and pilot implementation of the DevOps and Cloud based Software Development curricula for Computer Science and Software Engineering masters. The central part of proposed approach is the Body of Knowledge in the DevOps technologies for Software Engineering (DevOpsSE BoK) that defines a set Knowledge Areas and Knowledge Units required for SE professionals to work efficiently as DevOps engineer or application developer. Defining DevOpsSE-BoK provides a basis for defining required professional competences and skills and allows consistent curricula structuring and profiling. The paper also reports on the experience of the first course run on 2018/2019 academic year at the University of Amsterdam. The paper presents the structure of the course and explains what instructional methodologies have been used for course development, such as project based learning that facilitates the students' team based skills both in mastering Agile development process and skills sharing. The paper provides a short summary of the generally used DevOps definitions, concepts, models and tools, specifically focusing on the cloud based DevOps tools for software development, deployment and operation that allows the main DevOps principle of continuous development and continuous improvement which are critical for modern agile data driven companies.
{"title":"Teaching DevOps and Cloud Based Software Engineering in University Curricula","authors":"Y. Demchenko, Zhiming Zhao, Jayachander Surbiryala, Spiros Koulouzis, Zeshun Shi, X. Liao, Jelena Gordiyenko","doi":"10.1109/eScience.2019.00075","DOIUrl":"https://doi.org/10.1109/eScience.2019.00075","url":null,"abstract":"This paper presents recommendations on the design and pilot implementation of the DevOps and Cloud based Software Development curricula for Computer Science and Software Engineering masters. The central part of proposed approach is the Body of Knowledge in the DevOps technologies for Software Engineering (DevOpsSE BoK) that defines a set Knowledge Areas and Knowledge Units required for SE professionals to work efficiently as DevOps engineer or application developer. Defining DevOpsSE-BoK provides a basis for defining required professional competences and skills and allows consistent curricula structuring and profiling. The paper also reports on the experience of the first course run on 2018/2019 academic year at the University of Amsterdam. The paper presents the structure of the course and explains what instructional methodologies have been used for course development, such as project based learning that facilitates the students' team based skills both in mastering Agile development process and skills sharing. The paper provides a short summary of the generally used DevOps definitions, concepts, models and tools, specifically focusing on the cloud based DevOps tools for software development, deployment and operation that allows the main DevOps principle of continuous development and continuous improvement which are critical for modern agile data driven companies.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124418813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}