Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00025
Eduardo Rosales, Andrea Rosà, Matteo Basso, A. Villazón, Adriana Orellana, Ángel Zenteno, Jhon Rivero, Walter Binder
Since Java 8, streams ease the development of data transformations using a declarative style based on functional programming. Some recent studies aim at shedding light on how streams are used. However, they consider only small sets of applications and mainly apply static analysis techniques, leaving the large-scale analysis of dynamic metrics focusing on stream processing an open research question. In this paper, we present the first large-scale empirical study on the use of streams in Java. We present a novel dynamic analysis for collecting runtime information and key metrics that enable the fine-grained characterization of sequential and parallel stream processing. We massively apply our dynamic analysis using a fully automated approach, supported by a distributed infrastructure to mine public software projects hosted on GitHub. Our findings advance the understanding of the use of streams, both confirming some of the results of previous studies at a much larger scale, as well as revealing previously unobserved findings in the use of streams.
{"title":"Characterizing Java Streams in the Wild","authors":"Eduardo Rosales, Andrea Rosà, Matteo Basso, A. Villazón, Adriana Orellana, Ángel Zenteno, Jhon Rivero, Walter Binder","doi":"10.1109/ICECCS54210.2022.00025","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00025","url":null,"abstract":"Since Java 8, streams ease the development of data transformations using a declarative style based on functional programming. Some recent studies aim at shedding light on how streams are used. However, they consider only small sets of applications and mainly apply static analysis techniques, leaving the large-scale analysis of dynamic metrics focusing on stream processing an open research question. In this paper, we present the first large-scale empirical study on the use of streams in Java. We present a novel dynamic analysis for collecting runtime information and key metrics that enable the fine-grained characterization of sequential and parallel stream processing. We massively apply our dynamic analysis using a fully automated approach, supported by a distributed infrastructure to mine public software projects hosted on GitHub. Our findings advance the understanding of the use of streams, both confirming some of the results of previous studies at a much larger scale, as well as revealing previously unobserved findings in the use of streams.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126184631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00024
Jiaqi Yin, Huibiao Zhu
In the era of 5G, users are extremely sensitive to time delay and have strict reliability requirements. The architecture of MEC can effectively reduce or even eliminate the impact of return delay, whose core idea is to localize the data reasonably. Actually, most of the work still concentrated on the balance between the efficiency and energy consumption of task offloading strategy, but few work analyzed and expounded its offloading characteristics from the perspective of formal methods. Henceforth, In this paper, we propose a real-time secure hierarchical process calculus rMECal of task offloading for MEC. Then we show the operational semantics of this calculus from the process and network levels to describe how the program works, especially the parallel composition rule for many-to-many broadcast communication. In addition, we formalize the calculus and rules with real-time Maude, and adopt the example of Internet of Vehicles to illustrate the availability of the calculus and operational semantics. Moreover, we give the denotational semantics of this calculus to express what the program executes based on the Unifying Theories of Programming (UTP) approach, and show the fundamental algebraic properties. We believe that this paper can provide a guidance for exploring the formal theories in MEC.
{"title":"The Operational and Denotational Semantics of rMECal Calculus for Mobile Edge Computing","authors":"Jiaqi Yin, Huibiao Zhu","doi":"10.1109/ICECCS54210.2022.00024","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00024","url":null,"abstract":"In the era of 5G, users are extremely sensitive to time delay and have strict reliability requirements. The architecture of MEC can effectively reduce or even eliminate the impact of return delay, whose core idea is to localize the data reasonably. Actually, most of the work still concentrated on the balance between the efficiency and energy consumption of task offloading strategy, but few work analyzed and expounded its offloading characteristics from the perspective of formal methods. Henceforth, In this paper, we propose a real-time secure hierarchical process calculus rMECal of task offloading for MEC. Then we show the operational semantics of this calculus from the process and network levels to describe how the program works, especially the parallel composition rule for many-to-many broadcast communication. In addition, we formalize the calculus and rules with real-time Maude, and adopt the example of Internet of Vehicles to illustrate the availability of the calculus and operational semantics. Moreover, we give the denotational semantics of this calculus to express what the program executes based on the Unifying Theories of Programming (UTP) approach, and show the fundamental algebraic properties. We believe that this paper can provide a guidance for exploring the formal theories in MEC.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130808456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00022
Zhé Hóu, Qinyi Li, Ernest Foo, J. Dong, Paulo de Souza
This paper presents the conceptualisation of a framework that combines digital twins with runtime verification and applies the techniques in the context of security monitoring and verification for satellites. We focus on special considerations needed for space missions and satellites, and we discuss how digital twins in such applications can be developed and how the states of the twins should be synchronised. In particular, we present state synchronisation methods to ensure secure and efficient long-distance communication between the satellite and its digital twin on the ground. Building on top of this, we develop a runtime verification engine for the digital twin that can verify properties in multiple temporal logic languages. We end the paper with our proposal to develop a fully verified satellite digital twin system as future work.
{"title":"A Digital Twin Runtime Verification Framework for Protecting Satellites Systems from Cyber Attacks","authors":"Zhé Hóu, Qinyi Li, Ernest Foo, J. Dong, Paulo de Souza","doi":"10.1109/ICECCS54210.2022.00022","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00022","url":null,"abstract":"This paper presents the conceptualisation of a framework that combines digital twins with runtime verification and applies the techniques in the context of security monitoring and verification for satellites. We focus on special considerations needed for space missions and satellites, and we discuss how digital twins in such applications can be developed and how the states of the twins should be synchronised. In particular, we present state synchronisation methods to ensure secure and efficient long-distance communication between the satellite and its digital twin on the ground. Building on top of this, we develop a runtime verification engine for the digital twin that can verify properties in multiple temporal logic languages. We end the paper with our proposal to develop a fully verified satellite digital twin system as future work.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116487405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00014
Sree Ram Boyapati, Claudia Szabo
Most software companies deploy microservices be-hind API Gateways or load balancers to separate their business logic while at the same time serving their customers according to their SLAs. Today, internet companies serve an average of 150–200 million users efficiently in rapidly changing conditions, where autonomic self-adaptation solutions are critical. At such a large scale, self-adaptation has to address challenges related to high availability and reliability, in a variety of scenarios. In this industry experience report, we present the implementation of a self-adaptation approach for microservice architectures that can operate at a large scale and address availability and reliability concerns. Our prototype builds on current industry standards of observability tools used to track the system's internal state. We implement a lightweight MAPE-K loop that reduces the time taken to add self-adaptability and the total cost of ownership. Our case study focuses on dynamic rate limiting, where the implementation of our architecture was able to trigger and execute self-adaptation in under 1 second. We present our architecture, an overview of our prototype implementation and suite of tools used, and discuss our empirical observations.
{"title":"Self-adaptation in Microservice Architectures: A Case Study","authors":"Sree Ram Boyapati, Claudia Szabo","doi":"10.1109/ICECCS54210.2022.00014","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00014","url":null,"abstract":"Most software companies deploy microservices be-hind API Gateways or load balancers to separate their business logic while at the same time serving their customers according to their SLAs. Today, internet companies serve an average of 150–200 million users efficiently in rapidly changing conditions, where autonomic self-adaptation solutions are critical. At such a large scale, self-adaptation has to address challenges related to high availability and reliability, in a variety of scenarios. In this industry experience report, we present the implementation of a self-adaptation approach for microservice architectures that can operate at a large scale and address availability and reliability concerns. Our prototype builds on current industry standards of observability tools used to track the system's internal state. We implement a lightweight MAPE-K loop that reduces the time taken to add self-adaptability and the total cost of ownership. Our case study focuses on dynamic rate limiting, where the implementation of our architecture was able to trigger and execute self-adaptation in under 1 second. We present our architecture, an overview of our prototype implementation and suite of tools used, and discuss our empirical observations.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127393723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00019
Meryem Afendi, A. Mammar, Régine Laleau
Cyber-physical systems allow interactions with the physical world using a network of sensors and actuators. They also form basis of future technologies via engaging in innovating within many crucial fields: health, transport, smart grid, etc. Modeling cyber-physical systems requires handling the evolution of continuous measurements. Generally this evolution is repre-sented by ordinary differential equations where the unknown variable denotes a set of functions that depend on a single independent variable. The aim of our work is to propose a correct-by-construction formal approach, based on the refinement technique of the Event-B method, to model and verify such systems. However, Event-B does not handle the resolution of ordinary differential equations. To overcome this limit, we suggest to combine Event-B with the differential equation solver SageMath. This paper presents our approach by means of the hybrid smart heating system case study.
{"title":"Building Correct Hybrid Systems using Event-B and Sagemath: Illustration by the Hybrid Smart Heating System Case Study","authors":"Meryem Afendi, A. Mammar, Régine Laleau","doi":"10.1109/ICECCS54210.2022.00019","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00019","url":null,"abstract":"Cyber-physical systems allow interactions with the physical world using a network of sensors and actuators. They also form basis of future technologies via engaging in innovating within many crucial fields: health, transport, smart grid, etc. Modeling cyber-physical systems requires handling the evolution of continuous measurements. Generally this evolution is repre-sented by ordinary differential equations where the unknown variable denotes a set of functions that depend on a single independent variable. The aim of our work is to propose a correct-by-construction formal approach, based on the refinement technique of the Event-B method, to model and verify such systems. However, Event-B does not handle the resolution of ordinary differential equations. To overcome this limit, we suggest to combine Event-B with the differential equation solver SageMath. This paper presents our approach by means of the hybrid smart heating system case study.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128388277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00017
Peter Riviere, N. Singh, Y. A. Ameur
Event-B is a correct-by-construction rigorous state-based method offering features for formal modelling and proof automation. An inductive proof schema allows to prove system properties, in particular invariants. In the current setup, verifying other properties such as deadlock-freeness, reachability, event scheduling, liveness, etc., requires adhoc modelling. These prop-erties can be established partially using model checkers or by using third party interactive provers. Other crucial aspects, such as deadlock-freeness, are difficult to express. The availabilty of a meta-modelling mechanism for explicit manipulation of Event-B concepts would allow to deal with higher order modelling concepts and to define generic properties and associated proof obligations. In this paper, we propose EB4EB, an Event-B based modelling framework allowing to manipulate Event- B features explicitly based on meta modelling concepts. This framework relies on a set of Event-B theories defining data-types, operators, well-defined conditions, theorems and proof rules. It preserves the core logical foundation, including semantics, of original Event- B models. Based on the instantiation of the introduced features at meta level, deep and shallow modelling approaches are proposed to exploit this framework. In addition, a case study is developed to demonstrate the use of our framework applying the deep and shallow embedding approaches. The whole framework is supported by the Rodin platform handling Event- B models and proofs.
{"title":"EB4EB: A Framework for Reflexive Event-B","authors":"Peter Riviere, N. Singh, Y. A. Ameur","doi":"10.1109/ICECCS54210.2022.00017","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00017","url":null,"abstract":"Event-B is a correct-by-construction rigorous state-based method offering features for formal modelling and proof automation. An inductive proof schema allows to prove system properties, in particular invariants. In the current setup, verifying other properties such as deadlock-freeness, reachability, event scheduling, liveness, etc., requires adhoc modelling. These prop-erties can be established partially using model checkers or by using third party interactive provers. Other crucial aspects, such as deadlock-freeness, are difficult to express. The availabilty of a meta-modelling mechanism for explicit manipulation of Event-B concepts would allow to deal with higher order modelling concepts and to define generic properties and associated proof obligations. In this paper, we propose EB4EB, an Event-B based modelling framework allowing to manipulate Event- B features explicitly based on meta modelling concepts. This framework relies on a set of Event-B theories defining data-types, operators, well-defined conditions, theorems and proof rules. It preserves the core logical foundation, including semantics, of original Event- B models. Based on the instantiation of the introduced features at meta level, deep and shallow modelling approaches are proposed to exploit this framework. In addition, a case study is developed to demonstrate the use of our framework applying the deep and shallow embedding approaches. The whole framework is supported by the Rodin platform handling Event- B models and proofs.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131922419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00020
Wenhui Zhang, Ya Gao
One of the main concerns of automated verification and error detection of software designs is the efficiency. Although bounded model checking (BMC) has been proven to be effective for error detection, further improvement of the efficiency is of great importance to the practical application of such methods. The development of BMC approaches is based on bounded semantics of temporal logics. Therefore the design of bounded semantics is essential for the subsequent BMC approaches. In this work, we propose a non-monotone bounded semantics for the linear temporal logic (LTL), and consequently a non-monotone BMC approach for improving the efficiency of bounded model checking. To this end, the information that a formula is unsatisfiable in an early step of checking is partly taken into consideration in a later one (in the sequence) in order to provide possibility for dismissing some of the irrelevant paths quickly in checking the later more complicated bounded model. The experimental results have shown that this approach has clear advantage over the traditional one on the test cases with respect to the efficiency. A comparison of such a non-monotone BMC approach with the traditional one implemented in the well-known model checking tools NuSMV and nuXmv is also reported.
{"title":"A Bounded Semantics for Improving the Efficiency of Bounded Model Checking","authors":"Wenhui Zhang, Ya Gao","doi":"10.1109/ICECCS54210.2022.00020","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00020","url":null,"abstract":"One of the main concerns of automated verification and error detection of software designs is the efficiency. Although bounded model checking (BMC) has been proven to be effective for error detection, further improvement of the efficiency is of great importance to the practical application of such methods. The development of BMC approaches is based on bounded semantics of temporal logics. Therefore the design of bounded semantics is essential for the subsequent BMC approaches. In this work, we propose a non-monotone bounded semantics for the linear temporal logic (LTL), and consequently a non-monotone BMC approach for improving the efficiency of bounded model checking. To this end, the information that a formula is unsatisfiable in an early step of checking is partly taken into consideration in a later one (in the sequence) in order to provide possibility for dismissing some of the irrelevant paths quickly in checking the later more complicated bounded model. The experimental results have shown that this approach has clear advantage over the traditional one on the test cases with respect to the efficiency. A comparison of such a non-monotone BMC approach with the traditional one implemented in the well-known model checking tools NuSMV and nuXmv is also reported.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"26 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134288880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00028
Weilin Wu, Na Wang, Yixiang Chen
In order to assess the fire risk of the intelligent buildings, a trustworthy classification model was developed, which provides model supporting for the classification assessment of fire risk in intelligent buildings under the urban intelligent firefight construction. The model integrates Bayesian Network (BN) and software trustworthy computing theory and method, designs metric elements and attributes to assess fire risk from four dimensions of fire situation, building, environment and personnel; BN is used to calculate the risk value of fire attributes; Then, the fire risk attribute value is fused into the fire risk trustworthy value by using the trustworthy assessment model; This paper constructs a trustworthy classification model for intelligent building fire risk, and classifies the fire risk into five ranks according to the trustworthy value and attribute value. Taking the Shanghai Jing'an 11.15 fire as an example case, the result shows that the method provided in this paper can perform fire risk assessment and classification.
{"title":"A Novel Intelligent-Building-Fire-Risk Classification Method*","authors":"Weilin Wu, Na Wang, Yixiang Chen","doi":"10.1109/ICECCS54210.2022.00028","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00028","url":null,"abstract":"In order to assess the fire risk of the intelligent buildings, a trustworthy classification model was developed, which provides model supporting for the classification assessment of fire risk in intelligent buildings under the urban intelligent firefight construction. The model integrates Bayesian Network (BN) and software trustworthy computing theory and method, designs metric elements and attributes to assess fire risk from four dimensions of fire situation, building, environment and personnel; BN is used to calculate the risk value of fire attributes; Then, the fire risk attribute value is fused into the fire risk trustworthy value by using the trustworthy assessment model; This paper constructs a trustworthy classification model for intelligent building fire risk, and classifies the fire risk into five ranks according to the trustworthy value and attribute value. Taking the Shanghai Jing'an 11.15 fire as an example case, the result shows that the method provided in this paper can perform fire risk assessment and classification.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132147576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00011
Guanhong Chen, Shuang Liu
Malware detection and malware family classification are of great importance to network and system security. Currently, the wide adoption of deep learning models has greatly improved the performance of those tasks. However, deep-learning-based methods greatly rely on large-scale high-quality datasets, which require manual labeling. Obtaining a large-scale high-quality labeled dataset is extremely difficult for malware due to the domain knowledge required. In this work, we propose to reduce the manual labeling efforts by selecting a representative subset of instances, which has the same distribution as the original full dataset. Our method effectively reduces the workload of labeling while maintaining the accuracy degradation of the classification model within an acceptable threshold. We compare our method with the random sampling method on two widely adopted datasets and the evaluation results show that our method achieves significant improvements over the baseline method. In particular, with only 20% of the data selected, our method has only a 2.68 % degradation in classification performance compared to the full set, while the baseline method has a 6.78 % performance loss. We also compare the effects of factors such as training strategy and model structure on the final results, providing some guidance for subsequent research.
{"title":"Reducing Malware labeling Efforts Through Efficient Prototype Selection","authors":"Guanhong Chen, Shuang Liu","doi":"10.1109/ICECCS54210.2022.00011","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00011","url":null,"abstract":"Malware detection and malware family classification are of great importance to network and system security. Currently, the wide adoption of deep learning models has greatly improved the performance of those tasks. However, deep-learning-based methods greatly rely on large-scale high-quality datasets, which require manual labeling. Obtaining a large-scale high-quality labeled dataset is extremely difficult for malware due to the domain knowledge required. In this work, we propose to reduce the manual labeling efforts by selecting a representative subset of instances, which has the same distribution as the original full dataset. Our method effectively reduces the workload of labeling while maintaining the accuracy degradation of the classification model within an acceptable threshold. We compare our method with the random sampling method on two widely adopted datasets and the evaluation results show that our method achieves significant improvements over the baseline method. In particular, with only 20% of the data selected, our method has only a 2.68 % degradation in classification performance compared to the full set, while the baseline method has a 6.78 % performance loss. We also compare the effects of factors such as training strategy and model structure on the final results, providing some guidance for subsequent research.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121163016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-03-01DOI: 10.1109/ICECCS54210.2022.00030
Enze Ma
In deep learning libraries like TensorFlow, compu-tations are manually batched as computation graphs. Graph replacement is then an optimization that replaces one subgraph of a computation graph with another whilst keeping the graphs before and after replacement functionally equivalent. Meanwhile, in practice, it remains a challenge how graph replacements can be performed efficiently: graph replacement is usually conducted by human engineers, and thus it incurs many human efforts since a variety of deep learning models do exist and a number of model-specific replacements can be performed; the functionality equivalence of graphs before and after replacement is also not easy to guarantee. To tackle with this challenge, we introduce in this paper DLGR, a rule-based approach to graph replacement for deep learning. The core idea of DLGR is to define a set of replacement rules, each of which specifies the source and the tar-get graph patterns and constraints on graph replacement. Given a computation graph, DLGR then performs an iterative process of matching and replacing subgraphs in the source graph, and generates a replaced, and usually optimized computation graph. We conduct experiments to evaluate the capabilities of DLGR. The results clearly show the strengths of DLGR: compared with two existing graph replacement techniques, it provides with more replacement rules and saves engineers' development efforts in reducing up to 68 % lines of code.
{"title":"DLGR: A Rule-Based Approach to Graph Replacement for Deep Learning","authors":"Enze Ma","doi":"10.1109/ICECCS54210.2022.00030","DOIUrl":"https://doi.org/10.1109/ICECCS54210.2022.00030","url":null,"abstract":"In deep learning libraries like TensorFlow, compu-tations are manually batched as computation graphs. Graph replacement is then an optimization that replaces one subgraph of a computation graph with another whilst keeping the graphs before and after replacement functionally equivalent. Meanwhile, in practice, it remains a challenge how graph replacements can be performed efficiently: graph replacement is usually conducted by human engineers, and thus it incurs many human efforts since a variety of deep learning models do exist and a number of model-specific replacements can be performed; the functionality equivalence of graphs before and after replacement is also not easy to guarantee. To tackle with this challenge, we introduce in this paper DLGR, a rule-based approach to graph replacement for deep learning. The core idea of DLGR is to define a set of replacement rules, each of which specifies the source and the tar-get graph patterns and constraints on graph replacement. Given a computation graph, DLGR then performs an iterative process of matching and replacing subgraphs in the source graph, and generates a replaced, and usually optimized computation graph. We conduct experiments to evaluate the capabilities of DLGR. The results clearly show the strengths of DLGR: compared with two existing graph replacement techniques, it provides with more replacement rules and saves engineers' development efforts in reducing up to 68 % lines of code.","PeriodicalId":344493,"journal":{"name":"2022 26th International Conference on Engineering of Complex Computer Systems (ICECCS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130160013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}