Zhijian He, Bohuan Xue, Xiangcheng Hu, Zhaoyan Shen, Xiangyue Zeng, Ming Liu
Autonomous driving emphasizes precise multi-sensor fusion positioning on limit resource embedded system. LiDAR-centered sensor fusion system serves as mainstream navigation system due to its insensitivity to illumination and viewpoint change. However, these types of system suffer from handling large-scale sequential LiDAR data using limit resouce on board, leading LiDAR-centralized sensor fusion unpractical. As a result, hand-crafted feature such as plane and edge are leveraged in majority mainstream positioning methods to alleviate this unsatisfaction, triggering a new cornerstone in LiDAR Inertial sensor fusion. However, such super light weight feature extraction, although achieves real-time constraint in LiDAR-centered sensor fusion, encounters severe vulnerability under high speed rotational or translational perturbation. In this paper, we propose a sparse tensor based LiDAR Inertial fusion method for autonomous driving embedded system. Leveraging the power of sparse tensor, the global geometrical feature is fetched so that the point cloud sparsity defect is alleviated. Inertial sensor is deployed to conquer the time-consuming step caused by the coarse level point-wise inlier matching. We construct our experiments on both representative dataset benchmarks and realistic scenes. The evaluation results show the robustness and accuracy of our proposed solution comparing to classical methods.
{"title":"Robust Embedded Autonomous Driving Positioning System Fusing LiDAR and Inertial Sensors","authors":"Zhijian He, Bohuan Xue, Xiangcheng Hu, Zhaoyan Shen, Xiangyue Zeng, Ming Liu","doi":"10.1145/3626098","DOIUrl":"https://doi.org/10.1145/3626098","url":null,"abstract":"Autonomous driving emphasizes precise multi-sensor fusion positioning on limit resource embedded system. LiDAR-centered sensor fusion system serves as mainstream navigation system due to its insensitivity to illumination and viewpoint change. However, these types of system suffer from handling large-scale sequential LiDAR data using limit resouce on board, leading LiDAR-centralized sensor fusion unpractical. As a result, hand-crafted feature such as plane and edge are leveraged in majority mainstream positioning methods to alleviate this unsatisfaction, triggering a new cornerstone in LiDAR Inertial sensor fusion. However, such super light weight feature extraction, although achieves real-time constraint in LiDAR-centered sensor fusion, encounters severe vulnerability under high speed rotational or translational perturbation. In this paper, we propose a sparse tensor based LiDAR Inertial fusion method for autonomous driving embedded system. Leveraging the power of sparse tensor, the global geometrical feature is fetched so that the point cloud sparsity defect is alleviated. Inertial sensor is deployed to conquer the time-consuming step caused by the coarse level point-wise inlier matching. We construct our experiments on both representative dataset benchmarks and realistic scenes. The evaluation results show the robustness and accuracy of our proposed solution comparing to classical methods.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135944886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sung Woo Choi, Michael Ivashchenko, Luan V. Nguyen, Hoang-Dung Tran
This paper extends the star set reachability approach to verify the robustness of feed-forward neural networks (FNNs) with sigmoidal activation functions such as Sigmoid and TanH. The main drawbacks of the star set approach in Sigmoid/TanH FNN verification are scalability, feasibility, and optimality issues in some cases due to the linear programming solver usage. We overcome this challenge by proposing a relaxed star (RStar) with symbolic intervals, which allows the usage of the back-substitution technique in DeepPoly to find bounds when overapproximating activation functions while maintaining the valuable features of a star set. RStar can overapproximate a sigmoidal activation function using four linear constraints (RStar4) or two linear constraints (RStar2), or only the output bounds (RStar0). We implement our RStar reachability algorithms in NNV and compare them to DeepPoly via robustness verification of image classification DNNs benchmarks. The experimental results show that the original star approach (i.e., no relaxation) is the least conservative of all methods yet the slowest. RStar4 is computationally much faster than the original star method and is the second least conservative approach. It certifies up to 40% more images against adversarial attacks than DeepPoly and on average 51 times faster than the star set. Last but not least, RStar0 is the most conservative method, which could only verify two cases for the CIFAR10 small Sigmoid network, δ = 0.014. However, it is the fastest method that can verify neural networks up to 3528 times faster than the star set and up to 46 times faster than DeepPoly in our evaluation.
{"title":"Reachability Analysis of Sigmoidal Neural Networks","authors":"Sung Woo Choi, Michael Ivashchenko, Luan V. Nguyen, Hoang-Dung Tran","doi":"10.1145/3627991","DOIUrl":"https://doi.org/10.1145/3627991","url":null,"abstract":"This paper extends the star set reachability approach to verify the robustness of feed-forward neural networks (FNNs) with sigmoidal activation functions such as Sigmoid and TanH. The main drawbacks of the star set approach in Sigmoid/TanH FNN verification are scalability, feasibility, and optimality issues in some cases due to the linear programming solver usage. We overcome this challenge by proposing a relaxed star (RStar) with symbolic intervals, which allows the usage of the back-substitution technique in DeepPoly to find bounds when overapproximating activation functions while maintaining the valuable features of a star set. RStar can overapproximate a sigmoidal activation function using four linear constraints (RStar4) or two linear constraints (RStar2), or only the output bounds (RStar0). We implement our RStar reachability algorithms in NNV and compare them to DeepPoly via robustness verification of image classification DNNs benchmarks. The experimental results show that the original star approach (i.e., no relaxation) is the least conservative of all methods yet the slowest. RStar4 is computationally much faster than the original star method and is the second least conservative approach. It certifies up to 40% more images against adversarial attacks than DeepPoly and on average 51 times faster than the star set. Last but not least, RStar0 is the most conservative method, which could only verify two cases for the CIFAR10 small Sigmoid network, δ = 0.014. However, it is the fastest method that can verify neural networks up to 3528 times faster than the star set and up to 46 times faster than DeepPoly in our evaluation.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135994171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marten Lohstroh, Soroush Bateni, Christian Menard, Alexander Schulz-Rosengarten, Jeronimo Castrillon, Edward A. Lee
We discuss a novel approach for constructing deterministic reactive systems that revolves around a temporal model that incorporates a multiplicity of timelines. This model is central to Lingua Franca (LF), a polyglot coordination language and compiler toolchain we are developing for the definition and composition of concurrent components called reactors, which are objects that react to and emit discrete events. Our temporal model differs from existing models like the logical execution time (LET) paradigm and synchronous languages in that it reflects that there are always at least two distinct timelines involved in a reactive system; a logical one and a physical one—and possibly multiple of each kind. This paper explains how the relationship between events across timelines facilitates reasoning about consistency and availability across components in Cyber-Physical Systems (CPS).
我们讨论了一种构建确定性反应系统的新方法,该系统围绕一个包含多种时间线的时间模型。该模型是Lingua Franca (LF)的核心,LF是一种多语言协调语言和编译器工具链,我们正在开发用于定义和组合称为反应器的并发组件,这些组件是对离散事件作出反应并发出离散事件的对象。我们的时间模型不同于现有的模型,如逻辑执行时间(LET)范式和同步语言,因为它反映了至少有两个不同的时间线涉及到响应式系统;一个是逻辑上的,一个是物理上的,可能是每一种的多个。本文解释了跨时间线事件之间的关系如何促进对信息物理系统(CPS)中跨组件的一致性和可用性的推理。
{"title":"Deterministic Coordination Across Multiple Timelines","authors":"Marten Lohstroh, Soroush Bateni, Christian Menard, Alexander Schulz-Rosengarten, Jeronimo Castrillon, Edward A. Lee","doi":"10.1145/3615357","DOIUrl":"https://doi.org/10.1145/3615357","url":null,"abstract":"We discuss a novel approach for constructing deterministic reactive systems that revolves around a temporal model that incorporates a multiplicity of timelines. This model is central to Lingua Franca (LF), a polyglot coordination language and compiler toolchain we are developing for the definition and composition of concurrent components called reactors, which are objects that react to and emit discrete events. Our temporal model differs from existing models like the logical execution time (LET) paradigm and synchronous languages in that it reflects that there are always at least two distinct timelines involved in a reactive system; a logical one and a physical one—and possibly multiple of each kind. This paper explains how the relationship between events across timelines facilitates reasoning about consistency and availability across components in Cyber-Physical Systems (CPS).","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136078825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, a formal generic framework for defining and reasoning about deterministic concurrency in synchronous systems is implemented in the Spin model checker. Concretely, the paper implements the clock-synchronised shared memory ( csm ) theory , which extends synchronous programming with more and higher level csm data types. These csm data types are equipped with a synchronisation policy prescribing how concurrent calls to objects methods must be organised. In a policy constructive system, all methods of every object can be scheduled in a policy-conformant manner without deadlocking. In our framework, synchronous policies get codified as Promela never-claims. In this form, the model checker can search for executions (interleavings) that satisfy the synchronous product of all the never-claims, namely policy-conformant schedules for all the csm objects. The existence of such a policy-conformant schedules, verifies that the concurrent synchronous system is deterministic. The approach of this paper extends beyond a single semantics since it can handle the synchronous programming model as well as the various forms of the sequentially constructive model found in the literature.
{"title":"Synchronised Shared Memory and Model Checking","authors":"Joaquín Aguado, Alejandra Duenas","doi":"10.1145/3626188","DOIUrl":"https://doi.org/10.1145/3626188","url":null,"abstract":"In this paper, a formal generic framework for defining and reasoning about deterministic concurrency in synchronous systems is implemented in the Spin model checker. Concretely, the paper implements the clock-synchronised shared memory ( csm ) theory , which extends synchronous programming with more and higher level csm data types. These csm data types are equipped with a synchronisation policy prescribing how concurrent calls to objects methods must be organised. In a policy constructive system, all methods of every object can be scheduled in a policy-conformant manner without deadlocking. In our framework, synchronous policies get codified as Promela never-claims. In this form, the model checker can search for executions (interleavings) that satisfy the synchronous product of all the never-claims, namely policy-conformant schedules for all the csm objects. The existence of such a policy-conformant schedules, verifies that the concurrent synchronous system is deterministic. The approach of this paper extends beyond a single semantics since it can handle the synchronous programming model as well as the various forms of the sequentially constructive model found in the literature.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135834569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
LSM-tree-based key-value stores (KV stores) convert random-write requests to sequence-write ones to achieve high I/O performance. Meanwhile, compaction operations in KV stores update SSTables in forms of reorganizing low-level data components to high-level ones, thereby guaranteeing an orderly data layout in each component. Repeated writes caused by compaction ( a.k.a, write amplification) impacts I/O bandwidth and overall system performance. Near-data processing (NDP) is one of effective approaches to addressing this write-amplification issue. Most NDP-based techniques adopt synchronous parallel schemes to perform a compaction task on both the host and its NDP-enabled device. In synchronous parallel compaction schemes, the execution time of compaction is determined by a subsystem that has lower compaction performance coupled by under-utilized computing resources in a NDP framework. To solve this problem, we propose an asynchronous parallel scheme named PStore to improve the compaction performance in KV stores. In PStore, we designed a multi-tasks queue and three priority-based scheduling methods. PStore elects proper compaction tasks to be offloaded in host- and device-side compaction modules. Our proposed cross-leveled compaction mechanism mitigates write amplification induced by asynchronous compaction. PStore featured with the asynchronous compaction mechanism fully utilizes computing resources in both host and device-side subsystems. Compared with the two popular synchronous compaction modes based on KV stores (TStore and LevelDB), our PStore immensely improves the throughput by up to a factor of 14 and 10.52 with an average of a factor of 2.09 and 1.73, respectively.
{"title":"An Asynchronous Compaction Acceleration Scheme for Near-Data Processing-enabled LSM-Tree-based KV Stores","authors":"Hui Sun, Bendong Lou, Chao Zhao, Deyan Kong, Chaowei Zhang, Jianzhong Huang, Yinliang Yue, Xiao Qin","doi":"10.1145/3626097","DOIUrl":"https://doi.org/10.1145/3626097","url":null,"abstract":"LSM-tree-based key-value stores (KV stores) convert random-write requests to sequence-write ones to achieve high I/O performance. Meanwhile, compaction operations in KV stores update SSTables in forms of reorganizing low-level data components to high-level ones, thereby guaranteeing an orderly data layout in each component. Repeated writes caused by compaction ( a.k.a, write amplification) impacts I/O bandwidth and overall system performance. Near-data processing (NDP) is one of effective approaches to addressing this write-amplification issue. Most NDP-based techniques adopt synchronous parallel schemes to perform a compaction task on both the host and its NDP-enabled device. In synchronous parallel compaction schemes, the execution time of compaction is determined by a subsystem that has lower compaction performance coupled by under-utilized computing resources in a NDP framework. To solve this problem, we propose an asynchronous parallel scheme named PStore to improve the compaction performance in KV stores. In PStore, we designed a multi-tasks queue and three priority-based scheduling methods. PStore elects proper compaction tasks to be offloaded in host- and device-side compaction modules. Our proposed cross-leveled compaction mechanism mitigates write amplification induced by asynchronous compaction. PStore featured with the asynchronous compaction mechanism fully utilizes computing resources in both host and device-side subsystems. Compared with the two popular synchronous compaction modes based on KV stores (TStore and LevelDB), our PStore immensely improves the throughput by up to a factor of 14 and 10.52 with an average of a factor of 2.09 and 1.73, respectively.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135194176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we investigate the reach-avoid problem of a class of time-varying analytic systems with disturbances described by uncertain parameters. Firstly, by proposing the concepts of maximal and minimal reachable sets, we connect the avoidability and reachability with maximal and minimal reachable sets respectively. Then, for a given disturbance parameter, we introduce the evolution function for exactly describing the reachable set, and find a series representation of this evolution function with its Lie derivatives, which can also be regarded as a series function with respect to the uncertain parameter. Afterward, based on the partial sums of this series, over- and under-approximations of the evolution function are constructed, which can be realized by interval arithmetics with designated precision. Further, we propose sufficient conditions for avoidability and reachability and design a numerical quantifier elimination based algorithm to verify these conditions; moreover, we improve the algorithm with a time-splitting technique. We implement the algorithms and use some benchmarks with comparisons to show that our methodology is both efficient and promising. Finally, we additionally extend our methodology to deal with systems with complex initial sets and time-dependent switchings. The performance of our extended method for these systems is also shown by four examples with comparisons and discussions.
{"title":"Evolution Function Based Reach-Avoid Verification for Time-varying Systems with Disturbances","authors":"Ruiqi Hu, Kairong Liu, Zhikun She","doi":"10.1145/3626099","DOIUrl":"https://doi.org/10.1145/3626099","url":null,"abstract":"In this work, we investigate the reach-avoid problem of a class of time-varying analytic systems with disturbances described by uncertain parameters. Firstly, by proposing the concepts of maximal and minimal reachable sets, we connect the avoidability and reachability with maximal and minimal reachable sets respectively. Then, for a given disturbance parameter, we introduce the evolution function for exactly describing the reachable set, and find a series representation of this evolution function with its Lie derivatives, which can also be regarded as a series function with respect to the uncertain parameter. Afterward, based on the partial sums of this series, over- and under-approximations of the evolution function are constructed, which can be realized by interval arithmetics with designated precision. Further, we propose sufficient conditions for avoidability and reachability and design a numerical quantifier elimination based algorithm to verify these conditions; moreover, we improve the algorithm with a time-splitting technique. We implement the algorithms and use some benchmarks with comparisons to show that our methodology is both efficient and promising. Finally, we additionally extend our methodology to deal with systems with complex initial sets and time-dependent switchings. The performance of our extended method for these systems is also shown by four examples with comparisons and discussions.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135385297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bastien Sultan, Léon Frénot, Ludovic Apvrille, Philippe Jaillon, Sophie Coudert
SysML models are widely used for designing and analyzing complex systems. Model-based design methods often require successive modifications of the models, whether for incrementally refining the design (e.g. in agile development methods) or for testing different design options. Such modifications, or mutations, are also used in mutation-based testing approaches. However, the definition of mutation operators can be a complex issue and applying them to models is sometimes performed by hand: this is time consuming and error prone. The paper addresses this issue thanks to the introduction of AMULET, the first mutation language for SysML. AMULET encompasses the modifications targeting SysML block and state-machine diagrams, and is supported by a compiler the paper presents. This compiler is integrated in TTool, an open-source SysML toolkit, enabling the full support of design methods including model design, mutation and verification tasks in a unique toolkit. The paper also introduces two case-studies providing concrete examples of AMULET use for modeling vulnerabilities and cyber attacks, and highlighting the benefits of AMULET for SysML mutations.
{"title":"AMULET: a Mutation Language Enabling Automatic Enrichment of SysML Models","authors":"Bastien Sultan, Léon Frénot, Ludovic Apvrille, Philippe Jaillon, Sophie Coudert","doi":"10.1145/3624583","DOIUrl":"https://doi.org/10.1145/3624583","url":null,"abstract":"SysML models are widely used for designing and analyzing complex systems. Model-based design methods often require successive modifications of the models, whether for incrementally refining the design (e.g. in agile development methods) or for testing different design options. Such modifications, or mutations, are also used in mutation-based testing approaches. However, the definition of mutation operators can be a complex issue and applying them to models is sometimes performed by hand: this is time consuming and error prone. The paper addresses this issue thanks to the introduction of AMULET, the first mutation language for SysML. AMULET encompasses the modifications targeting SysML block and state-machine diagrams, and is supported by a compiler the paper presents. This compiler is integrated in TTool, an open-source SysML toolkit, enabling the full support of design methods including model design, mutation and verification tasks in a unique toolkit. The paper also introduces two case-studies providing concrete examples of AMULET use for modeling vulnerabilities and cyber attacks, and highlighting the benefits of AMULET for SysML mutations.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135308316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongsu Byun, Safdar Jamil, Jungwook Han, Sungyong Park, Myungcheol Lee, Changsoo Kim, Beongjun Choi, Youngjae Kim
The data movement in large-scale computing facilities (from compute nodes to data nodes) is categorized as one of the major contributors to high cost and energy utilization. To tackle it, in-storage processing (ISP) within storage devices, such as Solid-State Drives (SSDs), has been explored actively. The introduction of computational storage drives (CSDs) enabled ISP within the same form factor as regular SSDs and made it easy to replace SSDs within traditional compute nodes. With CSDs, host systems can offload various operations such as search, filter, and count. However, commercialized CSDs have different hardware resources and performance characteristics. Thus, it requires careful consideration of hardware, performance, and workload characteristics for building a CSD-based storage system within a compute node. Therefore, storage architects are hesitant to build a storage system based on CSDs as there are no tools to determine the benefits of CSD-based compute nodes to meet the performance requirements compared to traditional nodes based on SSDs. In this work, we proposed an analytical model-based storage capacity planner called CsdPlan for system architects to build performance-effective CSD-based compute nodes. Our model takes into account the performance characteristics of the host system, targeted workloads, and hardware and performance characteristics of CSDs to be deployed and provides optimal configuration based on the number of CSDs for a compute node. Furthermore, CsdPlan estimates and reduces the total cost of ownership (TCO) for building a CSD-based compute node. To evaluate the efficacy of CsdPlan , we selected two commercially available CSDs and 4 representative big data analysis workloads.
{"title":"An Analytical Model-based Capacity Planning Approach for Building CSD-based Storage Systems","authors":"Hongsu Byun, Safdar Jamil, Jungwook Han, Sungyong Park, Myungcheol Lee, Changsoo Kim, Beongjun Choi, Youngjae Kim","doi":"10.1145/3623677","DOIUrl":"https://doi.org/10.1145/3623677","url":null,"abstract":"The data movement in large-scale computing facilities (from compute nodes to data nodes) is categorized as one of the major contributors to high cost and energy utilization. To tackle it, in-storage processing (ISP) within storage devices, such as Solid-State Drives (SSDs), has been explored actively. The introduction of computational storage drives (CSDs) enabled ISP within the same form factor as regular SSDs and made it easy to replace SSDs within traditional compute nodes. With CSDs, host systems can offload various operations such as search, filter, and count. However, commercialized CSDs have different hardware resources and performance characteristics. Thus, it requires careful consideration of hardware, performance, and workload characteristics for building a CSD-based storage system within a compute node. Therefore, storage architects are hesitant to build a storage system based on CSDs as there are no tools to determine the benefits of CSD-based compute nodes to meet the performance requirements compared to traditional nodes based on SSDs. In this work, we proposed an analytical model-based storage capacity planner called CsdPlan for system architects to build performance-effective CSD-based compute nodes. Our model takes into account the performance characteristics of the host system, targeted workloads, and hardware and performance characteristics of CSDs to be deployed and provides optimal configuration based on the number of CSDs for a compute node. Furthermore, CsdPlan estimates and reduces the total cost of ownership (TCO) for building a CSD-based compute node. To evaluate the efficacy of CsdPlan , we selected two commercially available CSDs and 4 representative big data analysis workloads.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"214 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134911851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahmed El Yaacoub, Luca Mottola, Thiemo Voigt, Philipp Rümmer
We present NeRTA ( Ne xt R elease T ime A nalysis), a technique to enable dynamic software updates for low-level control software of mobile robots. Dynamic software updates enable software correction and evolution during system operation. In mobile robotics, they are crucial to resolve software defects without interrupting system operation or to enable on-the-fly extensions. Low-level control software for mobile robots, however, is time sensitive and runs on resource-constrained hardware with no operating system support. To minimize the impact of the update process, NeRTA safely schedules updates during times when the computing unit would otherwise be idle. It does so by utilizing information from the existing scheduling algorithm without impacting its operation. As such, NeRTA works orthogonal to the existing scheduler, retaining the existing platform-specific optimizations and fine-tuning, and may simply operate as a plug-in component. To enable larger dynamic updates, we further conceive an additional mechanism called bounded reactive control and apply mixed criticality concepts. The former cautiously reduces the overall control frequency, whereas the latter excludes less critical tasks from NeRTA processing. Their use increases the available idle times. We combine real-world experiments on embedded hardware with simulations to evaluate NeRTA. Our experimental evaluation shows that the difference between NeRTA’s estimated idle times and the measured idle times is less than 15% in more than three-quarters of the samples. The combined effect of bounded reactive control and mixed-criticality concepts results in a 150+% increase in available idle times. We also show that the processing overhead of NeRTA and of the additional mechanisms is essentially negligible.
我们提出了NeRTA (nenext R release time A analysis),这是一种能够对移动机器人的底层控制软件进行动态软件更新的技术。动态软件更新使软件在系统运行过程中得以修正和演进。在移动机器人中,它们对于解决软件缺陷而不中断系统运行或实现动态扩展至关重要。然而,用于移动机器人的低级控制软件是时间敏感的,并且运行在没有操作系统支持的资源受限的硬件上。为了尽量减少更新过程的影响,NeRTA在计算单元空闲时安全地安排更新。它通过利用现有调度算法中的信息而不影响其操作来实现这一目标。因此,NeRTA与现有的调度器是正交的,保留了现有的特定于平台的优化和微调,并且可以简单地作为插件组件操作。为了实现更大的动态更新,我们进一步构思了一种称为有界反应控制的附加机制,并应用了混合临界概念。前者谨慎地减少总体控制频率,而后者则从NeRTA处理中排除不太关键的任务。它们的使用增加了可用的空闲时间。我们将嵌入式硬件上的真实实验与模拟相结合来评估NeRTA。我们的实验评估表明,在超过四分之三的样本中,NeRTA估计的空闲时间与测量的空闲时间之间的差异小于15%。有界反应控制和混合临界概念的综合作用使可用空闲时间增加了150%以上。我们还表明,NeRTA和附加机制的处理开销基本上可以忽略不计。
{"title":"Scheduling Dynamic Software Updates in Mobile Robots","authors":"Ahmed El Yaacoub, Luca Mottola, Thiemo Voigt, Philipp Rümmer","doi":"10.1145/3623676","DOIUrl":"https://doi.org/10.1145/3623676","url":null,"abstract":"We present NeRTA ( Ne xt R elease T ime A nalysis), a technique to enable dynamic software updates for low-level control software of mobile robots. Dynamic software updates enable software correction and evolution during system operation. In mobile robotics, they are crucial to resolve software defects without interrupting system operation or to enable on-the-fly extensions. Low-level control software for mobile robots, however, is time sensitive and runs on resource-constrained hardware with no operating system support. To minimize the impact of the update process, NeRTA safely schedules updates during times when the computing unit would otherwise be idle. It does so by utilizing information from the existing scheduling algorithm without impacting its operation. As such, NeRTA works orthogonal to the existing scheduler, retaining the existing platform-specific optimizations and fine-tuning, and may simply operate as a plug-in component. To enable larger dynamic updates, we further conceive an additional mechanism called bounded reactive control and apply mixed criticality concepts. The former cautiously reduces the overall control frequency, whereas the latter excludes less critical tasks from NeRTA processing. Their use increases the available idle times. We combine real-world experiments on embedded hardware with simulations to evaluate NeRTA. Our experimental evaluation shows that the difference between NeRTA’s estimated idle times and the measured idle times is less than 15% in more than three-quarters of the samples. The combined effect of bounded reactive control and mixed-criticality concepts results in a 150+% increase in available idle times. We also show that the processing overhead of NeRTA and of the additional mechanisms is essentially negligible.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135740150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dehua Liang, Hiromitsu Awano, Noriyuki Miura, Jun Shiomi
Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing stable operation in modern VLSI. To tackle such issues, we further extend the DependableHD to the second version DependableHDv2 , a HyperDimensional Computing (HDC) system that can tolerate bit-level memory failure in the low voltage region with high robustness. DependableHDv2 introduces the concept of margin enhancement for model retraining and utilizes noise injection to improve the robustness, which is capable of application in most state-of-the-art HDC algorithms. We additionally propose the dimension-swapping technique, which aims at handling the stuck-at errors induced by aggressive voltage scaling in the memory cells. Our experiment shows that under 8% memory stuck-at error, DependableHDv2 exhibits a 2.42% accuracy loss on average, which achieves a 14.1 × robustness improvement compared to the baseline HDC solution. The hardware evaluation shows that DependableHDv2 supports the systems to reduce the supply voltage from 430mV to 340mV for both item Memory and Associative Memory, which provides a 41.8% energy consumption reduction while maintaining competitive accuracy performance.
{"title":"A Robust and Energy Efficient Hyperdimensional Computing System for Voltage-scaled Circuits","authors":"Dehua Liang, Hiromitsu Awano, Noriyuki Miura, Jun Shiomi","doi":"10.1145/3620671","DOIUrl":"https://doi.org/10.1145/3620671","url":null,"abstract":"Voltage scaling is one of the most promising approaches for energy efficiency improvement but also brings challenges to fully guaranteeing stable operation in modern VLSI. To tackle such issues, we further extend the DependableHD to the second version DependableHDv2 , a HyperDimensional Computing (HDC) system that can tolerate bit-level memory failure in the low voltage region with high robustness. DependableHDv2 introduces the concept of margin enhancement for model retraining and utilizes noise injection to improve the robustness, which is capable of application in most state-of-the-art HDC algorithms. We additionally propose the dimension-swapping technique, which aims at handling the stuck-at errors induced by aggressive voltage scaling in the memory cells. Our experiment shows that under 8% memory stuck-at error, DependableHDv2 exhibits a 2.42% accuracy loss on average, which achieves a 14.1 × robustness improvement compared to the baseline HDC solution. The hardware evaluation shows that DependableHDv2 supports the systems to reduce the supply voltage from 430mV to 340mV for both item Memory and Associative Memory, which provides a 41.8% energy consumption reduction while maintaining competitive accuracy performance.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135980673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}