Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457060
Lei Zhang, Yue Yu, Jianbo Dong, Yinhe Han, Shangping Ren, Xiaowei Li
Topology virtualization techniques are proposed for NoC-based many-core processors with core-level redundancy to isolate hardware changes caused by on-chip defective cores. Prior work focuses on homogeneous cores with symmetric performance and optimizes on-chip communication only. However, core-to-core performance asymmetry due to manufacturing process variations poses new challenges for constructing virtual topologies. Lower performance cores may scatter over a virtual topology, while operating systems typically allocate tasks to continuous cores. As a result, parallel applications are probably assigned to a region containing many slower cores that become bottlenecks. To tackle the above problem, in this paper we present a novel performance-asymmetry-aware reconfiguration algorithm Bubble-Up based on a new metric called core fragmentation factor (CFF). Bubble-Up can arrange cores with similar performance closer, yet maintaining reasonable hop distances between virtual neighbors, thus accelerating applications with higher degree of parallelism, without changing existing allocation strategies for OS. Experimental results show its effectiveness.
{"title":"Performance-asymmetry-aware topology virtualization for defect-tolerant NoC-based many-core processors","authors":"Lei Zhang, Yue Yu, Jianbo Dong, Yinhe Han, Shangping Ren, Xiaowei Li","doi":"10.1109/DATE.2010.5457060","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457060","url":null,"abstract":"Topology virtualization techniques are proposed for NoC-based many-core processors with core-level redundancy to isolate hardware changes caused by on-chip defective cores. Prior work focuses on homogeneous cores with symmetric performance and optimizes on-chip communication only. However, core-to-core performance asymmetry due to manufacturing process variations poses new challenges for constructing virtual topologies. Lower performance cores may scatter over a virtual topology, while operating systems typically allocate tasks to continuous cores. As a result, parallel applications are probably assigned to a region containing many slower cores that become bottlenecks. To tackle the above problem, in this paper we present a novel performance-asymmetry-aware reconfiguration algorithm Bubble-Up based on a new metric called core fragmentation factor (CFF). Bubble-Up can arrange cores with similar performance closer, yet maintaining reasonable hop distances between virtual neighbors, thus accelerating applications with higher degree of parallelism, without changing existing allocation strategies for OS. Experimental results show its effectiveness.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"490 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127648152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457159
J. Villena, L. M. Silveira
This paper describes a Model Order Reduction algorithm for multi-dimensional parameterized systems, based on a sampling procedure which incorporates a low order moment matching paradigm into a multi-point based methodology. The procedure seeks to maximize the subspace generated by a given number of samples, selected among an initial candidate set. The selection is based on a global criteria that chooses the sample whose associated vector adds more information to the existing subspace. However, the initial candidate set can be extremely large for high-dimensional systems, and thus the procedure can be costly. To improve efficiency we propose a scheme to incorporate information from low order moments to the basis with small extra cost, in order to extend the approximation to a wider region around the selected point. This will allow reduction of the initial candidate set without decreasing the level of confidence. We further improve the procedure by generating the global subspace based on the composition of local approximations. To achieve this, the initial candidates will be split into subsets that will be considered as independent regions, and in a first phase the procedure applied locally thus enabling improved efficiency and providing a framework for almost perfect parallelization.
{"title":"HORUS - high-dimensional Model Order Reduction via low moment-matching upgraded sampling","authors":"J. Villena, L. M. Silveira","doi":"10.1109/DATE.2010.5457159","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457159","url":null,"abstract":"This paper describes a Model Order Reduction algorithm for multi-dimensional parameterized systems, based on a sampling procedure which incorporates a low order moment matching paradigm into a multi-point based methodology. The procedure seeks to maximize the subspace generated by a given number of samples, selected among an initial candidate set. The selection is based on a global criteria that chooses the sample whose associated vector adds more information to the existing subspace. However, the initial candidate set can be extremely large for high-dimensional systems, and thus the procedure can be costly. To improve efficiency we propose a scheme to incorporate information from low order moments to the basis with small extra cost, in order to extend the approximation to a wider region around the selected point. This will allow reduction of the initial candidate set without decreasing the level of confidence. We further improve the procedure by generating the global subspace based on the composition of local approximations. To achieve this, the initial candidates will be split into subsets that will be considered as independent regions, and in a first phase the procedure applied locally thus enabling improved efficiency and providing a framework for almost perfect parallelization.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131311170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5456930
Abhishek Das, G. Memik, Joseph Zambreno, A. Choudhary
An increasing concern amongst designers and integrators of military and defense-related systems is the underlying security of the individual microprocessor components that make up these systems. Malicious circuitry can be inserted and hidden at several stages of the design process through the use of third-party Intellectual Property (IP), design tools, and manufacturing facilities. Such hardware Trojan circuitry has been shown to be capable of shutting down the main processor after a random number of cycles, broadcasting sensitive information over the bus, and bypassing software authentication mechanisms. In this work, we propose an architecture that can prevent information leakage due to such malicious hardware. Our technique is based on guaranteeing certain behavior in the memory system, which will be checked at an external guardian core that “approves” each memory request. By sitting between off-chip memory and the main core, the guardian core can monitor bus activity and verify the compiler-defined correctness of all memory writes. Experimental results on a conventional x86 platform demonstrate that application binaries can be statically reinstrumented to coordinate with the guardian core to monitor off-chip access, resulting in less than 60% overhead for the majority of the studied benchmarks.
{"title":"Detecting/preventing information leakage on the memory bus due to malicious hardware","authors":"Abhishek Das, G. Memik, Joseph Zambreno, A. Choudhary","doi":"10.1109/DATE.2010.5456930","DOIUrl":"https://doi.org/10.1109/DATE.2010.5456930","url":null,"abstract":"An increasing concern amongst designers and integrators of military and defense-related systems is the underlying security of the individual microprocessor components that make up these systems. Malicious circuitry can be inserted and hidden at several stages of the design process through the use of third-party Intellectual Property (IP), design tools, and manufacturing facilities. Such hardware Trojan circuitry has been shown to be capable of shutting down the main processor after a random number of cycles, broadcasting sensitive information over the bus, and bypassing software authentication mechanisms. In this work, we propose an architecture that can prevent information leakage due to such malicious hardware. Our technique is based on guaranteeing certain behavior in the memory system, which will be checked at an external guardian core that “approves” each memory request. By sitting between off-chip memory and the main core, the guardian core can monitor bus activity and verify the compiler-defined correctness of all memory writes. Experimental results on a conventional x86 platform demonstrate that application binaries can be statically reinstrumented to coordinate with the guardian core to monitor off-chip access, resulting in less than 60% overhead for the majority of the studied benchmarks.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115863942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457191
R. Ayoub, Shervin Sharifi, T. Simunic
In state of the art systems, workload scheduling and server fan speed operate independently leading to cooling inefficiencies. We propose GentleCool, a proactive multi-tier approach for significantly lowering the fan cooling costs without compromising the performance. Our technique manages the fan speed through intelligently allocating the workload across different machines. The experimental results show our approach delivers average cooling energy savings of 72% and improves the mean time between failures (MTBF) of the fans by 2.3X compared to the state of the art.
{"title":"GentleCool: Cooling aware proactive workload scheduling in multi-machine systems","authors":"R. Ayoub, Shervin Sharifi, T. Simunic","doi":"10.1109/DATE.2010.5457191","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457191","url":null,"abstract":"In state of the art systems, workload scheduling and server fan speed operate independently leading to cooling inefficiencies. We propose GentleCool, a proactive multi-tier approach for significantly lowering the fan cooling costs without compromising the performance. Our technique manages the fan speed through intelligently allocating the workload across different machines. The experimental results show our approach delivers average cooling energy savings of 72% and improves the mean time between failures (MTBF) of the fans by 2.3X compared to the state of the art.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114403312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457203
H. Dadgour, K. Banerjee
Time-dependent performance degradation due to transistor aging caused by mechanisms such as Negative Bias Temperature Instability (NBTI) and Hot Carrier Injection (HCI) is one of the most important reliability concerns for deep nano-scale regime VLSI circuits. Hence, aging-resilient design methodologies are necessary to address this issue in order to improve reliability, preferably with minimal impact on the area, power and performance. This work offers two major contributions to the aging-resilient circuit design methodology literature. First, it introduces a novel sensor circuit that can detect the aging of pipeline architectures by monitoring the arrival time of data signals at flip-flops. The area overhead of the proposed circuit is estimated to be less than 45% compared to that of previous approaches, which are over 95%. To ensure the accuracy of its operation, a comprehensive timing analysis is performed on the proposed circuit including the influence of process variations. As a second contribution, this work presents an innovative correction technique to reduce the probability of timing failures caused by aging. This method employs novel reconfigurable flip-flops, which operate as normal flip-flops as long as the circuit is fresh, but function as time-borrowing flip-flops once the circuit ages. This unique flip-flop design allows utilization of the advantages of the time-borrowing technique while avoiding potential race conditions that can be created by employing such a technique. It is shown via simulations that by employing the proposed design methodology, the probability of timing failures in the aged circuits can be reduced by as much as 10X for various benchmark circuits.
{"title":"Aging-resilient design of pipelined architectures using novel detection and correction circuits","authors":"H. Dadgour, K. Banerjee","doi":"10.1109/DATE.2010.5457203","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457203","url":null,"abstract":"Time-dependent performance degradation due to transistor aging caused by mechanisms such as Negative Bias Temperature Instability (NBTI) and Hot Carrier Injection (HCI) is one of the most important reliability concerns for deep nano-scale regime VLSI circuits. Hence, aging-resilient design methodologies are necessary to address this issue in order to improve reliability, preferably with minimal impact on the area, power and performance. This work offers two major contributions to the aging-resilient circuit design methodology literature. First, it introduces a novel sensor circuit that can detect the aging of pipeline architectures by monitoring the arrival time of data signals at flip-flops. The area overhead of the proposed circuit is estimated to be less than 45% compared to that of previous approaches, which are over 95%. To ensure the accuracy of its operation, a comprehensive timing analysis is performed on the proposed circuit including the influence of process variations. As a second contribution, this work presents an innovative correction technique to reduce the probability of timing failures caused by aging. This method employs novel reconfigurable flip-flops, which operate as normal flip-flops as long as the circuit is fresh, but function as time-borrowing flip-flops once the circuit ages. This unique flip-flop design allows utilization of the advantages of the time-borrowing technique while avoiding potential race conditions that can be created by employing such a technique. It is shown via simulations that by employing the proposed design methodology, the probability of timing failures in the aged circuits can be reduced by as much as 10X for various benchmark circuits.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117297896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Economakos, S. Xydis, Ioannis Koutras, D. Soudris
High-level synthesis has recently started to gain industrial acceptance, due to the improved quality of results and the multi-objective optimizations offered. One optimization area lately addressed is reconfigurable computing, where parts of a DFG are merged and mapped into coarse grained reconfigurable components. This paper presents an alternative approach, the construction of dual mode components which are exchanged with regular components in the resulting RTL architecture. The dual mode components are constructed by exhaustive search for dual mode functional primitives inside the datapath of complicated RTL components. Such components, like multipliers and dividers, that would remain idle in certain control steps, are able to work full-time in two different modes, without any reconfiguration overhead applied to the critical path of the application. The results obtained with different DSP benchmarks show an average performance gain of 15%, without any practical datapath area increase, offering uniform and balanced resource utilization.
{"title":"Construction of dual mode components for reconfiguration aware high-level synthesis","authors":"G. Economakos, S. Xydis, Ioannis Koutras, D. Soudris","doi":"10.5555/1870926.1871252","DOIUrl":"https://doi.org/10.5555/1870926.1871252","url":null,"abstract":"High-level synthesis has recently started to gain industrial acceptance, due to the improved quality of results and the multi-objective optimizations offered. One optimization area lately addressed is reconfigurable computing, where parts of a DFG are merged and mapped into coarse grained reconfigurable components. This paper presents an alternative approach, the construction of dual mode components which are exchanged with regular components in the resulting RTL architecture. The dual mode components are constructed by exhaustive search for dual mode functional primitives inside the datapath of complicated RTL components. Such components, like multipliers and dividers, that would remain idle in certain control steps, are able to work full-time in two different modes, without any reconfiguration overhead applied to the critical path of the application. The results obtained with different DSP benchmarks show an average performance gain of 15%, without any practical datapath area increase, offering uniform and balanced resource utilization.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116287087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457124
P. Ghosh, Arunabha Sen
Power consumption can be significantly reduced in Systems-on-Chip (SoC) by scaling down the voltage levels of the Processing Elements (PEs). The power efficiency of this Voltage Islanding technique comes at the cost of energy and area overhead due to the level shifters between voltage islands. Moreover, from the physical design perspective it is not desirable to have an excessive number of voltage islands on the chip. Considering voltage islanding at an early phase of design as during floorplanning of the PEs can address various of these issues. In this paper, we propose a new cost function for the floorplanning objective different from the traditional floorplanning objective. The new cost function not only includes the overall area requirement, but also incorporates the overall power consumption and the design constraint imposed on the maximum number of voltage islands. We propose a greedy heuristic based on the proposed cost function for the floorplanning of the PEs with several voltage islands. Experimental results using benchmark data study the effect of several parameters on the outcome of the heuristic. It is evident from the results that power consumption can be significantly reduced using our algorithm without significant area overhead. The area obtained from the heuristic is also compared with the optimal, and found to be within 4% of the optimal on average, when area minimization is given the priority.
{"title":"Power efficient voltage islanding for Systems-on-chip from a floorplanning perspective","authors":"P. Ghosh, Arunabha Sen","doi":"10.1109/DATE.2010.5457124","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457124","url":null,"abstract":"Power consumption can be significantly reduced in Systems-on-Chip (SoC) by scaling down the voltage levels of the Processing Elements (PEs). The power efficiency of this Voltage Islanding technique comes at the cost of energy and area overhead due to the level shifters between voltage islands. Moreover, from the physical design perspective it is not desirable to have an excessive number of voltage islands on the chip. Considering voltage islanding at an early phase of design as during floorplanning of the PEs can address various of these issues. In this paper, we propose a new cost function for the floorplanning objective different from the traditional floorplanning objective. The new cost function not only includes the overall area requirement, but also incorporates the overall power consumption and the design constraint imposed on the maximum number of voltage islands. We propose a greedy heuristic based on the proposed cost function for the floorplanning of the PEs with several voltage islands. Experimental results using benchmark data study the effect of several parameters on the outcome of the heuristic. It is evident from the results that power consumption can be significantly reduced using our algorithm without significant area overhead. The area obtained from the heuristic is also compared with the optimal, and found to be within 4% of the optimal on average, when area minimization is given the priority.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"405 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123528724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457070
Fahimeh Jafari, Zhonghai Lu, A. Jantsch, M. Moghaddam
We have proposed (σ, ρ)-based flow regulation to reduce delay and backlog bounds in SoC architectures, where σ bounds the traffic burstiness and ρ the traffic rate. The regulation is conducted per-flow for its peak rate and traffic burstiness. In this paper, we optimize these regulation parameters in networks on chips where many flows may have conflicting regulation requirements. We formulate an optimization problem for minimizing total buffers under performance constraints. We solve the problem with the interior point method. Our case study results exhibit 48% reduction of total buffers and 16% reduction of total latency for the proposed problem. The optimization solution has low run-time complexity, enabling quick exploration of large design space.
{"title":"Optimal regulation of traffic flows in networks-on-chip","authors":"Fahimeh Jafari, Zhonghai Lu, A. Jantsch, M. Moghaddam","doi":"10.1109/DATE.2010.5457070","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457070","url":null,"abstract":"We have proposed (σ, ρ)-based flow regulation to reduce delay and backlog bounds in SoC architectures, where σ bounds the traffic burstiness and ρ the traffic rate. The regulation is conducted per-flow for its peak rate and traffic burstiness. In this paper, we optimize these regulation parameters in networks on chips where many flows may have conflicting regulation requirements. We formulate an optimization problem for minimizing total buffers under performance constraints. We solve the problem with the interior point method. Our case study results exhibit 48% reduction of total buffers and 16% reduction of total latency for the proposed problem. The optimization solution has low run-time complexity, enabling quick exploration of large design space.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123547793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5456981
Zuochang Ye, L. M. Silveira, J. Phillips
An efficient algorithmbased on the Extended Hamiltonian Pencil was proposed in [1] for systems with hybrid representation. Here we further extend the Extended Hamiltonian Pencil method to systems described with scattering representation, i.e. S-parameter systems. The derivation of the Extended Hamiltonian Pencil for S-parameter systems is presented. Some properties that allow passivity enforcement based on eigenvalue displacement are reported. Experimental results demonstrate the effectiveness of the proposed method.
{"title":"Extended Hamiltonian Pencil for passivity assessment and enforcement for S-parameter systems","authors":"Zuochang Ye, L. M. Silveira, J. Phillips","doi":"10.1109/DATE.2010.5456981","DOIUrl":"https://doi.org/10.1109/DATE.2010.5456981","url":null,"abstract":"An efficient algorithmbased on the Extended Hamiltonian Pencil was proposed in [1] for systems with hybrid representation. Here we further extend the Extended Hamiltonian Pencil method to systems described with scattering representation, i.e. S-parameter systems. The derivation of the Extended Hamiltonian Pencil for S-parameter systems is presented. Some properties that allow passivity enforcement based on eigenvalue displacement are reported. Experimental results demonstrate the effectiveness of the proposed method.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117214704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2010-03-08DOI: 10.1109/DATE.2010.5457040
Roy Shea, M. Srivastava, Young H. Cho
Detailed diagnostic data is a prerequisite for debugging problems and understanding runtime performance in distributed wireless embedded systems. Severe bandwidth limitations, tight timing constraints, and limited program text space hinder the application of standard diagnostic tools within this domain. This work introduces the Log Instrumentation Specification (LIS), which provides a high level logging interface to developers and is able to create extremely compact diagnostic logs. LIS uses a token scoping technique to aggressively compact identifiers that are packed into bit aligned log buffers. LIS is evaluated in the context of recording call traces within a network of wireless sensor nodes. Our evaluation shows that logs generated using LIS require less than 50% of the bandwidth utilized by alternate logging mechanisms. Through microbench-marking of a complete LIS implementation for the TinyOS operating system, we demonstrate that LIS can comfortably fit onto low-end embedded systems. By significantly reducing log bandwidth, LIS enables extraction of a more complete picture of runtime behavior from distributed wireless embedded systems.
{"title":"Scoped identifiers for efficient bit aligned logging","authors":"Roy Shea, M. Srivastava, Young H. Cho","doi":"10.1109/DATE.2010.5457040","DOIUrl":"https://doi.org/10.1109/DATE.2010.5457040","url":null,"abstract":"Detailed diagnostic data is a prerequisite for debugging problems and understanding runtime performance in distributed wireless embedded systems. Severe bandwidth limitations, tight timing constraints, and limited program text space hinder the application of standard diagnostic tools within this domain. This work introduces the Log Instrumentation Specification (LIS), which provides a high level logging interface to developers and is able to create extremely compact diagnostic logs. LIS uses a token scoping technique to aggressively compact identifiers that are packed into bit aligned log buffers. LIS is evaluated in the context of recording call traces within a network of wireless sensor nodes. Our evaluation shows that logs generated using LIS require less than 50% of the bandwidth utilized by alternate logging mechanisms. Through microbench-marking of a complete LIS implementation for the TinyOS operating system, we demonstrate that LIS can comfortably fit onto low-end embedded systems. By significantly reducing log bandwidth, LIS enables extraction of a more complete picture of runtime behavior from distributed wireless embedded systems.","PeriodicalId":432902,"journal":{"name":"2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123975940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}