ACM Journal on Emerging Technologies in Computing Systems最新文献_第2页

Introduction to the Special Issue on BioFoundries and Cloud Laboratories 生物铸造厂和云实验室特刊简介

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-07-31 DOI: 10.1145/3609485

D. Densmore, N. Hillson, E. Klavins, Chris J. Myers, J. Peccoud, Giovanni Stracquadanio

DOUGLAS DENSMORE , Department of Electrical and Computer Engineering, Biological Design Center, Boston University NATHAN J. HILLSON , DOE Agile BioFoundry, and Biological Systems and Engineering Division, Lawrence Berkeley National Lab ERIC KLAVINS , Department of Electrical and Computer Engineering, University of Washington CHRIS MYERS , Department of Electrical, Computer, and Energy Engineering, University of Colorado JEAN PECCOUD , Chemical and Biological Engineering, Colorado State University GIOVANNI STRACQUADANIO , School of Biological Sciences, The University of Edinburgh

DOUGLAS DENSMORE，波士顿大学生物设计中心电气和计算机工程系NATHAN J.Hilson，DOE Agile BioFoundry，以及劳伦斯伯克利国家实验室生物系统和工程部ERIC KLAVINS，华盛顿大学电气和计算机工程系CHRIS MYERS，电气、计算机和能源工程系，科罗拉多大学JEAN PECCOUD，化学与生物工程，科罗拉多州立大学GIOVANNI STRACQUADANIO，爱丁堡大学生物科学学院

引用次数: 0

Guest Editors Introduction: Special Issue on Network-on-Chip Architectures of the Future (NoCArc) 特邀编辑导言:未来的片上网络架构(NoCArc)特刊

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-07-27 DOI: 10.1145/3609500

A. Ganguly, Salvatore Monteleone, Diana Goehringer, Cristinel Ababei

© 1 h s the number of cores integrated into the same integrated circuit increases, the role of the etwork-on-Chip (NoC)—as the communication infrastructure—becomes increasingly more imortant. Next-generation many-core processor systems continue to face communication-related calability problems, which are further exacerbated by ultra-deep sub-micron effects induced by he next silicon technology nodes. The emergence of novel computing paradigms consisting of ccelerators, quantum computing, DNA computing storage technologies, and optical computing an have deep and far-reaching implications on the future of interconnects. Integration platforms uch as interposers and processing-in-memory are also predicted to influence the course of NoC esearch. Furthermore, applications such as big data, artificial intelligence, deep learning, and cyersecurity will also impact the future of NoC research. With the end of Dennard scaling, large many-core processor systems are disaggregated into maller chiplets or dielets and are integrated using traditional platforms such as boards as well s emerging technologies such as 2.5D interposers, Silicon Photonics (SiPh), or wireless interonnects. Interposers are large silicon dies with minimum or no active devices providing abunant wiring resources to interconnect dielets integrated on sockets in the interposer. The capabilty to reuse older more mature technology nodes due to minimum active devices and the use of nly long-distance global wire-based interconnects make interposers a natural choice for low-cost nd sustainable scalable platforms that do not need new materials and can reuse existing fabriation nodes. The abundant wiring resources provide new opportunities for scaling the number f chiplets in the system and for research into novel, application-informed Network-in-Package NiP) designs that provide designers a wide range of tradeoffs in performance, energy efficiency, eliability, scalability, and sustainability. SiPh is maturing as an on-chip and chip-to-chip interconnect technology. Using miniature ring esonators, Mach-Zehnder modulator/demodulators, and waveguides, dense wavelength division ultiplexing is supported where multiple pairs of senders and receivers can communicate with igh bandwidth over chip-side dimensions with ultra-low latency and improved energy efficiency. ntegration and miniaturization of the SiPh devices at larger densities and elimination of electroptic domain conversions remain open challenges in this field. Wireless and radio frequency communication among cores in a many-core system over NoC or iP links can provide latency-bound communication using miniature millimeter-wave or sub-THz ands over multi-gigabit per second links. Wireless communication provides support for broadcast r multicast traffic, which is extremely beneficial in many-core processor systems due to essential ontrol messages such as cache coherency protocol messages. By eliminating repeated unicasts,

©1随着集成到同一集成电路中的核心数量的增加，作为通信基础设施的片上网络(NoC)的作用变得越来越重要。下一代多核处理器系统将继续面临与通信相关的可扩展性问题，而下一代硅技术节点引发的超深亚微米效应将进一步加剧这一问题。由加速器、量子计算、DNA计算存储技术和光计算组成的新型计算范式的出现，对互联的未来产生了深远的影响。集成平台，如中介器和内存处理，预计也会影响NoC研究的进程。此外，大数据、人工智能、深度学习和网络安全等应用也将影响NoC研究的未来。随着Dennard缩放的结束，大型多核处理器系统被分解成更小的芯片或小片，并使用传统平台(如板)以及新兴技术(如2.5D中间层、硅光子学(SiPh)或无线互连)进行集成。中间层是带有最少或没有有源器件的大型硅晶片，提供丰富的布线资源来互连集成在中间层插座上的片。由于具有最小的有源设备，并且仅使用长距离的全球有线互连，因此能够重用旧的更成熟的技术节点，这使得中间体成为低成本和可持续扩展平台的自然选择，这些平台不需要新材料，并且可以重用现有的制造节点。丰富的布线资源为扩展系统中的小芯片数量和研究新颖的、基于应用的包中网络(NiP)设计提供了新的机会，这些设计为设计人员提供了性能、能效、可靠性、可扩展性和可持续性方面的广泛权衡。SiPh作为片上和片对片互连技术正在成熟。使用微型环形谐振器，马赫-曾德尔调制器/解调器和波导，支持密集波分复用，其中多对发送者和接收器可以在芯片端尺寸上以超低延迟和提高的能源效率进行高带宽通信。更大密度下SiPh器件的集成和小型化以及消除电畴转换仍然是该领域的开放挑战。通过NoC或iP链路的多核系统中的核心之间的无线和射频通信可以通过每秒千兆比特的链路使用微型毫米波或次太赫兹提供延迟绑定通信。无线通信提供了对广播或多播流量的支持，这在多核处理器系统中是非常有益的，因为有必要的控制消息，如缓存一致性协议消息。通过消除重复的单播，

{"title":"Guest Editors Introduction: Special Issue on Network-on-Chip Architectures of the Future (NoCArc)","authors":"A. Ganguly, Salvatore Monteleone, Diana Goehringer, Cristinel Ababei","doi":"10.1145/3609500","DOIUrl":"https://doi.org/10.1145/3609500","url":null,"abstract":"© 1 h s the number of cores integrated into the same integrated circuit increases, the role of the etwork-on-Chip (NoC)—as the communication infrastructure—becomes increasingly more imortant. Next-generation many-core processor systems continue to face communication-related calability problems, which are further exacerbated by ultra-deep sub-micron effects induced by he next silicon technology nodes. The emergence of novel computing paradigms consisting of ccelerators, quantum computing, DNA computing storage technologies, and optical computing an have deep and far-reaching implications on the future of interconnects. Integration platforms uch as interposers and processing-in-memory are also predicted to influence the course of NoC esearch. Furthermore, applications such as big data, artificial intelligence, deep learning, and cyersecurity will also impact the future of NoC research. With the end of Dennard scaling, large many-core processor systems are disaggregated into maller chiplets or dielets and are integrated using traditional platforms such as boards as well s emerging technologies such as 2.5D interposers, Silicon Photonics (SiPh), or wireless interonnects. Interposers are large silicon dies with minimum or no active devices providing abunant wiring resources to interconnect dielets integrated on sockets in the interposer. The capabilty to reuse older more mature technology nodes due to minimum active devices and the use of nly long-distance global wire-based interconnects make interposers a natural choice for low-cost nd sustainable scalable platforms that do not need new materials and can reuse existing fabriation nodes. The abundant wiring resources provide new opportunities for scaling the number f chiplets in the system and for research into novel, application-informed Network-in-Package NiP) designs that provide designers a wide range of tradeoffs in performance, energy efficiency, eliability, scalability, and sustainability. SiPh is maturing as an on-chip and chip-to-chip interconnect technology. Using miniature ring esonators, Mach-Zehnder modulator/demodulators, and waveguides, dense wavelength division ultiplexing is supported where multiple pairs of senders and receivers can communicate with igh bandwidth over chip-side dimensions with ultra-low latency and improved energy efficiency. ntegration and miniaturization of the SiPh devices at larger densities and elimination of electroptic domain conversions remain open challenges in this field. Wireless and radio frequency communication among cores in a many-core system over NoC or iP links can provide latency-bound communication using miniature millimeter-wave or sub-THz ands over multi-gigabit per second links. Wireless communication provides support for broadcast r multicast traffic, which is extremely beneficial in many-core processor systems due to essential ontrol messages such as cache coherency protocol messages. By eliminating repeated unicasts,","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"19 1","pages":"1 - 3"},"PeriodicalIF":2.2,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46621748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An Electro-Photonic System for Accelerating Deep Neural Networks 一种用于加速深度神经网络的光电系统

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-07-12 DOI: https://dl.acm.org/doi/10.1145/3606949

Cansu Demirkiran, Furkan Eris, Gongyu Wang, Jonathan Elmhurst, Nick Moore, Nicholas C. Harris, Ayon Basumallik, Vijay Janapa Reddi, Ajay Joshi, Darius Bunandar

The number of parameters in deep neural networks (DNNs) is scaling at about 5 × the rate of Moore’s Law. To sustain this growth, photonic computing is a promising avenue, as it enables higher throughput in dominant general matrix-matrix multiplication (GEMM) operations in DNNs than their electrical counterpart. However, purely photonic systems face several challenges including lack of photonic memory and accumulation of noise. In this paper, we present an electro-photonic accelerator, ADEPT, which leverages a photonic computing unit for performing GEMM operations, a vectorized digital electronic ASIC for performing non-GEMM operations, and SRAM arrays for storing DNN parameters and activations. In contrast to prior works in photonic DNN accelerators, we adopt a system-level perspective and show that the gains while large are tempered relative to prior expectations. Our goal is to encourage architects to explore photonic technology in a more pragmatic way considering the system as a whole to understand its general applicability in accelerating today’s DNNs. Our evaluation shows that ADEPT can provide, on average, 5.73 × higher throughput per Watt compared to the traditional systolic arrays (SAs) in a full-system, and at least 6.8 × and 2.5 × better throughput per Watt, compared to state-of-the-art electronic and photonic accelerators, respectively.

深度神经网络(dnn)中参数的数量以摩尔定律的5倍速率扩展。为了维持这种增长，光子计算是一个很有前途的途径，因为它可以在dnn中占主导地位的一般矩阵-矩阵乘法(GEMM)操作中实现比电运算更高的吞吐量。然而，纯光子系统面临着缺乏光子记忆和噪声积累等挑战。在本文中，我们提出了一种光电加速器ADEPT，它利用光子计算单元执行GEMM操作，一个矢量数字电子ASIC执行非GEMM操作，以及用于存储DNN参数和激活的SRAM阵列。与之前在光子深度神经网络加速器上的工作相比，我们采用了系统级的观点，并表明增益虽然很大，但相对于先前的预期是温和的。我们的目标是鼓励建筑师以更务实的方式探索光子技术，将系统作为一个整体来考虑，以了解其在加速当今dnn中的普遍适用性。我们的评估表明，与全系统中的传统收缩阵列(SAs)相比，ADEPT平均每瓦特吞吐量提高5.73倍，与最先进的电子加速器和光子加速器相比，每瓦特吞吐量提高至少6.8倍和2.5倍。

{"title":"An Electro-Photonic System for Accelerating Deep Neural Networks","authors":"Cansu Demirkiran, Furkan Eris, Gongyu Wang, Jonathan Elmhurst, Nick Moore, Nicholas C. Harris, Ayon Basumallik, Vijay Janapa Reddi, Ajay Joshi, Darius Bunandar","doi":"https://dl.acm.org/doi/10.1145/3606949","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606949","url":null,"abstract":"The number of parameters in deep neural networks (DNNs) is scaling at about 5 × the rate of Moore’s Law. To sustain this growth, photonic computing is a promising avenue, as it enables higher throughput in dominant general matrix-matrix multiplication (GEMM) operations in DNNs than their electrical counterpart. However, purely photonic systems face several challenges including lack of photonic memory and accumulation of noise. In this paper, we present an electro-photonic accelerator, ADEPT, which leverages a photonic computing unit for performing GEMM operations, a vectorized digital electronic ASIC for performing non-GEMM operations, and SRAM arrays for storing DNN parameters and activations. In contrast to prior works in photonic DNN accelerators, we adopt a system-level perspective and show that the gains while large are tempered relative to prior expectations. Our goal is to encourage architects to explore photonic technology in a more pragmatic way considering the system as a whole to understand its general applicability in accelerating today’s DNNs. Our evaluation shows that ADEPT can provide, on average, 5.73 × higher throughput per Watt compared to the traditional systolic arrays (SAs) in a full-system, and at least 6.8 × and 2.5 × better throughput per Watt, compared to state-of-the-art electronic and photonic accelerators, respectively.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"98 1","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Fast Object Detection-Based Framework for Via Modeling on PCB X-Ray CT Images 基于快速目标检测的PCB x射线CT图像通孔建模框架

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-07-03 DOI: https://dl.acm.org/doi/10.1145/3606948

David Selasi Koblah, Ulbert J. Botero, Sean P. Costello, Olivia P. Dizon-Paradis, Fatemeh Ganji, Damon L. Woodard, Domenic Forte

For successful printed circuit board (PCB) reverse engineering (RE), the resulting device must retain the physical characteristics and functionality of the original. Although the applications of RE are within the discretion of the executing party, establishing a viable, non-destructive framework for analysis is vital for any stakeholder in the PCB industry. A widely-regarded approach in PCB RE uses non-destructive x-ray computed tomography (CT) to produce three-dimensional volumes with several slices of data corresponding to multi-layered PCBs. However, the noise sources specific to x-ray CT and variability from designers hampers the thorough acquisition of features necessary for successful RE. This article investigates a deep learning approach as a successor to the current state-of-the-art for detecting vias on PCB x-ray CT images; vias are a key building block of PCB designs. During RE, vias offer an understanding of the PCB’s electrical connections across multiple layers. Our method is an improvement on an earlier iteration which demonstrates significantly faster runtime with quality of results comparable to or better than the current state-of-the-art, unsupervised iterative Hough-based method. Compared with the Hough-based method, the current framework is 4.5 times faster for the discrete image scenario and 24.1 times faster for the volumetric image scenario. The upgrades to the prior deep learning version include faster feature-based detection for real-world usability and adaptive post-processing methods to improve the quality of detections.

对于成功的印刷电路板(PCB)逆向工程(RE)，所得到的器件必须保留原始器件的物理特性和功能。虽然可再生能源的应用在执行方的自由裁量权范围内，但对于PCB行业的任何利益相关者来说，建立一个可行的、非破坏性的分析框架至关重要。PCB RE中广泛采用的一种方法是使用非破坏性x射线计算机断层扫描(CT)产生三维体积，其中包含多层PCB对应的若干数据切片。然而，x射线CT特有的噪声源和设计人员的可变性阻碍了成功的RE所需的特征的全面获取。本文研究了一种深度学习方法，作为当前最先进的检测PCB x射线CT图像上过孔的继任者;过孔是PCB设计的关键组成部分。在RE过程中，通孔提供了对PCB跨多层电气连接的理解。我们的方法是对早期迭代的改进，它展示了明显更快的运行时间，结果质量与当前最先进的、无监督迭代的基于hough的方法相当或更好。与基于hough的方法相比，当前框架在离散图像场景下的速度快4.5倍，在体积图像场景下的速度快24.1倍。对先前深度学习版本的升级包括更快的基于特征的检测，以满足现实世界的可用性，以及自适应后处理方法，以提高检测质量。

{"title":"A Fast Object Detection-Based Framework for Via Modeling on PCB X-Ray CT Images","authors":"David Selasi Koblah, Ulbert J. Botero, Sean P. Costello, Olivia P. Dizon-Paradis, Fatemeh Ganji, Damon L. Woodard, Domenic Forte","doi":"https://dl.acm.org/doi/10.1145/3606948","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3606948","url":null,"abstract":"For successful printed circuit board (PCB) reverse engineering (RE), the resulting device must retain the physical characteristics and functionality of the original. Although the applications of RE are within the discretion of the executing party, establishing a viable, non-destructive framework for analysis is vital for any stakeholder in the PCB industry. A widely-regarded approach in PCB RE uses non-destructive x-ray computed tomography (CT) to produce three-dimensional volumes with several slices of data corresponding to multi-layered PCBs. However, the noise sources specific to x-ray CT and variability from designers hampers the thorough acquisition of features necessary for successful RE. This article investigates a deep learning approach as a successor to the current state-of-the-art for detecting vias on PCB x-ray CT images; vias are a key building block of PCB designs. During RE, vias offer an understanding of the PCB’s electrical connections across multiple layers. Our method is an improvement on an earlier iteration which demonstrates significantly faster runtime with quality of results comparable to or better than the current state-of-the-art, unsupervised iterative Hough-based method. Compared with the Hough-based method, the current framework is 4.5 times faster for the discrete image scenario and 24.1 times faster for the volumetric image scenario. The upgrades to the prior deep learning version include faster feature-based detection for real-world usability and adaptive post-processing methods to improve the quality of detections.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"97 2","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Fast Object Detection-Based Framework for Via Modeling on PCB X-Ray CT Images 基于快速目标检测的PCB x射线CT图像通孔建模框架

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-07-03 DOI: 10.1145/3606948

D. Koblah, Ulbert J. Botero, Sean P. Costello, Olivia P. Dizon-Paradis, F. Ganji, D. Woodard, Domenic Forte

For successful printed circuit board (PCB) reverse engineering (RE), the resulting device must retain the physical characteristics and functionality of the original. Although the applications of RE are within the discretion of the executing party, establishing a viable, non-destructive framework for analysis is vital for any stakeholder in the PCB industry. A widely-regarded approach in PCB RE uses non-destructive x-ray computed tomography (CT) to produce three-dimensional volumes with several slices of data corresponding to multi-layered PCBs. However, the noise sources specific to x-ray CT and variability from designers hampers the thorough acquisition of features necessary for successful RE. This article investigates a deep learning approach as a successor to the current state-of-the-art for detecting vias on PCB x-ray CT images; vias are a key building block of PCB designs. During RE, vias offer an understanding of the PCB’s electrical connections across multiple layers. Our method is an improvement on an earlier iteration which demonstrates significantly faster runtime with quality of results comparable to or better than the current state-of-the-art, unsupervised iterative Hough-based method. Compared with the Hough-based method, the current framework is 4.5 times faster for the discrete image scenario and 24.1 times faster for the volumetric image scenario. The upgrades to the prior deep learning version include faster feature-based detection for real-world usability and adaptive post-processing methods to improve the quality of detections.

对于成功的印刷电路板(PCB)逆向工程(RE)，所得到的器件必须保留原始器件的物理特性和功能。虽然可再生能源的应用在执行方的自由裁量权范围内，但对于PCB行业的任何利益相关者来说，建立一个可行的、非破坏性的分析框架至关重要。PCB RE中广泛采用的一种方法是使用非破坏性x射线计算机断层扫描(CT)产生三维体积，其中包含多层PCB对应的若干数据切片。然而，x射线CT特有的噪声源和设计人员的可变性阻碍了成功的RE所需的特征的全面获取。本文研究了一种深度学习方法，作为当前最先进的检测PCB x射线CT图像上过孔的继任者;过孔是PCB设计的关键组成部分。在RE过程中，通孔提供了对PCB跨多层电气连接的理解。我们的方法是对早期迭代的改进，它展示了明显更快的运行时间，结果质量与当前最先进的、无监督迭代的基于hough的方法相当或更好。与基于hough的方法相比，当前框架在离散图像场景下的速度快4.5倍，在体积图像场景下的速度快24.1倍。对先前深度学习版本的升级包括更快的基于特征的检测，以满足现实世界的可用性，以及自适应后处理方法，以提高检测质量。

{"title":"A Fast Object Detection-Based Framework for Via Modeling on PCB X-Ray CT Images","authors":"D. Koblah, Ulbert J. Botero, Sean P. Costello, Olivia P. Dizon-Paradis, F. Ganji, D. Woodard, Domenic Forte","doi":"10.1145/3606948","DOIUrl":"https://doi.org/10.1145/3606948","url":null,"abstract":"For successful printed circuit board (PCB) reverse engineering (RE), the resulting device must retain the physical characteristics and functionality of the original. Although the applications of RE are within the discretion of the executing party, establishing a viable, non-destructive framework for analysis is vital for any stakeholder in the PCB industry. A widely-regarded approach in PCB RE uses non-destructive x-ray computed tomography (CT) to produce three-dimensional volumes with several slices of data corresponding to multi-layered PCBs. However, the noise sources specific to x-ray CT and variability from designers hampers the thorough acquisition of features necessary for successful RE. This article investigates a deep learning approach as a successor to the current state-of-the-art for detecting vias on PCB x-ray CT images; vias are a key building block of PCB designs. During RE, vias offer an understanding of the PCB’s electrical connections across multiple layers. Our method is an improvement on an earlier iteration which demonstrates significantly faster runtime with quality of results comparable to or better than the current state-of-the-art, unsupervised iterative Hough-based method. Compared with the Hough-based method, the current framework is 4.5 times faster for the discrete image scenario and 24.1 times faster for the volumetric image scenario. The upgrades to the prior deep learning version include faster feature-based detection for real-world usability and adaptive post-processing methods to improve the quality of detections.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":" ","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44649233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core Architectures 基于片上网络的多/多核架构中设计和优化挑战的机器学习解决方案

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-30 DOI: https://dl.acm.org/doi/10.1145/3591470

Md Farhadur Reza

Due to the advancement of transistor technology, a single chip processor can now have hundreds of cores. Network-on-Chip (NoC) has been the superior interconnect fabric for multi/many-core on-chip systems because of its scalability and parallelism. Due to the rise of dark silicon with the end of Dennard Scaling, it becomes essential to design energy efficient and high performance heterogeneous NoC-based multi/many-core architectures. Because of the large and complex design space, the solution space becomes difficult to explore within a reasonable time for optimal trade-offs of energy-performance-reliability. Furthermore, reactive resource management is not effective in preventing problems from happening in adaptive systems. Therefore, in this work, we explore machine learning techniques to design and configure the NoC resources based on the learning of the system and applications workloads. Machine learning can automatically learn from past experiences and guide the NoC intelligently to achieve its objective on performance, power, and reliability. We present the challenges of NoC design and resource management and propose a generalized machine learning framework to uncover near-optimal solutions quickly. We propose and implement a NoC design and optimization solution enabled by neural networks, using the generalized machine learning framework. Simulation results demonstrated that the proposed neural networks-based design and optimization solution improves performance by 15% and reduces energy consumption by 6% compared to an existing non-machine learning-based solution while the proposed solution improves NoC latency and throughput compared to two existing machine learning-based NoC optimization solutions. The challenges of machine learning technique adaptation in multi/many-core NoC have been presented to guide future research.

由于晶体管技术的进步，单个芯片处理器现在可以有数百个内核。片上网络(NoC)由于其可扩展性和并行性，已成为多/多核片上系统的优越互连结构。随着Dennard Scaling的结束，暗硅的兴起，设计节能和高性能异构的基于noc的多核/多核架构变得至关重要。由于设计空间大而复杂，很难在合理的时间内探索解决方案空间，以实现能源性能-可靠性的最佳权衡。此外，在自适应系统中，被动资源管理不能有效地防止问题的发生。因此，在这项工作中，我们探索机器学习技术，以基于系统和应用程序工作负载的学习来设计和配置NoC资源。机器学习可以自动从过去的经验中学习，并智能地指导NoC实现其在性能、功率和可靠性方面的目标。我们提出了NoC设计和资源管理的挑战，并提出了一个通用的机器学习框架，以快速发现接近最佳的解决方案。我们使用广义机器学习框架，提出并实现了一个由神经网络支持的NoC设计和优化解决方案。仿真结果表明，与现有的非机器学习解决方案相比，提出的基于神经网络的设计和优化方案提高了15%的性能，降低了6%的能耗，而与现有的两种基于机器学习的NoC优化方案相比，提出的解决方案提高了NoC延迟和吞吐量。提出了机器学习技术在多核/多核NoC中的适应挑战，以指导未来的研究。

{"title":"Machine Learning Enabled Solutions for Design and Optimization Challenges in Networks-on-Chip based Multi/Many-Core Architectures","authors":"Md Farhadur Reza","doi":"https://dl.acm.org/doi/10.1145/3591470","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3591470","url":null,"abstract":"Due to the advancement of transistor technology, a single chip processor can now have hundreds of cores. Network-on-Chip (NoC) has been the superior interconnect fabric for multi/many-core on-chip systems because of its scalability and parallelism. Due to the rise of dark silicon with the end of Dennard Scaling, it becomes essential to design energy efficient and high performance heterogeneous NoC-based multi/many-core architectures. Because of the large and complex design space, the solution space becomes difficult to explore within a reasonable time for optimal trade-offs of energy-performance-reliability. Furthermore, reactive resource management is not effective in preventing problems from happening in adaptive systems. Therefore, in this work, we explore machine learning techniques to design and configure the NoC resources based on the learning of the system and applications workloads. Machine learning can automatically learn from past experiences and guide the NoC intelligently to achieve its objective on performance, power, and reliability. We present the challenges of NoC design and resource management and propose a generalized machine learning framework to uncover near-optimal solutions quickly. We propose and implement a NoC design and optimization solution enabled by neural networks, using the generalized machine learning framework. Simulation results demonstrated that the proposed neural networks-based design and optimization solution improves performance by 15% and reduces energy consumption by 6% compared to an existing non-machine learning-based solution while the proposed solution improves NoC latency and throughput compared to two existing machine learning-based NoC optimization solutions. The challenges of machine learning technique adaptation in multi/many-core NoC have been presented to guide future research.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"98 2","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Building an Open Representation for Biological Protocols 建立一个开放的生物协议表示

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-23 DOI: https://dl.acm.org/doi/10.1145/3604568

Bryan Bartley, Jacob Beal, Miles Rogers, Daniel Bryce, Robert P. Goldman, Benjamin Keller, Peter Lee, Vanessa Biggers, Joshua Nowak, Mark Weston

Laboratory protocols are critical to biological research and development, yet difficult to communicate and reproduce across projects, investigators, and organizations. While many attempts have been made to address this challenge, there is currently no available protocol representation that is unambiguous enough for precise interpretation and automation, yet simultaneously “human friendly” and abstract enough to enable reuse and adaptation. The Laboratory Open Protocol language (LabOP) is a free and open protocol representation aiming to address this gap, building on a foundation of UML, Autoprotocol, Aquarium, SBOL RDF, and the Provenance Ontology. LabOP provides a linked-data representation both for protocols and for records of their execution and the resulting data, as well as a framework for exporting from LabOP for execution by either humans or laboratory automation. LabOP is currently implemented in the form of an RDF knowledge representation, specification document, and Python library, and supports execution as manual “paper protocols,” by Autoprotocol or by Opentrons. From this initial implementation, LabOP is being further developed as an open community effort.

实验室协议对生物研究和发展至关重要，但在项目、研究者和组织之间难以沟通和复制。虽然已经进行了许多尝试来解决这一挑战，但目前还没有一种可用的协议表示足够明确，可以进行精确的解释和自动化，同时又“对人类友好”，足够抽象，可以进行重用和适应。实验室开放协议语言(LabOP)是一种自由和开放的协议表示，旨在解决这一差距，它建立在UML、Autoprotocol、Aquarium、SBOL RDF和出处本体的基础上。LabOP为协议及其执行记录和结果数据提供了链接数据表示，还提供了从LabOP导出供人工或实验室自动化执行的框架。LabOP目前以RDF知识表示、规范文档和Python库的形式实现，并支持通过Autoprotocol或Opentrons作为手动“纸质协议”执行。从这个最初的实现开始，LabOP作为一个开放社区的努力正在进一步发展。

{"title":"Building an Open Representation for Biological Protocols","authors":"Bryan Bartley, Jacob Beal, Miles Rogers, Daniel Bryce, Robert P. Goldman, Benjamin Keller, Peter Lee, Vanessa Biggers, Joshua Nowak, Mark Weston","doi":"https://dl.acm.org/doi/10.1145/3604568","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3604568","url":null,"abstract":"Laboratory protocols are critical to biological research and development, yet difficult to communicate and reproduce across projects, investigators, and organizations. While many attempts have been made to address this challenge, there is currently no available protocol representation that is unambiguous enough for precise interpretation and automation, yet simultaneously “human friendly” and abstract enough to enable reuse and adaptation. The Laboratory Open Protocol language (LabOP) is a free and open protocol representation aiming to address this gap, building on a foundation of UML, Autoprotocol, Aquarium, SBOL RDF, and the Provenance Ontology. LabOP provides a linked-data representation both for protocols and for records of their execution and the resulting data, as well as a framework for exporting from LabOP for execution by either humans or laboratory automation. LabOP is currently implemented in the form of an RDF knowledge representation, specification document, and Python library, and supports execution as manual “paper protocols,” by Autoprotocol or by Opentrons. From this initial implementation, LabOP is being further developed as an open community effort.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"95 2","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Characterization of Timing-based Software Side-channel Attacks and Mitigations on Network-on-Chip Hardware 基于时序的软件侧信道攻击特征及片上网络硬件的缓解

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3585519

Usman Ali, Sheikh Abdul Rasheed Sahni, Omer Khan

Modern network-on-chip (NoC) hardware is an emerging target for side-channel security attacks. A recent work implemented and characterized timing-based software side-channel attacks that target NoC hardware on a real multicore machine. This article studies the impact of system noise on prior attack setups and shows that high noise is sufficient to defeat the attacker. We propose an information theory-based attack setup that uses repetition codes and differential signaling techniques to de-noise the unwanted noise from the NoC channel to successfully implement a practical covert-communication attack on a real multicore machine. The evaluation demonstrates an attack efficacy of 97%, 88%, and 78% under low, medium, and high external noise, respectively. Our attack characterization reveals that noise-based mitigation schemes are inadequate to prevent practical covert communication, and thus isolation-based mitigation schemes must be considered to ensure strong security. Isolation-based schemes are shown to mitigate timing-based side-channel attacks. However, their impact on the performance of real-world security critical workloads is not well understood in the literature. This article evaluates the performance implications of state-of-the-art spatial and temporal isolation schemes. The performance impact is shown to range from 2–3% for a set of graph and machine learning workloads, thus making isolation-based mitigations practical.

现代片上网络(NoC)硬件是侧信道安全攻击的新兴目标。最近的一项工作实现并描述了针对真实多核机器上的NoC硬件的基于时间的软件侧信道攻击。本文研究了系统噪声对先前攻击设置的影响，并表明高噪声足以击败攻击者。我们提出了一种基于信息理论的攻击设置，该攻击设置使用重复编码和差分信号技术来去除NoC信道中的有害噪声，从而成功地在真实的多核机器上实现了实际的隐蔽通信攻击。在低噪声、中噪声和高噪声条件下，攻击效能分别为97%、88%和78%。我们的攻击特征表明，基于噪声的缓解方案不足以防止实际的隐蔽通信，因此必须考虑基于隔离的缓解方案以确保强安全性。基于隔离的方案被证明可以减轻基于时间的侧信道攻击。然而，它们对实际安全关键工作负载性能的影响在文献中并没有得到很好的理解。本文评估了最先进的空间和时间隔离方案的性能影响。对于一组图形和机器学习工作负载，性能影响范围为2-3%，因此基于隔离的缓解措施是可行的。

{"title":"Characterization of Timing-based Software Side-channel Attacks and Mitigations on Network-on-Chip Hardware","authors":"Usman Ali, Sheikh Abdul Rasheed Sahni, Omer Khan","doi":"https://dl.acm.org/doi/10.1145/3585519","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3585519","url":null,"abstract":"Modern network-on-chip (NoC) hardware is an emerging target for side-channel security attacks. A recent work implemented and characterized timing-based software side-channel attacks that target NoC hardware on a real multicore machine. This article studies the impact of system noise on prior attack setups and shows that high noise is sufficient to defeat the attacker. We propose an information theory-based attack setup that uses repetition codes and differential signaling techniques to de-noise the unwanted noise from the NoC channel to successfully implement a practical covert-communication attack on a real multicore machine. The evaluation demonstrates an attack efficacy of 97%, 88%, and 78% under low, medium, and high external noise, respectively. Our attack characterization reveals that noise-based mitigation schemes are inadequate to prevent practical covert communication, and thus isolation-based mitigation schemes must be considered to ensure strong security. Isolation-based schemes are shown to mitigate timing-based side-channel attacks. However, their impact on the performance of real-world security critical workloads is not well understood in the literature. This article evaluates the performance implications of state-of-the-art spatial and temporal isolation schemes. The performance impact is shown to range from 2–3% for a set of graph and machine learning workloads, thus making isolation-based mitigations practical.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"100 4","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hardware IP Assurance against Trojan Attacks with Machine Learning and Post-processing 基于机器学习和后处理的防木马攻击硬件IP保障

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3592795

Pravin Gaikwad, Jonathan Cruz, Prabuddha Chakraborty, Swarup Bhunia, Tamzidul Hoque

System-on-chip (SoC) developers increasingly rely on pre-verified hardware intellectual property (IP) blocks often acquired from untrusted third-party vendors. These IPs might contain hidden malicious functionalities or hardware Trojans that may compromise the security of the fabricated SoCs. Lack of golden or reference models and vast possible Trojan attack space form some of the major barriers in detecting hardware Trojans in these third-party IP (3PIP) blocks. Recently, supervised machine learning (ML) techniques have shown promising capability in identifying nets of potential Trojans in 3PIPs without the need for golden models. However, they bring several major challenges. First, they do not guide us to an optimal choice of features that reliably covers diverse classes of Trojans. Second, they require multiple Trojan-free/trusted designs to insert known Trojans and generate a trained model. Even if a set of trusted designs are available for training, the suspect IP can have an inherently very different structure from the set of trusted designs, which may negatively impact the verification outcome. Third, these techniques only identify a set of suspect Trojan nets that require manual intervention to understand the potential threat. In this article, we present VIPR, a systematic machine learning (ML)-based trust verification solution for 3PIPs that eliminates the need for trusted designs for training. We present a comprehensive framework, associated algorithms, and a tool flow for obtaining an optimal set of features, training a targeted machine learning model, detecting suspect nets, and identifying Trojan circuitry from the suspect nets. We evaluate the framework on several Trust-Hub Trojan benchmarks and provide a comparative analysis of detection performance across different trained models, selection of features, and post-processing techniques. We demonstrate promising Trojan detection accuracy for VIPR with up to 92.85% reduction in false positives by the proposed post-processing algorithm.

片上系统(SoC)开发人员越来越依赖于预先验证的硬件知识产权(IP)块，这些块通常是从不受信任的第三方供应商那里获得的。这些ip可能包含隐藏的恶意功能或硬件木马，可能会危及制造的soc的安全性。缺乏黄金模型或参考模型以及巨大的木马攻击空间构成了在这些第三方IP (3PIP)块中检测硬件木马的一些主要障碍。最近，监督机器学习(ML)技术在无需黄金模型的情况下识别3pip中的潜在木马网络方面显示出了很好的能力。然而，它们带来了几个主要挑战。首先，它们不能指导我们选择最优的功能，以可靠地覆盖不同类型的木马。其次，它们需要多个无木马/可信的设计来插入已知的木马并生成训练过的模型。即使一组可信设计可用于训练，可疑IP也可能具有与可信设计集非常不同的固有结构，这可能会对验证结果产生负面影响。第三，这些技术只识别一组可疑的特洛伊网络，需要人工干预才能了解潜在的威胁。在本文中，我们介绍了VIPR，这是一种用于3pip的基于系统机器学习(ML)的信任验证解决方案，它消除了对可信设计的培训需求。我们提出了一个全面的框架，相关的算法，以及一个工具流，用于获得一组最优的特征，训练目标机器学习模型，检测可疑网络，并从可疑网络中识别木马电路。我们在几个Trust-Hub木马基准测试中评估了该框架，并对不同训练模型、特征选择和后处理技术的检测性能进行了比较分析。通过提出的后处理算法，我们证明了VIPR的特洛伊木马检测精度很高，误报率降低了92.85%。

{"title":"Hardware IP Assurance against Trojan Attacks with Machine Learning and Post-processing","authors":"Pravin Gaikwad, Jonathan Cruz, Prabuddha Chakraborty, Swarup Bhunia, Tamzidul Hoque","doi":"https://dl.acm.org/doi/10.1145/3592795","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3592795","url":null,"abstract":"System-on-chip (SoC) developers increasingly rely on pre-verified hardware intellectual property (IP) blocks often acquired from untrusted third-party vendors. These IPs might contain hidden malicious functionalities or hardware Trojans that may compromise the security of the fabricated SoCs. Lack of golden or reference models and vast possible Trojan attack space form some of the major barriers in detecting hardware Trojans in these third-party IP (3PIP) blocks. Recently, supervised machine learning (ML) techniques have shown promising capability in identifying nets of potential Trojans in 3PIPs without the need for golden models. However, they bring several major challenges. First, they do not guide us to an optimal choice of features that reliably covers diverse classes of Trojans. Second, they require multiple Trojan-free/trusted designs to insert known Trojans and generate a trained model. Even if a set of trusted designs are available for training, the suspect IP can have an inherently very different structure from the set of trusted designs, which may negatively impact the verification outcome. Third, these techniques only identify a set of suspect Trojan nets that require manual intervention to understand the potential threat. In this article, we present VIPR, a systematic machine learning (ML)-based trust verification solution for 3PIPs that eliminates the need for trusted designs for training. We present a comprehensive framework, associated algorithms, and a tool flow for obtaining an optimal set of features, training a targeted machine learning model, detecting suspect nets, and identifying Trojan circuitry from the suspect nets. We evaluate the framework on several Trust-Hub Trojan benchmarks and provide a comparative analysis of detection performance across different trained models, selection of features, and post-processing techniques. We demonstrate promising Trojan detection accuracy for VIPR with up to 92.85% reduction in false positives by the proposed post-processing algorithm.","PeriodicalId":50924,"journal":{"name":"ACM Journal on Emerging Technologies in Computing Systems","volume":"23 6","pages":""},"PeriodicalIF":2.2,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138505902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Virtualizing Existing Fluidic Programs 虚拟化现有的流体程序

IF 2.2 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

ACM Journal on Emerging Technologies in Computing Systems

Pub Date : 2023-06-21 DOI: https://dl.acm.org/doi/10.1145/3558550

Caleb Winston, Max Willsey, Luis Ceze

Fluidic automation, the practice of programmatically manipulating small fluids to execute laboratory protocols, has led to vastly increased productivity for biologists and chemists. Most fluidic programs, commonly referred to as protocols, are written using APIs that couple the protocol to specific hardware by referring to the physical locations on the device. This coupling makes isolation impossible, preventing portability, concurrent execution, and composition of protocols on the same device.

We propose a system for virtualizing existing fluidic protocols on top of a single runtime system without modification. Our system presents an isolated view of the device to each running protocol, allowing it to assume it has sole access to hardware. We provide a proof-of-concept implementation that can concurrently execute and compose protocols written using the popular Opentrons Python API. Concurrent execution achieves near-linear speedup over serial execution, since protocols spend much of their time waiting.

流体自动化，通过程序化操作小流体来执行实验室规程的实践，极大地提高了生物学家和化学家的工作效率。大多数流体程序，通常被称为协议，是使用api编写的，通过引用设备上的物理位置将协议耦合到特定的硬件。这种耦合使得隔离变得不可能，从而妨碍了可移植性、并发执行和同一设备上协议的组合。我们提出了一种无需修改即可在单个运行时系统之上虚拟化现有流体协议的系统。我们的系统为每个正在运行的协议提供了一个孤立的设备视图，允许它假设它对硬件有唯一的访问权限。我们提供了一个概念验证实现，它可以并发地执行和组合使用流行的Opentrons Python API编写的协议。与串行执行相比，并发执行实现了接近线性的加速，因为协议花费了大量时间等待。

引用次数: 0