The number of parameters in deep neural networks (DNNs) is scaling at about 5 × the rate of Moore’s Law. To sustain this growth, photonic computing is a promising avenue, as it enables higher throughput in dominant general matrix-matrix multiplication (GEMM) operations in DNNs than their electrical counterpart. However, purely photonic systems face several challenges including lack of photonic memory and accumulation of noise. In this paper, we present an electro-photonic accelerator, ADEPT, which leverages a photonic computing unit for performing GEMM operations, a vectorized digital electronic ASIC for performing non-GEMM operations, and SRAM arrays for storing DNN parameters and activations. In contrast to prior works in photonic DNN accelerators, we adopt a system-level perspective and show that the gains while large are tempered relative to prior expectations. Our goal is to encourage architects to explore photonic technology in a more pragmatic way considering the system as a whole to understand its general applicability in accelerating today’s DNNs. Our evaluation shows that ADEPT can provide, on average, 5.73 × higher throughput per Watt compared to the traditional systolic arrays (SAs) in a full-system, and at least 6.8 × and 2.5 × better throughput per Watt, compared to state-of-the-art electronic and photonic accelerators, respectively.
For successful printed circuit board (PCB) reverse engineering (RE), the resulting device must retain the physical characteristics and functionality of the original. Although the applications of RE are within the discretion of the executing party, establishing a viable, non-destructive framework for analysis is vital for any stakeholder in the PCB industry. A widely-regarded approach in PCB RE uses non-destructive x-ray computed tomography (CT) to produce three-dimensional volumes with several slices of data corresponding to multi-layered PCBs. However, the noise sources specific to x-ray CT and variability from designers hampers the thorough acquisition of features necessary for successful RE. This article investigates a deep learning approach as a successor to the current state-of-the-art for detecting vias on PCB x-ray CT images; vias are a key building block of PCB designs. During RE, vias offer an understanding of the PCB’s electrical connections across multiple layers. Our method is an improvement on an earlier iteration which demonstrates significantly faster runtime with quality of results comparable to or better than the current state-of-the-art, unsupervised iterative Hough-based method. Compared with the Hough-based method, the current framework is 4.5 times faster for the discrete image scenario and 24.1 times faster for the volumetric image scenario. The upgrades to the prior deep learning version include faster feature-based detection for real-world usability and adaptive post-processing methods to improve the quality of detections.
Due to the advancement of transistor technology, a single chip processor can now have hundreds of cores. Network-on-Chip (NoC) has been the superior interconnect fabric for multi/many-core on-chip systems because of its scalability and parallelism. Due to the rise of dark silicon with the end of Dennard Scaling, it becomes essential to design energy efficient and high performance heterogeneous NoC-based multi/many-core architectures. Because of the large and complex design space, the solution space becomes difficult to explore within a reasonable time for optimal trade-offs of energy-performance-reliability. Furthermore, reactive resource management is not effective in preventing problems from happening in adaptive systems. Therefore, in this work, we explore machine learning techniques to design and configure the NoC resources based on the learning of the system and applications workloads. Machine learning can automatically learn from past experiences and guide the NoC intelligently to achieve its objective on performance, power, and reliability. We present the challenges of NoC design and resource management and propose a generalized machine learning framework to uncover near-optimal solutions quickly. We propose and implement a NoC design and optimization solution enabled by neural networks, using the generalized machine learning framework. Simulation results demonstrated that the proposed neural networks-based design and optimization solution improves performance by 15% and reduces energy consumption by 6% compared to an existing non-machine learning-based solution while the proposed solution improves NoC latency and throughput compared to two existing machine learning-based NoC optimization solutions. The challenges of machine learning technique adaptation in multi/many-core NoC have been presented to guide future research.
Laboratory protocols are critical to biological research and development, yet difficult to communicate and reproduce across projects, investigators, and organizations. While many attempts have been made to address this challenge, there is currently no available protocol representation that is unambiguous enough for precise interpretation and automation, yet simultaneously “human friendly” and abstract enough to enable reuse and adaptation. The Laboratory Open Protocol language (LabOP) is a free and open protocol representation aiming to address this gap, building on a foundation of UML, Autoprotocol, Aquarium, SBOL RDF, and the Provenance Ontology. LabOP provides a linked-data representation both for protocols and for records of their execution and the resulting data, as well as a framework for exporting from LabOP for execution by either humans or laboratory automation. LabOP is currently implemented in the form of an RDF knowledge representation, specification document, and Python library, and supports execution as manual “paper protocols,” by Autoprotocol or by Opentrons. From this initial implementation, LabOP is being further developed as an open community effort.
Modern network-on-chip (NoC) hardware is an emerging target for side-channel security attacks. A recent work implemented and characterized timing-based software side-channel attacks that target NoC hardware on a real multicore machine. This article studies the impact of system noise on prior attack setups and shows that high noise is sufficient to defeat the attacker. We propose an information theory-based attack setup that uses repetition codes and differential signaling techniques to de-noise the unwanted noise from the NoC channel to successfully implement a practical covert-communication attack on a real multicore machine. The evaluation demonstrates an attack efficacy of 97%, 88%, and 78% under low, medium, and high external noise, respectively. Our attack characterization reveals that noise-based mitigation schemes are inadequate to prevent practical covert communication, and thus isolation-based mitigation schemes must be considered to ensure strong security. Isolation-based schemes are shown to mitigate timing-based side-channel attacks. However, their impact on the performance of real-world security critical workloads is not well understood in the literature. This article evaluates the performance implications of state-of-the-art spatial and temporal isolation schemes. The performance impact is shown to range from 2–3% for a set of graph and machine learning workloads, thus making isolation-based mitigations practical.
System-on-chip (SoC) developers increasingly rely on pre-verified hardware intellectual property (IP) blocks often acquired from untrusted third-party vendors. These IPs might contain hidden malicious functionalities or hardware Trojans that may compromise the security of the fabricated SoCs. Lack of golden or reference models and vast possible Trojan attack space form some of the major barriers in detecting hardware Trojans in these third-party IP (3PIP) blocks. Recently, supervised machine learning (ML) techniques have shown promising capability in identifying nets of potential Trojans in 3PIPs without the need for golden models. However, they bring several major challenges. First, they do not guide us to an optimal choice of features that reliably covers diverse classes of Trojans. Second, they require multiple Trojan-free/trusted designs to insert known Trojans and generate a trained model. Even if a set of trusted designs are available for training, the suspect IP can have an inherently very different structure from the set of trusted designs, which may negatively impact the verification outcome. Third, these techniques only identify a set of suspect Trojan nets that require manual intervention to understand the potential threat. In this article, we present VIPR, a systematic machine learning (ML)-based trust verification solution for 3PIPs that eliminates the need for trusted designs for training. We present a comprehensive framework, associated algorithms, and a tool flow for obtaining an optimal set of features, training a targeted machine learning model, detecting suspect nets, and identifying Trojan circuitry from the suspect nets. We evaluate the framework on several Trust-Hub Trojan benchmarks and provide a comparative analysis of detection performance across different trained models, selection of features, and post-processing techniques. We demonstrate promising Trojan detection accuracy for VIPR with up to 92.85% reduction in false positives by the proposed post-processing algorithm.
Fluidic automation, the practice of programmatically manipulating small fluids to execute laboratory protocols, has led to vastly increased productivity for biologists and chemists. Most fluidic programs, commonly referred to as protocols, are written using APIs that couple the protocol to specific hardware by referring to the physical locations on the device. This coupling makes isolation impossible, preventing portability, concurrent execution, and composition of protocols on the same device.
We propose a system for virtualizing existing fluidic protocols on top of a single runtime system without modification. Our system presents an isolated view of the device to each running protocol, allowing it to assume it has sole access to hardware. We provide a proof-of-concept implementation that can concurrently execute and compose protocols written using the popular Opentrons Python API. Concurrent execution achieves near-linear speedup over serial execution, since protocols spend much of their time waiting.