IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems最新文献

英文中文

PARS: A Pattern-Aware Spatial Data Prefetcher Supporting Multiple Region Sizes PARS：支持多种区域大小的模式感知空间数据预取器

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-11-06 DOI: 10.1109/TCAD.2024.3442981

Yiquan Lin;Wenhai Lin;Jiexiong Xu;Yiquan Chen;Zhen Jin;Jingchang Qin;Jiahao He;Shishun Cai;Yuzhong Zhang;Zonghui Wang;Wenzhi Chen

Hardware data prefetching is a well-studied technique to bridge the processor-memory performance gap. Bit-pattern-based prefetchers are one of the most promising spatial data prefetchers that achieve substantial performance gains. In bit-pattern-based prefetchers, the region size is a crucial parameter, which denotes the memory size that can be recorded by a pattern or prefetched by a prediction. However, existing bit-pattern-based prefetchers only support one fixed region size. Our experiment shows that the fixed region size cannot meet the requirements for numerous applications and leads to suboptimal performance and high hardware overhead. In this article, we propose PARS, a pattern-aware spatial data prefetcher supporting multiple region sizes. The key idea of PARS is that it supports multiple region sizes, enabling it to simultaneously enhance application performance while reducing the hardware overhead. Moreover, PARS supports dynamically switching appropriate region sizes for different patterns through an adaptive RS-switching mechanism. We evaluated PARS on numerous workloads and results show that PARS provides an average performance improvement of 40.6% over a baseline with no data prefetchers and outperforms the two state-of-the-art prefetchers Bingo by 2.1% (up to 24.4%) and Pythia by 3.9% (up to 111.2%) in the single-core system. In the four-core system, PARS outperforms Bingo by 5.0% (up to 66.0%) and Pythia by 5.4% (up to 177.9%).

硬件数据预取是一项经过深入研究的技术，可弥合处理器与内存之间的性能差距。基于比特模式的预取器是最有前途的空间数据预取器之一，可大幅提高性能。在基于位模式的预取器中，区域大小是一个关键参数，它表示可被模式记录或被预测预取的内存大小。然而，现有的基于位模式的预取器只支持一种固定的区域大小。我们的实验表明，固定区域大小无法满足众多应用的要求，而且会导致性能不理想和高硬件开销。在本文中，我们提出了支持多种区域大小的模式感知空间数据预取器 PARS。PARS 的主要理念是支持多种区域大小，从而在提高应用性能的同时降低硬件开销。此外，PARS 还通过自适应 RS 切换机制，支持针对不同模式动态切换适当的区域大小。我们在大量工作负载上对 PARS 进行了评估，结果表明，与没有数据预取器的基线相比，PARS 的平均性能提高了 40.6%，在单核系统中，PARS 的性能比两个最先进的预取器 Bingo 高出 2.1%（最高达 24.4%），比 Pythia 高出 3.9%（最高达 111.2%）。在四核系统中，PARS 的性能比 Bingo 高出 5.0%（最高达 66.0%），比 Pythia 高出 5.4%（最高达 177.9%）。

{"title":"PARS: A Pattern-Aware Spatial Data Prefetcher Supporting Multiple Region Sizes","authors":"Yiquan Lin;Wenhai Lin;Jiexiong Xu;Yiquan Chen;Zhen Jin;Jingchang Qin;Jiahao He;Shishun Cai;Yuzhong Zhang;Zonghui Wang;Wenzhi Chen","doi":"10.1109/TCAD.2024.3442981","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3442981","url":null,"abstract":"Hardware data prefetching is a well-studied technique to bridge the processor-memory performance gap. Bit-pattern-based prefetchers are one of the most promising spatial data prefetchers that achieve substantial performance gains. In bit-pattern-based prefetchers, the region size is a crucial parameter, which denotes the memory size that can be recorded by a pattern or prefetched by a prediction. However, existing bit-pattern-based prefetchers only support one fixed region size. Our experiment shows that the fixed region size cannot meet the requirements for numerous applications and leads to suboptimal performance and high hardware overhead. In this article, we propose PARS, a pattern-aware spatial data prefetcher supporting multiple region sizes. The key idea of PARS is that it supports multiple region sizes, enabling it to simultaneously enhance application performance while reducing the hardware overhead. Moreover, PARS supports dynamically switching appropriate region sizes for different patterns through an adaptive RS-switching mechanism. We evaluated PARS on numerous workloads and results show that PARS provides an average performance improvement of 40.6% over a baseline with no data prefetchers and outperforms the two state-of-the-art prefetchers Bingo by 2.1% (up to 24.4%) and Pythia by 3.9% (up to 111.2%) in the single-core system. In the four-core system, PARS outperforms Bingo by 5.0% (up to 66.0%) and Pythia by 5.4% (up to 177.9%).","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3638-3649"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Revisiting Dynamic Scheduling of Control Tasks: A Performance-Aware Fine-Grained Approach 重新审视控制任务的动态调度：性能感知细粒度方法

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-11-06 DOI: 10.1109/TCAD.2024.3443007

Sunandan Adhikary;Ipsita Koley;Saurav Kumar Ghosh;Sumana Ghosh;Soumyajit Dey

Modern cyber-physical systems (CPSs) employ an increasingly large number of software control loops to enhance their autonomous capabilities. Such large task sets and their dependencies may lead to deadline misses caused by platform-level timing uncertainties, resource contention, etc. To ensure the schedulability of the task set in the embedded platform in the presence of these uncertainties, there exist co-design techniques that assign task periodicities such that control costs are minimized. Another line of work exists that addresses the same platform schedulability issue by skipping a bounded number of control executions within a fixed number of control instances. Considering that control tasks are designed to perform robustly against delayed actuation (due to deadline misses, network packet drops etc.) a bounded number of control skips can be applied while ensuring certain performance margin. Our work combines these two control scheduling co-design disciplines and develops a strategy to adaptively employ control skips or update periodicities of the control tasks depending on their current performance requirements. For this we leverage a novel theory of automata-based control skip sequence generation while ensuring periodicity, safety and stability constraints. We demonstrate the effectiveness of this dynamic resource sharing approach in an automotive Hardware-in-loop setup with realistic control task set implementations.

现代网络物理系统（CPS）采用了越来越多的软件控制回路来增强其自主能力。如此庞大的任务集及其依赖关系可能会因平台级时序不确定性、资源争用等原因而导致错过最后期限。为了确保任务集在嵌入式平台中的可调度性，现有的协同设计技术可在存在这些不确定性的情况下分配任务周期，从而最大限度地降低控制成本。另一种方法是在固定数量的控制实例中跳过一定数量的控制执行，从而解决相同的平台可调度性问题。考虑到控制任务旨在稳健地执行延迟执行（由于错过截止日期、网络数据包丢弃等原因），因此可以在确保一定性能余量的情况下应用一定数量的控制跳转。我们的工作结合了这两个控制调度协同设计学科，并开发出一种策略，可根据控制任务当前的性能要求，自适应地采用控制跳转或更新控制任务的周期。为此，我们利用基于自动机的控制跳转序列生成新理论，同时确保周期性、安全性和稳定性约束。我们在汽车硬件在环设置中演示了这种动态资源共享方法的有效性，并实现了现实的控制任务集。

{"title":"Revisiting Dynamic Scheduling of Control Tasks: A Performance-Aware Fine-Grained Approach","authors":"Sunandan Adhikary;Ipsita Koley;Saurav Kumar Ghosh;Sumana Ghosh;Soumyajit Dey","doi":"10.1109/TCAD.2024.3443007","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3443007","url":null,"abstract":"Modern cyber-physical systems (CPSs) employ an increasingly large number of software control loops to enhance their autonomous capabilities. Such large task sets and their dependencies may lead to deadline misses caused by platform-level timing uncertainties, resource contention, etc. To ensure the schedulability of the task set in the embedded platform in the presence of these uncertainties, there exist co-design techniques that assign task periodicities such that control costs are minimized. Another line of work exists that addresses the same platform schedulability issue by skipping a bounded number of control executions within a fixed number of control instances. Considering that control tasks are designed to perform robustly against delayed actuation (due to deadline misses, network packet drops etc.) a bounded number of control skips can be applied while ensuring certain performance margin. Our work combines these two control scheduling co-design disciplines and develops a strategy to adaptively employ control skips or update periodicities of the control tasks depending on their current performance requirements. For this we leverage a novel theory of automata-based control skip sequence generation while ensuring periodicity, safety and stability constraints. We demonstrate the effectiveness of this dynamic resource sharing approach in an automotive Hardware-in-loop setup with realistic control task set implementations.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3662-3673"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Image Processing via Memristive-Based Approximate In-Memory Computing 通过基于 Memristive 的近似内存计算实现高效图像处理

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-11-06 DOI: 10.1109/TCAD.2024.3438113

Fabian Seiler;Nima TaheriNejad

Image processing algorithms continue to demand higher performance from computers. However, computer performance is not improving at the same rate as before. In response to the current challenges in enhancing computing performance, a wave of new technologies and computing paradigms is surfacing. Among these, memristors stand out as one of the most promising components due to their technological prospects and low power consumption. With efficient data storage capabilities and their ability to directly perform logical operations within the memory, they are well-suited for in-memory computation (IMC). Approximate computing emerges as another promising paradigm, offering improved performance metrics, notably speed. The tradeoff for this gain is the reduction of accuracy. In this article, we are using the stateful logic material implication (IMPLY) in the semi-serial topology and combine both the paradigms to further enhance the computational performance. We present three novel approximated adders that drastically improve speed and energy consumption with an normalized mean error distance (NMED) lower than 0.02 for most scenarios. We evaluated partially approximated Ripple carry adder (RCA) at the circuit-level and compared them to the State-of-the-Art (SoA). The proposed adders are applied in different image processing applications and the quality metrics are calculated. While maintaining acceptable quality, our approach achieves significant energy savings of 6%–38% and reduces the delay (number of computation cycles) by 5%–35%, demonstrating notable efficiency compared to exact calculations.

图像处理算法不断要求计算机具有更高的性能。然而，计算机性能的提升速度却不如从前。为了应对当前在提高计算性能方面的挑战，一波新技术和计算模式正在浮出水面。其中，忆阻器凭借其技术前景和低功耗成为最有前途的元件之一。凭借高效的数据存储能力和在内存中直接执行逻辑运算的能力，它们非常适合内存计算（IMC）。近似计算是另一种前景广阔的范例，可提供更好的性能指标，尤其是速度。但这种改进的代价是精度的降低。在本文中，我们在半串行拓扑中使用了有状态逻辑材料蕴含（IMPLY），并将这两种范式结合起来，以进一步提高计算性能。我们介绍了三种新型近似加法器，它们大大提高了速度和能耗，在大多数情况下，归一化平均误差距离（NMED）低于 0.02。我们在电路级评估了部分近似波纹携带加法器（RCA），并将其与最新技术（SoA）进行了比较。我们在不同的图像处理应用中应用了所提出的加法器，并计算了质量指标。在保持可接受的质量的同时，我们的方法实现了 6%-38% 的显著节能，并减少了 5%-35% 的延迟（计算周期数），与精确计算相比效率显著提高。

{"title":"Efficient Image Processing via Memristive-Based Approximate In-Memory Computing","authors":"Fabian Seiler;Nima TaheriNejad","doi":"10.1109/TCAD.2024.3438113","DOIUrl":"https://doi.org/10.1109/TCAD.2024.3438113","url":null,"abstract":"Image processing algorithms continue to demand higher performance from computers. However, computer performance is not improving at the same rate as before. In response to the current challenges in enhancing computing performance, a wave of new technologies and computing paradigms is surfacing. Among these, memristors stand out as one of the most promising components due to their technological prospects and low power consumption. With efficient data storage capabilities and their ability to directly perform logical operations within the memory, they are well-suited for in-memory computation (IMC). Approximate computing emerges as another promising paradigm, offering improved performance metrics, notably speed. The tradeoff for this gain is the reduction of accuracy. In this article, we are using the stateful logic material implication (IMPLY) in the semi-serial topology and combine both the paradigms to further enhance the computational performance. We present three novel approximated adders that drastically improve speed and energy consumption with an normalized mean error distance (NMED) lower than 0.02 for most scenarios. We evaluated partially approximated Ripple carry adder (RCA) at the circuit-level and compared them to the State-of-the-Art (SoA). The proposed adders are applied in different image processing applications and the quality metrics are calculated. While maintaining acceptable quality, our approach achieves significant energy savings of 6%–38% and reduces the delay (number of computation cycles) by 5%–35%, demonstrating notable efficiency compared to exact calculations.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3312-3323"},"PeriodicalIF":2.7,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10745792","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems society information 电气和电子工程师学会《集成电路和系统计算机辅助设计期刊》（IEEE Transactions on Computer-Aided Design of Integrated Circits and Systems）社会信息

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-19 DOI: 10.1109/TCAD.2024.3454934

引用次数: 0

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information 电气和电子工程师学会《集成电路与系统计算机辅助设计》（IEEE Transactions on Computer-Aided Design of Integrated Circits and Systems）出版物信息

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-19 DOI: 10.1109/TCAD.2024.3449609

引用次数: 0

LithoHoD: A Litho Simulator-Powered Framework for IC Layout Hotspot Detection LithoHoD：由光刻模拟器驱动的集成电路布局热点检测框架

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-18 DOI: 10.1109/tcad.2024.3463539

Hao-Chiang Shao, Guan-Yu Chen, Yu-Hsien Lin, Chia-Wen Lin, Shao-Yun Fang, Pin-Yian Tsai, Yan-Hsiu Liu

引用次数: 0

Buffer and Splitter Insertion for Adiabatic Quantum-Flux-Parametron Circuits 绝热量子通量-参数电子电路的缓冲器和分流器插入技术

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-18 DOI: 10.1109/tcad.2024.3461573

Rongliang Fu, Mengmeng Wang, Yirong Kan, Olivia Chen, Nobuyuki Yoshikawa, Bei Yu, Tsung-Yi Ho

引用次数: 0

Sobol Sequence Optimization for Hardware-Efficient Vector Symbolic Architectures 针对硬件高效矢量符号架构的 Sobol 序列优化

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-18 DOI: 10.1109/tcad.2024.3463544

Sercan Aygun, M. Hassan Najafi

引用次数: 0

ATOM: An Automatic Topology Synthesis Framework for Operational Amplifiers ATOM：运算放大器拓扑自动合成框架

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-18 DOI: 10.1109/tcad.2024.3463534

Jinyi Shen, Fan Yang, Li Shang, Changhao Yan, Zhaori Bi, Dian Zhou, Xuan Zeng

引用次数: 0

Automated Topology Synthesis of Analog Integrated Circuits With Frequency Compensation 带频率补偿的模拟集成电路自动拓扑合成

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Pub Date : 2024-09-18 DOI: 10.1109/tcad.2024.3462904

Zhenxin Zhao, Jun Liu, Wensheng Zhao, Lihong Zhang

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀