Pub Date : 2026-01-22DOI: 10.1109/TVLSI.2026.3653075
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2026.3653075","DOIUrl":"https://doi.org/10.1109/TVLSI.2026.3653075","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"C3-C3"},"PeriodicalIF":3.1,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11361321","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1109/TVLSI.2025.3641351
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3641351","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3641351","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 1","pages":"C3-C3"},"PeriodicalIF":3.1,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11318116","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural radiance fields (NeRFs) have transformed 3-D reconstruction and rendering, facilitating photorealistic image synthesis from sparse viewpoints. This work introduces an explicit data reuse neural rendering (EDR-NR) architecture, which reduces frequent external memory accesses (EMAs) and cache misses by exploiting the spatial locality from three phases, including rays, ray packets (RPs), and samples. The EDR-NR architecture features a four-stage scheduler that clusters rays on the basis of $Z$ -order, prioritize lagging rays when ray divergence happens, reorders RPs based on spatial proximity, and issues samples out-of-orderly (OoO) according to the availability of on-chip feature data. In addition, a four-tier hierarchical RP marching (HRM) technique is integrated with an axis-aligned bounding box (AABB) to facilitate spatial skipping (SS), reducing redundant computations and improving throughput. Moreover, a balanced allocation strategy for feature storage is proposed to mitigate SRAM bank conflicts. Fabricated using a 40-nm process with a die area of 10.5 mm2, the EDR-NR chip demonstrates a $2.41times $ enhancement in normalized energy efficiency, a $1.21times $ improvement in normalized area efficiency, a $1.20times $ increase in normalized throughput, and a 53.42% reduction in on-chip SRAM consumption compared with state-of-the-art accelerators.
{"title":"An Energy-Efficient Edge Coprocessor for Neural Rendering With Explicit Data Reuse Strategies","authors":"Binzhe Yuan;Xiangyu Zhang;Zeyu Zheng;Yuefeng Zhang;Haochuan Wan;Zhechen Yuan;Junsheng Chen;Yunxiang He;Junran Ding;Xiaoming Zhang;Chaolin Rao;Wenyan Su;Pingqiang Zhou;Jingyi Yu;Xin Lou","doi":"10.1109/TVLSI.2025.3641653","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3641653","url":null,"abstract":"Neural radiance fields (NeRFs) have transformed 3-D reconstruction and rendering, facilitating photorealistic image synthesis from sparse viewpoints. This work introduces an explicit data reuse neural rendering (EDR-NR) architecture, which reduces frequent external memory accesses (EMAs) and cache misses by exploiting the spatial locality from three phases, including rays, ray packets (RPs), and samples. The EDR-NR architecture features a four-stage scheduler that clusters rays on the basis of <inline-formula> <tex-math>$Z$ </tex-math></inline-formula>-order, prioritize lagging rays when ray divergence happens, reorders RPs based on spatial proximity, and issues samples out-of-orderly (OoO) according to the availability of on-chip feature data. In addition, a four-tier hierarchical RP marching (HRM) technique is integrated with an axis-aligned bounding box (AABB) to facilitate spatial skipping (SS), reducing redundant computations and improving throughput. Moreover, a balanced allocation strategy for feature storage is proposed to mitigate SRAM bank conflicts. Fabricated using a 40-nm process with a die area of 10.5 mm<sup>2</sup>, the EDR-NR chip demonstrates a <inline-formula> <tex-math>$2.41times $ </tex-math></inline-formula> enhancement in normalized energy efficiency, a <inline-formula> <tex-math>$1.21times $ </tex-math></inline-formula> improvement in normalized area efficiency, a <inline-formula> <tex-math>$1.20times $ </tex-math></inline-formula> increase in normalized throughput, and a 53.42% reduction in on-chip SRAM consumption compared with state-of-the-art accelerators.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"620-633"},"PeriodicalIF":3.1,"publicationDate":"2025-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-10DOI: 10.1109/TVLSI.2025.3640215
Youngki Moon;Juyong Lee;Nayeun Kim;Yeonho Choi;Byungsoo Kim;Sungho Kang
The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.
{"title":"FADE: Fault-Aware Adaptive On-Die ECC for Improving Robustness","authors":"Youngki Moon;Juyong Lee;Nayeun Kim;Yeonho Choi;Byungsoo Kim;Sungho Kang","doi":"10.1109/TVLSI.2025.3640215","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3640215","url":null,"abstract":"The increasing density of dynamic random access memory (DRAM) renders permanent faults and soft errors more prevalent, which critically reduces yield and reliability. Although error correction code (ECC) can mitigate this issue, existing ECCs are not optimized for fault correction. As a result, fault tolerance remains insufficient, and the error correction capability in the presence of faults is degraded. Therefore, to improve DRAM robustness by efficiently addressing both permanent faults and soft errors, this brief proposes a fault-aware adaptive on-die ECC (FADE) in which two ECC engines independently operate in either fault mode (FM) or error mode (EM) according to the number of faulty symbols (FSs). In FM, a fault polynomial is reconstructed by reusing the fault addresses that the built-in self-repair (BISR) stores in content-addressable memory (CAM). To calculate the corresponding fault magnitudes, a modified decoding equation is employed. As a result, the number of correctable FSs in FM doubles compared to the conventional ECC. Moreover, with the proposed symbol-based fault isolation, both fault tolerance and error correction capability in the presence of faults are drastically enhanced. Additionally, the experimental results show that the proposed design can be implemented with a reasonable overhead in terms of delay and area.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"707-710"},"PeriodicalIF":3.1,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/TVLSI.2025.3637272
Shatadal Chatterjee;Jitumani Sarma
The flash analog-to-digital converters (ADCs), essential for high-speed embedded systems, face inherent linearity constraints due to device mismatch in the resistor ladder and comparator stages. While individual analytical models exist for these mismatch sources, designers rely on Monte Carlo simulations to evaluate the combined errors. This brief introduces a unified analytical framework with closed-form expressions that capture both mismatch sources, enabling efficient estimation of root mean square (rms) integral nonlinearity/differential nonlinearity (INL/DNL). Validated against circuit simulations, the model achieves a mean absolute error (MAE) of 2.71% ($boldsymbol {sigma _{textbf {DNL}}}$ ) and 2.51% ($sigma _{text {INL}}$ ), and the maximum absolute error (MaxE) remains within 5.44%. This predictive capability guides high-yield, precision, power, and area (PPA)-optimized system-on-chip (SoC) design, enabling over $3{times }$ silicon area reduction through application-specific optimization.
{"title":"An Analytical Model of Mismatch Dominance Crossover in High-Speed Flash ADC Cores","authors":"Shatadal Chatterjee;Jitumani Sarma","doi":"10.1109/TVLSI.2025.3637272","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3637272","url":null,"abstract":"The flash analog-to-digital converters (ADCs), essential for high-speed embedded systems, face inherent linearity constraints due to device mismatch in the resistor ladder and comparator stages. While individual analytical models exist for these mismatch sources, designers rely on Monte Carlo simulations to evaluate the combined errors. This brief introduces a unified analytical framework with closed-form expressions that capture both mismatch sources, enabling efficient estimation of root mean square (rms) integral nonlinearity/differential nonlinearity (INL/DNL). Validated against circuit simulations, the model achieves a mean absolute error (MAE) of 2.71% (<inline-formula> <tex-math>$boldsymbol {sigma _{textbf {DNL}}}$ </tex-math></inline-formula>) and 2.51% (<inline-formula> <tex-math>$sigma _{text {INL}}$ </tex-math></inline-formula>), and the maximum absolute error (MaxE) remains within 5.44%. This predictive capability guides high-yield, precision, power, and area (PPA)-optimized system-on-chip (SoC) design, enabling over <inline-formula> <tex-math>$3{times }$ </tex-math></inline-formula> silicon area reduction through application-specific optimization.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"702-706"},"PeriodicalIF":3.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-09DOI: 10.1109/TVLSI.2025.3639588
Junhak Kim;Young-Wook Kim;Sinho Lee;Yoojin Jung;Min-Seong Choo;Kwanseo Park
This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm2 and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UIpp at 35 MHz with a bit error rate (BER) of $10^{-12}$ .
{"title":"A Pattern-Dependent Pulse Filtering Technique for Low-Jitter Injection-Locked CDR in 28-nm CMOS","authors":"Junhak Kim;Young-Wook Kim;Sinho Lee;Yoojin Jung;Min-Seong Choo;Kwanseo Park","doi":"10.1109/TVLSI.2025.3639588","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3639588","url":null,"abstract":"This work presents a ring oscillator (RO)-based low-jitter injection-locked clock and data recovery (ILCDR) with a pattern-dependent pulse filtering (PDPF) technique. The conventional ILCDR has a drawback that data jitter is transferred to the recovered clock. To reduce jitter, the PDPF technique is employed to filter out the injection pulses occurring in data patterns that cause high data-dependent jitter (DDJ). Adopting the PDPF technique with an injection timing control loop, the ILCDR optimizes injection timing and maximizes timing margin. Fabricated in a 28-nm CMOS technology, the proposed ILCDR occupies an active area of 0.03 mm<sup>2</sup> and consumes 13.6 mW at 10 Gb/s. The measured jitter tolerance (JTOL) is 1 UI<sub>pp</sub> at 35 MHz with a bit error rate (BER) of <inline-formula> <tex-math>$10^{-12}$ </tex-math></inline-formula>.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"711-715"},"PeriodicalIF":3.1,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-02DOI: 10.1109/TVLSI.2025.3634734
Xiao Wang;Xin Sun;Yibin Zheng;Runkun Li;Kong-Pang Pun
This brief presents an amplifierless current-to-digital converter (CDC) that uniquely integrates an open-loop pseudo-differential current mirror with a current-integration successive-approximation-register analog-to-digital converter (ADC). The proposed architecture enables the CDC to achieve high-speed operation at low power consumption, which is critical for the intended applications in dynamic optical coherence tomography (OCT) systems. Fabricated in 65-nm CMOS, the prototype occupies 0.019 mm2, consumes $380~mu $ W from a 1-V supply, and achieves a 47-dB dynamic range (DR) with a 50-MS/s sample rate. It achieves Walden’s and Schreier’s figures of merit of 92 fJ/step and 148 dB, respectively, both being the best among reported CDCs.
本文介绍了一种无放大器的电流-数字转换器(CDC),该转换器独特地集成了开环伪差分电流镜和电流集成连续近似寄存器模数转换器(ADC)。所提出的架构使CDC能够在低功耗下实现高速运行,这对于动态光学相干层析成像(OCT)系统的预期应用至关重要。该原型机采用65纳米CMOS制造,占地0.019 mm2,功耗为380~mu $ W (1 v电源),采样率为50 ms /s,动态范围为47 db。它达到Walden 's和Schreier 's的优点值分别为92 fJ/步和148 dB,两者都是报道的cdc中最好的。
{"title":"A 0.38-mW, 50-MS/s, 2.3-μApp Current-Integration SAR-Based Current-to-Digital Converter for Real-Time OCT Imaging","authors":"Xiao Wang;Xin Sun;Yibin Zheng;Runkun Li;Kong-Pang Pun","doi":"10.1109/TVLSI.2025.3634734","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3634734","url":null,"abstract":"This brief presents an amplifierless current-to-digital converter (CDC) that uniquely integrates an open-loop pseudo-differential current mirror with a current-integration successive-approximation-register analog-to-digital converter (ADC). The proposed architecture enables the CDC to achieve high-speed operation at low power consumption, which is critical for the intended applications in dynamic optical coherence tomography (OCT) systems. Fabricated in 65-nm CMOS, the prototype occupies 0.019 mm<sup>2</sup>, consumes <inline-formula> <tex-math>$380~mu $ </tex-math></inline-formula>W from a 1-V supply, and achieves a 47-dB dynamic range (DR) with a 50-MS/s sample rate. It achieves Walden’s and Schreier’s figures of merit of 92 fJ/step and 148 dB, respectively, both being the best among reported CDCs.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"697-701"},"PeriodicalIF":3.1,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1109/TVLSI.2025.3630312
{"title":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information","authors":"","doi":"10.1109/TVLSI.2025.3630312","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3630312","url":null,"abstract":"","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 12","pages":"C3-C3"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268918","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-25DOI: 10.1109/TVLSI.2025.3621790
Saeed Aghapour;Kiarash Sedghighadikolaei;Attila A. Yavuz;Bechir Hamdaoui;Mehran Mozaffari-Kermani
After the acceptance of [1], an error was introduced, which we aim to resolve here. The abbreviation ML stands for module lattice-based, not “machine learning.” The first sentence of the first paragraph is corrected from the version that was published in Early Access. It should have read, “Barrett modular reduction and multiplication are essential primitives for efficient modular computation in cryptographic schemes, including postquantum standards such as module lattice-based (ML) key encapsulation mechanism (KEM) and ML-digital signature algorithm (DSA).” In the Introduction, the same correction has been made for the abbreviation ML.
{"title":"Corrections to “Efficient Fault-Detection Architectures for Barrett Reduction and Multiplication in Classical and Post-Quantum Cryptographic Systems”","authors":"Saeed Aghapour;Kiarash Sedghighadikolaei;Attila A. Yavuz;Bechir Hamdaoui;Mehran Mozaffari-Kermani","doi":"10.1109/TVLSI.2025.3621790","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3621790","url":null,"abstract":"After the acceptance of [1], an error was introduced, which we aim to resolve here. The abbreviation ML stands for module lattice-based, not “machine learning.” The first sentence of the first paragraph is corrected from the version that was published in Early Access. It should have read, “Barrett modular reduction and multiplication are essential primitives for efficient modular computation in cryptographic schemes, including postquantum standards such as module lattice-based (ML) key encapsulation mechanism (KEM) and ML-digital signature algorithm (DSA).” In the Introduction, the same correction has been made for the abbreviation ML.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 12","pages":"3545-3545"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11268919","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145595105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dynamic vision sensors (DVSs), renowned for their low latency and sparse event-driven output, have garnered significant attention in machine vision applications, particularly in latency-sensitive applications like object tracking. However, current DVS-based object tracking systems have not fully utilized the advantages of DVS due to their frame-based processing or the high computing intensity of their algorithms. This brief presents EDMOT, a fully event-driven multiobject tracking system for DVS, achieving 25–65 ns latency at 200 MHz. EDMOT introduces three key innovations: 1) a novel event-driven update mechanism that processes only the latest event and the expired oldest event, minimizing computational overhead; 2) a dual-threshold tracking strategy that decouples object formation and motion phases, significantly improving tracking accuracy; and 3) a row–column feature memory with flag registers, enabling object separation within eight clock cycles. The proposed EDMOT is evaluated on public datasets, demonstrating superior tracking accuracy compared to prior methods. Finally, EDMOT was implemented at the HLMC 55 nm, supporting 20–100 Me/s throughput. To the best of our knowledge, this is a multiobject tracking system with minimal delay and the highest event processing throughput.
{"title":"EDMOT: A 25–65 ns Latency Event-Driven Multiobject Tracker for Dynamic Vision Sensors","authors":"Feiqiang Li;Yujie Huang;Yaoyi Chen;Mingyu Wang;Minge Jing;Wenhong Li;Xiaoyang Zeng","doi":"10.1109/TVLSI.2025.3633693","DOIUrl":"https://doi.org/10.1109/TVLSI.2025.3633693","url":null,"abstract":"Dynamic vision sensors (DVSs), renowned for their low latency and sparse event-driven output, have garnered significant attention in machine vision applications, particularly in latency-sensitive applications like object tracking. However, current DVS-based object tracking systems have not fully utilized the advantages of DVS due to their frame-based processing or the high computing intensity of their algorithms. This brief presents EDMOT, a fully event-driven multiobject tracking system for DVS, achieving 25–65 ns latency at 200 MHz. EDMOT introduces three key innovations: 1) a novel event-driven update mechanism that processes only the latest event and the expired oldest event, minimizing computational overhead; 2) a dual-threshold tracking strategy that decouples object formation and motion phases, significantly improving tracking accuracy; and 3) a row–column feature memory with flag registers, enabling object separation within eight clock cycles. The proposed EDMOT is evaluated on public datasets, demonstrating superior tracking accuracy compared to prior methods. Finally, EDMOT was implemented at the HLMC 55 nm, supporting 20–100 Me/s throughput. To the best of our knowledge, this is a multiobject tracking system with minimal delay and the highest event processing throughput.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"34 2","pages":"692-696"},"PeriodicalIF":3.1,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146015919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}