首页 > 最新文献

IEEE Transactions on Computers最新文献

英文 中文
2025 Reviewers List 2025审稿人名单
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2026-02-11 DOI: 10.1109/TC.2026.3652671
{"title":"2025 Reviewers List","authors":"","doi":"10.1109/TC.2026.3652671","DOIUrl":"https://doi.org/10.1109/TC.2026.3652671","url":null,"abstract":"","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1224-1231"},"PeriodicalIF":3.8,"publicationDate":"2026-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11393623","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Latency Optimization in Hybrid Memory System for GNNs gnn混合存储系统的延迟优化
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-29 DOI: 10.1109/TC.2025.3648646
Zhaoyang Zeng;Yujuan Tan;Wei Chen;Jiali Li;Zhuoxin Bai;Ao Ren;Duo Liu;Xianzhang Chen
Graph Neural Networks (GNNs) require high-capacity, low-latency memory systems to process large graphs. A hierarchical hybrid memory architecture combining high-capacity Non-Volatile Memory (NVM) and low-latency DRAM offers a promising solution. However, the inherent sparsity of graph data results in poor locality for GNN memory requests, leading to low DRAM cache hit rates and numerous misses, which significantly impairs the hybrid memory system’s performance. A critical issue is that DRAM misses in serial access mode incur substantial latency. While parallel access mode can mitigate this for misses, it introduces long-tail latency and wastes bandwidth for DRAM hits. In this paper, we focus on addressing these issues from two aspects: increasing the cache hit rate and decreasing the miss latency. We mainly propose two predictors: a future data access predictor that enables accurate prefetching to DRAM, thereby improving cache hit rates, and a data location predictor that determines whether data resides in DRAM or NVM, optimizing the choice between serial and parallel access modes to reduce miss latency. By integrating these predictors, we achieve efficient data access in both DRAM and NVM. Our experiments show a 49.5% reduction in memory delay and a 38.1% increase in memory bandwidth utilization compared to baseline.
图神经网络(gnn)需要高容量、低延迟的内存系统来处理大型图。结合高容量非易失性存储器(NVM)和低延迟DRAM的分层混合存储器架构提供了一个有前途的解决方案。然而,图形数据固有的稀疏性导致GNN内存请求的局域性差,导致DRAM缓存命中率低和大量失误,这严重损害了混合存储系统的性能。一个关键的问题是,在串行访问模式下,DRAM丢失会导致大量的延迟。虽然并行访问模式可以缓解这种错误,但它引入了长尾延迟,并且浪费了DRAM命中的带宽。本文主要从提高缓存命中率和减少丢失延迟两个方面来解决这些问题。我们主要提出了两个预测器:一个未来数据访问预测器,它能够准确地预取到DRAM,从而提高缓存命中率;一个数据位置预测器,它确定数据是驻留在DRAM还是NVM中,优化串行和并行访问模式之间的选择,以减少丢失延迟。通过集成这些预测器,我们在DRAM和NVM中都实现了高效的数据访问。我们的实验表明,与基线相比,内存延迟减少了49.5%,内存带宽利用率增加了38.1%。
{"title":"Latency Optimization in Hybrid Memory System for GNNs","authors":"Zhaoyang Zeng;Yujuan Tan;Wei Chen;Jiali Li;Zhuoxin Bai;Ao Ren;Duo Liu;Xianzhang Chen","doi":"10.1109/TC.2025.3648646","DOIUrl":"https://doi.org/10.1109/TC.2025.3648646","url":null,"abstract":"Graph Neural Networks (GNNs) require high-capacity, low-latency memory systems to process large graphs. A hierarchical hybrid memory architecture combining high-capacity Non-Volatile Memory (NVM) and low-latency DRAM offers a promising solution. However, the inherent sparsity of graph data results in poor locality for GNN memory requests, leading to low DRAM cache hit rates and numerous misses, which significantly impairs the hybrid memory system’s performance. A critical issue is that DRAM misses in serial access mode incur substantial latency. While parallel access mode can mitigate this for misses, it introduces long-tail latency and wastes bandwidth for DRAM hits. In this paper, we focus on addressing these issues from two aspects: increasing the cache hit rate and decreasing the miss latency. We mainly propose two predictors: a future data access predictor that enables accurate prefetching to DRAM, thereby improving cache hit rates, and a data location predictor that determines whether data resides in DRAM or NVM, optimizing the choice between serial and parallel access modes to reduce miss latency. By integrating these predictors, we achieve efficient data access in both DRAM and NVM. Our experiments show a 49.5% reduction in memory delay and a 38.1% increase in memory bandwidth utilization compared to baseline.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1183-1196"},"PeriodicalIF":3.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Pronged Deep Learning Preprocessing on Heterogeneous Platforms With CPU, Accelerator and CSD 基于CPU、加速器和CSD的异构平台双管深度学习预处理
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-29 DOI: 10.1109/TC.2025.3649209
Jia Wei;Xingjun Zhang;Witold Pedrycz;Longxiang Wang;Jie Zhao
For image-related deep learning tasks, the first step often involves reading data from external storage and performing preprocessing on the CPU. As accelerator speed increases and the number of single compute node accelerators increases, the computing and data transfer capabilities gap between accelerators and CPUs gradually increases. Data reading and preprocessing become progressively the bottleneck of these tasks. Our work, DDLP, addresses the data computing and transfer bottleneck of deep learning preprocessing using Computable Storage Devices (CSDs). DDLP allows the CPU and CSD to efficiently parallelize preprocessing from both ends of the datasets, respectively. To this end, we propose two adaptive dynamic selection strategies to make DDLP control the accelerator to automatically read data from different sources. The two strategies trade-off between consistency and efficiency. DDLP achieves sufficient computational overlap between CSD data preprocessing and CPU preprocessing, accelerator computation, and accelerator data reading. In addition, DDLP leverages direct storage technology to enable efficient SSD-to-accelerator data transfer. In addition, DDLP reduces the use of expensive CPU and DRAM resources with more energy-efficient CSDs, alleviating preprocessing bottlenecks while significantly reducing power consumption. Extensive experimental results show that DDLP can improve learning speed by up to 23.5% on ImageNet Dataset while reducing energy consumption by 19.7% and CPU and DRAM usage by 37.6%. DDLP also improves the learning speed by up to 27.6% on the Cifar-10 dataset.
对于与图像相关的深度学习任务,第一步通常涉及从外部存储读取数据并在CPU上执行预处理。随着加速器速度的提高和单个计算节点加速器数量的增加,加速器与cpu之间的计算能力和数据传输能力差距逐渐增大。数据读取和预处理逐渐成为这些任务的瓶颈。我们的工作,DDLP,解决了使用可计算存储设备(CSDs)进行深度学习预处理的数据计算和传输瓶颈。DDLP允许CPU和CSD分别从数据集的两端有效地并行预处理。为此,我们提出了两种自适应动态选择策略,使DDLP控制加速器自动读取不同来源的数据。这两种策略在一致性和效率之间进行权衡。DDLP实现了CSD数据预处理与CPU预处理、加速器计算、加速器数据读取之间充分的计算重叠。此外,DDLP利用直接存储技术实现ssd到加速器的高效数据传输。此外,DDLP通过更节能的cd减少了昂贵的CPU和DRAM资源的使用,缓解了预处理瓶颈,同时显著降低了功耗。大量的实验结果表明,DDLP可以在ImageNet数据集上提高高达23.5%的学习速度,同时降低19.7%的能耗和37.6%的CPU和DRAM使用。DDLP还将Cifar-10数据集的学习速度提高了27.6%。
{"title":"Dual-Pronged Deep Learning Preprocessing on Heterogeneous Platforms With CPU, Accelerator and CSD","authors":"Jia Wei;Xingjun Zhang;Witold Pedrycz;Longxiang Wang;Jie Zhao","doi":"10.1109/TC.2025.3649209","DOIUrl":"https://doi.org/10.1109/TC.2025.3649209","url":null,"abstract":"For image-related deep learning tasks, the first step often involves reading data from external storage and performing preprocessing on the CPU. As accelerator speed increases and the number of single compute node accelerators increases, the computing and data transfer capabilities gap between accelerators and CPUs gradually increases. Data reading and preprocessing become progressively the bottleneck of these tasks. Our work, DDLP, addresses the data computing and transfer bottleneck of deep learning preprocessing using Computable Storage Devices (CSDs). DDLP allows the CPU and CSD to efficiently parallelize preprocessing from both ends of the datasets, respectively. To this end, we propose two adaptive dynamic selection strategies to make DDLP control the accelerator to automatically read data from different sources. The two strategies trade-off between consistency and efficiency. DDLP achieves sufficient computational overlap between CSD data preprocessing and CPU preprocessing, accelerator computation, and accelerator data reading. In addition, DDLP leverages direct storage technology to enable efficient SSD-to-accelerator data transfer. In addition, DDLP reduces the use of expensive CPU and DRAM resources with more energy-efficient CSDs, alleviating preprocessing bottlenecks while significantly reducing power consumption. Extensive experimental results show that DDLP can improve learning speed by up to 23.5% on ImageNet Dataset while reducing energy consumption by 19.7% and CPU and DRAM usage by 37.6%. DDLP also improves the learning speed by up to 27.6% on the Cifar-10 dataset.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1209-1223"},"PeriodicalIF":3.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of Radiation Resilience, Performance, and Vmin of Sub-3 nm FSFET Based SRAM Arrays 基于sub - 3nm fset的SRAM阵列的辐射弹性、性能和Vmin评估
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-29 DOI: 10.1109/TC.2025.3649150
Hafeez Raza;Mahdi Benkhelifa;Koshal Kumar;Shivendra Singh Parihar;Yogesh Singh Chauhan;Hussam Amrouch;Avinash Lahgere
In this work, we present single-event upset (SEU) analysis for Forksheet FET (FSFET) based CMOS circuits. Next, we present an array-level power and performance analysis along with the Vmin evaluation for the FSFET-based SRAM. Physics-based TCAD and industry-standard BSIM-CMG compact models are calibrated for accurate circuit analysis in SPICE. The impact of varying Heavy-Ion Radiation (HIR) doses and strike orientations is investigated for the FSFETs. The robustness of CMOS inverter against HIR is also reported in terms of failure time (tfail) and output voltage swing ($Delta$VDrop). For the SRAM, we determine the critical Linear Energy Transfer (LET). For FSFET, the individual n-/p-FETs are more vulnerable to the irradiation incident on nearby devices. At the circuit level, in comparison to perpendicular strikes, the $Delta$VDrop increases by 1.25V and 2.75V respectively, for oblique and transverse incidences, at a dose of 2.0MeVcm$^2$/mg. The tfail also increases by 43% and 60% and the SRAM critical LET also decreases by 85% and 57.5%, respectively. The array level SRAM evaluation shows that the FSFET enables reliable operation with low-power consumption, impressive noise margins, and low minimum-operating voltage (Vmin) values. FSFET SRAM power dissipation during the read and write operations is as low as 7.02$mu$W, and 3.00$mu$W respectively. At VDD$=$0.70V, the noise margins for hold, read, and write operations are 289.27mV, 122.89mV, and 297.79mV. The Vmin for read and write operations are 0.30V and 0.35V respectively.
在这项工作中,我们提出了基于叉片场效应管(fset)的CMOS电路的单事件扰动(SEU)分析。接下来,我们提出了阵列级功率和性能分析以及基于fsfet的SRAM的Vmin评估。基于物理的TCAD和行业标准BSIM-CMG紧凑型模型在SPICE中进行了精确的电路分析校准。研究了不同的重离子辐射剂量和冲击方向对fsfet的影响。CMOS逆变器对HIR的鲁棒性也在故障时间(tfail)和输出电压摆幅($Delta$ VDrop)方面得到了报道。对于SRAM,我们确定了临界线性能量传递(LET)。对于fset,单个n-/p- fet更容易受到附近器件的辐照事件的影响。在电路水平上,与垂直入射相比,当剂量为2.0 MeVcm $^2$ /mg时,垂直入射和横向入射的$Delta$电压降分别增加1.25 V和2.75 V。失败也增加了43% and 60% and the SRAM critical LET also decreases by 85% and 57.5%, respectively. The array level SRAM evaluation shows that the FSFET enables reliable operation with low-power consumption, impressive noise margins, and low minimum-operating voltage (Vmin) values. FSFET SRAM power dissipation during the read and write operations is as low as 7.02 $mu$W, and 3.00 $mu$W respectively. At VDD$=$0.70 V, the noise margins for hold, read, and write operations are 289.27 mV, 122.89 mV, and 297.79 mV. The Vmin for read and write operations are 0.30 V and 0.35 V respectively.
{"title":"Evaluation of Radiation Resilience, Performance, and Vmin of Sub-3 nm FSFET Based SRAM Arrays","authors":"Hafeez Raza;Mahdi Benkhelifa;Koshal Kumar;Shivendra Singh Parihar;Yogesh Singh Chauhan;Hussam Amrouch;Avinash Lahgere","doi":"10.1109/TC.2025.3649150","DOIUrl":"https://doi.org/10.1109/TC.2025.3649150","url":null,"abstract":"In this work, we present single-event upset (SEU) analysis for Forksheet FET (FSFET) based CMOS circuits. Next, we present an array-level power and performance analysis along with the V<sub>min</sub> evaluation for the FSFET-based SRAM. Physics-based TCAD and industry-standard BSIM-CMG compact models are calibrated for accurate circuit analysis in SPICE. The impact of varying Heavy-Ion Radiation (HIR) doses and strike orientations is investigated for the FSFETs. The robustness of CMOS inverter against HIR is also reported in terms of failure time (t<sub>fail</sub>) and output voltage swing (<inline-formula><tex-math>$Delta$</tex-math></inline-formula>V<sub>Drop</sub>). For the SRAM, we determine the critical Linear Energy Transfer (LET). For FSFET, the individual n-/p-FETs are more vulnerable to the irradiation incident on nearby devices. At the circuit level, in comparison to perpendicular strikes, the <inline-formula><tex-math>$Delta$</tex-math></inline-formula>V<sub>Drop</sub> increases by 1.25<roman> </roman>V and 2.75<roman> </roman>V respectively, for oblique and transverse incidences, at a dose of 2.0<roman> </roman>MeVcm<inline-formula><tex-math>$^2$</tex-math></inline-formula>/mg. The t<sub>fail</sub> also increases by 43% and 60% and the SRAM critical LET also decreases by 85% and 57.5%, respectively. The array level SRAM evaluation shows that the FSFET enables reliable operation with low-power consumption, impressive noise margins, and low minimum-operating voltage (V<sub>min</sub>) values. FSFET SRAM power dissipation during the read and write operations is as low as 7.02<roman> </roman><inline-formula><tex-math>$mu$</tex-math></inline-formula>W, and 3.00<roman> </roman><inline-formula><tex-math>$mu$</tex-math></inline-formula>W respectively. At V<sub>DD</sub><inline-formula><tex-math>$=$</tex-math></inline-formula>0.70<roman> </roman>V, the noise margins for hold, read, and write operations are 289.27<roman> </roman>mV, 122.89<roman> </roman>mV, and 297.79<roman> </roman>mV. The V<sub>min</sub> for read and write operations are 0.30<roman> </roman>V and 0.35<roman> </roman>V respectively.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1197-1208"},"PeriodicalIF":3.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fused FP8 Many-Terms Dot Product With Scaling and FP32 Accumulation 融合FP8多项点积的缩放和FP32累积
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-25 DOI: 10.1109/TC.2025.3648544
David R. Lutz;Anisha Saini;Mairin Kroes;Thomas Elmer;Harsha Valsaraju;Javier D. Bruguera
In the era of Deep Learning, hardware acceleration has become essential for meeting the immense computational demands of modern applications. In many Machine Learning applications, Generalized Matrix Multiplication (GEMM) with dot product is an ubiquitous and computationally intensive operation. This paper introduces two innovative microarchitectures for executing a fused FP8 $m$‐way dot product with dynamic range scaling and FP32 accumulation. Both microarchitectures have been synthesized in a 3 nm technology node at 3.6 GHz, and were designed to deliver power- and area-efficient performance, targeting a 4-cycle latency for $m = 4,8$ and 5+ cycles for larger $m$ values. The first design – termed dot product with late accumulation – computes the dot product in the first cycles, then expands intermediate products to a fixed-point format (2 cycles for $m = 4, 8$ and 3+ cycles for $m gt 8$), before using an additional two cycles for accumulation. This approach enables the reuse of a modified, FMA-capable FP32 adder. The second design – dot product with early accumulation – employs a dedicated FP8 datapath that concurrently computes the FP8 sum-of-products while aligning the FP32 accumulator, followed by the addition of the significands (2 cycles for $m = 4, 8$ and 3+ cycles for $m gt 8$). This is then followed by two cycles for normalization and a single rounding operation. This design aligns addends (products and accumulator) from an “anchor” for efficient, arithmetically fused, $m$-way FP8 dot product computation. Comparative analysis with previous proposals reveals that, despite challenges in establishing a fair comparison, our designs achieve significant area savings.
在深度学习时代,硬件加速对于满足现代应用程序的巨大计算需求至关重要。在许多机器学习应用中,带点积的广义矩阵乘法(GEMM)是一种普遍存在的计算密集型运算。本文介绍了两种用于执行具有动态范围缩放和FP32累积的融合FP8 $m$ way点积的创新微架构。这两种微架构都是在3.6 GHz的3nm技术节点上合成的,旨在提供功耗和面积效率高的性能,目标是在$m = 4,8$和5+ $m =值时实现4周期延迟。第一个设计-称为带后期累积的点积-在第一个循环中计算点积,然后将中间乘积扩展为定点格式($m = 4,8 $为2个循环,$m gt 8$为3+循环),然后使用额外的两个循环进行累积。这种方法可以重用经过修改的、支持fma的FP32加法器。第二种设计-带早期累积的点积-采用专用的FP8数据路径,在对齐FP32累加器的同时并发计算FP8乘积和,然后添加有效数($m = 4,8 $的2个周期和$m gt 8$的3个+周期)。然后是两个归一化循环和一个舍入操作。该设计将加数(乘积和累加器)从一个“锚”对齐,以实现高效、算术融合、$m$ way FP8点积计算。与以前的方案进行比较分析表明,尽管在建立公平比较方面存在挑战,但我们的设计实现了显著的面积节约。
{"title":"Fused FP8 Many-Terms Dot Product With Scaling and FP32 Accumulation","authors":"David R. Lutz;Anisha Saini;Mairin Kroes;Thomas Elmer;Harsha Valsaraju;Javier D. Bruguera","doi":"10.1109/TC.2025.3648544","DOIUrl":"https://doi.org/10.1109/TC.2025.3648544","url":null,"abstract":"In the era of Deep Learning, hardware acceleration has become essential for meeting the immense computational demands of modern applications. In many Machine Learning applications, Generalized Matrix Multiplication (GEMM) with dot product is an ubiquitous and computationally intensive operation. This paper introduces two innovative microarchitectures for executing a fused FP8 <inline-formula><tex-math>$m$</tex-math></inline-formula>‐way dot product with dynamic range scaling and FP32 accumulation. Both microarchitectures have been synthesized in a 3 nm technology node at 3.6 GHz, and were designed to deliver power- and area-efficient performance, targeting a 4-cycle latency for <inline-formula><tex-math>$m = 4,8$</tex-math></inline-formula> and 5+ cycles for larger <inline-formula><tex-math>$m$</tex-math></inline-formula> values. The first design – termed <i>dot product with late accumulation</i> – computes the dot product in the first cycles, then expands intermediate products to a fixed-point format (2 cycles for <inline-formula><tex-math>$m = 4, 8$</tex-math></inline-formula> and 3+ cycles for <inline-formula><tex-math>$m gt 8$</tex-math></inline-formula>), before using an additional two cycles for accumulation. This approach enables the reuse of a modified, FMA-capable FP32 adder. The second design – <i>dot product with early accumulation</i> – employs a dedicated FP8 datapath that concurrently computes the FP8 sum-of-products while aligning the FP32 accumulator, followed by the addition of the significands (2 cycles for <inline-formula><tex-math>$m = 4, 8$</tex-math></inline-formula> and 3+ cycles for <inline-formula><tex-math>$m gt 8$</tex-math></inline-formula>). This is then followed by two cycles for normalization and a single rounding operation. This design aligns addends (products and accumulator) from an “anchor” for efficient, arithmetically fused, <inline-formula><tex-math>$m$</tex-math></inline-formula>-way FP8 dot product computation. Comparative analysis with previous proposals reveals that, despite challenges in establishing a fair comparison, our designs achieve significant area savings.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1171-1182"},"PeriodicalIF":3.8,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FSEdit: Privacy-Preserving and Security-Enhanced Controllable Editing Framework for Cloud Storage FSEdit:用于云存储的隐私保护和安全增强的可控编辑框架
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-25 DOI: 10.1109/TC.2025.3648284
Qinlong Huang;Caiqun Shi;Xiyu Liang
The popularity of cloud storage makes it easy to share others’ data, such as accessing and editing electronic medical records. To control the editing, sanitizable signatures were developed to allow designated editors to modify restricted parts of the data with sanitizing keys. In particular, attribute-based sanitizable signature (ABSS) has been explored for fine-grained editing control, i.e., editors are determined by the owner through a policy. However, existing ABSS schemes can neither protect the anonymity of the owner due to signature verification under the owner’s public key, nor protect the data confidentiality due to the plaintext storage in the cloud server. Moreover, they do not consider the sanitizing key exposure issue. To this end, we propose FSEdit, a privacy-preserving and security-enhanced controllable editing framework for cloud storage. Specifically, we introduce an attribute-based sanitizable and puncturable signed encryption (AB-SPSE) primitive for FSEdit, where encrypted data can be anonymously verified via the owner’s attributes, and its admissible blocks can be modified by policy-authorized editors. Meanwhile, the editors’ sanitizing keys can be further punctured to guarantee forward secrecy. We design two novel building blocks, namely attribute-based equivalence-class signature, and attribute-based puncturable combined encryption and signature, and then construct FSEdit by leveraging them to instantiate AB-SPSE in asymmetric pairings. Finally, we show the security and efficiency of FSEdit through extensive security analysis and experimental results over existing ABSS schemes.
云存储的普及使得共享他人数据变得容易,例如访问和编辑电子病历。为了控制编辑,开发了可清理签名,以允许指定的编辑器使用清理键修改数据的受限制部分。特别是,基于属性的可清理签名(ABSS)已经被用于细粒度的编辑控制,即编辑器由所有者通过策略确定。但是,现有的ABSS方案既不能通过所有者公钥下的签名验证来保护所有者的匿名性,也不能通过明文存储在云服务器上来保护数据的机密性。此外,他们没有考虑消毒的关键暴露问题。为此,我们提出了FSEdit,一种用于云存储的隐私保护和安全增强的可控编辑框架。具体来说,我们为FSEdit引入了一种基于属性的可清理和可穿刺签名加密(AB-SPSE)原语,其中加密的数据可以通过所有者的属性匿名验证,其可接受的块可以由策略授权的编辑器修改。同时,编辑器的消毒密钥可以进一步被击穿,以保证向前保密。我们设计了基于属性的等价类签名和基于属性的可穿刺组合加密签名两个新的构建块,并利用它们实例化非对称对中的AB-SPSE,构造了FSEdit。最后,我们通过广泛的安全性分析和现有ABSS方案的实验结果证明了FSEdit的安全性和有效性。
{"title":"FSEdit: Privacy-Preserving and Security-Enhanced Controllable Editing Framework for Cloud Storage","authors":"Qinlong Huang;Caiqun Shi;Xiyu Liang","doi":"10.1109/TC.2025.3648284","DOIUrl":"https://doi.org/10.1109/TC.2025.3648284","url":null,"abstract":"The popularity of cloud storage makes it easy to share others’ data, such as accessing and editing electronic medical records. To control the editing, sanitizable signatures were developed to allow designated editors to modify restricted parts of the data with sanitizing keys. In particular, attribute-based sanitizable signature (ABSS) has been explored for fine-grained editing control, i.e., editors are determined by the owner through a policy. However, existing ABSS schemes can neither protect the anonymity of the owner due to signature verification under the owner’s public key, nor protect the data confidentiality due to the plaintext storage in the cloud server. Moreover, they do not consider the sanitizing key exposure issue. To this end, we propose FSEdit, a privacy-preserving and security-enhanced controllable editing framework for cloud storage. Specifically, we introduce an attribute-based sanitizable and puncturable signed encryption (AB-SPSE) primitive for FSEdit, where encrypted data can be anonymously verified via the owner’s attributes, and its admissible blocks can be modified by policy-authorized editors. Meanwhile, the editors’ sanitizing keys can be further punctured to guarantee forward secrecy. We design two novel building blocks, namely attribute-based equivalence-class signature, and attribute-based puncturable combined encryption and signature, and then construct FSEdit by leveraging them to instantiate AB-SPSE in asymmetric pairings. Finally, we show the security and efficiency of FSEdit through extensive security analysis and experimental results over existing ABSS schemes.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1156-1170"},"PeriodicalIF":3.8,"publicationDate":"2025-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Designing Spatial Architectures for Sparse Attention: STAR Accelerator via Cross-Stage Tiling 稀疏注意力的空间架构设计:通过跨阶段平铺的STAR加速器
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-24 DOI: 10.1109/TC.2025.3648055
Huizheng Wang;Taiquan Wei;Hongbin Wang;Zichuan Wang;Xinru Tang;Zhiheng Yue;Shaojun Wei;Yang Hu;Shouyi Yin
Large language models (LLMs) rely on self-attention for contextual understanding, demanding high-throughput inference and large-scale token parallelism (LTPP). Existing dynamic sparsity accelerators falter under LTPP scenarios due to stage-isolated optimizations. Revisiting the end-to-end sparsity acceleration flow, we identify an overlooked opportunity: cross-stage coordination can substantially reduce redundant computation and memory access. We propose STAR, a cross-stage compute- and memory-efficient algorithm–hardware co-design tailored for Transformer inference under LTPP. STAR introduces a leading-zero-based sparsity prediction using log-domain add-only operations to minimize prediction overhead. It further employs distributed sorting and a sorted updating FlashAttention mechanism, guided by a coordinated tiling strategy that enables fine-grained stage interaction for improved memory efficiency and latency. These optimizations are supported by a dedicated STAR accelerator architecture, achieving up to $9.2times$ speedup and $71.2times$ energy efficiency over A100, and surpassing SOTA accelerators by up to $16.1times$ energy and $27.1times$ area efficiency gains. Further, we deploy STAR onto a multi-core spatial architecture, optimizing dataflow and execution orchestration for ultra-long sequence processing. Architectural evaluation shows that, compared to the baseline design, Spatial-STAR achieves a $20.1times$ throughput improvement.
大型语言模型(llm)依赖于自我关注上下文理解,要求高吞吐量推理和大规模令牌并行性(LTPP)。现有的动态稀疏加速器在LTPP场景下由于阶段隔离优化而出现问题。重新审视端到端稀疏加速流,我们发现了一个被忽视的机会:跨阶段协调可以大大减少冗余计算和内存访问。我们提出STAR,这是一种跨阶段计算和内存效率高的算法-硬件协同设计,专为LTPP下的变压器推理而设计。STAR引入了一种基于前导零的稀疏性预测,使用对数域只添加操作来最小化预测开销。它进一步采用分布式排序和排序更新FlashAttention机制,由协调的平铺策略指导,实现细粒度的阶段交互,以提高内存效率和延迟。这些优化由专用的STAR加速器架构支持,在A100上实现高达9.2倍的加速和71.2倍的能源效率,并超过SOTA加速器高达16.1倍的能源和27.1倍的面积效率增益。此外,我们将STAR部署到多核空间架构上,优化数据流和执行编排,以实现超长序列处理。架构评估表明,与基线设计相比,Spatial-STAR实现了20.1倍的吞吐量改进。
{"title":"Designing Spatial Architectures for Sparse Attention: STAR Accelerator via Cross-Stage Tiling","authors":"Huizheng Wang;Taiquan Wei;Hongbin Wang;Zichuan Wang;Xinru Tang;Zhiheng Yue;Shaojun Wei;Yang Hu;Shouyi Yin","doi":"10.1109/TC.2025.3648055","DOIUrl":"https://doi.org/10.1109/TC.2025.3648055","url":null,"abstract":"Large language models (LLMs) rely on self-attention for contextual understanding, demanding high-throughput inference and large-scale token parallelism (LTPP). Existing dynamic sparsity accelerators falter under LTPP scenarios due to stage-isolated optimizations. Revisiting the end-to-end sparsity acceleration flow, we identify an overlooked opportunity: cross-stage coordination can substantially reduce redundant computation and memory access. We propose STAR, a cross-stage compute- and memory-efficient algorithm–hardware co-design tailored for Transformer inference under LTPP. STAR introduces a leading-zero-based sparsity prediction using log-domain add-only operations to minimize prediction overhead. It further employs distributed sorting and a sorted updating FlashAttention mechanism, guided by a coordinated tiling strategy that enables fine-grained stage interaction for improved memory efficiency and latency. These optimizations are supported by a dedicated STAR accelerator architecture, achieving up to <inline-formula><tex-math>$9.2times$</tex-math></inline-formula> speedup and <inline-formula><tex-math>$71.2times$</tex-math></inline-formula> energy efficiency over A100, and surpassing SOTA accelerators by up to <inline-formula><tex-math>$16.1times$</tex-math></inline-formula> energy and <inline-formula><tex-math>$27.1times$</tex-math></inline-formula> area efficiency gains. Further, we deploy STAR onto a multi-core spatial architecture, optimizing dataflow and execution orchestration for ultra-long sequence processing. Architectural evaluation shows that, compared to the baseline design, Spatial-STAR achieves a <inline-formula><tex-math>$20.1times$</tex-math></inline-formula> throughput improvement.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1125-1140"},"PeriodicalIF":3.8,"publicationDate":"2025-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
QoS Awareness and Improved Throughput of Point Cloud Services With Dynamic Workloads 具有动态工作负载的点云服务的QoS感知和改进吞吐量
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-23 DOI: 10.1109/TC.2025.3648132
Kaihua Fu;Jiuchen Shi;Yao Chen;Quan Chen;Weng-Fai Wong;Wei Wang;Bingsheng He;Minyi Guo
Deep learning on 3D point clouds plays a vital role in a wide range of applications such as AR/VR visualization, 3D cloth virtual try-on, and game rendering. As some applications require low latency, the point cloud services are also deployed on datacenter with powerful GPUs. While the queries of point cloud services show various workload change patterns due to different degrees of sparsity, current batching-based serving schemes result in either long latency or low throughput. We propose a scheme called Volans to address the above challenges and effectively support point cloud services. Volans comprises a workload predictor, a topology deployer, and a progress-aware scheduler. The predictor grids the input query and estimates the workload changes. Afterward, the deployer splits the model into several stages and determines the batch size for each stage based on the workload changes. The scheduler reduces the QoS violation when queries run slower due to unpredicted workload spikes. Experiments show that Volans enhances the peak supported throughput by up to 31.1% while maintaining the required 99%-ile latencies compared to state-of-the-art techniques.
基于3D点云的深度学习在AR/VR可视化、3D布料虚拟试穿和游戏渲染等广泛应用中发挥着至关重要的作用。由于一些应用程序需要低延迟,点云服务也部署在具有强大gpu的数据中心上。虽然点云服务的查询由于不同的稀疏度而表现出不同的工作负载变化模式,但当前基于批处理的服务方案要么导致长延迟,要么导致低吞吐量。我们提出了一种名为Volans的方案来解决上述挑战,并有效地支持点云服务。Volans包括一个工作负载预测器、一个拓扑部署器和一个进度感知调度器。预测器将输入查询网格化,并估计工作负载变化。之后,部署人员将模型分成几个阶段,并根据工作负载的变化确定每个阶段的批处理大小。当查询由于不可预测的工作负载峰值而运行较慢时,调度器会减少QoS冲突。实验表明,与最先进的技术相比,Volans将峰值支持吞吐量提高了31.1%,同时保持了所需的99%的延迟。
{"title":"QoS Awareness and Improved Throughput of Point Cloud Services With Dynamic Workloads","authors":"Kaihua Fu;Jiuchen Shi;Yao Chen;Quan Chen;Weng-Fai Wong;Wei Wang;Bingsheng He;Minyi Guo","doi":"10.1109/TC.2025.3648132","DOIUrl":"https://doi.org/10.1109/TC.2025.3648132","url":null,"abstract":"Deep learning on 3D point clouds plays a vital role in a wide range of applications such as AR/VR visualization, 3D cloth virtual try-on, and game rendering. As some applications require low latency, the point cloud services are also deployed on datacenter with powerful GPUs. While the queries of point cloud services show various workload change patterns due to different degrees of sparsity, current batching-based serving schemes result in either long latency or low throughput. We propose a scheme called Volans to address the above challenges and effectively support point cloud services. Volans comprises a workload predictor, a topology deployer, and a progress-aware scheduler. The predictor grids the input query and estimates the workload changes. Afterward, the deployer splits the model into several stages and determines the batch size for each stage based on the workload changes. The scheduler reduces the QoS violation when queries run slower due to unpredicted workload spikes. Experiments show that Volans enhances the peak supported throughput by up to 31.1% while maintaining the required 99%-ile latencies compared to state-of-the-art techniques.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1141-1155"},"PeriodicalIF":3.8,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdaptiveCore: Adaptive Parallel Core Decomposition Framework AdaptiveCore:自适应并行核心分解框架
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-19 DOI: 10.1109/TC.2025.3646191
Chen Zhao;Zhigao Zheng;Hao Huang;Hao Liu;Dacheng Tao
Core decomposition is a widely used hierarchical analysis algorithm for large-scale graphs. It achieves this decomposition by iteratively peeling the vertices along with their adjacency edges off into different hierarchies. With the timeliness requirements of modern applications, many researchers have introduced accelerators, particularly GPUs, to improve the computational efficiency of graph algorithms. However, the empty, sparse, and numerous hierarchies in large graphs lead to inefficient computation and parallelism, not only including unnecessary searching for the hierarchy’s vertices, but also significant thread wastage when peeling off the adjacency edges of these vertices. In this paper, we propose an adaptive parallel framework for core decomposition, named AdaptiveCore. First, it improves vertex searching efficiency by adaptively skipping the empty hierarchies and reducing the search space. Moreover, it greatly improves thread utilization by adaptively allocating the available threads to peel off the adjacency edges. Comprehensive experiments show that, compared with the state-of-the-art works, the proposed framework achieves an average speedup of $7.1times$ on the GPU platform and up to $2.0times$ on the multi-core CPU platform.
核心分解是一种广泛应用于大规模图的层次分析算法。它通过迭代地将顶点及其邻接边剥离到不同的层次结构来实现这种分解。随着现代应用对时效性的要求,许多研究者引入了加速器,特别是gpu来提高图算法的计算效率。然而,在大型图中,空的、稀疏的和众多的层次结构会导致低效的计算和并行性,不仅包括对层次结构顶点的不必要搜索,而且在剥离这些顶点的邻接边时还会造成严重的线程浪费。在本文中,我们提出了一个自适应并行的核心分解框架,命名为AdaptiveCore。首先,该算法通过自适应跳过空层次结构和减少搜索空间来提高顶点搜索效率。此外,它通过自适应地分配可用线程剥离邻接边,大大提高了线程利用率。综合实验表明,与目前的研究成果相比,所提出的框架在GPU平台上实现了平均7.1倍的加速,在多核CPU平台上实现了高达2.0倍的加速。
{"title":"AdaptiveCore: Adaptive Parallel Core Decomposition Framework","authors":"Chen Zhao;Zhigao Zheng;Hao Huang;Hao Liu;Dacheng Tao","doi":"10.1109/TC.2025.3646191","DOIUrl":"https://doi.org/10.1109/TC.2025.3646191","url":null,"abstract":"Core decomposition is a widely used hierarchical analysis algorithm for large-scale graphs. It achieves this decomposition by iteratively peeling the vertices along with their adjacency edges off into different hierarchies. With the timeliness requirements of modern applications, many researchers have introduced accelerators, particularly GPUs, to improve the computational efficiency of graph algorithms. However, the empty, sparse, and numerous hierarchies in large graphs lead to inefficient computation and parallelism, not only including unnecessary searching for the hierarchy’s vertices, but also significant thread wastage when peeling off the adjacency edges of these vertices. In this paper, we propose an adaptive parallel framework for core decomposition, named <i>AdaptiveCore</i>. First, it improves vertex searching efficiency by adaptively skipping the empty hierarchies and reducing the search space. Moreover, it greatly improves thread utilization by adaptively allocating the available threads to peel off the adjacency edges. Comprehensive experiments show that, compared with the state-of-the-art works, the proposed framework achieves an average speedup of <inline-formula><tex-math>$7.1times$</tex-math></inline-formula> on the GPU platform and up to <inline-formula><tex-math>$2.0times$</tex-math></inline-formula> on the multi-core CPU platform.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 3","pages":"1111-1124"},"PeriodicalIF":3.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Quantum Circuit Compilation 动态量子电路编译
IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2025-12-15 DOI: 10.1109/TC.2025.3643826
Kun Fang;Munan Zhang;Ruqi Shi;Yinan Li
The practical applications of quantum computing are currently limited by the small number of available qubits. Recent advances in quantum hardware have introduced mid-circuit measurements and resets, enabling the reuse of measured qubits and thus reducing the qubit requirements for executing quantum algorithms. In this work, we present a systematic study of dynamic quantum circuit compilation, a process that transforms static quantum circuits into their dynamic equivalents with fewer qubits through qubit reuse. We establish the first graph-based framework for optimizing qubit-reuse compilation. In particular, we characterize the task of finding the optimal compilation strategy for maximizing qubit reuse using binary integer programming and provide efficient heuristic algorithms for devising general compilation strategies. We conduct a thorough analysis of quantum circuits with practical relevance and offer their optimal qubit-reuse compilation strategies. We also perform a comparative analysis against state-of-the-art approaches, demonstrating the superior performance of our methods in both structured and random quantum circuits. Our framework lays a rigorous foundation for understanding dynamic quantum circuit compilation via qubit reuse, holding significant promise for the practical implementation of large-scale quantum algorithms on quantum computers with limited resources.
量子计算的实际应用目前受到可用量子比特数量少的限制。量子硬件的最新进展引入了中路测量和复位,使测量的量子位能够重用,从而减少了执行量子算法的量子位要求。在这项工作中,我们提出了动态量子电路编译的系统研究,这是一个通过量子比特重用将静态量子电路转换为具有较少量子比特的动态等效电路的过程。我们建立了第一个优化量子比特重用编译的基于图的框架。特别是,我们描述了使用二进制整数规划找到最大化量子位重用的最佳编译策略的任务,并提供了设计通用编译策略的有效启发式算法。我们对具有实际意义的量子电路进行了深入的分析,并提供了最佳的量子比特重用编译策略。我们还对最先进的方法进行了比较分析,证明了我们的方法在结构化和随机量子电路中的优越性能。我们的框架为理解通过量子比特重用的动态量子电路编译奠定了严格的基础,为在资源有限的量子计算机上实际实现大规模量子算法带来了重大希望。
{"title":"Dynamic Quantum Circuit Compilation","authors":"Kun Fang;Munan Zhang;Ruqi Shi;Yinan Li","doi":"10.1109/TC.2025.3643826","DOIUrl":"https://doi.org/10.1109/TC.2025.3643826","url":null,"abstract":"The practical applications of quantum computing are currently limited by the small number of available qubits. Recent advances in quantum hardware have introduced mid-circuit measurements and resets, enabling the reuse of measured qubits and thus reducing the qubit requirements for executing quantum algorithms. In this work, we present a systematic study of dynamic quantum circuit compilation, a process that transforms static quantum circuits into their dynamic equivalents with fewer qubits through qubit reuse. We establish the first graph-based framework for optimizing qubit-reuse compilation. In particular, we characterize the task of finding the optimal compilation strategy for maximizing qubit reuse using binary integer programming and provide efficient heuristic algorithms for devising general compilation strategies. We conduct a thorough analysis of quantum circuits with practical relevance and offer their optimal qubit-reuse compilation strategies. We also perform a comparative analysis against state-of-the-art approaches, demonstrating the superior performance of our methods in both structured and random quantum circuits. Our framework lays a rigorous foundation for understanding dynamic quantum circuit compilation via qubit reuse, holding significant promise for the practical implementation of large-scale quantum algorithms on quantum computers with limited resources.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"75 2","pages":"748-759"},"PeriodicalIF":3.8,"publicationDate":"2025-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145963443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Computers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1