首页 > 最新文献

IEEE Transactions on Parallel and Distributed Systems最新文献

英文 中文
2025 Reviewers List* 2025审稿人名单*
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2026-01-14 DOI: 10.1109/TPDS.2025.3639693
{"title":"2025 Reviewers List*","authors":"","doi":"10.1109/TPDS.2025.3639693","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3639693","url":null,"abstract":"","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"593-599"},"PeriodicalIF":6.0,"publicationDate":"2026-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11353051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing Management of Persistent Data Structures in High-Performance Analytics 高性能分析中持久数据结构的优化管理
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-31 DOI: 10.1109/TPDS.2025.3646133
Karim Youssef;Keita Iwabuchi;Maya Gokhale;Wu-chun Feng;Roger Pearce
Large-scale data analytics workflows ingest massive input data into various data structures, including graphs and key-value datastores. These data structures undergo multiple transformations and computations and are typically reused in incremental and iterative analytics workflows. Persisting in-memory views of these data structures enables reusing them beyond the scope of a single program run while avoiding repetitive raw data ingestion overheads. Memory-mapped I/O enables persisting in-memory data structures without data serialization and deserialization overheads. However, memory-mapped I/O lacks the key feature of persisting consistent snapshots of these data structures for incremental ingestion and processing. The obstacles to efficient virtual memory snapshots using memory-mapped I/O include background writebacks outside the application’s control, and the significantly high storage footprint of such snapshots. To address these limitations, we present Privateer, a memory and storage management tool that enables storage-efficient virtual memory snapshotting while also optimizing snapshot I/O performance. We integrated Privateer into Metall, a state-of-the-art persistent memory allocator for C++, and the Lightning Memory-Mapped Database (LMDB), a widely-used key-value datastore in data analytics and machine learning. Privateer optimized application performance by 1.22× when storing data structure snapshots to node-local storage, and up to 16.7× when storing snapshots to a parallel file system. Privateer also optimizes storage efficiency of incremental data structure snapshots by up to 11× using data deduplication and compression.
大规模数据分析工作流将大量输入数据摄取到各种数据结构中,包括图形和键值数据存储。这些数据结构经历多次转换和计算,并且通常在增量和迭代分析工作流中重用。在内存中持久化这些数据结构的视图可以在单个程序运行范围之外重用它们,同时避免重复的原始数据摄取开销。内存映射I/O支持在没有数据序列化和反序列化开销的情况下持久化内存中的数据结构。然而,内存映射I/O缺乏为增量摄取和处理持久化这些数据结构的一致快照的关键特性。使用内存映射I/O实现高效虚拟内存快照的障碍包括应用程序控制之外的后台写回,以及此类快照的高存储占用。为了解决这些限制,我们提出了Privateer,这是一个内存和存储管理工具,可以实现存储效率高的虚拟内存快照,同时还可以优化快照I/O性能。我们将Privateer集成到Metall(面向c++的最先进的持久内存分配器)和Lightning memory - mapped Database (LMDB)(在数据分析和机器学习中广泛使用的键值数据存储)中。当将数据结构快照存储到节点本地存储时,Privateer将应用程序性能优化了1.22倍,当将快照存储到并行文件系统时,性能优化了16.7倍。Privateer还通过重复数据删除和压缩,将增量数据结构快照的存储效率提高了11倍。
{"title":"Optimizing Management of Persistent Data Structures in High-Performance Analytics","authors":"Karim Youssef;Keita Iwabuchi;Maya Gokhale;Wu-chun Feng;Roger Pearce","doi":"10.1109/TPDS.2025.3646133","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3646133","url":null,"abstract":"Large-scale data analytics workflows ingest massive input data into various data structures, including graphs and key-value datastores. These data structures undergo multiple transformations and computations and are typically reused in incremental and iterative analytics workflows. Persisting in-memory views of these data structures enables reusing them beyond the scope of a single program run while avoiding repetitive raw data ingestion overheads. Memory-mapped I/O enables persisting in-memory data structures without data serialization and deserialization overheads. However, memory-mapped I/O lacks the key feature of persisting consistent snapshots of these data structures for incremental ingestion and processing. The obstacles to efficient virtual memory snapshots using memory-mapped I/O include background writebacks outside the application’s control, and the significantly high storage footprint of such snapshots. To address these limitations, we present <italic>Privateer</i>, a memory and storage management tool that enables storage-efficient virtual memory snapshotting while also optimizing snapshot I/O performance. We integrated <italic>Privateer</i> into <italic>Metall</i>, a state-of-the-art persistent memory allocator for C++, and the Lightning Memory-Mapped Database (LMDB), a widely-used key-value datastore in data analytics and machine learning. <italic>Privateer</i> optimized application performance by 1.22× when storing data structure snapshots to node-local storage, and up to 16.7× when storing snapshots to a parallel file system. <italic>Privateer</i> also optimizes storage efficiency of incremental data structure snapshots by up to 11× using data deduplication and compression.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"562-574"},"PeriodicalIF":6.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Faster Vertex Cover Algorithms on GPUs With Component-Aware Parallel Branching 基于组件感知并行分支的gpu上更快的顶点覆盖算法
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-23 DOI: 10.1109/TPDS.2025.3641049
Hussein Amro;Basel Fakhri;Amer E. Mouawad;Izzat El Hajj
Algorithms for finding minimum or bounded vertex covers in graphs use a branch-and-reduce strategy, which involves exploring a highly imbalanced search tree. Prior GPU solutions assign different thread blocks to different sub-trees, while using a shared worklist to balance the load. However, these prior solutions do not scale to large and complex graphs because their unawareness of when the graph splits into components causes them to solve these components redundantly. Moreover, their high memory footprint limits the number of workers that can execute concurrently. We propose a novel GPU solution for vertex cover problems that detects when a graph splits into components and branches on the components independently. Although the need to aggregate the solutions of different components introduces non-tail-recursive branches which interfere with load balancing, we overcome this challenge by delegating the post-processing to the last descendant of each branch. We also reduce the memory footprint by reducing the graph and inducing a subgraph before exploring the search tree. Our solution substantially outperforms the state-of-the-art GPU solution, finishing in seconds when the state-of-the-art solution exceeds 6 hours. To the best of our knowledge, our work is the first to parallelize non-tail-recursive branching patterns on GPUs in a load balanced manner.
在图中寻找最小或有界顶点覆盖的算法使用分支约简策略,该策略涉及探索高度不平衡的搜索树。先前的GPU解决方案将不同的线程块分配到不同的子树,同时使用共享工作列表来平衡负载。然而,这些先前的解决方案不能扩展到大型和复杂的图,因为它们不知道图何时分裂成组件,导致它们冗余地解决这些组件。此外,它们的高内存占用限制了可以并发执行的工作线程的数量。我们为顶点覆盖问题提出了一种新的GPU解决方案,该解决方案可以检测图何时分裂为组件并在组件上独立分支。尽管需要聚合不同组件的解决方案会引入干扰负载平衡的非尾递归分支,但我们通过将后处理委托给每个分支的最后后代来克服这一挑战。我们还通过在探索搜索树之前减少图和诱导子图来减少内存占用。我们的解决方案大大优于最先进的GPU解决方案,当最先进的解决方案超过6小时时,只需几秒钟即可完成。据我们所知,我们的工作是第一个以负载平衡的方式在gpu上并行化非尾递归分支模式的工作。
{"title":"Faster Vertex Cover Algorithms on GPUs With Component-Aware Parallel Branching","authors":"Hussein Amro;Basel Fakhri;Amer E. Mouawad;Izzat El Hajj","doi":"10.1109/TPDS.2025.3641049","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3641049","url":null,"abstract":"Algorithms for finding minimum or bounded vertex covers in graphs use a branch-and-reduce strategy, which involves exploring a highly imbalanced search tree. Prior GPU solutions assign different thread blocks to different sub-trees, while using a shared worklist to balance the load. However, these prior solutions do not scale to large and complex graphs because their unawareness of when the graph splits into components causes them to solve these components redundantly. Moreover, their high memory footprint limits the number of workers that can execute concurrently. We propose a novel GPU solution for vertex cover problems that detects when a graph splits into components and branches on the components independently. Although the need to aggregate the solutions of different components introduces non-tail-recursive branches which interfere with load balancing, we overcome this challenge by delegating the post-processing to the last descendant of each branch. We also reduce the memory footprint by reducing the graph and inducing a subgraph before exploring the search tree. Our solution substantially outperforms the state-of-the-art GPU solution, finishing in seconds when the state-of-the-art solution exceeds 6 hours. To the best of our knowledge, our work is the first to parallelize non-tail-recursive branching patterns on GPUs in a load balanced manner.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"504-517"},"PeriodicalIF":6.0,"publicationDate":"2025-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cost-Effective Empirical Performance Modeling 成本效益实证绩效模型
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-18 DOI: 10.1109/TPDS.2025.3646119
Marcus Ritter;Benedikt Naumann;Alexandru Calotoiu;Sebastian Rinke;Thorsten Reimann;Torsten Hoefler;Felix Wolf
Performance models help us to understand how HPC applications scale, which is crucial for efficiently utilizing HPC resources. They describe the performance (e.g., runtime) as a function of one or more execution parameters (e.g., problem size and the degree of parallelism). Creating one manually for a given program is challenging and time-consuming. Automatically learning a model from performance data is a viable alternative, but potentially resource-intensive. Extra-P is a tool that implements this approach. The user begins by selecting values for each parameter. Each combination of values defines a possible measurement point. The choice of measurement points affects the quality and cost of the resulting models, creating a complex optimization problem. A naive approach takes measurements for all possible measurement points, the number of which grows exponentially with the number of parameters. In our earlier work, we demonstrated that a quasi-linear number of points is sufficient and that prioritizing the least expensive points is a generic strategy with a good trade-off between cost and quality. Here, we present an improved selection strategy based on Gaussian process regression (GPR) that selects points individually for each modeling task. In our synthetic evaluation, which was based on tens of thousands of artificially generated functions, the naive approach achieved 66% accuracy with two model parameters and 5% artificial noise. At only 10% of the naïve approach’s cost, the generic approach already achieved 47.3% accuracy, while the GPR-based approach achieved even 77.8% accuracy. Similar improvements were observed in experiments involving different numbers of model parameters and noise levels, as well as in case studies with realistic applications.
性能模型帮助我们理解HPC应用程序如何扩展,这对于高效利用HPC资源至关重要。它们将性能(例如,运行时间)描述为一个或多个执行参数(例如,问题大小和并行度)的函数。为给定的程序手动创建一个是具有挑战性和耗时的。从性能数据中自动学习模型是一种可行的替代方案,但可能会占用大量资源。Extra-P是实现这种方法的工具。用户首先为每个参数选择值。每个值的组合定义了一个可能的测量点。测量点的选择会影响最终模型的质量和成本,从而产生一个复杂的优化问题。一种朴素的方法是对所有可能的测量点进行测量,测量点的数量随着参数的数量呈指数增长。在我们早期的工作中,我们证明了准线性数量的点是足够的,并且优先考虑最便宜的点是一种在成本和质量之间良好权衡的通用策略。在这里,我们提出了一种改进的基于高斯过程回归(GPR)的选择策略,该策略为每个建模任务单独选择点。在我们基于数万个人工生成函数的综合评估中,朴素方法在两个模型参数和5%人工噪声的情况下达到66%的准确率。通用方法的成本仅为naïve方法的10%,但准确率已经达到47.3%,而基于gpr的方法甚至达到了77.8%的准确率。在涉及不同数量的模型参数和噪声水平的实验以及具有实际应用的案例研究中,也观察到类似的改进。
{"title":"Cost-Effective Empirical Performance Modeling","authors":"Marcus Ritter;Benedikt Naumann;Alexandru Calotoiu;Sebastian Rinke;Thorsten Reimann;Torsten Hoefler;Felix Wolf","doi":"10.1109/TPDS.2025.3646119","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3646119","url":null,"abstract":"Performance models help us to understand how HPC applications scale, which is crucial for efficiently utilizing HPC resources. They describe the performance (e.g., runtime) as a function of one or more execution parameters (e.g., problem size and the degree of parallelism). Creating one manually for a given program is challenging and time-consuming. Automatically learning a model from performance data is a viable alternative, but potentially resource-intensive. Extra-P is a tool that implements this approach. The user begins by selecting values for each parameter. Each combination of values defines a possible measurement point. The choice of measurement points affects the quality and cost of the resulting models, creating a complex optimization problem. A naive approach takes measurements for all possible measurement points, the number of which grows exponentially with the number of parameters. In our earlier work, we demonstrated that a quasi-linear number of points is sufficient and that prioritizing the least expensive points is a generic strategy with a good trade-off between cost and quality. Here, we present an improved selection strategy based on Gaussian process regression (GPR) that selects points individually for each modeling task. In our synthetic evaluation, which was based on tens of thousands of artificially generated functions, the naive approach achieved 66% accuracy with two model parameters and 5% artificial noise. At only 10% of the naïve approach’s cost, the generic approach already achieved 47.3% accuracy, while the GPR-based approach achieved even 77.8% accuracy. Similar improvements were observed in experiments involving different numbers of model parameters and noise levels, as well as in case studies with realistic applications.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"575-592"},"PeriodicalIF":6.0,"publicationDate":"2025-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Burst Buffers: Accelerating HPC I/O via In-Storage Compression Offloading 计算突发缓冲:通过存储压缩卸载加速HPC I/O
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-11 DOI: 10.1109/TPDS.2025.3643175
Xiang Chen;Bing Lu;Haoquan Long;Huizhang Luo;Yili Ma;Guangming Tan;Dingwen Tao;Fei Wu;Tao Lu
Burst buffers (BBs) act as an intermediate storage layer between compute nodes and parallel file systems (PFS), effectively alleviating the I/O performance gap in high-performance computing (HPC). As scientific simulations and AI workloads generate larger checkpoints and analysis outputs, BB capacity shortages and PFS bandwidth bottlenecks are emerging, and CPU-based compression is not an effective solution due to its high overhead. We introduce Computational Burst Buffers (CBBs), a storage paradigm that embeds hardware compression engines such as application-specific integrated circuit (ASIC) inside computational storage drives (CSDs) at the BB tier. CBB transparently offloads both lossless and error-bounded lossy compression from CPUs to CSDs, thereby (i) expanding effective SSD-backed BB capacity, (ii) reducing BB–PFS traffic, and (iii) eliminating contention and energy overheads of CPU-based compression. Unlike prior CSD-based compression designs targeting databases or flash caching, CBB co-designs the burst-buffer layer and CSD hardware for HPC and quantitatively evaluates compression offload in BB–PFS hierarchies. We prototype CBB using a PCIe 5.0 CSD with an ASIC Zstd-like compressor and an FPGA prototype of an SZ entropy encoder, and evaluate CBB on a 16-node cluster. Experiments with four representative HPC applications and a large-scale workflow simulator show up to 61% lower application runtime, 8–12× higher cache hit ratios, and substantially reduced compute-node CPU utilization compared to software compression and conventional BBs. These results demonstrate that compression-aware BBs with CSDs provide a practical, scalable path to next-generation HPC storage.
突发缓冲区(Burst buffers)作为计算节点和并行文件系统(PFS)之间的中间存储层,可以有效缓解高性能计算(HPC)中的I/O性能差距。随着科学模拟和人工智能工作负载产生更大的检查点和分析输出,BB容量短缺和PFS带宽瓶颈正在出现,基于cpu的压缩由于其高开销而不是有效的解决方案。我们介绍了计算突发缓冲区(CBBs),这是一种存储范例,它将硬件压缩引擎(如专用集成电路(ASIC))嵌入到BB层的计算存储驱动器(csd)中。CBB透明地将无损和错误有界的有损压缩从cpu卸载到csd,从而(i)扩展ssd支持的有效BB容量,(ii)减少BB - pfs流量,以及(iii)消除基于cpu的压缩的争用和能源开销。与之前针对数据库或闪存缓存的基于CSD的压缩设计不同,CBB为HPC共同设计突发缓冲层和CSD硬件,并定量评估BB-PFS层次结构中的压缩卸载。我们使用带有ASIC zstd类压缩器的PCIe 5.0 CSD和SZ熵编码器的FPGA原型对CBB进行了原型设计,并在16节点集群上对CBB进行了评估。在四个典型HPC应用程序和一个大型工作流模拟器上进行的实验表明,与软件压缩和传统BBs相比,应用程序运行时间降低了61%,缓存命中率提高了8 - 12倍,并且大大降低了计算节点的CPU利用率。这些结果表明,带有csd的压缩感知型bsd为下一代高性能计算存储提供了一条实用的、可扩展的途径。
{"title":"Computational Burst Buffers: Accelerating HPC I/O via In-Storage Compression Offloading","authors":"Xiang Chen;Bing Lu;Haoquan Long;Huizhang Luo;Yili Ma;Guangming Tan;Dingwen Tao;Fei Wu;Tao Lu","doi":"10.1109/TPDS.2025.3643175","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3643175","url":null,"abstract":"Burst buffers (BBs) act as an intermediate storage layer between compute nodes and parallel file systems (PFS), effectively alleviating the I/O performance gap in high-performance computing (HPC). As scientific simulations and AI workloads generate larger checkpoints and analysis outputs, BB capacity shortages and PFS bandwidth bottlenecks are emerging, and CPU-based compression is not an effective solution due to its high overhead. We introduce <underline>Computational Burst Buffers</u> (CBBs), a storage paradigm that embeds hardware compression engines such as application-specific integrated circuit (ASIC) inside computational storage drives (CSDs) at the BB tier. CBB transparently offloads both lossless and error-bounded lossy compression from CPUs to CSDs, thereby (i) expanding effective SSD-backed BB capacity, (ii) reducing BB–PFS traffic, and (iii) eliminating contention and energy overheads of CPU-based compression. Unlike prior CSD-based compression designs targeting databases or flash caching, CBB co-designs the burst-buffer layer and CSD hardware for HPC and quantitatively evaluates compression offload in BB–PFS hierarchies. We prototype CBB using a PCIe 5.0 CSD with an ASIC Zstd-like compressor and an FPGA prototype of an SZ entropy encoder, and evaluate CBB on a 16-node cluster. Experiments with four representative HPC applications and a large-scale workflow simulator show up to 61% lower application runtime, 8–12× higher cache hit ratios, and substantially reduced compute-node CPU utilization compared to software compression and conventional BBs. These results demonstrate that compression-aware BBs with CSDs provide a practical, scalable path to next-generation HPC storage.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"518-532"},"PeriodicalIF":6.0,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enabling Tile-Based Direct Query on Adaptively Compressed Data With GPU Acceleration 使用GPU加速对自适应压缩数据启用基于磁贴的直接查询
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-02 DOI: 10.1109/TPDS.2025.3639485
Yu Zhang;Feng Zhang;Yani Liu;Huanchen Zhang;Jidong Zhai;Wenchao Zhou;Xiaoyong Du
The explosive growth of data poses significant challenges for GPU-based databases, which must balance limited memory capacity with the need for high-speed query execution. Compression has become an essential technique for optimizing memory utilization and reducing data movement. However, its benefits have been limited to the necessary data decompression. Querying compressed data conventionally requires decompression, which causes the query process to be significantly slower than a direct query on uncompressed data. To address this problem, this article presents a novel GPU-accelerated tile-based direct query framework that successfully eliminates the limitation, significantly enhancing query performance. By employing direct query strategies, the framework minimizes data movement and maximizes memory bandwidth utilization. It incorporates tile-based hardware-conscious execution strategies for direct query, including memory management and control flow coordination, to improve execution efficiency. Additionally, adaptive data-driven compression formats are paired with tailored SQL operators to enable efficient support for diverse queries. Our experiments, conducted using the Star Schema Benchmark, show an average improvement of 3.5× compared to the state-of-the-art tile-based decompression scheme, while maintaining the space-saving advantages of compression. Notably, our solution consistently outperforms existing direct execution schemes for compressed data across all query types.
数据的爆炸性增长对基于gpu的数据库提出了重大挑战,这些数据库必须平衡有限的内存容量和高速查询执行的需求。压缩已经成为优化内存利用率和减少数据移动的基本技术。然而,它的好处仅限于必要的数据解压缩。查询压缩数据通常需要解压缩,这导致查询过程明显慢于直接查询未压缩数据。为了解决这个问题,本文提出了一种新的gpu加速的基于tile的直接查询框架,该框架成功地消除了这一限制,显著提高了查询性能。通过采用直接查询策略,该框架最大限度地减少了数据移动并最大限度地提高了内存带宽利用率。它为直接查询集成了基于磁贴的硬件感知执行策略,包括内存管理和控制流协调,以提高执行效率。此外,自适应数据驱动的压缩格式与定制的SQL操作符配对,以便有效地支持各种查询。我们使用星型模式基准进行的实验显示,与最先进的基于tile的解压方案相比,平均提高了3.5倍,同时保持了压缩节省空间的优势。值得注意的是,对于所有查询类型的压缩数据,我们的解决方案始终优于现有的直接执行方案。
{"title":"Enabling Tile-Based Direct Query on Adaptively Compressed Data With GPU Acceleration","authors":"Yu Zhang;Feng Zhang;Yani Liu;Huanchen Zhang;Jidong Zhai;Wenchao Zhou;Xiaoyong Du","doi":"10.1109/TPDS.2025.3639485","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3639485","url":null,"abstract":"The explosive growth of data poses significant challenges for GPU-based databases, which must balance limited memory capacity with the need for high-speed query execution. Compression has become an essential technique for optimizing memory utilization and reducing data movement. However, its benefits have been limited to the necessary data decompression. Querying compressed data conventionally requires decompression, which causes the query process to be significantly slower than a direct query on uncompressed data. To address this problem, this article presents a novel GPU-accelerated tile-based direct query framework that successfully eliminates the limitation, significantly enhancing query performance. By employing direct query strategies, the framework minimizes data movement and maximizes memory bandwidth utilization. It incorporates tile-based hardware-conscious execution strategies for direct query, including memory management and control flow coordination, to improve execution efficiency. Additionally, adaptive data-driven compression formats are paired with tailored SQL operators to enable efficient support for diverse queries. Our experiments, conducted using the Star Schema Benchmark, show an average improvement of 3.5× compared to the state-of-the-art tile-based decompression scheme, while maintaining the space-saving advantages of compression. Notably, our solution consistently outperforms existing direct execution schemes for compressed data across all query types.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"410-426"},"PeriodicalIF":6.0,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Rack Aware Recycle Technique in Erasure-Coded Data Centers 擦除编码数据中心的跨机架感知回收技术
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-01 DOI: 10.1109/TPDS.2025.3639066
Hai Zhou;Dan Feng
Data centers commonly use erasure codes to maintain high data reliability with lower storage overhead than replication. However, recycling invalid data blocks caused by deletion and update operations is challenging in erasure-coded data centers. Erasure codes organize data blocks into stripes, and we cannot directly delete invalid data blocks like replication to ensure the redundancy of the remaining valid blocks within a stripe. When considering the recycling issues in data centers, existing studies still need to address the following problems: ignoring heavy cross-rack traffic and the load imbalance problem during recycling, and incurring high disk seeks that affect writing performance after recycling. This paper presents the first systematic study on data recycling in erasure-coded data centers and proposes a Cross-rack Aware Recycle (CARecycle) technique. The key idea is migrating valid data blocks from certain stripes to rewrite invalid ones in others, thereby releasing the invalid blocks for certain stripes. Specifically, CARecycle first carefully examines the block distribution for each stripe and generates an efficient recycle solution for migrating and releasing, with the primary objective of reducing cross-rack traffic and disk seek load of nodes. Due to the rewriting of invalid data blocks, parity blocks in multiple stripes need to be updated concurrently. Thus, it further batch processes multiple stripes and selectively arranges appropriate stripes into a batch to achieve uniform cross-rack traffic load distribution. In addition, CARecycle can be extended to adapt to different erasure codes and boost recycling in heterogeneous network environments. Large-scale simulations and Amazon EC2 experiments show that CARecycle can reduce up to 33.8% cross-rack traffic and 28.64%–59.64% recycle time while incurring low disk seek, compared to a state-of-the-art recycling technique.
数据中心通常使用擦除码来保持高的数据可靠性,并且比复制具有更低的存储开销。然而,在擦除编码的数据中心中,回收由删除和更新操作引起的无效数据块是一项挑战。Erasure码将数据块组织成条带,不能像复制一样直接删除无效的数据块,以保证条带内剩余有效块的冗余性。在考虑数据中心的回收问题时,现有的研究还需要解决以下问题:忽略了回收过程中过大的跨机架流量和负载不平衡问题,以及回收后产生的高磁盘寻道率影响写入性能。本文首次系统地研究了擦除编码数据中心的数据回收问题,提出了一种跨机架感知回收(CARecycle)技术。关键思想是将有效的数据块从某些条带迁移到重写其他条带中的无效数据块,从而释放某些条带的无效块。具体来说,CARecycle首先仔细检查每个条带的块分布,并为迁移和释放生成有效的回收解决方案,其主要目标是减少跨机架流量和节点的磁盘寻道负载。由于重写无效数据块,需要同时更新多个分条的校验块。从而进一步对多个条纹进行批量处理,并有选择地将适当的条纹排列成批,以实现均匀的跨机架流量负载分布。此外,还可以对CARecycle进行扩展,以适应不同的擦除码,提高异构网络环境下的回收能力。大规模模拟和Amazon EC2实验表明,与最先进的回收技术相比,CARecycle可以减少高达33.8%的跨机架流量和28.64%-59.64%的回收时间,同时减少磁盘寻道。
{"title":"Cross-Rack Aware Recycle Technique in Erasure-Coded Data Centers","authors":"Hai Zhou;Dan Feng","doi":"10.1109/TPDS.2025.3639066","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3639066","url":null,"abstract":"Data centers commonly use erasure codes to maintain high data reliability with lower storage overhead than replication. However, recycling invalid data blocks caused by deletion and update operations is challenging in erasure-coded data centers. Erasure codes organize data blocks into stripes, and we cannot directly delete invalid data blocks like replication to ensure the redundancy of the remaining valid blocks within a stripe. When considering the recycling issues in data centers, existing studies still need to address the following problems: ignoring heavy cross-rack traffic and the load imbalance problem during recycling, and incurring high disk seeks that affect writing performance after recycling. This paper presents the first systematic study on data recycling in erasure-coded data centers and proposes a <italic>Cross-rack Aware Recycle</i> (CARecycle) technique. The key idea is migrating valid data blocks from certain stripes to rewrite invalid ones in others, thereby releasing the invalid blocks for certain stripes. Specifically, CARecycle first carefully examines the block distribution for each stripe and generates an efficient recycle solution for migrating and releasing, with the primary objective of reducing cross-rack traffic and disk seek load of nodes. Due to the rewriting of invalid data blocks, parity blocks in multiple stripes need to be updated concurrently. Thus, it further batch processes multiple stripes and selectively arranges appropriate stripes into a batch to achieve uniform cross-rack traffic load distribution. In addition, CARecycle can be extended to adapt to different erasure codes and boost recycling in heterogeneous network environments. Large-scale simulations and Amazon EC2 experiments show that CARecycle can reduce up to 33.8% cross-rack traffic and 28.64%–59.64% recycle time while incurring low disk seek, compared to a state-of-the-art recycling technique.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"365-379"},"PeriodicalIF":6.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling Jobs Under a Variable Number of Processors 在可变数量的处理器下调度作业
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-01 DOI: 10.1109/TPDS.2025.3638703
Anne Benoit;Joachim Cendrier;Frédéric Vivien
Even though it is usually assumed that data centers can always operate at maximum capacity, there have been recent scenarios where the amount of electricity that can be used by data centers evolve over time. Hence, the number of available processors is not a constant anymore. In this work, we assume that jobs can be checkpointed before a resource change. Indeed, in the scenarios that we consider, the resource provider warns the user before a change in the number of processors. It is thus possible to anticipate and take checkpoints before the change happens, such that no work is ever lost. The goal is then to maximize the goodput and/or the minimum yield of jobs within the next section (time between two changes in the number of processors). We model the problem and design greedy solutions and sophisticated dynamic programming algorithms with some optimality results for jobs of infinite duration, and adapt the algorithms to finite jobs. A comprehensive set of simulations, building on real-life job sets, demonstrates the performance of the proposed algorithms. Most algorithms achieve a useful platform utilization (goodput) of over 95%. With infinite jobs, the algorithms also keep fairness by having a relative minimum yield above 0.8, meaning that each job gets a good access to the platform (80% of the time that it would have had if each job had its perfect share of the platform). For finite jobs, the minimum yield can be low since very short new jobs may have to wait until the beginning of the next section to start (and finish), significantly impacting their yield. However, for 75% of the jobs within each workload, the yield ratio between these jobs is at most at a factor two, hence demonstrating the fairness of the proposed algorithms.
尽管通常假设数据中心可以始终以最大容量运行,但最近出现了数据中心可以使用的电量随着时间的推移而变化的情况。因此,可用处理器的数量不再是一个常数。在本工作中,我们假设作业可以在资源更改之前被检查点。实际上,在我们考虑的场景中,资源提供者在更改处理器数量之前会向用户发出警告。因此,可以在更改发生之前预测并采取检查点,这样就不会丢失任何工作。然后,目标是在下一段(处理器数量两次变化之间的时间)内最大化作业的goodput和/或最小yield。我们对问题进行建模,设计了贪心解和复杂的动态规划算法,并对无限工期的作业给出了一些最优结果,并使算法适用于有限工期的作业。一组全面的模拟,建立在现实生活中的工作集,证明了所提出的算法的性能。大多数算法实现了95%以上的有用平台利用率(goodput)。对于无限的作业,算法也保持了公平性,其相对最小收益高于0.8,这意味着每个作业都可以很好地访问平台(如果每个作业都拥有完美的平台份额,那么它将拥有80%的时间)。对于有限的作业,最小产量可能很低,因为非常短的新作业可能要等到下一段的开始才能开始(和结束),这极大地影响了它们的产量。然而,对于每个工作负载中75%的作业,这些作业之间的收益率最多为2倍,从而证明了所提出算法的公平性。
{"title":"Scheduling Jobs Under a Variable Number of Processors","authors":"Anne Benoit;Joachim Cendrier;Frédéric Vivien","doi":"10.1109/TPDS.2025.3638703","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3638703","url":null,"abstract":"Even though it is usually assumed that data centers can always operate at maximum capacity, there have been recent scenarios where the amount of electricity that can be used by data centers evolve over time. Hence, the number of available processors is not a constant anymore. In this work, we assume that jobs can be checkpointed before a resource change. Indeed, in the scenarios that we consider, the resource provider warns the user before a change in the number of processors. It is thus possible to anticipate and take checkpoints before the change happens, such that no work is ever lost. The goal is then to maximize the goodput and/or the minimum yield of jobs within the next section (time between two changes in the number of processors). We model the problem and design greedy solutions and sophisticated dynamic programming algorithms with some optimality results for jobs of infinite duration, and adapt the algorithms to finite jobs. A comprehensive set of simulations, building on real-life job sets, demonstrates the performance of the proposed algorithms. Most algorithms achieve a useful platform utilization (goodput) of over 95%. With infinite jobs, the algorithms also keep fairness by having a relative minimum yield above 0.8, meaning that each job gets a good access to the platform (80% of the time that it would have had if each job had its perfect share of the platform). For finite jobs, the minimum yield can be low since very short new jobs may have to wait until the beginning of the next section to start (and finish), significantly impacting their yield. However, for 75% of the jobs within each workload, the yield ratio between these jobs is at most at a factor two, hence demonstrating the fairness of the proposed algorithms.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"427-442"},"PeriodicalIF":6.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EdgeDup: Popularity-Aware Communication-Efficient Decentralized Edge Data Deduplication EdgeDup:受欢迎的高效通信分散边缘重复数据删除
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-12-01 DOI: 10.1109/TPDS.2025.3638945
Ruikun Luo;Wang Yang;Qiang He;Feifei Chen;Song Wu;Hai Jin;Yun Yang
Data deduplication, originally designed for cloud storage systems, is increasingly popular in edge storage systems due to the costly and limited resources and prevalent data redundancy in edge computing environments. The geographical distribution of edge servers poses a challenge in aggregating all data storage information for global decision-making. Existing edge data deduplication (EDD) methods rely on centralized cloud control, which faces issues of timeliness and system scalability. Additionally, these methods overlook data popularity, leading to significantly increased data retrieval latency. A promising approach to this challenge is to implement distributed EDD without cloud control, performing regional deduplication with the edge server requiring deduplication as the center. However, our investigation reveals that existing distributed EDD approaches either fail to account for the impact of collaborative caching on data availability or generate excessive information exchange between edge servers, leading to high communication overhead. To tackle this challenge, this paper presents EdgeDup, which attempts to implement effective EDD in a distributed manner. Additionally, to ensure data availability, EdgeDup aims to maintain low data retrieval latency. EdgeDup achieves its goals by: 1) identifying data redundancies across different edge servers in the system; 2) deduplicating data based on their popularity; and 3) reducing communication overheads using a novel data dependency index. Extensive experimental results show that EdgeDup significantly enhances performance, i.e., reducing data retrieval latency by an average of 47.78% compared to state-of-the-art EDD approaches while maintaining a comparable deduplication ratio.
重复数据删除最初是为云存储系统设计的,由于边缘计算环境中资源昂贵且有限,数据冗余普遍存在,因此在边缘存储系统中越来越流行。边缘服务器的地理分布对聚合所有数据存储信息以进行全局决策提出了挑战。现有的边缘重复数据删除(EDD)方法依赖于集中的云控制,这面临着及时性和系统可扩展性的问题。此外,这些方法忽略了数据的流行程度,导致数据检索延迟显著增加。应对这一挑战的一种很有前途的方法是在没有云控制的情况下实现分布式EDD,以需要重复数据删除的边缘服务器为中心执行区域重复数据删除。然而,我们的调查显示,现有的分布式EDD方法要么无法考虑协作缓存对数据可用性的影响,要么在边缘服务器之间产生过多的信息交换,从而导致高通信开销。为了应对这一挑战,本文提出了EdgeDup,它试图以分布式的方式实现有效的EDD。此外,为了确保数据的可用性,EdgeDup旨在保持较低的数据检索延迟。EdgeDup实现其目标:1)识别系统中不同边缘服务器之间的数据冗余;2)根据受欢迎程度对数据进行重复数据删除;3)使用新的数据依赖索引减少通信开销。大量的实验结果表明,EdgeDup显著提高了性能,即与最先进的EDD方法相比,在保持相当的重复数据删除比率的同时,平均减少了47.78%的数据检索延迟。
{"title":"EdgeDup: Popularity-Aware Communication-Efficient Decentralized Edge Data Deduplication","authors":"Ruikun Luo;Wang Yang;Qiang He;Feifei Chen;Song Wu;Hai Jin;Yun Yang","doi":"10.1109/TPDS.2025.3638945","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3638945","url":null,"abstract":"Data deduplication, originally designed for cloud storage systems, is increasingly popular in edge storage systems due to the costly and limited resources and prevalent data redundancy in edge computing environments. The geographical distribution of edge servers poses a challenge in aggregating all data storage information for global decision-making. Existing edge data deduplication (EDD) methods rely on centralized cloud control, which faces issues of timeliness and system scalability. Additionally, these methods overlook data popularity, leading to significantly increased data retrieval latency. A promising approach to this challenge is to implement distributed EDD without cloud control, performing regional deduplication with the edge server requiring deduplication as the center. However, our investigation reveals that existing distributed EDD approaches either fail to account for the impact of collaborative caching on data availability or generate excessive information exchange between edge servers, leading to high communication overhead. To tackle this challenge, this paper presents EdgeDup, which attempts to implement effective EDD in a distributed manner. Additionally, to ensure data availability, EdgeDup aims to maintain low data retrieval latency. EdgeDup achieves its goals by: 1) identifying data redundancies across different edge servers in the system; 2) deduplicating data based on their popularity; and 3) reducing communication overheads using a novel data dependency index. Extensive experimental results show that EdgeDup significantly enhances performance, i.e., reducing data retrieval latency by an average of 47.78% compared to state-of-the-art EDD approaches while maintaining a comparable deduplication ratio.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"459-471"},"PeriodicalIF":6.0,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271552","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FLUXLog: A Federated Mixture-of-Experts Framework for Unified Log Anomaly Detection FLUXLog:用于统一日志异常检测的联邦混合专家框架
IF 6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2025-11-28 DOI: 10.1109/TPDS.2025.3638693
Yixiao Xia;Yinghui Zhao;Jian Wan;Congfeng Jiang
Traditional log anomaly detection systems are centralized, which poses the risk of privacy leakage during data transmission. Previous research mainly focuses on single-domain logs, requiring domain-specific models and retraining, which limits flexibility and scalability. In this paper, we propose a unified federated cross-domain log anomaly detection approach, FLUXLog, which is based on MoE (Mixture of Experts) to handle heterogeneous log data. Based on our insights, we establish a two-phase training process: pre-training the gating network to assign expert weights based on data distribution, followed by expert-driven top-down feature fusion. The following training of the gating network is based on fine-tuning the adapters, providing the necessary flexibility for the model to adapt across domains while maintaining expert specialization. This training paradigm enables a Hybrid Specialization Strategy, fostering both domain-specific expertise and cross-domain generalization. The Cross Gated-Experts Module (CGEM) then fuses expert weights and dual-channel outputs. Experiments on public datasets demonstrate that our model outperforms baseline models in handling unified cross-domain log data.
传统的日志异常检测系统是集中式的,存在数据传输过程中隐私泄露的风险。以前的研究主要集中在单域日志,需要特定于域的模型和再训练,这限制了灵活性和可扩展性。在本文中,我们提出了一种统一的联邦跨域日志异常检测方法FLUXLog,该方法基于MoE(混合专家)来处理异构日志数据。基于我们的见解,我们建立了一个两阶段的训练过程:预训练门控网络以根据数据分布分配专家权重,然后是专家驱动的自上而下的特征融合。门控网络的以下训练是基于对适配器的微调,为模型提供必要的灵活性,以便在保持专家专门化的同时适应跨域。这种训练范例支持混合专门化策略,既培养特定领域的专业知识,又培养跨领域的泛化。然后,交叉门专家模块(CGEM)融合专家权重和双通道输出。在公共数据集上的实验表明,该模型在处理统一的跨域日志数据方面优于基线模型。
{"title":"FLUXLog: A Federated Mixture-of-Experts Framework for Unified Log Anomaly Detection","authors":"Yixiao Xia;Yinghui Zhao;Jian Wan;Congfeng Jiang","doi":"10.1109/TPDS.2025.3638693","DOIUrl":"https://doi.org/10.1109/TPDS.2025.3638693","url":null,"abstract":"Traditional log anomaly detection systems are centralized, which poses the risk of privacy leakage during data transmission. Previous research mainly focuses on single-domain logs, requiring domain-specific models and retraining, which limits flexibility and scalability. In this paper, we propose a unified federated cross-domain log anomaly detection approach, FLUXLog, which is based on MoE (Mixture of Experts) to handle heterogeneous log data. Based on our insights, we establish a two-phase training process: pre-training the gating network to assign expert weights based on data distribution, followed by expert-driven top-down feature fusion. The following training of the gating network is based on fine-tuning the adapters, providing the necessary flexibility for the model to adapt across domains while maintaining expert specialization. This training paradigm enables a Hybrid Specialization Strategy, fostering both domain-specific expertise and cross-domain generalization. The Cross Gated-Experts Module (CGEM) then fuses expert weights and dual-channel outputs. Experiments on public datasets demonstrate that our model outperforms baseline models in handling unified cross-domain log data.","PeriodicalId":13257,"journal":{"name":"IEEE Transactions on Parallel and Distributed Systems","volume":"37 2","pages":"395-409"},"PeriodicalIF":6.0,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271152","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Parallel and Distributed Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1