Journal of Computer Science and Technology最新文献_第5页

Motion-Inspired Real-Time Garment Synthesis with Temporal-Consistency 具有时空一致性的运动启发实时服装合成

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-12-01 DOI: 10.1007/s11390-022-1887-1

Abstract

Synthesizing garment dynamics according to body motions is a vital technique in computer graphics. Physics-based simulation depends on an accurate model of the law of kinetics of cloth, which is time-consuming, hard to implement, and complex to control. Existing data-driven approaches either lack temporal consistency, or fail to handle garments that are different from body topology. In this paper, we present a motion-inspired real-time garment synthesis workflow that enables high-level control of garment shape. Given a sequence of body motions, our workflow is able to generate corresponding garment dynamics with both spatial and temporal coherence. To that end, we develop a transformerbased garment synthesis network to learn the mapping from body motions to garment dynamics. Frame-level attention is employed to capture the dependency of garments and body motions. Moreover, a post-processing procedure is further taken to perform penetration removal and auto-texturing. Then, textured clothing animation that is collision-free and temporally-consistent is generated. We quantitatively and qualitatively evaluated our proposed workflow from different aspects. Extensive experiments demonstrate that our network is able to deliver clothing dynamics which retain the wrinkles from the physics-based simulation, while running 1 000 times faster. Besides, our workflow achieved superior synthesis performance compared with alternative approaches. To stimulate further research in this direction, our code will be publicly available soon.

摘要根据人体运动合成服装动态是计算机制图中的一项重要技术。基于物理的仿真依赖于精确的布料动力学规律模型，而该模型耗时长、难以实现且控制复杂。现有的数据驱动方法要么缺乏时间一致性，要么无法处理与身体拓扑结构不同的服装。在本文中，我们提出了一种受运动启发的实时服装合成工作流程，可实现对服装形状的高级控制。给定一系列身体运动，我们的工作流程就能生成具有空间和时间一致性的相应服装动态。为此，我们开发了基于变压器的服装合成网络，以学习从身体运动到服装动态的映射。我们采用帧级关注来捕捉服装和身体运动之间的依赖关系。此外，还进一步采用后处理程序来执行穿透去除和自动贴图。然后，生成无碰撞且时间上一致的服装纹理动画。我们从不同方面对我们提出的工作流程进行了定量和定性评估。广泛的实验证明，我们的网络能够提供保留了基于物理模拟的褶皱的服装动态效果，同时运行速度提高了 1000 倍。此外，与其他方法相比，我们的工作流程实现了卓越的合成性能。为了激励在这一方向上的进一步研究，我们的代码即将公开发布。

{"title":"Motion-Inspired Real-Time Garment Synthesis with Temporal-Consistency","authors":"","doi":"10.1007/s11390-022-1887-1","DOIUrl":"https://doi.org/10.1007/s11390-022-1887-1","url":null,"abstract":"<h3>Abstract</h3> <p>Synthesizing garment dynamics according to body motions is a vital technique in computer graphics. Physics-based simulation depends on an accurate model of the law of kinetics of cloth, which is time-consuming, hard to implement, and complex to control. Existing data-driven approaches either lack temporal consistency, or fail to handle garments that are different from body topology. In this paper, we present a motion-inspired real-time garment synthesis workflow that enables high-level control of garment shape. Given a sequence of body motions, our workflow is able to generate corresponding garment dynamics with both spatial and temporal coherence. To that end, we develop a transformerbased garment synthesis network to learn the mapping from body motions to garment dynamics. Frame-level attention is employed to capture the dependency of garments and body motions. Moreover, a post-processing procedure is further taken to perform penetration removal and auto-texturing. Then, textured clothing animation that is collision-free and temporally-consistent is generated. We quantitatively and qualitatively evaluated our proposed workflow from different aspects. Extensive experiments demonstrate that our network is able to deliver clothing dynamics which retain the wrinkles from the physics-based simulation, while running 1 000 times faster. Besides, our workflow achieved superior synthesis performance compared with alternative approaches. To stimulate further research in this direction, our code will be publicly available soon.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"9 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139656281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automatic Target Description File Generation 自动生成目标描述文件

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-12-01 DOI: 10.1007/s11390-022-1919-x

Abstract

Agile hardware design is gaining increasing momentum and bringing new chips in larger quantities to the market faster. However, it also takes new challenges for compiler developers to retarget existing compilers to these new chips in shorter time than ever before. Currently, retargeting a compiler backend, e.g., an LLVM backend to a new target, requires compiler developers to write manually a set of target description files (totalling 10 300+ lines of code (LOC) for RISC-V in LLVM), which is error-prone and time-consuming. In this paper, we introduce a new approach, Automatic Target Description File Generation (ATG), which accelerates the generation of a compiler backend for a new target by generating its target description files automatically. Given a new target, ATG proceeds in two stages. First, ATG synthesizes a small list of target-specific properties and a list of code-layout templates from the target description files of a set of existing targets with similar instruction set architectures (ISAs). Second, ATG requests compiler developers to fill in the information for each instruction in the new target in tabular form according to the list of target-specific properties synthesized and then generates its target description files automatically according to the list of code-layout templates synthesized. The first stage can often be reused by different new targets sharing similar ISAs. We evaluate ATG using nine RISC-V instruction sets drawn from a total of 1 029 instructions in LLVM 12.0. ATG enables compiler developers to generate compiler backends for these ISAs that emit the same assembly code as the existing compiler backends for RISC-V but with significantly less development effort (by specifying each instruction in terms of up to 61 target-specific properties only).

摘要敏捷硬件设计的发展势头日益强劲，并能更快地将更多新芯片推向市场。然而，这也给编译器开发人员带来了新的挑战，他们需要在比以往更短的时间内将现有编译器重定向到这些新芯片。目前，将编译器后端（如 LLVM 后端）重定向到新目标需要编译器开发人员手动编写一组目标描述文件（LLVM 中 RISC-V 的代码行数超过 10 300 行），既容易出错又耗时。在本文中，我们介绍了一种新方法--自动目标描述文件生成（ATG），它通过自动生成目标描述文件来加速新目标编译器后端的生成。给定一个新目标，ATG 分两个阶段进行。首先，ATG 从一组具有类似指令集架构（ISA）的现有目标机的目标机描述文件中合成一小部分目标机特定属性列表和代码布局模板列表。其次，ATG 要求编译器开发人员根据合成的目标特定属性列表，以表格形式填写新目标中每条指令的信息，然后根据合成的代码布局模板列表自动生成目标描述文件。共享类似 ISA 的不同新目标通常可以重复使用第一阶段。我们使用 LLVM 12.0 中总共 1 029 条指令中的九个 RISC-V 指令集对 ATG 进行了评估。ATG 使编译器开发人员能够为这些 ISA 生成编译器后端，这些后端可生成与现有 RISC-V 编译器后端相同的汇编代码，但开发工作量却大大减少（只需在多达 61 个目标特定属性中指定每条指令）。

{"title":"Automatic Target Description File Generation","authors":"","doi":"10.1007/s11390-022-1919-x","DOIUrl":"https://doi.org/10.1007/s11390-022-1919-x","url":null,"abstract":"<h3>Abstract</h3> <p>Agile hardware design is gaining increasing momentum and bringing new chips in larger quantities to the market faster. However, it also takes new challenges for compiler developers to retarget existing compilers to these new chips in shorter time than ever before. Currently, retargeting a compiler backend, e.g., an LLVM backend to a new target, requires compiler developers to write manually a set of target description files (totalling 10 300+ lines of code (LOC) for RISC-V in LLVM), which is error-prone and time-consuming. In this paper, we introduce a new approach, Automatic Target Description File Generation (ATG), which accelerates the generation of a compiler backend for a new target by generating its target description files automatically. Given a new target, ATG proceeds in two stages. First, ATG synthesizes a small list of target-specific properties and a list of code-layout templates from the target description files of a set of existing targets with similar instruction set architectures (ISAs). Second, ATG requests compiler developers to fill in the information for each instruction in the new target in tabular form according to the list of target-specific properties synthesized and then generates its target description files automatically according to the list of code-layout templates synthesized. The first stage can often be reused by different new targets sharing similar ISAs. We evaluate ATG using nine RISC-V instruction sets drawn from a total of 1 029 instructions in LLVM 12.0. ATG enables compiler developers to generate compiler backends for these ISAs that emit the same assembly code as the existing compiler backends for RISC-V but with significantly less development effort (by specifying each instruction in terms of up to 61 target-specific properties only).</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"155-156 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139656170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hadamard Encoding Based Frequent Itemset Mining under Local Differential Privacy 局部差异隐私下基于哈达玛编码的常项集挖掘

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-12-01 DOI: 10.1007/s11390-023-1346-7

Abstract

Local differential privacy (LDP) approaches to collecting sensitive information for frequent itemset mining (FIM) can reliably guarantee privacy. Most current approaches to FIM under LDP add “padding and sampling” steps to obtain frequent itemsets and their frequencies because each user transaction represents a set of items. The current state-of-the-art approach, namely set-value itemset mining (SVSM), must balance variance and bias to achieve accurate results. Thus, an unbiased FIM approach with lower variance is highly promising. To narrow this gap, we propose an Item-Level LDP frequency oracle approach, named the Integrated-with-Hadamard-Transform-Based Frequency Oracle (IHFO). For the first time, Hadamard encoding is introduced to a set of values to encode all items into a fixed vector, and perturbation can be subsequently applied to the vector. An FIM approach, called optimized united itemset mining (O-UISM), is proposed to combine the padding-and-sampling-based frequency oracle (PSFO) and the IHFO into a framework for acquiring accurate frequent itemsets with their frequencies. Finally, we theoretically and experimentally demonstrate that O-UISM significantly outperforms the extant approaches in finding frequent itemsets and estimating their frequencies under the same privacy guarantee.

摘要为频繁项集挖掘（FIM）收集敏感信息的局部差分隐私（LDP）方法可以可靠地保证隐私。由于每笔用户交易都代表一组项目，因此目前大多数 LDP 下的频繁项集挖掘方法都增加了 "填充和采样 "步骤，以获得频繁项集及其频率。目前最先进的方法，即集值项集挖掘（SVSM），必须在方差和偏差之间取得平衡，才能获得准确的结果。因此，一种无偏见、方差较小的 FIM 方法大有可为。为了缩小这一差距，我们提出了一种项级 LDP 频率甲骨文方法，名为基于哈达玛德变换的集成频率甲骨文（IHFO）。我们首次在一组值中引入哈达玛编码，将所有项目编码为一个固定的向量，随后可对该向量进行扰动。我们提出了一种称为优化联合项集挖掘（O-UISM）的 FIM 方法，将基于填充和采样的频率神谕（PSFO）和 IHFO 结合到一个框架中，以获取精确的频繁项集及其频率。最后，我们通过理论和实验证明，O-UISM 在寻找频繁项集和估算其频率方面明显优于现有方法。

{"title":"Hadamard Encoding Based Frequent Itemset Mining under Local Differential Privacy","authors":"","doi":"10.1007/s11390-023-1346-7","DOIUrl":"https://doi.org/10.1007/s11390-023-1346-7","url":null,"abstract":"<h3>Abstract</h3> <p>Local differential privacy (LDP) approaches to collecting sensitive information for frequent itemset mining (FIM) can reliably guarantee privacy. Most current approaches to FIM under LDP add “padding and sampling” steps to obtain frequent itemsets and their frequencies because each user transaction represents a set of items. The current state-of-the-art approach, namely set-value itemset mining (SVSM), must balance variance and bias to achieve accurate results. Thus, an unbiased FIM approach with lower variance is highly promising. To narrow this gap, we propose an Item-Level LDP frequency oracle approach, named the Integrated-with-Hadamard-Transform-Based Frequency Oracle (IHFO). For the first time, Hadamard encoding is introduced to a set of values to encode all items into a fixed vector, and perturbation can be subsequently applied to the vector. An FIM approach, called optimized united itemset mining (O-UISM), is proposed to combine the padding-and-sampling-based frequency oracle (PSFO) and the IHFO into a framework for acquiring accurate frequent itemsets with their frequencies. Finally, we theoretically and experimentally demonstrate that O-UISM significantly outperforms the extant approaches in finding frequent itemsets and estimating their frequencies under the same privacy guarantee.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"13 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139659401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

2k-Vertex Kernels for Cluster Deletion and Strong Triadic Closure 用于簇删除和强三元封闭的 2k 顶点内核

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-023-1420-1

Wen-Yu Gao, Hang Gao

Cluster deletion and strong triadic closure are two important NP-complete problems that have received significant attention due to their applications in various areas, including social networks and data analysis. Although cluster deletion and strong triadic closure are closely linked by induced paths on three vertices, there are subtle differences between them. In some cases, the solutions of strong triadic closure and cluster deletion are quite different. In this paper, we study the parameterized algorithms for these two problems. More specifically, we focus on the kernels of these two problems. Instead of separating the critical clique and its neighbors for analysis, we consider them as a whole, which allows us to more effectively bound the number of related vertices. In addition, in analyzing the kernel of strong triadic closure, we introduce the concept of edge-disjoint induced path on three vertices, which enables us to obtain the lower bound of weak edge number in a more concise way. Our analysis demonstrates that cluster deletion and strong triadic closure both admit 2k-vertex kernels. These results represent improvements over previously best-known kernels for both problems. Furthermore, our analysis provides additional insights into the relationship between cluster deletion and strong triadic closure.

簇删除和强三元组闭合是两个重要的 NP-完全问题，由于它们在社交网络和数据分析等多个领域的应用而备受关注。虽然簇删除和强三元组闭合因三个顶点上的诱导路径而密切相关，但它们之间存在细微差别。在某些情况下，强三元组闭合的解和簇删除的解截然不同。本文将研究这两个问题的参数化算法。更具体地说，我们关注这两个问题的内核。我们不再将临界小群及其邻域分开分析，而是将它们视为一个整体，这样就能更有效地约束相关顶点的数量。此外，在分析强三元封闭的内核时，我们引入了三个顶点上的边不相交诱导路径的概念，这使我们能以更简洁的方式获得弱边数量的下界。我们的分析表明，簇删除和强三元组闭合都允许 2k 顶点内核。这些结果代表了对这两个问题已知内核的改进。此外，我们的分析还提供了关于簇删除和强三元封闭之间关系的更多见解。

{"title":"2k-Vertex Kernels for Cluster Deletion and Strong Triadic Closure","authors":"Wen-Yu Gao, Hang Gao","doi":"10.1007/s11390-023-1420-1","DOIUrl":"https://doi.org/10.1007/s11390-023-1420-1","url":null,"abstract":"<p>Cluster deletion and strong triadic closure are two important NP-complete problems that have received significant attention due to their applications in various areas, including social networks and data analysis. Although cluster deletion and strong triadic closure are closely linked by induced paths on three vertices, there are subtle differences between them. In some cases, the solutions of strong triadic closure and cluster deletion are quite different. In this paper, we study the parameterized algorithms for these two problems. More specifically, we focus on the kernels of these two problems. Instead of separating the critical clique and its neighbors for analysis, we consider them as a whole, which allows us to more effectively bound the number of related vertices. In addition, in analyzing the kernel of strong triadic closure, we introduce the concept of edge-disjoint induced path on three vertices, which enables us to obtain the lower bound of weak edge number in a more concise way. Our analysis demonstrates that cluster deletion and strong triadic closure both admit 2<i>k</i>-vertex kernels. These results represent improvements over previously best-known kernels for both problems. Furthermore, our analysis provides additional insights into the relationship between cluster deletion and strong triadic closure.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"24 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139656204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Composing Like an Ancient Chinese Poet: Learn to Generate Rhythmic Chinese Poetry 像中国古代诗人一样作诗：学习创作有韵律的中国诗歌

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-023-1295-1

Ming He, Yan Chen, Hong-Ke Zhao, Qi Liu, Le Wu, Yu Cui, Gui-Hua Zeng, Gui-Quan Liu

Automatic generation of Chinese classical poetry is still a challenging problem in artificial intelligence. Recently, Encoder-Decoder models have provided a few viable methods for poetry generation. However, by reviewing the prior methods, two major issues still need to be settled: 1) most of them are one-stage generation methods without further polishing; 2) they rarely take into consideration the restrictions of poetry, such as tone and rhyme. Intuitively, some ancient Chinese poets tended first to write a coarse poem underlying aesthetics and then deliberated its semantics; while others first create a semantic poem and then refine its aesthetics. On this basis, in order to better imitate the human creation procedure of poems, we propose a two-stage method (i.e., restricted polishing generation method) of which each stage focuses on the different aspects of poems (i.e., semantics and aesthetics), which can produce a higher quality of generated poems. In this way, the two-stage method develops into two symmetrical generation methods, the aesthetics-to-semantics method and the semantics-to-aesthetics method. In particular, we design a sampling method and a gate to formulate the tone and rhyme restrictions, which can further improve the rhythm of the generated poems. Experimental results demonstrate the superiority of our proposed two-stage method in both automatic evaluation metrics and human evaluation metrics compared with baselines, especially in yielding consistent improvements in tone and rhyme.

中国古典诗词的自动生成仍然是人工智能领域一个具有挑战性的问题。最近，编码器-解码器模型为诗歌生成提供了一些可行的方法。然而，回顾以往的方法，仍有两大问题亟待解决：1）它们大多是未经进一步打磨的单阶段生成方法；2）它们很少考虑诗歌的限制，如声调和韵律。从直观上看，中国古代诗人有的倾向于先写出一首具有美学基础的粗诗，然后再斟酌其语义；有的则先创作一首语义诗，然后再提炼其美学。在此基础上，为了更好地模仿人类创作诗歌的过程，我们提出了一种两阶段法（即限制性打磨生成法），其中每个阶段都侧重于诗歌的不同方面（即语义和美学），这样可以生成更高质量的诗歌。这样，两阶段法就发展成为两种对称的生成方法，即从美学到语义学法和从语义学到美学法。特别是，我们设计了一种采样方法和一个制定声调和韵律限制的门，可以进一步提高生成诗歌的韵律。实验结果表明，我们提出的两阶段方法在自动评价指标和人工评价指标上都优于基线方法，特别是在声调和韵律方面的改进更为一致。

{"title":"Composing Like an Ancient Chinese Poet: Learn to Generate Rhythmic Chinese Poetry","authors":"Ming He, Yan Chen, Hong-Ke Zhao, Qi Liu, Le Wu, Yu Cui, Gui-Hua Zeng, Gui-Quan Liu","doi":"10.1007/s11390-023-1295-1","DOIUrl":"https://doi.org/10.1007/s11390-023-1295-1","url":null,"abstract":"<p>Automatic generation of Chinese classical poetry is still a challenging problem in artificial intelligence. Recently, Encoder-Decoder models have provided a few viable methods for poetry generation. However, by reviewing the prior methods, two major issues still need to be settled: 1) most of them are one-stage generation methods without further polishing; 2) they rarely take into consideration the restrictions of poetry, such as tone and rhyme. Intuitively, some ancient Chinese poets tended first to write a coarse poem underlying aesthetics and then deliberated its semantics; while others first create a semantic poem and then refine its aesthetics. On this basis, in order to better imitate the human creation procedure of poems, we propose a two-stage method (i.e., restricted polishing generation method) of which each stage focuses on the different aspects of poems (i.e., semantics and aesthetics), which can produce a higher quality of generated poems. In this way, the two-stage method develops into two symmetrical generation methods, the aesthetics-to-semantics method and the semantics-to-aesthetics method. In particular, we design a sampling method and a gate to formulate the tone and rhyme restrictions, which can further improve the rhythm of the generated poems. Experimental results demonstrate the superiority of our proposed two-stage method in both automatic evaluation metrics and human evaluation metrics compared with baselines, especially in yielding consistent improvements in tone and rhyme.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"118 1 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139657219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

wrBench: Comparing Cache Architectures and Coherency Protocols on ARMv8 Many-Core Systems wrBench：比较 ARMv8 多核系统上的高速缓存架构和一致性协议

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-021-1251-x

Wan-Rong Gao, Jian-Bin Fang, Chun Huang, Chuan-Fu Xu, Zheng Wang

Cache performance is a critical design constraint for modern many-core systems. Since the cache often works in a “black-box” manner, it is difficult for the software to reason about the cache behavior to match the running software to the underlying hardware. To better support code optimization, we need to understand and characterize the cache behavior. While cache performance characterization is heavily studied on traditional x86 architectures, there is little work for understanding the cache implementations on emerging ARMv8-based many-cores. This paper presents a comprehensive study to evaluate the cache architecture design on three representative ARMv8 multi-cores, Phytium 2000+, ThunderX2, and Kunpeng 920 (KP920). To this end, we develop wrBench, a micro-benchmark suite to measure the realized latency and bandwidth of caches at different memory hierarchies when performing core-to-core communication. Our evaluation provides inter-core latency and bandwidth in different cache levels and coherency states for the three ARMv8 many-cores. The quantitative performance data is shown in tables. We mine the characteristics of caches and coherency protocols by analyzing the data for the three processors, Phytium 2000+, ThunderX2, and KP920. Our paper also provides discussions and guidelines for optimizing memory access on ARMv8 many-cores.

高速缓存性能是现代多核系统的一个关键设计限制因素。由于高速缓存通常以 "黑盒 "方式工作，软件很难推理出高速缓存的行为，从而使运行中的软件与底层硬件相匹配。为了更好地支持代码优化，我们需要了解并描述缓存行为。虽然对传统 x86 架构的高速缓存性能特征进行了大量研究，但对新兴的基于 ARMv8 的多核高速缓存实现的了解却很少。本文介绍了一项综合研究，以评估三种具有代表性的 ARMv8 多核（Phytium 2000+、ThunderX2 和 Kunpeng 920 (KP920)）上的高速缓存架构设计。为此，我们开发了一个微型基准套件--wrBench，用于测量不同内存层级的高速缓存在执行内核到内核通信时实现的延迟和带宽。我们的评估提供了三个 ARMv8 多核在不同缓存级别和一致性状态下的内核间延迟和带宽。量化性能数据以表格形式显示。通过分析 Phytium 2000+、ThunderX2 和 KP920 这三种处理器的数据，我们挖掘出了高速缓存和一致性协议的特性。我们的论文还为优化 ARMv8 多核内存访问提供了讨论和指导。

{"title":"wrBench: Comparing Cache Architectures and Coherency Protocols on ARMv8 Many-Core Systems","authors":"Wan-Rong Gao, Jian-Bin Fang, Chun Huang, Chuan-Fu Xu, Zheng Wang","doi":"10.1007/s11390-021-1251-x","DOIUrl":"https://doi.org/10.1007/s11390-021-1251-x","url":null,"abstract":"<p>Cache performance is a critical design constraint for modern many-core systems. Since the cache often works in a “black-box” manner, it is difficult for the software to reason about the cache behavior to match the running software to the underlying hardware. To better support code optimization, we need to understand and characterize the cache behavior. While cache performance characterization is heavily studied on traditional x86 architectures, there is little work for understanding the cache implementations on emerging ARMv8-based many-cores. This paper presents a comprehensive study to evaluate the cache architecture design on three representative ARMv8 multi-cores, Phytium 2000+, ThunderX2, and Kunpeng 920 (KP920). To this end, we develop wrBench, a micro-benchmark suite to measure the realized latency and bandwidth of caches at different memory hierarchies when performing core-to-core communication. Our evaluation provides inter-core latency and bandwidth in different cache levels and coherency states for the three ARMv8 many-cores. The quantitative performance data is shown in tables. We mine the characteristics of caches and coherency protocols by analyzing the data for the three processors, Phytium 2000+, ThunderX2, and KP920. Our paper also provides discussions and guidelines for optimizing memory access on ARMv8 many-cores.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"1 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139657221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Probabilistic Framework for Temporal Cognitive Diagnosis in Online Learning Systems 在线学习系统中的时态认知诊断概率框架

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-022-1332-5

Jia-Yu Liu, Fei Wang, Hai-Ping Ma, Zhen-Ya Huang, Qi Liu, En-Hong Chen, Yu Su

Cognitive diagnosis is an important issue of intelligent education systems, which aims to estimate students’ proficiency on specific knowledge concepts. Most existing studies rely on the assumption of static student states and ignore the dynamics of proficiency in the learning process, which makes them unsuitable for online learning scenarios. In this paper, we propose a unified temporal item response theory (UTIRT) framework, incorporating temporality and randomness of proficiency evolving to get both accurate and interpretable diagnosis results. Specifically, we hypothesize that students’ proficiency varies as a Wiener process and describe a probabilistic graphical model in UTIRT to consider temporality and randomness factors. Furthermore, based on the relationship between student states and exercising answers, we hypothesize that the answering result at time k contributes most to inferring a student's proficiency at time k, which also reflects the temporality aspect and enables us to get analytical maximization (M-step) in the expectation maximization (EM) algorithm when estimating model parameters. Our UTIRT is a framework containing unified training and inferencing methods, and is general to cover several typical traditional models such as Item Response Theory (IRT), multidimensional IRT (MIRT), and temporal IRT (TIRT). Extensive experimental results on real-world datasets show the effectiveness of UTIRT and prove its superiority in leveraging temporality theoretically and practically over TIRT.

认知诊断是智能教育系统的一个重要问题，其目的是估计学生对特定知识概念的掌握程度。现有研究大多依赖于学生静态状态的假设，忽略了学习过程中能力的动态变化，因此不适合在线学习场景。在本文中，我们提出了一个统一的时序项目反应理论（UTIRT）框架，将能力演进的时序性和随机性结合起来，以获得既准确又可解释的诊断结果。具体来说，我们假设学生的能力变化是一个维纳过程，并在UTIRT中描述了一个概率图形模型，以考虑时间性和随机性因素。此外，基于学生状态和练习答案之间的关系，我们假设第 k 次的答题结果对推断学生在第 k 次的能力贡献最大，这也反映了时间性的一面，并使我们能够在估计模型参数时在期望最大化（EM）算法中获得解析最大化（M-step）。我们的UTIRT是一个包含统一训练和推断方法的框架，它具有通用性，可涵盖多个典型的传统模型，如项目反应理论（IRT）、多维IRT（MIRT）和时间IRT（TIRT）。在实际数据集上的大量实验结果表明了UTIRT的有效性，并证明了它在理论和实践上都比TIRT更善于利用时间性。

{"title":"A Probabilistic Framework for Temporal Cognitive Diagnosis in Online Learning Systems","authors":"Jia-Yu Liu, Fei Wang, Hai-Ping Ma, Zhen-Ya Huang, Qi Liu, En-Hong Chen, Yu Su","doi":"10.1007/s11390-022-1332-5","DOIUrl":"https://doi.org/10.1007/s11390-022-1332-5","url":null,"abstract":"<p>Cognitive diagnosis is an important issue of intelligent education systems, which aims to estimate students’ proficiency on specific knowledge concepts. Most existing studies rely on the assumption of static student states and ignore the dynamics of proficiency in the learning process, which makes them unsuitable for online learning scenarios. In this paper, we propose a unified temporal item response theory (UTIRT) framework, incorporating temporality and randomness of proficiency evolving to get both accurate and interpretable diagnosis results. Specifically, we hypothesize that students’ proficiency varies as a Wiener process and describe a probabilistic graphical model in UTIRT to consider temporality and randomness factors. Furthermore, based on the relationship between student states and exercising answers, we hypothesize that the answering result at time <i>k</i> contributes most to inferring a student's proficiency at time <i>k</i>, which also reflects the temporality aspect and enables us to get analytical maximization (M-step) in the expectation maximization (EM) algorithm when estimating model parameters. Our UTIRT is a framework containing unified training and inferencing methods, and is general to cover several typical traditional models such as Item Response Theory (IRT), multidimensional IRT (MIRT), and temporal IRT (TIRT). Extensive experimental results on real-world datasets show the effectiveness of UTIRT and prove its superiority in leveraging temporality theoretically and practically over TIRT.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"39 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139657134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Shapelet Based Two-Step Time Series Positive and Unlabeled Learning 基于小形的两步时间序列正向和非标记学习

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-022-1320-9

Han-Bo Zhang, Peng Wang, Ming-Ming Zhang, Wei Wang

In the last decade, there has been significant progress in time series classification. However, in real-world industrial settings, it is expensive and difficult to obtain high-quality labeled data. Therefore, the positive and unlabeled learning (PU-learning) problem has become more and more popular recently. The current PU-learning approaches of the time series data suffer from low accuracy due to the lack of negative labeled time series. In this paper, we propose a novel shapelet based two-step (2STEP) PU-learning approach. In the first step, we generate shapelet features based on the positive time series, which are used to select a set of negative examples. In the second step, based on both positive and negative time series, we select the final features and build the classification model. The experimental results show that our 2STEP approach can improve the average F1 score on 15 datasets by 9.1% compared with the baselines, and achieves the highest F1 score on 10 out of 15 time series datasets.

过去十年中，时间序列分类取得了重大进展。然而，在现实世界的工业环境中，获得高质量的标记数据既昂贵又困难。因此，正向无标注学习（PU-learning）问题近来变得越来越流行。由于缺乏负标签时间序列，目前的时间序列数据 PU-learning 方法存在准确率低的问题。在本文中，我们提出了一种新颖的基于 shapelet 的两步（2STEP）PU-learning 方法。第一步，我们根据正时间序列生成 shapelet 特征，并利用这些特征选择一组负示例。第二步，根据正负时间序列，我们选择最终特征并建立分类模型。实验结果表明，我们的 2STEP 方法在 15 个数据集上的平均 F1 分数比基线提高了 9.1%，并在 15 个时间序列数据集中的 10 个数据集上获得了最高的 F1 分数。

引用次数: 0

Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision 通过自我监督实现句子匹配的无监督领域自适应

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-022-1479-0

Gui-Rong Bai, Qing-Bin Liu, Shi-Zhu He, Kang Liu, Jun Zhao

Although neural approaches have yielded state-of-the-art results in the sentence matching task, their performance inevitably drops dramatically when applied to unseen domains. To tackle this cross-domain challenge, we address unsupervised domain adaptation on sentence matching, in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data. Specifically, we propose to perform self-supervised tasks to achieve it. Different from previous unsupervised domain adaptation methods, self-supervision can not only flexibly suit the characteristics of sentence matching with a special design, but also be much easier to optimize. When training, each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum, which gradually brings the two domains closer together along the direction relevant to the task. As a result, the classifier trained on the source domain is able to generalize to the unlabeled target domain. In total, we present three types of self-supervised tasks and the results demonstrate their superiority. In addition, we further study the performance of different usages of self-supervised tasks, which would inspire how to effectively utilize self-supervision for cross-domain scenarios.

尽管神经方法在句子匹配任务中取得了最先进的成果，但当它们应用于未知领域时，性能不可避免地会大幅下降。为了应对这一跨域挑战，我们研究了句子匹配的无监督域适应性，其目标是在目标域中仅使用未标注的目标域数据和标注的源域数据就能获得良好的性能。具体来说，我们建议执行自监督任务来实现这一目标。与以往的无监督域适应方法不同，自监督不仅可以通过特殊设计灵活地适应句子匹配的特点，而且更易于优化。在训练时，每个自监督任务都会在两个域上同时进行，按照从易到难的课程设置，沿着与任务相关的方向逐渐拉近两个域的距离。因此，在源领域训练的分类器能够泛化到未标记的目标领域。我们总共提出了三种自监督任务，结果证明了它们的优越性。此外，我们还进一步研究了自监督任务不同用途的性能，这将启发我们如何在跨领域场景中有效利用自监督。

{"title":"Unsupervised Domain Adaptation on Sentence Matching Through Self-Supervision","authors":"Gui-Rong Bai, Qing-Bin Liu, Shi-Zhu He, Kang Liu, Jun Zhao","doi":"10.1007/s11390-022-1479-0","DOIUrl":"https://doi.org/10.1007/s11390-022-1479-0","url":null,"abstract":"<p>Although neural approaches have yielded state-of-the-art results in the sentence matching task, their performance inevitably drops dramatically when applied to unseen domains. To tackle this cross-domain challenge, we address unsupervised domain adaptation on sentence matching, in which the goal is to have good performance on a target domain with only unlabeled target domain data as well as labeled source domain data. Specifically, we propose to perform self-supervised tasks to achieve it. Different from previous unsupervised domain adaptation methods, self-supervision can not only flexibly suit the characteristics of sentence matching with a special design, but also be much easier to optimize. When training, each self-supervised task is performed on both domains simultaneously in an easy-to-hard curriculum, which gradually brings the two domains closer together along the direction relevant to the task. As a result, the classifier trained on the source domain is able to generalize to the unlabeled target domain. In total, we present three types of self-supervised tasks and the results demonstrate their superiority. In addition, we further study the performance of different usages of self-supervised tasks, which would inspire how to effectively utilize self-supervision for cross-domain scenarios.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"15 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139657374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency 视觉主题语义增强型机器翻译提高多模态数据效率

IF 1.9 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Computer Science and Technology

Pub Date : 2023-11-30 DOI: 10.1007/s11390-023-1302-6

Chao Wang, Si-Jia Cai, Bei-Xiang Shi, Zhi-Hong Chong

The scarcity of bilingual parallel corpus imposes limitations on exploiting the state-of-the-art supervised translation technology. One of the research directions is employing relations among multi-modal data to enhance performance. However, the reliance on manually annotated multi-modal datasets results in a high cost of data labeling. In this paper, the topic semantics of images is proposed to alleviate the above problem. First, topic-related images can be automatically collected from the Internet by search engines. Second, topic semantics is sufficient to encode the relations between multi-modal data such as texts and images. Specifically, we propose a visual topic semantic enhanced translation (VTSE) model that utilizes topic-related images to construct a cross-lingual and cross-modal semantic space, allowing the VTSE model to simultaneously integrate the syntactic structure and semantic features. In the above process, topic similar texts and images are wrapped into groups so that the model can extract more robust topic semantics from a set of similar images and then further optimize the feature integration. The results show that our model outperforms competitive baselines by a large margin on the Multi30k and the Ambiguous COCO datasets. Our model can use external images to bring gains to translation, improving data efficiency.

双语平行语料库的匮乏限制了对最先进的监督翻译技术的利用。研究方向之一是利用多模态数据之间的关系来提高性能。然而，依赖人工标注的多模态数据集会导致高昂的数据标注成本。本文提出了图像的主题语义来缓解上述问题。首先，与主题相关的图像可以通过搜索引擎从互联网上自动收集。其次，主题语义足以编码文本和图像等多模态数据之间的关系。具体来说，我们提出了一种视觉主题语义增强翻译（VTSE）模型，利用与主题相关的图像来构建跨语言和跨模态的语义空间，使 VTSE 模型能够同时整合句法结构和语义特征。在上述过程中，主题相似的文本和图像被包装成组，这样模型就能从一组相似的图像中提取更强大的主题语义，然后进一步优化特征整合。结果表明，在 Multi30k 和 Ambiguous COCO 数据集上，我们的模型远远优于竞争基线。我们的模型可以利用外部图像带来翻译增益，从而提高数据效率。

{"title":"Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency","authors":"Chao Wang, Si-Jia Cai, Bei-Xiang Shi, Zhi-Hong Chong","doi":"10.1007/s11390-023-1302-6","DOIUrl":"https://doi.org/10.1007/s11390-023-1302-6","url":null,"abstract":"<p>The scarcity of bilingual parallel corpus imposes limitations on exploiting the state-of-the-art supervised translation technology. One of the research directions is employing relations among multi-modal data to enhance performance. However, the reliance on manually annotated multi-modal datasets results in a high cost of data labeling. In this paper, the topic semantics of images is proposed to alleviate the above problem. First, topic-related images can be automatically collected from the Internet by search engines. Second, topic semantics is sufficient to encode the relations between multi-modal data such as texts and images. Specifically, we propose a visual topic semantic enhanced translation (VTSE) model that utilizes topic-related images to construct a cross-lingual and cross-modal semantic space, allowing the VTSE model to simultaneously integrate the syntactic structure and semantic features. In the above process, topic similar texts and images are wrapped into groups so that the model can extract more robust topic semantics from a set of similar images and then further optimize the feature integration. The results show that our model outperforms competitive baselines by a large margin on the Multi30k and the Ambiguous COCO datasets. Our model can use external images to bring gains to translation, improving data efficiency.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"37 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139657107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0