Frontiers of Computer Science最新文献_第5页

A program logic for obstruction-freedom 不受阻碍的程序逻辑

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-28 DOI: 10.1007/s11704-023-2774-9

Zhao-Hui Li, Xin-Yu Feng

Though obstruction-free progress property is weaker than other non-blocking properties including lock-freedom and wait-freedom, it has advantages that have led to the use of obstruction-free implementations for software transactional memory (STM) and in anonymous and fault-tolerant distributed computing. However, existing work can only verify obstruction-freedom of specific data structures (e.g., STM and list-based algorithms).

In this paper, to fill this gap, we propose a program logic that can formally verify obstruction-freedom of practical implementations, as well as verify linearizability, a safety property, at the same time. We also propose informal principles to extend a logic for verifying linearizability to verifying obstruction-freedom. With this approach, the existing proof for linearizability can be reused directly to construct the proof for both linearizability and obstruction-freedom. Finally, we have successfully applied our logic to verifying a practical obstruction-free double-ended queue implementation in the first classic paper that has proposed the definition of obstruction-freedom.

虽然无阻塞进度特性弱于其他非阻塞特性（包括锁自由和等待自由），但它的优势已被用于软件事务内存（STM）以及匿名和容错分布式计算中的无阻塞实现。然而，现有的工作只能验证特定数据结构（如 STM 和基于列表的算法）的无阻塞性。在本文中，为了填补这一空白，我们提出了一种程序逻辑，它可以正式验证实际实现的无阻塞性，并同时验证线性化（一种安全属性）。我们还提出了一些非正式原则，将验证线性化的逻辑扩展到验证无阻塞性。通过这种方法，现有的线性化证明可直接用于构建线性化和无障碍证明。最后，我们成功地将我们的逻辑应用于验证一个实用的无障碍双端队列实现，这是第一篇提出无障碍定义的经典论文。

引用次数: 0

IP2vec: an IP node representation model for IP geolocation IP2vec：用于 IP 地理定位的 IP 节点表示模型

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-28 DOI: 10.1007/s11704-023-2616-9

Fan Zhang, Meijuan Yin, Fenlin Liu, Xiangyang Luo, Shuodi Zu

IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.

IP 地理定位对于敏感网络实体的地域分析、基于位置的服务（LBS）和网络欺诈检测至关重要。它具有重要的理论意义和应用价值。基于测量的 IP 地理定位是一个热门研究课题。然而，现有的 IP 地理定位算法不能有效利用延迟的距离特性和节点的连接关系，导致地理定位误差较大。如何获取延迟、节点连接关系和地理位置之间的映射关系是一项挑战。基于网络表示学习的思想，我们提出了一种 IP 节点表示学习模型（简称 IP2vec），并将其应用于街道级 IP 地理定位。IP2vec 模型根据节点之间的连接关系和延迟对节点进行矢量化，从而使 IP 矢量能够反映 IP 节点之间的距离和拓扑接近程度。基于 IP2vec 模型的街道级 IP 地理定位算法步骤如下：首先，测量地标和目标 IP，获取延迟和路径信息，构建网络拓扑。其次，利用 IP2vec 模型从网络拓扑结构中获取 IP 向量。第三，我们训练神经网络来拟合向量与地标位置之间的映射关系。最后，将目标 IP 的向量输入神经网络，以获得目标 IP 的地理位置。该算法可以根据 IP 向量中蕴含的延迟和拓扑邻近性准确推断出目标 IP 的地理位置。对纽约、北京、香港和郑州的 10023 个目标 IP 的交叉验证实验结果表明，所提出的算法可以实现街道级地理定位。与 Hop-Hot、IP-geolocater 和 SLG 等现有算法相比，所提算法的平均地理定位误差分别减少了 33%、39% 和 51%。

{"title":"IP2vec: an IP node representation model for IP geolocation","authors":"Fan Zhang, Meijuan Yin, Fenlin Liu, Xiangyang Luo, Shuodi Zu","doi":"10.1007/s11704-023-2616-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2616-9","url":null,"abstract":"IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"31 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Model gradient: unified model and policy learning in model-based reinforcement learning 模型梯度：基于模型的强化学习中的统一模型和策略学习

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-27 DOI: 10.1007/s11704-023-3150-5

Abstract

Model-based reinforcement learning is a promising direction to improve the sample efficiency of reinforcement learning with learning a model of the environment. Previous model learning methods aim at fitting the transition data, and commonly employ a supervised learning approach to minimize the distance between the predicted state and the real state. The supervised model learning methods, however, diverge from the ultimate goal of model learning, i.e., optimizing the learned-in-the-model policy. In this work, we investigate how model learning and policy learning can share the same objective of maximizing the expected return in the real environment. We find model learning towards this objective can result in a target of enhancing the similarity between the gradient on generated data and the gradient on the real data. We thus derive the gradient of the model from this target and propose the Model Gradient algorithm (MG) to integrate this novel model learning approach with policy-gradient-based policy optimization. We conduct experiments on multiple locomotion control tasks and find that MG can not only achieve high sample efficiency but also lead to better convergence performance compared to traditional model-based reinforcement learning approaches.

摘要基于模型的强化学习是通过学习环境模型来提高强化学习样本效率的一个有前途的方向。以往的模型学习方法以拟合过渡数据为目标，通常采用监督学习方法来最小化预测状态与真实状态之间的距离。然而，有监督的模型学习方法偏离了模型学习的最终目标，即优化模型中的学习策略。在这项工作中，我们研究了模型学习和策略学习如何在真实环境中实现预期收益最大化这一相同目标。我们发现，为实现这一目标而进行的模型学习可以提高生成数据上的梯度与真实数据上的梯度之间的相似度。因此，我们从这一目标中推导出模型梯度，并提出了模型梯度算法（MG），将这种新颖的模型学习方法与基于策略梯度的策略优化相结合。我们在多个运动控制任务上进行了实验，发现与传统的基于模型的强化学习方法相比，MG 不仅能实现较高的采样效率，还能带来更好的收敛性能。

{"title":"Model gradient: unified model and policy learning in model-based reinforcement learning","authors":"","doi":"10.1007/s11704-023-3150-5","DOIUrl":"https://doi.org/10.1007/s11704-023-3150-5","url":null,"abstract":"<h3>Abstract</h3> Model-based reinforcement learning is a promising direction to improve the sample efficiency of reinforcement learning with learning a model of the environment. Previous model learning methods aim at fitting the transition data, and commonly employ a supervised learning approach to minimize the distance between the predicted state and the real state. The supervised model learning methods, however, diverge from the ultimate goal of model learning, i.e., optimizing the learned-in-the-model policy. In this work, we investigate how model learning and policy learning can share the same objective of maximizing the expected return in the real environment. We find model learning towards this objective can result in a target of enhancing the similarity between the gradient on generated data and the gradient on the real data. We thus derive the gradient of the model from this target and propose the Model Gradient algorithm (MG) to integrate this novel model learning approach with policy-gradient-based policy optimization. We conduct experiments on multiple locomotion control tasks and find that MG can not only achieve high sample efficiency but also lead to better convergence performance compared to traditional model-based reinforcement learning approaches.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"32 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ARCHER: a ReRAM-based accelerator for compressed recommendation systems ARCHER：基于 ReRAM 的压缩推荐系统加速器

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-3397-x

Xinyang Shen, Xiaofei Liao, Long Zheng, Yu Huang, Dan Chen, Hai Jin

Modern recommendation systems are widely used in modern data centers. The random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and memory. ReRAM-based processing-in-memory (PIM) can resolve this problem by processing embedding vectors where they are stored. However, the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip, which induces off-chip accesses that may offset the PIM profits. Therefore, we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance loss. In this paper, we propose ARCHER, a ReRAM-based PIM architecture that implements fully on-chip recommendations under resource constraints. First, we make a full analysis of the computation pattern and access pattern on the decomposed table. Based on the computation pattern, we unify the operations of each layer of the decomposed model in multiply-and-accumulate operations. Based on the access observation, we propose a hierarchical mapping schema and a specialized hardware design to maximize resource utilization. Under the unified computation and mapping strategy, we can coordinate the inter-processing elements pipeline. The evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system, the state-of-the-art near-memory processing recommendation system RecNMP, and the ReRAM-based recommendation accelerator REREC by 15.79×, 2.21×, and 1.21× in terms of performance and 56.06×, 6.45×, and 1.71× in terms of energy savings, respectively.

现代推荐系统广泛应用于现代数据中心。在传统平台上，随机和稀疏的嵌入查找操作是处理推荐系统的主要性能瓶颈，因为这些操作会导致大量数据在计算单元和内存之间移动。基于 ReRAM 的内存处理（PIM）可以在嵌入向量存储的地方对其进行处理，从而解决这一问题。但是，嵌入表很容易超出基于 ReRAM 的单片式 PIM 芯片的容量限制，从而导致片外访问，这可能会抵消 PIM 的利润。因此，我们在芯片上部署分解模型，并利用 ReRAM 的高计算效率来弥补解压缩性能的损失。在本文中，我们提出了基于 ReRAM 的 PIM 架构 ARCHER，该架构可在资源限制条件下实现完全片上推荐。首先，我们对分解表的计算模式和访问模式进行了全面分析。根据计算模式，我们将分解模型各层的操作统一为乘法累加操作。根据访问观察结果，我们提出了分层映射模式和专用硬件设计，以最大限度地提高资源利用率。在统一的计算和映射策略下，我们可以协调处理元素间的流水线。评估结果表明，ARCHER 在性能方面分别优于最先进的基于 GPU 的 DLRM 系统、最先进的近内存处理推荐系统 RecNMP 和基于 ReRAM 的推荐加速器 REREC 15.79 倍、2.21 倍和 1.21 倍，在节能方面分别优于最先进的基于 GPU 的 DLRM 系统、最先进的近内存处理推荐系统 RecNMP 和基于 ReRAM 的推荐加速器 REREC 56.06 倍、6.45 倍和 1.71 倍。

{"title":"ARCHER: a ReRAM-based accelerator for compressed recommendation systems","authors":"Xinyang Shen, Xiaofei Liao, Long Zheng, Yu Huang, Dan Chen, Hai Jin","doi":"10.1007/s11704-023-3397-x","DOIUrl":"https://doi.org/10.1007/s11704-023-3397-x","url":null,"abstract":"Modern recommendation systems are widely used in modern data centers. The random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and memory. ReRAM-based processing-in-memory (PIM) can resolve this problem by processing embedding vectors where they are stored. However, the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip, which induces off-chip accesses that may offset the PIM profits. Therefore, we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance loss. In this paper, we propose ARCHER, a ReRAM-based PIM architecture that implements fully on-chip recommendations under resource constraints. First, we make a full analysis of the computation pattern and access pattern on the decomposed table. Based on the computation pattern, we unify the operations of each layer of the decomposed model in multiply-and-accumulate operations. Based on the access observation, we propose a hierarchical mapping schema and a specialized hardware design to maximize resource utilization. Under the unified computation and mapping strategy, we can coordinate the inter-processing elements pipeline. The evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system, the state-of-the-art near-memory processing recommendation system RecNMP, and the ReRAM-based recommendation accelerator REREC by 15.79×, 2.21×, and 1.21× in terms of performance and 56.06×, 6.45×, and 1.71× in terms of energy savings, respectively.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"36 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Provable secure authentication key agreement for wireless body area networks 无线体域网络的可证明安全认证密钥协议

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-2548-4

Yuqian Ma, Wenbo Shi, Xinghua Li, Qingfeng Cheng

Wireless body area networks (WBANs) guarantee timely data processing and secure information preservation within the range of the wireless access network, which is in urgent need of a new type of security technology. However, with the speedy development of hardware, the existing security schemes can no longer meet the new requirements of anonymity and lightweight. New solutions that do not require complex calculations, such as certificateless cryptography, attract great attention from researchers. To resolve these difficulties, Wang et al. designed a new authentication architecture for the WBANs environment, which was claimed to be secure and efficient. However, in this paper, we will show that this scheme is prone to ephemeral key leakage attacks. Further, based on this authentication scheme, an anonymous certificateless scheme is proposed for lightweight devices. Meanwhile, user anonymity is fully protected. The proposed scheme is proved to be secure under a specific security model. In addition, we assess the security attributes our scheme meets through BAN logic and Scyther tool. The comparisons of time consumption and communication cost are given at the end of the paper, to demonstrate that our scheme performs prior to several previous schemes.

无线体域网（WBAN）保证了无线接入网范围内数据的及时处理和信息的安全保存，迫切需要一种新型的安全技术。然而，随着硬件的飞速发展，现有的安全方案已无法满足匿名和轻量级的新要求。无需复杂计算的新方案，如无证书加密技术，引起了研究人员的极大关注。为了解决这些难题，Wang 等人为无线局域网环境设计了一种新的身份验证架构，并声称这种架构既安全又高效。然而，在本文中，我们将证明这种方案容易受到短暂密钥泄漏攻击。此外，在此认证方案的基础上，我们还为轻量级设备提出了一种匿名无证书方案。同时，用户的匿名性得到了充分保护。在特定的安全模型下，所提出的方案被证明是安全的。此外，我们还通过 BAN 逻辑和 Scyther 工具评估了我们的方案所满足的安全属性。本文末尾还给出了时间消耗和通信成本的比较，以证明我们的方案优于之前的几种方案。

{"title":"Provable secure authentication key agreement for wireless body area networks","authors":"Yuqian Ma, Wenbo Shi, Xinghua Li, Qingfeng Cheng","doi":"10.1007/s11704-023-2548-4","DOIUrl":"https://doi.org/10.1007/s11704-023-2548-4","url":null,"abstract":"Wireless body area networks (WBANs) guarantee timely data processing and secure information preservation within the range of the wireless access network, which is in urgent need of a new type of security technology. However, with the speedy development of hardware, the existing security schemes can no longer meet the new requirements of anonymity and lightweight. New solutions that do not require complex calculations, such as certificateless cryptography, attract great attention from researchers. To resolve these difficulties, Wang et al. designed a new authentication architecture for the WBANs environment, which was claimed to be secure and efficient. However, in this paper, we will show that this scheme is prone to ephemeral key leakage attacks. Further, based on this authentication scheme, an anonymous certificateless scheme is proposed for lightweight devices. Meanwhile, user anonymity is fully protected. The proposed scheme is proved to be secure under a specific security model. In addition, we assess the security attributes our scheme meets through BAN logic and Scyther tool. The comparisons of time consumption and communication cost are given at the end of the paper, to demonstrate that our scheme performs prior to several previous schemes.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"4 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139025450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BVDFed: Byzantine-resilient and verifiable aggregation for differentially private federated learning BVDFed：针对差异化私有联合学习的拜占庭弹性可验证聚合技术

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-3142-5

Abstract

Federated Learning (FL) has emerged as a powerful technology designed for collaborative training between multiple clients and a server while maintaining data privacy of clients. To enhance the privacy in FL, Differentially Private Federated Learning (DPFL) has gradually become one of the most effective approaches. As DPFL operates in the distributed settings, there exist potential malicious adversaries who manipulate some clients and the aggregation server to produce malicious parameters and disturb the learning model. However, existing aggregation protocols for DPFL concern either the existence of some corrupted clients (Byzantines) or the corrupted server. Such protocols are limited to eliminate the effects of corrupted clients and server when both are in existence simultaneously due to the complicated threat model. In this paper, we elaborate such adversarial threat model and propose BVDFed. To our best knowledge, it is the first Byzantine-resilient and Verifiable aggregation for Differentially private FEDerated learning. In specific, we propose Differentially Private Federated Averaging algorithm (DPFA) as our primary workflow of BVDFed, which is more lightweight and easily portable than traditional DPFL algorithm. We then introduce Loss Score to indicate the trustworthiness of disguised gradients in DPFL. Based on Loss Score, we propose an aggregation rule DPLoss to eliminate faulty gradients from Byzantine clients during server aggregation while preserving the privacy of clients’ data. Additionally, we design a secure verification scheme DPVeri that are compatible with DPFA and DPLoss to support the honest clients in verifying the integrity of received aggregated results. And DPVeri also provides resistance to collusion attacks with no more than t participants for our aggregation. Theoretical analysis and experimental results demonstrate our aggregation to be feasible and effective in practice.

摘要联合学习（FL）是一种功能强大的技术，用于多个客户端和服务器之间的协作训练，同时维护客户端的数据隐私。为了提高联合学习的隐私性，差分私有联合学习（DPFL）逐渐成为最有效的方法之一。由于 DPFL 在分布式环境中运行，存在潜在的恶意对手，他们会操纵一些客户端和聚合服务器，产生恶意参数，扰乱学习模型。然而，DPFL 的现有聚合协议要么涉及存在一些被破坏的客户端（拜占庭），要么涉及被破坏的服务器。由于威胁模型的复杂性，这些协议在消除同时存在的损坏客户端和服务器的影响方面受到了限制。本文阐述了这种对抗性威胁模型，并提出了 BVDFed。据我们所知，这是第一个用于差异化私有 FEDerated 学习的拜占庭抗性可验证聚合。具体来说，我们提出了差分私有联合平均算法（DPFA）作为 BVDFed 的主要工作流程，它比传统的 DPFL 算法更轻便、更易于移植。然后，我们引入了损失分数（Loss Score）来表示 DPFL 中伪装梯度的可信度。基于 Loss Score，我们提出了一种聚合规则 DPLoss，以消除服务器聚合过程中来自拜占庭客户端的错误梯度，同时保护客户端数据的隐私。此外，我们还设计了一种与 DPFA 和 DPLoss 兼容的安全验证方案 DPVeri，以支持诚实的客户端验证所收到的聚合结果的完整性。DPVeri 还能抵御串通攻击，我们的聚合参与者不超过 t 人。理论分析和实验结果表明，我们的聚合方法在实践中是可行且有效的。

{"title":"BVDFed: Byzantine-resilient and verifiable aggregation for differentially private federated learning","authors":"","doi":"10.1007/s11704-023-3142-5","DOIUrl":"https://doi.org/10.1007/s11704-023-3142-5","url":null,"abstract":"<h3>Abstract</h3> Federated Learning (FL) has emerged as a powerful technology designed for collaborative training between multiple clients and a server while maintaining data privacy of clients. To enhance the privacy in FL, Differentially Private Federated Learning (DPFL) has gradually become one of the most effective approaches. As DPFL operates in the distributed settings, there exist potential malicious adversaries who manipulate some clients and the aggregation server to produce malicious parameters and disturb the learning model. However, existing aggregation protocols for DPFL concern either the existence of some corrupted clients (Byzantines) or the corrupted server. Such protocols are limited to eliminate the effects of corrupted clients and server when both are in existence simultaneously due to the complicated threat model. In this paper, we elaborate such adversarial threat model and propose BVDFed. To our best knowledge, it is the first Byzantine-resilient and Verifiable aggregation for Differentially private FEDerated learning. In specific, we propose Differentially Private Federated Averaging algorithm (DPFA) as our primary workflow of BVDFed, which is more lightweight and easily portable than traditional DPFL algorithm. We then introduce Loss Score to indicate the trustworthiness of disguised gradients in DPFL. Based on Loss Score, we propose an aggregation rule DPLoss to eliminate faulty gradients from Byzantine clients during server aggregation while preserving the privacy of clients’ data. Additionally, we design a secure verification scheme DPVeri that are compatible with DPFA and DPLoss to support the honest clients in verifying the integrity of received aggregated results. And DPVeri also provides resistance to collusion attacks with no more than t participants for our aggregation. Theoretical analysis and experimental results demonstrate our aggregation to be feasible and effective in practice.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"49 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Label distribution similarity-based noise correction for crowdsourcing 基于标签分布相似性的众包噪声校正

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-2751-3

Lijuan Ren, Liangxiao Jiang, Wenjun Zhang, Chaoqun Li

Abstract

In crowdsourcing scenarios, we can obtain each instance’s multiple noisy labels from different crowd workers and then infer its integrated label via label aggregation. In spite of the effectiveness of label aggregation methods, there still remains a certain level of noise in the integrated labels. Thus, some noise correction methods have been proposed to reduce the impact of noise in recent years. However, to the best of our knowledge, existing methods rarely consider an instance’s information from both its features and multiple noisy labels simultaneously when identifying a noise instance. In this study, we argue that the more distinguishable an instance’s features but the noisier its multiple noisy labels, the more likely it is a noise instance. Based on this premise, we propose a label distribution similarity-based noise correction (LDSNC) method. To measure whether an instance’s features are distinguishable, we obtain each instance’s predicted label distribution by building multiple classifiers using instances’ features and their integrated labels. To measure whether an instance’s multiple noisy labels are noisy, we obtain each instance’s multiple noisy label distribution using its multiple noisy labels. Then, we use the Kullback-Leibler (KL) divergence to calculate the similarity between the predicted label distribution and multiple noisy label distribution and define the instance with the lower similarity as a noise instance. The extensive experimental results on 34 simulated and four real-world crowdsourced datasets validate the effectiveness of our method.

摘要在众包场景中，我们可以从不同的众包工作者那里获得每个实例的多个噪声标签，然后通过标签聚合推断其综合标签。尽管标签聚合方法很有效，但整合后的标签仍存在一定程度的噪声。因此，近年来人们提出了一些噪声校正方法来减少噪声的影响。然而，据我们所知，现有的方法在识别噪声实例时很少同时考虑实例的特征信息和多个噪声标签的信息。在本研究中，我们认为，一个实例的特征越明显，但其多个噪声标签越嘈杂，它就越有可能是噪声实例。基于这一前提，我们提出了基于标签分布相似性的噪声校正（LDSNC）方法。为了衡量实例的特征是否可区分，我们利用实例的特征及其集成标签建立多个分类器，从而获得每个实例的预测标签分布。为了衡量一个实例的多重噪声标签是否有噪声，我们使用实例的多重噪声标签获得每个实例的多重噪声标签分布。然后，我们使用库尔巴克-莱伯勒（KL）发散计算预测标签分布与多重噪声标签分布之间的相似度，并将相似度较低的实例定义为噪声实例。在 34 个模拟数据集和 4 个真实世界众包数据集上的大量实验结果验证了我们方法的有效性。

{"title":"Label distribution similarity-based noise correction for crowdsourcing","authors":"Lijuan Ren, Liangxiao Jiang, Wenjun Zhang, Chaoqun Li","doi":"10.1007/s11704-023-2751-3","DOIUrl":"https://doi.org/10.1007/s11704-023-2751-3","url":null,"abstract":"<h3>Abstract</h3> In crowdsourcing scenarios, we can obtain each instance’s multiple noisy labels from different crowd workers and then infer its integrated label via label aggregation. In spite of the effectiveness of label aggregation methods, there still remains a certain level of noise in the integrated labels. Thus, some noise correction methods have been proposed to reduce the impact of noise in recent years. However, to the best of our knowledge, existing methods rarely consider an instance’s information from both its features and multiple noisy labels simultaneously when identifying a noise instance. In this study, we argue that the more distinguishable an instance’s features but the noisier its multiple noisy labels, the more likely it is a noise instance. Based on this premise, we propose a label distribution similarity-based noise correction (LDSNC) method. To measure whether an instance’s features are distinguishable, we obtain each instance’s predicted label distribution by building multiple classifiers using instances’ features and their integrated labels. To measure whether an instance’s multiple noisy labels are noisy, we obtain each instance’s multiple noisy label distribution using its multiple noisy labels. Then, we use the Kullback-Leibler (KL) divergence to calculate the similarity between the predicted label distribution and multiple noisy label distribution and define the instance with the lower similarity as a noise instance. The extensive experimental results on 34 simulated and four real-world crowdsourced datasets validate the effectiveness of our method.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"10 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Delegable zk-SNARKs with proxies 有代理人的可委托 zk-SNARKs

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-2782-9

Abstract

In this paper, we propose the concept of delegable zero knowledge succinct non-interactive arguments of knowledge (zk-SNARKs). The delegable zk-SNARK is parameterized by (μ,k,k′,k″). The delegable property of zk-SNARKs allows the prover to delegate its proving ability to μ proxies. Any k honest proxies are able to generate the correct proof for a statement, but the collusion of less than k proxies does not obtain information about the witness of the statement. We also define k′-soundness and k″-zero knowledge by taking into consider of multi-proxies.

We propose a construction of (μ,2t + 1,t,t)- delegable zk-SNARK for the NPC language of arithmetic circuit satisfiability. Our delegable zk-SNARK stems from Groth’s zk-SNARK scheme (Groth16). We take advantage of the additive and multiplicative properties of polynomial-based secret sharing schemes to achieve delegation for zk-SNARK. Our secret sharing scheme works well with the pairing groups so that the nice succinct properties of Groth’s zk-SNARK scheme are preserved, while augmenting the delegable property and keeping soundness and zero-knowledge in the scenario of multi-proxies.

摘要本文提出了可委托的零知识简洁非交互式知识参数（zk-SNARKs）的概念。可委托的 zk-SNARK 的参数为 (μ,k,k′,k″)。zk-SNARK 的可委托属性允许证明者将其证明能力委托给 μ 个代理。任何 k 个诚实的代理者都能为语句生成正确的证明，但少于 k 个代理者的串通并不能获得语句证明者的信息。我们还通过考虑多代理人定义了 k′-soundness 和 k″-zero knowledge。我们为算术电路可满足性的 NPC 语言提出了一种 (μ,2t + 1,t,t)- 可委托的 zk-SNARK 构造。我们的可委托 zk-SNARK 源自 Groth 的 zk-SNARK 方案 (Groth16)。我们利用基于多项式的秘密共享方案的加法和乘法特性来实现 zk-SNARK 的委托。我们的秘密共享方案与配对组配合得很好，因此保留了 Groth 的 zk-SNARK 方案的简洁特性，同时增强了可委托特性，并在多代理的情况下保持了健全性和零知识性。

{"title":"Delegable zk-SNARKs with proxies","authors":"","doi":"10.1007/s11704-023-2782-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2782-9","url":null,"abstract":"<h3>Abstract</h3> In this paper, we propose the concept of delegable zero knowledge succinct non-interactive arguments of knowledge (zk-SNARKs). The delegable zk-SNARK is parameterized by (μ,k,k′,k″). The delegable property of zk-SNARKs allows the prover to delegate its proving ability to μ proxies. Any k honest proxies are able to generate the correct proof for a statement, but the collusion of less than k proxies does not obtain information about the witness of the statement. We also define k′-soundness and k″-zero knowledge by taking into consider of multi-proxies. We propose a construction of (μ,2t + 1,t,t)- delegable zk-SNARK for the NPC language of arithmetic circuit satisfiability. Our delegable zk-SNARK stems from Groth’s zk-SNARK scheme (Groth16). We take advantage of the additive and multiplicative properties of polynomial-based secret sharing schemes to achieve delegation for zk-SNARK. Our secret sharing scheme works well with the pairing groups so that the nice succinct properties of Groth’s zk-SNARK scheme are preserved, while augmenting the delegable property and keeping soundness and zero-knowledge in the scenario of multi-proxies.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"26 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contactless interaction recognition and interactor detection in multi-person scenes 多人场景中的非接触式交互识别和交互者检测

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-2418-0

Jiacheng Li, Ruize Han, Wei Feng, Haomin Yan, Song Wang

Human interaction recognition is an essential task in video surveillance. The current works on human interaction recognition mainly focus on the scenarios only containing the close-contact interactive subjects without other people. In this paper, we handle more practical but more challenging scenarios where interactive subjects are contactless and other subjects not involved in the interactions of interest are also present in the scene. To address this problem, we propose an Interactive Relation Embedding Network (IRE-Net) to simultaneously identify the subjects involved in the interaction and recognize their interaction category. As a new problem, we also build a new dataset with annotations and metrics for performance evaluation. Experimental results on this dataset show significant improvements of the proposed method when compared with current methods developed for human interaction recognition and group activity recognition.

人机交互识别是视频监控中的一项重要任务。目前关于人机交互识别的研究主要集中在只有近距离接触的交互主体而没有其他人的场景。在本文中，我们将处理更实际但更具挑战性的场景，即互动主体是非接触式的，并且场景中还存在其他未参与互动的主体。为解决这一问题，我们提出了一种交互关系嵌入网络（IRE-Net），可同时识别参与交互的主体并识别其交互类别。作为一个新问题，我们还建立了一个带有注释和性能评估指标的新数据集。在该数据集上的实验结果表明，与目前针对人际互动识别和群体活动识别所开发的方法相比，所提出的方法有显著改进。

引用次数: 0

Empirically revisiting and enhancing automatic classification of bug and non-bug issues 以经验为基础，重新审视并加强错误和非错误问题的自动分类

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Frontiers of Computer Science

Pub Date : 2023-12-23 DOI: 10.1007/s11704-023-2771-z

Zhong Li, Minxue Pan, Yu Pei, Tian Zhang, Linzhang Wang, Xuandong Li

A large body of research effort has been dedicated to automated issue classification for Issue Tracking Systems (ITSs). Although the existing approaches have shown promising performance, the different design choices, including the different textual fields, feature representation methods and machine learning algorithms adopted by existing approaches, have not been comprehensively compared and analyzed. To fill this gap, we perform the first extensive study of automated issue classification on 9 state-of-the-art issue classification approaches. Our experimental results on the widely studied dataset reveal multiple practical guidelines for automated issue classification, including: (1) Training separate models for the issue titles and descriptions and then combining these two models tend to achieve better performance for issue classification; (2) Word embedding with Long Short-Term Memory (LSTM) can better extract features from the textual fields in the issues, and hence, lead to better issue classification models; (3) There exist certain terms in the textual fields that are helpful for building more discriminating classifiers between bug and non-bug issues; (4) The performance of the issue classification model is not sensitive to the choices of ML algorithms. Based on our study outcomes, we further propose an advanced issue classification approach, DeepLabel, which can achieve better performance compared with the existing issue classification approaches.

针对问题跟踪系统（ITSs）的自动问题分类已经开展了大量的研究工作。虽然现有的方法都显示出了良好的性能，但对不同的设计选择，包括现有方法所采用的不同文本字段、特征表示方法和机器学习算法，还没有进行过全面的比较和分析。为了填补这一空白，我们首次对 9 种最先进的问题分类方法进行了广泛的自动问题分类研究。我们在广泛研究的数据集上的实验结果揭示了自动问题分类的多种实用指南，包括(1) 为问题标题和描述分别训练模型，然后将这两个模型结合起来，往往能获得更好的问题分类性能；(2) 使用长短期记忆（LSTM）进行单词嵌入能更好地从问题的文本字段中提取特征，从而建立更好的问题分类模型；(3) 文本字段中的某些术语有助于在错误问题和非错误问题之间建立更具区分性的分类器；(4) 问题分类模型的性能对多重L算法的选择并不敏感。在研究成果的基础上，我们进一步提出了一种先进的问题分类方法--DeepLabel，与现有的问题分类方法相比，它可以获得更好的性能。

{"title":"Empirically revisiting and enhancing automatic classification of bug and non-bug issues","authors":"Zhong Li, Minxue Pan, Yu Pei, Tian Zhang, Linzhang Wang, Xuandong Li","doi":"10.1007/s11704-023-2771-z","DOIUrl":"https://doi.org/10.1007/s11704-023-2771-z","url":null,"abstract":"A large body of research effort has been dedicated to automated issue classification for Issue Tracking Systems (ITSs). Although the existing approaches have shown promising performance, the different design choices, including the different textual fields, feature representation methods and machine learning algorithms adopted by existing approaches, have not been comprehensively compared and analyzed. To fill this gap, we perform the first extensive study of automated issue classification on 9 state-of-the-art issue classification approaches. Our experimental results on the widely studied dataset reveal multiple practical guidelines for automated issue classification, including: (1) Training separate models for the issue titles and descriptions and then combining these two models tend to achieve better performance for issue classification; (2) Word embedding with Long Short-Term Memory (LSTM) can better extract features from the textual fields in the issues, and hence, lead to better issue classification models; (3) There exist certain terms in the textual fields that are helpful for building more discriminating classifiers between bug and non-bug issues; (4) The performance of the issue classification model is not sensitive to the choices of ML algorithms. Based on our study outcomes, we further propose an advanced issue classification approach, DeepLabel, which can achieve better performance compared with the existing issue classification approaches.","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"32 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0