Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2774-9
Zhao-Hui Li, Xin-Yu Feng
Though obstruction-free progress property is weaker than other non-blocking properties including lock-freedom and wait-freedom, it has advantages that have led to the use of obstruction-free implementations for software transactional memory (STM) and in anonymous and fault-tolerant distributed computing. However, existing work can only verify obstruction-freedom of specific data structures (e.g., STM and list-based algorithms).
In this paper, to fill this gap, we propose a program logic that can formally verify obstruction-freedom of practical implementations, as well as verify linearizability, a safety property, at the same time. We also propose informal principles to extend a logic for verifying linearizability to verifying obstruction-freedom. With this approach, the existing proof for linearizability can be reused directly to construct the proof for both linearizability and obstruction-freedom. Finally, we have successfully applied our logic to verifying a practical obstruction-free double-ended queue implementation in the first classic paper that has proposed the definition of obstruction-freedom.
{"title":"A program logic for obstruction-freedom","authors":"Zhao-Hui Li, Xin-Yu Feng","doi":"10.1007/s11704-023-2774-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2774-9","url":null,"abstract":"<p>Though obstruction-free progress property is weaker than other non-blocking properties including lock-freedom and wait-freedom, it has advantages that have led to the use of obstruction-free implementations for software transactional memory (STM) and in anonymous and fault-tolerant distributed computing. However, existing work can only verify obstruction-freedom of specific data structures (e.g., STM and list-based algorithms).</p><p>In this paper, to fill this gap, we propose a program logic that can formally verify obstruction-freedom of practical implementations, as well as verify linearizability, a safety property, at the same time. We also propose informal principles to extend a logic for verifying linearizability to verifying obstruction-freedom. With this approach, the existing proof for linearizability can be reused directly to construct the proof for both linearizability and obstruction-freedom. Finally, we have successfully applied our logic to verifying a practical obstruction-free double-ended queue implementation in the first classic paper that has proposed the definition of obstruction-freedom.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"5 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2616-9
Fan Zhang, Meijuan Yin, Fenlin Liu, Xiangyang Luo, Shuodi Zu
IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.
IP 地理定位对于敏感网络实体的地域分析、基于位置的服务(LBS)和网络欺诈检测至关重要。它具有重要的理论意义和应用价值。基于测量的 IP 地理定位是一个热门研究课题。然而,现有的 IP 地理定位算法不能有效利用延迟的距离特性和节点的连接关系,导致地理定位误差较大。如何获取延迟、节点连接关系和地理位置之间的映射关系是一项挑战。基于网络表示学习的思想,我们提出了一种 IP 节点表示学习模型(简称 IP2vec),并将其应用于街道级 IP 地理定位。IP2vec 模型根据节点之间的连接关系和延迟对节点进行矢量化,从而使 IP 矢量能够反映 IP 节点之间的距离和拓扑接近程度。基于 IP2vec 模型的街道级 IP 地理定位算法步骤如下:首先,测量地标和目标 IP,获取延迟和路径信息,构建网络拓扑。其次,利用 IP2vec 模型从网络拓扑结构中获取 IP 向量。第三,我们训练神经网络来拟合向量与地标位置之间的映射关系。最后,将目标 IP 的向量输入神经网络,以获得目标 IP 的地理位置。该算法可以根据 IP 向量中蕴含的延迟和拓扑邻近性准确推断出目标 IP 的地理位置。对纽约、北京、香港和郑州的 10023 个目标 IP 的交叉验证实验结果表明,所提出的算法可以实现街道级地理定位。与 Hop-Hot、IP-geolocater 和 SLG 等现有算法相比,所提算法的平均地理定位误差分别减少了 33%、39% 和 51%。
{"title":"IP2vec: an IP node representation model for IP geolocation","authors":"Fan Zhang, Meijuan Yin, Fenlin Liu, Xiangyang Luo, Shuodi Zu","doi":"10.1007/s11704-023-2616-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2616-9","url":null,"abstract":"<p>IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"31 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-27DOI: 10.1007/s11704-023-3150-5
Abstract
Model-based reinforcement learning is a promising direction to improve the sample efficiency of reinforcement learning with learning a model of the environment. Previous model learning methods aim at fitting the transition data, and commonly employ a supervised learning approach to minimize the distance between the predicted state and the real state. The supervised model learning methods, however, diverge from the ultimate goal of model learning, i.e., optimizing the learned-in-the-model policy. In this work, we investigate how model learning and policy learning can share the same objective of maximizing the expected return in the real environment. We find model learning towards this objective can result in a target of enhancing the similarity between the gradient on generated data and the gradient on the real data. We thus derive the gradient of the model from this target and propose the Model Gradient algorithm (MG) to integrate this novel model learning approach with policy-gradient-based policy optimization. We conduct experiments on multiple locomotion control tasks and find that MG can not only achieve high sample efficiency but also lead to better convergence performance compared to traditional model-based reinforcement learning approaches.
{"title":"Model gradient: unified model and policy learning in model-based reinforcement learning","authors":"","doi":"10.1007/s11704-023-3150-5","DOIUrl":"https://doi.org/10.1007/s11704-023-3150-5","url":null,"abstract":"<h3>Abstract</h3> <p>Model-based reinforcement learning is a promising direction to improve the sample efficiency of reinforcement learning with learning a model of the environment. Previous model learning methods aim at fitting the transition data, and commonly employ a supervised learning approach to minimize the distance between the predicted state and the real state. The supervised model learning methods, however, diverge from the ultimate goal of model learning, i.e., optimizing the learned-in-the-model policy. In this work, we investigate how model learning and policy learning can share the same objective of maximizing the expected return in the real environment. We find model learning towards this objective can result in a target of enhancing the similarity between the gradient on generated data and the gradient on the real data. We thus derive the gradient of the model from this target and propose the <em>Model Gradient</em> algorithm (MG) to integrate this novel model learning approach with policy-gradient-based policy optimization. We conduct experiments on multiple locomotion control tasks and find that MG can not only achieve high sample efficiency but also lead to better convergence performance compared to traditional model-based reinforcement learning approaches.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"32 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-3397-x
Xinyang Shen, Xiaofei Liao, Long Zheng, Yu Huang, Dan Chen, Hai Jin
Modern recommendation systems are widely used in modern data centers. The random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and memory. ReRAM-based processing-in-memory (PIM) can resolve this problem by processing embedding vectors where they are stored. However, the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip, which induces off-chip accesses that may offset the PIM profits. Therefore, we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance loss. In this paper, we propose ARCHER, a ReRAM-based PIM architecture that implements fully on-chip recommendations under resource constraints. First, we make a full analysis of the computation pattern and access pattern on the decomposed table. Based on the computation pattern, we unify the operations of each layer of the decomposed model in multiply-and-accumulate operations. Based on the access observation, we propose a hierarchical mapping schema and a specialized hardware design to maximize resource utilization. Under the unified computation and mapping strategy, we can coordinate the inter-processing elements pipeline. The evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system, the state-of-the-art near-memory processing recommendation system RecNMP, and the ReRAM-based recommendation accelerator REREC by 15.79×, 2.21×, and 1.21× in terms of performance and 56.06×, 6.45×, and 1.71× in terms of energy savings, respectively.
{"title":"ARCHER: a ReRAM-based accelerator for compressed recommendation systems","authors":"Xinyang Shen, Xiaofei Liao, Long Zheng, Yu Huang, Dan Chen, Hai Jin","doi":"10.1007/s11704-023-3397-x","DOIUrl":"https://doi.org/10.1007/s11704-023-3397-x","url":null,"abstract":"<p>Modern recommendation systems are widely used in modern data centers. The random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and memory. ReRAM-based processing-in-memory (PIM) can resolve this problem by processing embedding vectors where they are stored. However, the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip, which induces off-chip accesses that may offset the PIM profits. Therefore, we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance loss. In this paper, we propose ARCHER, a ReRAM-based PIM architecture that implements fully on-chip recommendations under resource constraints. First, we make a full analysis of the computation pattern and access pattern on the decomposed table. Based on the computation pattern, we unify the operations of each layer of the decomposed model in multiply-and-accumulate operations. Based on the access observation, we propose a hierarchical mapping schema and a specialized hardware design to maximize resource utilization. Under the unified computation and mapping strategy, we can coordinate the inter-processing elements pipeline. The evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system, the state-of-the-art near-memory processing recommendation system RecNMP, and the ReRAM-based recommendation accelerator REREC by 15.79×, 2.21×, and 1.21× in terms of performance and 56.06×, 6.45×, and 1.71× in terms of energy savings, respectively.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"36 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-2548-4
Yuqian Ma, Wenbo Shi, Xinghua Li, Qingfeng Cheng
Wireless body area networks (WBANs) guarantee timely data processing and secure information preservation within the range of the wireless access network, which is in urgent need of a new type of security technology. However, with the speedy development of hardware, the existing security schemes can no longer meet the new requirements of anonymity and lightweight. New solutions that do not require complex calculations, such as certificateless cryptography, attract great attention from researchers. To resolve these difficulties, Wang et al. designed a new authentication architecture for the WBANs environment, which was claimed to be secure and efficient. However, in this paper, we will show that this scheme is prone to ephemeral key leakage attacks. Further, based on this authentication scheme, an anonymous certificateless scheme is proposed for lightweight devices. Meanwhile, user anonymity is fully protected. The proposed scheme is proved to be secure under a specific security model. In addition, we assess the security attributes our scheme meets through BAN logic and Scyther tool. The comparisons of time consumption and communication cost are given at the end of the paper, to demonstrate that our scheme performs prior to several previous schemes.
无线体域网(WBAN)保证了无线接入网范围内数据的及时处理和信息的安全保存,迫切需要一种新型的安全技术。然而,随着硬件的飞速发展,现有的安全方案已无法满足匿名和轻量级的新要求。无需复杂计算的新方案,如无证书加密技术,引起了研究人员的极大关注。为了解决这些难题,Wang 等人为无线局域网环境设计了一种新的身份验证架构,并声称这种架构既安全又高效。然而,在本文中,我们将证明这种方案容易受到短暂密钥泄漏攻击。此外,在此认证方案的基础上,我们还为轻量级设备提出了一种匿名无证书方案。同时,用户的匿名性得到了充分保护。在特定的安全模型下,所提出的方案被证明是安全的。此外,我们还通过 BAN 逻辑和 Scyther 工具评估了我们的方案所满足的安全属性。本文末尾还给出了时间消耗和通信成本的比较,以证明我们的方案优于之前的几种方案。
{"title":"Provable secure authentication key agreement for wireless body area networks","authors":"Yuqian Ma, Wenbo Shi, Xinghua Li, Qingfeng Cheng","doi":"10.1007/s11704-023-2548-4","DOIUrl":"https://doi.org/10.1007/s11704-023-2548-4","url":null,"abstract":"<p>Wireless body area networks (WBANs) guarantee timely data processing and secure information preservation within the range of the wireless access network, which is in urgent need of a new type of security technology. However, with the speedy development of hardware, the existing security schemes can no longer meet the new requirements of anonymity and lightweight. New solutions that do not require complex calculations, such as certificateless cryptography, attract great attention from researchers. To resolve these difficulties, Wang et al. designed a new authentication architecture for the WBANs environment, which was claimed to be secure and efficient. However, in this paper, we will show that this scheme is prone to ephemeral key leakage attacks. Further, based on this authentication scheme, an anonymous certificateless scheme is proposed for lightweight devices. Meanwhile, user anonymity is fully protected. The proposed scheme is proved to be secure under a specific security model. In addition, we assess the security attributes our scheme meets through BAN logic and Scyther tool. The comparisons of time consumption and communication cost are given at the end of the paper, to demonstrate that our scheme performs prior to several previous schemes.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"4 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139025450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-3142-5
Abstract
Federated Learning (FL) has emerged as a powerful technology designed for collaborative training between multiple clients and a server while maintaining data privacy of clients. To enhance the privacy in FL, Differentially Private Federated Learning (DPFL) has gradually become one of the most effective approaches. As DPFL operates in the distributed settings, there exist potential malicious adversaries who manipulate some clients and the aggregation server to produce malicious parameters and disturb the learning model. However, existing aggregation protocols for DPFL concern either the existence of some corrupted clients (Byzantines) or the corrupted server. Such protocols are limited to eliminate the effects of corrupted clients and server when both are in existence simultaneously due to the complicated threat model. In this paper, we elaborate such adversarial threat model and propose BVDFed. To our best knowledge, it is the first Byzantine-resilient and Verifiable aggregation for Differentially private FEDerated learning. In specific, we propose Differentially Private Federated Averaging algorithm (DPFA) as our primary workflow of BVDFed, which is more lightweight and easily portable than traditional DPFL algorithm. We then introduce Loss Score to indicate the trustworthiness of disguised gradients in DPFL. Based on Loss Score, we propose an aggregation rule DPLoss to eliminate faulty gradients from Byzantine clients during server aggregation while preserving the privacy of clients’ data. Additionally, we design a secure verification scheme DPVeri that are compatible with DPFA and DPLoss to support the honest clients in verifying the integrity of received aggregated results. And DPVeri also provides resistance to collusion attacks with no more than t participants for our aggregation. Theoretical analysis and experimental results demonstrate our aggregation to be feasible and effective in practice.
{"title":"BVDFed: Byzantine-resilient and verifiable aggregation for differentially private federated learning","authors":"","doi":"10.1007/s11704-023-3142-5","DOIUrl":"https://doi.org/10.1007/s11704-023-3142-5","url":null,"abstract":"<h3>Abstract</h3> <p>Federated Learning (FL) has emerged as a powerful technology designed for collaborative training between multiple clients and a server while maintaining data privacy of clients. To enhance the privacy in FL, Differentially Private Federated Learning (DPFL) has gradually become one of the most effective approaches. As DPFL operates in the distributed settings, there exist potential malicious adversaries who manipulate some clients and the aggregation server to produce malicious parameters and disturb the learning model. However, existing aggregation protocols for DPFL concern either the existence of some corrupted clients (Byzantines) or the corrupted server. Such protocols are limited to eliminate the effects of corrupted clients and server when both are in existence simultaneously due to the complicated threat model. In this paper, we elaborate such adversarial threat model and propose BVDFed. To our best knowledge, it is the first Byzantine-resilient and Verifiable aggregation for Differentially private FEDerated learning. In specific, we propose Differentially Private Federated Averaging algorithm (DPFA) as our primary workflow of BVDFed, which is more lightweight and easily portable than traditional DPFL algorithm. We then introduce Loss Score to indicate the trustworthiness of disguised gradients in DPFL. Based on Loss Score, we propose an aggregation rule DPLoss to eliminate faulty gradients from Byzantine clients during server aggregation while preserving the privacy of clients’ data. Additionally, we design a secure verification scheme DPVeri that are compatible with DPFA and DPLoss to support the honest clients in verifying the integrity of received aggregated results. And DPVeri also provides resistance to collusion attacks with no more than <em>t</em> participants for our aggregation. Theoretical analysis and experimental results demonstrate our aggregation to be feasible and effective in practice.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"49 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-2751-3
Lijuan Ren, Liangxiao Jiang, Wenjun Zhang, Chaoqun Li
Abstract
In crowdsourcing scenarios, we can obtain each instance’s multiple noisy labels from different crowd workers and then infer its integrated label via label aggregation. In spite of the effectiveness of label aggregation methods, there still remains a certain level of noise in the integrated labels. Thus, some noise correction methods have been proposed to reduce the impact of noise in recent years. However, to the best of our knowledge, existing methods rarely consider an instance’s information from both its features and multiple noisy labels simultaneously when identifying a noise instance. In this study, we argue that the more distinguishable an instance’s features but the noisier its multiple noisy labels, the more likely it is a noise instance. Based on this premise, we propose a label distribution similarity-based noise correction (LDSNC) method. To measure whether an instance’s features are distinguishable, we obtain each instance’s predicted label distribution by building multiple classifiers using instances’ features and their integrated labels. To measure whether an instance’s multiple noisy labels are noisy, we obtain each instance’s multiple noisy label distribution using its multiple noisy labels. Then, we use the Kullback-Leibler (KL) divergence to calculate the similarity between the predicted label distribution and multiple noisy label distribution and define the instance with the lower similarity as a noise instance. The extensive experimental results on 34 simulated and four real-world crowdsourced datasets validate the effectiveness of our method.
{"title":"Label distribution similarity-based noise correction for crowdsourcing","authors":"Lijuan Ren, Liangxiao Jiang, Wenjun Zhang, Chaoqun Li","doi":"10.1007/s11704-023-2751-3","DOIUrl":"https://doi.org/10.1007/s11704-023-2751-3","url":null,"abstract":"<h3>Abstract</h3> <p>In crowdsourcing scenarios, we can obtain each instance’s multiple noisy labels from different crowd workers and then infer its integrated label via label aggregation. In spite of the effectiveness of label aggregation methods, there still remains a certain level of noise in the integrated labels. Thus, some noise correction methods have been proposed to reduce the impact of noise in recent years. However, to the best of our knowledge, existing methods rarely consider an instance’s information from both its features and multiple noisy labels simultaneously when identifying a noise instance. In this study, we argue that the more distinguishable an instance’s features but the noisier its multiple noisy labels, the more likely it is a noise instance. Based on this premise, we propose a label distribution similarity-based noise correction (LDSNC) method. To measure whether an instance’s features are distinguishable, we obtain each instance’s predicted label distribution by building multiple classifiers using instances’ features and their integrated labels. To measure whether an instance’s multiple noisy labels are noisy, we obtain each instance’s multiple noisy label distribution using its multiple noisy labels. Then, we use the Kullback-Leibler (KL) divergence to calculate the similarity between the predicted label distribution and multiple noisy label distribution and define the instance with the lower similarity as a noise instance. The extensive experimental results on 34 simulated and four real-world crowdsourced datasets validate the effectiveness of our method.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"10 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-2782-9
Abstract
In this paper, we propose the concept of delegable zero knowledge succinct non-interactive arguments of knowledge (zk-SNARKs). The delegable zk-SNARK is parameterized by (μ,k,k′,k″). The delegable property of zk-SNARKs allows the prover to delegate its proving ability to μ proxies. Any k honest proxies are able to generate the correct proof for a statement, but the collusion of less than k proxies does not obtain information about the witness of the statement. We also define k′-soundness and k″-zero knowledge by taking into consider of multi-proxies.
We propose a construction of (μ,2t + 1,t,t)- delegable zk-SNARK for the NPC language of arithmetic circuit satisfiability. Our delegable zk-SNARK stems from Groth’s zk-SNARK scheme (Groth16). We take advantage of the additive and multiplicative properties of polynomial-based secret sharing schemes to achieve delegation for zk-SNARK. Our secret sharing scheme works well with the pairing groups so that the nice succinct properties of Groth’s zk-SNARK scheme are preserved, while augmenting the delegable property and keeping soundness and zero-knowledge in the scenario of multi-proxies.
{"title":"Delegable zk-SNARKs with proxies","authors":"","doi":"10.1007/s11704-023-2782-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2782-9","url":null,"abstract":"<h3>Abstract</h3> <p>In this paper, we propose the concept of delegable zero knowledge succinct non-interactive arguments of knowledge (zk-SNARKs). The delegable zk-SNARK is parameterized by (<em>μ,k,k′,k″</em>). The delegable property of zk-SNARKs allows the prover to delegate its proving ability to <em>μ</em> proxies. Any <em>k</em> honest proxies are able to generate the correct proof for a statement, but the collusion of less than <em>k</em> proxies does not obtain information about the witness of the statement. We also define <em>k′</em>-soundness and <em>k″</em>-zero knowledge by taking into consider of multi-proxies.</p> <p>We propose a construction of (<em>μ</em>,2<em>t</em> + 1,<em>t,t</em>)- delegable zk-SNARK for the NPC language of arithmetic circuit satisfiability. Our delegable zk-SNARK stems from Groth’s zk-SNARK scheme (Groth16). We take advantage of the additive and multiplicative properties of polynomial-based secret sharing schemes to achieve delegation for zk-SNARK. Our secret sharing scheme works well with the pairing groups so that the nice succinct properties of Groth’s zk-SNARK scheme are preserved, while augmenting the delegable property and keeping soundness and zero-knowledge in the scenario of multi-proxies.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"26 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-2418-0
Jiacheng Li, Ruize Han, Wei Feng, Haomin Yan, Song Wang
Human interaction recognition is an essential task in video surveillance. The current works on human interaction recognition mainly focus on the scenarios only containing the close-contact interactive subjects without other people. In this paper, we handle more practical but more challenging scenarios where interactive subjects are contactless and other subjects not involved in the interactions of interest are also present in the scene. To address this problem, we propose an Interactive Relation Embedding Network (IRE-Net) to simultaneously identify the subjects involved in the interaction and recognize their interaction category. As a new problem, we also build a new dataset with annotations and metrics for performance evaluation. Experimental results on this dataset show significant improvements of the proposed method when compared with current methods developed for human interaction recognition and group activity recognition.
{"title":"Contactless interaction recognition and interactor detection in multi-person scenes","authors":"Jiacheng Li, Ruize Han, Wei Feng, Haomin Yan, Song Wang","doi":"10.1007/s11704-023-2418-0","DOIUrl":"https://doi.org/10.1007/s11704-023-2418-0","url":null,"abstract":"<p>Human interaction recognition is an essential task in video surveillance. The current works on human interaction recognition mainly focus on the scenarios only containing the close-contact interactive subjects without other people. In this paper, we handle more practical but more challenging scenarios where interactive subjects are contactless and other subjects not involved in the interactions of interest are also present in the scene. To address this problem, we propose an Interactive Relation Embedding Network (IRE-Net) to simultaneously identify the subjects involved in the interaction and recognize their interaction category. As a new problem, we also build a new dataset with annotations and metrics for performance evaluation. Experimental results on this dataset show significant improvements of the proposed method when compared with current methods developed for human interaction recognition and group activity recognition.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"27 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139025445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A large body of research effort has been dedicated to automated issue classification for Issue Tracking Systems (ITSs). Although the existing approaches have shown promising performance, the different design choices, including the different textual fields, feature representation methods and machine learning algorithms adopted by existing approaches, have not been comprehensively compared and analyzed. To fill this gap, we perform the first extensive study of automated issue classification on 9 state-of-the-art issue classification approaches. Our experimental results on the widely studied dataset reveal multiple practical guidelines for automated issue classification, including: (1) Training separate models for the issue titles and descriptions and then combining these two models tend to achieve better performance for issue classification; (2) Word embedding with Long Short-Term Memory (LSTM) can better extract features from the textual fields in the issues, and hence, lead to better issue classification models; (3) There exist certain terms in the textual fields that are helpful for building more discriminating classifiers between bug and non-bug issues; (4) The performance of the issue classification model is not sensitive to the choices of ML algorithms. Based on our study outcomes, we further propose an advanced issue classification approach, DeepLabel, which can achieve better performance compared with the existing issue classification approaches.
{"title":"Empirically revisiting and enhancing automatic classification of bug and non-bug issues","authors":"Zhong Li, Minxue Pan, Yu Pei, Tian Zhang, Linzhang Wang, Xuandong Li","doi":"10.1007/s11704-023-2771-z","DOIUrl":"https://doi.org/10.1007/s11704-023-2771-z","url":null,"abstract":"<p>A large body of research effort has been dedicated to automated issue classification for Issue Tracking Systems (ITSs). Although the existing approaches have shown promising performance, the different design choices, including the different textual fields, feature representation methods and machine learning algorithms adopted by existing approaches, have not been comprehensively compared and analyzed. To fill this gap, we perform the first extensive study of automated issue classification on 9 state-of-the-art issue classification approaches. Our experimental results on the widely studied dataset reveal multiple practical guidelines for automated issue classification, including: (1) Training separate models for the issue titles and descriptions and then combining these two models tend to achieve better performance for issue classification; (2) Word embedding with Long Short-Term Memory (LSTM) can better extract features from the textual fields in the issues, and hence, lead to better issue classification models; (3) There exist certain terms in the textual fields that are helpful for building more discriminating classifiers between bug and non-bug issues; (4) The performance of the issue classification model is not sensitive to the choices of ML algorithms. Based on our study outcomes, we further propose an advanced issue classification approach, D<span>eep</span>L<span>abel</span>, which can achieve better performance compared with the existing issue classification approaches.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":"32 1","pages":""},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}