Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2662-3
Liangxuan Zhu, Han Li, Xuelin Zhang, Lingjuan Wu, Hong Chen
Interpretability has drawn increasing attention in machine learning. Most works focus on post-hoc explanations rather than building a self-explaining model. So, we propose a Neural Partially Linear Additive Model (NPLAM), which automatically distinguishes insignificant, linear, and nonlinear features in neural networks. On the one hand, neural network construction fits data better than spline function under the same parameter amount; on the other hand, learnable gate design and sparsity regular-term maintain the ability of feature selection and structure discovery. We theoretically establish the generalization error bounds of the proposed method with Rademacher complexity. Experiments based on both simulations and real-world datasets verify its good performance and interpretability.
{"title":"Neural partially linear additive model","authors":"Liangxuan Zhu, Han Li, Xuelin Zhang, Lingjuan Wu, Hong Chen","doi":"10.1007/s11704-023-2662-3","DOIUrl":"https://doi.org/10.1007/s11704-023-2662-3","url":null,"abstract":"<p>Interpretability has drawn increasing attention in machine learning. Most works focus on post-hoc explanations rather than building a self-explaining model. So, we propose a Neural Partially Linear Additive Model (NPLAM), which automatically distinguishes insignificant, linear, and nonlinear features in neural networks. On the one hand, neural network construction fits data better than spline function under the same parameter amount; on the other hand, learnable gate design and sparsity regular-term maintain the ability of feature selection and structure discovery. We theoretically establish the generalization error bounds of the proposed method with Rademacher complexity. Experiments based on both simulations and real-world datasets verify its good performance and interpretability.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1007/s11704-023-3087-8
Jiantong Huo, Zhisheng Huo, Limin Xiao, Zhenxue He
For the high-performance computing in a WAN environment, the geographical locations of national supercomputing centers are scattered and the network topology is complex, so it is difficult to form a unified view of resources. To aggregate the widely dispersed storage resources of national supercomputing centers in China, we have previously proposed a global virtual data space named GVDS in the project of “High Performance Computing Virtual Data Space”, a part of the National Key Research and Development Program of China. The GVDS enables large-scale applications of the high-performance computing to run efficiently across WAN. However, the applications running on the GVDS are often data-intensive, requiring large amounts of data from multiple supercomputing centers across WANs. In this regard, the GVDS suffers from performance bottlenecks in data migration and access across WANs. To solve the above-mentioned problem, this paper proposes a performance optimization framework of GVDS including the multitask-oriented data migration method and the request access-aware IO proxy resource allocation strategy. In a WAN environment, the framework proposed in this paper can make an efficient migration decision based on the amount of migrated data and the number of multiple data sources, guaranteeing lower average migration latency when multiple data migration tasks are running in parallel. In addition, it can ensure that the thread resource of the IO proxy node is fairly allocated among different types of requests (the IO proxy is a module of GVDS), so as to improve the application’s performance across WANs. The experimental results show that the framework can effectively reduce the average data access delay of GVDS while improving the performance of the application greatly.
{"title":"Research on performance optimization of virtual data space across WAN","authors":"Jiantong Huo, Zhisheng Huo, Limin Xiao, Zhenxue He","doi":"10.1007/s11704-023-3087-8","DOIUrl":"https://doi.org/10.1007/s11704-023-3087-8","url":null,"abstract":"<p>For the high-performance computing in a WAN environment, the geographical locations of national supercomputing centers are scattered and the network topology is complex, so it is difficult to form a unified view of resources. To aggregate the widely dispersed storage resources of national supercomputing centers in China, we have previously proposed a global virtual data space named GVDS in the project of “High Performance Computing Virtual Data Space”, a part of the National Key Research and Development Program of China. The GVDS enables large-scale applications of the high-performance computing to run efficiently across WAN. However, the applications running on the GVDS are often data-intensive, requiring large amounts of data from multiple supercomputing centers across WANs. In this regard, the GVDS suffers from performance bottlenecks in data migration and access across WANs. To solve the above-mentioned problem, this paper proposes a performance optimization framework of GVDS including the multitask-oriented data migration method and the request access-aware IO proxy resource allocation strategy. In a WAN environment, the framework proposed in this paper can make an efficient migration decision based on the amount of migrated data and the number of multiple data sources, guaranteeing lower average migration latency when multiple data migration tasks are running in parallel. In addition, it can ensure that the thread resource of the IO proxy node is fairly allocated among different types of requests (the IO proxy is a module of GVDS), so as to improve the application’s performance across WANs. The experimental results show that the framework can effectively reduce the average data access delay of GVDS while improving the performance of the application greatly.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2492-3
Mojtaba Noorallahzadeh, Mohammad Mosleh, Kamalika Datta
With the recent demonstration of quantum computers, interests in the field of reversible logic synthesis and optimization have taken a different turn. As every quantum operation is inherently reversible, there is an immense motivation for exploring reversible circuit design and optimization. When it comes to faults in circuits, the parity-preserving feature donates to the detection of permanent and temporary faults. In the context of reversible circuits, the parity-preserving property ensures that the input and output parities are equal. In this paper we suggest six parity-preserving reversible blocks (Z, F, A, T, S, and L) with improved quantum cost. The reversible blocks are synthesized using an existing synthesis method that generates a netlist of multiple-control Toffoli (MCT) gates. Various optimization rules are applied at the reversible circuit level, followed by transformation into a netlist of elementary quantum gates from the NCV library. The designs of full-adder and unsigned and signed multipliers are proposed using the functional blocks that possess parity-preserving properties. The proposed designs are compared with state-of-the-art methods and found to be better in terms of cost of realization. Average savings of 25.04%, 20.89%, 21.17%, and 51.03%, and 18.59%, 13.82%, 13.82%, and 27.65% respectively, are observed for 4-bit unsigned and 5-bit signed multipliers in terms of quantum cost, garbage output, constant input, and gate count as compared to recent works.
{"title":"A new design of parity-preserving reversible multipliers based on multiple-control toffoli synthesis targeting emerging quantum circuits","authors":"Mojtaba Noorallahzadeh, Mohammad Mosleh, Kamalika Datta","doi":"10.1007/s11704-023-2492-3","DOIUrl":"https://doi.org/10.1007/s11704-023-2492-3","url":null,"abstract":"<p>With the recent demonstration of quantum computers, interests in the field of reversible logic synthesis and optimization have taken a different turn. As every quantum operation is inherently reversible, there is an immense motivation for exploring reversible circuit design and optimization. When it comes to faults in circuits, the parity-preserving feature donates to the detection of permanent and temporary faults. In the context of reversible circuits, the parity-preserving property ensures that the input and output parities are equal. In this paper we suggest six parity-preserving reversible blocks (<i>Z, F, A, T, S</i>, and <i>L</i>) with improved quantum cost. The reversible blocks are synthesized using an existing synthesis method that generates a netlist of multiple-control Toffoli (MCT) gates. Various optimization rules are applied at the reversible circuit level, followed by transformation into a netlist of elementary quantum gates from the NCV library. The designs of full-adder and unsigned and signed multipliers are proposed using the functional blocks that possess parity-preserving properties. The proposed designs are compared with state-of-the-art methods and found to be better in terms of cost of realization. Average savings of 25.04%, 20.89%, 21.17%, and 51.03%, and 18.59%, 13.82%, 13.82%, and 27.65% respectively, are observed for 4-bit unsigned and 5-bit signed multipliers in terms of quantum cost, garbage output, constant input, and gate count as compared to recent works.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spatial crowdsourcing (SC) is a popular data collection paradigm for numerous applications. With the increment of tasks and workers in SC, heterogeneity becomes an unavoidable difficulty in task allocation. Existing researches only focus on the single-heterogeneous task allocation. However, a variety of heterogeneous objects coexist in real-world SC systems. This dramatically expands the space for searching the optimal task allocation solution, affecting the quality and efficiency of data collection. In this paper, an aggregation-based dual heterogeneous task allocation algorithm is put forth. It investigates the impact of dual heterogeneous on the task allocation problem and seeks to maximize the quality of task completion and minimize the average travel distance. This problem is first proved to be NP-hard. Then, a task aggregation method based on locations and requirements is built to reduce task failures. Meanwhile, a time-constrained shortest path planning is also developed to shorten the travel distance in a community. After that, two evolutionary task allocation schemes are presented. Finally, extensive experiments are conducted based on real-world datasets in various contexts. Compared with baseline algorithms, our proposed schemes enhance the quality of task completion by up to 25% and utilize 34% less average travel distance.
{"title":"Aggregation-based dual heterogeneous task allocation in spatial crowdsourcing","authors":"Xiaochuan Lin, Kaimin Wei, Zhetao Li, Jinpeng Chen, Tingrui Pei","doi":"10.1007/s11704-023-3133-6","DOIUrl":"https://doi.org/10.1007/s11704-023-3133-6","url":null,"abstract":"<p>Spatial crowdsourcing (SC) is a popular data collection paradigm for numerous applications. With the increment of tasks and workers in SC, heterogeneity becomes an unavoidable difficulty in task allocation. Existing researches only focus on the single-heterogeneous task allocation. However, a variety of heterogeneous objects coexist in real-world SC systems. This dramatically expands the space for searching the optimal task allocation solution, affecting the quality and efficiency of data collection. In this paper, an aggregation-based dual heterogeneous task allocation algorithm is put forth. It investigates the impact of dual heterogeneous on the task allocation problem and seeks to maximize the quality of task completion and minimize the average travel distance. This problem is first proved to be NP-hard. Then, a task aggregation method based on locations and requirements is built to reduce task failures. Meanwhile, a time-constrained shortest path planning is also developed to shorten the travel distance in a community. After that, two evolutionary task allocation schemes are presented. Finally, extensive experiments are conducted based on real-world datasets in various contexts. Compared with baseline algorithms, our proposed schemes enhance the quality of task completion by up to 25% and utilize 34% less average travel distance.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2791-8
Abstract
Learning-outcome prediction (LOP) is a longstanding and critical problem in educational routes. Many studies have contributed to developing effective models while often suffering from data shortage and low generalization to various institutions due to the privacy-protection issue. To this end, this study proposes a distributed grade prediction model, dubbed FecMap, by exploiting the federated learning (FL) framework that preserves the private data of local clients and communicates with others through a global generalized model. FecMap considers local subspace learning (LSL), which explicitly learns the local features against the global features, and multi-layer privacy protection (MPP), which hierarchically protects the private features, including model-shareable features and not-allowably shared features, to achieve client-specific classifiers of high performance on LOP per institution. FecMap is then achieved in an iteration manner with all datasets distributed on clients by training a local neural network composed of a global part, a local part, and a classification head in clients and averaging the global parts from clients on the server. To evaluate the FecMap model, we collected three higher-educational datasets of student academic records from engineering majors. Experiment results manifest that FecMap benefits from the proposed LSL and MPP and achieves steady performance on the task of LOP, compared with the state-of-the-art models. This study makes a fresh attempt at the use of federated learning in the learning-analytical task, potentially paving the way to facilitating personalized education with privacy protection.
{"title":"Federated learning-outcome prediction with multi-layer privacy protection","authors":"","doi":"10.1007/s11704-023-2791-8","DOIUrl":"https://doi.org/10.1007/s11704-023-2791-8","url":null,"abstract":"<h3>Abstract</h3> <p>Learning-outcome prediction (LOP) is a longstanding and critical problem in educational routes. Many studies have contributed to developing effective models while often suffering from data shortage and low generalization to various institutions due to the privacy-protection issue. To this end, this study proposes a distributed grade prediction model, dubbed FecMap, by exploiting the federated learning (FL) framework that preserves the private data of local clients and communicates with others through a global generalized model. FecMap considers local subspace learning (LSL), which explicitly learns the local features against the global features, and multi-layer privacy protection (MPP), which hierarchically protects the private features, including model-shareable features and not-allowably shared features, to achieve client-specific classifiers of high performance on LOP per institution. FecMap is then achieved in an iteration manner with all datasets distributed on clients by training a local neural network composed of a global part, a local part, and a classification head in clients and averaging the global parts from clients on the server. To evaluate the FecMap model, we collected three higher-educational datasets of student academic records from engineering majors. Experiment results manifest that FecMap benefits from the proposed LSL and MPP and achieves steady performance on the task of LOP, compared with the state-of-the-art models. This study makes a fresh attempt at the use of federated learning in the learning-analytical task, potentially paving the way to facilitating personalized education with privacy protection.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2616-9
Fan Zhang, Meijuan Yin, Fenlin Liu, Xiangyang Luo, Shuodi Zu
IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.
IP 地理定位对于敏感网络实体的地域分析、基于位置的服务(LBS)和网络欺诈检测至关重要。它具有重要的理论意义和应用价值。基于测量的 IP 地理定位是一个热门研究课题。然而,现有的 IP 地理定位算法不能有效利用延迟的距离特性和节点的连接关系,导致地理定位误差较大。如何获取延迟、节点连接关系和地理位置之间的映射关系是一项挑战。基于网络表示学习的思想,我们提出了一种 IP 节点表示学习模型(简称 IP2vec),并将其应用于街道级 IP 地理定位。IP2vec 模型根据节点之间的连接关系和延迟对节点进行矢量化,从而使 IP 矢量能够反映 IP 节点之间的距离和拓扑接近程度。基于 IP2vec 模型的街道级 IP 地理定位算法步骤如下:首先,测量地标和目标 IP,获取延迟和路径信息,构建网络拓扑。其次,利用 IP2vec 模型从网络拓扑结构中获取 IP 向量。第三,我们训练神经网络来拟合向量与地标位置之间的映射关系。最后,将目标 IP 的向量输入神经网络,以获得目标 IP 的地理位置。该算法可以根据 IP 向量中蕴含的延迟和拓扑邻近性准确推断出目标 IP 的地理位置。对纽约、北京、香港和郑州的 10023 个目标 IP 的交叉验证实验结果表明,所提出的算法可以实现街道级地理定位。与 Hop-Hot、IP-geolocater 和 SLG 等现有算法相比,所提算法的平均地理定位误差分别减少了 33%、39% 和 51%。
{"title":"IP2vec: an IP node representation model for IP geolocation","authors":"Fan Zhang, Meijuan Yin, Fenlin Liu, Xiangyang Luo, Shuodi Zu","doi":"10.1007/s11704-023-2616-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2616-9","url":null,"abstract":"<p>IP geolocation is essential for the territorial analysis of sensitive network entities, location-based services (LBS) and network fraud detection. It has important theoretical significance and application value. Measurement-based IP geolocation is a hot research topic. However, the existing IP geolocation algorithms cannot effectively utilize the distance characteristics of the delay, and the nodes’ connection relation, resulting in high geolocation error. It is challenging to obtain the mapping between delay, nodes’ connection relation, and geographical location. Based on the idea of network representation learning, we propose a representation learning model for IP nodes (IP2vec for short) and apply it to street-level IP geolocation. IP2vec model vectorizes nodes according to the connection relation and delay between nodes so that the IP vectors can reflect the distance and topological proximity between IP nodes. The steps of the street-level IP geolocation algorithm based on IP2vec model are as follows: Firstly, we measure landmarks and target IP to obtain delay and path information to construct the network topology. Secondly, we use the IP2vec model to obtain the IP vectors from the network topology. Thirdly, we train a neural network to fit the mapping relation between vectors and locations of landmarks. Finally, the vector of target IP is fed into the neural network to obtain the geographical location of target IP. The algorithm can accurately infer geographical locations of target IPs based on delay and topological proximity embedded in the IP vectors. The cross-validation experimental results on 10023 target IPs in New York, Beijing, Hong Kong, and Zhengzhou demonstrate that the proposed algorithm can achieve street-level geolocation. Compared with the existing algorithms such as Hop-Hot, IP-geolocater and SLG, the mean geolocation error of the proposed algorithm is reduced by 33%, 39%, and 51%, respectively.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-28DOI: 10.1007/s11704-023-2774-9
Zhao-Hui Li, Xin-Yu Feng
Though obstruction-free progress property is weaker than other non-blocking properties including lock-freedom and wait-freedom, it has advantages that have led to the use of obstruction-free implementations for software transactional memory (STM) and in anonymous and fault-tolerant distributed computing. However, existing work can only verify obstruction-freedom of specific data structures (e.g., STM and list-based algorithms).
In this paper, to fill this gap, we propose a program logic that can formally verify obstruction-freedom of practical implementations, as well as verify linearizability, a safety property, at the same time. We also propose informal principles to extend a logic for verifying linearizability to verifying obstruction-freedom. With this approach, the existing proof for linearizability can be reused directly to construct the proof for both linearizability and obstruction-freedom. Finally, we have successfully applied our logic to verifying a practical obstruction-free double-ended queue implementation in the first classic paper that has proposed the definition of obstruction-freedom.
{"title":"A program logic for obstruction-freedom","authors":"Zhao-Hui Li, Xin-Yu Feng","doi":"10.1007/s11704-023-2774-9","DOIUrl":"https://doi.org/10.1007/s11704-023-2774-9","url":null,"abstract":"<p>Though obstruction-free progress property is weaker than other non-blocking properties including lock-freedom and wait-freedom, it has advantages that have led to the use of obstruction-free implementations for software transactional memory (STM) and in anonymous and fault-tolerant distributed computing. However, existing work can only verify obstruction-freedom of specific data structures (e.g., STM and list-based algorithms).</p><p>In this paper, to fill this gap, we propose a program logic that can formally verify obstruction-freedom of practical implementations, as well as verify linearizability, a safety property, at the same time. We also propose informal principles to extend a logic for verifying linearizability to verifying obstruction-freedom. With this approach, the existing proof for linearizability can be reused directly to construct the proof for both linearizability and obstruction-freedom. Finally, we have successfully applied our logic to verifying a practical obstruction-free double-ended queue implementation in the first classic paper that has proposed the definition of obstruction-freedom.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-27DOI: 10.1007/s11704-023-3150-5
Abstract
Model-based reinforcement learning is a promising direction to improve the sample efficiency of reinforcement learning with learning a model of the environment. Previous model learning methods aim at fitting the transition data, and commonly employ a supervised learning approach to minimize the distance between the predicted state and the real state. The supervised model learning methods, however, diverge from the ultimate goal of model learning, i.e., optimizing the learned-in-the-model policy. In this work, we investigate how model learning and policy learning can share the same objective of maximizing the expected return in the real environment. We find model learning towards this objective can result in a target of enhancing the similarity between the gradient on generated data and the gradient on the real data. We thus derive the gradient of the model from this target and propose the Model Gradient algorithm (MG) to integrate this novel model learning approach with policy-gradient-based policy optimization. We conduct experiments on multiple locomotion control tasks and find that MG can not only achieve high sample efficiency but also lead to better convergence performance compared to traditional model-based reinforcement learning approaches.
{"title":"Model gradient: unified model and policy learning in model-based reinforcement learning","authors":"","doi":"10.1007/s11704-023-3150-5","DOIUrl":"https://doi.org/10.1007/s11704-023-3150-5","url":null,"abstract":"<h3>Abstract</h3> <p>Model-based reinforcement learning is a promising direction to improve the sample efficiency of reinforcement learning with learning a model of the environment. Previous model learning methods aim at fitting the transition data, and commonly employ a supervised learning approach to minimize the distance between the predicted state and the real state. The supervised model learning methods, however, diverge from the ultimate goal of model learning, i.e., optimizing the learned-in-the-model policy. In this work, we investigate how model learning and policy learning can share the same objective of maximizing the expected return in the real environment. We find model learning towards this objective can result in a target of enhancing the similarity between the gradient on generated data and the gradient on the real data. We thus derive the gradient of the model from this target and propose the <em>Model Gradient</em> algorithm (MG) to integrate this novel model learning approach with policy-gradient-based policy optimization. We conduct experiments on multiple locomotion control tasks and find that MG can not only achieve high sample efficiency but also lead to better convergence performance compared to traditional model-based reinforcement learning approaches.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139056399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-3397-x
Xinyang Shen, Xiaofei Liao, Long Zheng, Yu Huang, Dan Chen, Hai Jin
Modern recommendation systems are widely used in modern data centers. The random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and memory. ReRAM-based processing-in-memory (PIM) can resolve this problem by processing embedding vectors where they are stored. However, the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip, which induces off-chip accesses that may offset the PIM profits. Therefore, we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance loss. In this paper, we propose ARCHER, a ReRAM-based PIM architecture that implements fully on-chip recommendations under resource constraints. First, we make a full analysis of the computation pattern and access pattern on the decomposed table. Based on the computation pattern, we unify the operations of each layer of the decomposed model in multiply-and-accumulate operations. Based on the access observation, we propose a hierarchical mapping schema and a specialized hardware design to maximize resource utilization. Under the unified computation and mapping strategy, we can coordinate the inter-processing elements pipeline. The evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system, the state-of-the-art near-memory processing recommendation system RecNMP, and the ReRAM-based recommendation accelerator REREC by 15.79×, 2.21×, and 1.21× in terms of performance and 56.06×, 6.45×, and 1.71× in terms of energy savings, respectively.
{"title":"ARCHER: a ReRAM-based accelerator for compressed recommendation systems","authors":"Xinyang Shen, Xiaofei Liao, Long Zheng, Yu Huang, Dan Chen, Hai Jin","doi":"10.1007/s11704-023-3397-x","DOIUrl":"https://doi.org/10.1007/s11704-023-3397-x","url":null,"abstract":"<p>Modern recommendation systems are widely used in modern data centers. The random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and memory. ReRAM-based processing-in-memory (PIM) can resolve this problem by processing embedding vectors where they are stored. However, the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip, which induces off-chip accesses that may offset the PIM profits. Therefore, we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance loss. In this paper, we propose ARCHER, a ReRAM-based PIM architecture that implements fully on-chip recommendations under resource constraints. First, we make a full analysis of the computation pattern and access pattern on the decomposed table. Based on the computation pattern, we unify the operations of each layer of the decomposed model in multiply-and-accumulate operations. Based on the access observation, we propose a hierarchical mapping schema and a specialized hardware design to maximize resource utilization. Under the unified computation and mapping strategy, we can coordinate the inter-processing elements pipeline. The evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system, the state-of-the-art near-memory processing recommendation system RecNMP, and the ReRAM-based recommendation accelerator REREC by 15.79×, 2.21×, and 1.21× in terms of performance and 56.06×, 6.45×, and 1.71× in terms of energy savings, respectively.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139026723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-23DOI: 10.1007/s11704-023-2548-4
Yuqian Ma, Wenbo Shi, Xinghua Li, Qingfeng Cheng
Wireless body area networks (WBANs) guarantee timely data processing and secure information preservation within the range of the wireless access network, which is in urgent need of a new type of security technology. However, with the speedy development of hardware, the existing security schemes can no longer meet the new requirements of anonymity and lightweight. New solutions that do not require complex calculations, such as certificateless cryptography, attract great attention from researchers. To resolve these difficulties, Wang et al. designed a new authentication architecture for the WBANs environment, which was claimed to be secure and efficient. However, in this paper, we will show that this scheme is prone to ephemeral key leakage attacks. Further, based on this authentication scheme, an anonymous certificateless scheme is proposed for lightweight devices. Meanwhile, user anonymity is fully protected. The proposed scheme is proved to be secure under a specific security model. In addition, we assess the security attributes our scheme meets through BAN logic and Scyther tool. The comparisons of time consumption and communication cost are given at the end of the paper, to demonstrate that our scheme performs prior to several previous schemes.
无线体域网(WBAN)保证了无线接入网范围内数据的及时处理和信息的安全保存,迫切需要一种新型的安全技术。然而,随着硬件的飞速发展,现有的安全方案已无法满足匿名和轻量级的新要求。无需复杂计算的新方案,如无证书加密技术,引起了研究人员的极大关注。为了解决这些难题,Wang 等人为无线局域网环境设计了一种新的身份验证架构,并声称这种架构既安全又高效。然而,在本文中,我们将证明这种方案容易受到短暂密钥泄漏攻击。此外,在此认证方案的基础上,我们还为轻量级设备提出了一种匿名无证书方案。同时,用户的匿名性得到了充分保护。在特定的安全模型下,所提出的方案被证明是安全的。此外,我们还通过 BAN 逻辑和 Scyther 工具评估了我们的方案所满足的安全属性。本文末尾还给出了时间消耗和通信成本的比较,以证明我们的方案优于之前的几种方案。
{"title":"Provable secure authentication key agreement for wireless body area networks","authors":"Yuqian Ma, Wenbo Shi, Xinghua Li, Qingfeng Cheng","doi":"10.1007/s11704-023-2548-4","DOIUrl":"https://doi.org/10.1007/s11704-023-2548-4","url":null,"abstract":"<p>Wireless body area networks (WBANs) guarantee timely data processing and secure information preservation within the range of the wireless access network, which is in urgent need of a new type of security technology. However, with the speedy development of hardware, the existing security schemes can no longer meet the new requirements of anonymity and lightweight. New solutions that do not require complex calculations, such as certificateless cryptography, attract great attention from researchers. To resolve these difficulties, Wang et al. designed a new authentication architecture for the WBANs environment, which was claimed to be secure and efficient. However, in this paper, we will show that this scheme is prone to ephemeral key leakage attacks. Further, based on this authentication scheme, an anonymous certificateless scheme is proposed for lightweight devices. Meanwhile, user anonymity is fully protected. The proposed scheme is proved to be secure under a specific security model. In addition, we assess the security attributes our scheme meets through BAN logic and Scyther tool. The comparisons of time consumption and communication cost are given at the end of the paper, to demonstrate that our scheme performs prior to several previous schemes.</p>","PeriodicalId":12640,"journal":{"name":"Frontiers of Computer Science","volume":null,"pages":null},"PeriodicalIF":4.2,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139025450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}