Pub Date : 2023-05-09DOI: 10.48550/arXiv.2305.05245
Tomoyuki Tokuue, T. Ishiyama
Sorting is one of the most basic algorithms, and developing highly parallel sorting programs is becoming increasingly important in high-performance computing because the number of CPU cores per node in modern supercomputers tends to increase. In this study, we have implemented two multi-threaded sorting algorithms based on samplesort and compared their performance on the supercomputer Fugaku. The first algorithm divides an input sequence into multiple blocks, sorts each block, and then selects pivots by sampling from each block at regular intervals. Each block is then partitioned using the pivots, and partitions in different blocks are merged into a single sorted sequence. The second algorithm differs from the first one in only selecting pivots, where the binary search is used to select pivots such that the number of elements in each partition is equal. We compare the performance of the two algorithms with different sequential sorting and multiway merging algorithms. We demonstrate that the second algorithm with BlockQuicksort (a quicksort accelerated by reducing conditional branches) for sequential sorting and the selection tree for merging shows consistently high speed and high parallel efficiency for various input data types and data sizes.
{"title":"Performance Evaluation of Parallel Sortings on the Supercomputer Fugaku","authors":"Tomoyuki Tokuue, T. Ishiyama","doi":"10.48550/arXiv.2305.05245","DOIUrl":"https://doi.org/10.48550/arXiv.2305.05245","url":null,"abstract":"Sorting is one of the most basic algorithms, and developing highly parallel sorting programs is becoming increasingly important in high-performance computing because the number of CPU cores per node in modern supercomputers tends to increase. In this study, we have implemented two multi-threaded sorting algorithms based on samplesort and compared their performance on the supercomputer Fugaku. The first algorithm divides an input sequence into multiple blocks, sorts each block, and then selects pivots by sampling from each block at regular intervals. Each block is then partitioned using the pivots, and partitions in different blocks are merged into a single sorted sequence. The second algorithm differs from the first one in only selecting pivots, where the binary search is used to select pivots such that the number of elements in each partition is equal. We compare the performance of the two algorithms with different sequential sorting and multiway merging algorithms. We demonstrate that the second algorithm with BlockQuicksort (a quicksort accelerated by reducing conditional branches) for sequential sorting and the selection tree for merging shows consistently high speed and high parallel efficiency for various input data types and data sizes.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115622974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-09-12DOI: 10.48550/arXiv.2209.05149
J. Sano, Naoki Yamamoto, K. Ueda
Graphs are a generalized concept that encompasses more complex data structures than trees, such as difference lists, doubly-linked lists, skip lists, and leaf-linked trees. Normally, these structures are handled with destructive assignments to heaps, which is opposed to a purely functional programming style and makes verification difficult. We propose a new purely functional language, $lambda_{GT}$, that handles graphs as immutable, first-class data structures with a pattern matching mechanism based on Graph Transformation and developed a new type system, $F_{GT}$, for the language. Our approach is in contrast with the analysis of pointer manipulation programs using separation logic, shape analysis, etc. in that (i) we do not consider destructive operations but pattern matchings over graphs provided by the new higher-level language that abstract pointers and heaps away and that (ii) we pursue what properties can be established automatically using a rather simple typing framework.
{"title":"Type checking data structures more complex than trees","authors":"J. Sano, Naoki Yamamoto, K. Ueda","doi":"10.48550/arXiv.2209.05149","DOIUrl":"https://doi.org/10.48550/arXiv.2209.05149","url":null,"abstract":"Graphs are a generalized concept that encompasses more complex data structures than trees, such as difference lists, doubly-linked lists, skip lists, and leaf-linked trees. Normally, these structures are handled with destructive assignments to heaps, which is opposed to a purely functional programming style and makes verification difficult. We propose a new purely functional language, $lambda_{GT}$, that handles graphs as immutable, first-class data structures with a pattern matching mechanism based on Graph Transformation and developed a new type system, $F_{GT}$, for the language. Our approach is in contrast with the analysis of pointer manipulation programs using separation logic, shape analysis, etc. in that (i) we do not consider destructive operations but pattern matchings over graphs provided by the new higher-level language that abstract pointers and heaps away and that (ii) we pursue what properties can be established automatically using a rather simple typing framework.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124625875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-26DOI: 10.48550/arXiv.2208.12496
Ayana Niwa, Sho Takase, Naoaki Okazaki
Non-autoregressive (NAR) models can generate sentences with less computation than autoregressive models but sacrifice generation quality. Previous studies addressed this issue through iterative decoding. This study proposes using nearest neighbors as the initial state of an NAR decoder and editing them iteratively. We present a novel training strategy to learn the edit operations on neighbors to improve NAR text generation. Experimental results show that the proposed method (NeighborEdit) achieves higher translation quality (1.69 points higher than the vanilla Transformer) with fewer decoding iterations (one-eighteenth fewer iterations) on the JRC-Acquis En-De dataset, the common benchmark dataset for machine translation using nearest neighbors. We also confirm the effectiveness of the proposed method on a data-to-text task (WikiBio). In addition, the proposed method outperforms an NAR baseline on the WMT'14 En-De dataset. We also report analysis on neighbor examples used in the proposed method.
{"title":"Nearest Neighbor Non-autoregressive Text Generation","authors":"Ayana Niwa, Sho Takase, Naoaki Okazaki","doi":"10.48550/arXiv.2208.12496","DOIUrl":"https://doi.org/10.48550/arXiv.2208.12496","url":null,"abstract":"Non-autoregressive (NAR) models can generate sentences with less computation than autoregressive models but sacrifice generation quality. Previous studies addressed this issue through iterative decoding. This study proposes using nearest neighbors as the initial state of an NAR decoder and editing them iteratively. We present a novel training strategy to learn the edit operations on neighbors to improve NAR text generation. Experimental results show that the proposed method (NeighborEdit) achieves higher translation quality (1.69 points higher than the vanilla Transformer) with fewer decoding iterations (one-eighteenth fewer iterations) on the JRC-Acquis En-De dataset, the common benchmark dataset for machine translation using nearest neighbors. We also confirm the effectiveness of the proposed method on a data-to-text task (WikiBio). In addition, the proposed method outperforms an NAR baseline on the WMT'14 En-De dataset. We also report analysis on neighbor examples used in the proposed method.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126896782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Logical Natural Language Generation, i.e., generating textual descriptions that can be logically entailed by a structured table, has been a challenge due to the low fidelity of the generation. citet{chen2020logic2text} have addressed this problem by annotating interim logical programs to control the generation contents and semantics, and presented the task of table-aware logical form to text (Logic2text) generation. However, although table instances are abundant in the real world, logical forms paired with textual descriptions require costly human annotation work, which limits the performance of neural models. To mitigate this, we propose topic-conditioned data augmentation (TopicDA), which utilizes GPT-2 to generate unpaired logical forms and textual descriptions directly from tables. We further introduce logical form generation (LG), a dual task of Logic2text that requires generating a valid logical form based on a text description of a table. We also propose a semi-supervised learning approach to jointly train a Logic2text and an LG model with both labeled and augmented data. The two models benefit from each other by providing extra supervision signals through back-translation. Experimental results on the Logic2text dataset and the LG task demonstrate that our approach can effectively utilize the augmented data and outperform supervised baselines by a substantial margin.
{"title":"Improving Logical-Level Natural Language Generation with Topic-Conditioned Data Augmentation and Logical Form Generation","authors":"Ao Liu, Congjian Luo, Naoaki Okazaki","doi":"10.2197/ipsjjip.31.332","DOIUrl":"https://doi.org/10.2197/ipsjjip.31.332","url":null,"abstract":"Logical Natural Language Generation, i.e., generating textual descriptions that can be logically entailed by a structured table, has been a challenge due to the low fidelity of the generation. citet{chen2020logic2text} have addressed this problem by annotating interim logical programs to control the generation contents and semantics, and presented the task of table-aware logical form to text (Logic2text) generation. However, although table instances are abundant in the real world, logical forms paired with textual descriptions require costly human annotation work, which limits the performance of neural models. To mitigate this, we propose topic-conditioned data augmentation (TopicDA), which utilizes GPT-2 to generate unpaired logical forms and textual descriptions directly from tables. We further introduce logical form generation (LG), a dual task of Logic2text that requires generating a valid logical form based on a text description of a table. We also propose a semi-supervised learning approach to jointly train a Logic2text and an LG model with both labeled and augmented data. The two models benefit from each other by providing extra supervision signals through back-translation. Experimental results on the Logic2text dataset and the LG task demonstrate that our approach can effectively utilize the augmented data and outperform supervised baselines by a substantial margin.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115138487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
: For IoT applications LPWA is a useful communication choice that enables us to connect tiny devices spread over the land to the Internet. Since many low-price IoT devices usually need to work with limited power budget, this kind of low-power long-range communication technique is a strong tool to populate IoT deployment. Since LPWA de- vices are less functional, localization of devices are addressed as one of the important practical problems. UNB (Ultra Narrow Band)-based LPWA networks such as Sigfox are one of the major LPWA services for IoT applications, which have a long communication range more than 10km. However, due to the long-range communications and the property of UNB-based modulation, it is not possible to use state-of-the-art localization techniques with high-accuracy; UNB- based LPWA should use simple methods based on RSSI (Radio Signal Strength Indicator) that involves large position estimation errors. In this paper, we propose a method to improve accuracy of device localization in UNB-based LPWA networks by utilizing portable Access Points (APs). By introducing a distance-based weighting technique, we improve the localization accuracy in combination with stationary and portable APs. We demonstrated that the portable AP and the new weighting technique e ff ectively works in UNB-based LPWA networks.
对于物联网应用,LPWA是一种有用的通信选择,使我们能够将遍布陆地的微型设备连接到互联网。由于许多低价物联网设备通常需要在有限的功率预算下工作,因此这种低功耗远程通信技术是普及物联网部署的强大工具。由于低功耗广域网设备的功能较差,因此设备的本地化成为一个重要的实际问题。Sigfox等基于UNB(超窄带)的LPWA网络是物联网应用的主要LPWA服务之一,其通信距离超过10公里。然而,由于远程通信和基于unb调制的特性,不可能使用最先进的高精度定位技术;基于UNB的LPWA应采用基于RSSI (Radio Signal Strength Indicator)的简单方法,而RSSI (Radio Signal Strength Indicator)的位置估计误差较大。在本文中,我们提出了一种利用便携式接入点(ap)来提高基于unb的LPWA网络中设备定位精度的方法。通过引入基于距离的加权技术,我们结合固定式和便携式ap提高了定位精度。我们证明了便携式AP和新的加权技术可以有效地工作在基于unb的LPWA网络中。
{"title":"Localization with Portable APs in Ultra-Narrow-Band-based LPWA Networks","authors":"Miya Fukumoto, Takuya Yoshihiro","doi":"10.2197/ipsjjip.29.149","DOIUrl":"https://doi.org/10.2197/ipsjjip.29.149","url":null,"abstract":": For IoT applications LPWA is a useful communication choice that enables us to connect tiny devices spread over the land to the Internet. Since many low-price IoT devices usually need to work with limited power budget, this kind of low-power long-range communication technique is a strong tool to populate IoT deployment. Since LPWA de- vices are less functional, localization of devices are addressed as one of the important practical problems. UNB (Ultra Narrow Band)-based LPWA networks such as Sigfox are one of the major LPWA services for IoT applications, which have a long communication range more than 10km. However, due to the long-range communications and the property of UNB-based modulation, it is not possible to use state-of-the-art localization techniques with high-accuracy; UNB- based LPWA should use simple methods based on RSSI (Radio Signal Strength Indicator) that involves large position estimation errors. In this paper, we propose a method to improve accuracy of device localization in UNB-based LPWA networks by utilizing portable Access Points (APs). By introducing a distance-based weighting technique, we improve the localization accuracy in combination with stationary and portable APs. We demonstrated that the portable AP and the new weighting technique e ff ectively works in UNB-based LPWA networks.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117101998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Arnan Maipradit, Tomoya Kawakami, Ying Liu, Juntao Gao, Minuro Ito
: Nowadays tra ffi c congestion has increasingly been a significant problem, which results in a longer travel time and aggravates air pollution. Available works showed that back-pressure based tra ffi c control algorithms can ef-fectively reduce tra ffi c congestion. However, those works control tra ffi c based on either inaccurate tra ffi c information or local tra ffi c information, which causes ine ffi cient tra ffi c scheduling. In this paper, we propose an adaptive tra ffi c control algorithm based on back-pressure and Q-learning, which can e ffi ciently reduce congestion. Our algorithm controls tra ffi c based on accurate real-time tra ffi c information and global tra ffi c information learned by Q-learning. As verified by simulation, our algorithm significantly decreases average vehicle traveling time from 17% to 38% when compared with a state-of-the-art algorithm under tested scenarios.
{"title":"An Adaptive Traffic Signal Control Scheme Based on Back-pressure with Global Information","authors":"Arnan Maipradit, Tomoya Kawakami, Ying Liu, Juntao Gao, Minuro Ito","doi":"10.2197/ipsjjip.29.124","DOIUrl":"https://doi.org/10.2197/ipsjjip.29.124","url":null,"abstract":": Nowadays tra ffi c congestion has increasingly been a significant problem, which results in a longer travel time and aggravates air pollution. Available works showed that back-pressure based tra ffi c control algorithms can ef-fectively reduce tra ffi c congestion. However, those works control tra ffi c based on either inaccurate tra ffi c information or local tra ffi c information, which causes ine ffi cient tra ffi c scheduling. In this paper, we propose an adaptive tra ffi c control algorithm based on back-pressure and Q-learning, which can e ffi ciently reduce congestion. Our algorithm controls tra ffi c based on accurate real-time tra ffi c information and global tra ffi c information learned by Q-learning. As verified by simulation, our algorithm significantly decreases average vehicle traveling time from 17% to 38% when compared with a state-of-the-art algorithm under tested scenarios.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121359976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
: There is a great need to evaluate and / or test programming performance. For this purpose, two schemes have been used. Constructed response (CR) tests let the examinee write programs on a blank sheet (or with a computer keyboard). This scheme can evaluate the programming performance. However, it is di ffi cult to apply in a large volume because skilled human graders are required (automatic evaluation is attempted but not widely used yet). Multiple choice (MC) tests let the examinee choose the correct answer from a list (often corresponding to the “hidden” portion of a complete program). This scheme can be used in a large volume with computer-based testing or mark-sense cards. However, many teachers and researchers are suspicious in that a good score does not necessarily mean the ability to write programs from scratch. We propose a third method, split-paper (SP) testing. Our scheme splits a correct program into each of its lines, shu ffl es the lines, adds “wrong answer” lines, and prepends them with choice symbols. The examinee answers by using a list of choice symbols corresponding to the correct program, which can be easily graded automatically by using computers. In particular, we propose the use of edit distance (Levenshtein distance) in the scoring scheme, which seems to have a ffi nity with the SP scheme. The research question is whether SP tests scored by using an edit-distance-based scoring scheme measure programming performance as do CR tests. Therefore, we conducted an experiment by using college programming classes with 60 students to compare SP tests against CR tests. As a result, SP and CR test scores are correlated for multiple settings, and the results were statistically significant. Therefore, we might conclude that SP tests with automatic scoring using edit distance are useful tools for evaluating the programming performance.
{"title":"Split-Paper Testing: A Novel Approach to Evaluate Programming Performance","authors":"Yasuichi Nakayama, Y. Kuno, Hiroyasu Kakuda","doi":"10.2197/ipsjjip.28.733","DOIUrl":"https://doi.org/10.2197/ipsjjip.28.733","url":null,"abstract":": There is a great need to evaluate and / or test programming performance. For this purpose, two schemes have been used. Constructed response (CR) tests let the examinee write programs on a blank sheet (or with a computer keyboard). This scheme can evaluate the programming performance. However, it is di ffi cult to apply in a large volume because skilled human graders are required (automatic evaluation is attempted but not widely used yet). Multiple choice (MC) tests let the examinee choose the correct answer from a list (often corresponding to the “hidden” portion of a complete program). This scheme can be used in a large volume with computer-based testing or mark-sense cards. However, many teachers and researchers are suspicious in that a good score does not necessarily mean the ability to write programs from scratch. We propose a third method, split-paper (SP) testing. Our scheme splits a correct program into each of its lines, shu ffl es the lines, adds “wrong answer” lines, and prepends them with choice symbols. The examinee answers by using a list of choice symbols corresponding to the correct program, which can be easily graded automatically by using computers. In particular, we propose the use of edit distance (Levenshtein distance) in the scoring scheme, which seems to have a ffi nity with the SP scheme. The research question is whether SP tests scored by using an edit-distance-based scoring scheme measure programming performance as do CR tests. Therefore, we conducted an experiment by using college programming classes with 60 students to compare SP tests against CR tests. As a result, SP and CR test scores are correlated for multiple settings, and the results were statistically significant. Therefore, we might conclude that SP tests with automatic scoring using edit distance are useful tools for evaluating the programming performance.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129842017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
: It is important to handle large-scale data in text formats such as XML, JSON, and CSV because these data very often appear in data exchange. For these data, instead of data ingestion to databases, ad hoc data extraction is highly desirable. The main issue of ad hoc data extraction is to serve both the programmability to allow handling various types of data intuitively and the performance for large-scale data. To pursue it, we develop C entaurus , a dynamic parser generator library for parallel ad hoc data extraction. This paper presents the design and implementation of C entaurus . The experimental results on ad hoc data extraction have demonstrated that C entaurus outperformed fast dedicated parser libraries in C ++ for XML and JSON, and achieved excellent scalability with actions implemented in Python.
{"title":"CENTAURUS: A Dynamic Parser Generator for Parallel Ad Hoc Data Extraction","authors":"Shigeyuki Sato, Hiroka Ihara, K. Taura","doi":"10.2197/ipsjjip.28.724","DOIUrl":"https://doi.org/10.2197/ipsjjip.28.724","url":null,"abstract":": It is important to handle large-scale data in text formats such as XML, JSON, and CSV because these data very often appear in data exchange. For these data, instead of data ingestion to databases, ad hoc data extraction is highly desirable. The main issue of ad hoc data extraction is to serve both the programmability to allow handling various types of data intuitively and the performance for large-scale data. To pursue it, we develop C entaurus , a dynamic parser generator library for parallel ad hoc data extraction. This paper presents the design and implementation of C entaurus . The experimental results on ad hoc data extraction have demonstrated that C entaurus outperformed fast dedicated parser libraries in C ++ for XML and JSON, and achieved excellent scalability with actions implemented in Python.","PeriodicalId":430763,"journal":{"name":"J. Inf. Process.","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116836214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}