Li Yuxiang, Zhang Zhiyong, Wang Xinyong, Huang Shuaina, Su Yaning
{"title":"IDaTPA: importance degree based thread partitioning approach in thread level speculation","authors":"Li Yuxiang, Zhang Zhiyong, Wang Xinyong, Huang Shuaina, Su Yaning","doi":"10.1007/s10791-024-09440-x","DOIUrl":null,"url":null,"abstract":"<p>As an auto-parallelization technique with the level of thread on multi-core, Thread-Level Speculation (TLS) which is also called Speculative Multithreading (SpMT), partitions programs into multiple threads and speculatively executes them under conditions of ambiguous data and control dependence. Thread partitioning approach plays a key role to the performance enhancement in TLS. The existing heuristic rules-based approach (HR-based approach) which is an one-size-fits-all strategy, can not guarantee to achieve the satisfied thread partitioning. In this paper, an importance <u>d</u>egree b<u>a</u>sed <u>t</u>hread <u>p</u>artitioning <u>a</u>pproach (IDaTPA) is proposed to realize the partition of irregular programs into multithreads. IDaTPA implements biasing partitioning for every procedure with a machine learning method. It mainly includes: constructing sample set, expression of knowledge, calculation of similarity, prediction model and the partition of the irregular programs is performed by the prediction model. Using IDaTPA, the subprocedures in unseen irregular programs can obtain their satisfied partition. On a generic SpMT processor (called Prophet) to perform the performance evaluation for multithreaded programs, the IDaTPA is evaluated and averagely delivers a speedup of 1.80 upon a 4-core processor. Furthermore, in order to obtain the portability evaluation of IDaTPA, we port IDaTPA to 8-core processor and obtain a speedup of 2.82 on average. Experiment results show that IDaTPA obtains a significant speedup increasement and Olden benchmarks respectively deliver a 5.75% performance improvement on 4-core and a 6.32% performance improvement on 8-core, and SPEC2020 benchmarks obtain a 38.20% performance improvement than the conventional HR-based approach.</p>","PeriodicalId":54352,"journal":{"name":"Information Retrieval Journal","volume":"50 1","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Retrieval Journal","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10791-024-09440-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
As an auto-parallelization technique with the level of thread on multi-core, Thread-Level Speculation (TLS) which is also called Speculative Multithreading (SpMT), partitions programs into multiple threads and speculatively executes them under conditions of ambiguous data and control dependence. Thread partitioning approach plays a key role to the performance enhancement in TLS. The existing heuristic rules-based approach (HR-based approach) which is an one-size-fits-all strategy, can not guarantee to achieve the satisfied thread partitioning. In this paper, an importance degree based thread partitioning approach (IDaTPA) is proposed to realize the partition of irregular programs into multithreads. IDaTPA implements biasing partitioning for every procedure with a machine learning method. It mainly includes: constructing sample set, expression of knowledge, calculation of similarity, prediction model and the partition of the irregular programs is performed by the prediction model. Using IDaTPA, the subprocedures in unseen irregular programs can obtain their satisfied partition. On a generic SpMT processor (called Prophet) to perform the performance evaluation for multithreaded programs, the IDaTPA is evaluated and averagely delivers a speedup of 1.80 upon a 4-core processor. Furthermore, in order to obtain the portability evaluation of IDaTPA, we port IDaTPA to 8-core processor and obtain a speedup of 2.82 on average. Experiment results show that IDaTPA obtains a significant speedup increasement and Olden benchmarks respectively deliver a 5.75% performance improvement on 4-core and a 6.32% performance improvement on 8-core, and SPEC2020 benchmarks obtain a 38.20% performance improvement than the conventional HR-based approach.
期刊介绍:
The journal provides an international forum for the publication of theory, algorithms, analysis and experiments across the broad area of information retrieval. Topics of interest include search, indexing, analysis, and evaluation for applications such as the web, social and streaming media, recommender systems, and text archives. This includes research on human factors in search, bridging artificial intelligence and information retrieval, and domain-specific search applications.