首页 > 最新文献

BenchCouncil Transactions on Benchmarks, Standards and Evaluations最新文献

英文 中文
Enhanced deep learning based decision support system for kidney tumour detection 基于深度学习的肾脏肿瘤检测增强型决策支持系统
Pub Date : 2024-06-01 DOI: 10.1016/j.tbench.2024.100174

This study presents a high-accuracy deep learning-based decision support system for kidney cancer detection. The research utilizes a relatively large dataset of 10,000 CT images, including both healthy and tumour-detected kidney scans. After data preprocessing and optimization, various deep learning models were evaluated, with DenseNet-201 emerging as the top performer, achieving an accuracy of 99.75 %. The study compares multiple deep learning architectures, including AlexNet, EfficientNet, Darknet-53, Xception, and DenseNet-201, across different learning rates. Performance metrics such as accuracy, precision, sensitivity, F1-score, and specificity are analysed using confusion matrices. The proposed system outperforms different deep learning networks, demonstrating superior accuracy in kidney cancer detection. The improvement is attributed to effective data engineering and hyperparameter optimization of the deep learning networks. This research contributes to the field of medical image analysis by providing a robust decision support tool for early and rapid diagnosis of kidney cancer. The high accuracy and efficiency of the proposed system make it a promising aid for healthcare professionals in clinical settings.

本研究提出了一种基于深度学习的高精度肾癌检测决策支持系统。研究利用了一个包含 10,000 张 CT 图像的相对较大的数据集,其中包括健康肾脏扫描图像和检测到肿瘤的肾脏扫描图像。经过数据预处理和优化后,对各种深度学习模型进行了评估,其中 DenseNet-201 表现最佳,准确率达到 99.75%。该研究比较了不同学习率下的多种深度学习架构,包括 AlexNet、EfficientNet、Darknet-53、Xception 和 DenseNet-201。使用混淆矩阵分析了准确度、精确度、灵敏度、F1-分数和特异性等性能指标。所提出的系统优于不同的深度学习网络,在肾癌检测方面表现出更高的准确性。这一改进归功于有效的数据工程和深度学习网络的超参数优化。这项研究为肾癌的早期快速诊断提供了强大的决策支持工具,从而为医学图像分析领域做出了贡献。所提议系统的高准确性和高效率使其成为临床环境中医护人员的理想助手。
{"title":"Enhanced deep learning based decision support system for kidney tumour detection","authors":"","doi":"10.1016/j.tbench.2024.100174","DOIUrl":"10.1016/j.tbench.2024.100174","url":null,"abstract":"<div><p>This study presents a high-accuracy deep learning-based decision support system for kidney cancer detection. The research utilizes a relatively large dataset of 10,000 CT images, including both healthy and tumour-detected kidney scans. After data preprocessing and optimization, various deep learning models were evaluated, with DenseNet-201 emerging as the top performer, achieving an accuracy of 99.75 %. The study compares multiple deep learning architectures, including AlexNet, EfficientNet, Darknet-53, Xception, and DenseNet-201, across different learning rates. Performance metrics such as accuracy, precision, sensitivity, F1-score, and specificity are analysed using confusion matrices. The proposed system outperforms different deep learning networks, demonstrating superior accuracy in kidney cancer detection. The improvement is attributed to effective data engineering and hyperparameter optimization of the deep learning networks. This research contributes to the field of medical image analysis by providing a robust decision support tool for early and rapid diagnosis of kidney cancer. The high accuracy and efficiency of the proposed system make it a promising aid for healthcare professionals in clinical settings.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000267/pdfft?md5=1e6e92b87d485e865811a8bedeb30bc4&pid=1-s2.0-S2772485924000267-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142232454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing the impact of opportunistic maintenance optimization on manufacturing industries in Bangladesh: An empirical study 分析机会主义维护优化对孟加拉国制造业的影响:实证研究
Pub Date : 2024-06-01 DOI: 10.1016/j.tbench.2024.100172

The study investigates the impact of opportunistic maintenance (OM) optimization on manufacturing industries, especially in Bangladesh, to reduce maintenance costs. To that end, OM strategies have been proposed and optimized for multi-unit manufacturing systems, whereas most of the existing research is for single- or two-unit systems. OM strategies in this research cover one of the three policies: preventive replacement, preventive repair, and a two-level maintenance approach. The proposed two-level maintenance approach is a combination of lower-level maintenance, known as preventive repair, and higher-level maintenance, known as preventive replacement. Simulation optimization (SO) techniques using Python were utilized to evaluate the strategies. Historical data from two of Bangladesh's most promising and significant sectors, the footwear and railway industries, was used as the case study. Compared to the currently utilized corrective maintenance approach, the two-level maintenance approach is the most effective for both case studies, demonstrating cost savings of 16.9 % and 22.4 % for the footwear and railway industries, respectively. This study reveals that manufacturing industries can achieve significant cost savings by implementing the proposed OM strategies, a concept that has yet to be explored in developing countries like Bangladesh. However, the study considered the proposed approaches for major components of the system, and more significant benefits can be achieved if it is possible to apply them to all critical components of the system.

本研究探讨了机会主义维护(OM)优化对制造业,尤其是孟加拉国制造业降低维护成本的影响。为此,针对多单元制造系统提出并优化了 OM 策略,而现有研究大多针对单单元或双单元系统。本研究中的 OM 策略包括三种策略中的一种:预防性更换、预防性维修和两级维护方法。所提出的两级维护方法是低级维护(即预防性维修)和高级维护(即预防性更换)的结合。使用 Python 的模拟优化 (SO) 技术对这些策略进行了评估。案例研究使用了孟加拉国最有前途的两个重要行业--制鞋业和铁路业的历史数据。与目前使用的纠正性维护方法相比,两级维护方法在两个案例研究中都是最有效的,分别为制鞋业和铁路业节省了 16.9% 和 22.4% 的成本。这项研究表明,制造业可以通过实施建议的 OM 战略来大幅节约成本,而这一概念在孟加拉国等发展中国家尚待探索。不过,本研究考虑的是针对系统主要组件提出的方法,如果有可能将这些方法应用于系统的所有关键组件,则可实现更显著的效益。
{"title":"Analyzing the impact of opportunistic maintenance optimization on manufacturing industries in Bangladesh: An empirical study","authors":"","doi":"10.1016/j.tbench.2024.100172","DOIUrl":"10.1016/j.tbench.2024.100172","url":null,"abstract":"<div><p>The study investigates the impact of opportunistic maintenance (OM) optimization on manufacturing industries, especially in Bangladesh, to reduce maintenance costs. To that end, OM strategies have been proposed and optimized for multi-unit manufacturing systems, whereas most of the existing research is for single- or two-unit systems. OM strategies in this research cover one of the three policies: preventive replacement, preventive repair, and a two-level maintenance approach. The proposed two-level maintenance approach is a combination of lower-level maintenance, known as preventive repair, and higher-level maintenance, known as preventive replacement. Simulation optimization (SO) techniques using Python were utilized to evaluate the strategies. Historical data from two of Bangladesh's most promising and significant sectors, the footwear and railway industries, was used as the case study. Compared to the currently utilized corrective maintenance approach, the two-level maintenance approach is the most effective for both case studies, demonstrating cost savings of 16.9 % and 22.4 % for the footwear and railway industries, respectively. This study reveals that manufacturing industries can achieve significant cost savings by implementing the proposed OM strategies, a concept that has yet to be explored in developing countries like Bangladesh. However, the study considered the proposed approaches for major components of the system, and more significant benefits can be achieved if it is possible to apply them to all critical components of the system.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000243/pdfft?md5=1b77ff7ad4966e3ee27415efaf6f7e80&pid=1-s2.0-S2772485924000243-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142044887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BinCodex: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques BinCodex:用于评估二进制代码相似性检测技术的多层次综合数据集
Pub Date : 2024-06-01 DOI: 10.1016/j.tbench.2024.100163
Peihua Zhang , Chenggang Wu , Zhe Wang

The binary code similarity detection (BCSD) technique can quantitatively measure the differences between two given binaries and give matching results at predefined granularity (e.g., function), and has been widely used in multiple scenarios including software vulnerability search, security patch analysis, malware detection, code clone detection, etc. With the help of deep learning, the BCSD techniques have achieved high accuracy in their evaluation. However, on the one hand, their high accuracy has become indistinguishable due to the lack of a standard dataset, thus being unable to reveal their abilities. On the other hand, since binary code can be easily changed, it is essential to gain a holistic understanding of the underlying transformations including default optimization options, non-default optimization options, and commonly used code obfuscations, thus assessing their impact on the accuracy and adaptability of the BCSD technique. This paper presents our observations regarding the diversity of BCSD datasets and proposes a comprehensive dataset for the BCSD technique. We employ and present detailed evaluation results of various BCSD works, applying different classifications for different types of BCSD tasks, including pure function pairing and vulnerable code detection. Our results show that most BCSD works are capable of adopting default compiler options but are unsatisfactory when facing non-default compiler options and code obfuscation. We take a layered perspective on the BCSD task and point to opportunities for future optimizations in the technologies we consider.

二进制代码相似性检测(BCSD)技术可以定量测量两个给定二进制文件之间的差异,并给出预定粒度(如函数)的匹配结果,已被广泛应用于软件漏洞搜索、安全补丁分析、恶意软件检测、代码克隆检测等多个场景。在深度学习的帮助下,BCSD 技术在评估中取得了较高的准确率。然而,一方面,由于缺乏标准数据集,其高精度变得难以区分,从而无法展现其能力。另一方面,由于二进制代码很容易更改,因此有必要全面了解底层转换,包括默认优化选项、非默认优化选项和常用代码混淆,从而评估它们对 BCSD 技术准确性和适应性的影响。本文介绍了我们对 BCSD 数据集多样性的观察,并为 BCSD 技术提出了一个综合数据集。我们针对不同类型的 BCSD 任务(包括纯函数配对和漏洞代码检测)采用了不同的分类方法,并介绍了各种 BCSD 作品的详细评估结果。我们的结果表明,大多数 BCSD 作品都能采用默认编译器选项,但在面对非默认编译器选项和代码混淆时却不能令人满意。我们从分层的角度来看待 BCSD 任务,并指出了我们所考虑的技术在未来的优化机会。
{"title":"BinCodex: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques","authors":"Peihua Zhang ,&nbsp;Chenggang Wu ,&nbsp;Zhe Wang","doi":"10.1016/j.tbench.2024.100163","DOIUrl":"https://doi.org/10.1016/j.tbench.2024.100163","url":null,"abstract":"<div><p>The binary code similarity detection (BCSD) technique can quantitatively measure the differences between two given binaries and give matching results at predefined granularity (e.g., function), and has been widely used in multiple scenarios including software vulnerability search, security patch analysis, malware detection, code clone detection, etc. With the help of deep learning, the BCSD techniques have achieved high accuracy in their evaluation. However, on the one hand, their high accuracy has become indistinguishable due to the lack of a standard dataset, thus being unable to reveal their abilities. On the other hand, since binary code can be easily changed, it is essential to gain a holistic understanding of the underlying transformations including default optimization options, non-default optimization options, and commonly used code obfuscations, thus assessing their impact on the accuracy and adaptability of the BCSD technique. This paper presents our observations regarding the diversity of BCSD datasets and proposes a comprehensive dataset for the BCSD technique. We employ and present detailed evaluation results of various BCSD works, applying different classifications for different types of BCSD tasks, including pure function pairing and vulnerable code detection. Our results show that most BCSD works are capable of adopting default compiler options but are unsatisfactory when facing non-default compiler options and code obfuscation. We take a layered perspective on the BCSD task and point to opportunities for future optimizations in the technologies we consider.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000152/pdfft?md5=e14058fa183420c2a27c98650ad7e993&pid=1-s2.0-S2772485924000152-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141240102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TensorTable: Extending PyTorch for mixed relational and linear algebra pipelines TensorTable:为混合关系和线性代数管道扩展 PyTorch
Pub Date : 2024-03-01 DOI: 10.1016/j.tbench.2024.100161
Xu Wen

The mixed relational algebra (RA) and linear algebra (LA) pipelines have become increasingly common in recent years. However, contemporary widely used frameworks struggle to support both RA and LA operators effectively, failing to ensure optimal end-to-end performance due to the cost of LA operators and data conversion. This underscores the demand for a system capable of seamlessly integrating RA and LA while delivering robust end-to-end performance. This paper proposes TensorTable, a tensor system that extends PyTorch to enable mixed RA and LA pipelines. We propose TensorTable as the unified data representation, storing data in a tensor format to prioritize the performance of LA operators and reduce data conversion costs. Relational tables from RA, as well as vectors, matrices, and tensors from LA, can be seamlessly converted into TensorTables. Additionally, we provide TensorTable-based implementations for RA operators and build a system that supports mixed LA and RA pipelines. We implement TensorTable on top of PyTorch, achieving comparable performance for both RA and LA operators, particularly on small datasets. TensorTable achieves a 1.15x-5.63x speedup for mixed pipelines, compared with state-of-the-art frameworks—AIDA and RMA.

近年来,混合关系代数(RA)和线性代数(LA)管道越来越常见。然而,由于线性代数运算符和数据转换的成本问题,当代广泛使用的框架难以同时有效支持关系代数和线性代数运算符,无法确保最佳的端到端性能。这就凸显了对能够无缝集成 RA 和 LA 并提供强大端到端性能的系统的需求。本文提出的张量系统 TensorTable 对 PyTorch 进行了扩展,以实现 RA 和 LA 混合管道。我们建议将 TensorTable 作为统一的数据表示方式,以张量格式存储数据,从而优先考虑 LA 运算符的性能并降低数据转换成本。来自 RA 的关系表,以及来自 LA 的向量、矩阵和张量,都可以无缝转换成 TensorTable。此外,我们还为 RA 运算符提供了基于 TensorTable 的实现,并构建了一个支持 LA 和 RA 混合管道的系统。我们在 PyTorch 的基础上实现了 TensorTable,为 RA 和 LA 运算符实现了相当的性能,尤其是在小型数据集上。与最先进的框架--AIDA 和 RMA 相比,TensorTable 的混合管道速度提高了 1.15-5.63 倍。
{"title":"TensorTable: Extending PyTorch for mixed relational and linear algebra pipelines","authors":"Xu Wen","doi":"10.1016/j.tbench.2024.100161","DOIUrl":"10.1016/j.tbench.2024.100161","url":null,"abstract":"<div><p>The mixed relational algebra (RA) and linear algebra (LA) pipelines have become increasingly common in recent years. However, contemporary widely used frameworks struggle to support both RA and LA operators effectively, failing to ensure optimal end-to-end performance due to the cost of LA operators and data conversion. This underscores the demand for a system capable of seamlessly integrating RA and LA while delivering robust end-to-end performance. This paper proposes TensorTable, a tensor system that extends PyTorch to enable mixed RA and LA pipelines. We propose TensorTable as the unified data representation, storing data in a tensor format to prioritize the performance of LA operators and reduce data conversion costs. Relational tables from RA, as well as vectors, matrices, and tensors from LA, can be seamlessly converted into TensorTables. Additionally, we provide TensorTable-based implementations for RA operators and build a system that supports mixed LA and RA pipelines. We implement TensorTable on top of PyTorch, achieving comparable performance for both RA and LA operators, particularly on small datasets. TensorTable achieves a 1.15x-5.63x speedup for mixed pipelines, compared with state-of-the-art frameworks—AIDA and RMA.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000139/pdfft?md5=159d30f36fa85195e487f7a07663be37&pid=1-s2.0-S2772485924000139-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140090009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluatology: The science and engineering of evaluation 评价学:评估科学与工程
Pub Date : 2024-03-01 DOI: 10.1016/j.tbench.2024.100162
Jianfeng Zhan , Lei Wang , Wanling Gao , Hongxiao Li , Chenxi Wang , Yunyou Huang , Yatao Li , Zhengxin Yang , Guoxin Kang , Chunjie Luo , Hainan Ye , Shaopeng Dai , Zhifei Zhang

Evaluation is a crucial aspect of human existence and plays a vital role in each field. However, it is often approached in an empirical and ad-hoc manner, lacking consensus on universal concepts, terminologies, theories, and methodologies. This lack of agreement has significant consequences. This article aims to formally introduce the discipline of evaluatology, which encompasses the science and engineering of evaluation. We propose a universal framework for evaluation, encompassing concepts, terminologies, theories, and methodologies that can be applied across various disciplines, if not all disciplines.

Our research reveals that the essence of evaluation lies in conducting experiments that intentionally apply a well-defined evaluation condition to individuals or systems under scrutiny, which we refer to as the subjects. This process allows for the creation of an evaluation system or model. By measuring and/or testing this evaluation system or model, we can infer the impact of different subjects. Derived from the essence of evaluation, we propose five axioms focusing on key aspects of evaluation outcomes as the foundational evaluation theory. These axioms serve as the bedrock upon which we build universal evaluation theories and methodologies. When evaluating a single subject, it is crucial to create evaluation conditions with different levels of equivalency. By applying these conditions to diverse subjects, we can establish reference evaluation models. These models allow us to alter a single independent variable at a time while keeping all other variables as controls. When evaluating complex scenarios, the key lies in establishing a series of evaluation models that maintain transitivity. Building upon the science of evaluation, we propose a formal definition of a benchmark as a simplified and sampled evaluation condition that guarantees different levels of equivalency. This concept serves as the cornerstone for a universal benchmark-based engineering approach to evaluation across various disciplines, which we refer to as benchmarkology.

评价是人类生存的一个重要方面,在各个领域都发挥着至关重要的作用。然而,人们往往以经验主义和临时性的方式来对待它,对普遍的概念、术语、理论和方法缺乏共识。这种缺乏共识的现象造成了严重后果。本文旨在正式介绍评价学这一学科,它包括评价的科学和工程。我们提出了一个通用的评价框架,其中包含的概念、术语、理论和方法即使不能适用于所有学科,也可以适用于各个学科。我们的研究揭示了评价的本质在于进行实验,有意识地对被审查的个人或系统(我们称之为被试)施加一个定义明确的评价条件。通过这一过程,可以创建一个评价系统或模型。通过测量和/或测试这个评价系统或模型,我们可以推断出不同主体的影响。从评价的本质出发,我们提出了五个公理,作为评价的基础理论,这些公理集中在评价结果的关键方面。这些公理是我们建立通用评价理论和方法的基石。在评价单一科目时,关键是要创造不同等效水平的评价条件。通过将这些条件应用于不同的主题,我们可以建立参考评价模型。通过这些模型,我们可以一次改变一个独立变量,同时保留所有其他变量作为对照。在对复杂的情况进行评估时,关键在于建立一系列能够保持反向性的评估模型。在评估科学的基础上,我们提出了基准的正式定义,即保证不同等效水平的简化和抽样评估条件。这一概念是基于基准的通用工程评估方法的基石,适用于各个学科,我们称之为基准学。
{"title":"Evaluatology: The science and engineering of evaluation","authors":"Jianfeng Zhan ,&nbsp;Lei Wang ,&nbsp;Wanling Gao ,&nbsp;Hongxiao Li ,&nbsp;Chenxi Wang ,&nbsp;Yunyou Huang ,&nbsp;Yatao Li ,&nbsp;Zhengxin Yang ,&nbsp;Guoxin Kang ,&nbsp;Chunjie Luo ,&nbsp;Hainan Ye ,&nbsp;Shaopeng Dai ,&nbsp;Zhifei Zhang","doi":"10.1016/j.tbench.2024.100162","DOIUrl":"https://doi.org/10.1016/j.tbench.2024.100162","url":null,"abstract":"<div><p>Evaluation is a crucial aspect of human existence and plays a vital role in each field. However, it is often approached in an empirical and ad-hoc manner, lacking consensus on universal concepts, terminologies, theories, and methodologies. This lack of agreement has significant consequences. This article aims to formally introduce the discipline of evaluatology, which encompasses the science and engineering of evaluation. We propose a universal framework for evaluation, encompassing concepts, terminologies, theories, and methodologies that can be applied across various disciplines, if not all disciplines.</p><p>Our research reveals that the essence of evaluation lies in conducting experiments that intentionally apply a well-defined evaluation condition to individuals or systems under scrutiny, which we refer to as the <em>subjects</em>. This process allows for the creation of an evaluation system or model. By measuring and/or testing this evaluation system or model, we can infer the impact of different subjects. Derived from the essence of evaluation, we propose five axioms focusing on key aspects of evaluation outcomes as the foundational evaluation theory. These axioms serve as the bedrock upon which we build universal evaluation theories and methodologies. When evaluating a single subject, it is crucial to create evaluation conditions with different levels of equivalency. By applying these conditions to diverse subjects, we can establish reference evaluation models. These models allow us to alter a single independent variable at a time while keeping all other variables as controls. When evaluating complex scenarios, the key lies in establishing a series of evaluation models that maintain transitivity. Building upon the science of evaluation, we propose a formal definition of a benchmark as a simplified and sampled evaluation condition that guarantees different levels of equivalency. This concept serves as the cornerstone for a universal benchmark-based engineering approach to evaluation across various disciplines, which we refer to as benchmarkology.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000140/pdfft?md5=31c7470bd845fb50d0580585f84133b4&pid=1-s2.0-S2772485924000140-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140906873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An approach to workload generation for modern data centers: A view from Alibaba trace 现代数据中心工作负载生成方法:来自阿里巴巴的观点
Pub Date : 2024-03-01 DOI: 10.1016/j.tbench.2024.100164
Yi Liang , Nianyi Ruan , Lan Yi , Xing Su

Modern data centers provide the foundational infrastructure of cloud computing. Workload generation, which involves simulating or constructing tasks and transactions to replicate the actual resource usage patterns of real-world systems or applications, plays essential role for efficient resource management in these centers. Data center traces, rich in information about workload execution and resource utilization, are thus ideal data for workload generation. Traditional traces provide detailed temporal resource usage data to enable fine-grained workload generation. However, modern data centers tend to favor tracing statistical metrics to reduce overhead. Therefore the accurate reconstruction of temporal resource consumption without detailed, temporized trace information become a major challenge for trace-based workload generation. To address this challenge, we propose STWGEN, a novel method that leverages statistical trace data for workload generation. STWGEN is specifically designed to generate the batch task workloads based on Alibaba trace. STWGEN contains two key components: a suite of C program-based flexible workload building blocks and a heuristic strategy to assemble building blocks for workload generation. Both components are carefully designed to reproduce synthetic batch tasks that closely replicate the observed resource usage patterns in a representative data center. Experimental results demonstrate that STWGEN outperforms state-of-the-art workload generation methods as it emulates workload-level and machine-level resource usage in much higher accuracy.

现代数据中心是云计算的基础架构。工作负载生成涉及模拟或构建任务和事务,以复制现实世界中系统或应用的实际资源使用模式,对这些中心的高效资源管理起着至关重要的作用。因此,数据中心跟踪信息中含有丰富的工作负载执行和资源利用信息,是工作负载生成的理想数据。传统的跟踪可提供详细的时间资源使用数据,从而实现细粒度的工作负载生成。然而,现代数据中心倾向于采用跟踪统计指标来减少开销。因此,在没有详细的时间化跟踪信息的情况下,如何准确重建时间资源消耗成为基于跟踪的工作负载生成所面临的一大挑战。为了应对这一挑战,我们提出了 STWGEN,一种利用统计跟踪数据生成工作负载的新方法。STWGEN 专为生成基于阿里巴巴跟踪的批处理任务工作负载而设计。STWGEN 包含两个关键组件:一套基于 C 程序的灵活工作负载构建模块和一种启发式策略,用于组合构建模块以生成工作负载。这两个组件都经过精心设计,用于重现合成批处理任务,这些任务与在代表性数据中心观察到的资源使用模式密切相关。实验结果表明,STWGEN 超越了最先进的工作负载生成方法,因为它能更准确地模拟工作负载级和机器级资源使用情况。
{"title":"An approach to workload generation for modern data centers: A view from Alibaba trace","authors":"Yi Liang ,&nbsp;Nianyi Ruan ,&nbsp;Lan Yi ,&nbsp;Xing Su","doi":"10.1016/j.tbench.2024.100164","DOIUrl":"https://doi.org/10.1016/j.tbench.2024.100164","url":null,"abstract":"<div><p>Modern data centers provide the foundational infrastructure of cloud computing. Workload generation, which involves simulating or constructing tasks and transactions to replicate the actual resource usage patterns of real-world systems or applications, plays essential role for efficient resource management in these centers. Data center traces, rich in information about workload execution and resource utilization, are thus ideal data for workload generation. Traditional traces provide detailed temporal resource usage data to enable fine-grained workload generation. However, modern data centers tend to favor tracing statistical metrics to reduce overhead. Therefore the accurate reconstruction of temporal resource consumption without detailed, temporized trace information become a major challenge for trace-based workload generation. To address this challenge, we propose STWGEN, a novel method that leverages statistical trace data for workload generation. STWGEN is specifically designed to generate the batch task workloads based on Alibaba trace. STWGEN contains two key components: a suite of C program-based flexible workload building blocks and a heuristic strategy to assemble building blocks for workload generation. Both components are carefully designed to reproduce synthetic batch tasks that closely replicate the observed resource usage patterns in a representative data center. Experimental results demonstrate that STWGEN outperforms state-of-the-art workload generation methods as it emulates workload-level and machine-level resource usage in much higher accuracy.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772485924000164/pdfft?md5=dc97b50be70f18c4e64b66906a378a03&pid=1-s2.0-S2772485924000164-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141095886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking ChatGPT for Prototyping Theories: Experimental Studies Using the Technology Acceptance Model 以 ChatGPT 为原型理论基准:使用技术接受模型的实验研究
Pub Date : 2024-02-01 DOI: 10.1016/j.tbench.2024.100153
Yanwu Yang, T. Goh, Xin Dai
{"title":"Benchmarking ChatGPT for Prototyping Theories: Experimental Studies Using the Technology Acceptance Model","authors":"Yanwu Yang, T. Goh, Xin Dai","doi":"10.1016/j.tbench.2024.100153","DOIUrl":"https://doi.org/10.1016/j.tbench.2024.100153","url":null,"abstract":"","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139815896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking ChatGPT for Prototyping Theories: Experimental Studies Using the Technology Acceptance Model 以 ChatGPT 为原型理论基准:使用技术接受模型的实验研究
Pub Date : 2024-02-01 DOI: 10.1016/j.tbench.2024.100153
Yanwu Yang, T. Goh, Xin Dai
{"title":"Benchmarking ChatGPT for Prototyping Theories: Experimental Studies Using the Technology Acceptance Model","authors":"Yanwu Yang, T. Goh, Xin Dai","doi":"10.1016/j.tbench.2024.100153","DOIUrl":"https://doi.org/10.1016/j.tbench.2024.100153","url":null,"abstract":"","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139875928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A pluggable single-image super-resolution algorithm based on second-order gradient loss 基于二阶梯度损失的可插入式单图像超分辨率算法
Pub Date : 2023-12-01 DOI: 10.1016/j.tbench.2023.100148
Shuran Lin, Chunjie Zhang, Yanwu Yang
{"title":"A pluggable single-image super-resolution algorithm based on second-order gradient loss","authors":"Shuran Lin, Chunjie Zhang, Yanwu Yang","doi":"10.1016/j.tbench.2023.100148","DOIUrl":"https://doi.org/10.1016/j.tbench.2023.100148","url":null,"abstract":"","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139022929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing the potential benefits and use cases of ChatGPT as a tool for improving the efficiency and effectiveness of business operations 分析ChatGPT作为提高业务操作效率和有效性的工具的潜在好处和用例
Pub Date : 2023-09-01 DOI: 10.1016/j.tbench.2023.100140
Rohit Raj , Arpit Singh , Vimal Kumar , Pratima Verma

The study addresses the potential benefits for companies of adopting ChatGPT, a popular chatbot built on a large-scale transformer-based language model known as a generative pre-trained transformer (GPT). Chatbots like ChatGPT may improve customer service, handle several client inquiries at once, and save operational costs. Moreover, ChatGPT may automate regular processes like order tracking and billing, allowing human employees to focus on more complex and strategic responsibilities. Nevertheless, before deploying ChatGPT, enterprises must carefully analyze its use cases and restrictions, as well as its strengths and disadvantages. ChatGPT, for example, requires training data that is particular to the business domain and might produce erroneous and ambiguous findings. The study identifies areas of deployment of ChatGPT's possible benefits in enterprises by drawing on the literature that is currently accessible on ChatGPT, massive language models, and artificial intelligence. Then, using the PSI (Preference Selection Index) and COPRAS (Complex Proportional Assessment) approaches, potential advantages are taken into account and prioritized. By highlighting current trends and possible advantages in the industry, this editorial seeks to provide insight into the present state of employing ChatGPT in enterprises and research. ChatGPT may also learn biases from training data and create replies that reinforce those biases. As a result, enterprises must train and fine-tune ChatGPT to specific operations, set explicit boundaries and limitations for its use, and implement appropriate security measures to avoid malicious input. The study highlights the research gap in the dearth of literature by outlining ChatGPT's potential benefits for businesses, analyzing its strengths and limits, and offering insights into how organizations might use ChatGPT's capabilities to enhance their operations.

这项研究探讨了采用ChatGPT对公司的潜在好处,ChatGPT是一种流行的聊天机器人,建立在一种大规模的基于转换器的语言模型上,称为生成预训练转换器(GPT)。像ChatGPT这样的聊天机器人可以改善客户服务,同时处理多个客户查询,并节省运营成本。此外,ChatGPT可以自动化订单跟踪和计费等常规流程,使员工能够专注于更复杂和战略性的职责。尽管如此,在部署ChatGPT之前,企业必须仔细分析它的用例和限制,以及它的优势和劣势。例如,ChatGPT需要特定于业务领域的训练数据,这些数据可能会产生错误和模糊的结果。该研究通过借鉴目前在ChatGPT、大规模语言模型和人工智能上可以获得的文献,确定了ChatGPT在企业中可能带来的好处的部署领域。然后,使用PSI(偏好选择指数)和COPRAS(复杂比例评估)方法,将潜在优势考虑在内并排定优先级。通过强调行业的当前趋势和可能的优势,这篇社论试图深入了解在企业和研究中使用ChatGPT的现状。ChatGPT还可以从训练数据中学习偏见,并创建强化这些偏见的回复。因此,企业必须根据具体操作对ChatGPT进行培训和微调,为其使用设置明确的边界和限制,并实施适当的安全措施以避免恶意输入。该研究概述了ChatGPT对企业的潜在好处,分析了其优势和局限性,并深入了解了组织如何利用ChatGPT的能力来增强其运营,从而突出了缺乏文献的研究差距。
{"title":"Analyzing the potential benefits and use cases of ChatGPT as a tool for improving the efficiency and effectiveness of business operations","authors":"Rohit Raj ,&nbsp;Arpit Singh ,&nbsp;Vimal Kumar ,&nbsp;Pratima Verma","doi":"10.1016/j.tbench.2023.100140","DOIUrl":"https://doi.org/10.1016/j.tbench.2023.100140","url":null,"abstract":"<div><p>The study addresses the potential benefits for companies of adopting ChatGPT, a popular chatbot built on a large-scale transformer-based language model known as a generative pre-trained transformer (GPT). Chatbots like ChatGPT may improve customer service, handle several client inquiries at once, and save operational costs. Moreover, ChatGPT may automate regular processes like order tracking and billing, allowing human employees to focus on more complex and strategic responsibilities. Nevertheless, before deploying ChatGPT, enterprises must carefully analyze its use cases and restrictions, as well as its strengths and disadvantages. ChatGPT, for example, requires training data that is particular to the business domain and might produce erroneous and ambiguous findings. The study identifies areas of deployment of ChatGPT's possible benefits in enterprises by drawing on the literature that is currently accessible on ChatGPT, massive language models, and artificial intelligence. Then, using the PSI (Preference Selection Index) and COPRAS (Complex Proportional Assessment) approaches, potential advantages are taken into account and prioritized. By highlighting current trends and possible advantages in the industry, this editorial seeks to provide insight into the present state of employing ChatGPT in enterprises and research. ChatGPT may also learn biases from training data and create replies that reinforce those biases. As a result, enterprises must train and fine-tune ChatGPT to specific operations, set explicit boundaries and limitations for its use, and implement appropriate security measures to avoid malicious input. The study highlights the research gap in the dearth of literature by outlining ChatGPT's potential benefits for businesses, analyzing its strengths and limits, and offering insights into how organizations might use ChatGPT's capabilities to enhance their operations.</p></div>","PeriodicalId":100155,"journal":{"name":"BenchCouncil Transactions on Benchmarks, Standards and Evaluations","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49713752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
BenchCouncil Transactions on Benchmarks, Standards and Evaluations
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1