首页 > 最新文献

Journal of Computer Science and Technology最新文献

英文 中文
A Dataset and Post-Processing Method for Pointing Device Human-Machine Interface Evaluation 指向设备人机界面评价的数据集及后处理方法
3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-25 DOI: 10.24215/16666038.23.e11
Rocío Madou, Federico N. Guerrero, Enrique M. Spinelli
The evaluation of human-machine interfaces (HMI) requires quantitative metrics to define the ability of a person to effectively achieve their goals using the HMI. In particular, for pointing-device type HMIs such as the computer mouse, an experiment quantifying movement by performing repetitive target selections allows defining a useful metric known as throughput (TP) using the Fitts’ Law test. In this work, a dataset obtained from an automated protocol application is presented, which is made publicly available through an on-line platform. A post-processing method to obtain performance parameters from the dataset is also presented, and its output is used to validate the data against similar experiments in the literature.
人机界面(HMI)的评估需要定量的度量来定义一个人使用HMI有效实现其目标的能力。特别是,对于像计算机鼠标这样的指向设备类型的hmi,通过执行重复的目标选择来量化运动的实验允许使用Fitts定律测试定义一个称为吞吐量(TP)的有用度量。在这项工作中,提出了从自动协议应用程序获得的数据集,该数据集通过在线平台公开提供。本文还提出了一种从数据集中获取性能参数的后处理方法,并将其输出用于对照文献中类似实验验证数据。
{"title":"A Dataset and Post-Processing Method for Pointing Device Human-Machine Interface Evaluation","authors":"Rocío Madou, Federico N. Guerrero, Enrique M. Spinelli","doi":"10.24215/16666038.23.e11","DOIUrl":"https://doi.org/10.24215/16666038.23.e11","url":null,"abstract":"The evaluation of human-machine interfaces (HMI) requires quantitative metrics to define the ability of a person to effectively achieve their goals using the HMI. In particular, for pointing-device type HMIs such as the computer mouse, an experiment quantifying movement by performing repetitive target selections allows defining a useful metric known as throughput (TP) using the Fitts’ Law test. In this work, a dataset obtained from an automated protocol application is presented, which is made publicly available through an on-line platform. A post-processing method to obtain performance parameters from the dataset is also presented, and its output is used to validate the data against similar experiments in the literature.","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135217136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publication of Linked Open Data – A Systematic Literature Review for Identifying Problems and Technical Tools Supporting the Process 链接开放数据的出版-识别问题和支持该过程的技术工具的系统文献综述
3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-25 DOI: 10.24215/16666038.23.e16
Jairo H. Silva Aguilar, Rommel Torres T., Elsa Estevez
On the Internet, we find a large amount of information from government institutions that has been published in open format. However, only a part of these data is available in standard formats such as Resource Description Framework (RDF), and to a lesser extent, is published as Linked Open Data (LOD). The main objective of the research presented in this paper is to identify problems and tools used in the process of publishing LOD with the purpose of establishing a basis for the construction of a future framework that will help public institutions to facilitate such processes. To fulfill the objective, we conducted a systematic literature review in order to assess the state-of-the-art in this matter. The contribution of this work is to identify the frequent problems that arise in the LOD publishing process. It also provides a detail of the frameworks proposed in scientific papers grouping the technical tools by phases that correspond to the LOD publication life cycle. In addition, it compiles the characteristics of the ETL (Extract-Transform-Load) tools that predominate in this review, such as Pentaho Data Integration (Kettle) and OpenRefine.
在互联网上,我们发现大量政府机构的信息已经以开放的形式发布。然而,这些数据中只有一部分以诸如资源描述框架(RDF)之类的标准格式提供,并且在较小程度上,作为链接开放数据(LOD)发布。本文提出的研究的主要目的是确定在发布LOD过程中使用的问题和工具,目的是为构建有助于公共机构促进这一过程的未来框架奠定基础。为了实现目标,我们进行了系统的文献综述,以评估这一问题的最新进展。这项工作的贡献是确定在LOD出版过程中出现的常见问题。它还提供了科学论文中提出的框架的细节,这些框架按照与LOD出版生命周期相对应的阶段对技术工具进行分组。此外,它还汇编了在本综述中占主导地位的ETL(提取-转换-加载)工具的特征,例如Pentaho数据集成(Kettle)和OpenRefine。
{"title":"Publication of Linked Open Data – A Systematic Literature Review for Identifying Problems and Technical Tools Supporting the Process","authors":"Jairo H. Silva Aguilar, Rommel Torres T., Elsa Estevez","doi":"10.24215/16666038.23.e16","DOIUrl":"https://doi.org/10.24215/16666038.23.e16","url":null,"abstract":"On the Internet, we find a large amount of information from government institutions that has been published in open format. However, only a part of these data is available in standard formats such as Resource Description Framework (RDF), and to a lesser extent, is published as Linked Open Data (LOD). The main objective of the research presented in this paper is to identify problems and tools used in the process of publishing LOD with the purpose of establishing a basis for the construction of a future framework that will help public institutions to facilitate such processes. To fulfill the objective, we conducted a systematic literature review in order to assess the state-of-the-art in this matter. The contribution of this work is to identify the frequent problems that arise in the LOD publishing process. It also provides a detail of the frameworks proposed in scientific papers grouping the technical tools by phases that correspond to the LOD publication life cycle. In addition, it compiles the characteristics of the ETL (Extract-Transform-Load) tools that predominate in this review, such as Pentaho Data Integration (Kettle) and OpenRefine.","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"44 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135218923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intermediate Task Fine-Tuning in Cancer Classification 癌症分类中的中间任务微调
3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-25 DOI: 10.24215/16666038.23.e12
Mario Alejandro García, Martín Nicolás Gramática, Juan Pablo Ricapito
Reducing the amount of annotated data required to train predictive models is one of the main challenges in applying artificial intelligence to histopathology. In this paper, we propose a method to enhance the performance of deep learning models trained with limited data in the field of digital pathology. The method relies on a two-stage transfer learning process, where an intermediate model serves as a bridge between a pre-trained model on ImageNet and the final cancer classification model. The intermediate model is fine-tuned with a dataset of over 4,000,000 images weakly labeled with clinical data extracted from TCGA program. The model obtained through the proposed method significantly outperforms a model trained with a traditional transfer learning process.
减少训练预测模型所需的注释数据量是将人工智能应用于组织病理学的主要挑战之一。在本文中,我们提出了一种方法来提高数字病理学领域中使用有限数据训练的深度学习模型的性能。该方法依赖于一个两阶段的迁移学习过程,其中一个中间模型作为ImageNet上预训练模型和最终癌症分类模型之间的桥梁。中间模型使用从TCGA程序中提取的临床数据弱标记的400多万张图像数据集进行微调。通过该方法获得的模型明显优于传统迁移学习过程训练的模型。
{"title":"Intermediate Task Fine-Tuning in Cancer Classification","authors":"Mario Alejandro García, Martín Nicolás Gramática, Juan Pablo Ricapito","doi":"10.24215/16666038.23.e12","DOIUrl":"https://doi.org/10.24215/16666038.23.e12","url":null,"abstract":"Reducing the amount of annotated data required to train predictive models is one of the main challenges in applying artificial intelligence to histopathology. In this paper, we propose a method to enhance the performance of deep learning models trained with limited data in the field of digital pathology. The method relies on a two-stage transfer learning process, where an intermediate model serves as a bridge between a pre-trained model on ImageNet and the final cancer classification model. The intermediate model is fine-tuned with a dataset of over 4,000,000 images weakly labeled with clinical data extracted from TCGA program. The model obtained through the proposed method significantly outperforms a model trained with a traditional transfer learning process.","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"12 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135111319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Model of Reusable Assets in AIE Software Systems AIE软件系统中的可重用资产模型
3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-25 DOI: 10.24215/16666038.23.e13
Agustina Buccella, Alejandra Cechich, Carolina Villegas, Ayelén Montenegro, Angel Muñoz, Andrea Rodriguez
Nowadays, due to the increasing presence of artificial intelligence in software systems, development teams face the challenge of working together to integrate tasks, resources, and roles in a new field, named AI Engineering. Proposals, in the way of models, highlight the needs of integrating two different perspectives – the software and the decision-making support (big data, machine learning, and so on) systems. But there is something more – both systems must achieve high quality levels for different properties; and this is not a straightforward task. Quality properties, such as reusability, traditionally evaluated and reinforced through modeling in software systems, do not exactly apply similarly in decision-making support systems. In this paper, we propose a model for managing reusable assets in AI engineered systems by linking software product line modeling and variety identification. The proposal is exemplified through a case study in the agriculture domain.
如今,由于软件系统中人工智能的出现越来越多,开发团队面临着在一个名为人工智能工程的新领域中共同集成任务、资源和角色的挑战。提案以模型的方式强调了整合两种不同视角的需求——软件和决策支持(大数据、机器学习等)系统。但还有更多的东西——两个系统都必须达到不同属性的高质量水平;这不是一项简单的任务。质量属性,例如可重用性,传统上是通过软件系统中的建模来评估和加强的,但在决策支持系统中并不完全适用。在本文中,我们通过将软件产品线建模和品种识别联系起来,提出了一个在人工智能工程系统中管理可重用资产的模型。该建议通过农业领域的案例研究得到了例证。
{"title":"A Model of Reusable Assets in AIE Software Systems","authors":"Agustina Buccella, Alejandra Cechich, Carolina Villegas, Ayelén Montenegro, Angel Muñoz, Andrea Rodriguez","doi":"10.24215/16666038.23.e13","DOIUrl":"https://doi.org/10.24215/16666038.23.e13","url":null,"abstract":"Nowadays, due to the increasing presence of artificial intelligence in software systems, development teams face the challenge of working together to integrate tasks, resources, and roles in a new field, named AI Engineering. Proposals, in the way of models, highlight the needs of integrating two different perspectives – the software and the decision-making support (big data, machine learning, and so on) systems. But there is something more – both systems must achieve high quality levels for different properties; and this is not a straightforward task. Quality properties, such as reusability, traditionally evaluated and reinforced through modeling in software systems, do not exactly apply similarly in decision-making support systems. In this paper, we propose a model for managing reusable assets in AI engineered systems by linking software product line modeling and variety identification. The proposal is exemplified through a case study in the agriculture domain.","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135111482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Methodology for Generating Virtual Reality Immersion Metrics based on System Variables 一种基于系统变量的虚拟现实沉浸度量生成方法
3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-10-25 DOI: 10.24215/16666038.23.e08
Matias Selzer, Silvia M. Castro
Technological advances in recent years have promoted the development of virtual reality systems that have awide variety of hardware and software characteristics, providing varying degrees of immersion. Immersionis an objective property of the virtual reality system that depends on both its hardware and softwarecharacteristics. Virtual reality systems are currently attempting to improve immersion as much as possible.However, there is no metric to measure the level of immersion of a virtual reality system based onits characteristics. To date, the influence of these hardware and software variables on immersion hasonly been considered individually or in small groups. The way these system variables simultaneously affectimmersion has not been analyzed either. In this paper, we propose immersion metrics for virtualreality systems based on their hardware and software variables, as well as the development process that ledto their formulation. From the conducted experiment and the obtained data, we followed a methodology togenerate immersion models based on the variables of the system. The immersion metrics presented in thiswork offer a useful tool in the area of virtual reality and immersive technologies, not only to measurethe immersion of any virtual reality system but also to analyze the relationship and importance of thevariables of these systems.
近年来的技术进步促进了虚拟现实系统的发展,这些系统具有各种各样的硬件和软件特性,提供不同程度的沉浸感。沉浸感是虚拟现实系统的客观属性,它取决于其硬件和软件特性。虚拟现实系统目前正试图尽可能地提高沉浸感。然而,目前还没有一个基于虚拟现实系统特性的度量标准来衡量虚拟现实系统的沉浸程度。迄今为止,这些硬件和软件变量对沉浸感的影响只被单独或在小群体中考虑过。这些系统变量同时影响影响的方式也没有被分析。在本文中,我们提出了基于硬件和软件变量的虚拟现实系统沉浸度量,以及导致其制定的开发过程。根据所进行的实验和获得的数据,我们遵循一种基于系统变量生成沉浸式模型的方法。本工作中提出的沉浸度指标为虚拟现实和沉浸式技术领域提供了一个有用的工具,不仅可以测量任何虚拟现实系统的沉浸度,还可以分析这些系统变量的关系和重要性。
{"title":"A Methodology for Generating Virtual Reality Immersion Metrics based on System Variables","authors":"Matias Selzer, Silvia M. Castro","doi":"10.24215/16666038.23.e08","DOIUrl":"https://doi.org/10.24215/16666038.23.e08","url":null,"abstract":"Technological advances in recent years have promoted the development of virtual reality systems that have awide variety of hardware and software characteristics, providing varying degrees of immersion. Immersionis an objective property of the virtual reality system that depends on both its hardware and softwarecharacteristics. Virtual reality systems are currently attempting to improve immersion as much as possible.However, there is no metric to measure the level of immersion of a virtual reality system based onits characteristics. To date, the influence of these hardware and software variables on immersion hasonly been considered individually or in small groups. The way these system variables simultaneously affectimmersion has not been analyzed either. In this paper, we propose immersion metrics for virtualreality systems based on their hardware and software variables, as well as the development process that ledto their formulation. From the conducted experiment and the obtained data, we followed a methodology togenerate immersion models based on the variables of the system. The immersion metrics presented in thiswork offer a useful tool in the area of virtual reality and immersive technologies, not only to measurethe immersion of any virtual reality system but also to analyze the relationship and importance of thevariables of these systems.","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"18 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135170815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dalea: A Persistent Multi-Level Extendible Hashing with Improved Tail Performance Dalea:具有改进尾部性能的持久多级可扩展散列
IF 1.9 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-30 DOI: 10.1007/s11390-023-2957-8
Zi-Wei Xiong, De-Jun Jiang, Jin Xiong, Ren Ren

Persistent memory (PM) promises byte-addressability, large capacity, and durability. Main memory systems, such as key-value stores and in-memory databases, benefit from such features of PM. Due to the great popularity of hashing index in main memory systems, a number of research efforts are made to provide high average performance persistent hashing. However, suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent hashing. In this paper, we analyze major sources of suboptimal tail performance from key design issues of persistent hashing. We identify the global hash structure and concurrency control as remaining explorable design spaces for improving tail performance. We propose Directory-sharing Multi-level Extendible Hashing (Dalea) for PM. Dalea designs ancestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources (rehashing and locking) affecting tail performance. The evaluation results show that, compared with state-of-the-art persistent hashing Dash, Dalea achieves increased tail throughput by 4.1x and reduced tail latency by 5.4x. Moreover, in order to provide design guidelines for improving tail performance, we adopt Dalea as a testbed to identify different impacts of four factors on tail performance, including fine-grained rehashing, transient locking, memory pre-allocation, and fingerprinting.

持久内存(PM)保证了字节寻址能力、大容量和持久性。主存系统,比如键值存储和内存数据库,可以从PM的这些特性中受益。由于哈希索引在主存系统中的广泛应用,人们进行了大量的研究工作来提供高平均性能的持久哈希。然而,对于现有的持久散列,在尾部吞吐量和尾部延迟方面,仍然观察到次优的尾部性能。在本文中,我们从持久哈希的关键设计问题分析了次优尾部性能的主要来源。我们将全局散列结构和并发控制确定为改进尾部性能的剩余可探索的设计空间。我们提出了目录共享多级可扩展哈希(Dalea)。dala设计了基于祖先链接的可扩展散列和细粒度瞬态锁,以解决影响尾部性能的两个主要来源(重散列和锁定)。评估结果表明,与最先进的持久哈希Dash相比,Dalea的尾部吞吐量增加了4.1倍,尾部延迟减少了5.4倍。此外,为了提供改进尾部性能的设计指南,我们采用Dalea作为测试平台,确定了细粒度重哈希、瞬态锁定、内存预分配和指纹识别四种因素对尾部性能的不同影响。
{"title":"Dalea: A Persistent Multi-Level Extendible Hashing with Improved Tail Performance","authors":"Zi-Wei Xiong, De-Jun Jiang, Jin Xiong, Ren Ren","doi":"10.1007/s11390-023-2957-8","DOIUrl":"https://doi.org/10.1007/s11390-023-2957-8","url":null,"abstract":"<p>Persistent memory (PM) promises byte-addressability, large capacity, and durability. Main memory systems, such as key-value stores and in-memory databases, benefit from such features of PM. Due to the great popularity of hashing index in main memory systems, a number of research efforts are made to provide high average performance persistent hashing. However, suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent hashing. In this paper, we analyze major sources of suboptimal tail performance from key design issues of persistent hashing. We identify the global hash structure and concurrency control as remaining explorable design spaces for improving tail performance. We propose Directory-sharing Multi-level Extendible Hashing (Dalea) for PM. Dalea designs ancestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources (rehashing and locking) affecting tail performance. The evaluation results show that, compared with state-of-the-art persistent hashing Dash, Dalea achieves increased tail throughput by 4.1x and reduced tail latency by 5.4x. Moreover, in order to provide design guidelines for improving tail performance, we adopt Dalea as a testbed to identify different impacts of four factors on tail performance, including fine-grained rehashing, transient locking, memory pre-allocation, and fingerprinting.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"4 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Chinese Named Entity Recognition Augmented with Lexicon Memory 词汇记忆增强的中文命名实体识别
IF 1.9 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-30 DOI: 10.1007/s11390-021-1153-y
Yi Zhou, Xiao-Qing Zheng, Xuan-Jing Huang

Inspired by the concept of content-addressable retrieval from cognitive science, we propose a novel fragmentbased Chinese named entity recognition (NER) model augmented with a lexicon-based memory in which both characterlevel and word-level features are combined to generate better feature representations for possible entity names. Observing that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories, position-dependent features, such as prefix and suffix, are introduced and taken into account for NER tasks in the form of distributed representations. The lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary words. Experimental results show that the proposed model, called LEMON, achieved state-of-the-art performance with an increase in the F1-score up to 3.2% over the state-of-the-art models on four different widely-used NER datasets.

受认知科学中内容可寻址检索概念的启发,我们提出了一种新的基于片段的中文命名实体识别(NER)模型,该模型增强了基于词汇的记忆,将字符级和词级特征结合起来,为可能的实体名称生成更好的特征表示。观察到实体名称的边界信息对于将它们定位和分类到预定义的类别特别有用,因此以分布式表示的形式引入并考虑了NER任务的位置相关特征,如前缀和后缀。基于词汇的记忆是为了帮助生成这种位置相关的特征,并解决词汇表外的问题。实验结果表明,所提出的模型(称为LEMON)在四个不同的广泛使用的NER数据集上取得了最先进的性能,f1分数比最先进的模型提高了3.2%。
{"title":"Chinese Named Entity Recognition Augmented with Lexicon Memory","authors":"Yi Zhou, Xiao-Qing Zheng, Xuan-Jing Huang","doi":"10.1007/s11390-021-1153-y","DOIUrl":"https://doi.org/10.1007/s11390-021-1153-y","url":null,"abstract":"<p>Inspired by the concept of content-addressable retrieval from cognitive science, we propose a novel fragmentbased Chinese named entity recognition (NER) model augmented with a lexicon-based memory in which both characterlevel and word-level features are combined to generate better feature representations for possible entity names. Observing that the boundary information of entity names is particularly useful to locate and classify them into pre-defined categories, position-dependent features, such as prefix and suffix, are introduced and taken into account for NER tasks in the form of distributed representations. The lexicon-based memory is built to help generate such position-dependent features and deal with the problem of out-of-vocabulary words. Experimental results show that the proposed model, called LEMON, achieved state-of-the-art performance with an increase in the <i>F</i>1-score up to 3.2% over the state-of-the-art models on four different widely-used NER datasets.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"24 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
VTensor: Using Virtual Tensors to Build a Layout-Oblivious AI Programming Framework VTensor:使用虚拟张量构建无关布局的AI编程框架
IF 1.9 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-30 DOI: 10.1007/s11390-022-1457-6
Feng Yu, Jia-Cheng Zhao, Hui-Min Cui, Xiao-Bing Feng, Jingling Xue

Tensors are a popular programming interface for developing artificial intelligence (AI) algorithms. Layout refers to the order of placing tensor data in the memory and will affect performance by affecting data locality; therefore the deep neural network library has a convention on the layout. Since AI applications can use arbitrary layouts, and existing AI systems do not provide programming abstractions to shield the layout conventions of libraries, operator developers need to write a lot of layout-related code, which reduces the efficiency of integrating new libraries or developing new operators. Furthermore, the developer assigns the layout conversion operation to the internal operator to deal with the uncertainty of the input layout, thus losing the opportunity for layout optimization. Based on the idea of polymorphism, we propose a layout-agnostic virtual tensor programming interface, namely the VTensor framework, which enables developers to write new operators without caring about the underlying physical layout of tensors. In addition, the VTensor framework performs global layout inference at runtime to transparently resolve the required layout of virtual tensors, and runtime layout-oriented optimizations to globally minimize the number of layout transformation operations. Experimental results demonstrate that with VTensor, developers can avoid writing layout-dependent code. Compared with TensorFlow, for the 16 operations used in 12 popular networks, VTensor can reduce the lines of code (LOC) of writing a new operation by 47.82% on average, and improve the overall performance by 18.65% on average.

张量是开发人工智能(AI)算法的流行编程接口。布局指的是张量数据在内存中的放置顺序,通过影响数据的局部性来影响性能;因此,深度神经网络库在布局上有一定的约定。由于人工智能应用程序可以使用任意布局,而现有的人工智能系统没有提供编程抽象来屏蔽库的布局约定,操作符开发人员需要编写大量与布局相关的代码,这降低了集成新库或开发新操作符的效率。此外,开发人员将布局转换操作分配给内部运算符来处理输入布局的不确定性,从而失去了布局优化的机会。基于多态的思想,我们提出了一个与布局无关的虚拟张量编程接口,即VTensor框架,它使开发人员能够编写新的运算符,而无需关心张量的底层物理布局。此外,VTensor框架在运行时执行全局布局推理,以透明地解析所需的虚拟张量布局,并在运行时进行面向布局的优化,以全局最小化布局转换操作的数量。实验结果表明,使用VTensor,开发人员可以避免编写与布局相关的代码。与TensorFlow相比,对于12种流行网络中使用的16种操作,VTensor可以平均减少编写新操作的代码行数(LOC) 47.82%,平均提高整体性能18.65%。
{"title":"VTensor: Using Virtual Tensors to Build a Layout-Oblivious AI Programming Framework","authors":"Feng Yu, Jia-Cheng Zhao, Hui-Min Cui, Xiao-Bing Feng, Jingling Xue","doi":"10.1007/s11390-022-1457-6","DOIUrl":"https://doi.org/10.1007/s11390-022-1457-6","url":null,"abstract":"<p>Tensors are a popular programming interface for developing artificial intelligence (AI) algorithms. Layout refers to the order of placing tensor data in the memory and will affect performance by affecting data locality; therefore the deep neural network library has a convention on the layout. Since AI applications can use arbitrary layouts, and existing AI systems do not provide programming abstractions to shield the layout conventions of libraries, operator developers need to write a lot of layout-related code, which reduces the efficiency of integrating new libraries or developing new operators. Furthermore, the developer assigns the layout conversion operation to the internal operator to deal with the uncertainty of the input layout, thus losing the opportunity for layout optimization. Based on the idea of polymorphism, we propose a layout-agnostic virtual tensor programming interface, namely the VTensor framework, which enables developers to write new operators without caring about the underlying physical layout of tensors. In addition, the VTensor framework performs global layout inference at runtime to transparently resolve the required layout of virtual tensors, and runtime layout-oriented optimizations to globally minimize the number of layout transformation operations. Experimental results demonstrate that with VTensor, developers can avoid writing layout-dependent code. Compared with TensorFlow, for the 16 operations used in 12 popular networks, VTensor can reduce the lines of code (LOC) of writing a new operation by 47.82% on average, and improve the overall performance by 18.65% on average.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"4 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cognition: Accurate and Consistent Linear Log Parsing Using Template Correction 认知:使用模板校正进行准确一致的线性日志解析
IF 1.9 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-30 DOI: 10.1007/s11390-021-1691-3
Ran Tian, Zu-Long Diao, Hai-Yang Jiang, Gao-Gang Xie

Logs contain runtime information for both systems and users. As many of them use natural language, a typical log-based analysis needs to parse logs into the structured format first. Existing parsing approaches often take two steps. The first step is to find similar words (tokens) or sentences. Second, parsers extract log templates by replacing different tokens with variable placeholders. However, we observe that most parsers concentrate on precisely grouping similar tokens or logs. But they do not have a well-designed template extraction process, which leads to inconsistent accuracy on particular datasets. The root cause is the ambiguous definition of variable placeholders and similar templates. The consequences include abuse of variable placeholders, incorrectly divided templates, and an excessive number of templates over time. In this paper, we propose our online log parsing approach Cognition. It redefines variable placeholders via a strict lower bound to avoid ambiguity first. Then, it applies our template correction technique to merge and absorb similar templates. It eliminates the interference of commonly used parameters and thus isolates template quantity. Evaluation through 16 public datasets shows that Cognition has better accuracy and consistency than the state-of-the-art approaches. It also saves up to 52.1% of time cost on average than the others.

日志包含系统和用户的运行时信息。由于其中许多使用自然语言,因此典型的基于日志的分析需要首先将日志解析为结构化格式。现有的解析方法通常分为两个步骤。第一步是找到相似的单词(标记)或句子。其次,解析器通过用可变占位符替换不同的令牌来提取日志模板。然而,我们观察到大多数解析器集中于精确分组相似的令牌或日志。但是他们没有一个设计良好的模板提取过程,这导致在特定数据集上的准确性不一致。根本原因是变量占位符和类似模板的定义不明确。其结果包括滥用可变占位符、模板划分不正确以及随着时间的推移模板数量过多。在本文中,我们提出了我们的在线日志解析方法Cognition。它通过严格的下限重新定义变量占位符,以避免歧义。然后,应用模板校正技术对相似模板进行合并和吸收。它消除了常用参数的干扰,从而隔离了模板数量。通过16个公共数据集的评估表明,认知比最先进的方法具有更好的准确性和一致性。与其他方法相比,平均节省52.1%的时间成本。
{"title":"Cognition: Accurate and Consistent Linear Log Parsing Using Template Correction","authors":"Ran Tian, Zu-Long Diao, Hai-Yang Jiang, Gao-Gang Xie","doi":"10.1007/s11390-021-1691-3","DOIUrl":"https://doi.org/10.1007/s11390-021-1691-3","url":null,"abstract":"<p>Logs contain runtime information for both systems and users. As many of them use natural language, a typical log-based analysis needs to parse logs into the structured format first. Existing parsing approaches often take two steps. The first step is to find similar words (tokens) or sentences. Second, parsers extract log templates by replacing different tokens with variable placeholders. However, we observe that most parsers concentrate on precisely grouping similar tokens or logs. But they do not have a well-designed template extraction process, which leads to inconsistent accuracy on particular datasets. The root cause is the ambiguous definition of variable placeholders and similar templates. The consequences include abuse of variable placeholders, incorrectly divided templates, and an excessive number of templates over time. In this paper, we propose our online log parsing approach Cognition. It redefines variable placeholders via a strict lower bound to avoid ambiguity first. Then, it applies our template correction technique to merge and absorb similar templates. It eliminates the interference of commonly used parameters and thus isolates template quantity. Evaluation through 16 public datasets shows that Cognition has better accuracy and consistency than the state-of-the-art approaches. It also saves up to 52.1% of time cost on average than the others.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"9 4 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model Checking for Probabilistic Multiagent Systems 概率多智能体系统的模型检验
IF 1.9 3区 计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Pub Date : 2023-09-30 DOI: 10.1007/s11390-022-1218-6
Chen Fu, Andrea Turrini, Xiaowei Huang, Lei Song, Yuan Feng, Li-Jun Zhang

In multiagent systems, agents usually do not have complete information of the whole system, which makes the analysis of such systems hard. The incompleteness of information is normally modelled by means of accessibility relations, and the schedulers consistent with such relations are called uniform. In this paper, we consider probabilistic multiagent systems with accessibility relations and focus on the model checking problem with respect to the probabilistic epistemic temporal logic, which can specify both temporal and epistemic properties. However, the problem is undecidable in general. We show that it becomes decidable when restricted to memoryless uniform schedulers. Then, we present two algorithms for this case: one reduces the model checking problem into a mixed integer non-linear programming (MINLP) problem, which can then be solved by Satisfiability Modulo Theories (SMT) solvers, and the other is an approximate algorithm based on the upper confidence bounds applied to trees (UCT) algorithm, which can return a result whenever queried. These algorithms have been implemented in an existing model checker and then validated on experiments. The experimental results show the efficiency and extendability of these algorithms, and the algorithm based on UCT outperforms the one based on MINLP in most cases.

在多智能体系统中,智能体通常不具备整个系统的完整信息,这给系统分析带来了困难。信息的不完全性通常通过可访问性关系来建模,与可访问性关系一致的调度程序称为统一调度程序。本文考虑具有可达性关系的概率多智能体系统,重点研究了概率认知时间逻辑的模型检验问题,该逻辑可以同时指定时间和认知属性。然而,总的来说,这个问题是无法确定的。我们表明,当限制为无内存统一调度器时,它是可决定的。然后,我们针对这种情况提出了两种算法:一种是将模型检验问题简化为混合整数非线性规划(MINLP)问题,然后用可满足模理论(SMT)求解器求解;另一种是基于应用于树的上置信限的近似算法(UCT)算法,无论何时查询都可以返回结果。这些算法已在现有的模型检查器中实现,并在实验上进行了验证。实验结果表明了这些算法的有效性和可扩展性,在大多数情况下,基于UCT的算法优于基于MINLP的算法。
{"title":"Model Checking for Probabilistic Multiagent Systems","authors":"Chen Fu, Andrea Turrini, Xiaowei Huang, Lei Song, Yuan Feng, Li-Jun Zhang","doi":"10.1007/s11390-022-1218-6","DOIUrl":"https://doi.org/10.1007/s11390-022-1218-6","url":null,"abstract":"<p>In multiagent systems, agents usually do not have complete information of the whole system, which makes the analysis of such systems hard. The incompleteness of information is normally modelled by means of accessibility relations, and the schedulers consistent with such relations are called uniform. In this paper, we consider probabilistic multiagent systems with accessibility relations and focus on the model checking problem with respect to the probabilistic epistemic temporal logic, which can specify both temporal and epistemic properties. However, the problem is undecidable in general. We show that it becomes decidable when restricted to memoryless uniform schedulers. Then, we present two algorithms for this case: one reduces the model checking problem into a mixed integer non-linear programming (MINLP) problem, which can then be solved by Satisfiability Modulo Theories (SMT) solvers, and the other is an approximate algorithm based on the upper confidence bounds applied to trees (UCT) algorithm, which can return a result whenever queried. These algorithms have been implemented in an existing model checker and then validated on experiments. The experimental results show the efficiency and extendability of these algorithms, and the algorithm based on UCT outperforms the one based on MINLP in most cases.</p>","PeriodicalId":50222,"journal":{"name":"Journal of Computer Science and Technology","volume":"14 1","pages":""},"PeriodicalIF":1.9,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138540026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Computer Science and Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1