首页 > 最新文献

Information (Switzerland)最新文献

英文 中文
Defending IoT Devices against Bluetooth Worms with Bluetooth OBEX Proxy 使用蓝牙OBEX代理保护物联网设备免受蓝牙蠕虫攻击
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-27 DOI: 10.3390/info14100525
Fu-Hau Hsu, Min-Hao Wu, Yan-Ling Hwang, Jian-Xin Chen, Jian-Hong Huang, Hao-Jyun Wang, Yi-Wen Lai
The number of Internet of Things (IoT) devices has increased dramatically in recent years, and Bluetooth technology is critical for communication between IoT devices. It is possible to protect electronic communications, the Internet of Things (IoT), and big data from malware and data theft with BlueZ’s Bluetooth File Transfer Filter (BTF). It can use a configurable filter to block unauthorized Bluetooth file transfers. The BTF is available for various Linux distributions and can protect many Bluetooth-enabled devices, including smartphones, tablets, laptops, and the Internet of Things. However, the increased number and density of Bluetooth devices have also created a serious problem—the Bluetooth worm. It poses a severe threat to the security of Bluetooth devices. In this paper, we propose a Bluetooth OBEX Proxy (BOP) to filter malicious files transferred to devices via the OBEX system service in BlueZ. The method described in this article prevents illegal Bluetooth file transfers, defending big data, the Internet of Things (IoT), and electronic communications from malware and data theft. It also protects numerous Bluetooth devices, including smartphones, tablets, laptops, and the Internet of Things, with many Linux distributions. Overall, the detection findings were entirely accurate, with zero false positives and 2.29% misses.
近年来,物联网(IoT)设备的数量急剧增加,蓝牙技术对于物联网设备之间的通信至关重要。借助BlueZ的蓝牙文件传输过滤器(BTF),可以保护电子通信、物联网(IoT)和大数据免受恶意软件和数据盗窃。它可以使用一个可配置的过滤器来阻止未经授权的蓝牙文件传输。BTF可用于各种Linux发行版,可以保护许多支持蓝牙的设备,包括智能手机、平板电脑、笔记本电脑和物联网。然而,蓝牙设备数量和密度的增加也带来了一个严重的问题——蓝牙蠕虫。这对蓝牙设备的安全性构成了严重的威胁。在本文中,我们提出了一个蓝牙OBEX代理(BOP)来过滤通过蓝牙OBEX系统服务传输到设备的恶意文件。本文中描述的方法可以防止非法蓝牙文件传输,保护大数据,物联网(IoT)和电子通信免受恶意软件和数据盗窃。它还可以保护许多蓝牙设备,包括智能手机、平板电脑、笔记本电脑和许多Linux发行版的物联网。总的来说,检测结果是完全准确的,零假阳性和2.29%的遗漏。
{"title":"Defending IoT Devices against Bluetooth Worms with Bluetooth OBEX Proxy","authors":"Fu-Hau Hsu, Min-Hao Wu, Yan-Ling Hwang, Jian-Xin Chen, Jian-Hong Huang, Hao-Jyun Wang, Yi-Wen Lai","doi":"10.3390/info14100525","DOIUrl":"https://doi.org/10.3390/info14100525","url":null,"abstract":"The number of Internet of Things (IoT) devices has increased dramatically in recent years, and Bluetooth technology is critical for communication between IoT devices. It is possible to protect electronic communications, the Internet of Things (IoT), and big data from malware and data theft with BlueZ’s Bluetooth File Transfer Filter (BTF). It can use a configurable filter to block unauthorized Bluetooth file transfers. The BTF is available for various Linux distributions and can protect many Bluetooth-enabled devices, including smartphones, tablets, laptops, and the Internet of Things. However, the increased number and density of Bluetooth devices have also created a serious problem—the Bluetooth worm. It poses a severe threat to the security of Bluetooth devices. In this paper, we propose a Bluetooth OBEX Proxy (BOP) to filter malicious files transferred to devices via the OBEX system service in BlueZ. The method described in this article prevents illegal Bluetooth file transfers, defending big data, the Internet of Things (IoT), and electronic communications from malware and data theft. It also protects numerous Bluetooth devices, including smartphones, tablets, laptops, and the Internet of Things, with many Linux distributions. Overall, the detection findings were entirely accurate, with zero false positives and 2.29% misses.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135579599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Construction of Educational Knowledge Graphs: A Word Embedding-Based Approach 基于词嵌入的教育知识图谱自动构建方法
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-27 DOI: 10.3390/info14100526
Qurat Ul Ain, Mohamed Amine Chatti, Komlan Gluck Charles Bakar, Shoeb Joarder, Rawaa Alatrash
Knowledge graphs (KGs) are widely used in the education domain to offer learners a semantic representation of domain concepts from educational content and their relations, termed as educational knowledge graphs (EduKGs). Previous studies on EduKGs have incorporated concept extraction and weighting modules. However, these studies face limitations in terms of accuracy and performance. To address these challenges, this work aims to improve the concept extraction and weighting mechanisms by leveraging state-of-the-art word and sentence embedding techniques. Concretely, we enhance the SIFRank keyphrase extraction method by using SqueezeBERT and we propose a concept-weighting strategy based on SBERT. Furthermore, we conduct extensive experiments on different datasets, demonstrating significant improvements over several state-of-the-art keyphrase extraction and concept-weighting techniques.
知识图被广泛应用于教育领域,为学习者提供教育内容及其关系的领域概念的语义表示,称为教育知识图。以往对EduKGs的研究包含了概念提取和权重模块。然而,这些研究在准确性和性能方面面临局限性。为了解决这些挑战,本工作旨在通过利用最先进的词和句子嵌入技术来改进概念提取和加权机制。具体而言,我们利用SqueezeBERT对SIFRank关键词提取方法进行了改进,并提出了一种基于SBERT的概念加权策略。此外,我们在不同的数据集上进行了广泛的实验,证明了几种最先进的关键词提取和概念加权技术的显著改进。
{"title":"Automatic Construction of Educational Knowledge Graphs: A Word Embedding-Based Approach","authors":"Qurat Ul Ain, Mohamed Amine Chatti, Komlan Gluck Charles Bakar, Shoeb Joarder, Rawaa Alatrash","doi":"10.3390/info14100526","DOIUrl":"https://doi.org/10.3390/info14100526","url":null,"abstract":"Knowledge graphs (KGs) are widely used in the education domain to offer learners a semantic representation of domain concepts from educational content and their relations, termed as educational knowledge graphs (EduKGs). Previous studies on EduKGs have incorporated concept extraction and weighting modules. However, these studies face limitations in terms of accuracy and performance. To address these challenges, this work aims to improve the concept extraction and weighting mechanisms by leveraging state-of-the-art word and sentence embedding techniques. Concretely, we enhance the SIFRank keyphrase extraction method by using SqueezeBERT and we propose a concept-weighting strategy based on SBERT. Furthermore, we conduct extensive experiments on different datasets, demonstrating significant improvements over several state-of-the-art keyphrase extraction and concept-weighting techniques.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135580101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers 区分人类写作和机器生成的科学论文的基准数据集
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-26 DOI: 10.3390/info14100522
Mohamed Hesham Ibrahim Abdalla, Simon Malberg, Daryna Dementieva, Edoardo Mosca, Georg Groh
As generative NLP can now produce content nearly indistinguishable from human writing, it is becoming difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information in machine-generated text can be factually wrong or even entirely fabricated. In this work, we introduce a novel benchmark dataset containing human-written and machine-generated scientific papers from SCIgen, GPT-2, GPT-3, ChatGPT, and Galactica, as well as papers co-created by humans and ChatGPT. We also experiment with several types of classifiers—linguistic-based and transformer-based—for detecting the authorship of scientific text. A strong focus is put on generalization capabilities and explainability to highlight the strengths and weaknesses of these detectors. Our work makes an important step towards creating more robust methods for distinguishing between human-written and machine-generated scientific papers, ultimately ensuring the integrity of scientific literature.
由于生成式NLP现在可以产生与人类写作几乎没有区别的内容,因此在学术写作和科学出版物中识别真正的研究贡献变得越来越困难。此外,机器生成文本中的信息在事实上可能是错误的,甚至完全是捏造的。在这项工作中,我们引入了一个新的基准数据集,其中包含来自SCIgen, GPT-2, GPT-3, ChatGPT和卡拉狄加的人类编写和机器生成的科学论文,以及人类和ChatGPT共同创建的论文。我们还实验了几种类型的分类器——基于语言的和基于转换的——用于检测科学文本的作者身份。重点放在泛化能力和可解释性上,以突出这些检测器的优点和缺点。我们的工作朝着创建更强大的方法来区分人类写作和机器生成的科学论文迈出了重要的一步,最终确保了科学文献的完整性。
{"title":"A Benchmark Dataset to Distinguish Human-Written and Machine-Generated Scientific Papers","authors":"Mohamed Hesham Ibrahim Abdalla, Simon Malberg, Daryna Dementieva, Edoardo Mosca, Georg Groh","doi":"10.3390/info14100522","DOIUrl":"https://doi.org/10.3390/info14100522","url":null,"abstract":"As generative NLP can now produce content nearly indistinguishable from human writing, it is becoming difficult to identify genuine research contributions in academic writing and scientific publications. Moreover, information in machine-generated text can be factually wrong or even entirely fabricated. In this work, we introduce a novel benchmark dataset containing human-written and machine-generated scientific papers from SCIgen, GPT-2, GPT-3, ChatGPT, and Galactica, as well as papers co-created by humans and ChatGPT. We also experiment with several types of classifiers—linguistic-based and transformer-based—for detecting the authorship of scientific text. A strong focus is put on generalization capabilities and explainability to highlight the strengths and weaknesses of these detectors. Our work makes an important step towards creating more robust methods for distinguishing between human-written and machine-generated scientific papers, ultimately ensuring the integrity of scientific literature.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135718806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monitoring Key Pair Usage through Distributed Ledgers and One-Time Signatures 通过分布式账本和一次性签名监控密钥对的使用
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-26 DOI: 10.3390/info14100523
Lucas Mayr, Lucas Palma, Gustavo Zambonin, Wellington Silvano, Ricardo Custódio
Private key management is a complex obstacle arising from the traditional public key infrastructure model. However, before any related security breach can be addressed, it must first be reliably detected. Certificate Transparency (CT) is an example of a certificate issuance monitoring strategy, developed to detect the possible malfeasance of certification authorities (CAs). To the best of our knowledge, CT and other detection mechanisms do not cover digitally signed documents made by an end user, which are also susceptible to CA misbehavior. We modify the CT framework to handle signed documents via logging certificates in the blockchain to enable the secure and user-friendly monitoring of one-time signatures, backdating protection, and effective CA misbehavior detection. Moreover, to demonstrate the feasibility of our proposal, we present distinct deployment scenarios and analyze the storage, performance, and monetary costs.
私钥管理是传统公钥基础设施模式带来的复杂障碍。然而,在解决任何相关的安全漏洞之前,必须首先可靠地检测到它。证书透明度(Certificate Transparency, CT)是证书颁发监控策略的一个示例,用于检测证书颁发机构(Certificate authority, ca)可能存在的不当行为。据我们所知,CT和其他检测机制不包括最终用户生成的数字签名文档,这些文档也容易受到CA不当行为的影响。我们修改了CT框架,通过区块链中的日志证书来处理签名的文档,从而实现对一次性签名的安全、用户友好的监控、回溯保护和有效的CA不当行为检测。此外,为了演示我们建议的可行性,我们提出了不同的部署方案,并分析了存储、性能和货币成本。
{"title":"Monitoring Key Pair Usage through Distributed Ledgers and One-Time Signatures","authors":"Lucas Mayr, Lucas Palma, Gustavo Zambonin, Wellington Silvano, Ricardo Custódio","doi":"10.3390/info14100523","DOIUrl":"https://doi.org/10.3390/info14100523","url":null,"abstract":"Private key management is a complex obstacle arising from the traditional public key infrastructure model. However, before any related security breach can be addressed, it must first be reliably detected. Certificate Transparency (CT) is an example of a certificate issuance monitoring strategy, developed to detect the possible malfeasance of certification authorities (CAs). To the best of our knowledge, CT and other detection mechanisms do not cover digitally signed documents made by an end user, which are also susceptible to CA misbehavior. We modify the CT framework to handle signed documents via logging certificates in the blockchain to enable the secure and user-friendly monitoring of one-time signatures, backdating protection, and effective CA misbehavior detection. Moreover, to demonstrate the feasibility of our proposal, we present distinct deployment scenarios and analyze the storage, performance, and monetary costs.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135718925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges of Automated Identification of Access to Education and Training in Germany 德国教育和培训机会自动识别的挑战
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-26 DOI: 10.3390/info14100524
Jens Dörpinghaus, David Samray, Robert Helmrich
The German labor market relies heavily on vocational training, retraining, and continuing education. In order to match training seekers with training offers and to make the available data interoperable, we present a novel approach to automatically detect access to education and training in German training offers and advertisements and identify open research questions and areas for further research. In particular, we focus on (a) general education and school leaving certificates, (b) work experience, (c) previous apprenticeship, and (d) a list of skills provided by the German Federal Employment Agency. This novel approach combines several methods: First, we provide technical terms and classes of the education system that are used synonymously, combining different qualifications and adding obsolete terms. Second, we provide rule-based matching to identify the need for work experience or education. However, not all qualification requirements can be matched due to incompatible data schemas or non-standardized requirements such as initial tests or interviews. Although there are several shortcomings, the presented approach shows promising results for two data sets: training and retraining advertisements.
德国劳动力市场严重依赖职业培训、再培训和继续教育。为了匹配培训寻求者与培训报价,并使可用数据可互操作,我们提出了一种新的方法来自动检测德国培训报价和广告中的教育和培训访问,并确定开放的研究问题和进一步研究的领域。我们特别关注(a)普通教育和学校毕业证书,(b)工作经验,(c)以前的学徒经历,以及(d)德国联邦就业局提供的技能清单。这种新颖的方法结合了几种方法:首先,我们提供同义使用的教育系统的技术术语和类别,结合不同的资格和添加过时的术语。其次,我们提供基于规则的匹配,以确定对工作经验或教育的需求。但是,由于不兼容的数据模式或初始测试或面试等非标准化需求,并非所有资格要求都可以匹配。尽管存在一些缺点,但所提出的方法在两个数据集上显示出有希望的结果:训练和再训练广告。
{"title":"Challenges of Automated Identification of Access to Education and Training in Germany","authors":"Jens Dörpinghaus, David Samray, Robert Helmrich","doi":"10.3390/info14100524","DOIUrl":"https://doi.org/10.3390/info14100524","url":null,"abstract":"The German labor market relies heavily on vocational training, retraining, and continuing education. In order to match training seekers with training offers and to make the available data interoperable, we present a novel approach to automatically detect access to education and training in German training offers and advertisements and identify open research questions and areas for further research. In particular, we focus on (a) general education and school leaving certificates, (b) work experience, (c) previous apprenticeship, and (d) a list of skills provided by the German Federal Employment Agency. This novel approach combines several methods: First, we provide technical terms and classes of the education system that are used synonymously, combining different qualifications and adding obsolete terms. Second, we provide rule-based matching to identify the need for work experience or education. However, not all qualification requirements can be matched due to incompatible data schemas or non-standardized requirements such as initial tests or interviews. Although there are several shortcomings, the presented approach shows promising results for two data sets: training and retraining advertisements.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135719610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Objective Improved Cockroach Swarm Algorithm Approach for Apartment Energy Management Systems 公寓能源管理系统的多目标改进蟑螂群算法
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-25 DOI: 10.3390/info14100521
Bilal Naji Alhasnawi, Basil H. Jasim, Ali M. Jasim, Vladimír Bureš, Arshad Naji Alhasnawi, Raad Z. Homod, Majid Razaq Mohamed Alsemawai, Rabeh Abbassi, Bishoy E. Sedhom
The electrical demand and generation in power systems is currently the biggest source of uncertainty for an electricity provider. For a dependable and financially advantageous electricity system, demand response (DR) success as a result of household appliance energy management has attracted significant attention. Due to fluctuating electricity rates and usage trends, determining the best schedule for apartment appliances can be difficult. As a result of this context, the Improved Cockroach Swarm Optimization Algorithm (ICSOA) is combined with the Innovative Apartments Appliance Scheduling (IAAS) framework. Using the proposed technique, the cost of electricity reduction, user comfort maximization, and peak-to-average ratio reduction are analyzed for apartment appliances. The proposed framework is evaluated by comparing it with BFOA and W/O scheduling cases. In comparison to the W/O scheduling case, the BFOA method lowered energy costs by 17.75%, but the ICSA approach reduced energy cost by 46.085%. According to the results, the created ICSA algorithm performed better than the BFOA and W/O scheduling situations in terms of the stated objectives and was advantageous to both utilities and consumers.
电力系统的电力需求和发电量目前是电力供应商最大的不确定性来源。对于一个可靠和经济上有利的电力系统,需求响应(DR)的成功是家电能源管理的结果,引起了人们的极大关注。由于电费和使用趋势的波动,确定公寓电器的最佳时间表可能很困难。在此背景下,将改进的蟑螂群优化算法(ICSOA)与创新公寓家电调度(IAAS)框架相结合。利用所提出的技术,分析了公寓电器的电力成本降低、用户舒适度最大化和峰值-平均比降低。通过与BFOA和W/O调度实例的比较,对该框架进行了评价。与W/O调度相比,BFOA方法降低了17.75%的能源成本,而ICSA方法降低了46.085%的能源成本。结果表明,所创建的ICSA算法在既定目标方面优于BFOA和W/O调度情况,并且对公用事业和消费者都有利。
{"title":"A Multi-Objective Improved Cockroach Swarm Algorithm Approach for Apartment Energy Management Systems","authors":"Bilal Naji Alhasnawi, Basil H. Jasim, Ali M. Jasim, Vladimír Bureš, Arshad Naji Alhasnawi, Raad Z. Homod, Majid Razaq Mohamed Alsemawai, Rabeh Abbassi, Bishoy E. Sedhom","doi":"10.3390/info14100521","DOIUrl":"https://doi.org/10.3390/info14100521","url":null,"abstract":"The electrical demand and generation in power systems is currently the biggest source of uncertainty for an electricity provider. For a dependable and financially advantageous electricity system, demand response (DR) success as a result of household appliance energy management has attracted significant attention. Due to fluctuating electricity rates and usage trends, determining the best schedule for apartment appliances can be difficult. As a result of this context, the Improved Cockroach Swarm Optimization Algorithm (ICSOA) is combined with the Innovative Apartments Appliance Scheduling (IAAS) framework. Using the proposed technique, the cost of electricity reduction, user comfort maximization, and peak-to-average ratio reduction are analyzed for apartment appliances. The proposed framework is evaluated by comparing it with BFOA and W/O scheduling cases. In comparison to the W/O scheduling case, the BFOA method lowered energy costs by 17.75%, but the ICSA approach reduced energy cost by 46.085%. According to the results, the created ICSA algorithm performed better than the BFOA and W/O scheduling situations in terms of the stated objectives and was advantageous to both utilities and consumers.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135864711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Can Triplet Loss Be Used for Multi-Label Few-Shot Classification? A Case Study 三重态损失可以用于多标签少针分类吗?案例研究
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-23 DOI: 10.3390/info14100520
Gergely Márk Csányi, Renátó Vági, Andrea Megyeri, Anna Fülöp , Dániel Nagy, János Pál Vadász, István Üveges
Few-shot learning is a deep learning subfield that is the focus of research nowadays. This paper addresses the research question of whether a triplet-trained Siamese network, initially designed for multi-class classification, can effectively handle multi-label classification. We conducted a case study to identify any limitations in its application. The experiments were conducted on a dataset containing Hungarian legal decisions of administrative agencies in tax matters belonging to a major legal content provider. We also tested how different Siamese embeddings compare on classifying a previously non-existing label on a binary and a multi-label setting. We found that triplet-trained Siamese networks can be applied to perform classification but with a sampling restriction during training. We also found that the overlap between labels affects the results negatively. The few-shot model, seeing only ten examples for each label, provided competitive results compared to models trained on tens of thousands of court decisions using tf-idf vectorization and logistic regression.
少次学习是深度学习的一个分支,也是目前研究的热点。本文解决了最初为多类分类设计的三重训练的Siamese网络能否有效处理多标签分类的研究问题。我们进行了一个案例研究,以确定其应用中的任何限制。实验是在一个数据集上进行的,该数据集包含匈牙利行政机构在税务问题上的法律决定,该决定属于一个主要的法律内容提供商。我们还测试了不同的暹罗嵌入如何在二进制和多标签设置上对以前不存在的标签进行分类。我们发现三重训练的Siamese网络可以应用于分类,但在训练过程中有采样限制。我们还发现,标签之间的重叠会对结果产生负面影响。与使用tf-idf矢量化和逻辑回归训练的数以万计的法院判决模型相比,每个标签只看到10个样本的few-shot模型提供了具有竞争力的结果。
{"title":"Can Triplet Loss Be Used for Multi-Label Few-Shot Classification? A Case Study","authors":"Gergely Márk Csányi, Renátó Vági, Andrea Megyeri, Anna Fülöp , Dániel Nagy, János Pál Vadász, István Üveges","doi":"10.3390/info14100520","DOIUrl":"https://doi.org/10.3390/info14100520","url":null,"abstract":"Few-shot learning is a deep learning subfield that is the focus of research nowadays. This paper addresses the research question of whether a triplet-trained Siamese network, initially designed for multi-class classification, can effectively handle multi-label classification. We conducted a case study to identify any limitations in its application. The experiments were conducted on a dataset containing Hungarian legal decisions of administrative agencies in tax matters belonging to a major legal content provider. We also tested how different Siamese embeddings compare on classifying a previously non-existing label on a binary and a multi-label setting. We found that triplet-trained Siamese networks can be applied to perform classification but with a sampling restriction during training. We also found that the overlap between labels affects the results negatively. The few-shot model, seeing only ten examples for each label, provided competitive results compared to models trained on tens of thousands of court decisions using tf-idf vectorization and logistic regression.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135966563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learnability in Automated Driving (LiAD): Concepts for Applying Learnability Engineering (CALE) Based on Long-Term Learning Effects 自动驾驶中的可学习性:基于长期学习效应的可学习性工程应用概念
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-22 DOI: 10.3390/info14100519
Naomi Y. Mbelekani, Klaus Bengler
Learnability in Automated Driving (LiAD) is a neglected research topic, especially when considering the unpredictable and intricate ways humans learn to interact and use automated driving systems (ADS) over the sequence of time. Moreover, there is a scarcity of publications dedicated to LiAD (specifically extended learnability methods) to guide the scientific paradigm. As a result, this generates scientific discord and, thus, leaves many facets of long-term learning effects associated with automated driving in dire need of significant research courtesy. This, we believe, is a constraint to knowledge discovery on quality interaction design phenomena. In a sense, it is imperative to abstract knowledge on how long-term effects and learning effects may affect (negatively and positively) users’ learning and mental models. As well as induce changeable behavioural configurations and performances. In view of that, it may be imperative to examine operational concepts that may help researchers envision future scenarios with automation by assessing users’ learning ability, how they learn and what they learn over the sequence of time. As well as constructing a theory of effects (from micro, meso and macro perspectives), which may help profile ergonomic quality design aspects that stand the test of time. As a result, we reviewed the literature on learnability, which we mined for LiAD knowledge discovery from the experience perspective of long-term learning effects. Therefore, the paper offers the reader the resulting discussion points formulated under the Learnability Engineering Life Cycle. For instance, firstly, contextualisation of LiAD with emphasis on extended LiAD. Secondly, conceptualisation and operationalisation of the operational mechanics of LiAD as a concept in ergonomic quality engineering (with an introduction of Concepts for Applying Learnability Engineering (CALE) research based on LiAD knowledge discovery). Thirdly, the systemisation of implementable long-term research strategies towards comprehending behaviour modification associated with extended LiAD. As the vehicle industry revolutionises at a rapid pace towards automation and artificially intelligent (AI) systems, this knowledge is useful for illuminating and instructing quality interaction strategies and Quality Automated Driving (QAD).
自动驾驶的易学性(LiAD)是一个被忽视的研究课题,特别是考虑到人类学习交互和使用自动驾驶系统(ADS)的不可预测和复杂的方式。此外,还缺乏专门用于指导科学范式的LiAD(特别是扩展可学习性方法)的出版物。因此,这产生了科学上的分歧,因此,与自动驾驶相关的长期学习影响的许多方面迫切需要进行重要的研究。我们认为,这是对高质量交互设计现象的知识发现的约束。从某种意义上说,有必要抽象出长期效果和学习效果如何影响用户的学习和心理模型(消极和积极)的知识。以及诱导变化的行为配置和表现。鉴于此,可能有必要检查操作概念,通过评估用户的学习能力,他们如何学习以及他们在时间序列中学习什么,来帮助研究人员设想自动化的未来场景。以及建立一个理论的效果(从微观,中观和宏观的角度),这可能有助于轮廓符合人体工程学的质量设计方面经得起时间的考验。因此,我们回顾了关于可学习性的文献,从长期学习效应的经验角度挖掘了LiAD知识发现。因此,本文向读者提供了在可学习性工程生命周期下制定的讨论要点。例如,首先,LiAD的上下文化,重点是扩展LiAD。其次,将LiAD的操作机制概念化和可操作化,作为人体工程学质量工程中的一个概念(介绍了基于LiAD知识发现的应用可学习性工程(CALE)研究的概念)。第三,系统化可实施的长期研究策略,以理解与扩展LiAD相关的行为改变。随着汽车行业朝着自动化和人工智能(AI)系统的快速发展,这些知识对于阐明和指导高质量的交互策略和高质量的自动驾驶(QAD)非常有用。
{"title":"Learnability in Automated Driving (LiAD): Concepts for Applying Learnability Engineering (CALE) Based on Long-Term Learning Effects","authors":"Naomi Y. Mbelekani, Klaus Bengler","doi":"10.3390/info14100519","DOIUrl":"https://doi.org/10.3390/info14100519","url":null,"abstract":"Learnability in Automated Driving (LiAD) is a neglected research topic, especially when considering the unpredictable and intricate ways humans learn to interact and use automated driving systems (ADS) over the sequence of time. Moreover, there is a scarcity of publications dedicated to LiAD (specifically extended learnability methods) to guide the scientific paradigm. As a result, this generates scientific discord and, thus, leaves many facets of long-term learning effects associated with automated driving in dire need of significant research courtesy. This, we believe, is a constraint to knowledge discovery on quality interaction design phenomena. In a sense, it is imperative to abstract knowledge on how long-term effects and learning effects may affect (negatively and positively) users’ learning and mental models. As well as induce changeable behavioural configurations and performances. In view of that, it may be imperative to examine operational concepts that may help researchers envision future scenarios with automation by assessing users’ learning ability, how they learn and what they learn over the sequence of time. As well as constructing a theory of effects (from micro, meso and macro perspectives), which may help profile ergonomic quality design aspects that stand the test of time. As a result, we reviewed the literature on learnability, which we mined for LiAD knowledge discovery from the experience perspective of long-term learning effects. Therefore, the paper offers the reader the resulting discussion points formulated under the Learnability Engineering Life Cycle. For instance, firstly, contextualisation of LiAD with emphasis on extended LiAD. Secondly, conceptualisation and operationalisation of the operational mechanics of LiAD as a concept in ergonomic quality engineering (with an introduction of Concepts for Applying Learnability Engineering (CALE) research based on LiAD knowledge discovery). Thirdly, the systemisation of implementable long-term research strategies towards comprehending behaviour modification associated with extended LiAD. As the vehicle industry revolutionises at a rapid pace towards automation and artificially intelligent (AI) systems, this knowledge is useful for illuminating and instructing quality interaction strategies and Quality Automated Driving (QAD).","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136061016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SUCCEED: Sharing Upcycling Cases with Context and Evaluation for Efficient Software Development 成功:分享升级回收案例与环境和有效软件开发的评估
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-21 DOI: 10.3390/info14090518
Takuya Nakata, Sinan Chen, Sachio Saiki, Masahide Nakamura
Software upcycling, a form of software reuse, is a concept that efficiently generates novel, innovative, and value-added development projects by utilizing knowledge extracted from past projects. However, how to integrate the materials derived from these projects for upcycling remains uncertain. This study defines a systematic model for upcycling cases and develops the Sharing Upcycling Cases with Context and Evaluation for Efficient Software Development (SUCCEED) system to support the implementation of new upcycling initiatives by effectively sharing cases within the organization. To ascertain the efficacy of upcycling within our proposed model and system, we formulated three research questions and conducted two distinct experiments. Through surveys, we identified motivations and characteristics of shared upcycling-relevant development cases. Development tasks were divided into groups, those that employed the SUCCEED system and those that did not, in order to discern the enhancements brought about by upcycling. As a result of this research, we accomplished a comprehensive structuring of both technical and experiential knowledge beneficial for development, a feat previously unrealizable through conventional software reuse, and successfully realized reuse in a proactive and closed environment through construction of the wisdom of crowds for upcycling cases. Consequently, it becomes possible to systematically perform software upcycling by leveraging knowledge from existing projects for streamlining of software development.
软件升级循环是软件重用的一种形式,它是一个概念,通过利用从过去项目中提取的知识,有效地生成新颖的、创新的和增值的开发项目。然而,如何整合来自这些项目的材料进行升级回收仍然是不确定的。本研究定义了一个升级案例的系统模型,并开发了“共享升级案例与高效软件开发的背景和评估”(SUCCEED)系统,以通过在组织内有效地共享案例来支持新的升级倡议的实施。为了确定升级回收在我们提出的模型和系统中的功效,我们制定了三个研究问题,并进行了两个不同的实验。通过调查,我们确定了共享升级回收相关开发案例的动机和特征。开发任务被分成几组,一组使用了SUCCEED系统,另一组没有,以便辨别升级循环带来的增强。通过本研究,我们完成了有利于开发的技术知识和经验知识的全面结构化,这是以往通过常规软件重用无法实现的壮举,并通过构建升级案例的群体智慧,成功实现了主动封闭环境下的重用。因此,通过利用现有项目的知识来简化软件开发,系统地执行软件升级循环成为可能。
{"title":"SUCCEED: Sharing Upcycling Cases with Context and Evaluation for Efficient Software Development","authors":"Takuya Nakata, Sinan Chen, Sachio Saiki, Masahide Nakamura","doi":"10.3390/info14090518","DOIUrl":"https://doi.org/10.3390/info14090518","url":null,"abstract":"Software upcycling, a form of software reuse, is a concept that efficiently generates novel, innovative, and value-added development projects by utilizing knowledge extracted from past projects. However, how to integrate the materials derived from these projects for upcycling remains uncertain. This study defines a systematic model for upcycling cases and develops the Sharing Upcycling Cases with Context and Evaluation for Efficient Software Development (SUCCEED) system to support the implementation of new upcycling initiatives by effectively sharing cases within the organization. To ascertain the efficacy of upcycling within our proposed model and system, we formulated three research questions and conducted two distinct experiments. Through surveys, we identified motivations and characteristics of shared upcycling-relevant development cases. Development tasks were divided into groups, those that employed the SUCCEED system and those that did not, in order to discern the enhancements brought about by upcycling. As a result of this research, we accomplished a comprehensive structuring of both technical and experiential knowledge beneficial for development, a feat previously unrealizable through conventional software reuse, and successfully realized reuse in a proactive and closed environment through construction of the wisdom of crowds for upcycling cases. Consequently, it becomes possible to systematically perform software upcycling by leveraging knowledge from existing projects for streamlining of software development.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136129397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Translation of Electrical Terminology Constraints 电气术语约束的机器翻译
Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-20 DOI: 10.3390/info14090517
Zepeng Wang, Yuan Chen, Juwei Zhang
In practical applications, the accuracy of domain terminology translation is an important criterion for the performance evaluation of domain machine translation models. Aiming at the problem of phrase mismatch and improper translation caused by word-by-word translation of English terminology phrases, this paper constructs a dictionary of terminology phrases in the field of electrical engineering and proposes three schemes to integrate the dictionary knowledge into the translation model. Scheme 1 replaces the terminology phrases of the source language. Scheme 2 uses the residual connection at the encoder end after the terminology phrase is replaced. Scheme 3 uses a segmentation method of combining character segmentation and terminology segmentation for the target language and uses an additional loss module in the training process. The results show that all three schemes are superior to the baseline model in two aspects: BLEU value and correct translation rate of terminology words. In the test set, the highest accuracy of terminology words was 48.3% higher than that of the baseline model. The BLEU value is up to 3.6 higher than the baseline model. The phenomenon is also analyzed and discussed in this paper.
在实际应用中,领域术语翻译的准确性是评估领域机器翻译模型性能的重要标准。针对英语术语短语逐字翻译造成的短语不匹配和翻译不当问题,构建了电气工程领域术语短语词典,并提出了三种将词典知识整合到翻译模型中的方案。方案1替换源语言的术语短语。方案2在替换术语短语后在编码器端使用剩余连接。方案3对目标语言采用字符分割和术语分割相结合的分割方法,并在训练过程中增加了损失模块。结果表明,三种方案在BLEU值和术语词的正确翻译率两个方面都优于基线模型。在测试集中,术语词的最高准确率比基线模型高48.3%。BLEU值比基线模型高3.6。本文还对这一现象进行了分析和探讨。
{"title":"Machine Translation of Electrical Terminology Constraints","authors":"Zepeng Wang, Yuan Chen, Juwei Zhang","doi":"10.3390/info14090517","DOIUrl":"https://doi.org/10.3390/info14090517","url":null,"abstract":"In practical applications, the accuracy of domain terminology translation is an important criterion for the performance evaluation of domain machine translation models. Aiming at the problem of phrase mismatch and improper translation caused by word-by-word translation of English terminology phrases, this paper constructs a dictionary of terminology phrases in the field of electrical engineering and proposes three schemes to integrate the dictionary knowledge into the translation model. Scheme 1 replaces the terminology phrases of the source language. Scheme 2 uses the residual connection at the encoder end after the terminology phrase is replaced. Scheme 3 uses a segmentation method of combining character segmentation and terminology segmentation for the target language and uses an additional loss module in the training process. The results show that all three schemes are superior to the baseline model in two aspects: BLEU value and correct translation rate of terminology words. In the test set, the highest accuracy of terminology words was 48.3% higher than that of the baseline model. The BLEU value is up to 3.6 higher than the baseline model. The phenomenon is also analyzed and discussed in this paper.","PeriodicalId":38479,"journal":{"name":"Information (Switzerland)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136306961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information (Switzerland)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1