首页 > 最新文献

Open Computer Science最新文献

英文 中文
A novel similarity measure of link prediction in bipartite social networks based on neighborhood structure 基于邻域结构的二部社会网络链接预测相似性度量
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0233
Fariba Sarhangnia, Shima Mahjoobi, Samaneh Jamshidi
Abstract Link prediction is one of the methods of social network analysis. Bipartite networks are a type of complex network that can be used to model many natural events. In this study, a novel similarity measure for link prediction in bipartite networks is presented. Due to the fact that classical social network link prediction methods are less efficient and effective for use in bipartite network, it is necessary to use bipartite network-specific methods to solve this problem. The purpose of this study is to provide a centralized and comprehensive method based on the neighborhood structure that performs better than the existing classical methods. The proposed method consists of a combination of criteria based on the neighborhood structure. Here, the classical criteria for link prediction by modifying the bipartite network are defined. These modified criteria constitute the main component of the proposed similarity measure. In addition to low simplicity and complexity, this method has high efficiency. The simulation results show that the proposed method with a superiority of 0.5% over MetaPath, 1.32% over FriendLink, and 1.8% over Katz in the f-measure criterion shows the best performance.
链接预测是社会网络分析的一种方法。二部网络是一种复杂的网络,可以用来模拟许多自然事件。本文提出了一种新的用于二部网络中链路预测的相似性度量方法。由于传统的社会网络链接预测方法在二部网络中使用效率较低,因此有必要使用二部网络专用方法来解决这一问题。本研究的目的是提供一种基于邻域结构的集中综合的方法,其性能优于现有的经典方法。该方法由基于邻域结构的多准则组合而成。本文定义了修正二部网络进行链路预测的经典准则。这些修改后的标准构成了提议的相似性度量的主要组成部分。该方法简单、复杂,效率高。仿真结果表明,该方法在f-measure准则上比MetaPath高0.5%,比FriendLink高1.32%,比Katz高1.8%,具有最佳性能。
{"title":"A novel similarity measure of link prediction in bipartite social networks based on neighborhood structure","authors":"Fariba Sarhangnia, Shima Mahjoobi, Samaneh Jamshidi","doi":"10.1515/comp-2022-0233","DOIUrl":"https://doi.org/10.1515/comp-2022-0233","url":null,"abstract":"Abstract Link prediction is one of the methods of social network analysis. Bipartite networks are a type of complex network that can be used to model many natural events. In this study, a novel similarity measure for link prediction in bipartite networks is presented. Due to the fact that classical social network link prediction methods are less efficient and effective for use in bipartite network, it is necessary to use bipartite network-specific methods to solve this problem. The purpose of this study is to provide a centralized and comprehensive method based on the neighborhood structure that performs better than the existing classical methods. The proposed method consists of a combination of criteria based on the neighborhood structure. Here, the classical criteria for link prediction by modifying the bipartite network are defined. These modified criteria constitute the main component of the proposed similarity measure. In addition to low simplicity and complexity, this method has high efficiency. The simulation results show that the proposed method with a superiority of 0.5% over MetaPath, 1.32% over FriendLink, and 1.8% over Katz in the f-measure criterion shows the best performance.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"112 - 122"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41516809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Greatest-common-divisor dependency of juggling sequence rotation efficient performance 最大公约数依赖性杂耍序列旋转的高效性能
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0234
Joseph A. Erho, J. I. Consul, B. R. Japheth
Abstract In previous experimental study with three-way-reversal and juggling sequence rotation algorithms, using 20,000,000 elements for type LONG in Java, the average execution times have been shown to be 49.66761ms and 246.4394ms, respectively. These results have revealed appreciable low performance in the juggling algorithm despite its proven optimality. However, the juggling algorithm has also exhibited efficiency with some offset ranges. Due to this pattern of the juggling algorithm, the current study is focused on investigating source of the inefficiency on the average performance. Samples were extracted from the previous experimental data, presented differently and analyzed both graphically and in tabular form. Greatest common divisor values from the data that equal offsets were used. As emanating from the previous study, the Java language used for the rotation was to simulate ordering of tasks for safety and efficiency in the context of real-time task scheduling. Outcome of the investigation shows that juggling rotation performance competes favorably with three-way-reversal rotation (and even better in few cases) for certain offsets, but poorly with the rests. This study identifies the poorest performances around offsets in the neighborhood of square root of the sequence size. From the outcome, the study therefore strongly advises application developers (especially for real-time systems) to be mindful of where and how to in using juggling rotation.
在之前的实验研究中,在Java中使用20,000,000个元素进行LONG类型的三向反转和杂耍序列旋转算法,平均执行时间分别为49.66761ms和246.4394ms。这些结果表明,尽管杂耍算法已被证明是最优的,但其性能明显较低。然而,杂耍算法在某些偏移范围内也表现出效率。由于杂耍算法的这种模式,目前的研究重点是调查平均性能低效率的来源。从以前的实验数据中提取样本,以不同的方式呈现,并以图表和表格的形式进行分析。使用相等偏移量的数据中的最大公约数值。根据前面的研究,用于轮换的Java语言是为了在实时任务调度上下文中模拟任务的安全和效率排序。调查结果表明,杂耍旋转性能有利的竞争与三向反转旋转(甚至更好,在少数情况下)在某些偏移,但与休息差。本研究确定了在序列大小的平方根附近的偏移量附近的最差性能。因此,从结果来看,该研究强烈建议应用程序开发人员(特别是实时系统)注意在何处以及如何使用杂耍旋转。
{"title":"Greatest-common-divisor dependency of juggling sequence rotation efficient performance","authors":"Joseph A. Erho, J. I. Consul, B. R. Japheth","doi":"10.1515/comp-2022-0234","DOIUrl":"https://doi.org/10.1515/comp-2022-0234","url":null,"abstract":"Abstract In previous experimental study with three-way-reversal and juggling sequence rotation algorithms, using 20,000,000 elements for type LONG in Java, the average execution times have been shown to be 49.66761ms and 246.4394ms, respectively. These results have revealed appreciable low performance in the juggling algorithm despite its proven optimality. However, the juggling algorithm has also exhibited efficiency with some offset ranges. Due to this pattern of the juggling algorithm, the current study is focused on investigating source of the inefficiency on the average performance. Samples were extracted from the previous experimental data, presented differently and analyzed both graphically and in tabular form. Greatest common divisor values from the data that equal offsets were used. As emanating from the previous study, the Java language used for the rotation was to simulate ordering of tasks for safety and efficiency in the context of real-time task scheduling. Outcome of the investigation shows that juggling rotation performance competes favorably with three-way-reversal rotation (and even better in few cases) for certain offsets, but poorly with the rests. This study identifies the poorest performances around offsets in the neighborhood of square root of the sequence size. From the outcome, the study therefore strongly advises application developers (especially for real-time systems) to be mindful of where and how to in using juggling rotation.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"92 - 102"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46679192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Designing of fault-tolerant computer system structures using residue number systems 基于剩余数系统的容错计算机系统结构设计
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2020-0171
V. Krasnobayev, A. Kuznetsov, A. Kiian
Abstract This article discusses computing systems that operate in residue number systems (RNSs). The main direction of improving computer systems (CSs) is increasing the speed of implementation of arithmetic operations and the reliability of their functioning. Encoding data in RNS solves the problem of optimal redundancy, i.e., the creation of such computing systems provides maximum reliability with restrictions on weight and size characteristics. This article proposes new structures of fault-tolerant CSs operating in RNS in the case of the application with an active fault-tolerant method. The use of the active fault-tolerant method (dynamic redundancy) in the RNSs provides higher reliability. In addition, with an increase in the digits of CSs, the efficiency of using the proposed structures increases.
摘要本文讨论了在残数系统中运行的计算系统。改进计算机系统的主要方向是提高算术运算的执行速度及其功能的可靠性。RNS中的数据编码解决了最佳冗余问题,即,创建这样的计算系统在重量和尺寸特性受到限制的情况下提供了最大的可靠性。本文在应用主动容错方法的情况下,提出了在RNS中运行的容错CS的新结构。在RNS中使用主动容错方法(动态冗余)提供了更高的可靠性。此外,随着CS位数的增加,使用所提出的结构的效率也提高了。
{"title":"Designing of fault-tolerant computer system structures using residue number systems","authors":"V. Krasnobayev, A. Kuznetsov, A. Kiian","doi":"10.1515/comp-2020-0171","DOIUrl":"https://doi.org/10.1515/comp-2020-0171","url":null,"abstract":"Abstract This article discusses computing systems that operate in residue number systems (RNSs). The main direction of improving computer systems (CSs) is increasing the speed of implementation of arithmetic operations and the reliability of their functioning. Encoding data in RNS solves the problem of optimal redundancy, i.e., the creation of such computing systems provides maximum reliability with restrictions on weight and size characteristics. This article proposes new structures of fault-tolerant CSs operating in RNS in the case of the application with an active fault-tolerant method. The use of the active fault-tolerant method (dynamic redundancy) in the RNSs provides higher reliability. In addition, with an increase in the digits of CSs, the efficiency of using the proposed structures increases.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"66 - 74"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45898286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Security and privacy issues in federated healthcare – An overview 联邦医疗保健中的安全和隐私问题-概述
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0230
Jansi Rani Amalraj, Robert Lourdusamy
Abstract Securing medical records is a significant task in Healthcare communication. The major setback during the transfer of medical data in the electronic medium is the inherent difficulty in preserving data confidentiality and patients’ privacy. The innovation in technology and improvisation in the medical field has given numerous advancements in transferring the medical data with foolproof security. In today’s healthcare industry, federated network operation is gaining significance to deal with distributed network resources due to the efficient handling of privacy issues. The design of a federated security system for healthcare services is one of the intense research topics. This article highlights the importance of federated learning in healthcare. Also, the article discusses the privacy and security issues in communicating the e-health data.
摘要医疗记录的安全保护是医疗通信中的一项重要任务。在以电子媒介传输医疗数据过程中遇到的主要挫折是在保护数据机密性和患者隐私方面存在固有的困难。医疗领域的技术创新和即兴创作在传输医疗数据方面取得了许多进步,并且具有万无一失的安全性。在当今的医疗保健行业中,由于能够有效地处理隐私问题,联合网络操作在处理分布式网络资源方面变得越来越重要。医疗服务联合安全系统的设计是当前研究热点之一。本文强调了联邦学习在医疗保健中的重要性。此外,本文还讨论了电子医疗数据通信中的隐私和安全问题。
{"title":"Security and privacy issues in federated healthcare – An overview","authors":"Jansi Rani Amalraj, Robert Lourdusamy","doi":"10.1515/comp-2022-0230","DOIUrl":"https://doi.org/10.1515/comp-2022-0230","url":null,"abstract":"Abstract Securing medical records is a significant task in Healthcare communication. The major setback during the transfer of medical data in the electronic medium is the inherent difficulty in preserving data confidentiality and patients’ privacy. The innovation in technology and improvisation in the medical field has given numerous advancements in transferring the medical data with foolproof security. In today’s healthcare industry, federated network operation is gaining significance to deal with distributed network resources due to the efficient handling of privacy issues. The design of a federated security system for healthcare services is one of the intense research topics. This article highlights the importance of federated learning in healthcare. Also, the article discusses the privacy and security issues in communicating the e-health data.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"57 - 65"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43797937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cross-modal biometric fusion intelligent traffic recognition system combined with real-time data operation 结合实时数据操作的跨模态生物特征融合智能交通识别系统
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0252
Wei Xu, Yujin Zhai
Abstract Intelligent traffic recognition system is the development direction of the future traffic system. It effectively integrates advanced information technology, data communication transmission technology, electronic sensing technology, control technology, and computer technology into the entire ground traffic management system. It establishes a real-time, accurate, and efficient integrated transportation management system that plays a role in a wide range and all directions. The aim of this article is to integrate cross-modal biometrics into an intelligent traffic recognition system combined with real-time data operations. Based on the cross-modal recognition algorithm, it can better re-identify the vehicle cross-modally by building a model. First, this article first presents a general introduction to the cross-modal recognition method. Then, the experimental analysis is conducted on the classification of vehicle images recognized by the intelligent transportation system, the complexity of vehicle logo recognition, and the recognition of vehicle images with different lights. Finally, the cross-modal recognition algorithm is introduced into the dynamic analysis of the intelligent traffic recognition system. The cross-modal traffic recognition system experiment is carried out. The experimental results show that the intraclass distribution loss function can improve the Rank 1 recognition rate and mAP value by 6–7% points on the basis of the baseline method. This shows that improving the modal invariance feature by reducing the distribution difference between different modal images of the same vehicle can effectively deal with the feature information imbalance caused by modal changes.
摘要智能交通识别系统是未来交通系统的发展方向。它将先进的信息技术、数据通信传输技术、电子传感技术、控制技术和计算机技术有效地集成到整个地面交通管理系统中。它建立了一个实时、准确、高效、全方位发挥作用的综合运输管理系统。本文的目的是将跨模态生物识别技术集成到一个与实时数据操作相结合的智能交通识别系统中。基于跨模态识别算法,通过建立模型可以更好地对车辆进行跨模态识别。本文首先对跨模态识别方法进行了一般介绍。然后,对智能交通系统识别的车辆图像的分类、车标识别的复杂性以及不同灯光下车辆图像的识别进行了实验分析。最后,将跨模态识别算法引入到智能交通识别系统的动态分析中。进行了跨模态交通识别系统实验。实验结果表明,在基线方法的基础上,类内分布损失函数可以将秩1的识别率和mAP值提高6–7%。这表明,通过减少同一车辆不同模态图像之间的分布差异来改进模态不变性特征,可以有效地处理由模态变化引起的特征信息不平衡。
{"title":"Cross-modal biometric fusion intelligent traffic recognition system combined with real-time data operation","authors":"Wei Xu, Yujin Zhai","doi":"10.1515/comp-2022-0252","DOIUrl":"https://doi.org/10.1515/comp-2022-0252","url":null,"abstract":"Abstract Intelligent traffic recognition system is the development direction of the future traffic system. It effectively integrates advanced information technology, data communication transmission technology, electronic sensing technology, control technology, and computer technology into the entire ground traffic management system. It establishes a real-time, accurate, and efficient integrated transportation management system that plays a role in a wide range and all directions. The aim of this article is to integrate cross-modal biometrics into an intelligent traffic recognition system combined with real-time data operations. Based on the cross-modal recognition algorithm, it can better re-identify the vehicle cross-modally by building a model. First, this article first presents a general introduction to the cross-modal recognition method. Then, the experimental analysis is conducted on the classification of vehicle images recognized by the intelligent transportation system, the complexity of vehicle logo recognition, and the recognition of vehicle images with different lights. Finally, the cross-modal recognition algorithm is introduced into the dynamic analysis of the intelligent traffic recognition system. The cross-modal traffic recognition system experiment is carried out. The experimental results show that the intraclass distribution loss function can improve the Rank 1 recognition rate and mAP value by 6–7% points on the basis of the baseline method. This shows that improving the modal invariance feature by reducing the distribution difference between different modal images of the same vehicle can effectively deal with the feature information imbalance caused by modal changes.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"332 - 344"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41451055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A student-based central exam scheduling model using A* algorithm 基于A*算法的学生中心考试调度模型
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0237
M. S. Başar, Sinan Kul
Abstract In this study, a student-based placement model using the A* algorithm is proposed and applied to solve the problem of placing the courses in exam sessions. The application area of the model is midterm and final exams, conducted by the Open Education Faculty. The reason for choosing open education exams for the practice is that the exams are applied across the country and more than 100,000 students participate. The main problem is to obtain a suitable distribution that can satisfy many constraints simultaneously. In the current system, the lessons in the sessions were placed once using the curriculum knowledge. This placement plan is applied in all exams. When the placement is done according to the curriculum information, the courses in the sessions cannot be placed effectively and efficiently due to a large number of common courses and the large number of students taking the exam. This makes the booklets more expensive and the organization more prone to errors. Both the opening of new programs and the increase in the number of students regularly lead to the necessity of placing the classes in sessions dynamically each semester. In addition, to prevent conflicts with the calendars of other central exams, it is necessary to conduct all exams in three sessions. A better solution was obtained by using a different model than the currently used model in the study. With this solution, distribution of the courses of successful students with few courses to all sessions is provided, and difficult courses of unsuccessful students who have a large number of courses were gathered in the same session. This study can support future studies on two issues: the first issue is the approach of using the course that will be taken by most students instead of the courses taught in most departments in the selection of the course to be placed in the booklet. The second issue is to try to find the most suitable solution by performing performance tests on many algorithms whose performance has been determined by many academic studies.
摘要本文提出了一种基于学生的a *算法的课程布置模型,并将其应用于解决课程在考试时段的布置问题。该模式的应用领域是由开放教育学院进行的期中和期末考试。之所以选择开放教育考试进行实践,是因为这些考试在全国范围内适用,参加考试的学生超过10万人。主要问题是如何得到一个同时满足多个约束条件的合适分布。在目前的系统中,每堂课都是使用课程知识进行一次授课。此分班计划适用于所有考试。在根据课程信息进行课程安排时,由于公共课较多,参加考试的学生较多,因此无法有效高效地安排各时段的课程。这使得小册子更加昂贵,组织更容易出错。新课程的开设和学生人数的增加都导致每学期有必要动态地安排课程。此外,为避免与其他中心考试日程冲突,所有考试必须分三期进行。通过使用不同于目前研究中使用的模型,得到了一个更好的解决方案。通过这种解决方案,将课程较少的优等生的课程分配到各个时段,将课程较多的不优等生的疑难课程集中到同一时段。本研究可以对两个问题的未来研究提供支持:第一个问题是在选择要放在小册子中的课程时,使用大多数学生将学习的课程而不是大多数部门教授的课程的方法。第二个问题是通过对许多算法进行性能测试,试图找到最合适的解决方案,这些算法的性能已经被许多学术研究确定。
{"title":"A student-based central exam scheduling model using A* algorithm","authors":"M. S. Başar, Sinan Kul","doi":"10.1515/comp-2022-0237","DOIUrl":"https://doi.org/10.1515/comp-2022-0237","url":null,"abstract":"Abstract In this study, a student-based placement model using the A* algorithm is proposed and applied to solve the problem of placing the courses in exam sessions. The application area of the model is midterm and final exams, conducted by the Open Education Faculty. The reason for choosing open education exams for the practice is that the exams are applied across the country and more than 100,000 students participate. The main problem is to obtain a suitable distribution that can satisfy many constraints simultaneously. In the current system, the lessons in the sessions were placed once using the curriculum knowledge. This placement plan is applied in all exams. When the placement is done according to the curriculum information, the courses in the sessions cannot be placed effectively and efficiently due to a large number of common courses and the large number of students taking the exam. This makes the booklets more expensive and the organization more prone to errors. Both the opening of new programs and the increase in the number of students regularly lead to the necessity of placing the classes in sessions dynamically each semester. In addition, to prevent conflicts with the calendars of other central exams, it is necessary to conduct all exams in three sessions. A better solution was obtained by using a different model than the currently used model in the study. With this solution, distribution of the courses of successful students with few courses to all sessions is provided, and difficult courses of unsuccessful students who have a large number of courses were gathered in the same session. This study can support future studies on two issues: the first issue is the approach of using the course that will be taken by most students instead of the courses taught in most departments in the selection of the course to be placed in the booklet. The second issue is to try to find the most suitable solution by performing performance tests on many algorithms whose performance has been determined by many academic studies.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"181 - 190"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48049071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Word2Vec: Optimal hyperparameters and their impact on natural language processing downstream tasks Word2Vec:最优超参数及其对自然语言处理下游任务的影响
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0236
Tosin P. Adewumi, F. Liwicki, M. Liwicki
Abstract Word2Vec is a prominent model for natural language processing tasks. Similar inspiration is found in distributed embeddings (word-vectors) in recent state-of-the-art deep neural networks. However, wrong combination of hyperparameters can produce embeddings with poor quality. The objective of this work is to empirically show that Word2Vec optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the publicly released, original Word2Vec embedding. Both intrinsic and extrinsic (downstream) evaluations are carried out, including named entity recognition and sentiment analysis. Our main contributions include showing that the best model is usually task-specific, high analogy scores do not necessarily correlate positively with F1 scores, and performance is not dependent on data size alone. If ethical considerations to save time, energy, and the environment are made, then relatively smaller corpora may do just as well or even better in some cases. Increasing the dimension size of embeddings after a point leads to poor quality or performance. In addition, using a relatively small corpus, we obtain better WordSim scores, corresponding Spearman correlation, and better downstream performances (with significance tests) compared to the original model, which is trained on a 100 billion-word corpus.
摘要Word2Vec是自然语言处理任务的一个突出模型。在最近最先进的深度神经网络中,分布式嵌入(词向量)也有类似的灵感。然而,超参数的错误组合可能会产生质量较差的嵌入。这项工作的目的是从经验上证明Word2Verc超参数的最优组合是存在的,并评估各种组合。我们将它们与公开发布的原始Word2Vec嵌入进行了比较。进行了内在和外在(下游)评估,包括命名实体识别和情绪分析。我们的主要贡献包括表明,最佳模型通常是特定于任务的,高类比分数不一定与F1分数呈正相关,性能也不取决于数据大小。如果出于节省时间、能源和环境的道德考虑,那么相对较小的社团可能会做得同样好,甚至在某些情况下会做得更好。在一点之后增加嵌入的维度大小会导致质量或性能变差。此外,与在1000亿单词语料库上训练的原始模型相比,使用相对较小的语料库,我们获得了更好的WordSim分数、相应的Spearman相关性和更好的下游性能(具有显著性测试)。
{"title":"Word2Vec: Optimal hyperparameters and their impact on natural language processing downstream tasks","authors":"Tosin P. Adewumi, F. Liwicki, M. Liwicki","doi":"10.1515/comp-2022-0236","DOIUrl":"https://doi.org/10.1515/comp-2022-0236","url":null,"abstract":"Abstract Word2Vec is a prominent model for natural language processing tasks. Similar inspiration is found in distributed embeddings (word-vectors) in recent state-of-the-art deep neural networks. However, wrong combination of hyperparameters can produce embeddings with poor quality. The objective of this work is to empirically show that Word2Vec optimal combination of hyper-parameters exists and evaluate various combinations. We compare them with the publicly released, original Word2Vec embedding. Both intrinsic and extrinsic (downstream) evaluations are carried out, including named entity recognition and sentiment analysis. Our main contributions include showing that the best model is usually task-specific, high analogy scores do not necessarily correlate positively with F1 scores, and performance is not dependent on data size alone. If ethical considerations to save time, energy, and the environment are made, then relatively smaller corpora may do just as well or even better in some cases. Increasing the dimension size of embeddings after a point leads to poor quality or performance. In addition, using a relatively small corpus, we obtain better WordSim scores, corresponding Spearman correlation, and better downstream performances (with significance tests) compared to the original model, which is trained on a 100 billion-word corpus.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"134 - 141"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42899205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Research on the virtual simulation experiment evaluation model of e-commerce logistics smart warehousing based on multidimensional weighting 基于多维加权的电子商务物流智能仓储虚拟仿真实验评价模型研究
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0249
Ganglong Fan, Bo Fan, Hongsheng Xu, Chuqiao Wang
Abstract Through the analysis of the current research situation at home and abroad, this article finds that there is a lack of evaluation standards and methods in the virtual simulation experiment of e-commerce logistics smart warehousing. Therefore, it seriously affects the standardization and rationality of the experiment. To solve the problems in the evaluation of the current virtual simulation experiment, this article proposes a virtual simulation experiment evaluation model of e-commerce logistics smart warehousing based on multidimensional weighting. This article firstly sorts out the basic process of e-commerce logistics smart warehousing experiment activities and establishes the evaluation object. Then, based on the duality degree of the output results of the experimental steps, it proposes a method that conforms to the corresponding operation steps. Thus, a three-dimensional evaluation model of the completion degree of the operation steps, the reasonable degree of the operation steps, and the completion time of the operation steps are constructed. An automatic scoring evaluation model is proposed based on the combination of three-dimensional weighted evaluation of experimental steps. Finally, the feasibility and convenience of the evaluation model are verified through the experiment analysis.
摘要通过对国内外研究现状的分析,发现电子商务物流智能仓储虚拟仿真实验缺乏评价标准和方法。因此,它严重影响了实验的规范性和合理性。针对目前虚拟仿真实验评价中存在的问题,本文提出了一种基于多维加权的电子商务物流智能仓储虚拟仿真实验评估模型。本文首先梳理了电子商务物流智能仓储实验活动的基本过程,并建立了评价对象。然后,基于实验步骤输出结果的对偶度,提出了一种符合相应操作步骤的方法。因此,构建了操作步骤的完成程度、操作步骤的合理程度和操作步骤的结束时间的三维评估模型。提出了一种基于实验步骤三维加权评价相结合的自动评分评价模型。最后,通过实验分析验证了评价模型的可行性和方便性。
{"title":"Research on the virtual simulation experiment evaluation model of e-commerce logistics smart warehousing based on multidimensional weighting","authors":"Ganglong Fan, Bo Fan, Hongsheng Xu, Chuqiao Wang","doi":"10.1515/comp-2022-0249","DOIUrl":"https://doi.org/10.1515/comp-2022-0249","url":null,"abstract":"Abstract Through the analysis of the current research situation at home and abroad, this article finds that there is a lack of evaluation standards and methods in the virtual simulation experiment of e-commerce logistics smart warehousing. Therefore, it seriously affects the standardization and rationality of the experiment. To solve the problems in the evaluation of the current virtual simulation experiment, this article proposes a virtual simulation experiment evaluation model of e-commerce logistics smart warehousing based on multidimensional weighting. This article firstly sorts out the basic process of e-commerce logistics smart warehousing experiment activities and establishes the evaluation object. Then, based on the duality degree of the output results of the experimental steps, it proposes a method that conforms to the corresponding operation steps. Thus, a three-dimensional evaluation model of the completion degree of the operation steps, the reasonable degree of the operation steps, and the completion time of the operation steps are constructed. An automatic scoring evaluation model is proposed based on the combination of three-dimensional weighted evaluation of experimental steps. Finally, the feasibility and convenience of the evaluation model are verified through the experiment analysis.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"314 - 322"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43001277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big data network security defense mode of deep learning algorithm 大数据网络安全防御模式的深度学习算法
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0257
Ying Yu
Abstract With the rapid development and progress of big data technology, people can already use big data to judge the transmission and distribution of network information and make better decisions in time, but it also faces major network threats such as Trojan horses and viruses. Traditional network security functions generally wait until the network power is turned on to a certain extent before starting, and it is difficult to ensure the security of big data networks. To protect the network security of big data and improve its ability to defend against attacks, this article introduces the deep learning algorithm into the research of big data network security defense mode. The test results show that the introduction of deep learning algorithms into the research of network security model can enhance the security defense capability of the network by 5.12%, proactively detect, and kill cyber attacks that can pose threats. At the same time, the security defense mode will evaluate the network security of big data and analyze potential network security risks in detail, which will prevent risks before they occur and effectively protect the network security in the context of big data.
随着大数据技术的快速发展和进步,人们已经可以利用大数据来判断网络信息的传播和分布,及时做出更好的决策,但也面临着特洛伊木马、病毒等重大网络威胁。传统的网络安全功能一般要等到网络电源开启到一定程度后才能启动,难以保证大数据网络的安全性。为了保护大数据的网络安全,提高其防御攻击的能力,本文将深度学习算法引入到大数据网络安全防御模式的研究中。测试结果表明,将深度学习算法引入网络安全模型的研究中,可以使网络的安全防御能力提升5.12%,主动发现并消灭可能构成威胁的网络攻击。同时,安全防御模式将对大数据的网络安全进行评估,详细分析潜在的网络安全风险,防患于未然,有效保护大数据背景下的网络安全。
{"title":"Big data network security defense mode of deep learning algorithm","authors":"Ying Yu","doi":"10.1515/comp-2022-0257","DOIUrl":"https://doi.org/10.1515/comp-2022-0257","url":null,"abstract":"Abstract With the rapid development and progress of big data technology, people can already use big data to judge the transmission and distribution of network information and make better decisions in time, but it also faces major network threats such as Trojan horses and viruses. Traditional network security functions generally wait until the network power is turned on to a certain extent before starting, and it is difficult to ensure the security of big data networks. To protect the network security of big data and improve its ability to defend against attacks, this article introduces the deep learning algorithm into the research of big data network security defense mode. The test results show that the introduction of deep learning algorithms into the research of network security model can enhance the security defense capability of the network by 5.12%, proactively detect, and kill cyber attacks that can pose threats. At the same time, the security defense mode will evaluate the network security of big data and analyze potential network security risks in detail, which will prevent risks before they occur and effectively protect the network security in the context of big data.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"345 - 356"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48869891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Post-quantum cryptography-driven security framework for cloud computing 后量子密码学驱动的云计算安全框架
IF 1.5 Q3 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2022-01-01 DOI: 10.1515/comp-2022-0235
H. C. Ukwuoma, A. J. Gabriel, A. Thompson, B. Alese
Abstract Data security in the cloud has been a major issue since the inception and adoption of cloud computing. Various frameworks have been proposed, and yet data breach prevails. With encryption being the dominant method of cloud data security, the advent of quantum computing implies an urgent need to proffer a model that will provide adequate data security for both classical and quantum computing. Thus, most cryptosystems will be rendered susceptible and obsolete, though some cryptosystems will stand the test of quantum computing. The article proposes a model that comprises the application of a variant of McEliece cryptosystem, which has been tipped to replace Rivest–Shamir–Adleman (RSA) in the quantum computing era to secure access control data and the application of a variant of N-th degree truncated polynomial ring units (NTRU) cryptosystem to secure cloud user data. The simulation of the proposed McEliece algorithm showed that the algorithm has a better time complexity than the existing McEliece cryptosystem. Furthermore, the novel tweaking of parameters S and P further improves the security of the proposed algorithms. More so, the simulation of the proposed NTRU algorithm revealed that the existing NTRU cryptosystem had a superior time complexity when juxtaposed with the proposed NTRU cryptosystem.
摘要自云计算诞生和采用以来,云中的数据安全一直是一个主要问题。已经提出了各种框架,但数据泄露盛行。加密是云数据安全的主要方法,量子计算的出现意味着迫切需要提供一种模型,为经典计算和量子计算提供足够的数据安全。因此,大多数密码系统将变得易受影响和过时,尽管一些密码系统将经得起量子计算的考验。这篇文章提出了一个模型,其中包括McEliece密码系统的变体的应用,该变体已被认为将取代量子计算时代的Rivest–Shamir–Adleman(RSA)来保护访问控制数据,以及N次截断多项式环单元(NTRU)密码系统的变种来保护云用户数据。对所提出的McEliece算法的仿真表明,该算法比现有的McEliess密码系统具有更好的时间复杂度。此外,对参数S和P的新颖调整进一步提高了所提出算法的安全性。更重要的是,对所提出的NTRU算法的仿真表明,现有的NTRU密码系统与所提出的NT RU密码系统并列时具有优越的时间复杂性。
{"title":"Post-quantum cryptography-driven security framework for cloud computing","authors":"H. C. Ukwuoma, A. J. Gabriel, A. Thompson, B. Alese","doi":"10.1515/comp-2022-0235","DOIUrl":"https://doi.org/10.1515/comp-2022-0235","url":null,"abstract":"Abstract Data security in the cloud has been a major issue since the inception and adoption of cloud computing. Various frameworks have been proposed, and yet data breach prevails. With encryption being the dominant method of cloud data security, the advent of quantum computing implies an urgent need to proffer a model that will provide adequate data security for both classical and quantum computing. Thus, most cryptosystems will be rendered susceptible and obsolete, though some cryptosystems will stand the test of quantum computing. The article proposes a model that comprises the application of a variant of McEliece cryptosystem, which has been tipped to replace Rivest–Shamir–Adleman (RSA) in the quantum computing era to secure access control data and the application of a variant of N-th degree truncated polynomial ring units (NTRU) cryptosystem to secure cloud user data. The simulation of the proposed McEliece algorithm showed that the algorithm has a better time complexity than the existing McEliece cryptosystem. Furthermore, the novel tweaking of parameters S and P further improves the security of the proposed algorithms. More so, the simulation of the proposed NTRU algorithm revealed that the existing NTRU cryptosystem had a superior time complexity when juxtaposed with the proposed NTRU cryptosystem.","PeriodicalId":43014,"journal":{"name":"Open Computer Science","volume":"12 1","pages":"142 - 153"},"PeriodicalIF":1.5,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49345890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Open Computer Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1