首页 > 最新文献

International journal of artificial intelligence & applications最新文献

英文 中文
Understanding Negative Sampling in Knowledge Graph Embedding 知识图嵌入中负抽样的理解
Pub Date : 2021-01-31 DOI: 10.5121/IJAIA.2021.12105
Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue
Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.
知识图嵌入(Knowledge graph embedding, KGE)是将知识图中的实体和关系投影到低维向量空间中,近年来取得了稳步发展。传统的KGE方法,特别是基于平移距离的模型,是通过区分阳性样本和阴性样本来训练的。为了节省空间,大多数kg只存储阳性样本。因此,负采样在编码KG的三元组中起着至关重要的作用。生成负样本的质量直接影响学习到的知识表示在无数下游任务中的表现,如推荐、链接预测和节点分类。我们将目前KGE的负抽样方法分为三类,分别是基于静态分布的、基于动态分布的和基于自定义聚类的。基于这种分类,我们讨论了最普遍的现有方法及其特点。希望本文能对KGE负抽样的新思路提供一些指导。
{"title":"Understanding Negative Sampling in Knowledge Graph Embedding","authors":"Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue","doi":"10.5121/IJAIA.2021.12105","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12105","url":null,"abstract":"Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"71-81"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42547409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards Predicting Software Defects with Clustering Techniques 用聚类技术预测软件缺陷
Pub Date : 2021-01-31 DOI: 10.5121/IJAIA.2021.12103
Waheeda Almayyan
The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analyzed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) self-organizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number of features.
软件缺陷预测的目的是通过建立预测模型来决定软件模块是否容易发生故障,从而提高软件项目的质量。近年来,在这个主题中使用机器学习技术进行了大量的研究。我们的目的是评估具有特征选择方案的聚类技术的性能,以解决软件缺陷预测问题。我们使用三种聚类算法分析了美国国家航空航天局(NASA)的数据集基准:(1)最远优先、(2)X均值和(3)自组织映射(SOM)。为了评估不同的特征选择算法,本文对基于蝙蝠、布谷鸟、灰太狼优化器(GWO)和粒子群优化器(PSO)的软件缺陷预测进行了比较分析。利用所提出的聚类模型获得的结果使我们能够建立一个高效的预测模型,该模型具有令人满意的检测率和可接受的特征数量。
{"title":"Towards Predicting Software Defects with Clustering Techniques","authors":"Waheeda Almayyan","doi":"10.5121/IJAIA.2021.12103","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12103","url":null,"abstract":"The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analyzed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) self-organizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number of features.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"39-54"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45706980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Analysis of Enterprise Shared Resource Invocation Scheme based on Hadoop and R 基于Hadoop和R的企业共享资源调用方案分析
Pub Date : 2021-01-31 DOI: 10.5121/IJAIA.2021.12104
H. Xiong
The response rate and performance indicators of enterprise resource calls have become an important part of measuring the difference in enterprise user experience. An efficient corporate shared resource calling system can significantly improve the office efficiency of corporate users and significantly improve the fluency of corporate users' resource calling. Hadoop has powerful data integration and analysis capabilities in resource extraction, while R has excellent statistical capabilities and resource personalized decomposition and display capabilities in data calling. This article will propose an integration plan for enterprise shared resource invocation based on Hadoop and R to further improve the efficiency of enterprise users' shared resource utilization, improve the efficiency of system operation, and bring enterprise users a higher level of user experience. First, we use Hadoop to extract the corporate shared resources required by corporate users from the nearby resource storage computer room and terminal equipment to increase the call rate, and use the R function attribute to convert the user’s search results into linear correlations, according to the correlation The strong and weak principles are displayed in order to improve the corresponding speed and experience. This article proposes feasible solutions to the shortcomings in the current enterprise shared resource invocation. We can use public data sets to perform personalized regression analysis on user needs, and optimize and integrate most relevant information.
企业资源呼叫的响应率和绩效指标已经成为衡量企业用户体验差异的重要组成部分。一个高效的企业共享资源呼叫系统可以显著提高企业用户的办公效率,显著提高企业用户资源呼叫的流畅性。Hadoop在资源提取方面具有强大的数据集成和分析能力,R在数据调用方面具有出色的统计能力和资源个性化分解和显示能力。本文将提出一种基于Hadoop和R的企业共享资源调用集成方案,进一步提高企业用户共享资源利用效率,提高系统运行效率,为企业用户带来更高层次的用户体验。首先,我们利用Hadoop从附近的资源存储机房和终端设备中提取企业用户所需的企业共享资源,以提高调用率,并利用R函数属性将用户的搜索结果转换成线性相关性,根据相关性的强弱原则进行显示,以提高相应的速度和体验。针对当前企业共享资源调用中存在的不足,提出了可行的解决方案。我们可以使用公共数据集对用户需求进行个性化回归分析,并优化和整合最相关的信息。
{"title":"Analysis of Enterprise Shared Resource Invocation Scheme based on Hadoop and R","authors":"H. Xiong","doi":"10.5121/IJAIA.2021.12104","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12104","url":null,"abstract":"The response rate and performance indicators of enterprise resource calls have become an important part of measuring the difference in enterprise user experience. An efficient corporate shared resource calling system can significantly improve the office efficiency of corporate users and significantly improve the fluency of corporate users' resource calling. Hadoop has powerful data integration and analysis capabilities in resource extraction, while R has excellent statistical capabilities and resource personalized decomposition and display capabilities in data calling. This article will propose an integration plan for enterprise shared resource invocation based on Hadoop and R to further improve the efficiency of enterprise users' shared resource utilization, improve the efficiency of system operation, and bring enterprise users a higher level of user experience. First, we use Hadoop to extract the corporate shared resources required by corporate users from the nearby resource storage computer room and terminal equipment to increase the call rate, and use the R function attribute to convert the user’s search results into linear correlations, according to the correlation The strong and weak principles are displayed in order to improve the corresponding speed and experience. This article proposes feasible solutions to the shortcomings in the current enterprise shared resource invocation. We can use public data sets to perform personalized regression analysis on user needs, and optimize and integrate most relevant information.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"55-69"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45532643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Supervised and Unsupervised Machine Learning Methodologies for Crime Pattern Analysis 用于犯罪模式分析的有监督和无监督机器学习方法
Pub Date : 2021-01-31 DOI: 10.5121/IJAIA.2021.12106
D. Sardana, S. Marwaha, R. Bhatnagar
Crime is a grave problem that affects all countries in the world. The level of crime in a country has a big impact on its economic growth and quality of life of citizens. In this paper, we provide a survey of trends of supervised and unsupervised machine learning methods used for crime pattern analysis. We use a spatiotemporal dataset of crimes in San Francisco, CA to demonstrate some of these strategies for crime analysis. We use classification models, namely, Logistic Regression, Random Forest, Gradient Boosting and Naive Bayes to predict crime types such as Larceny, Theft, etc. and propose model optimization strategies. Further, we use a graph based unsupervised machine learning technique called core periphery structures to analyze how crime behavior evolves over time. These methods can be generalized to use for different counties and can be greatly helpful in planning police task forces for law enforcement and crime prevention.
犯罪是一个影响世界各国的严重问题。一个国家的犯罪水平对其经济增长和公民生活质量有很大影响。在本文中,我们对用于犯罪模式分析的有监督和无监督机器学习方法的趋势进行了调查。我们使用加利福尼亚州旧金山的犯罪时空数据集来演示其中一些犯罪分析策略。我们使用分类模型,即逻辑回归、随机森林、梯度提升和朴素贝叶斯来预测盗窃、盗窃等犯罪类型,并提出模型优化策略。此外,我们使用一种名为核心-外围结构的基于图的无监督机器学习技术来分析犯罪行为如何随着时间的推移而演变。这些方法可以推广到不同的县,对规划执法和预防犯罪的警察工作队有很大帮助。
{"title":"Supervised and Unsupervised Machine Learning Methodologies for Crime Pattern Analysis","authors":"D. Sardana, S. Marwaha, R. Bhatnagar","doi":"10.5121/IJAIA.2021.12106","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12106","url":null,"abstract":"Crime is a grave problem that affects all countries in the world. The level of crime in a country has a big impact on its economic growth and quality of life of citizens. In this paper, we provide a survey of trends of supervised and unsupervised machine learning methods used for crime pattern analysis. We use a spatiotemporal dataset of crimes in San Francisco, CA to demonstrate some of these strategies for crime analysis. We use classification models, namely, Logistic Regression, Random Forest, Gradient Boosting and Naive Bayes to predict crime types such as Larceny, Theft, etc. and propose model optimization strategies. Further, we use a graph based unsupervised machine learning technique called core periphery structures to analyze how crime behavior evolves over time. These methods can be generalized to use for different counties and can be greatly helpful in planning police task forces for law enforcement and crime prevention.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"83-99"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49171386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Brief Survey of Question Answering Systems 问答系统概述
Pub Date : 2021-01-01 DOI: 10.5121/ijaia.2021.12501
Michael Caballero
Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science focused on building systems that automatically answer questions from humans in natural language. This survey summarizes the history and current state of the field and is intended as an introductory overview of QA systems. After discussing QA history, this paper summarizes the different approaches to the architecture of QA systems -- whether they are closed or open-domain and whether they are text-based, knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and different evaluation metrics are discussed.
问答(QA)是自然语言处理(NLP)和计算机科学的一个子领域,专注于构建用自然语言自动回答人类问题的系统。本调查总结了该领域的历史和现状,并打算作为QA系统的介绍性概述。在讨论了QA历史之后,本文总结了QA系统架构的不同方法——它们是封闭的还是开放的,是基于文本的,基于知识的,还是混合系统。最后,介绍了该领域的一些常用数据集,并讨论了不同的评价指标。
{"title":"A Brief Survey of Question Answering Systems","authors":"Michael Caballero","doi":"10.5121/ijaia.2021.12501","DOIUrl":"https://doi.org/10.5121/ijaia.2021.12501","url":null,"abstract":"Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science focused on building systems that automatically answer questions from humans in natural language. This survey summarizes the history and current state of the field and is intended as an introductory overview of QA systems. After discussing QA history, this paper summarizes the different approaches to the architecture of QA systems -- whether they are closed or open-domain and whether they are text-based, knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and different evaluation metrics are discussed.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70613602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Answer Set Programming to Model Plan Agent Scenarios 回答集编程模型规划座席场景
Pub Date : 2020-11-30 DOI: 10.5121/ijaia.2020.11606
F. Z. Flores, Rosalba Cuapa Canto, José María Ángeles López
One of the most challenging aspects of reasoning, planning, and acting in an agent domain is reasoning about what an agent knows about their environment to consider when planning and acting. There are various proposals that have addressed this problem using modal, epistemic and other logics. In this paper we explore how to take advantage of the properties of Answer Set Programming for this purpose. The Answer Set Programming's property of non-monotonicity allow us to express causality in an elegant fashion. We begin our discussion by showing how Answer Set Programming can be used to model the frog’s problem. We then illustrate how this problem can be represented and solved using these concepts. In addition, our proposal allows us to solve the generalization of this problem, that is, for any number of frogs.
在代理领域中,推理、计划和行动的最具挑战性的方面之一是推理代理在计划和行动时对其环境的了解。有各种各样的建议已经解决了这个问题,使用模态,认知和其他逻辑。在本文中,我们探讨了如何利用答案集规划的性质来达到这个目的。答案集规划的非单调性使我们能够以一种优雅的方式表达因果关系。我们通过展示如何使用答案集编程来模拟青蛙的问题来开始我们的讨论。然后我们说明如何使用这些概念来表示和解决这个问题。此外,我们的建议允许我们解决这个问题的泛化,也就是说,对于任何数量的青蛙。
{"title":"Answer Set Programming to Model Plan Agent Scenarios","authors":"F. Z. Flores, Rosalba Cuapa Canto, José María Ángeles López","doi":"10.5121/ijaia.2020.11606","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11606","url":null,"abstract":"One of the most challenging aspects of reasoning, planning, and acting in an agent domain is reasoning about what an agent knows about their environment to consider when planning and acting. There are various proposals that have addressed this problem using modal, epistemic and other logics. In this paper we explore how to take advantage of the properties of Answer Set Programming for this purpose. The Answer Set Programming's property of non-monotonicity allow us to express causality in an elegant fashion. We begin our discussion by showing how Answer Set Programming can be used to model the frog’s problem. We then illustrate how this problem can be represented and solved using these concepts. In addition, our proposal allows us to solve the generalization of this problem, that is, for any number of frogs.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"55-63"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45228625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Transfer Rate Adjustment for Transfer Reinforcement Learning 转移强化学习的自动转移率调整
Pub Date : 2020-11-30 DOI: 10.5121/ijaia.2020.11605
H. Kono, Yuto Sakamoto, Yonghoon Ji, Hiromitsu Fujii
This paper proposes a novel parameter for transfer reinforcement learning to avoid over-fitting when an agent uses a transferred policy from a source task. Learning robot systems have recently been studied for many applications, such as home robots, communication robots, and warehouse robots. However, if the agent reuses the knowledge that has been sufficiently learned in the source task, deadlock may occur and appropriate transfer learning may not be realized. In the previous work, a parameter called transfer rate was proposed to adjust the ratio of transfer, and its contribution include avoiding dead lock in the target task. However, adjusting the parameter depends on human intuition and experiences. Furthermore, the method for deciding transfer rate has not discussed. Therefore, an automatic method for adjusting the transfer rate is proposed in this paper using a sigmoid function. Further, computer simulations are used to evaluate the effectiveness of the proposed method to improve the environmental adaptation performance in a target task, which refers to the situation of reusing knowledge.
本文提出了一种新的迁移强化学习参数,以避免智能体使用源任务的迁移策略时的过拟合。学习机器人系统最近被研究用于许多应用,如家庭机器人、通信机器人和仓库机器人。但是,如果代理重用源任务中已经充分学习到的知识,则可能发生死锁,并且可能无法实现适当的迁移学习。在之前的工作中,提出了一个称为传输速率的参数来调整传输比率,其贡献包括避免目标任务中的死锁。然而,调整参数取决于人的直觉和经验。此外,还没有讨论确定传输速率的方法。因此,本文提出了一种利用s型函数自动调节传输速率的方法。此外,通过计算机仿真验证了所提方法在目标任务(即知识重用情况)中提高环境适应性能的有效性。
{"title":"Automatic Transfer Rate Adjustment for Transfer Reinforcement Learning","authors":"H. Kono, Yuto Sakamoto, Yonghoon Ji, Hiromitsu Fujii","doi":"10.5121/ijaia.2020.11605","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11605","url":null,"abstract":"This paper proposes a novel parameter for transfer reinforcement learning to avoid over-fitting when an agent uses a transferred policy from a source task. Learning robot systems have recently been studied for many applications, such as home robots, communication robots, and warehouse robots. However, if the agent reuses the knowledge that has been sufficiently learned in the source task, deadlock may occur and appropriate transfer learning may not be realized. In the previous work, a parameter called transfer rate was proposed to adjust the ratio of transfer, and its contribution include avoiding dead lock in the target task. However, adjusting the parameter depends on human intuition and experiences. Furthermore, the method for deciding transfer rate has not discussed. Therefore, an automatic method for adjusting the transfer rate is proposed in this paper using a sigmoid function. Further, computer simulations are used to evaluate the effectiveness of the proposed method to improve the environmental adaptation performance in a target task, which refers to the situation of reusing knowledge.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"47-54"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42345937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Intelligent Portfolio Management via NLP Analysis of Financial 10-k Statements 通过财务10-k报表的NLP分析进行智能投资组合管理
Pub Date : 2020-11-30 DOI: 10.5121/ijaia.2020.11602
Purva Singh
The paper attempts to analyze if the sentiment stability of financial 10-K reports over time can determine the company’s future mean returns. A diverse portfolio of stocks was selected to test this hypothesis. The proposed framework downloads 10-K reports of the companies from SEC’s EDGAR database. It passes them through the preprocessing pipeline to extract critical sections of the filings to perform NLP analysis. Using Loughran and McDonald sentiment word list, the framework generates sentiment TF-IDF from the 10-K documents to calculate the cosine similarity between two consecutive 10-K reports and proposes to leverage this cosine similarity as the alpha factor. For analyzing the effectiveness of our alpha factor at predicting future returns, the framework uses the alphalens library to perform factor return analysis, turnover analysis, and for comparing the Sharpe ratio of potential alpha factors. The results show that there exists a strong correlation between the sentiment stability of our portfolio’s 10-K statements and its future mean returns. For the benefit of the research community, the code and Jupyter notebooks related to this paper have been open-sourced on Github1.
本文试图分析随着时间的推移,10-K财务报告的情绪稳定性是否可以决定公司未来的平均回报率。选择不同的股票组合来检验这一假设。拟议的框架从美国证券交易委员会的EDGAR数据库下载这些公司的10-K报告。它通过预处理管道提取文件的关键部分,以执行NLP分析。使用Loughran和McDonald情感单词列表,该框架从10-K文档中生成情感TF-IDF,以计算两个连续10-K报告之间的余弦相似性,并建议利用这种余弦相似性作为阿尔法因子。为了分析阿尔法因子在预测未来回报方面的有效性,该框架使用阿尔法透镜库进行因子回报分析、营业额分析,并比较潜在阿尔法因子的夏普比率。结果表明,我们投资组合10-K报表的情绪稳定性与其未来平均回报率之间存在很强的相关性。为了研究社区的利益,与本文相关的代码和Jupyter笔记本已经在Github1上开源。
{"title":"Intelligent Portfolio Management via NLP Analysis of Financial 10-k Statements","authors":"Purva Singh","doi":"10.5121/ijaia.2020.11602","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11602","url":null,"abstract":"The paper attempts to analyze if the sentiment stability of financial 10-K reports over time can determine the company’s future mean returns. A diverse portfolio of stocks was selected to test this hypothesis. The proposed framework downloads 10-K reports of the companies from SEC’s EDGAR database. It passes them through the preprocessing pipeline to extract critical sections of the filings to perform NLP analysis. Using Loughran and McDonald sentiment word list, the framework generates sentiment TF-IDF from the 10-K documents to calculate the cosine similarity between two consecutive 10-K reports and proposes to leverage this cosine similarity as the alpha factor. For analyzing the effectiveness of our alpha factor at predicting future returns, the framework uses the alphalens library to perform factor return analysis, turnover analysis, and for comparing the Sharpe ratio of potential alpha factors. The results show that there exists a strong correlation between the sentiment stability of our portfolio’s 10-K statements and its future mean returns. For the benefit of the research community, the code and Jupyter notebooks related to this paper have been open-sourced on Github1.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"13-25"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46494005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Contextual Graphs as a Decision-making Tool in the Process of Hiring Candidates 在招聘过程中使用上下文图作为决策工具
Pub Date : 2020-11-30 DOI: 10.5121/ijaia.2020.11604
H. Tahir, P. Brézillon
Poor selection of employees can be a first step towards a lack of motivation, poor performance, and high turnover, to name a few. It's no wonder that organizations are trying to find the best ways to avoid these slippages by finding the best possible person for the job. Therefore, it is very important to understand the context of hiring process to help to understand which recruiting mistakes are most damaging to the organization in order to reduce the recruiting challenges faced by Human resource managers by building their capacity to ensure optimal HR performance. This paper initiates a research about how Contextual Graphs Formalism can be used for improving the decision making in the process of hiring potential candidates. An example of a typical procedure for visualization of recruiting phases is presented to show how to add contextual elements and practices in order to communicate the recruitment policy in a concrete and memorable way to both hiring teams and candidates.
员工选择不当可能是缺乏动力、表现不佳和高流动率的第一步,等等。难怪公司都在努力寻找最好的方法来避免这些失误,找到最适合这份工作的人。因此,了解招聘过程的背景是非常重要的,这有助于了解哪些招聘错误对组织的危害最大,从而通过建立人力资源经理的能力来确保最佳的人力资源绩效,从而减少人力资源经理面临的招聘挑战。本文研究了上下文图形式论如何用于改善招聘潜在候选人过程中的决策。本文给出了一个典型的招聘阶段可视化过程的示例,以展示如何添加上下文元素和实践,以便以具体和难忘的方式向招聘团队和候选人传达招聘政策。
{"title":"Using Contextual Graphs as a Decision-making Tool in the Process of Hiring Candidates","authors":"H. Tahir, P. Brézillon","doi":"10.5121/ijaia.2020.11604","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11604","url":null,"abstract":"Poor selection of employees can be a first step towards a lack of motivation, poor performance, and high turnover, to name a few. It's no wonder that organizations are trying to find the best ways to avoid these slippages by finding the best possible person for the job. Therefore, it is very important to understand the context of hiring process to help to understand which recruiting mistakes are most damaging to the organization in order to reduce the recruiting challenges faced by Human resource managers by building their capacity to ensure optimal HR performance. This paper initiates a research about how Contextual Graphs Formalism can be used for improving the decision making in the process of hiring potential candidates. An example of a typical procedure for visualization of recruiting phases is presented to show how to add contextual elements and practices in order to communicate the recruitment policy in a concrete and memorable way to both hiring teams and candidates.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"37-46"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46863610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Problem Decomposition and Information Minimization for the Global, Concurrent, On-line Validation of Neutron Noise Signals and Neutron Detector Operation 中子噪声信号和中子探测器运行的全局、并发、在线验证问题分解和信息最小化
Pub Date : 2020-11-30 DOI: 10.5121/ijaia.2020.11601
Tatiana Tambouratzis
This piece of research introduces a purely data-driven, directly reconfigurable, divide-and-conquer on-line monitoring (OLM) methodology for automatically selecting the minimum number of neutron detectors (NDs) – and corresponding neutron noise signals (NSs) – which are currently necessary, as well as sufficient, for inspecting the entire nuclear reactor (NR) in-core area. The proposed implementation builds upon the 3-tuple configuration, according to which three sufficiently pairwise-correlated NSs are capable of on-line (I) verifying each NS of the 3-tuple and (II) endorsing correct functioning of each corresponding ND, implemented herein via straightforward pairwise comparisons of fixed-length sliding time-windows (STWs) between the three NSs of the 3-tuple. A pressurized water NR (PWR) model – developed for H2020 CORTEX – is used for deriving the optimal ND/NS configuration, where (i) the evident partitioning of the 36 NDs/NSs into six clusters of six NDs/NSs each, and (ii) the high cross-correlations (CCs) within every 3-tuple of NSs, endorse the use of a constant pair comprising the two most highly CC-ed NSs per cluster as the first two members of the 3-tuple, with the third member being each remaining NS of the cluster, in turn, thereby computationally streamlining OLM without compromising the identification of either deviating NSs or malfunctioning NDs. Tests on the in-core dataset of the PWR model demonstrate the potential of the proposed methodology in terms of suitability for, efficiency at, as well as robustness in ND/NS selection, further establishing the “directly reconfigurable” property of the proposed approach at every point in time while using one-third only of the original NDs/NSs.
本研究介绍了一种纯数据驱动的、直接可重构的、分而治之的在线监测(OLM)方法,用于自动选择最小数量的中子探测器(nd)和相应的中子噪声信号(NSs),这是目前检查整个核反应堆(NR)核心区域所必需的,也是足够的。所提出的实现建立在3元组配置的基础上,根据该配置,三个充分两两相关的神经网络能够在线(I)验证3元组中的每个神经网络和(II)认可每个相应神经网络的正确功能,本文通过对3元组中的三个神经网络之间的定长滑动时间窗口(STWs)的直接两两比较来实现。加压水NR(压水式反应堆)模型——发达H2020皮层——用于推导最优ND / NS配置,(i)明显分区的36 NDs / NSs六组分为六NDs NSs,和(2)的高互关联(CCs)在每个包含NSs,支持使用一个常数对包括两个最高度CC-ed NSs每个集群的前两个成员包含每个剩余NS的第三个成员的集群,反过来,从而在计算上简化OLM,而不影响对偏离的NSs或故障的NDs的识别。在压水堆模型核心数据集上的测试表明,在ND/NS选择的适用性、效率和鲁棒性方面,所提出的方法具有潜力,进一步确立了所提出方法在每个时间点的“直接可重构”特性,同时只使用三分之一的原始ND/NS。
{"title":"Problem Decomposition and Information Minimization for the Global, Concurrent, On-line Validation of Neutron Noise Signals and Neutron Detector Operation","authors":"Tatiana Tambouratzis","doi":"10.5121/ijaia.2020.11601","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11601","url":null,"abstract":"This piece of research introduces a purely data-driven, directly reconfigurable, divide-and-conquer on-line monitoring (OLM) methodology for automatically selecting the minimum number of neutron detectors (NDs) – and corresponding neutron noise signals (NSs) – which are currently necessary, as well as sufficient, for inspecting the entire nuclear reactor (NR) in-core area. The proposed implementation builds upon the 3-tuple configuration, according to which three sufficiently pairwise-correlated NSs are capable of on-line (I) verifying each NS of the 3-tuple and (II) endorsing correct functioning of each corresponding ND, implemented herein via straightforward pairwise comparisons of fixed-length sliding time-windows (STWs) between the three NSs of the 3-tuple. A pressurized water NR (PWR) model – developed for H2020 CORTEX – is used for deriving the optimal ND/NS configuration, where (i) the evident partitioning of the 36 NDs/NSs into six clusters of six NDs/NSs each, and (ii) the high cross-correlations (CCs) within every 3-tuple of NSs, endorse the use of a constant pair comprising the two most highly CC-ed NSs per cluster as the first two members of the 3-tuple, with the third member being each remaining NS of the cluster, in turn, thereby computationally streamlining OLM without compromising the identification of either deviating NSs or malfunctioning NDs. Tests on the in-core dataset of the PWR model demonstrate the potential of the proposed methodology in terms of suitability for, efficiency at, as well as robustness in ND/NS selection, further establishing the “directly reconfigurable” property of the proposed approach at every point in time while using one-third only of the original NDs/NSs.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45024519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
International journal of artificial intelligence & applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1