首页 > 最新文献

J. Chem. Inf. Comput. Sci.最新文献

英文 中文
Constructing Metadata Schema of Scientific and Technical Report Based on FRBR 基于FRBR的科技报告元数据模式构建
Pub Date : 2018-03-26 DOI: 10.5539/CIS.V11N2P34
Xiaozhu Zou, Siyi Xiong, Zhi Li, P. Jiang
Scientific and technical report is an important document type with high intelligence value. But the resource distribution of different carrier forms of scientific and technical report is not integrated and the resource description is not deeply and specific enough to the report document type, which influences the information searching accuracy and efficiency for users. Functional Requirements for Bibliographic Records (FRBR), an emerging model in the bibliographic domain, provides interesting possibilities in terms of cataloguing, representation and semantic enrichment of bibliographic data. This study employs the FRBR conceptual model and entity-relationship analysis method to design in-depth descriptive metadata schema of scientific and technical report by analyzing the entities and mapping the bibliographic attributes corresponding to the characteristics of report, which can help to integrating and disclosing scientific and technical report resources.
科技报告是一种重要的文献类型,具有很高的情报价值。但科技报告不同载体形式的资源分布不整合,对报告文献类型的资源描述不够深入和具体,影响了用户信息检索的准确性和效率。书目记录功能需求(Functional Requirements for Bibliographic Records, FRBR)是书目领域的一个新兴模型,它在书目数据的编目、表示和语义丰富方面提供了有趣的可能性。本研究采用FRBR概念模型和实体关系分析方法,通过实体分析和对应报告特征的书目属性映射,设计深度描述性科技报告元数据模式,有助于科技报告资源的整合和披露。
{"title":"Constructing Metadata Schema of Scientific and Technical Report Based on FRBR","authors":"Xiaozhu Zou, Siyi Xiong, Zhi Li, P. Jiang","doi":"10.5539/CIS.V11N2P34","DOIUrl":"https://doi.org/10.5539/CIS.V11N2P34","url":null,"abstract":"Scientific and technical report is an important document type with high intelligence value. But the resource distribution of different carrier forms of scientific and technical report is not integrated and the resource description is not deeply and specific enough to the report document type, which influences the information searching accuracy and efficiency for users. Functional Requirements for Bibliographic Records (FRBR), an emerging model in the bibliographic domain, provides interesting possibilities in terms of cataloguing, representation and semantic enrichment of bibliographic data. This study employs the FRBR conceptual model and entity-relationship analysis method to design in-depth descriptive metadata schema of scientific and technical report by analyzing the entities and mapping the bibliographic attributes corresponding to the characteristics of report, which can help to integrating and disclosing scientific and technical report resources.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"1 1","pages":"34-39"},"PeriodicalIF":0.0,"publicationDate":"2018-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91322937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Finding Nutritional Deficiency and Disease Pattern of Rural People Using Fuzzy Logic and Big Data Techniques on Hadoop 基于Hadoop的模糊逻辑和大数据技术发现农村人口营养缺乏和疾病模式
Pub Date : 2018-03-20 DOI: 10.5539/cis.v11n2p11
Sadia Yeasmin, Muhammad Abrar Hussain, Noor Yazdani Sikder, R. Rahman
Over the decades there is a high demand of a tool to identify the nutritional needs of the people of Bangladesh since it has an alarming rate of under nutrition among the countries of the world. This analysis has focused on the dissimilarity of diseases caused by malnutrition in different districts of Bangladesh. Among the 64 districts, there is no single one found where people have grown proper nutritional food habit. Low income and less knowledge are the triggering factors and the case is worse in the rural areas. In this research, a distributed enumerating framework for large data set is processed in big data models. Fuzzy logic has the ability to model the nutrition problem, in the way helping people to calculate the suitability between food calories and user’s profile. A Map Reduce-based K-nearest neighbor (mrK-NN) classifier has been applied in this research in order to classify data. We have designed a balanced model applying fuzzy logic and big data analysis on Hadoop concerning food habit, food nutrition and disease, especially for the rural people.
几十年来,人们对确定孟加拉国人民营养需求的工具有很高的需求,因为孟加拉国的营养不足率在世界各国中是惊人的。这一分析的重点是孟加拉国不同地区营养不良引起的疾病的差异。在64个地区中,没有一个地区的居民养成了适当的营养饮食习惯。低收入和知识匮乏是诱发因素,农村地区的情况更为严重。本研究在大数据模型中处理了一个面向大数据集的分布式枚举框架。模糊逻辑具有对营养问题建模的能力,通过这种方式帮助人们计算食物卡路里与用户个人资料之间的适用性。本研究采用基于地图约简的k -近邻(mrK-NN)分类器对数据进行分类。我们在Hadoop上运用模糊逻辑和大数据分析设计了一个关于饮食习惯、食物营养和疾病的平衡模型,特别是针对农村人群。
{"title":"Finding Nutritional Deficiency and Disease Pattern of Rural People Using Fuzzy Logic and Big Data Techniques on Hadoop","authors":"Sadia Yeasmin, Muhammad Abrar Hussain, Noor Yazdani Sikder, R. Rahman","doi":"10.5539/cis.v11n2p11","DOIUrl":"https://doi.org/10.5539/cis.v11n2p11","url":null,"abstract":"Over the decades there is a high demand of a tool to identify the nutritional needs of the people of Bangladesh since it has an alarming rate of under nutrition among the countries of the world. This analysis has focused on the dissimilarity of diseases caused by malnutrition in different districts of Bangladesh. Among the 64 districts, there is no single one found where people have grown proper nutritional food habit. Low income and less knowledge are the triggering factors and the case is worse in the rural areas. In this research, a distributed enumerating framework for large data set is processed in big data models. Fuzzy logic has the ability to model the nutrition problem, in the way helping people to calculate the suitability between food calories and user’s profile. A Map Reduce-based K-nearest neighbor (mrK-NN) classifier has been applied in this research in order to classify data. We have designed a balanced model applying fuzzy logic and big data analysis on Hadoop concerning food habit, food nutrition and disease, especially for the rural people.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"54 1","pages":"11-33"},"PeriodicalIF":0.0,"publicationDate":"2018-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89662807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Vehicle Scheduling Model Based on Data Mining 基于数据挖掘的车辆调度模型
Pub Date : 2018-01-31 DOI: 10.5539/cis.v11n1p104
Guohua Zhang, Ting Xie, Min Liu, Yang Liu
The article presents a shortest-path model of vehicle scheduling, which based on analyzing the application of data mining in vehicle scheduling model by referring research status of data mining and describing logistics distribution process. The article also provides an algorithmic support by making the Dijkstra algorithm of the shortest path model simple and rational.
在分析数据挖掘技术在车辆调度模型中的应用的基础上,参考数据挖掘的研究现状,描述物流配送过程,提出了车辆调度的最短路径模型。本文还通过使最短路径模型的Dijkstra算法简单合理,提供了算法支持。
{"title":"Vehicle Scheduling Model Based on Data Mining","authors":"Guohua Zhang, Ting Xie, Min Liu, Yang Liu","doi":"10.5539/cis.v11n1p104","DOIUrl":"https://doi.org/10.5539/cis.v11n1p104","url":null,"abstract":"The article presents a shortest-path model of vehicle scheduling, which based on analyzing the application of data mining in vehicle scheduling model by referring research status of data mining and describing logistics distribution process. The article also provides an algorithmic support by making the Dijkstra algorithm of the shortest path model simple and rational.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"13 1","pages":"104-107"},"PeriodicalIF":0.0,"publicationDate":"2018-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75292181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Simulation Model of IEEE 802.15.4 GTS Mechanism and GTS Attacks in OMNeT++ / MiXiM + NETA IEEE 802.15.4 GTS机制及omnet++ / MiXiM + NETA中GTS攻击仿真模型
Pub Date : 2018-01-27 DOI: 10.5539/cis.v11n1p78
Y. M. Amin, A. T. Abdel-Hamid
The IEEE 802.15.4 standard defines the PHY and MAC layer specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs). With the proliferation of many time-critical applications with real-time delivery, low latency, and/or specific bandwidth requirements, Guaranteed Time Slots (GTS) are increasingly being used for reliable contention-free data transmission by nodes within beacon-enabled WPANs. To evaluate the performance of the 802.15.4 GTS management scheme, this paper introduces a new GTS simulation model for OMNeT++ / MiXiM. Our GTS model considers star-topology WPANs within the 2.4 GHz frequency band, and is in full conformance with the IEEE 802.15.4 – 2006 standard. To enable thorough investigation of the behaviors and impacts of different attacks against the 802.15.4 GTS mechanism, a new GTS attacks simulation model for OMNeT++ is also introduced in this paper. Our GTS attacks model is developed for OMNeT++ / NETA, and is integrated with our GTS model to provide a single inclusive OMNeT++ simulation model for both the GTS mechanism and all known-to-date attacks against it.
IEEE 802.15.4标准定义了低速率无线个人区域网络(lr - wpan)的物理层和MAC层规范。随着许多具有实时传输、低延迟和/或特定带宽要求的时间关键型应用程序的激增,保证时隙(GTS)越来越多地被用于支持信标的wpan内节点的可靠无争用数据传输。为了评估802.15.4 GTS管理方案的性能,本文介绍了一种新的基于omnet++ / MiXiM的GTS仿真模型。我们的GTS模型考虑了2.4 GHz频段内的星形拓扑wwan,并且完全符合IEEE 802.15.4 - 2006标准。为了深入研究针对802.15.4 GTS机制的不同攻击行为和影响,本文还介绍了一种新的omnet++ GTS攻击仿真模型。我们的GTS攻击模型是为omnet++ / NETA开发的,并与我们的GTS模型集成在一起,为GTS机制和所有已知的针对GTS的攻击提供了一个单一的omnet++仿真模型。
{"title":"A Simulation Model of IEEE 802.15.4 GTS Mechanism and GTS Attacks in OMNeT++ / MiXiM + NETA","authors":"Y. M. Amin, A. T. Abdel-Hamid","doi":"10.5539/cis.v11n1p78","DOIUrl":"https://doi.org/10.5539/cis.v11n1p78","url":null,"abstract":"The IEEE 802.15.4 standard defines the PHY and MAC layer specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs). With the proliferation of many time-critical applications with real-time delivery, low latency, and/or specific bandwidth requirements, Guaranteed Time Slots (GTS) are increasingly being used for reliable contention-free data transmission by nodes within beacon-enabled WPANs. To evaluate the performance of the 802.15.4 GTS management scheme, this paper introduces a new GTS simulation model for OMNeT++ / MiXiM. Our GTS model considers star-topology WPANs within the 2.4 GHz frequency band, and is in full conformance with the IEEE 802.15.4 – 2006 standard. To enable thorough investigation of the behaviors and impacts of different attacks against the 802.15.4 GTS mechanism, a new GTS attacks simulation model for OMNeT++ is also introduced in this paper. Our GTS attacks model is developed for OMNeT++ / NETA, and is integrated with our GTS model to provide a single inclusive OMNeT++ simulation model for both the GTS mechanism and all known-to-date attacks against it.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"24 1","pages":"78-89"},"PeriodicalIF":0.0,"publicationDate":"2018-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90059112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Enhancing Big Data Auditing 加强大数据审计
Pub Date : 2018-01-27 DOI: 10.5539/cis.v11n1p90
Sara Alomari, Mona Alghamdi, F. Alotaibi
The auditing services of the outsourced data, especially big data, have been an active research area recently. Many schemes of remotely data auditing (RDA) have been proposed. Both categories of RDA, which are Provable Data Possession (PDP) and Proof of Retrievability (PoR), mostly represent the core schemes for most researchers to derive new schemes that support additional capabilities such as batch and dynamic auditing. In this paper, we choose the most popular PDP schemes to be investigated due to the existence of many PDP techniques which are further improved to achieve efficient integrity verification. We firstly review the work of literature to form the required knowledge about the auditing services and related schemes. Secondly, we specify a methodology to be adhered to attain the research goals. Then, we define each selected PDP scheme and the auditing properties to be used to compare between the chosen schemes. Therefore, we decide, if possible, which scheme is optimal in handling big data auditing.
近年来,外包数据尤其是大数据的审计服务一直是研究的热点。目前已经提出了多种远程数据审计方案。RDA的两类,即可证明数据占有(PDP)和可检索性证明(PoR),主要代表了大多数研究人员派生支持批处理和动态审计等附加功能的新方案的核心方案。在本文中,我们选择了最流行的PDP方案进行研究,因为存在许多PDP技术,这些技术有待进一步改进以实现有效的完整性验证。我们首先回顾文献工作,形成审计服务和相关方案所需的知识。其次,我们明确了要遵循的研究方法,以达到研究目标。然后,我们定义每个选定的PDP方案和用于在所选方案之间进行比较的审计属性。因此,我们决定,如果可能的话,哪种方案是处理大数据审计的最佳方案。
{"title":"Enhancing Big Data Auditing","authors":"Sara Alomari, Mona Alghamdi, F. Alotaibi","doi":"10.5539/cis.v11n1p90","DOIUrl":"https://doi.org/10.5539/cis.v11n1p90","url":null,"abstract":"The auditing services of the outsourced data, especially big data, have been an active research area recently. Many schemes of remotely data auditing (RDA) have been proposed. Both categories of RDA, which are Provable Data Possession (PDP) and Proof of Retrievability (PoR), mostly represent the core schemes for most researchers to derive new schemes that support additional capabilities such as batch and dynamic auditing. In this paper, we choose the most popular PDP schemes to be investigated due to the existence of many PDP techniques which are further improved to achieve efficient integrity verification. We firstly review the work of literature to form the required knowledge about the auditing services and related schemes. Secondly, we specify a methodology to be adhered to attain the research goals. Then, we define each selected PDP scheme and the auditing properties to be used to compare between the chosen schemes. Therefore, we decide, if possible, which scheme is optimal in handling big data auditing.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"37 1","pages":"90-97"},"PeriodicalIF":0.0,"publicationDate":"2018-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73115039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform 基于Hadoop平台的K均值算法的分析与实现
Pub Date : 2018-01-27 DOI: 10.5539/cis.v11n1p98
L. Wei
In today's society has entered the era of big data, data of the diversity and the amount of data increases to the data storage and processing brought great challenges, Hadoop HDFS and MapReduce better solves the these two problems. Classical K-means algorithm is the most widely used one based on the partition of the clustering algorithm. At the completion of the cluster configuration based on, the k-means algorithm in cluster mode of operation principle and in the cluster mode realized kmeans algorithm, and the experimental results are research and analysis, summarized the k-means algorithm is run on the Hadoop platform's strengths and limitations.
当今社会已经进入了大数据时代,数据的多样性和数据量的增加给数据的存储和处理带来了巨大的挑战,Hadoop HDFS和MapReduce较好的解决了这两个问题。经典K-means算法是目前应用最广泛的一种基于分割的聚类算法。在完成集群配置的基础上,对k-means算法在集群模式下的工作原理和在集群模式下实现的k-means算法进行了研究和分析,并对实验结果进行了研究和分析,总结了k-means算法在Hadoop平台上运行的优势和局限性。
{"title":"The Analysis and Implementation of the K - Means Algorithm Based on Hadoop Platform","authors":"L. Wei","doi":"10.5539/cis.v11n1p98","DOIUrl":"https://doi.org/10.5539/cis.v11n1p98","url":null,"abstract":"In today's society has entered the era of big data, data of the diversity and the amount of data increases to the data storage and processing brought great challenges, Hadoop HDFS and MapReduce better solves the these two problems. Classical K-means algorithm is the most widely used one based on the partition of the clustering algorithm. At the completion of the cluster configuration based on, the k-means algorithm in cluster mode of operation principle and in the cluster mode realized kmeans algorithm, and the experimental results are research and analysis, summarized the k-means algorithm is run on the Hadoop platform's strengths and limitations.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"1 1","pages":"98-103"},"PeriodicalIF":0.0,"publicationDate":"2018-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90707186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Matrix Factorization Techniques for Context-Aware Collaborative Filtering Recommender Systems: A Survey 上下文感知协同过滤推荐系统的矩阵分解技术综述
Pub Date : 2018-01-24 DOI: 10.5539/cis.v11n2p1
M. H. Abdi, G. Okeyo, R. Mwangi
Collaborative Filtering Recommender Systems predict user preferences for online information, products or services by learning from past user-item relationships. A predominant approach to Collaborative Filtering is Neighborhood-based, where a user-item preference rating is computed from ratings of similar items and/or users. This approach encounters data sparsity and scalability limitations as the volume of accessible information and the active users continue to grow leading to performance degradation, poor quality recommendations and inaccurate predictions. Despite these drawbacks, the problem of information overload has led to great interests in personalization techniques. The incorporation of context information and Matrix and Tensor Factorization techniques have proved to be a promising solution to some of these challenges. We conducted a focused review of literature in the areas of Context-aware Recommender Systems utilizing Matrix Factorization approaches. This survey paper presents a detailed literature review of Context-aware Recommender Systems and approaches to improving performance for large scale datasets and the impact of incorporating contextual information on the quality and accuracy of the recommendation. The results of this survey can be used as a basic reference for improving and optimizing existing Context-aware Collaborative Filtering based Recommender Systems. The main contribution of this paper is a survey of Matrix Factorization techniques for Context-aware Collaborative Filtering Recommender Systems.
协同过滤推荐系统通过学习过去的用户-物品关系来预测用户对在线信息、产品或服务的偏好。协作过滤的主要方法是基于邻域的,其中用户-项目偏好评级是从相似项目和/或用户的评级中计算出来的。随着可访问信息的数量和活跃用户的不断增长,这种方法会遇到数据稀疏性和可伸缩性的限制,从而导致性能下降、推荐质量差和预测不准确。尽管存在这些缺点,信息过载的问题还是引起了人们对个性化技术的极大兴趣。上下文信息与矩阵和张量分解技术的结合已被证明是解决这些挑战的一个有希望的解决方案。我们对使用矩阵分解方法的上下文感知推荐系统领域的文献进行了重点回顾。这篇调查论文详细介绍了上下文感知推荐系统的文献综述,以及提高大规模数据集性能的方法,以及整合上下文信息对推荐质量和准确性的影响。本调查的结果可作为改进和优化现有基于上下文感知协同过滤的推荐系统的基本参考。本文的主要贡献是综述了用于上下文感知协同过滤推荐系统的矩阵分解技术。
{"title":"Matrix Factorization Techniques for Context-Aware Collaborative Filtering Recommender Systems: A Survey","authors":"M. H. Abdi, G. Okeyo, R. Mwangi","doi":"10.5539/cis.v11n2p1","DOIUrl":"https://doi.org/10.5539/cis.v11n2p1","url":null,"abstract":"Collaborative Filtering Recommender Systems predict user preferences for online information, products or services by learning from past user-item relationships. A predominant approach to Collaborative Filtering is Neighborhood-based, where a user-item preference rating is computed from ratings of similar items and/or users. This approach encounters data sparsity and scalability limitations as the volume of accessible information and the active users continue to grow leading to performance degradation, poor quality recommendations and inaccurate predictions. Despite these drawbacks, the problem of information overload has led to great interests in personalization techniques. The incorporation of context information and Matrix and Tensor Factorization techniques have proved to be a promising solution to some of these challenges. We conducted a focused review of literature in the areas of Context-aware Recommender Systems utilizing Matrix Factorization approaches. This survey paper presents a detailed literature review of Context-aware Recommender Systems and approaches to improving performance for large scale datasets and the impact of incorporating contextual information on the quality and accuracy of the recommendation. The results of this survey can be used as a basic reference for improving and optimizing existing Context-aware Collaborative Filtering based Recommender Systems. The main contribution of this paper is a survey of Matrix Factorization techniques for Context-aware Collaborative Filtering Recommender Systems.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"126 1","pages":"1-10"},"PeriodicalIF":0.0,"publicationDate":"2018-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79156754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
A Class Validation Proposal of a Pedagogic Domain Ontology based on Clustering Analysis 基于聚类分析的教学领域本体类验证方案
Pub Date : 2018-01-16 DOI: 10.5539/cis.v11n1p65
Yuridiana Alemán, M. J. S. García, D. V. Ayala
The knowledge bases of the Web are fundamentally organized in ontologies in order to answer queries based on semantics. The ontologies learning process comprises three fundamental steps: creation of classes and relationships, population and evaluation. In this paper the focus includes the classes creation, by introducing a class validation proposal using clustering analysis. As case of study was selected a pedagogical domain, where a corpus was semi-automatically built, from articles written in Spanish published in Social Sciences. Moreover, a dictionary containing classes, concepts and synonyms was included in the experiments. Clustering analysis allowed to verify the concepts that the experts considered as the most important for the domain. For the case of study selected, the cluster analysis step reports clusters with the same instances that the clusters defined by the experts.
为了回答基于语义的查询,Web的知识库基本上组织在本体中。本体的学习过程包括三个基本步骤:创建类和关系、填充和评估。本文重点介绍了类的创建,并引入了一种基于聚类分析的类验证方案。作为研究案例,选择了一个教学领域,其中语料库是半自动构建的,从发表在《社会科学》上的西班牙文文章中。此外,实验中还包括一个包含类、概念和同义词的字典。聚类分析允许验证专家认为对该领域最重要的概念。对于选择的研究案例,聚类分析步骤报告与专家定义的聚类具有相同实例的聚类。
{"title":"A Class Validation Proposal of a Pedagogic Domain Ontology based on Clustering Analysis","authors":"Yuridiana Alemán, M. J. S. García, D. V. Ayala","doi":"10.5539/cis.v11n1p65","DOIUrl":"https://doi.org/10.5539/cis.v11n1p65","url":null,"abstract":"The knowledge bases of the Web are fundamentally organized in ontologies in order to answer queries based on semantics. The ontologies learning process comprises three fundamental steps: creation of classes and relationships, population and evaluation. In this paper the focus includes the classes creation, by introducing a class validation proposal using clustering analysis. As case of study was selected a pedagogical domain, where a corpus was semi-automatically built, from articles written in Spanish published in Social Sciences. Moreover, a dictionary containing classes, concepts and synonyms was included in the experiments. Clustering analysis allowed to verify the concepts that the experts considered as the most important for the domain. For the case of study selected, the cluster analysis step reports clusters with the same instances that the clusters defined by the experts.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"21 1","pages":"65-77"},"PeriodicalIF":0.0,"publicationDate":"2018-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73050266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Stock Market Classification Model Using Sentiment Analysis on Twitter Based on Hybrid Naive Bayes Classifiers 基于混合朴素贝叶斯分类器的Twitter情绪分析股票市场分类模型
Pub Date : 2018-01-11 DOI: 10.5539/cis.v11n1p52
Ghaith Abdulsattar A. Jabbar Alkubaisi, S. S. Kamaruddin, H. Husni
Sentiment analysis has become one of the most popular process to predict stock market behaviour based on consumer reactions. Concurrently, the availability of data from Twitter has also attracted researchers towards this research area. Most of the models related to sentiment analysis are still suffering from inaccuracies. The low accuracy in classification has a direct effect on the reliability of stock market indicators. The study primarily focuses on the analysis of the Twitter dataset. Moreover, an improved model is proposed in this study; it is designed to enhance the classification accuracy. The first phase of this model is data collection, and the second involves the filtration and transformation, which are conducted to get only relevant data. The most crucial phase is labelling, in which polarity of data is determined and negative, positive or neutral values are assigned to people opinion. The fourth phase is the classification phase in which suitable patterns of the stock market are identified by hybridizing Naive Bayes Classifiers (NBCs), and the final phase is the performance and evaluation. This study proposes Hybrid Naive Bayes Classifiers (HNBCs) as a machine learning method for stock market classification. The outcome is instrumental for investors, companies, and researchers whereby it will enable them to formulate their plans according to the sentiments of people. The proposed method has produced a significant result; it has achieved accuracy equals 90.38%.
情绪分析已成为基于消费者反应预测股市行为的最流行方法之一。同时,Twitter数据的可用性也吸引了研究人员进入这一研究领域。大多数与情感分析相关的模型仍然存在不准确性。分类准确率低直接影响到股票市场指标的可靠性。这项研究主要集中在对Twitter数据集的分析上。此外,本文还提出了一种改进模型;它是为了提高分类精度而设计的。该模型的第一阶段是数据收集,第二阶段是过滤和转换,只得到相关的数据。最关键的阶段是标签,在这个阶段,确定数据的极性,并将消极、积极或中性的值分配给人们的意见。第四阶段是分类阶段,通过混合朴素贝叶斯分类器(nbc)识别出合适的股票市场模式,最后阶段是绩效评估阶段。本研究提出混合朴素贝叶斯分类器(hnbc)作为股票市场分类的机器学习方法。这一结果对投资者、公司和研究人员都有帮助,使他们能够根据人们的情绪制定他们的计划。该方法取得了显著的效果;准确率达到90.38%。
{"title":"Stock Market Classification Model Using Sentiment Analysis on Twitter Based on Hybrid Naive Bayes Classifiers","authors":"Ghaith Abdulsattar A. Jabbar Alkubaisi, S. S. Kamaruddin, H. Husni","doi":"10.5539/cis.v11n1p52","DOIUrl":"https://doi.org/10.5539/cis.v11n1p52","url":null,"abstract":"Sentiment analysis has become one of the most popular process to predict stock market behaviour based on consumer reactions. Concurrently, the availability of data from Twitter has also attracted researchers towards this research area. Most of the models related to sentiment analysis are still suffering from inaccuracies. The low accuracy in classification has a direct effect on the reliability of stock market indicators. The study primarily focuses on the analysis of the Twitter dataset. Moreover, an improved model is proposed in this study; it is designed to enhance the classification accuracy. The first phase of this model is data collection, and the second involves the filtration and transformation, which are conducted to get only relevant data. The most crucial phase is labelling, in which polarity of data is determined and negative, positive or neutral values are assigned to people opinion. The fourth phase is the classification phase in which suitable patterns of the stock market are identified by hybridizing Naive Bayes Classifiers (NBCs), and the final phase is the performance and evaluation. This study proposes Hybrid Naive Bayes Classifiers (HNBCs) as a machine learning method for stock market classification. The outcome is instrumental for investors, companies, and researchers whereby it will enable them to formulate their plans according to the sentiments of people. The proposed method has produced a significant result; it has achieved accuracy equals 90.38%.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"19 1","pages":"52-64"},"PeriodicalIF":0.0,"publicationDate":"2018-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88193957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Enhanced Firefly Algorithm Using Fuzzy Parameter Tuner 基于模糊参数调谐器的改进萤火虫算法
Pub Date : 2018-01-07 DOI: 10.5539/cis.v11n1p26
Mahdi Bidar, S. Sadaoui, Malek Mouhoub, Mohsen Bidar
Exploitation and exploration are two main search strategies of every metaheuristic algorithm . However, the ratio between exploitation and exploration has a significant impact on the performance of these algorithms when dealing with optimization problems. In this study, we introduce an entire fuzzy system to tune efficiently and dynamically the firefly algorithm parameters in order to keep the exploration and exploitation in balance in each of the searching steps. This will prevent the firefly algorithm from being stuck in local optimal, a challenge issue in metaheuristic algorithms . To evaluate the quality of the solution returned by the fuzzy-based firefly algorithm, we conduct extensive experiments on a set of high and low dimensional benchmark functions as well as two constrained engineering problems. In this regard, we compare the improved firefly algorithm with the standard one and other famous metaheuristic algorithms. The experimental results demonstrate the superiority of the fuzzy-based firefly algorithm to standard firefly and also its comparability to other metaheuristic algorithms.
挖掘和探索是每一种元启发式算法的两种主要搜索策略。然而,在处理优化问题时,开采和勘探的比例对这些算法的性能有很大的影响。在本研究中,我们引入了一个完整的模糊系统来有效和动态地调整萤火虫算法的参数,以便在每个搜索步骤中保持探索和开发的平衡。这将防止萤火虫算法陷入局部最优,这是元启发式算法的一个挑战问题。为了评估基于模糊的萤火虫算法返回的解的质量,我们在一组高维和低维基准函数以及两个约束工程问题上进行了广泛的实验。在这方面,我们将改进的萤火虫算法与标准算法和其他著名的元启发式算法进行了比较。实验结果表明,基于模糊的萤火虫算法优于标准萤火虫算法,并且与其他元启发式算法具有可比性。
{"title":"Enhanced Firefly Algorithm Using Fuzzy Parameter Tuner","authors":"Mahdi Bidar, S. Sadaoui, Malek Mouhoub, Mohsen Bidar","doi":"10.5539/cis.v11n1p26","DOIUrl":"https://doi.org/10.5539/cis.v11n1p26","url":null,"abstract":"Exploitation and exploration are two main search strategies of every metaheuristic algorithm . However, the ratio between exploitation and exploration has a significant impact on the performance of these algorithms when dealing with optimization problems. In this study, we introduce an entire fuzzy system to tune efficiently and dynamically the firefly algorithm parameters in order to keep the exploration and exploitation in balance in each of the searching steps. This will prevent the firefly algorithm from being stuck in local optimal, a challenge issue in metaheuristic algorithms . To evaluate the quality of the solution returned by the fuzzy-based firefly algorithm, we conduct extensive experiments on a set of high and low dimensional benchmark functions as well as two constrained engineering problems. In this regard, we compare the improved firefly algorithm with the standard one and other famous metaheuristic algorithms. The experimental results demonstrate the superiority of the fuzzy-based firefly algorithm to standard firefly and also its comparability to other metaheuristic algorithms.","PeriodicalId":14676,"journal":{"name":"J. Chem. Inf. Comput. Sci.","volume":"23 4 1","pages":"26-51"},"PeriodicalIF":0.0,"publicationDate":"2018-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83667541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
J. Chem. Inf. Comput. Sci.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1