首页 > 最新文献

Data Intelligence最新文献

英文 中文
A Civil Aviation Customer Service Ontology and Its Applications 民航客户服务本体论及其应用
IF 3.9 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-11-01 DOI: 10.1162/dint_a_00237
Meixiang Lv, Xudong Cao, Tianxing Wu, Yuehua Li
ABSTRACT In the process of developing the C919 large aircraft customer service intelligence system, we find that heterogeneous and incomplete data cause the inefficient and inaccurate decision making. Thus, to solve this problem, we propose to introduce the idea of ontology modeling and reasoning into competitive intelligence system building in this paper. We first present the building principles and methods of the civil aviation customer service ontology. We then define the classes and properties to contribute a real-world civil aviation customer service ontology, which is published on the Web (http://www.openkg.cn/dataset/cacso). We finally design SWRL rules corresponding to different intelligence analysis targets to support reasoning in our designed competitive intelligence system.
ABSTRACT 在开发 C919 大飞机客户服务智能系统的过程中,我们发现异构和不完整的数据会导致决策的低效和不准确。因此,为了解决这一问题,本文提出将本体建模与推理的思想引入竞争情报系统的构建中。我们首先介绍了民航客户服务本体的构建原则和方法。然后,我们定义了类和属性,以贡献一个发布在 Web 上的真实民航客户服务本体(http://www.openkg.cn/dataset/cacso)。最后,我们设计了与不同情报分析目标相对应的 SWRL 规则,以支持我们设计的竞争情报系统中的推理。
{"title":"A Civil Aviation Customer Service Ontology and Its Applications","authors":"Meixiang Lv, Xudong Cao, Tianxing Wu, Yuehua Li","doi":"10.1162/dint_a_00237","DOIUrl":"https://doi.org/10.1162/dint_a_00237","url":null,"abstract":"ABSTRACT In the process of developing the C919 large aircraft customer service intelligence system, we find that heterogeneous and incomplete data cause the inefficient and inaccurate decision making. Thus, to solve this problem, we propose to introduce the idea of ontology modeling and reasoning into competitive intelligence system building in this paper. We first present the building principles and methods of the civil aviation customer service ontology. We then define the classes and properties to contribute a real-world civil aviation customer service ontology, which is published on the Web (http://www.openkg.cn/dataset/cacso). We finally design SWRL rules corresponding to different intelligence analysis targets to support reasoning in our designed competitive intelligence system.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"183 1","pages":"1063-1081"},"PeriodicalIF":3.9,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139292767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation Index System of Green Public Open Space Based on Internet of Things and Mental Health 基于物联网与心理健康的绿色公共开放空间评价指标体系
3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-24 DOI: 10.1162/dint_a_00219
Jiexu Li, Faziawati binti Abdul Aziz, Ning Zhang
Abstract With the emergence of the IoT era, wireless sensor networks will be more and more widely used. In addition to collecting, transmitting and processing simple data such as humidity, temperature and density of the dome, they can also provide multimedia information services such as video and images. It enables more comprehensive and accurate environmental monitoring. Therefore, MSDs have a huge demand in military, daily, forestry, biomedicine and other fields. The intensive city model has obvious advantages in meeting people's diverse needs and comfortable life. Most obviously, it speeds up the rhythm of life for residents, thereby increasing efficiency and saving time. Starting from this aspect, this paper conducts a research on the evaluation index system of public built on the following areas of open space IoT and mental health. In this paper, the GRNN neural network model is constructed, the mean condition is calculated, the density function can be estimated, the network output, and the schematic diagram of the generalized regression neural network is improved. Using the established system, the index in 2018 is selected as the base year, and after transformation, the standardized values of the past years are formed, which are substituted into the cells to form different matrices. The value of each cell is counted to obtain the subsystem coordination degree, and the global coordination degree is obtained through calculation. The evaluation results of ecological civilization construction and development in 2018 and 2019, 2020 and 2021 were compared. The experimental data shows that compared with 2018, economic development will change from 1 to 2.000, social harmony will change from 1 to 2.480, ecological health will decrease to 0.850, environmental friendliness will decrease to 0.750, and comprehensive evaluation will decrease to 0.513. This shows that while the economy is developing this year, the construction of ecological civilization has been gradually carried out, and good results have been achieved. This reflects the effectiveness of the system. The subject of the evaluation index system of green public open space based on the Internet of Things and mental health has been well completed.
随着物联网时代的到来,无线传感器网络将得到越来越广泛的应用。除了对穹顶的湿度、温度、密度等简单数据进行采集、传输和处理外,还可以提供视频、图像等多媒体信息服务。它使环境监测更加全面和准确。因此,MSDs在军事、日用、林业、生物医药等领域有着巨大的需求。集约型城市模式在满足人们多样化需求和舒适生活方面优势明显。最明显的是,它加快了居民的生活节奏,从而提高了效率,节省了时间。从这方面出发,本文从开放空间物联网和心理健康两个方面对构建的公众评价指标体系进行了研究。本文构建了GRNN神经网络模型,计算了均值条件,估计了密度函数,得到了网络输出,改进了广义回归神经网络的原理图。利用所建立的体系,选取2018年的指标作为基准年,经过变换后形成历年的标准化值,代入单元格形成不同的矩阵。对每个单元的值进行计数,得到子系统的协调度,通过计算得到全局的协调度。对比了2018年与2019年、2020年与2021年生态文明建设发展评价结果。实验数据显示,与2018年相比,经济发展将从1下降到2.000,社会和谐将从1下降到2.480,生态健康将下降到0.850,环境友好将下降到0.750,综合评价将下降到0.513。这说明,今年经济发展的同时,生态文明建设也在逐步开展,取得了良好的效果。这反映了该制度的有效性。基于物联网与心理健康的绿色公共开放空间评价指标体系课题已经完成。
{"title":"Evaluation Index System of Green Public Open Space Based on Internet of Things and Mental Health","authors":"Jiexu Li, Faziawati binti Abdul Aziz, Ning Zhang","doi":"10.1162/dint_a_00219","DOIUrl":"https://doi.org/10.1162/dint_a_00219","url":null,"abstract":"Abstract With the emergence of the IoT era, wireless sensor networks will be more and more widely used. In addition to collecting, transmitting and processing simple data such as humidity, temperature and density of the dome, they can also provide multimedia information services such as video and images. It enables more comprehensive and accurate environmental monitoring. Therefore, MSDs have a huge demand in military, daily, forestry, biomedicine and other fields. The intensive city model has obvious advantages in meeting people's diverse needs and comfortable life. Most obviously, it speeds up the rhythm of life for residents, thereby increasing efficiency and saving time. Starting from this aspect, this paper conducts a research on the evaluation index system of public built on the following areas of open space IoT and mental health. In this paper, the GRNN neural network model is constructed, the mean condition is calculated, the density function can be estimated, the network output, and the schematic diagram of the generalized regression neural network is improved. Using the established system, the index in 2018 is selected as the base year, and after transformation, the standardized values of the past years are formed, which are substituted into the cells to form different matrices. The value of each cell is counted to obtain the subsystem coordination degree, and the global coordination degree is obtained through calculation. The evaluation results of ecological civilization construction and development in 2018 and 2019, 2020 and 2021 were compared. The experimental data shows that compared with 2018, economic development will change from 1 to 2.000, social harmony will change from 1 to 2.480, ecological health will decrease to 0.850, environmental friendliness will decrease to 0.750, and comprehensive evaluation will decrease to 0.513. This shows that while the economy is developing this year, the construction of ecological civilization has been gradually carried out, and good results have been achieved. This reflects the effectiveness of the system. The subject of the evaluation index system of green public open space based on the Internet of Things and mental health has been well completed.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"39 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135273961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Slide-Detect: An Accurate Deep Learning Diagnosis of Lung Infiltration Slide-Detect:肺浸润的准确深度学习诊断
3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-10-02 DOI: 10.1162/dint_a_00233
Ahmed E. Mohamed, Magda B. Fayek, Mona Farouk
ABSTRACT Lung infiltration is a non-communicable condition where materials with higher density than air exist in the parenchyma tissue of the lungs. Lung infiltration can be hard to be detected in an X-ray scan even for a radiologist, especially at the early stages making it a leading cause of death. In response, several deep learning approaches have been evolved to address this problem. This paper proposes the Slide-Detect technique which is a Deep Neural Networks (DNN) model based on Convolutional Neural Networks (CNNs) that is trained to diagnose lung infiltration with Area Under Curve (AUC) up to 91.47%, accuracy of 93.85% and relatively low computational resources.
肺浸润是一种非传染性疾病,肺实质组织中存在密度高于空气的物质。即使是放射科医生,也很难在x射线扫描中发现肺浸润,特别是在早期阶段,这使其成为死亡的主要原因。作为回应,已经发展了几种深度学习方法来解决这个问题。本文提出了基于卷积神经网络(cnn)的深度神经网络(DNN)模型Slide-Detect技术,该技术经过训练后诊断肺浸润,曲线下面积(Area Under Curve, AUC)高达91.47%,准确率为93.85%,计算资源相对较少。
{"title":"Slide-Detect: An Accurate Deep Learning Diagnosis of Lung Infiltration","authors":"Ahmed E. Mohamed, Magda B. Fayek, Mona Farouk","doi":"10.1162/dint_a_00233","DOIUrl":"https://doi.org/10.1162/dint_a_00233","url":null,"abstract":"ABSTRACT Lung infiltration is a non-communicable condition where materials with higher density than air exist in the parenchyma tissue of the lungs. Lung infiltration can be hard to be detected in an X-ray scan even for a radiologist, especially at the early stages making it a leading cause of death. In response, several deep learning approaches have been evolved to address this problem. This paper proposes the Slide-Detect technique which is a Deep Neural Networks (DNN) model based on Convolutional Neural Networks (CNNs) that is trained to diagnose lung infiltration with Area Under Curve (AUC) up to 91.47%, accuracy of 93.85% and relatively low computational resources.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"246 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135902369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation on ChatGPT for Chinese Language Understanding ChatGPT对汉语理解能力的评价
3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-09-12 DOI: 10.1162/dint_a_00232
Linhan Li, Huaping Zhang, Chunjin Li, Haowen You, Wenyao Cui
Abstract ChatGPT has attracted extension attention of academia and industry. This paper aims to evaluate ChatGPT in Chinese language understanding capability on 6 tasks using 11 datasets. Experiments indicate that ChatGPT achieved competitive results in sentiment analysis, summary, and reading comprehension in Chinese, while it is prone to factual errors in closed-book QA. Further, on two more difficult Chinese understanding tasks, that is, idiom fill-in-the-blank and cants understanding, we found that a simple chain-of-thought prompt can improve the accuracy of ChatGPT in complex reasoning. This paper further analyses the possible risks of using ChatGPT based on the results. Finally, we briefly describe the research and development progress of our ChatBIT.
ChatGPT已经引起了学术界和工业界的广泛关注。本文旨在使用11个数据集评估ChatGPT在6个任务上的中文理解能力。实验表明,ChatGPT在中文情感分析、摘要和阅读理解方面取得了较好的效果,但在闭卷问答中容易出现事实错误。此外,在两个难度更高的汉语理解任务,即习语填空和俚语理解上,我们发现一个简单的思维链提示可以提高ChatGPT在复杂推理中的准确性。本文在此基础上进一步分析了使用ChatGPT可能存在的风险。最后,简要介绍了ChatBIT的研究与开发进展。
{"title":"Evaluation on ChatGPT for Chinese Language Understanding","authors":"Linhan Li, Huaping Zhang, Chunjin Li, Haowen You, Wenyao Cui","doi":"10.1162/dint_a_00232","DOIUrl":"https://doi.org/10.1162/dint_a_00232","url":null,"abstract":"Abstract ChatGPT has attracted extension attention of academia and industry. This paper aims to evaluate ChatGPT in Chinese language understanding capability on 6 tasks using 11 datasets. Experiments indicate that ChatGPT achieved competitive results in sentiment analysis, summary, and reading comprehension in Chinese, while it is prone to factual errors in closed-book QA. Further, on two more difficult Chinese understanding tasks, that is, idiom fill-in-the-blank and cants understanding, we found that a simple chain-of-thought prompt can improve the accuracy of ChatGPT in complex reasoning. This paper further analyses the possible risks of using ChatGPT based on the results. Finally, we briefly describe the research and development progress of our ChatBIT.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135879702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Smart management information systems (SMIS): Concept, evolution, research hotspots and applications 智能管理信息系统(SMIS):概念、发展、研究热点和应用
IF 3.9 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-08-31 DOI: 10.1162/dint_a_00231
Changyong Liang, Xiaoxiao Wang, Dong-xiao Gu, Pengyu Li, Hui Chen, Zhengfei Xu
Management information system (MIS), a human-computer system that deeply integrates next-generation information technology and management services, has become the nerve center of society and organizations. With the development of next-generation information technology, MIS has gradually entered the smart period. However, research on smart management information systems (SMIS) is still limited, lacking systematic summarization of its conceptual definition, evolution, research hotspots, and typical applications. Therefore, this paper defines the conceptual characteristics of SMIS, provides an overview of the evolution of SMIS, examines research focus areas using bibliometric methods, and elaborates on typical application practices of SMIS in fields such as health care, elderly care, manufacturing, and transportation. Furthermore, we discuss the future development directions of SMIS in four key areas: smart interaction, smart decision-making, efficient resource allocation, and flexible system architecture. These discussions provide guidance and a foundation for the theoretical development and practical application of SMIS.
管理信息系统(MIS)是一个将下一代信息技术和管理服务深度融合的人机系统,已成为社会和组织的神经中枢。随着下一代信息技术的发展,MIS逐渐进入智能化时期。然而,对智能管理信息系统的研究仍然有限,缺乏对其概念定义、发展、研究热点和典型应用的系统总结。因此,本文定义了SMIS的概念特征,概述了SMIS发展历程,使用文献计量学方法考察了研究重点领域,并阐述了SMIS在医疗、养老、制造和交通等领域的典型应用实践。此外,我们还讨论了SMIS在四个关键领域的未来发展方向:智能交互、智能决策、高效资源分配和灵活的系统架构。这些讨论为SMIS的理论发展和实际应用提供了指导和基础。
{"title":"Smart management information systems (SMIS): Concept, evolution, research hotspots and applications","authors":"Changyong Liang, Xiaoxiao Wang, Dong-xiao Gu, Pengyu Li, Hui Chen, Zhengfei Xu","doi":"10.1162/dint_a_00231","DOIUrl":"https://doi.org/10.1162/dint_a_00231","url":null,"abstract":"\u0000 Management information system (MIS), a human-computer system that deeply integrates next-generation information technology and management services, has become the nerve center of society and organizations. With the development of next-generation information technology, MIS has gradually entered the smart period. However, research on smart management information systems (SMIS) is still limited, lacking systematic summarization of its conceptual definition, evolution, research hotspots, and typical applications. Therefore, this paper defines the conceptual characteristics of SMIS, provides an overview of the evolution of SMIS, examines research focus areas using bibliometric methods, and elaborates on typical application practices of SMIS in fields such as health care, elderly care, manufacturing, and transportation. Furthermore, we discuss the future development directions of SMIS in four key areas: smart interaction, smart decision-making, efficient resource allocation, and flexible system architecture. These discussions provide guidance and a foundation for the theoretical development and practical application of SMIS.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44041436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of Medical Image Detection Technology Based on Deep Learning in Pneumoconiosis Diagnosis 基于深度学习的医学图像检测技术在尘肺诊断中的应用
IF 3.9 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-07-10 DOI: 10.1162/dint_a_00228
Shengguang Peng
Pneumoconiosis is a disease characterized by pulmonary tissue deposition caused by dust exposure in the workplace. In China, due to the large number and wide distribution of pneumoconiosis patients, there is a high demand for the case data of lung biopsy during the diagnosis of pneumoconiosis. This text studied the application of medical image detection technology in pneumoconiosis diagnosis based on deep learning (DL). A medical image detection and convolution neural network (CNN) based on DL was analyzed, and the application of DL medical image technology in pneumoconiosis diagnosis was researched. The experimental results in this paper showed that in the last round of testing, the accuracy of ResNet model including deconvolution structure reached 95.2%. The area under curve (AUC) value of the working characteristics of the subject is 0.987. The sensitivity was 99.66%, and the specificity was 88.61%. The non staging diagnosis of pneumoconiosis improved the diagnostic sensitivity while ensuring high specificity. At the same time, Delong test method was used to conduct AUC analysis on the three models, and the results showed that model C was more effective than model A and model B. There is no significant difference between model A and model B, and there is no significant difference in diagnostic efficiency. In a word, the diagnosis of the model has high sensitivity and low probability of missed diagnosis, which can greatly reduce the working pressure of diagnostic doctors and effectively improve the efficiency of diagnosis.
尘肺病是一种以工作场所接触粉尘引起的肺组织沉积为特征的疾病。在中国,由于尘肺患者数量多、分布广,在尘肺诊断过程中对肺活检的病例资料有很高的需求。本文研究了基于深度学习(DL)的医学图像检测技术在尘肺诊断中的应用。分析了一种基于深度学习的医学图像检测和卷积神经网络(CNN),研究了深度学习医学图像技术在尘肺诊断中的应用。本文的实验结果表明,在最后一轮测试中,包含反褶积结构的ResNet模型准确率达到95.2%。受试者工作特性的曲线下面积(AUC)值为0.987。灵敏度为99.66%,特异度为88.61%。尘肺病的非分期诊断在保证高特异性的同时提高了诊断敏感性。同时,采用Delong检验方法对三种模型进行AUC分析,结果显示,C模型比A模型和B模型更有效,A模型与B模型之间无显著性差异,诊断效率无显著性差异。总之,该模型的诊断灵敏度高,漏诊概率低,可以大大减轻诊断医生的工作压力,有效提高诊断效率。
{"title":"Application of Medical Image Detection Technology Based on Deep Learning in Pneumoconiosis Diagnosis","authors":"Shengguang Peng","doi":"10.1162/dint_a_00228","DOIUrl":"https://doi.org/10.1162/dint_a_00228","url":null,"abstract":"\u0000 Pneumoconiosis is a disease characterized by pulmonary tissue deposition caused by dust exposure in the workplace. In China, due to the large number and wide distribution of pneumoconiosis patients, there is a high demand for the case data of lung biopsy during the diagnosis of pneumoconiosis. This text studied the application of medical image detection technology in pneumoconiosis diagnosis based on deep learning (DL). A medical image detection and convolution neural network (CNN) based on DL was analyzed, and the application of DL medical image technology in pneumoconiosis diagnosis was researched. The experimental results in this paper showed that in the last round of testing, the accuracy of ResNet model including deconvolution structure reached 95.2%. The area under curve (AUC) value of the working characteristics of the subject is 0.987. The sensitivity was 99.66%, and the specificity was 88.61%. The non staging diagnosis of pneumoconiosis improved the diagnostic sensitivity while ensuring high specificity. At the same time, Delong test method was used to conduct AUC analysis on the three models, and the results showed that model C was more effective than model A and model B. There is no significant difference between model A and model B, and there is no significant difference in diagnostic efficiency. In a word, the diagnosis of the model has high sensitivity and low probability of missed diagnosis, which can greatly reduce the working pressure of diagnostic doctors and effectively improve the efficiency of diagnosis.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"1 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42662945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The State of the Art of Natural Language Processing—A Systematic Automated Review of NLP Literature Using NLP Techniques 自然语言处理技术的现状-使用自然语言处理技术的自然语言处理文献的系统自动回顾
IF 3.9 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-07-03 DOI: 10.1162/dint_a_00213
Jan Sawicki, M. Ganzha, M. Paprzycki
ABSTRACT Nowadays, natural language processing (NLP) is one of the most popular areas of, broadly understood, artificial intelligence. Therefore, every day, new research contributions are posted, for instance, to the arXiv repository. Hence, it is rather difficult to capture the current “state of the field” and thus, to enter it. This brought the id-art NLP techniques to analyse the NLP-focused literature. As a result, (1) meta-level knowledge, concerning the current state of NLP has been captured, and (2) a guide to use of basic NLP tools is provided. It should be noted that all the tools and the dataset described in this contribution are publicly available. Furthermore, the originality of this review lies in its full automation. This allows easy reproducibility and continuation and updating of this research in the future as new researches emerge in the field of NLP.
如今,自然语言处理(NLP)是人工智能中最受欢迎的、被广泛理解的领域之一。因此,每天都有新的研究成果发布,例如,发布到arXiv知识库。因此,捕捉当前的“领域状态”并进入它是相当困难的。这带来了id-art NLP技术来分析以NLP为重点的文献。因此,(1)获取了有关NLP现状的元级知识;(2)提供了基本NLP工具的使用指南。值得注意的是,本贡献中描述的所有工具和数据集都是公开的。此外,这项审查的独创性在于其完全自动化。随着NLP领域的新研究的出现,这使得本研究在未来的可重复性和延续和更新变得容易。
{"title":"The State of the Art of Natural Language Processing—A Systematic Automated Review of NLP Literature Using NLP Techniques","authors":"Jan Sawicki, M. Ganzha, M. Paprzycki","doi":"10.1162/dint_a_00213","DOIUrl":"https://doi.org/10.1162/dint_a_00213","url":null,"abstract":"ABSTRACT Nowadays, natural language processing (NLP) is one of the most popular areas of, broadly understood, artificial intelligence. Therefore, every day, new research contributions are posted, for instance, to the arXiv repository. Hence, it is rather difficult to capture the current “state of the field” and thus, to enter it. This brought the id-art NLP techniques to analyse the NLP-focused literature. As a result, (1) meta-level knowledge, concerning the current state of NLP has been captured, and (2) a guide to use of basic NLP tools is provided. It should be noted that all the tools and the dataset described in this contribution are publicly available. Furthermore, the originality of this review lies in its full automation. This allows easy reproducibility and continuation and updating of this research in the future as new researches emerge in the field of NLP.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"707-749"},"PeriodicalIF":3.9,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64531695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Revealing the trends in the academic landscape of the health care system using contextual topic modeling 使用上下文主题建模揭示卫生保健系统学术景观的趋势
IF 3.9 3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-06-13 DOI: 10.1162/dint_a_00217
Muhammad Inaam ul haq, Qianmu Li
The health care system encompasses the participation of individuals, groups, agencies, and resources that offer services to address the requirements of the person, community, and population in terms of health. Parallel to the rising debates on the healthcare systems in relation to diseases, treatments, interventions, medication, and clinical practice guidelines, the world is currently discussing the healthcare industry, technology perspectives, and healthcare costs. To gain a comprehensive understanding of the healthcare systems research paradigm, we offered a novel contextual topic modeling approach that links up the CombinedTM model with our healthcare Bert to discover the contextual topics in the domain of healthcare. This research work discovered 60 contextual topics among them fifteen topics are the hottest which include smart medical monitoring systems, causes, and effects of stress and anxiety, and healthcare cost estimation and twelve topics are the coldest. Moreover, thirty-three topics are showing insignificant trends. We further investigated various clusters and correlations among the topics exploring inter-topic distance maps which add depth to the understanding of the research structure of this scientific domain. The current study enhances the prior topic modeling methodologies that examine the healthcare literature from a particular disciplinary perspective. It further extends the existing topic modeling approaches that do not incorporate contextual information in the topic discovery process adding contextual information by creating sentence embedding vectors through transformers-based models. We also utilized corpus tuning, the mean pooling technique, and the hugging face tool. Our method gives a higher coherence score as compared to the state-of-the-art models (LSA, LDA, and Ber Topic).
医疗保健系统包括提供服务的个人、团体、机构和资源的参与,以满足个人、社区和人口在健康方面的要求。与医疗保健系统在疾病、治疗、干预、药物和临床实践指南方面日益激烈的辩论同时,世界目前正在讨论医疗保健行业、技术视角和医疗保健成本。为了全面了解医疗保健系统研究范式,我们提供了一种新的上下文主题建模方法,将CombinedTM模型与我们的医疗保健Bert联系起来,以发现医疗保健领域的上下文主题。这项研究发现了60个上下文主题,其中15个主题是最热门的,包括智能医疗监测系统、压力和焦虑的原因和影响,以及医疗成本估计,12个主题最冷门。此外,33个专题显示出微不足道的趋势。我们进一步调查了主题之间的各种聚类和相关性,探索了主题间距离图,这为理解这一科学领域的研究结构增加了深度。当前的研究增强了先前的主题建模方法,该方法从特定学科的角度检查医疗保健文献。它进一步扩展了现有的主题建模方法,这些方法在主题发现过程中不包含上下文信息,通过基于转换器的模型创建句子嵌入向量来添加上下文信息。我们还使用了语料库调整、均值池技术和拥抱脸工具。与最先进的模型(LSA、LDA和Ber-Topic)相比,我们的方法给出了更高的一致性分数。
{"title":"Revealing the trends in the academic landscape of the health care system using contextual topic modeling","authors":"Muhammad Inaam ul haq, Qianmu Li","doi":"10.1162/dint_a_00217","DOIUrl":"https://doi.org/10.1162/dint_a_00217","url":null,"abstract":"\u0000 The health care system encompasses the participation of individuals, groups, agencies, and resources that offer services to address the requirements of the person, community, and population in terms of health. Parallel to the rising debates on the healthcare systems in relation to diseases, treatments, interventions, medication, and clinical practice guidelines, the world is currently discussing the healthcare industry, technology perspectives, and healthcare costs. To gain a comprehensive understanding of the healthcare systems research paradigm, we offered a novel contextual topic modeling approach that links up the CombinedTM model with our healthcare Bert to discover the contextual topics in the domain of healthcare. This research work discovered 60 contextual topics among them fifteen topics are the hottest which include smart medical monitoring systems, causes, and effects of stress and anxiety, and healthcare cost estimation and twelve topics are the coldest. Moreover, thirty-three topics are showing insignificant trends. We further investigated various clusters and correlations among the topics exploring inter-topic distance maps which add depth to the understanding of the research structure of this scientific domain. The current study enhances the prior topic modeling methodologies that examine the healthcare literature from a particular disciplinary perspective. It further extends the existing topic modeling approaches that do not incorporate contextual information in the topic discovery process adding contextual information by creating sentence embedding vectors through transformers-based models. We also utilized corpus tuning, the mean pooling technique, and the hugging face tool. Our method gives a higher coherence score as compared to the state-of-the-art models (LSA, LDA, and Ber Topic).","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44972089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A knowledge graph-based deep learning framework for efficient content similarity search of Sustainable Development Goals data 基于知识图的深度学习框架,用于可持续发展目标数据的高效内容相似度搜索
3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-06-13 DOI: 10.1162/dint_a_00206
Irene Kilanioti, George A. Papadopoulos
Abstract Sustainable development denotes the enhancement of living standards in the present without compromising future generations’ resources. Sustainable Development Goals (SDGs) quantify the accomplishment of sustainable development and pave the way for a world worth living in for future generations. Scholars can contribute to the achievement of the SDGs by guiding the actions of practitioners based on the analysis of SDG data, as intended by this work. We propose a framework of algorithms based on dimensionality reduction methods with the use of Hilbert Space Filling Curves (HSFCs) in order to semantically cluster new uncategorised SDG data and novel indicators, and efficiently place them in the environment of a distributed knowledge graph store. First, a framework of algorithms for insertion of new indicators and projection on the HSFC curve based on their transformer-based similarity assessment, for retrieval of indicators and load-balancing along with an approach for data classification of entrant-indicators is described. Then, a thorough case study in a distributed knowledge graph environment experimentally evaluates our framework. The results are presented and discussed in light of theory along with the actual impact that can have for practitioners analysing SDG data, including intergovernmental organizations, government agencies and social welfare organizations. Our approach empowers SDG knowledge graphs for causal analysis, inference, and manifold interpretations of the societal implications of SDG-related actions, as data are accessed in reduced retrieval times. It facilitates quicker measurement of influence of users and communities on specific goals and serves for faster distributed knowledge matching, as semantic cohesion of data is preserved.
可持续发展是指在不损害子孙后代资源的前提下,提高当代人的生活水平。可持续发展目标(sdg)量化了可持续发展的成就,为子孙后代创造一个值得生活的世界铺平了道路。学者可以在分析可持续发展目标数据的基础上指导实践者的行动,从而为实现可持续发展目标做出贡献,这也是本工作的目的。我们提出了一种基于降维方法的算法框架,利用希尔伯特空间填充曲线(Hilbert Space Filling Curves, hsfc)对新的未分类的可持续发展目标数据和新的指标进行语义聚类,并有效地将它们放置在分布式知识图存储环境中。首先,描述了基于变压器相似性评估的新指标插入和HSFC曲线投影的算法框架,用于指标检索和负载平衡,以及进入指标的数据分类方法。然后,在分布式知识图环境中进行了全面的案例研究,实验评估了我们的框架。结果在理论的基础上提出和讨论,以及对分析可持续发展目标数据的实践者的实际影响,包括政府间组织、政府机构和社会福利组织。我们的方法使可持续发展目标知识图谱能够进行因果分析、推理,并对可持续发展目标相关行动的社会影响进行多种解释,因为数据可以在更短的检索时间内访问。它有助于更快地衡量用户和社区对特定目标的影响,并有助于更快地进行分布式知识匹配,因为数据的语义内聚得到了保留。
{"title":"A knowledge graph-based deep learning framework for efficient content similarity search of Sustainable Development Goals data","authors":"Irene Kilanioti, George A. Papadopoulos","doi":"10.1162/dint_a_00206","DOIUrl":"https://doi.org/10.1162/dint_a_00206","url":null,"abstract":"Abstract Sustainable development denotes the enhancement of living standards in the present without compromising future generations’ resources. Sustainable Development Goals (SDGs) quantify the accomplishment of sustainable development and pave the way for a world worth living in for future generations. Scholars can contribute to the achievement of the SDGs by guiding the actions of practitioners based on the analysis of SDG data, as intended by this work. We propose a framework of algorithms based on dimensionality reduction methods with the use of Hilbert Space Filling Curves (HSFCs) in order to semantically cluster new uncategorised SDG data and novel indicators, and efficiently place them in the environment of a distributed knowledge graph store. First, a framework of algorithms for insertion of new indicators and projection on the HSFC curve based on their transformer-based similarity assessment, for retrieval of indicators and load-balancing along with an approach for data classification of entrant-indicators is described. Then, a thorough case study in a distributed knowledge graph environment experimentally evaluates our framework. The results are presented and discussed in light of theory along with the actual impact that can have for practitioners analysing SDG data, including intergovernmental organizations, government agencies and social welfare organizations. Our approach empowers SDG knowledge graphs for causal analysis, inference, and manifold interpretations of the societal implications of SDG-related actions, as data are accessed in reduced retrieval times. It facilitates quicker measurement of influence of users and communities on specific goals and serves for faster distributed knowledge matching, as semantic cohesion of data is preserved.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136106992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MillenniumDB: An Open-Source Graph Database System 一个开源的图形数据库系统
3区 计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2023-06-13 DOI: 10.1162/dint_a_00209
Domagoj Vrgoč, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil-Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, Juan Romero
Abstract In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.
在这篇系统论文中,我们提出了millenumdb:一个模块化、持久化和开源的新型图形数据库引擎。millenumdb基于一个图形数据模型,我们称之为领域图,它提供了一个简单的抽象,在此基础上可以支持各种流行的图形模型,从而为各种类型的知识图提供了一个灵活的数据管理引擎。引擎本身是建立在一系列久经考验的技术基础之上的,这些技术来自关系数据管理、最坏情况下最优连接的最先进算法,以及用于评估路径查询的特定于图的算法。在本文中,我们提出了基于millenumdb的主要设计原则,描述了支持的抽象图模型和查询语义,实现的具体数据模型和查询语法,以及使用的存储、索引、查询规划和查询评估技术。我们对真实世界的数据和来自维基数据知识图的查询进行了评估,发现它优于其他流行的持久性图形数据库引擎(包括企业和开源替代品),这些引擎支持类似的查询功能。
{"title":"MillenniumDB: An Open-Source Graph Database System","authors":"Domagoj Vrgoč, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil-Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, Juan Romero","doi":"10.1162/dint_a_00209","DOIUrl":"https://doi.org/10.1162/dint_a_00209","url":null,"abstract":"Abstract In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136106994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Data Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1