Dhiaa Musleh, Haya Almossaeed, Fay Balhareth, Ghadah Alqahtani, Norah Alobaidan, Jana Altalag, May Issa Aldossary
The rise of artificial intelligence has created and facilitated numerous everyday tasks in a variety of industries, including dentistry. Dentists have utilized X-rays for diagnosing patients’ ailments for many years. However, the procedure is typically performed manually, which can be challenging and time-consuming for non-specialized specialists and carries a significant risk of error. As a result, researchers have turned to machine and deep learning modeling approaches to precisely identify dental disorders using X-ray pictures. This review is motivated by the need to address these challenges and to explore the potential of AI to enhance diagnostic accuracy, efficiency, and reliability in dental practice. Although artificial intelligence is frequently employed in dentistry, the approaches’ outcomes are still influenced by aspects such as dataset availability and quantity, chapter balance, and data interpretation capability. Consequently, it is critical to work with the research community to address these issues in order to identify the most effective approaches for use in ongoing investigations. This article, which is based on a literature review, provides a concise summary of the diagnosis process using X-ray imaging systems, offers a thorough understanding of the difficulties that dental researchers face, and presents an amalgamative evaluation of the performances and methodologies assessed using publicly available benchmarks.
人工智能的兴起为包括牙科在内的各行各业创造和促进了大量日常工作。多年来,牙医一直利用 X 射线来诊断病人的疾病。然而,这一过程通常由人工完成,这对非专业专家来说具有挑战性,耗费时间,而且存在很大的出错风险。因此,研究人员转而采用机器和深度学习建模方法,利用 X 射线图片精确识别牙科疾病。本综述正是出于应对这些挑战的需要,并探索人工智能在提高牙科诊所诊断准确性、效率和可靠性方面的潜力。虽然人工智能经常被应用于牙科领域,但其结果仍受到数据集的可用性和数量、章节平衡和数据解读能力等方面的影响。因此,与研究界合作解决这些问题至关重要,以便找出最有效的方法用于正在进行的研究。本文以文献综述为基础,简明扼要地总结了使用 X 射线成像系统进行诊断的过程,透彻地阐述了牙科研究人员面临的困难,并对使用公开基准评估的性能和方法进行了综合评价。
{"title":"Advancing Dental Diagnostics: A Review of Artificial Intelligence Applications and Challenges in Dentistry","authors":"Dhiaa Musleh, Haya Almossaeed, Fay Balhareth, Ghadah Alqahtani, Norah Alobaidan, Jana Altalag, May Issa Aldossary","doi":"10.3390/bdcc8060066","DOIUrl":"https://doi.org/10.3390/bdcc8060066","url":null,"abstract":"The rise of artificial intelligence has created and facilitated numerous everyday tasks in a variety of industries, including dentistry. Dentists have utilized X-rays for diagnosing patients’ ailments for many years. However, the procedure is typically performed manually, which can be challenging and time-consuming for non-specialized specialists and carries a significant risk of error. As a result, researchers have turned to machine and deep learning modeling approaches to precisely identify dental disorders using X-ray pictures. This review is motivated by the need to address these challenges and to explore the potential of AI to enhance diagnostic accuracy, efficiency, and reliability in dental practice. Although artificial intelligence is frequently employed in dentistry, the approaches’ outcomes are still influenced by aspects such as dataset availability and quantity, chapter balance, and data interpretation capability. Consequently, it is critical to work with the research community to address these issues in order to identify the most effective approaches for use in ongoing investigations. This article, which is based on a literature review, provides a concise summary of the diagnosis process using X-ray imaging systems, offers a thorough understanding of the difficulties that dental researchers face, and presents an amalgamative evaluation of the performances and methodologies assessed using publicly available benchmarks.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":" 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141371605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantinos I. Roumeliotis, Nikolaos D. Tselikas, Dimitrios K. Nasiopoulos
Cryptocurrencies are becoming increasingly prominent in financial investments, with more investors diversifying their portfolios and individuals drawn to their ease of use and decentralized financial opportunities. However, this accessibility also brings significant risks and rewards, often influenced by news and the sentiments of crypto investors, known as crypto signals. This paper explores the capabilities of large language models (LLMs) and natural language processing (NLP) models in analyzing sentiment from cryptocurrency-related news articles. We fine-tune state-of-the-art models such as GPT-4, BERT, and FinBERT for this specific task, evaluating their performance and comparing their effectiveness in sentiment classification. By leveraging these advanced techniques, we aim to enhance the understanding of sentiment dynamics in the cryptocurrency market, providing insights that can inform investment decisions and risk management strategies. The outcomes of this comparative study contribute to the broader discourse on applying advanced NLP models to cryptocurrency sentiment analysis, with implications for both academic research and practical applications in financial markets.
{"title":"LLMs and NLP Models in Cryptocurrency Sentiment Analysis: A Comparative Classification Study","authors":"Konstantinos I. Roumeliotis, Nikolaos D. Tselikas, Dimitrios K. Nasiopoulos","doi":"10.3390/bdcc8060063","DOIUrl":"https://doi.org/10.3390/bdcc8060063","url":null,"abstract":"Cryptocurrencies are becoming increasingly prominent in financial investments, with more investors diversifying their portfolios and individuals drawn to their ease of use and decentralized financial opportunities. However, this accessibility also brings significant risks and rewards, often influenced by news and the sentiments of crypto investors, known as crypto signals. This paper explores the capabilities of large language models (LLMs) and natural language processing (NLP) models in analyzing sentiment from cryptocurrency-related news articles. We fine-tune state-of-the-art models such as GPT-4, BERT, and FinBERT for this specific task, evaluating their performance and comparing their effectiveness in sentiment classification. By leveraging these advanced techniques, we aim to enhance the understanding of sentiment dynamics in the cryptocurrency market, providing insights that can inform investment decisions and risk management strategies. The outcomes of this comparative study contribute to the broader discourse on applying advanced NLP models to cryptocurrency sentiment analysis, with implications for both academic research and practical applications in financial markets.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"57 52","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141384032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diana Martínez-Mosquera, Rosa Navarrete, Sergio Luján-Mora, Lorena Recalde, Andres Andrade-Cabrera
The growing importance of data analytics is leading to a shift in data management strategy at many companies, moving away from simple data storage towards adopting Online Analytical Processing (OLAP) query analysis. Concurrently, NoSQL databases are gaining ground as the preferred choice for storing and querying analytical data. This article presents a comprehensive, systematic mapping, aiming to consolidate research efforts related to the integration of OLAP with NoSQL databases in Big Data environments. After identifying 1646 initial research studies from scientific digital repositories, a thorough examination of their content resulted in the acceptance of 22 studies. Utilizing the snowballing technique, an additional three studies were selected, culminating in a final corpus of twenty-five relevant articles. This review addresses the growing importance of leveraging NoSQL databases for OLAP query analysis in response to increasing data analytics demands. By identifying the most commonly used NoSQL databases with OLAP, such as column-oriented and document-oriented, prevalent OLAP modeling methods, such as Relational Online Analytical Processing (ROLAP) and Multidimensional Online Analytical Processing (MOLAP), and suggested models for batch and real-time processing, among other results, this research provides a roadmap for organizations navigating the integration of OLAP with NoSQL. Additionally, exploring computational resource requirements and performance benchmarks facilitates informed decision making and promotes advancements in Big Data analytics. The main findings of this review provide valuable insights and updated information regarding the integration of OLAP cubes with NoSQL databases to benefit future research, industry practitioners, and academia alike. This consolidation of research efforts not only promotes innovative solutions but also promises reduced operational costs compared to traditional database systems.
{"title":"Integrating OLAP with NoSQL Databases in Big Data Environments: Systematic Mapping","authors":"Diana Martínez-Mosquera, Rosa Navarrete, Sergio Luján-Mora, Lorena Recalde, Andres Andrade-Cabrera","doi":"10.3390/bdcc8060064","DOIUrl":"https://doi.org/10.3390/bdcc8060064","url":null,"abstract":"The growing importance of data analytics is leading to a shift in data management strategy at many companies, moving away from simple data storage towards adopting Online Analytical Processing (OLAP) query analysis. Concurrently, NoSQL databases are gaining ground as the preferred choice for storing and querying analytical data. This article presents a comprehensive, systematic mapping, aiming to consolidate research efforts related to the integration of OLAP with NoSQL databases in Big Data environments. After identifying 1646 initial research studies from scientific digital repositories, a thorough examination of their content resulted in the acceptance of 22 studies. Utilizing the snowballing technique, an additional three studies were selected, culminating in a final corpus of twenty-five relevant articles. This review addresses the growing importance of leveraging NoSQL databases for OLAP query analysis in response to increasing data analytics demands. By identifying the most commonly used NoSQL databases with OLAP, such as column-oriented and document-oriented, prevalent OLAP modeling methods, such as Relational Online Analytical Processing (ROLAP) and Multidimensional Online Analytical Processing (MOLAP), and suggested models for batch and real-time processing, among other results, this research provides a roadmap for organizations navigating the integration of OLAP with NoSQL. Additionally, exploring computational resource requirements and performance benchmarks facilitates informed decision making and promotes advancements in Big Data analytics. The main findings of this review provide valuable insights and updated information regarding the integration of OLAP cubes with NoSQL databases to benefit future research, industry practitioners, and academia alike. This consolidation of research efforts not only promotes innovative solutions but also promises reduced operational costs compared to traditional database systems.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"19 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141382518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
During the COVID-19 pandemic, pro-vaccine and anti-vaccine groups emerged, influencing others to vaccinate or abstain and leading to polarized debates. Due to incomplete user data and the complexity of social network interactions, understanding the dynamics of these discussions is challenging. This study aims to discover and quantify the factors driving the controversy related to vaccine stances across Kuwaiti social networks. To tackle these challenges, a graph convolutional network (GCN) and feature propagation (FP) were utilized to accurately detect users’ stances despite incomplete features, achieving an accuracy of 96%. Additionally, the random walk controversy (RWC) score was employed to quantify polarization points within the social networks. Experiments were conducted using a dataset of vaccine-related retweets and discussions from X (formerly Twitter) during the Kuwait COVID-19 vaccine rollout period. The analysis revealed high polarization periods correlating with specific vaccination rates and governmental announcements. This research provides a novel approach to accurately detecting user stances in low-resource languages like the Kuwaiti dialect without the need for costly annotations, offering valuable insights to help policymakers understand public opinion and address misinformation effectively.
{"title":"Quantifying Variations in Controversial Discussions within Kuwaiti Social Networks","authors":"Yeonjung Lee, Hana Alostad, Hasan Davulcu","doi":"10.3390/bdcc8060060","DOIUrl":"https://doi.org/10.3390/bdcc8060060","url":null,"abstract":"During the COVID-19 pandemic, pro-vaccine and anti-vaccine groups emerged, influencing others to vaccinate or abstain and leading to polarized debates. Due to incomplete user data and the complexity of social network interactions, understanding the dynamics of these discussions is challenging. This study aims to discover and quantify the factors driving the controversy related to vaccine stances across Kuwaiti social networks. To tackle these challenges, a graph convolutional network (GCN) and feature propagation (FP) were utilized to accurately detect users’ stances despite incomplete features, achieving an accuracy of 96%. Additionally, the random walk controversy (RWC) score was employed to quantify polarization points within the social networks. Experiments were conducted using a dataset of vaccine-related retweets and discussions from X (formerly Twitter) during the Kuwait COVID-19 vaccine rollout period. The analysis revealed high polarization periods correlating with specific vaccination rates and governmental announcements. This research provides a novel approach to accurately detecting user stances in low-resource languages like the Kuwaiti dialect without the need for costly annotations, offering valuable insights to help policymakers understand public opinion and address misinformation effectively.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"66 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141387515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Lehnert, Falko Gawantka, Jonas During, Franz Just, Marc Reichenbach
Wild and forest fires pose a threat to forests and thereby, in extension, to wild life and humanity. Recent history shows an increase in devastating damages caused by fires. Traditional fire detection systems, such as video surveillance, fail in the early stages of a rural forest fire. Such systems would see the fire only when the damage is immense. Novel low-power smoke detection units based on gas sensors can detect smoke fumes in the early development stages of fires. The required proximity is only achieved using a distributed network of sensors interconnected via 5G. In the context of battery-powered sensor nodes, energy efficiency becomes a key metric. Using AI classification combined with XAI enables improved confidence regarding measurements. In this work, we present both a low-power gas sensor for smoke detection and a system elaboration regarding energy-efficient communication schemes and XAI-based evaluation. We show that leveraging edge processing in a smart way combined with buffered data samples in a 5G communication network yields optimal energy efficiency and rating results.
{"title":"XplAInable: Explainable AI Smoke Detection at the Edge","authors":"Alexander Lehnert, Falko Gawantka, Jonas During, Franz Just, Marc Reichenbach","doi":"10.3390/bdcc8050050","DOIUrl":"https://doi.org/10.3390/bdcc8050050","url":null,"abstract":"Wild and forest fires pose a threat to forests and thereby, in extension, to wild life and humanity. Recent history shows an increase in devastating damages caused by fires. Traditional fire detection systems, such as video surveillance, fail in the early stages of a rural forest fire. Such systems would see the fire only when the damage is immense. Novel low-power smoke detection units based on gas sensors can detect smoke fumes in the early development stages of fires. The required proximity is only achieved using a distributed network of sensors interconnected via 5G. In the context of battery-powered sensor nodes, energy efficiency becomes a key metric. Using AI classification combined with XAI enables improved confidence regarding measurements. In this work, we present both a low-power gas sensor for smoke detection and a system elaboration regarding energy-efficient communication schemes and XAI-based evaluation. We show that leveraging edge processing in a smart way combined with buffered data samples in a 5G communication network yields optimal energy efficiency and rating results.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"1 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140962701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The intelligent warehouse is a modern logistics management system that uses technologies like the Internet of Things, robots, and artificial intelligence to realize automated management and optimize warehousing operations. The multi-robot system (MRS) is an important carrier for implementing an intelligent warehouse, which completes various tasks in the warehouse through cooperation and coordination between robots. As an extension of reinforcement learning and a kind of swarm intelligence, MARL (multi-agent reinforcement learning) can effectively create the multi-robot systems in intelligent warehouses. However, MARL-based multi-robot systems in intelligent warehouses face serious safety issues, such as collisions, conflicts, and congestion. To deal with these issues, this paper proposes a safe MARL method based on runtime verification, i.e., an optimized safety policy-generation framework, for multi-robot systems in intelligent warehouses. The framework consists of three stages. In the first stage, a runtime model SCMG (safety-constrained Markov Game) is defined for the multi-robot system at runtime in the intelligent warehouse. In the second stage, rPATL (probabilistic alternating-time temporal logic with rewards) is used to express safety properties, and SCMG is cyclically verified and refined through runtime verification (RV) to ensure safety. This stage guarantees the safety of robots’ behaviors before training. In the third stage, the verified SCMG guides SCPO (safety-constrained policy optimization) to obtain an optimized safety policy for robots. Finally, a multi-robot warehouse (RWARE) scenario is used for experimental evaluation. The results show that the policy obtained by our framework is safer than existing frameworks and includes a certain degree of optimization.
{"title":"Runtime Verification-Based Safe MARL for Optimized Safety Policy Generation for Multi-Robot Systems","authors":"Yang Liu, Jiankun Li","doi":"10.3390/bdcc8050049","DOIUrl":"https://doi.org/10.3390/bdcc8050049","url":null,"abstract":"The intelligent warehouse is a modern logistics management system that uses technologies like the Internet of Things, robots, and artificial intelligence to realize automated management and optimize warehousing operations. The multi-robot system (MRS) is an important carrier for implementing an intelligent warehouse, which completes various tasks in the warehouse through cooperation and coordination between robots. As an extension of reinforcement learning and a kind of swarm intelligence, MARL (multi-agent reinforcement learning) can effectively create the multi-robot systems in intelligent warehouses. However, MARL-based multi-robot systems in intelligent warehouses face serious safety issues, such as collisions, conflicts, and congestion. To deal with these issues, this paper proposes a safe MARL method based on runtime verification, i.e., an optimized safety policy-generation framework, for multi-robot systems in intelligent warehouses. The framework consists of three stages. In the first stage, a runtime model SCMG (safety-constrained Markov Game) is defined for the multi-robot system at runtime in the intelligent warehouse. In the second stage, rPATL (probabilistic alternating-time temporal logic with rewards) is used to express safety properties, and SCMG is cyclically verified and refined through runtime verification (RV) to ensure safety. This stage guarantees the safety of robots’ behaviors before training. In the third stage, the verified SCMG guides SCPO (safety-constrained policy optimization) to obtain an optimized safety policy for robots. Finally, a multi-robot warehouse (RWARE) scenario is used for experimental evaluation. The results show that the policy obtained by our framework is safer than existing frameworks and includes a certain degree of optimization.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"29 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140968544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Time series forecasting has been a challenging area in the field of Artificial Intelligence. Various approaches such as linear neural networks, recurrent linear neural networks, Convolutional Neural Networks, and recently transformers have been attempted for the time series forecasting domain. Although transformer-based architectures have been outstanding in the Natural Language Processing domain, especially in autoregressive language modeling, the initial attempts to use transformers in the time series arena have met mixed success. A recent important work indicating simple linear networks outperform transformer-based designs. We investigate this paradox in detail comparing the linear neural network- and transformer-based designs, providing insights into why a certain approach may be better for a particular type of problem. We also improve upon the recently proposed simple linear neural network-based architecture by using dual pipelines with batch normalization and reversible instance normalization. Our enhanced architecture outperforms all existing architectures for time series forecasting on a majority of the popular benchmarks.
{"title":"Enhanced Linear and Vision Transformer-Based Architectures for Time Series Forecasting","authors":"Musleh Alharthi, Ausif Mahmood","doi":"10.3390/bdcc8050048","DOIUrl":"https://doi.org/10.3390/bdcc8050048","url":null,"abstract":"Time series forecasting has been a challenging area in the field of Artificial Intelligence. Various approaches such as linear neural networks, recurrent linear neural networks, Convolutional Neural Networks, and recently transformers have been attempted for the time series forecasting domain. Although transformer-based architectures have been outstanding in the Natural Language Processing domain, especially in autoregressive language modeling, the initial attempts to use transformers in the time series arena have met mixed success. A recent important work indicating simple linear networks outperform transformer-based designs. We investigate this paradox in detail comparing the linear neural network- and transformer-based designs, providing insights into why a certain approach may be better for a particular type of problem. We also improve upon the recently proposed simple linear neural network-based architecture by using dual pipelines with batch normalization and reversible instance normalization. Our enhanced architecture outperforms all existing architectures for time series forecasting on a majority of the popular benchmarks.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"26 7","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140968682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ilyas Aden, Christopher H. T. Child, C. Reyes-Aldasoro
The International Classification of Diseases (ICD) serves as a widely employed framework for assigning diagnosis codes to electronic health records of patients. These codes facilitate the encapsulation of diagnoses and procedures conducted during a patient’s hospitalisation. This study aims to devise a predictive model for ICD codes based on the MIMIC-III clinical text dataset. Leveraging natural language processing techniques and deep learning architectures, we constructed a pipeline to distill pertinent information from the MIMIC-III dataset: the Medical Information Mart for Intensive Care III (MIMIC-III), a sizable, de-identified, and publicly accessible repository of medical records. Our method entails predicting diagnosis codes from unstructured data, such as discharge summaries and notes encompassing symptoms. We used state-of-the-art deep learning algorithms, such as recurrent neural networks (RNNs), long short-term memory (LSTM) networks, bidirectional LSTM (BiLSTM) and BERT models after tokenizing the clinical test with Bio-ClinicalBERT, a pre-trained model from Hugging Face. To evaluate the efficacy of our approach, we conducted experiments utilizing the discharge dataset within MIMIC-III. Employing the BERT model, our methodology exhibited commendable accuracy in predicting the top 10 and top 50 diagnosis codes within the MIMIC-III dataset, achieving average accuracies of 88% and 80%, respectively. In comparison to recent studies by Biseda and Kerang, as well as Gangavarapu, which reported F1 scores of 0.72 in predicting the top 10 ICD-10 codes, our model demonstrated better performance, with an F1 score of 0.87. Similarly, in predicting the top 50 ICD-10 codes, previous research achieved an F1 score of 0.75, whereas our method attained an F1 score of 0.81. These results underscore the better performance of deep learning models over conventional machine learning approaches in this domain, thus validating our findings. The ability to predict diagnoses early from clinical notes holds promise in assisting doctors or physicians in determining effective treatments, thereby reshaping the conventional paradigm of diagnosis-then-treatment care. Our code is available online.
国际疾病分类(ICD)是一个广泛使用的框架,用于为病人的电子健康记录分配诊断代码。这些代码便于概括病人住院期间的诊断和治疗过程。本研究旨在基于 MIMIC-III 临床文本数据集设计一个 ICD 代码预测模型。利用自然语言处理技术和深度学习架构,我们构建了一个从 MIMIC-III 数据集中提炼相关信息的管道:MIMIC-III(Medical Information Mart for Intensive Care III)是一个规模庞大、去标识化且可公开访问的医疗记录库。我们的方法需要从非结构化数据(如出院摘要和包含症状的笔记)中预测诊断代码。我们使用了最先进的深度学习算法,如递归神经网络(RNN)、长短期记忆(LSTM)网络、双向 LSTM(BiLSTM)和 BERT 模型,然后使用 Hugging Face 的预训练模型 Bio-ClinicalBERT 对临床测试进行标记。为了评估我们方法的有效性,我们利用 MIMIC-III 中的出院数据集进行了实验。通过使用 BERT 模型,我们的方法在预测 MIMIC-III 数据集中的前 10 和前 50 个诊断代码方面表现出了值得称赞的准确性,平均准确率分别达到了 88% 和 80%。与 Biseda 和 Kerang 以及 Gangavarapu 最近的研究相比,我们的模型在预测前 10 个 ICD-10 代码方面的 F1 得分为 0.72,表现更好,F1 得分为 0.87。同样,在预测前 50 个 ICD-10 代码时,以前的研究取得了 0.75 的 F1 分数,而我们的方法取得了 0.81 的 F1 分数。这些结果表明,在这一领域,深度学习模型的性能优于传统的机器学习方法,从而验证了我们的研究结果。从临床笔记中及早预测诊断的能力有望协助医生确定有效的治疗方法,从而重塑先诊断后治疗的传统模式。我们的代码可在线获取。
{"title":"International Classification of Diseases Prediction from MIMIIC-III Clinical Text Using Pre-Trained ClinicalBERT and NLP Deep Learning Models Achieving State of the Art","authors":"Ilyas Aden, Christopher H. T. Child, C. Reyes-Aldasoro","doi":"10.3390/bdcc8050047","DOIUrl":"https://doi.org/10.3390/bdcc8050047","url":null,"abstract":"The International Classification of Diseases (ICD) serves as a widely employed framework for assigning diagnosis codes to electronic health records of patients. These codes facilitate the encapsulation of diagnoses and procedures conducted during a patient’s hospitalisation. This study aims to devise a predictive model for ICD codes based on the MIMIC-III clinical text dataset. Leveraging natural language processing techniques and deep learning architectures, we constructed a pipeline to distill pertinent information from the MIMIC-III dataset: the Medical Information Mart for Intensive Care III (MIMIC-III), a sizable, de-identified, and publicly accessible repository of medical records. Our method entails predicting diagnosis codes from unstructured data, such as discharge summaries and notes encompassing symptoms. We used state-of-the-art deep learning algorithms, such as recurrent neural networks (RNNs), long short-term memory (LSTM) networks, bidirectional LSTM (BiLSTM) and BERT models after tokenizing the clinical test with Bio-ClinicalBERT, a pre-trained model from Hugging Face. To evaluate the efficacy of our approach, we conducted experiments utilizing the discharge dataset within MIMIC-III. Employing the BERT model, our methodology exhibited commendable accuracy in predicting the top 10 and top 50 diagnosis codes within the MIMIC-III dataset, achieving average accuracies of 88% and 80%, respectively. In comparison to recent studies by Biseda and Kerang, as well as Gangavarapu, which reported F1 scores of 0.72 in predicting the top 10 ICD-10 codes, our model demonstrated better performance, with an F1 score of 0.87. Similarly, in predicting the top 50 ICD-10 codes, previous research achieved an F1 score of 0.75, whereas our method attained an F1 score of 0.81. These results underscore the better performance of deep learning models over conventional machine learning approaches in this domain, thus validating our findings. The ability to predict diagnoses early from clinical notes holds promise in assisting doctors or physicians in determining effective treatments, thereby reshaping the conventional paradigm of diagnosis-then-treatment care. Our code is available online.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":" 23","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140992033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Classification methods based on fine-tuning pre-trained language models often require a large number of labeled samples; therefore, few-shot text classification has attracted considerable attention. Prompt learning is an effective method for addressing few-shot text classification tasks in low-resource settings. The essence of prompt tuning is to insert tokens into the input, thereby converting a text classification task into a masked language modeling problem. However, constructing appropriate prompt templates and verbalizers remains challenging, as manual prompts often require expert knowledge, while auto-constructing prompts is time-consuming. In addition, the extensive knowledge contained in entities and relations should not be ignored. To address these issues, we propose a structured knowledge prompt tuning (SKPT) method, which is a knowledge-enhanced prompt tuning approach. Specifically, SKPT includes three components: prompt template, prompt verbalizer, and training strategies. First, we insert virtual tokens into the prompt template based on open triples to introduce external knowledge. Second, we use an improved knowledgeable verbalizer to expand and filter the label words. Finally, we use structured knowledge constraints during the training phase to optimize the model. Through extensive experiments on few-shot text classification tasks with different settings, the effectiveness of our model has been demonstrated.
{"title":"Knowledge-Enhanced Prompt Learning for Few-Shot Text Classification","authors":"Jinshuo Liu, Lu Yang","doi":"10.3390/bdcc8040043","DOIUrl":"https://doi.org/10.3390/bdcc8040043","url":null,"abstract":"Classification methods based on fine-tuning pre-trained language models often require a large number of labeled samples; therefore, few-shot text classification has attracted considerable attention. Prompt learning is an effective method for addressing few-shot text classification tasks in low-resource settings. The essence of prompt tuning is to insert tokens into the input, thereby converting a text classification task into a masked language modeling problem. However, constructing appropriate prompt templates and verbalizers remains challenging, as manual prompts often require expert knowledge, while auto-constructing prompts is time-consuming. In addition, the extensive knowledge contained in entities and relations should not be ignored. To address these issues, we propose a structured knowledge prompt tuning (SKPT) method, which is a knowledge-enhanced prompt tuning approach. Specifically, SKPT includes three components: prompt template, prompt verbalizer, and training strategies. First, we insert virtual tokens into the prompt template based on open triples to introduce external knowledge. Second, we use an improved knowledgeable verbalizer to expand and filter the label words. Finally, we use structured knowledge constraints during the training phase to optimize the model. Through extensive experiments on few-shot text classification tasks with different settings, the effectiveness of our model has been demonstrated.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":" 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140688770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olga Narushynska, V. Teslyuk, Anastasiya Doroshenko, Maksym Arzubov
The precise categorization of brief texts holds significant importance in various applications within the ever-changing realm of artificial intelligence (AI) and natural language processing (NLP). Short texts are everywhere in the digital world, from social media updates to customer reviews and feedback. Nevertheless, short texts’ limited length and context pose unique challenges for accurate classification. This research article delves into the influence of data sorting methods on the quality of manual labeling in hierarchical classification, with a particular focus on short texts. The study is set against the backdrop of the increasing reliance on manual labeling in AI and NLP, highlighting its significance in the accuracy of hierarchical text classification. Methodologically, the study integrates AI, notably zero-shot learning, with human annotation processes to examine the efficacy of various data-sorting strategies. The results demonstrate how different sorting approaches impact the accuracy and consistency of manual labeling, a critical aspect of creating high-quality datasets for NLP applications. The study’s findings reveal a significant time efficiency improvement in terms of labeling, where ordered manual labeling required 760 min per 1000 samples, compared to 800 min for traditional manual labeling, illustrating the practical benefits of optimized data sorting strategies. Comparatively, ordered manual labeling achieved the highest mean accuracy rates across all hierarchical levels, with figures reaching up to 99% for segments, 95% for families, 92% for classes, and 90% for bricks, underscoring the efficiency of structured data sorting. It offers valuable insights and practical guidelines for improving labeling quality in hierarchical classification tasks, thereby advancing the precision of text analysis in AI-driven research. This abstract encapsulates the article’s background, methods, results, and conclusions, providing a comprehensive yet succinct study overview.
{"title":"Data Sorting Influence on Short Text Manual Labeling Quality for Hierarchical Classification","authors":"Olga Narushynska, V. Teslyuk, Anastasiya Doroshenko, Maksym Arzubov","doi":"10.3390/bdcc8040041","DOIUrl":"https://doi.org/10.3390/bdcc8040041","url":null,"abstract":"The precise categorization of brief texts holds significant importance in various applications within the ever-changing realm of artificial intelligence (AI) and natural language processing (NLP). Short texts are everywhere in the digital world, from social media updates to customer reviews and feedback. Nevertheless, short texts’ limited length and context pose unique challenges for accurate classification. This research article delves into the influence of data sorting methods on the quality of manual labeling in hierarchical classification, with a particular focus on short texts. The study is set against the backdrop of the increasing reliance on manual labeling in AI and NLP, highlighting its significance in the accuracy of hierarchical text classification. Methodologically, the study integrates AI, notably zero-shot learning, with human annotation processes to examine the efficacy of various data-sorting strategies. The results demonstrate how different sorting approaches impact the accuracy and consistency of manual labeling, a critical aspect of creating high-quality datasets for NLP applications. The study’s findings reveal a significant time efficiency improvement in terms of labeling, where ordered manual labeling required 760 min per 1000 samples, compared to 800 min for traditional manual labeling, illustrating the practical benefits of optimized data sorting strategies. Comparatively, ordered manual labeling achieved the highest mean accuracy rates across all hierarchical levels, with figures reaching up to 99% for segments, 95% for families, 92% for classes, and 90% for bricks, underscoring the efficiency of structured data sorting. It offers valuable insights and practical guidelines for improving labeling quality in hierarchical classification tasks, thereby advancing the precision of text analysis in AI-driven research. This abstract encapsulates the article’s background, methods, results, and conclusions, providing a comprehensive yet succinct study overview.","PeriodicalId":505155,"journal":{"name":"Big Data and Cognitive Computing","volume":"47 10","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140733772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}