Dharahas Tallapally, John Wang, Katerina Potika, Magdalini Eirinaki
Recommender systems have revolutionized the way users discover and engage with content. Moving beyond the collaborative filtering approach, most modern recommender systems leverage additional sources of information, such as context and social network data. Such data can be modeled using graphs, and the recent advances in Graph Neural Networks have led to the prominence of a new family of graph-based recommender system algorithms. In this work, we propose the RelationalNet algorithm, which not only models user–item, and user–user relationships but also item–item relationships with graphs and uses them as input to the recommendation process. The rationale for utilizing item–item interactions is to enrich the item embeddings by leveraging the similarities between items. By using Graph Neural Networks (GNNs), RelationalNet incorporates social influence and similar item influence into the recommendation process and captures more accurate user interests, especially when traditional methods fall short due to data sparsity. Such models improve the accuracy and effectiveness of recommendation systems by leveraging social connections and item interactions. Results demonstrate that RelationalNet outperforms current state-of-the-art social recommendation algorithms.
{"title":"Using Graph Neural Networks for Social Recommendations","authors":"Dharahas Tallapally, John Wang, Katerina Potika, Magdalini Eirinaki","doi":"10.3390/a16110515","DOIUrl":"https://doi.org/10.3390/a16110515","url":null,"abstract":"Recommender systems have revolutionized the way users discover and engage with content. Moving beyond the collaborative filtering approach, most modern recommender systems leverage additional sources of information, such as context and social network data. Such data can be modeled using graphs, and the recent advances in Graph Neural Networks have led to the prominence of a new family of graph-based recommender system algorithms. In this work, we propose the RelationalNet algorithm, which not only models user–item, and user–user relationships but also item–item relationships with graphs and uses them as input to the recommendation process. The rationale for utilizing item–item interactions is to enrich the item embeddings by leveraging the similarities between items. By using Graph Neural Networks (GNNs), RelationalNet incorporates social influence and similar item influence into the recommendation process and captures more accurate user interests, especially when traditional methods fall short due to data sparsity. Such models improve the accuracy and effectiveness of recommendation systems by leveraging social connections and item interactions. Results demonstrate that RelationalNet outperforms current state-of-the-art social recommendation algorithms.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":" 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135188167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinbo Huang, Zhiwei Song, Chao Ji, Ye Zhang, Luya Yang
Different types of surface defects will occur during the production of strip steel. To ensure production quality, it is essential to classify these defects. Our research indicates that two main problems exist in the existing strip steel surface defect classification methods: (1) they cannot solve the problem of unbalanced data using few-shot in reality, (2) they cannot meet the requirement of online real-time classification. To solve the aforementioned problems, a relational knowledge distillation self-adaptive residual shrinkage network (RKD-SARSN) is presented in this work. First, the data enhancement strategy of Cycle GAN defective sample migration is designed. Second, the self-adaptive residual shrinkage network (SARSN) is intended as the backbone network for feature extraction. An adaptive loss function based on accuracy and geometric mean (Gmean) is proposed to solve the problem of unbalanced samples. Finally, a relational knowledge distillation model (RKD) is proposed, and the functions of GUI operation interface encapsulation are designed by combining image processing technology. SARSN is used as a teacher model, its generalization performance is transferred to the lightweight network ResNet34, and it is conveniently deployed as a student model. The results show that the proposed method can improve the deployment efficiency of the model and ensure the real-time performance of the classification algorithms. It is superior to other mainstream algorithms for fine-grained images with unbalanced data classification.
{"title":"Research on a Classification Method for Strip Steel Surface Defects Based on Knowledge Distillation and a Self-Adaptive Residual Shrinkage Network","authors":"Xinbo Huang, Zhiwei Song, Chao Ji, Ye Zhang, Luya Yang","doi":"10.3390/a16110516","DOIUrl":"https://doi.org/10.3390/a16110516","url":null,"abstract":"Different types of surface defects will occur during the production of strip steel. To ensure production quality, it is essential to classify these defects. Our research indicates that two main problems exist in the existing strip steel surface defect classification methods: (1) they cannot solve the problem of unbalanced data using few-shot in reality, (2) they cannot meet the requirement of online real-time classification. To solve the aforementioned problems, a relational knowledge distillation self-adaptive residual shrinkage network (RKD-SARSN) is presented in this work. First, the data enhancement strategy of Cycle GAN defective sample migration is designed. Second, the self-adaptive residual shrinkage network (SARSN) is intended as the backbone network for feature extraction. An adaptive loss function based on accuracy and geometric mean (Gmean) is proposed to solve the problem of unbalanced samples. Finally, a relational knowledge distillation model (RKD) is proposed, and the functions of GUI operation interface encapsulation are designed by combining image processing technology. SARSN is used as a teacher model, its generalization performance is transferred to the lightweight network ResNet34, and it is conveniently deployed as a student model. The results show that the proposed method can improve the deployment efficiency of the model and ensure the real-time performance of the classification algorithms. It is superior to other mainstream algorithms for fine-grained images with unbalanced data classification.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"2 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135186031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Holger Boche, Yannik N. Böck, Ullrich J. Mönich, Frank H. P. Fitzek
This article compares two methods of algorithmically processing bandlimited time-continuous signals in light of the general problem of finding “suitable” representations of analog information on digital hardware. Albeit abstract, we argue that this problem is fundamental in digital twinning, a signal-processing paradigm the upcoming 6G communication-technology standard relies on heavily. Using computable analysis, we formalize a general framework of machine-readable descriptions for representing analytic objects on Turing machines. Subsequently, we apply this framework to sampling and interpolation theory, providing a thoroughly formalized method for digitally processing the information carried by bandlimited analog signals. We investigate discrete-time descriptions, which form the implicit quasi-standard in digital signal processing, and establish continuous-time descriptions that take the signal’s continuous-time behavior into account. Motivated by an exemplary application of digital twinning, we analyze a textbook model of digital communication systems accordingly. We show that technologically fundamental properties, such as a signal’s (Banach-space) norm, can be computed from continuous-time, but not from discrete-time descriptions of the signal. Given the high trustworthiness requirements within 6G, e.g., employed software must satisfy assessment criteria in a provable manner, we conclude that the problem of “trustworthy” digital representations of analog information is indeed essential to near-future information technology.
{"title":"Trustworthy Digital Representations of Analog Information—An Application-Guided Analysis of a Fundamental Theoretical Problem in Digital Twinning","authors":"Holger Boche, Yannik N. Böck, Ullrich J. Mönich, Frank H. P. Fitzek","doi":"10.3390/a16110514","DOIUrl":"https://doi.org/10.3390/a16110514","url":null,"abstract":"This article compares two methods of algorithmically processing bandlimited time-continuous signals in light of the general problem of finding “suitable” representations of analog information on digital hardware. Albeit abstract, we argue that this problem is fundamental in digital twinning, a signal-processing paradigm the upcoming 6G communication-technology standard relies on heavily. Using computable analysis, we formalize a general framework of machine-readable descriptions for representing analytic objects on Turing machines. Subsequently, we apply this framework to sampling and interpolation theory, providing a thoroughly formalized method for digitally processing the information carried by bandlimited analog signals. We investigate discrete-time descriptions, which form the implicit quasi-standard in digital signal processing, and establish continuous-time descriptions that take the signal’s continuous-time behavior into account. Motivated by an exemplary application of digital twinning, we analyze a textbook model of digital communication systems accordingly. We show that technologically fundamental properties, such as a signal’s (Banach-space) norm, can be computed from continuous-time, but not from discrete-time descriptions of the signal. Given the high trustworthiness requirements within 6G, e.g., employed software must satisfy assessment criteria in a provable manner, we conclude that the problem of “trustworthy” digital representations of analog information is indeed essential to near-future information technology.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":" 22","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135244385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pietro Dell’Oglio, Alessandro Bondielli, Francesco Marcelloni
Today, most newspapers utilize social media to disseminate news. On the one hand, this results in an overload of related articles for social media users. On the other hand, since social media tends to form echo chambers around their users, different opinions and information may be hidden. Enabling users to access different information (possibly outside of their echo chambers, without the burden of reading entire articles, often containing redundant information) may be a step forward in allowing them to form their own opinions. To address this challenge, we propose a system that integrates Transformer neural models and text summarization models along with decision rules. Given a reference article already read by the user, our system first collects articles related to the same topic from a configurable number of different sources. Then, it identifies and summarizes the information that differs from the reference article and outputs the summary to the user. The core of the system is the sentence classification algorithm, which classifies sentences in the collected articles into three classes based on similarity with the reference article: sentences classified as dissimilar are summarized by using a pre-trained abstractive summarization model. We evaluated the proposed system in two steps. First, we assessed its effectiveness in identifying content differences between the reference article and the related articles by using human judgments obtained through crowdsourcing as ground truth. We obtained an average F1 score of 0.772 against average F1 scores of 0.797 and 0.676 achieved by two state-of-the-art approaches based, respectively, on model tuning and prompt tuning, which require an appropriate tuning phase and, therefore, greater computational effort. Second, we asked a sample of people to evaluate how well the summary generated by the system represents the information that is not present in the article read by the user. The results are extremely encouraging. Finally, we present a use case.
{"title":"A System to Support Readers in Automatically Acquiring Complete Summarized Information on an Event from Different Sources","authors":"Pietro Dell’Oglio, Alessandro Bondielli, Francesco Marcelloni","doi":"10.3390/a16110513","DOIUrl":"https://doi.org/10.3390/a16110513","url":null,"abstract":"Today, most newspapers utilize social media to disseminate news. On the one hand, this results in an overload of related articles for social media users. On the other hand, since social media tends to form echo chambers around their users, different opinions and information may be hidden. Enabling users to access different information (possibly outside of their echo chambers, without the burden of reading entire articles, often containing redundant information) may be a step forward in allowing them to form their own opinions. To address this challenge, we propose a system that integrates Transformer neural models and text summarization models along with decision rules. Given a reference article already read by the user, our system first collects articles related to the same topic from a configurable number of different sources. Then, it identifies and summarizes the information that differs from the reference article and outputs the summary to the user. The core of the system is the sentence classification algorithm, which classifies sentences in the collected articles into three classes based on similarity with the reference article: sentences classified as dissimilar are summarized by using a pre-trained abstractive summarization model. We evaluated the proposed system in two steps. First, we assessed its effectiveness in identifying content differences between the reference article and the related articles by using human judgments obtained through crowdsourcing as ground truth. We obtained an average F1 score of 0.772 against average F1 scores of 0.797 and 0.676 achieved by two state-of-the-art approaches based, respectively, on model tuning and prompt tuning, which require an appropriate tuning phase and, therefore, greater computational effort. Second, we asked a sample of people to evaluate how well the summary generated by the system represents the information that is not present in the article read by the user. The results are extremely encouraging. Finally, we present a use case.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"106 s415","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135342397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Currently, decision support systems (DSSs) are essential tools that provide information and support for decision making on possible problems that, due to their level of complexity, cannot be easily solved by humans [...]
{"title":"Special Issue on Algorithms in Decision Support Systems Vol.2","authors":"Edward Rolando Núñez-Valdez","doi":"10.3390/a16110512","DOIUrl":"https://doi.org/10.3390/a16110512","url":null,"abstract":"Currently, decision support systems (DSSs) are essential tools that provide information and support for decision making on possible problems that, due to their level of complexity, cannot be easily solved by humans [...]","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"107 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135342395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Silvia Carpitella, Bruno Brentan, Antonella Certa, Joaquín Izquierdo
This paper introduces a recommendation system aimed at enhancing the sustainable process of risk management within airport operations, with a special focus on Occupational Stress Risks (OSRs). The recommendation system is implemented via a flexible Python code that offers seamless integration into various operational contexts. It leverages Fuzzy Cognitive Maps (FCMs) to conduct comprehensive risk assessments, subsequently generating prioritized recommendations for predefined risk management measures aimed at preventing and/or reducing the most critical OSRs. The system’s reliability has been validated by iterating the procedure with diverse input data (i.e., matrices of varying sizes) and measures. This confirms the system’s effectiveness across a broad spectrum of engineering scenarios.
{"title":"A Recommendation System Supporting the Implementation of Sustainable Risk Management Measures in Airport Operations","authors":"Silvia Carpitella, Bruno Brentan, Antonella Certa, Joaquín Izquierdo","doi":"10.3390/a16110511","DOIUrl":"https://doi.org/10.3390/a16110511","url":null,"abstract":"This paper introduces a recommendation system aimed at enhancing the sustainable process of risk management within airport operations, with a special focus on Occupational Stress Risks (OSRs). The recommendation system is implemented via a flexible Python code that offers seamless integration into various operational contexts. It leverages Fuzzy Cognitive Maps (FCMs) to conduct comprehensive risk assessments, subsequently generating prioritized recommendations for predefined risk management measures aimed at preventing and/or reducing the most critical OSRs. The system’s reliability has been validated by iterating the procedure with diverse input data (i.e., matrices of varying sizes) and measures. This confirms the system’s effectiveness across a broad spectrum of engineering scenarios.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"45 31","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135432506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laurent Risser, Agustin Martin Picard, Lucas Hervier, Jean-Michel Loubes
The problem of algorithmic bias in machine learning has recently gained a lot of attention due to its potentially strong impact on our societies. In much the same manner, algorithmic biases can alter industrial and safety-critical machine learning applications, where high-dimensional inputs are used. This issue has, however, been mostly left out of the spotlight in the machine learning literature. Contrary to societal applications, where a set of potentially sensitive variables, such as gender or race, can be defined by common sense or by regulations to draw attention to potential risks, the sensitive variables are often unsuspected in industrial and safety-critical applications. In addition, these unsuspected sensitive variables may be indirectly represented as a latent feature of the input data. For instance, the predictions of an image classifier may be altered by reconstruction artefacts in a small subset of the training images. This raises serious and well-founded concerns about the commercial deployment of AI-based solutions, especially in a context where new regulations address bias issues in AI. The purpose of our paper is, then, to first give a large overview of recent advances in robust machine learning. Then, we propose a new procedure to detect and to treat such unknown biases. As far as we know, no equivalent procedure has been proposed in the literature so far. The procedure is also generic enough to be used in a wide variety of industrial contexts. Its relevance is demonstrated on a set of satellite images used to train a classifier. In this illustration, our technique detects that a subset of the training images has reconstruction faults, leading to systematic prediction errors that would have been unsuspected using conventional cross-validation techniques.
{"title":"Detecting and Processing Unsuspected Sensitive Variables for Robust Machine Learning","authors":"Laurent Risser, Agustin Martin Picard, Lucas Hervier, Jean-Michel Loubes","doi":"10.3390/a16110510","DOIUrl":"https://doi.org/10.3390/a16110510","url":null,"abstract":"The problem of algorithmic bias in machine learning has recently gained a lot of attention due to its potentially strong impact on our societies. In much the same manner, algorithmic biases can alter industrial and safety-critical machine learning applications, where high-dimensional inputs are used. This issue has, however, been mostly left out of the spotlight in the machine learning literature. Contrary to societal applications, where a set of potentially sensitive variables, such as gender or race, can be defined by common sense or by regulations to draw attention to potential risks, the sensitive variables are often unsuspected in industrial and safety-critical applications. In addition, these unsuspected sensitive variables may be indirectly represented as a latent feature of the input data. For instance, the predictions of an image classifier may be altered by reconstruction artefacts in a small subset of the training images. This raises serious and well-founded concerns about the commercial deployment of AI-based solutions, especially in a context where new regulations address bias issues in AI. The purpose of our paper is, then, to first give a large overview of recent advances in robust machine learning. Then, we propose a new procedure to detect and to treat such unknown biases. As far as we know, no equivalent procedure has been proposed in the literature so far. The procedure is also generic enough to be used in a wide variety of industrial contexts. Its relevance is demonstrated on a set of satellite images used to train a classifier. In this illustration, our technique detects that a subset of the training images has reconstruction faults, leading to systematic prediction errors that would have been unsuspected using conventional cross-validation techniques.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"45 38","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135432502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Predicting the price gap between the day-ahead Market (DAM) and the real-time Market (RTM) plays a vital role in the convergence bidding mechanism of Independent System Operators (ISOs) in wholesale electricity markets. This paper presents a model to predict the values of the price gap between the DAM and RTM using statistical machine learning algorithms and deep neural networks. In this paper, we seek to answer these questions: What will be the impact of predicting the DAM and RTM price gap directly on the prediction performance of learning methods? How can exogenous weather data affect the price gap prediction? In this paper, several exogenous features are collected, and the impacts of these features are examined to capture the best relations between the features and the target variable. An ensemble learning algorithm, namely the Random Forest (RF), is used to select the most important features. A Long Short-Term Memory (LSTM) network is used to capture long-term dependencies in predicting direct gap values between the markets stated. Moreover, the advantages of directly predicting the gap price rather than subtracting the price predictions of the DAM and RTM are shown. The presented results are based on the California Independent System Operator (CAISO)’s electricity market data for two years. The results show that direct gap prediction using exogenous weather features decreases the error of learning methods by 46%. Therefore, the presented method mitigates the prediction error of the price gap between the DAM and RTM. Thus, the convergence bidders can increase their profit, and the ISOs can tune their mechanism accordingly.
{"title":"Predicting the Gap in the Day-Ahead and Real-Time Market Prices Leveraging Exogenous Weather Data","authors":"Nika Nizharadze, Arash Farokhi Soofi, Saeed Manshadi","doi":"10.3390/a16110508","DOIUrl":"https://doi.org/10.3390/a16110508","url":null,"abstract":"Predicting the price gap between the day-ahead Market (DAM) and the real-time Market (RTM) plays a vital role in the convergence bidding mechanism of Independent System Operators (ISOs) in wholesale electricity markets. This paper presents a model to predict the values of the price gap between the DAM and RTM using statistical machine learning algorithms and deep neural networks. In this paper, we seek to answer these questions: What will be the impact of predicting the DAM and RTM price gap directly on the prediction performance of learning methods? How can exogenous weather data affect the price gap prediction? In this paper, several exogenous features are collected, and the impacts of these features are examined to capture the best relations between the features and the target variable. An ensemble learning algorithm, namely the Random Forest (RF), is used to select the most important features. A Long Short-Term Memory (LSTM) network is used to capture long-term dependencies in predicting direct gap values between the markets stated. Moreover, the advantages of directly predicting the gap price rather than subtracting the price predictions of the DAM and RTM are shown. The presented results are based on the California Independent System Operator (CAISO)’s electricity market data for two years. The results show that direct gap prediction using exogenous weather features decreases the error of learning methods by 46%. Therefore, the presented method mitigates the prediction error of the price gap between the DAM and RTM. Thus, the convergence bidders can increase their profit, and the ISOs can tune their mechanism accordingly.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"32 13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135773349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sai Bharadwaj Appakaya, Ruchira Pratihar, Ravi Sankar
Parkinson’s disease (PD) classification through speech has been an advancing field of research because of its ease of acquisition and processing. The minimal infrastructure requirements of the system have also made it suitable for telemonitoring applications. Researchers have studied the effects of PD on speech from various perspectives using different speech tasks. Typical speech deficits due to PD include voice monotony (e.g., monopitch), breathy or rough quality, and articulatory errors. In connected speech, these symptoms are more emphatic, which is also the basis for speech assessment in popular rating scales used for PD, like the Unified Parkinson’s Disease Rating Scale (UPDRS) and Hoehn and Yahr (HY). The current study introduces an innovative framework that integrates pitch-synchronous segmentation and an optimized set of features to investigate and analyze continuous speech from both PD patients and healthy controls (HC). Comparison of the proposed framework against existing methods has shown its superiority in classification performance and mitigation of overfitting in machine learning models. A set of optimal classifiers with unbiased decision-making was identified after comparing several machine learning models. The outcomes yielded by the classifiers demonstrate that the framework effectively learns the intrinsic characteristics of PD from connected speech, which can potentially offer valuable assistance in clinical diagnosis.
通过言语分类帕金森病(PD)因其易于获取和处理而成为一个前沿研究领域。该系统对基础设施的最低要求也使其适合远程监控应用。研究者利用不同的言语任务从不同的角度研究了PD对言语的影响。由PD引起的典型言语缺陷包括声音单调(例如,单音),呼吸或粗糙的质量,以及发音错误。在关联言语中,这些症状更加突出,这也是常用的PD评定量表(如统一帕金森病评定量表(UPDRS)和Hoehn and Yahr (HY))的言语评估基础。目前的研究引入了一个创新的框架,该框架集成了音高同步分割和一组优化的功能,用于调查和分析PD患者和健康对照(HC)的连续语音。将提出的框架与现有方法进行比较,表明其在分类性能和缓解机器学习模型的过拟合方面具有优势。通过比较几种机器学习模型,确定了一组具有无偏决策的最优分类器。分类器产生的结果表明,该框架可以有效地从连接语音中学习PD的内在特征,这可能为临床诊断提供有价值的帮助。
{"title":"Parkinson’s Disease Classification Framework Using Vocal Dynamics in Connected Speech","authors":"Sai Bharadwaj Appakaya, Ruchira Pratihar, Ravi Sankar","doi":"10.3390/a16110509","DOIUrl":"https://doi.org/10.3390/a16110509","url":null,"abstract":"Parkinson’s disease (PD) classification through speech has been an advancing field of research because of its ease of acquisition and processing. The minimal infrastructure requirements of the system have also made it suitable for telemonitoring applications. Researchers have studied the effects of PD on speech from various perspectives using different speech tasks. Typical speech deficits due to PD include voice monotony (e.g., monopitch), breathy or rough quality, and articulatory errors. In connected speech, these symptoms are more emphatic, which is also the basis for speech assessment in popular rating scales used for PD, like the Unified Parkinson’s Disease Rating Scale (UPDRS) and Hoehn and Yahr (HY). The current study introduces an innovative framework that integrates pitch-synchronous segmentation and an optimized set of features to investigate and analyze continuous speech from both PD patients and healthy controls (HC). Comparison of the proposed framework against existing methods has shown its superiority in classification performance and mitigation of overfitting in machine learning models. A set of optimal classifiers with unbiased decision-making was identified after comparing several machine learning models. The outcomes yielded by the classifiers demonstrate that the framework effectively learns the intrinsic characteristics of PD from connected speech, which can potentially offer valuable assistance in clinical diagnosis.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"30 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135774659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The online spread of fake news on various platforms has emerged as a significant concern, posing threats to public opinion, political stability, and the dissemination of reliable information. Researchers have turned to advanced technologies, including machine learning (ML) and deep learning (DL) techniques, to detect and classify fake news to address this issue. This research study explores fake news classification using diverse ML and DL approaches. We utilized a well-known “Fake News” dataset sourced from Kaggle, encompassing a labelled news collection. We implemented diverse ML models, including multinomial naïve bayes (MNB), gaussian naïve bayes (GNB), Bernoulli naïve Bayes (BNB), logistic regression (LR), and passive aggressive classifier (PAC). Additionally, we explored DL models, such as long short-term memory (LSTM), convolutional neural networks (CNN), and CNN-LSTM. We compared the performance of these models based on key evaluation metrics, such as accuracy, precision, recall, and the F1 score. Additionally, we conducted cross-validation and hyperparameter tuning to ensure optimal performance. The results provide valuable insights into the strengths and weaknesses of each model in classifying fake news. We observed that DL models, particularly LSTM and CNN-LSTM, showed better performance compared to traditional ML models. These models achieved higher accuracy and demonstrated robustness in classification tasks. These findings emphasize the potential of DL models to tackle the spread of fake news effectively and highlight the importance of utilizing advanced techniques to address this challenging problem.
{"title":"Deep Dive into Fake News Detection: Feature-Centric Classification with Ensemble and Deep Learning Methods","authors":"Fawaz Khaled Alarfaj, Jawad Abbas Khan","doi":"10.3390/a16110507","DOIUrl":"https://doi.org/10.3390/a16110507","url":null,"abstract":"The online spread of fake news on various platforms has emerged as a significant concern, posing threats to public opinion, political stability, and the dissemination of reliable information. Researchers have turned to advanced technologies, including machine learning (ML) and deep learning (DL) techniques, to detect and classify fake news to address this issue. This research study explores fake news classification using diverse ML and DL approaches. We utilized a well-known “Fake News” dataset sourced from Kaggle, encompassing a labelled news collection. We implemented diverse ML models, including multinomial naïve bayes (MNB), gaussian naïve bayes (GNB), Bernoulli naïve Bayes (BNB), logistic regression (LR), and passive aggressive classifier (PAC). Additionally, we explored DL models, such as long short-term memory (LSTM), convolutional neural networks (CNN), and CNN-LSTM. We compared the performance of these models based on key evaluation metrics, such as accuracy, precision, recall, and the F1 score. Additionally, we conducted cross-validation and hyperparameter tuning to ensure optimal performance. The results provide valuable insights into the strengths and weaknesses of each model in classifying fake news. We observed that DL models, particularly LSTM and CNN-LSTM, showed better performance compared to traditional ML models. These models achieved higher accuracy and demonstrated robustness in classification tasks. These findings emphasize the potential of DL models to tackle the spread of fake news effectively and highlight the importance of utilizing advanced techniques to address this challenging problem.","PeriodicalId":7636,"journal":{"name":"Algorithms","volume":"39 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135868337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}