Machine Learning (ML) algorithms have demonstrated remarkable performance in dysphonia detection using speech samples. However, their efficacy often diminishes when tested on languages different from the training data, raising questions about their suitability in clinical settings. This study aims to develop a robust method for cross- and multi-lingual dysphonia detection that overcomes the limitation of language dependency in existing ML methods. We propose an innovative approach that leverages speech embeddings from speaker verification models, especially ECAPA and x-vector and employs a majority voting ensemble classifier. We utilize speech features extracted from ECAPA and x-vector embeddings to train three distinct classifiers. The significant advantage of these embedding models lies in their capability to capture speaker characteristics in a language-independent manner, forming fixed-dimensional feature spaces. Additionally, we investigate the impact of generating synthetic data within the embedding feature space using the Synthetic Minority Oversampling Technique (SMOTE). Our experimental results unveil the effectiveness of the proposed method for dysphonia detection. Compared to results obtained from x-vector embeddings, ECAPA consistently demonstrates superior performance in distinguishing between healthy and dysphonic speech, achieving accuracy values of 93.33% and 96.55% in both cross-lingual and multi-lingual scenarios, respectively. This highlights the remarkable capabilities of speaker verification models, especially ECAPA, in capturing language-independent features that enhance overall detection performance. The proposed method effectively addresses the challenges of language dependency in dysphonia detection. ECAPA embeddings, combined with majority voting ensemble classifiers, show significant potential for improving the accuracy and reliability of dysphonia detection in cross- and multi-lingual scenarios.
机器学习(ML)算法在使用语音样本进行发音障碍检测方面表现出色。然而,当在与训练数据不同的语言上进行测试时,这些算法的功效往往会减弱,从而引发了这些算法在临床环境中是否适用的问题。本研究旨在开发一种稳健的跨语言和多语言发音障碍检测方法,以克服现有 ML 方法中语言依赖性的限制。我们提出了一种创新方法,利用说话人验证模型(尤其是 ECAPA 和 x-vector)中的语音嵌入,并采用多数投票集合分类器。我们利用从 ECAPA 和 x-vector 嵌入中提取的语音特征来训练三种不同的分类器。这些嵌入模型的显著优势在于它们能够以与语言无关的方式捕捉说话者的特征,形成固定维度的特征空间。此外,我们还利用合成少数群体过采样技术(SMOTE)研究了在嵌入特征空间内生成合成数据的影响。我们的实验结果揭示了所提方法在发音障碍检测中的有效性。与 x 向量嵌入的结果相比,ECAPA 在区分健康语音和发音障碍语音方面始终表现出卓越的性能,在跨语言和多语言场景中的准确率分别达到 93.33% 和 96.55%。这凸显了说话人验证模型,尤其是 ECAPA,在捕捉语言无关特征以提高整体检测性能方面的卓越能力。所提出的方法有效地解决了发音障碍检测中语言依赖性的难题。ECAPA 嵌入与多数投票集合分类器相结合,在提高跨语言和多语言场景中发音障碍检测的准确性和可靠性方面显示出巨大的潜力。
{"title":"Automatic cross- and multi-lingual recognition of dysphonia by ensemble classification using deep speaker embedding models","authors":"Dosti Aziz, Dávid Sztahó","doi":"10.1111/exsy.13660","DOIUrl":"10.1111/exsy.13660","url":null,"abstract":"<p>Machine Learning (ML) algorithms have demonstrated remarkable performance in dysphonia detection using speech samples. However, their efficacy often diminishes when tested on languages different from the training data, raising questions about their suitability in clinical settings. This study aims to develop a robust method for cross- and multi-lingual dysphonia detection that overcomes the limitation of language dependency in existing ML methods. We propose an innovative approach that leverages speech embeddings from speaker verification models, especially ECAPA and x-vector and employs a majority voting ensemble classifier. We utilize speech features extracted from ECAPA and x-vector embeddings to train three distinct classifiers. The significant advantage of these embedding models lies in their capability to capture speaker characteristics in a language-independent manner, forming fixed-dimensional feature spaces. Additionally, we investigate the impact of generating synthetic data within the embedding feature space using the Synthetic Minority Oversampling Technique (SMOTE). Our experimental results unveil the effectiveness of the proposed method for dysphonia detection. Compared to results obtained from x-vector embeddings, ECAPA consistently demonstrates superior performance in distinguishing between healthy and dysphonic speech, achieving accuracy values of 93.33% and 96.55% in both cross-lingual and multi-lingual scenarios, respectively. This highlights the remarkable capabilities of speaker verification models, especially ECAPA, in capturing language-independent features that enhance overall detection performance. The proposed method effectively addresses the challenges of language dependency in dysphonia detection. ECAPA embeddings, combined with majority voting ensemble classifiers, show significant potential for improving the accuracy and reliability of dysphonia detection in cross- and multi-lingual scenarios.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/exsy.13660","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141353066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexânder Araújo Reis, Rafael Ângelo Santos Leite, Cícero Eduardo Walter, Igor Bezerra Reis, Ramiro Gonçalves, J. Martins, Frederico Branco, M. Au‐Yong‐Oliveira
The purpose of this study is to ascertain the hierarchical importance of a patent's characteristics to licensing. This research has a causal‐exploratory purpose, in that it sought to establish relationships between variables. This research aims to identify which characteristics are influential in the licensing of Brazilian academic patents in the biotechnology and pharmaceutical technology fields, based on the mining of data contained in licensed and unlicensed patent documents. Which characteristics of Brazilian academic patents are most influential in their licensing potential? An analysis through Random Forest was performed. To the best of our knowledge, there are no studies in Brazil using machine learning to identify which characteristics are influential in licensing a particular academic patent, especially given the difficulty of gathering this information. We found that regardless of the measure used, the three most critical licensing characteristics for the Biotechnology and Pharmaceutical patents analysed are Patent Scope, Life Cycle, and Claims. At the same time, the least important is the Patent Cooperation Treaty. The relevance of this research is based on the fact that after identifying which intrinsic characteristics influence the final value and licensing probabilities of a given patent, it will be possible to develop mathematical models that provide accurate information for establishing technology transfer agreements. In practical terms, the results suggest that greater patent versatility, combined with lifecycle management and a technical effort to build strong claims, increases the licensing potential of academic biopharmaceutical patents.
{"title":"The hierarchical importance of patent's characteristics to licensing: An analysis through Random Forest","authors":"Alexânder Araújo Reis, Rafael Ângelo Santos Leite, Cícero Eduardo Walter, Igor Bezerra Reis, Ramiro Gonçalves, J. Martins, Frederico Branco, M. Au‐Yong‐Oliveira","doi":"10.1111/exsy.13661","DOIUrl":"https://doi.org/10.1111/exsy.13661","url":null,"abstract":"The purpose of this study is to ascertain the hierarchical importance of a patent's characteristics to licensing. This research has a causal‐exploratory purpose, in that it sought to establish relationships between variables. This research aims to identify which characteristics are influential in the licensing of Brazilian academic patents in the biotechnology and pharmaceutical technology fields, based on the mining of data contained in licensed and unlicensed patent documents. Which characteristics of Brazilian academic patents are most influential in their licensing potential? An analysis through Random Forest was performed. To the best of our knowledge, there are no studies in Brazil using machine learning to identify which characteristics are influential in licensing a particular academic patent, especially given the difficulty of gathering this information. We found that regardless of the measure used, the three most critical licensing characteristics for the Biotechnology and Pharmaceutical patents analysed are Patent Scope, Life Cycle, and Claims. At the same time, the least important is the Patent Cooperation Treaty. The relevance of this research is based on the fact that after identifying which intrinsic characteristics influence the final value and licensing probabilities of a given patent, it will be possible to develop mathematical models that provide accurate information for establishing technology transfer agreements. In practical terms, the results suggest that greater patent versatility, combined with lifecycle management and a technical effort to build strong claims, increases the licensing potential of academic biopharmaceutical patents.","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141350319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Wang, Rundong Xin, Osama Alfarraj, Amr M. Tolba, Qitao Tang
Recently, face recognition based on homomorphic encryption for privacy preservation has garnered significant attention. However, there are two major challenges with homomorphic encryption methods: the security and efficiency of face recognition systems. We present a more efficient and secure PUM (Privacy preserving security Using Multi‐key homomorphic encryption) mechanism for facial recognition. By integrating feature grouping with parallel computing, we enhance the efficiency of homomorphic operations. The use of multi‐key encryption ensures the security of the facial recognition system. This approach improves the security and speed of facial recognition systems in cloud computing scenarios, increasing the original 128‐bit security to a maximum of 1664‐bit security. In terms of efficiency, comparing encrypted images takes only 0.302 s, with an accuracy rate of 99.425%. When applied to a campus scenario, the average search time for a facial template library containing 700 encrypted features is approximately 1.5 s. Consequently, our solution not only ensures user privacy but also demonstrates superior operational efficiency and practical value. In comparison to recently emerged ciphertext facial recognition systems, our solution has demonstrated notable enhancements in both security and time efficiency.
{"title":"Privacy preserving security using multi‐key homomorphic encryption for face recognition","authors":"Jing Wang, Rundong Xin, Osama Alfarraj, Amr M. Tolba, Qitao Tang","doi":"10.1111/exsy.13645","DOIUrl":"https://doi.org/10.1111/exsy.13645","url":null,"abstract":"Recently, face recognition based on homomorphic encryption for privacy preservation has garnered significant attention. However, there are two major challenges with homomorphic encryption methods: the security and efficiency of face recognition systems. We present a more efficient and secure PUM (Privacy preserving security Using Multi‐key homomorphic encryption) mechanism for facial recognition. By integrating feature grouping with parallel computing, we enhance the efficiency of homomorphic operations. The use of multi‐key encryption ensures the security of the facial recognition system. This approach improves the security and speed of facial recognition systems in cloud computing scenarios, increasing the original 128‐bit security to a maximum of 1664‐bit security. In terms of efficiency, comparing encrypted images takes only 0.302 s, with an accuracy rate of 99.425%. When applied to a campus scenario, the average search time for a facial template library containing 700 encrypted features is approximately 1.5 s. Consequently, our solution not only ensures user privacy but also demonstrates superior operational efficiency and practical value. In comparison to recently emerged ciphertext facial recognition systems, our solution has demonstrated notable enhancements in both security and time efficiency.","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141355158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gazi Md Daud Iqbal, Jay Rosenberger, Matthew Rosenberger, Muhammad Shah Alam, Lidan Ha, Emmanuel Anoruo, Sadie Gregory, Tom Mazzone
Global temperature is increasing at an alarming rate, which increases the number of heatwaves. Heatwaves have significant impacts, both directly and indirectly, on human and natural systems and can create considerable risk to public health. Predicting the occurrence of a heatwave can save lives, increase the production of crops, improve water quality, and reduce transportation restrictions. Because of its geographical location, Bangladesh is particularly vulnerable to cyclones, droughts, earthquakes, floods, and heatwaves. The Bangladesh Meteorological Department collects temperature data at multiple weather stations, and we use data from 10 weather stations in this research. Data show that most heatwaves occur in the summer months, namely, April, May, and June. In this research, we develop Classification and Regression Tree (CART) models that use daily temperature data for the months of March, April, May, and June to predict the likelihood of a heatwave within the next 7 days, the next 28 days, and on any particular day based on daily high temperatures from the previous 14 days. We also use different model parameters to evaluate the accuracy of the models. Finally, we develop treed Stepwise Logistic Regression models to predict the probability of heatwaves occurring. Even though this research uses data from Bangladesh Meteorological Department, the developed modeling approach can be used in other geographic regions.
{"title":"A supervised learning tool for heatwave predictions using daily high summer temperatures","authors":"Gazi Md Daud Iqbal, Jay Rosenberger, Matthew Rosenberger, Muhammad Shah Alam, Lidan Ha, Emmanuel Anoruo, Sadie Gregory, Tom Mazzone","doi":"10.1111/exsy.13656","DOIUrl":"10.1111/exsy.13656","url":null,"abstract":"<p>Global temperature is increasing at an alarming rate, which increases the number of heatwaves. Heatwaves have significant impacts, both directly and indirectly, on human and natural systems and can create considerable risk to public health. Predicting the occurrence of a heatwave can save lives, increase the production of crops, improve water quality, and reduce transportation restrictions. Because of its geographical location, Bangladesh is particularly vulnerable to cyclones, droughts, earthquakes, floods, and heatwaves. The Bangladesh Meteorological Department collects temperature data at multiple weather stations, and we use data from 10 weather stations in this research. Data show that most heatwaves occur in the summer months, namely, April, May, and June. In this research, we develop Classification and Regression Tree (CART) models that use daily temperature data for the months of March, April, May, and June to predict the likelihood of a heatwave within the next 7 days, the next 28 days, and on any particular day based on daily high temperatures from the previous 14 days. We also use different model parameters to evaluate the accuracy of the models. Finally, we develop treed Stepwise Logistic Regression models to predict the probability of heatwaves occurring. Even though this research uses data from Bangladesh Meteorological Department, the developed modeling approach can be used in other geographic regions.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141359427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Generative Artificial Intelligence (GAI) represents an emerging field that promises the creation of synthetic data and outputs in different modalities. GAI has recently shown impressive results across a large spectrum of applications ranging from biology, medicine, education, legislation, computer science, and finance. As one strives for enhanced safety, efficiency, and sustainability, generative AI indeed emerges as a key differentiator and promises a paradigm shift in the field. This article explores the potential applications of generative AI and large language models in geoscience. The recent developments in the field of machine learning and deep learning have enabled the generative model's utility for tackling diverse prediction problems, simulation, and multi-criteria decision-making challenges related to geoscience and Earth system dynamics. This survey discusses several GAI models that have been used in geoscience comprising generative adversarial networks (GANs), physics-informed neural networks (PINNs), and generative pre-trained transformer (GPT)-based structures. These tools have helped the geoscience community in several applications, including (but not limited to) data generation/augmentation, super-resolution, panchromatic sharpening, haze removal, restoration, and land surface changing. Some challenges still remain, such as ensuring physical interpretation, nefarious use cases, and trustworthiness. Beyond that, GAI models show promises to the geoscience community, especially with the support to climate change, urban science, atmospheric science, marine science, and planetary science through their extraordinary ability to data-driven modelling and uncertainty quantification.
生成式人工智能(GAI)是一个新兴领域,有望以不同方式创建合成数据和输出结果。最近,GAI 在生物学、医学、教育、立法、计算机科学和金融等众多应用领域都取得了令人瞩目的成果。在人们努力提高安全性、效率和可持续性的过程中,生成式人工智能确实成为一个关键的差异化因素,并有望实现该领域的范式转变。本文探讨了生成式人工智能和大型语言模型在地球科学领域的潜在应用。机器学习和深度学习领域的最新发展使生成模型在应对与地球科学和地球系统动力学相关的各种预测问题、模拟和多标准决策挑战方面大显身手。本研究讨论了地质科学中使用的几种 GAI 模型,包括生成对抗网络(GAN)、物理信息神经网络(PINN)和基于生成预训练变换器(GPT)的结构。这些工具在多个应用领域为地球科学界提供了帮助,包括(但不限于)数据生成/增强、超分辨率、全色锐化、雾霾消除、恢复和地表变化。一些挑战依然存在,如确保物理解释、邪恶用例和可信度。除此之外,GAI 模型通过其数据驱动建模和不确定性量化的非凡能力,为地球科学界展示了前景,尤其是对气候变化、城市科学、大气科学、海洋科学和行星科学的支持。
{"title":"When geoscience meets generative AI and large language models: Foundations, trends, and future challenges","authors":"Abdenour Hadid, Tanujit Chakraborty, Daniel Busby","doi":"10.1111/exsy.13654","DOIUrl":"https://doi.org/10.1111/exsy.13654","url":null,"abstract":"<p>Generative Artificial Intelligence (GAI) represents an emerging field that promises the creation of synthetic data and outputs in different modalities. GAI has recently shown impressive results across a large spectrum of applications ranging from biology, medicine, education, legislation, computer science, and finance. As one strives for enhanced safety, efficiency, and sustainability, generative AI indeed emerges as a key differentiator and promises a paradigm shift in the field. This article explores the potential applications of generative AI and large language models in geoscience. The recent developments in the field of machine learning and deep learning have enabled the generative model's utility for tackling diverse prediction problems, simulation, and multi-criteria decision-making challenges related to geoscience and Earth system dynamics. This survey discusses several GAI models that have been used in geoscience comprising generative adversarial networks (GANs), physics-informed neural networks (PINNs), and generative pre-trained transformer (GPT)-based structures. These tools have helped the geoscience community in several applications, including (but not limited to) data generation/augmentation, super-resolution, panchromatic sharpening, haze removal, restoration, and land surface changing. Some challenges still remain, such as ensuring physical interpretation, nefarious use cases, and trustworthiness. Beyond that, GAI models show promises to the geoscience community, especially with the support to climate change, urban science, atmospheric science, marine science, and planetary science through their extraordinary ability to data-driven modelling and uncertainty quantification.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/exsy.13654","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142165757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}