In spite of the high potential shown by spiking neural networks (e.g., temporal patterns), training them remains an open and complex problem. In practice, while in theory these networks are computationally as powerful as mainstream artificial neural networks, they have not reached the same accuracy levels yet. The major reason for such a situation seems to be represented by the lack of adequate training algorithms for deep spiking neural networks, since spike signals are not differentiable, that is, no direct way to compute a gradient is provided. Recently, a novel training method, based on the (digital) simulation of certain quantum systems, has been suggested. It has already shown interesting advantages, among which is the fact that no gradient is required to be computed. In this work, we apply this approach to the problem of training spiking neural networks, and we show that this recent training method is capable of training deep and complex spiking neural networks on the MNIST data set.
{"title":"On Training Spiking Neural Networks by Means of a Novel Quantum Inspired Machine Learning Method","authors":"Jean Michel Sellier, Alexandre Martini","doi":"10.1002/ail2.114","DOIUrl":"https://doi.org/10.1002/ail2.114","url":null,"abstract":"<p>In spite of the high potential shown by spiking neural networks (e.g., temporal patterns), training them remains an open and complex problem. In practice, while in theory these networks are computationally as powerful as mainstream artificial neural networks, they have not reached the same accuracy levels yet. The major reason for such a situation seems to be represented by the lack of adequate training algorithms for deep spiking neural networks, since spike signals are not differentiable, that is, no direct way to compute a gradient is provided. Recently, a novel training method, based on the (digital) simulation of certain quantum systems, has been suggested. It has already shown interesting advantages, among which is the fact that no gradient is required to be computed. In this work, we apply this approach to the problem of training spiking neural networks, and we show that this recent training method is capable of training deep and complex spiking neural networks on the MNIST data set.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.114","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143565188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ghalia Nassreddine, Amal El Arid, Mohamad Nassereddine, Obada Al Khatib
The deployment of solar photovoltaic (PV) panel systems, as renewable energy sources, has seen a rise recently. Consequently, it is imperative to implement efficient methods for the accurate detection and diagnosis of PV system faults to prevent unexpected power disruptions. This paper introduces a potential strategy for fault identification and classification through the utilization of machine learning (ML) techniques. The study aimed to use ML algorithms to identify and classify normal operations, seven different types of faults, in two operational modes (maximum power point tracking and intermediate power point tracking). Four machine learning algorithms and ensemble methods (decision trees, k-nearest neighbors, random forest, and extreme gradient boosting) were employed, followed by hyperparameter tuning and cross-validation to determine the best configuration. The results indicated that ensemble methods, particularly XGBoost, excelled in detecting and classifying faults in PV systems, achieving a 99% accuracy rate after hyperparameter adjustments. The TPR values show a high sensitivity of 0.999, with some achieving a perfect score of 1.000. The FPR shows very low values, with the majority of metrics indicating FPRs at or close to 0%. This performance is crucial in the solar energy context, as failing to detect faults can result in significant energy loss and increased maintenance costs.
{"title":"Fault Detection and Classification for Photovoltaic Panel System Using Machine Learning Techniques","authors":"Ghalia Nassreddine, Amal El Arid, Mohamad Nassereddine, Obada Al Khatib","doi":"10.1002/ail2.115","DOIUrl":"https://doi.org/10.1002/ail2.115","url":null,"abstract":"<p>The deployment of solar photovoltaic (PV) panel systems, as renewable energy sources, has seen a rise recently. Consequently, it is imperative to implement efficient methods for the accurate detection and diagnosis of PV system faults to prevent unexpected power disruptions. This paper introduces a potential strategy for fault identification and classification through the utilization of machine learning (ML) techniques. The study aimed to use ML algorithms to identify and classify normal operations, seven different types of faults, in two operational modes (maximum power point tracking and intermediate power point tracking). Four machine learning algorithms and ensemble methods (decision trees, k-nearest neighbors, random forest, and extreme gradient boosting) were employed, followed by hyperparameter tuning and cross-validation to determine the best configuration. The results indicated that ensemble methods, particularly XGBoost, excelled in detecting and classifying faults in PV systems, achieving a 99% accuracy rate after hyperparameter adjustments. The TPR values show a high sensitivity of 0.999, with some achieving a perfect score of 1.000. The FPR shows very low values, with the majority of metrics indicating FPRs at or close to 0%. This performance is crucial in the solar energy context, as failing to detect faults can result in significant energy loss and increased maintenance costs.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.115","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143554350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Company earnings calls are pivotal events that offer crucial insights into a company's financial well-being and future outlook. Large language models (LLMs) present a promising avenue for automatically generating the initial draft of earnings call scripts, leveraging financial data and past examples. We evaluate two distinct methods: (1) few-shot learning prompt engineering with a large language model (LLM) and (2) fine-tuning a large language model on earnings call transcript data. Our findings indicate that both methods can produce coherent scripts encompassing key metrics, updates, and guidance. However, there are inherent trade-offs in comprehensiveness, potential hallucinations, writing style, ease of use, and cost. We discuss the pros and cons of each method to guide practitioners on effectively harnessing LLMs for earnings call script generation. Notably, we employ a human and two different LLMs to act as judges to compare the outcomes generated by the two approaches.
{"title":"Earnings Call Scripts Generation With Large Language Models Using Few-Shot Learning Prompt Engineering and Fine-Tuning Methods","authors":"Sovik Kumar Nath, Yanyan Zhang, Jia Vivian Li","doi":"10.1002/ail2.110","DOIUrl":"https://doi.org/10.1002/ail2.110","url":null,"abstract":"<p>Company earnings calls are pivotal events that offer crucial insights into a company's financial well-being and future outlook. Large language models (LLMs) present a promising avenue for automatically generating the initial draft of earnings call scripts, leveraging financial data and past examples. We evaluate two distinct methods: (1) few-shot learning prompt engineering with a large language model (LLM) and (2) fine-tuning a large language model on earnings call transcript data. Our findings indicate that both methods can produce coherent scripts encompassing key metrics, updates, and guidance. However, there are inherent trade-offs in comprehensiveness, potential hallucinations, writing style, ease of use, and cost. We discuss the pros and cons of each method to guide practitioners on effectively harnessing LLMs for earnings call script generation. Notably, we employ a human and two different LLMs to act as judges to compare the outcomes generated by the two approaches.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.110","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dean Wyatte, Fatemeh Tahmasbi, Ming Li, Thomas Markovich
Modern large language models (LLMs) represent a paradigm shift in what can plausibly be expected of machine learning models. The fact that LLMs can effectively generate sensible answers to a diverse range of queries suggests that they would be useful in customer support applications. While powerful, LLMs have been observed to be prone to hallucination which unfortunately makes their near-term use in customer support applications challenging. To address this issue, we present a system that allows us to use an LLM to augment our customer support advocates by re-framing the language modeling task as a discriminative classification task. In this framing, we seek to present the Top-K best template responses for a customer support advocate to use when responding to a customer. We present the result of both offline and online experiments where we observed offline gains and statistically significant online lifts for our experimental system. Along the way, we present observed scaling curves for validation loss and Top-K accuracy, resulted from model parameter ablation studies. We close by discussing the space of trade-offs with respect to model size, latency, and accuracy as well as and suggesting future applications to explore.
{"title":"Scaling Laws for Discriminative Classification in Large Language Models","authors":"Dean Wyatte, Fatemeh Tahmasbi, Ming Li, Thomas Markovich","doi":"10.1002/ail2.109","DOIUrl":"https://doi.org/10.1002/ail2.109","url":null,"abstract":"<p>Modern large language models (LLMs) represent a paradigm shift in what can plausibly be expected of machine learning models. The fact that LLMs can effectively generate sensible answers to a diverse range of queries suggests that they would be useful in customer support applications. While powerful, LLMs have been observed to be prone to hallucination which unfortunately makes their near-term use in customer support applications challenging. To address this issue, we present a system that allows us to use an LLM to augment our customer support advocates by re-framing the language modeling task as a discriminative classification task. In this framing, we seek to present the Top-K best template responses for a customer support advocate to use when responding to a customer. We present the result of both offline and online experiments where we observed offline gains and statistically significant online lifts for our experimental system. Along the way, we present observed scaling curves for validation loss and Top-K accuracy, resulted from model parameter ablation studies. We close by discussing the space of trade-offs with respect to model size, latency, and accuracy as well as and suggesting future applications to explore.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.109","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaspreet Kaur, Kang Tan, Muhammad Z. Khan, Olaoluwa R. Popoola, Muhammad A. Imran, Qammer H. Abbasi, Hasan T. Abbas
Accurately determining the indoor location of mobile devices has garnered significant interest due to the complex challenges posed by non-line-of-sight (NLOS) propagation and multipath effects. To address this challenge, this paper proposes a new approach to indoor positioning that utilises channel state information (CSI) and machine learning (ML) techniques to improve accuracy. The proposed method extracts the amplitude and phase differences of the subcarriers from the CSI data to create fingerprints. ML algorithms and network architecture are utilised to train the CSI data from two antennas, in the form of phase and amplitude. Experiments conducted in a standard indoor environment demonstrate the effectiveness of the proposed method.
{"title":"Fingerprinting-Based Indoor Localization in a 3 × 3 Meter Grid Using OFDM Signals at Sub-6 GHz","authors":"Jaspreet Kaur, Kang Tan, Muhammad Z. Khan, Olaoluwa R. Popoola, Muhammad A. Imran, Qammer H. Abbasi, Hasan T. Abbas","doi":"10.1002/ail2.104","DOIUrl":"https://doi.org/10.1002/ail2.104","url":null,"abstract":"<p>Accurately determining the indoor location of mobile devices has garnered significant interest due to the complex challenges posed by non-line-of-sight (NLOS) propagation and multipath effects. To address this challenge, this paper proposes a new approach to indoor positioning that utilises channel state information (CSI) and machine learning (ML) techniques to improve accuracy. The proposed method extracts the amplitude and phase differences of the subcarriers from the CSI data to create fingerprints. ML algorithms and network architecture are utilised to train the CSI data from two antennas, in the form of phase and amplitude. Experiments conducted in a standard indoor environment demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.104","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miguel G. Folgado, Verónica Sanz, Johannes Hirn, Edgar Lorenzo-Sáez, Javier F. Urchueguía
Traffic congestion represents a significant urban challenge, with notable implications for public health and environmental well-being. Consequently, urban decision-makers prioritize the mitigation of congestion. This study delves into the efficacy of harnessing extensive data on urban traffic dynamics, coupled with comprehensive knowledge of road networks, to enable Artificial Intelligence (AI) in forecasting traffic flux well in advance. Such forecasts hold promise for informing emission reduction measures, particularly those aligned with Low Emission Zone policies. The investigation centers on Valencia, leveraging its robust traffic sensor infrastructure, one of the most densely deployed worldwide, encompassing approximately 3500 sensors strategically positioned across the city. Employing historical data spanning 2016 and 2017, we undertake the task of training and characterizing a Long Short-Term Memory (LSTM) Neural Network for the prediction of temporal traffic patterns. Our findings demonstrate the LSTM's efficacy in real-time forecasting of traffic flow evolution, facilitated by its ability to discern salient patterns within the dataset.
{"title":"Towards Predictive Pollution Control Through Traffic Flux Forecasting With Deep Learning: A Case Study in the City of Valencia","authors":"Miguel G. Folgado, Verónica Sanz, Johannes Hirn, Edgar Lorenzo-Sáez, Javier F. Urchueguía","doi":"10.1002/ail2.106","DOIUrl":"https://doi.org/10.1002/ail2.106","url":null,"abstract":"<p>Traffic congestion represents a significant urban challenge, with notable implications for public health and environmental well-being. Consequently, urban decision-makers prioritize the mitigation of congestion. This study delves into the efficacy of harnessing extensive data on urban traffic dynamics, coupled with comprehensive knowledge of road networks, to enable Artificial Intelligence (AI) in forecasting traffic flux well in advance. Such forecasts hold promise for informing emission reduction measures, particularly those aligned with Low Emission Zone policies. The investigation centers on Valencia, leveraging its robust traffic sensor infrastructure, one of the most densely deployed worldwide, encompassing approximately 3500 sensors strategically positioned across the city. Employing historical data spanning 2016 and 2017, we undertake the task of training and characterizing a Long Short-Term Memory (LSTM) Neural Network for the prediction of temporal traffic patterns. Our findings demonstrate the LSTM's efficacy in real-time forecasting of traffic flow evolution, facilitated by its ability to discern salient patterns within the dataset.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.106","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mechanical ventilation (MV) is used in subjects with respiratory problems for assistance in breathing. Various MV settings are adjusted at the clinician's discretion based on the patient's respiratory condition. In this study, an AI (artificial intelligence) model using artificial neural networks (ANNs) along with Bayesian Optimization (BO) was developed to estimate the desired MV settings for various subject scenarios. The ANN model with two hidden layers was trained with experimental data collected from subjects (canines and felines) in our previous work. Inverse mapping of the trained ANNs was conducted using BO to predict the acceptable MV settings for specific subject outcomes. Our results suggest that the model can support veterinarians in estimating the proper MV parameters for optimal subject outcome.
{"title":"Mechanical Ventilator Settings Estimation From an AI Model","authors":"Ali Moghadam, Ramana M. Pidaparti","doi":"10.1002/ail2.103","DOIUrl":"https://doi.org/10.1002/ail2.103","url":null,"abstract":"<p>Mechanical ventilation (MV) is used in subjects with respiratory problems for assistance in breathing. Various MV settings are adjusted at the clinician's discretion based on the patient's respiratory condition. In this study, an AI (artificial intelligence) model using artificial neural networks (ANNs) along with Bayesian Optimization (BO) was developed to estimate the desired MV settings for various subject scenarios. The ANN model with two hidden layers was trained with experimental data collected from subjects (canines and felines) in our previous work. Inverse mapping of the trained ANNs was conducted using BO to predict the acceptable MV settings for specific subject outcomes. Our results suggest that the model can support veterinarians in estimating the proper MV parameters for optimal subject outcome.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.103","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biological function often depends on complex mechanisms of a dynamic, time-variant nature. An example is certain bat species (horseshoe bats—Rhinolophidae) that use intricate pinna musculatures to execute a variety of pinna deformations. While prior work has indicated the potential significance of these motions for sensory information encoding, it remains unclear how the complex time-variant pinna geometries could be controlled to enhance sensory performance. To address this issue, this work has investigated deep neural network models as digital twins for biomimetic pinnae. The networks were trained to predict the acoustic impacts of the deformed pinna geometries. A total of three network architectures have been evaluated for this purpose using physical numerical simulations (boundary element method) as ground truth. The networks predicted the acoustic beampattern function from pinna shape or even directly from the states of actuators that were used to deform the pinna shapes in simulation. Inserting prior knowledge in the form of beam-shaped basis functions did not improve network performance. The ability of the networks to produce beampattern predictions with low computational effort (in about three milliseconds each) should lend itself readily to supporting learning methods such as deep reinforcement learning that require many such functional evaluations.
{"title":"Deep Learning-Driven Modeling of Dynamic Acoustic Sensing in Biomimetic Soft-Robotic Pinnae","authors":"Sounak Chakrabarti, Rolf Müller","doi":"10.1002/ail2.107","DOIUrl":"https://doi.org/10.1002/ail2.107","url":null,"abstract":"<p>Biological function often depends on complex mechanisms of a dynamic, time-variant nature. An example is certain bat species (horseshoe bats—Rhinolophidae) that use intricate pinna musculatures to execute a variety of pinna deformations. While prior work has indicated the potential significance of these motions for sensory information encoding, it remains unclear how the complex time-variant pinna geometries could be controlled to enhance sensory performance. To address this issue, this work has investigated deep neural network models as digital twins for biomimetic pinnae. The networks were trained to predict the acoustic impacts of the deformed pinna geometries. A total of three network architectures have been evaluated for this purpose using physical numerical simulations (boundary element method) as ground truth. The networks predicted the acoustic beampattern function from pinna shape or even directly from the states of actuators that were used to deform the pinna shapes in simulation. Inserting prior knowledge in the form of beam-shaped basis functions did not improve network performance. The ability of the networks to produce beampattern predictions with low computational effort (in about three milliseconds each) should lend itself readily to supporting learning methods such as deep reinforcement learning that require many such functional evaluations.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.107","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sentiment classification deals with extracting and classifying the text sentiment. Fuzzy Deep Belief Network (DBN) has proved its efficiency in dealing with sentiment analysis and suitability for classifying unlabeled or semi-labeled data. Previous structures of deep belief networks are mostly made of traditional activation functions such as sigmoid. In this paper, a new activation function, which is referred to as hyperbolic secant function, is proposed. The new activation function not only solves gradient zeroing problem but also increases the accuracy and efficiency. Besides, extreme learning machine (ELM) is proposed as the decision layer to increase the accuracy and improve the generalizability through solving gradient-based learning problem. The efficiency of the proposed method has been experimented on “IMDB” movie critic dataset, 20-newspaper dataset and Sentiment Analysis dataset. The results of the proposed method are more accurate and precise as compared with the previous approaches.
{"title":"A Hybrid Fuzzy Deep Belief Network Extreme Learning Machine Framework With Hyperbolic Secant Activation Function for Robust Semi-Supervised Sentiment Classification","authors":"Maryam Mozafari, Mohammad Hossein Moattar","doi":"10.1002/ail2.102","DOIUrl":"https://doi.org/10.1002/ail2.102","url":null,"abstract":"<p>Sentiment classification deals with extracting and classifying the text sentiment. Fuzzy Deep Belief Network (DBN) has proved its efficiency in dealing with sentiment analysis and suitability for classifying unlabeled or semi-labeled data. Previous structures of deep belief networks are mostly made of traditional activation functions such as sigmoid. In this paper, a new activation function, which is referred to as hyperbolic secant function, is proposed. The new activation function not only solves gradient zeroing problem but also increases the accuracy and efficiency. Besides, extreme learning machine (ELM) is proposed as the decision layer to increase the accuracy and improve the generalizability through solving gradient-based learning problem. The efficiency of the proposed method has been experimented on “IMDB” movie critic dataset, 20-newspaper dataset and Sentiment Analysis dataset. The results of the proposed method are more accurate and precise as compared with the previous approaches.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.102","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143404388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faisal Ahmed Sifat, Md Sahadul Hasan Arian, Saif Ahmed, Taseef Hasan Farook, Nabeel Mohammed, James Dudley
To create and validate a transformer-based deep neural network architecture for classifying 3D scans of teeth for computer-assisted manufacturing and dental prosthetic rehabilitation surpassing previously reported validation accuracies obtained with convolutional neural networks (CNNs). Voxel-based representation and encoding input data in a high-dimensional space forms of preprocessing were investigated using 34 3D models of teeth obtained from intraoral scanning. Independent CNNs and vision transformers (ViTs), and their combination (CNN and ViT hybrid model) were implemented to classify the 3D scans directly from standard tessellation language (.stl) files and an Explainable AI (ExAI) model was generated to qualitatively explore the deterministic patterns that influenced the outcomes of the automation process. The results demonstrate that the CNN and ViT hybrid model architecture surpasses conventional supervised CNN, achieving a consistent validation accuracy of 90% through three-fold cross-validation. This process validated our initial findings, where each instance had the opportunity to be part of the validation set, ensuring it remained unseen during training. Furthermore, employing high-dimensional encoding of input data solely with 3DCNN yields a validation accuracy of 80%. When voxel data preprocessing is utilized, ViT outperforms CNN, achieving validation accuracies of 80% and 50%, respectively. The study also highlighted the saliency map's ability to identify areas of tooth cavity preparation of restorative importance, that can theoretically enable more accurate 3D printed prosthetic outputs. The investigation introduced a CNN and ViT hybrid model for classification of 3D tooth models in digital dentistry, and it was the first to employ ExAI in the efforts to automate the process of dental computer-assisted manufacturing.
{"title":"An Application of 3D Vision Transformers and Explainable AI in Prosthetic Dentistry","authors":"Faisal Ahmed Sifat, Md Sahadul Hasan Arian, Saif Ahmed, Taseef Hasan Farook, Nabeel Mohammed, James Dudley","doi":"10.1002/ail2.101","DOIUrl":"https://doi.org/10.1002/ail2.101","url":null,"abstract":"<p>To create and validate a transformer-based deep neural network architecture for classifying 3D scans of teeth for computer-assisted manufacturing and dental prosthetic rehabilitation surpassing previously reported validation accuracies obtained with convolutional neural networks (CNNs). Voxel-based representation and encoding input data in a high-dimensional space forms of preprocessing were investigated using 34 3D models of teeth obtained from intraoral scanning. Independent CNNs and vision transformers (ViTs), and their combination (CNN and ViT hybrid model) were implemented to classify the 3D scans directly from standard tessellation language (.stl) files and an Explainable AI (ExAI) model was generated to qualitatively explore the deterministic patterns that influenced the outcomes of the automation process. The results demonstrate that the CNN and ViT hybrid model architecture surpasses conventional supervised CNN, achieving a consistent validation accuracy of 90% through three-fold cross-validation. This process validated our initial findings, where each instance had the opportunity to be part of the validation set, ensuring it remained unseen during training. Furthermore, employing high-dimensional encoding of input data solely with 3DCNN yields a validation accuracy of 80%. When voxel data preprocessing is utilized, ViT outperforms CNN, achieving validation accuracies of 80% and 50%, respectively. The study also highlighted the saliency map's ability to identify areas of tooth cavity preparation of restorative importance, that can theoretically enable more accurate 3D printed prosthetic outputs. The investigation introduced a CNN and ViT hybrid model for classification of 3D tooth models in digital dentistry, and it was the first to employ ExAI in the efforts to automate the process of dental computer-assisted manufacturing.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"5 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.101","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142859877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}