Pub Date : 2023-12-07DOI: 10.1109/JTEHM.2023.3340345
Dosti Aziz;Sztahó Dávid
Objective: Despite speech being the primary communication medium, it carries valuable information about a speaker’s health, emotions, and identity. Various conditions can affect the vocal organs, leading to speech difficulties. Extensive research has been conducted by voice clinicians and academia in speech analysis. Previous approaches primarily focused on one particular task, such as differentiating between normal and dysphonic speech, classifying different voice disorders, or estimating the severity of voice disorders. Methods and procedures: This study proposes an approach that combines transfer learning and multitask learning (MTL) to simultaneously perform dysphonia classification and severity estimation. Both tasks use a shared representation; network is learned from these shared features. We employed five computer vision models and changed their architecture to support multitask learning. Additionally, we conducted binary ‘healthy vs. dysphonia’ and multiclass ‘healthy vs. organic and functional dysphonia’ classification using multitask learning, with the speaker’s sex as an auxiliary task. Results: The proposed method achieved improved performance across all classification metrics compared to single-task learning (STL), which only performs classification or severity estimation. Specifically, the model achieved F1 scores of 93% and 90% in MTL and STL, respectively. Moreover, we observed considerable improvements in both classification tasks by evaluating beta values associated with the weight assigned to the sex-predicting auxiliary task. MTL achieved an accuracy of 77% compared to the STL score of 73.2%. However, the performance of severity estimation in MTL was comparable to STL. Conclusion: Our goal is to improve how voice pathologists and clinicians understand patients’ conditions, make it easier to track their progress, and enhance the monitoring of vocal quality and treatment procedures. Clinical and Translational Impact Statement: By integrating both classification and severity estimation of dysphonia using multitask learning, we aim to enable clinicians to gain a better understanding of the patient’s situation, effectively monitor their progress and voice quality.
目的:尽管语言是主要的交流媒介,但它也承载着有关说话者健康、情感和身份的宝贵信息。各种疾病都会影响发声器官,导致说话困难。嗓音临床医生和学术界对语音分析进行了广泛的研究。以往的方法主要集中于某一特定任务,如区分正常语音和发音障碍语音、对不同嗓音疾病进行分类或估计嗓音疾病的严重程度。方法和程序:本研究提出了一种结合迁移学习和多任务学习(MTL)的方法,可同时进行发音障碍分类和严重程度评估。这两项任务都使用共享表征;网络是从这些共享特征中学习的。我们采用了五种计算机视觉模型,并改变了它们的架构以支持多任务学习。此外,我们还使用多任务学习进行了二元 "健康 vs. 发声困难 "和多类 "健康 vs. 器质性和功能性发声困难 "分类,并将说话者的性别作为辅助任务。结果与只进行分类或严重程度估计的单任务学习(STL)相比,所提出的方法在所有分类指标上都取得了更好的性能。具体来说,该模型在 MTL 和 STL 中的 F1 分数分别达到了 93% 和 90%。此外,通过评估与分配给性别预测辅助任务的权重相关的贝塔值,我们还观察到这两项分类任务都有相当大的改进。MTL 的准确率为 77%,而 STL 为 73.2%。不过,MTL 在严重程度估计方面的表现与 STL 不相上下。结论我们的目标是改善嗓音病理学家和临床医生对患者病情的了解,使他们更容易跟踪病情进展,并加强对嗓音质量和治疗过程的监控。临床和转化影响声明:通过使用多任务学习将发音障碍的分类和严重程度评估结合起来,我们的目标是让临床医生更好地了解患者的情况,有效地监测他们的病情进展和嗓音质量。
{"title":"Multitask and Transfer Learning Approach for Joint Classification and Severity Estimation of Dysphonia","authors":"Dosti Aziz;Sztahó Dávid","doi":"10.1109/JTEHM.2023.3340345","DOIUrl":"https://doi.org/10.1109/JTEHM.2023.3340345","url":null,"abstract":"Objective: Despite speech being the primary communication medium, it carries valuable information about a speaker’s health, emotions, and identity. Various conditions can affect the vocal organs, leading to speech difficulties. Extensive research has been conducted by voice clinicians and academia in speech analysis. Previous approaches primarily focused on one particular task, such as differentiating between normal and dysphonic speech, classifying different voice disorders, or estimating the severity of voice disorders. Methods and procedures: This study proposes an approach that combines transfer learning and multitask learning (MTL) to simultaneously perform dysphonia classification and severity estimation. Both tasks use a shared representation; network is learned from these shared features. We employed five computer vision models and changed their architecture to support multitask learning. Additionally, we conducted binary ‘healthy vs. dysphonia’ and multiclass ‘healthy vs. organic and functional dysphonia’ classification using multitask learning, with the speaker’s sex as an auxiliary task. Results: The proposed method achieved improved performance across all classification metrics compared to single-task learning (STL), which only performs classification or severity estimation. Specifically, the model achieved F1 scores of 93% and 90% in MTL and STL, respectively. Moreover, we observed considerable improvements in both classification tasks by evaluating beta values associated with the weight assigned to the sex-predicting auxiliary task. MTL achieved an accuracy of 77% compared to the STL score of 73.2%. However, the performance of severity estimation in MTL was comparable to STL. Conclusion: Our goal is to improve how voice pathologists and clinicians understand patients’ conditions, make it easier to track their progress, and enhance the monitoring of vocal quality and treatment procedures. Clinical and Translational Impact Statement: By integrating both classification and severity estimation of dysphonia using multitask learning, we aim to enable clinicians to gain a better understanding of the patient’s situation, effectively monitor their progress and voice quality.","PeriodicalId":54255,"journal":{"name":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","volume":"12 ","pages":"233-244"},"PeriodicalIF":3.4,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10347235","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139034042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Malignant glioma (MG) is the most common type of primary malignant brain tumors. Surgical resection of MG remains the cornerstone of therapy and the extent of resection correlates with patient survival. A limiting factor for resection, however, is the difficulty in differentiating the tumor from normal tissue during surgery. Fluorescence imaging is an emerging technique for real-time intraoperative visualization of MGs and their boundaries. However, most clinical grade neurosurgical operative microscopes with fluorescence imaging ability are hampered by low adoption rates due to high cost, limited portability, limited operation flexibility, and lack of skilled professionals with technical knowledge. To overcome the limitations, we innovatively integrated miniaturized light sources, flippable filters, and a recording camera to the surgical eye loupes to generate a wearable fluorescence eye loupe (FLoupe) device for intraoperative imaging of fluorescent MGs. Two FLoupe prototypes were constructed for imaging of Fluorescein and 5-aminolevulinic acid (5-ALA), respectively. The wearable FLoupe devices were tested on tumor-simulating phantoms and patients with MGs. Comparable results were observed against the standard neurosurgical operative microscope (PENTERO® 900) with fluorescence kits. The affordable and wearable FLoupe devices enable visualization of both color and fluorescence images with the same quality as the large and expensive stationary operative microscopes. The wearable FLoupe device allows for a greater range of movement, less obstruction, and faster/easier operation. Thus, it reduces surgery time and is more easily adapted to the surgical environment than unwieldy neurosurgical operative microscopes. Clinical and Translational Impact Statement—The affordable and wearable fluorescence imaging device developed in this study enables neurosurgeons to observe brain tumors with the same clarity and greater flexibility compared to bulky and costly operative microscopes.
{"title":"A Wearable Fluorescence Imaging Device for Intraoperative Identification of Human Brain Tumors","authors":"Mehrana Mohtasebi;Chong Huang;Mingjun Zhao;Siavash Mazdeyasna;Xuhui Liu;Samaneh Rabienia Haratbar;Faraneh Fathi;Jinghong Sun;Thomas Pittman;Guoqiang Yu","doi":"10.1109/JTEHM.2023.3338564","DOIUrl":"https://doi.org/10.1109/JTEHM.2023.3338564","url":null,"abstract":"Malignant glioma (MG) is the most common type of primary malignant brain tumors. Surgical resection of MG remains the cornerstone of therapy and the extent of resection correlates with patient survival. A limiting factor for resection, however, is the difficulty in differentiating the tumor from normal tissue during surgery. Fluorescence imaging is an emerging technique for real-time intraoperative visualization of MGs and their boundaries. However, most clinical grade neurosurgical operative microscopes with fluorescence imaging ability are hampered by low adoption rates due to high cost, limited portability, limited operation flexibility, and lack of skilled professionals with technical knowledge. To overcome the limitations, we innovatively integrated miniaturized light sources, flippable filters, and a recording camera to the surgical eye loupes to generate a wearable fluorescence eye loupe (FLoupe) device for intraoperative imaging of fluorescent MGs. Two FLoupe prototypes were constructed for imaging of Fluorescein and 5-aminolevulinic acid (5-ALA), respectively. The wearable FLoupe devices were tested on tumor-simulating phantoms and patients with MGs. Comparable results were observed against the standard neurosurgical operative microscope (PENTERO® 900) with fluorescence kits. The affordable and wearable FLoupe devices enable visualization of both color and fluorescence images with the same quality as the large and expensive stationary operative microscopes. The wearable FLoupe device allows for a greater range of movement, less obstruction, and faster/easier operation. Thus, it reduces surgery time and is more easily adapted to the surgical environment than unwieldy neurosurgical operative microscopes. Clinical and Translational Impact Statement—The affordable and wearable fluorescence imaging device developed in this study enables neurosurgeons to observe brain tumors with the same clarity and greater flexibility compared to bulky and costly operative microscopes.","PeriodicalId":54255,"journal":{"name":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","volume":"12 ","pages":"225-232"},"PeriodicalIF":3.4,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10339301","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139081243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-27DOI: 10.1109/JTEHM.2023.3336889
Kaya Kuru;Darren Ansell;Dave Hughes;Benjamin Jon Watkinson;Fabrizio Gaudenzi;Martin Jones;David Lunardi;Noreen Caswell;Adela Rabella Montiel;Peter Leather;Daniel Irving;Kina Bennett;Corrin McKenzie;Paula Sugden;Carl Davies;Christian Degoede
Our study was designed to develop a customisable, wearable, and comfortable medical device – the text so-called “MyPAD” that monitors the fullness of the bladder, triggering an alarm indicating the need to void, in order to prevent badwetting – i.e., treating Nocturnal Enuresis (NE) at the text pre-void stage using miniaturised mechatronics with Artificial Intelligence (AI). The developed features include: multiple bespoke ultrasound (US) probes for sensing, a bespoke electronic device housing custom US electronics for signal processing, a bedside alarm box for processing the echoed pulses and generating alarms, and a phantom to mimic the human body. The validation of the system is conducted on the text tissue-mimicking phantom and volunteers using Bidirectional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM-RNN) and Reinforcement Learning (RL). A Se value of 99% and a Sp value of 99.5% with an overall accuracy rate of 99.3% are observed. The obtained results demonstrate successful empirical evidence for the viability of the device, both in monitoring bladder expansion to determine voiding need and in reinforcing the continuous learning and customisation of the device for bladder control through consecutive uses. Clinical impact: MyPAD will treat the NE better and efficiently against other techniques currently used (e.g., post-void alarms) and will i) replace those techniques quickly considering sufferers’ condition while being treated by other approaches, and ii) enable children to gain control of incontinence over time and consistently have dry nights. Category: Early/Pre-Clinical Research
{"title":"Treatment of Nocturnal Enuresis Using Miniaturised Smart Mechatronics With Artificial Intelligence","authors":"Kaya Kuru;Darren Ansell;Dave Hughes;Benjamin Jon Watkinson;Fabrizio Gaudenzi;Martin Jones;David Lunardi;Noreen Caswell;Adela Rabella Montiel;Peter Leather;Daniel Irving;Kina Bennett;Corrin McKenzie;Paula Sugden;Carl Davies;Christian Degoede","doi":"10.1109/JTEHM.2023.3336889","DOIUrl":"https://doi.org/10.1109/JTEHM.2023.3336889","url":null,"abstract":"Our study was designed to develop a customisable, wearable, and comfortable medical device – the text so-called “MyPAD” that monitors the fullness of the bladder, triggering an alarm indicating the need to void, in order to prevent badwetting – i.e., treating Nocturnal Enuresis (NE) at the text pre-void stage using miniaturised mechatronics with Artificial Intelligence (AI). The developed features include: multiple bespoke ultrasound (US) probes for sensing, a bespoke electronic device housing custom US electronics for signal processing, a bedside alarm box for processing the echoed pulses and generating alarms, and a phantom to mimic the human body. The validation of the system is conducted on the text tissue-mimicking phantom and volunteers using Bidirectional Long Short-Term Memory Recurrent Neural Networks (Bi-LSTM-RNN) and Reinforcement Learning (RL). A Se value of 99% and a Sp value of 99.5% with an overall accuracy rate of 99.3% are observed. The obtained results demonstrate successful empirical evidence for the viability of the device, both in monitoring bladder expansion to determine voiding need and in reinforcing the continuous learning and customisation of the device for bladder control through consecutive uses. Clinical impact: MyPAD will treat the NE better and efficiently against other techniques currently used (e.g., post-void alarms) and will i) replace those techniques quickly considering sufferers’ condition while being treated by other approaches, and ii) enable children to gain control of incontinence over time and consistently have dry nights. Category: Early/Pre-Clinical Research","PeriodicalId":54255,"journal":{"name":"IEEE Journal of Translational Engineering in Health and Medicine-Jtehm","volume":"12 ","pages":"204-214"},"PeriodicalIF":3.4,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10328832","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138485029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-21DOI: 10.1109/JTEHM.2023.3335608
Andrea Moglia;Luca Marsilio;Matteo Rossi;Maria Pinelli;Emanuele Lettieri;Luca Mainardi;Alfonso Manzotti;Pietro Cerveri
Objective: Recent advancements in augmented reality led to planning and navigation systems for orthopedic surgery. However little is known about mixed reality (MR) in orthopedics. Furthermore, artificial intelligence (AI) has the potential to boost the capabilities of MR by enabling automation and personalization. The purpose of this work is to assess Holoknee prototype, based on AI and MR for multimodal data visualization and surgical planning in knee osteotomy, developed to run on the HoloLens 2 headset. Methods: Two preclinical test sessions were performed with 11 participants (eight surgeons, two residents, and one medical student) executing three times six tasks, corresponding to a number of holographic data interactions and preoperative planning steps. At the end of each session, participants answered a questionnaire on user perception and usability. Results: During the second trial, the participants were faster in all tasks than in the first one, while in the third one, the time of execution decreased only for two tasks (“Patient selection” and “Scrolling through radiograph”) with respect to the second attempt, but without statistically significant difference (respectively $p$