首页 > 最新文献

Brain Informatics最新文献

英文 中文
Explainability of random survival forests in predicting conversion risk from mild cognitive impairment to Alzheimer's disease. 随机生存森林预测轻度认知障碍向阿尔茨海默病转化风险的可解释性
Q1 Computer Science Pub Date : 2023-11-18 DOI: 10.1186/s40708-023-00211-w
Alessia Sarica, Federica Aracri, Maria Giovanna Bianco, Fulvia Arcuri, Andrea Quattrone, Aldo Quattrone

Random Survival Forests (RSF) has recently showed better performance than statistical survival methods as Cox proportional hazard (CPH) in predicting conversion risk from mild cognitive impairment (MCI) to Alzheimer's disease (AD). However, RSF application in real-world clinical setting is still limited due to its black-box nature.For this reason, we aimed at providing a comprehensive study of RSF explainability with SHapley Additive exPlanations (SHAP) on biomarkers of stable and progressive patients (sMCI and pMCI) from Alzheimer's Disease Neuroimaging Initiative. We evaluated three global explanations-RSF feature importance, permutation importance and SHAP importance-and we quantitatively compared them with Rank-Biased Overlap (RBO). Moreover, we assessed whether multicollinearity among variables may perturb SHAP outcome. Lastly, we stratified pMCI test patients in high, medium and low risk grade, to investigate individual SHAP explanation of one pMCI patient per risk group.We confirmed that RSF had higher accuracy (0.890) than CPH (0.819), and its stability and robustness was demonstrated by high overlap (RBO > 90%) between feature rankings within first eight features. SHAP local explanations with and without correlated variables had no substantial difference, showing that multicollinearity did not alter the model. FDG, ABETA42 and HCI were the first important features in global explanations, with the highest contribution also in local explanation. FAQ, mPACCdigit, mPACCtrailsB and RAVLT immediate had the highest influence among all clinical and neuropsychological assessments in increasing progression risk, as particularly evident in pMCI patients' individual explanation. In conclusion, our findings suggest that RSF represents a useful tool to support clinicians in estimating conversion-to-AD risk and that SHAP explainer boosts its clinical utility with intelligible and interpretable individual outcomes that highlights key features associated with AD prognosis.

随机生存森林(RSF)最近在预测轻度认知障碍(MCI)向阿尔茨海默病(AD)转化风险方面表现出比统计生存方法更好的Cox比例风险(CPH)。然而,RSF在实际临床环境中的应用仍然受到限制,因为它的黑箱性质。因此,我们的目的是利用来自阿尔茨海默病神经影像学倡议的稳定和进展患者(sMCI和pMCI)生物标志物的SHapley加性解释(SHAP)对RSF的可解释性进行全面研究。我们评估了三种全局解释——rsf特征重要性、排列重要性和SHAP重要性,并将它们与秩偏重叠(RBO)进行了定量比较。此外,我们评估了变量之间的多重共线性是否会干扰SHAP结果。最后,我们将pMCI测试患者分为高、中、低风险等级,探讨每个风险组中一名pMCI患者的个体SHAP解释。结果表明,RSF的准确率(0.890)高于CPH(0.819),且前8个特征之间排序的高度重叠(RBO > 90%)证明了RSF的稳定性和鲁棒性。带相关变量和不带相关变量的SHAP局部解释没有实质性差异,表明多重共线性没有改变模型。FDG、ABETA42和HCI是全球解释中最重要的特征,在局部解释中贡献最大。在所有临床和神经心理学评估中,FAQ、mPACCdigit、mPACCtrailsB和RAVLT immediate对增加进展风险的影响最大,这在pMCI患者的个体解释中尤为明显。总之,我们的研究结果表明,RSF是一种有用的工具,可以帮助临床医生评估转化为阿尔茨海默病的风险,而SHAP解释器通过可理解和可解释的个体结果提高了其临床实用性,突出了与阿尔茨海默病预后相关的关键特征。
{"title":"Explainability of random survival forests in predicting conversion risk from mild cognitive impairment to Alzheimer's disease.","authors":"Alessia Sarica, Federica Aracri, Maria Giovanna Bianco, Fulvia Arcuri, Andrea Quattrone, Aldo Quattrone","doi":"10.1186/s40708-023-00211-w","DOIUrl":"10.1186/s40708-023-00211-w","url":null,"abstract":"<p><p>Random Survival Forests (RSF) has recently showed better performance than statistical survival methods as Cox proportional hazard (CPH) in predicting conversion risk from mild cognitive impairment (MCI) to Alzheimer's disease (AD). However, RSF application in real-world clinical setting is still limited due to its black-box nature.For this reason, we aimed at providing a comprehensive study of RSF explainability with SHapley Additive exPlanations (SHAP) on biomarkers of stable and progressive patients (sMCI and pMCI) from Alzheimer's Disease Neuroimaging Initiative. We evaluated three global explanations-RSF feature importance, permutation importance and SHAP importance-and we quantitatively compared them with Rank-Biased Overlap (RBO). Moreover, we assessed whether multicollinearity among variables may perturb SHAP outcome. Lastly, we stratified pMCI test patients in high, medium and low risk grade, to investigate individual SHAP explanation of one pMCI patient per risk group.We confirmed that RSF had higher accuracy (0.890) than CPH (0.819), and its stability and robustness was demonstrated by high overlap (RBO > 90%) between feature rankings within first eight features. SHAP local explanations with and without correlated variables had no substantial difference, showing that multicollinearity did not alter the model. FDG, ABETA42 and HCI were the first important features in global explanations, with the highest contribution also in local explanation. FAQ, mPACCdigit, mPACCtrailsB and RAVLT immediate had the highest influence among all clinical and neuropsychological assessments in increasing progression risk, as particularly evident in pMCI patients' individual explanation. In conclusion, our findings suggest that RSF represents a useful tool to support clinicians in estimating conversion-to-AD risk and that SHAP explainer boosts its clinical utility with intelligible and interpretable individual outcomes that highlights key features associated with AD prognosis.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"31"},"PeriodicalIF":0.0,"publicationDate":"2023-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10657350/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136399652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic representation of neural circuit knowledge in Caenorhabditis elegans. 秀丽隐杆线虫神经回路知识的语义表示。
Q1 Computer Science Pub Date : 2023-11-10 DOI: 10.1186/s40708-023-00208-5
Sharan J Prakash, Kimberly M Van Auken, David P Hill, Paul W Sternberg

In modern biology, new knowledge is generated quickly, making it challenging for researchers to efficiently acquire and synthesise new information from the large volume of primary publications. To address this problem, computational approaches that generate machine-readable representations of scientific findings in the form of knowledge graphs have been developed. These representations can integrate different types of experimental data from multiple papers and biological knowledge bases in a unifying data model, providing a complementary method to manual review for interacting with published knowledge. The Gene Ontology Consortium (GOC) has created a semantic modelling framework that extends individual functional gene annotations to structured descriptions of causal networks representing biological processes (Gene Ontology-Causal Activity Modelling, or GO-CAM). In this study, we explored whether the GO-CAM framework could represent knowledge of the causal relationships between environmental inputs, neural circuits and behavior in the model nematode C. elegans [C. elegans Neural-Circuit Causal Activity Modelling (CeN-CAM)]. We found that, given extensions to several relevant ontologies, a wide variety of author statements from the literature about the neural circuit basis of egg-laying and carbon dioxide (CO2) avoidance behaviors could be faithfully represented with CeN-CAM. Through this process, we were able to generate generic data models for several categories of experimental results. We also discuss how semantic modelling may be used to functionally annotate the C. elegans connectome. Thus, Gene Ontology-based semantic modelling has the potential to support various machine-readable representations of neurobiological knowledge.

在现代生物学中,新知识产生得很快,这使得研究人员很难从大量的初级出版物中有效地获取和综合新信息。为了解决这个问题,已经开发了以知识图的形式生成科学发现的机器可读表示的计算方法。这些表示可以将来自多篇论文和生物知识库的不同类型的实验数据集成到一个统一的数据模型中,为与已发表的知识交互的手动审查提供了一种补充方法。基因本体联盟(GOC)创建了一个语义建模框架,将个体功能基因注释扩展到表示生物过程的因果网络的结构化描述(基因本体因果活动建模,简称GO-CAM)。在这项研究中,我们探讨了GO-CAM框架是否可以代表线虫模型中环境输入、神经回路和行为之间因果关系的知识。我们发现,如果扩展到几个相关的本体论,文献中关于产卵和二氧化碳(CO2)规避行为的神经回路基础的各种作者陈述都可以用CeN-CAM忠实地表达。通过这个过程,我们能够为几个类别的实验结果生成通用数据模型。我们还讨论了如何使用语义建模来对秀丽隐杆线虫连接体进行功能注释。因此,基于基因本体论的语义建模有可能支持神经生物学知识的各种机器可读表示。
{"title":"Semantic representation of neural circuit knowledge in Caenorhabditis elegans.","authors":"Sharan J Prakash, Kimberly M Van Auken, David P Hill, Paul W Sternberg","doi":"10.1186/s40708-023-00208-5","DOIUrl":"10.1186/s40708-023-00208-5","url":null,"abstract":"<p><p>In modern biology, new knowledge is generated quickly, making it challenging for researchers to efficiently acquire and synthesise new information from the large volume of primary publications. To address this problem, computational approaches that generate machine-readable representations of scientific findings in the form of knowledge graphs have been developed. These representations can integrate different types of experimental data from multiple papers and biological knowledge bases in a unifying data model, providing a complementary method to manual review for interacting with published knowledge. The Gene Ontology Consortium (GOC) has created a semantic modelling framework that extends individual functional gene annotations to structured descriptions of causal networks representing biological processes (Gene Ontology-Causal Activity Modelling, or GO-CAM). In this study, we explored whether the GO-CAM framework could represent knowledge of the causal relationships between environmental inputs, neural circuits and behavior in the model nematode C. elegans [C. elegans Neural-Circuit Causal Activity Modelling (CeN-CAM)]. We found that, given extensions to several relevant ontologies, a wide variety of author statements from the literature about the neural circuit basis of egg-laying and carbon dioxide (CO<sub>2</sub>) avoidance behaviors could be faithfully represented with CeN-CAM. Through this process, we were able to generate generic data models for several categories of experimental results. We also discuss how semantic modelling may be used to functionally annotate the C. elegans connectome. Thus, Gene Ontology-based semantic modelling has the potential to support various machine-readable representations of neurobiological knowledge.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"30"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638142/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72015645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting object properties based on movement kinematics. 基于运动运动学预测对象特性。
Q1 Computer Science Pub Date : 2023-11-04 DOI: 10.1186/s40708-023-00209-4
Lena Kopnarski, Laura Lippert, Julian Rudisch, Claudia Voelcker-Rehage

In order to grasp and transport an object, grip and load forces must be scaled according to the object's properties (such as weight). To select the appropriate grip and load forces, the object weight is estimated based on experience or, in the case of robots, usually by use of image recognition. We propose a new approach that makes a robot's weight estimation less dependent on prior learning and, thereby, allows it to successfully grasp a wider variety of objects. This study evaluates whether it is feasible to predict an object's weight class in a replacement task based on the time series of upper body angles of the active arm or on object velocity profiles. Furthermore, we wanted to investigate how prediction accuracy is affected by (i) the length of the time series and (ii) different cross-validation (CV) procedures. To this end, we recorded and analyzed the movement kinematics of 12 participants during a replacement task. The participants' kinematics were recorded by an optical motion tracking system while transporting an object, 80 times in total from varying starting positions to a predefined end position on a table. The object's weight was modified (made lighter and heavier) without changing the object's visual appearance. Throughout the experiment, the object's weight (light/heavy) was randomly changed without the participant's knowledge. To predict the object's weight class, we used a discrete cosine transform to smooth and compress the time series and a support vector machine for supervised learning from the achieved discrete cosine transform parameters. Results showed good prediction accuracy (up to [Formula: see text], depending on the CV procedure and the length of the time series). Even at the beginning of a movement (after only 300 ms), we were able to predict the object weight reliably (within a classification rate of [Formula: see text]).

为了抓取和运输物体,必须根据物体的特性(如重量)缩放抓取力和负载力。为了选择合适的握力和负载力,根据经验估计物体重量,或者在机器人的情况下,通常通过使用图像识别来估计物体重量。我们提出了一种新的方法,可以减少机器人的重量估计对先验学习的依赖,从而使其能够成功地抓住更广泛的物体。这项研究评估了在替换任务中,根据主动臂上半身角度的时间序列或物体速度分布来预测物体的重量等级是否可行。此外,我们想研究预测精度如何受到(i)时间序列长度和(ii)不同交叉验证(CV)程序的影响。为此,我们记录并分析了12名参与者在替换任务中的运动运动学。参与者在运输物体时,通过光学运动跟踪系统记录其运动学,从不同的起始位置到桌子上预定义的结束位置总共80次。在不更改对象视觉外观的情况下,修改了对象的重量(使其越来越轻)。在整个实验过程中,对象的重量(轻/重)在参与者不知情的情况下随机变化。为了预测对象的权重类别,我们使用离散余弦变换来平滑和压缩时间序列,并使用支持向量机从所获得的离散余弦变换参数中进行监督学习。结果显示,预测精度良好(根据CV程序和时间序列的长度,最高可达[公式:见正文])。即使在运动开始时(仅300毫秒后),我们也能够可靠地预测物体重量(在[公式:见正文]的分类率范围内)。
{"title":"Predicting object properties based on movement kinematics.","authors":"Lena Kopnarski,&nbsp;Laura Lippert,&nbsp;Julian Rudisch,&nbsp;Claudia Voelcker-Rehage","doi":"10.1186/s40708-023-00209-4","DOIUrl":"https://doi.org/10.1186/s40708-023-00209-4","url":null,"abstract":"<p><p>In order to grasp and transport an object, grip and load forces must be scaled according to the object's properties (such as weight). To select the appropriate grip and load forces, the object weight is estimated based on experience or, in the case of robots, usually by use of image recognition. We propose a new approach that makes a robot's weight estimation less dependent on prior learning and, thereby, allows it to successfully grasp a wider variety of objects. This study evaluates whether it is feasible to predict an object's weight class in a replacement task based on the time series of upper body angles of the active arm or on object velocity profiles. Furthermore, we wanted to investigate how prediction accuracy is affected by (i) the length of the time series and (ii) different cross-validation (CV) procedures. To this end, we recorded and analyzed the movement kinematics of 12 participants during a replacement task. The participants' kinematics were recorded by an optical motion tracking system while transporting an object, 80 times in total from varying starting positions to a predefined end position on a table. The object's weight was modified (made lighter and heavier) without changing the object's visual appearance. Throughout the experiment, the object's weight (light/heavy) was randomly changed without the participant's knowledge. To predict the object's weight class, we used a discrete cosine transform to smooth and compress the time series and a support vector machine for supervised learning from the achieved discrete cosine transform parameters. Results showed good prediction accuracy (up to [Formula: see text], depending on the CV procedure and the length of the time series). Even at the beginning of a movement (after only 300 ms), we were able to predict the object weight reliably (within a classification rate of [Formula: see text]).</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"29"},"PeriodicalIF":0.0,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625504/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71487017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Attribute Selection Hybrid Network Model for risk factors analysis of postpartum depression using Social media. 属性选择混合网络模型用于利用社交媒体分析产后抑郁症的危险因素。
Q1 Computer Science Pub Date : 2023-10-31 DOI: 10.1186/s40708-023-00206-7
Abinaya Gopalakrishnan, Raj Gururajan, Revathi Venkataraman, Xujuan Zhou, Ka Chan Ching, Arul Saravanan, Maitrayee Sen

Background and objective: Postpartum Depression (PPD) is a frequently ignored birth-related consequence. Social network analysis can be used to address this issue because social media network serves as a platform for their users to communicate with their friends and share their opinions, photos, and videos, which reflect their moods, feelings, and sentiments. In this work, the depression of delivered mothers is identified using the PPD score and segregated into control and depressed groups. Recently, to detect depression, deep learning methods have played a vital role. However, these methods still do not clarify why some people have been identified as depressed.

Methods: We have developed Attribute Selection Hybrid Network (ASHN) to detect the postpartum depression diagnoses framework. Later analysis of the post of mothers who have been confirmed with the score calculated by the experts of the field using physiological questionnaire score. The model works on the analysis of the attributes of the negative Facebook posts for Depressed user Diagnosis, which is a large general forum. This framework explains the process of analyzing posts containing Sentiment, depressive symptoms, and reflective thinking and suggests psycho-linguistic and stylistic attributes of depression in posts.

Results: The experimental results show that ASHN works well and is easy to understand. Here, four attribute networks based on psychological studies were used to analyze the different parts of posts by depressed users. The results of the experiments show the extraction of psycho-linguistic markers-based attributes, the recording of assessment metrics including Precision, Recall and F1 score and visualization of those attributes were used title-wise as well as words wise and compared with daily life, depression and postpartum depressed people using Word cloud. Furthermore, a comparison to a reference with Baseline and ASHN model was carried out.

Conclusions: Attribute Selection Hybrid Network (ASHN) mimics the importance of attributes in social media posts to predict depressed mothers. Those mothers were anticipated to be depressed by answering a questionnaire designed by domain experts with prior knowledge of depression. This work will help researchers look at social media posts to find useful evidence for other depressive symptoms.

背景和目的:产后抑郁症(PPD)是一种经常被忽视的与出生有关的后果。社交网络分析可以用来解决这个问题,因为社交媒体网络是用户与朋友交流、分享意见、照片和视频的平台,反映了他们的情绪、感受和情感。在这项工作中,使用PPD评分来识别分娩母亲的抑郁症,并将其分为对照组和抑郁症组。最近,为了检测抑郁症,深度学习方法发挥了至关重要的作用。然而,这些方法仍然没有阐明为什么有些人被认定为抑郁症。方法:我们开发了属性选择混合网络(ASHN)来检测产后抑郁症的诊断框架。后期对已确认的产妇岗位进行分析,由该领域的专家利用生理问卷得分计算得出得分。该模型用于分析抑郁用户诊断的负面Facebook帖子的属性,这是一个大型的通用论坛。该框架解释了分析包含情感、抑郁症状和反思思维的帖子的过程,并提出了帖子中抑郁的心理语言和风格特征。结果:实验结果表明,ASHN工作良好,易于理解。在这里,基于心理学研究的四个属性网络被用来分析抑郁用户帖子的不同部分。实验结果表明,基于心理-语言标记的属性的提取,包括Precision、Recall和F1分数在内的评估指标的记录,以及这些属性的可视化,在标题和单词方面都得到了使用,并使用Word cloud与日常生活、抑郁和产后抑郁的人进行了比较。此外,还与基准和ASHN模型的参考进行了比较。结论:属性选择混合网络(ASHN)模拟了社交媒体帖子中属性对预测抑郁母亲的重要性。通过回答领域专家设计的问卷,这些母亲预计会患抑郁症。这项工作将帮助研究人员查看社交媒体帖子,找到其他抑郁症状的有用证据。
{"title":"Attribute Selection Hybrid Network Model for risk factors analysis of postpartum depression using Social media.","authors":"Abinaya Gopalakrishnan,&nbsp;Raj Gururajan,&nbsp;Revathi Venkataraman,&nbsp;Xujuan Zhou,&nbsp;Ka Chan Ching,&nbsp;Arul Saravanan,&nbsp;Maitrayee Sen","doi":"10.1186/s40708-023-00206-7","DOIUrl":"https://doi.org/10.1186/s40708-023-00206-7","url":null,"abstract":"<p><strong>Background and objective: </strong>Postpartum Depression (PPD) is a frequently ignored birth-related consequence. Social network analysis can be used to address this issue because social media network serves as a platform for their users to communicate with their friends and share their opinions, photos, and videos, which reflect their moods, feelings, and sentiments. In this work, the depression of delivered mothers is identified using the PPD score and segregated into control and depressed groups. Recently, to detect depression, deep learning methods have played a vital role. However, these methods still do not clarify why some people have been identified as depressed.</p><p><strong>Methods: </strong>We have developed Attribute Selection Hybrid Network (ASHN) to detect the postpartum depression diagnoses framework. Later analysis of the post of mothers who have been confirmed with the score calculated by the experts of the field using physiological questionnaire score. The model works on the analysis of the attributes of the negative Facebook posts for Depressed user Diagnosis, which is a large general forum. This framework explains the process of analyzing posts containing Sentiment, depressive symptoms, and reflective thinking and suggests psycho-linguistic and stylistic attributes of depression in posts.</p><p><strong>Results: </strong>The experimental results show that ASHN works well and is easy to understand. Here, four attribute networks based on psychological studies were used to analyze the different parts of posts by depressed users. The results of the experiments show the extraction of psycho-linguistic markers-based attributes, the recording of assessment metrics including Precision, Recall and F1 score and visualization of those attributes were used title-wise as well as words wise and compared with daily life, depression and postpartum depressed people using Word cloud. Furthermore, a comparison to a reference with Baseline and ASHN model was carried out.</p><p><strong>Conclusions: </strong>Attribute Selection Hybrid Network (ASHN) mimics the importance of attributes in social media posts to predict depressed mothers. Those mothers were anticipated to be depressed by answering a questionnaire designed by domain experts with prior knowledge of depression. This work will help researchers look at social media posts to find useful evidence for other depressive symptoms.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"28"},"PeriodicalIF":0.0,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10618142/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71427639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating the mental health of university students during the COVID-19 pandemic in a UK university: a machine learning approach using feature permutation importance. 调查新冠肺炎大流行期间英国一所大学大学生的心理健康:使用特征排列重要性的机器学习方法。
Q1 Computer Science Pub Date : 2023-10-10 DOI: 10.1186/s40708-023-00205-8
Tianhua Chen

Mental wellbeing of university students is a growing concern that has been worsening during the COVID-19 pandemic. Numerous studies have gathered empirical data to explore the mental health impact of the pandemic on university students and investigate factors associated with higher levels of distress. While the online questionnaire survey has been a prevalent means to collect data, regression analysis has been observed a dominating approach to interpret and understand the impact of independent factors on a mental wellbeing state of interest. Drawbacks such as sensitivity to outliers, ineffectiveness in case of multiple predictors highly correlated may limit the use of regression in complex scenarios. These observations motivate the underlying research to propose alternative computational methods to investigate the questionnaire data. Inspired by recent machine learning advances, this research aims to construct a framework through feature permutation importance to empower the application of a variety of machine learning algorithms that originate from different computational frameworks and learning theories, including algorithms that cannot directly provide exact numerical contributions of individual factors. This would enable to explore quantitative impact of predictors in influencing student mental wellbeing from multiple perspectives as a result of using different algorithms, thus complementing the single view due to the dominant use of regression. Applying the proposed approach over an online survey in a UK university, the analysis suggests the past medical record and wellbeing history and the experience of adversity contribute significantly to mental wellbeing states; and the frequent communication with families and friends to keep good relationship as well as regular exercise are generally contributing to improved mental wellbeing.

大学生的心理健康是一个日益令人担忧的问题,在新冠肺炎大流行期间,这一问题一直在恶化。许多研究收集了经验数据,以探索疫情对大学生心理健康的影响,并调查与更高程度的痛苦相关的因素。虽然在线问卷调查是收集数据的一种普遍手段,但回归分析被认为是解释和理解独立因素对感兴趣的心理健康状态影响的主要方法。对异常值的敏感性、多个预测因子高度相关时的无效性等缺点可能会限制回归在复杂场景中的使用。这些观察结果促使基础研究提出替代计算方法来调查问卷数据。受机器学习最新进展的启发,本研究旨在通过特征置换重要性构建一个框架,以支持各种机器学习算法的应用,这些算法源于不同的计算框架和学习理论,包括无法直接提供单个因素的精确数值贡献的算法。由于使用了不同的算法,这将使我们能够从多个角度探索预测因素对学生心理健康的影响,从而补充回归的主要使用所带来的单一观点。将所提出的方法应用于英国一所大学的一项在线调查,分析表明,过去的医疗记录、健康史和逆境经历对心理健康状态有显著影响;与家人和朋友频繁沟通以保持良好关系以及定期锻炼通常有助于改善心理健康。
{"title":"Investigating the mental health of university students during the COVID-19 pandemic in a UK university: a machine learning approach using feature permutation importance.","authors":"Tianhua Chen","doi":"10.1186/s40708-023-00205-8","DOIUrl":"10.1186/s40708-023-00205-8","url":null,"abstract":"<p><p>Mental wellbeing of university students is a growing concern that has been worsening during the COVID-19 pandemic. Numerous studies have gathered empirical data to explore the mental health impact of the pandemic on university students and investigate factors associated with higher levels of distress. While the online questionnaire survey has been a prevalent means to collect data, regression analysis has been observed a dominating approach to interpret and understand the impact of independent factors on a mental wellbeing state of interest. Drawbacks such as sensitivity to outliers, ineffectiveness in case of multiple predictors highly correlated may limit the use of regression in complex scenarios. These observations motivate the underlying research to propose alternative computational methods to investigate the questionnaire data. Inspired by recent machine learning advances, this research aims to construct a framework through feature permutation importance to empower the application of a variety of machine learning algorithms that originate from different computational frameworks and learning theories, including algorithms that cannot directly provide exact numerical contributions of individual factors. This would enable to explore quantitative impact of predictors in influencing student mental wellbeing from multiple perspectives as a result of using different algorithms, thus complementing the single view due to the dominant use of regression. Applying the proposed approach over an online survey in a UK university, the analysis suggests the past medical record and wellbeing history and the experience of adversity contribute significantly to mental wellbeing states; and the frequent communication with families and friends to keep good relationship as well as regular exercise are generally contributing to improved mental wellbeing.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"27"},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10564685/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41183816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-based algorithm for postoperative glioblastoma MRI segmentation: a promising new tool for tumor burden assessment. 基于深度学习的胶质母细胞瘤术后MRI分割算法:一种很有前途的肿瘤负担评估新工具。
Q1 Computer Science Pub Date : 2023-10-06 DOI: 10.1186/s40708-023-00207-6
Andrea Bianconi, Luca Francesco Rossi, Marta Bonada, Pietro Zeppa, Elsa Nico, Raffaele De Marco, Paola Lacroce, Fabio Cofano, Francesco Bruno, Giovanni Morana, Antonio Melcarne, Roberta Ruda, Luca Mainardi, Pietro Fiaschi, Diego Garbossa, Lia Morra

Objective: Clinical and surgical decisions for glioblastoma patients depend on a tumor imaging-based evaluation. Artificial Intelligence (AI) can be applied to magnetic resonance imaging (MRI) assessment to support clinical practice, surgery planning and prognostic predictions. In a real-world context, the current obstacles for AI are low-quality imaging and postoperative reliability. The aim of this study is to train an automatic algorithm for glioblastoma segmentation on a clinical MRI dataset and to obtain reliable results both pre- and post-operatively.

Methods: The dataset used for this study comprises 237 (71 preoperative and 166 postoperative) MRIs from 71 patients affected by a histologically confirmed Grade IV Glioma. The implemented U-Net architecture was trained by transfer learning to perform the segmentation task on postoperative MRIs. The training was carried out first on BraTS2021 dataset for preoperative segmentation. Performance is evaluated using DICE score (DS) and Hausdorff 95% (H95).

Results: In preoperative scenario, overall DS is 91.09 (± 0.60) and H95 is 8.35 (± 1.12), considering tumor core, enhancing tumor and whole tumor (ET and edema). In postoperative context, overall DS is 72.31 (± 2.88) and H95 is 23.43 (± 7.24), considering resection cavity (RC), gross tumor volume (GTV) and whole tumor (WT). Remarkably, the RC segmentation obtained a mean DS of 63.52 (± 8.90) in postoperative MRIs.

Conclusions: The performances achieved by the algorithm are consistent with previous literature for both pre-operative and post-operative glioblastoma's MRI evaluation. Through the proposed algorithm, it is possible to reduce the impact of low-quality images and missing sequences.

目的:胶质母细胞瘤患者的临床和手术决策取决于基于肿瘤影像学的评估。人工智能(AI)可应用于磁共振成像(MRI)评估,以支持临床实践、手术计划和预后预测。在现实世界中,人工智能目前的障碍是低质量成像和术后可靠性。本研究的目的是在临床MRI数据集上训练胶质母细胞瘤分割的自动算法,并在术前和术后获得可靠的结果。方法:本研究使用的数据集包括来自71名经组织学证实的IV级胶质瘤患者的237(71例术前和166例术后)MRI。所实现的U-Net架构通过迁移学习进行训练,以在术后MRI上执行分割任务。训练首先在BraTS2021数据集上进行,用于术前分割。使用DICE评分(DS)和Hausdorff 95%(H95)评估表现。结果:在术前情况下,总体DS为91.09(± 0.60),H95为8.35(± 1.12),考虑肿瘤核心,增强肿瘤和整个肿瘤(ET和水肿)。在术后情况下,总DS为72.31(± 2.88),H95为23.43(± 7.24),考虑切除腔(RC)、肿瘤总体积(GTV)和整个肿瘤(WT)。值得注意的是,RC分割获得了63.52(± 8.90)。结论:该算法在术前和术后胶质母细胞瘤MRI评估方面的性能与以往文献一致。通过所提出的算法,可以减少低质量图像和缺失序列的影响。
{"title":"Deep learning-based algorithm for postoperative glioblastoma MRI segmentation: a promising new tool for tumor burden assessment.","authors":"Andrea Bianconi, Luca Francesco Rossi, Marta Bonada, Pietro Zeppa, Elsa Nico, Raffaele De Marco, Paola Lacroce, Fabio Cofano, Francesco Bruno, Giovanni Morana, Antonio Melcarne, Roberta Ruda, Luca Mainardi, Pietro Fiaschi, Diego Garbossa, Lia Morra","doi":"10.1186/s40708-023-00207-6","DOIUrl":"10.1186/s40708-023-00207-6","url":null,"abstract":"<p><strong>Objective: </strong>Clinical and surgical decisions for glioblastoma patients depend on a tumor imaging-based evaluation. Artificial Intelligence (AI) can be applied to magnetic resonance imaging (MRI) assessment to support clinical practice, surgery planning and prognostic predictions. In a real-world context, the current obstacles for AI are low-quality imaging and postoperative reliability. The aim of this study is to train an automatic algorithm for glioblastoma segmentation on a clinical MRI dataset and to obtain reliable results both pre- and post-operatively.</p><p><strong>Methods: </strong>The dataset used for this study comprises 237 (71 preoperative and 166 postoperative) MRIs from 71 patients affected by a histologically confirmed Grade IV Glioma. The implemented U-Net architecture was trained by transfer learning to perform the segmentation task on postoperative MRIs. The training was carried out first on BraTS2021 dataset for preoperative segmentation. Performance is evaluated using DICE score (DS) and Hausdorff 95% (H95).</p><p><strong>Results: </strong>In preoperative scenario, overall DS is 91.09 (± 0.60) and H95 is 8.35 (± 1.12), considering tumor core, enhancing tumor and whole tumor (ET and edema). In postoperative context, overall DS is 72.31 (± 2.88) and H95 is 23.43 (± 7.24), considering resection cavity (RC), gross tumor volume (GTV) and whole tumor (WT). Remarkably, the RC segmentation obtained a mean DS of 63.52 (± 8.90) in postoperative MRIs.</p><p><strong>Conclusions: </strong>The performances achieved by the algorithm are consistent with previous literature for both pre-operative and post-operative glioblastoma's MRI evaluation. Through the proposed algorithm, it is possible to reduce the impact of low-quality images and missing sequences.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"26"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10558414/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41161329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Common spatial pattern for classification of loving kindness meditation EEG for single and multiple sessions. 仁爱冥想单次与多次脑电分类的共同空间格局。
Q1 Computer Science Pub Date : 2023-09-09 DOI: 10.1186/s40708-023-00204-9
Nalinda D Liyanagedera, Ali Abdul Hussain, Amardeep Singh, Sunil Lal, Heather Kempton, Hans W Guesgen

While a very few studies have been conducted on classifying loving kindness meditation (LKM) and non-meditation electroencephalography (EEG) data for a single session, there are no such studies conducted for multiple session EEG data. Thus, this study aims at classifying existing raw EEG meditation data on single and multiple sessions to come up with meaningful inferences which will be highly beneficial when developing algorithms that can support meditation practices. In this analysis, data have been collected on Pre-Resting (before-meditation), Post-Resting (after-meditation), LKM-Self and LKM-Others for 32 participants and hence allowing us to conduct six pairwise comparisons for the four mind tasks. Common Spatial Patterns (CSP) is a feature extraction method widely used in motor imaginary brain computer interface (BCI), but not in meditation EEG data. Therefore, using CSP in extracting features from meditation EEG data and classifying meditation/non-meditation instances, particularly for multiple sessions will create a new path in future meditation EEG research. The classification was done using Linear Discriminant Analysis (LDA) where both meditation techniques (LKM-Self and LKM-Others) were compared with Pre-Resting and Post-Resting instances. The results show that for a single session of 32 participants, around 99.5% accuracy was obtained for classifying meditation/Pre-Resting instances. For the 15 participants when using five sessions of EEG data, around 83.6% accuracy was obtained for classifying meditation/Pre-Resting instances. The results demonstrate the ability to classify meditation/Pre-Resting data. Most importantly, this classification is possible for multiple session data as well. In addition to this, when comparing the classification accuracies of the six mind task pairs; LKM-Self, LKM-Others and Post-Resting produced relatively lower accuracies among them than the accuracies obtained for classifying Pre-Resting with the other three. This indicates that Pre-Resting has some features giving a better classification indicating that it is different from the other three mind tasks.

对单次慈心冥想(LKM)和非冥想脑电图(EEG)数据进行分类的研究很少,而对多次脑电图数据进行分类的研究则很少。因此,本研究旨在对现有的单次和多次原始脑电图冥想数据进行分类,以得出有意义的推论,这将在开发支持冥想练习的算法时非常有益。在这项分析中,我们收集了32名参与者的静息前(冥想前)、静息后(冥想后)、LKM-Self和LKM-Others的数据,从而允许我们对四种思维任务进行六次两两比较。共同空间模式(Common Spatial Patterns, CSP)是一种广泛应用于运动想象脑机接口(BCI)的特征提取方法,但在冥想脑电数据中应用较少。因此,利用CSP对冥想脑电数据进行特征提取,并对冥想/非冥想实例进行分类,特别是对多时段的冥想脑电进行分类,将为未来的冥想脑电研究开辟一条新的路径。使用线性判别分析(LDA)进行分类,将两种冥想技术(LKM-Self和LKM-Others)与静息前和静息后的实例进行比较。结果表明,对于32名参与者的单一会话,对冥想/预休息实例进行分类的准确率约为99.5%。对于15名参与者,当使用5次脑电图数据时,对冥想/休息前实例进行分类的准确率约为83.6%。结果证明了对冥想/预休息数据进行分类的能力。最重要的是,这种分类也可以用于多个会话数据。此外,在比较六种思维任务对的分类准确率时;其中LKM-Self、LKM-Others和Post-Resting的准确率相对低于Pre-Resting与其他三种分类的准确率。这表明,“休息前”有一些特征,可以更好地分类,表明它与其他三种思维任务不同。
{"title":"Common spatial pattern for classification of loving kindness meditation EEG for single and multiple sessions.","authors":"Nalinda D Liyanagedera, Ali Abdul Hussain, Amardeep Singh, Sunil Lal, Heather Kempton, Hans W Guesgen","doi":"10.1186/s40708-023-00204-9","DOIUrl":"10.1186/s40708-023-00204-9","url":null,"abstract":"<p><p>While a very few studies have been conducted on classifying loving kindness meditation (LKM) and non-meditation electroencephalography (EEG) data for a single session, there are no such studies conducted for multiple session EEG data. Thus, this study aims at classifying existing raw EEG meditation data on single and multiple sessions to come up with meaningful inferences which will be highly beneficial when developing algorithms that can support meditation practices. In this analysis, data have been collected on Pre-Resting (before-meditation), Post-Resting (after-meditation), LKM-Self and LKM-Others for 32 participants and hence allowing us to conduct six pairwise comparisons for the four mind tasks. Common Spatial Patterns (CSP) is a feature extraction method widely used in motor imaginary brain computer interface (BCI), but not in meditation EEG data. Therefore, using CSP in extracting features from meditation EEG data and classifying meditation/non-meditation instances, particularly for multiple sessions will create a new path in future meditation EEG research. The classification was done using Linear Discriminant Analysis (LDA) where both meditation techniques (LKM-Self and LKM-Others) were compared with Pre-Resting and Post-Resting instances. The results show that for a single session of 32 participants, around 99.5% accuracy was obtained for classifying meditation/Pre-Resting instances. For the 15 participants when using five sessions of EEG data, around 83.6% accuracy was obtained for classifying meditation/Pre-Resting instances. The results demonstrate the ability to classify meditation/Pre-Resting data. Most importantly, this classification is possible for multiple session data as well. In addition to this, when comparing the classification accuracies of the six mind task pairs; LKM-Self, LKM-Others and Post-Resting produced relatively lower accuracies among them than the accuracies obtained for classifying Pre-Resting with the other three. This indicates that Pre-Resting has some features giving a better classification indicating that it is different from the other three mind tasks.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"24"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492719/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10212318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transformers for autonomous recognition of psychiatric dysfunction via raw and imbalanced EEG signals. 利用原始和不平衡脑电图信号自主识别精神功能障碍的变压器。
Q1 Computer Science Pub Date : 2023-09-09 DOI: 10.1186/s40708-023-00201-y
Neha Gour, Taimur Hassan, Muhammad Owais, Iyyakutti Iyappan Ganapathi, Pritee Khanna, Mohamed L Seghier, Naoufel Werghi

Early identification of mental disorders, based on subjective interviews, is extremely challenging in the clinical setting. There is a growing interest in developing automated screening tools for potential mental health problems based on biological markers. Here, we demonstrate the feasibility of an AI-powered diagnosis of different mental disorders using EEG data. Specifically, this work aims to classify different mental disorders in the following ecological context accurately: (1) using raw EEG data, (2) collected during rest, (3) during both eye open, and eye closed conditions, (4) at short 2-min duration, (5) on participants with different psychiatric conditions, (6) with some overlapping symptoms, and (7) with strongly imbalanced classes. To tackle this challenge, we designed and optimized a transformer-based architecture, where class imbalance is addressed through focal loss and class weight balancing. Using the recently released TDBRAIN dataset (n= 1274 participants), our method classifies each participant as either a neurotypical or suffering from major depressive disorder (MDD), attention deficit hyperactivity disorder (ADHD), subjective memory complaints (SMC), or obsessive-compulsive disorder (OCD). We evaluate the performance of the proposed architecture on both the window-level and the patient-level. The classification of the 2-min raw EEG data into five classes achieved a window-level accuracy of 63.2% and 65.8% for open and closed eye conditions, respectively. When the classification is limited to three main classes (MDD, ADHD, SMC), window level accuracy improved to 75.1% and 69.9% for eye open and eye closed conditions, respectively. Our work paves the way for developing novel AI-based methods for accurately diagnosing mental disorders using raw resting-state EEG data.

基于主观访谈的精神障碍早期识别在临床环境中极具挑战性。人们对开发基于生物标记的潜在心理健康问题的自动筛选工具越来越感兴趣。在这里,我们展示了利用脑电图数据对不同精神障碍进行人工智能诊断的可行性。具体来说,这项工作旨在准确地对以下生态背景下的不同精神障碍进行分类:(1)使用原始脑电图数据,(2)在休息时收集,(3)在睁眼和闭眼状态下收集,(4)在短的2分钟持续时间内,(5)不同精神状况的参与者,(6)有一些重叠症状,(7)类别严重不平衡。为了应对这一挑战,我们设计并优化了一个基于变压器的架构,其中通过焦点损失和类权重平衡来解决类不平衡问题。使用最近发布的TDBRAIN数据集(n= 1274名参与者),我们的方法将每个参与者分类为神经典型或患有重度抑郁症(MDD),注意缺陷多动障碍(ADHD),主观记忆抱怨(SMC)或强迫症(OCD)。我们在窗口级和患者级评估了所提出的体系结构的性能。将2 min原始EEG数据分为5类,在睁眼和闭眼条件下的窗级准确率分别为63.2%和65.8%。当分类仅限于三个主要类别(MDD, ADHD, SMC)时,睁眼和闭眼的窗口水平准确率分别提高到75.1%和69.9%。我们的工作为开发基于人工智能的新方法铺平了道路,该方法可以使用原始静息状态脑电图数据准确诊断精神障碍。
{"title":"Transformers for autonomous recognition of psychiatric dysfunction via raw and imbalanced EEG signals.","authors":"Neha Gour, Taimur Hassan, Muhammad Owais, Iyyakutti Iyappan Ganapathi, Pritee Khanna, Mohamed L Seghier, Naoufel Werghi","doi":"10.1186/s40708-023-00201-y","DOIUrl":"10.1186/s40708-023-00201-y","url":null,"abstract":"<p><p>Early identification of mental disorders, based on subjective interviews, is extremely challenging in the clinical setting. There is a growing interest in developing automated screening tools for potential mental health problems based on biological markers. Here, we demonstrate the feasibility of an AI-powered diagnosis of different mental disorders using EEG data. Specifically, this work aims to classify different mental disorders in the following ecological context accurately: (1) using raw EEG data, (2) collected during rest, (3) during both eye open, and eye closed conditions, (4) at short 2-min duration, (5) on participants with different psychiatric conditions, (6) with some overlapping symptoms, and (7) with strongly imbalanced classes. To tackle this challenge, we designed and optimized a transformer-based architecture, where class imbalance is addressed through focal loss and class weight balancing. Using the recently released TDBRAIN dataset (n= 1274 participants), our method classifies each participant as either a neurotypical or suffering from major depressive disorder (MDD), attention deficit hyperactivity disorder (ADHD), subjective memory complaints (SMC), or obsessive-compulsive disorder (OCD). We evaluate the performance of the proposed architecture on both the window-level and the patient-level. The classification of the 2-min raw EEG data into five classes achieved a window-level accuracy of 63.2% and 65.8% for open and closed eye conditions, respectively. When the classification is limited to three main classes (MDD, ADHD, SMC), window level accuracy improved to 75.1% and 69.9% for eye open and eye closed conditions, respectively. Our work paves the way for developing novel AI-based methods for accurately diagnosing mental disorders using raw resting-state EEG data.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"25"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10492733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10209651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic identification of scientific publications describing digital reconstructions of neural morphology. 描述神经形态学数字重建的科学出版物的自动识别。
Q1 Computer Science Pub Date : 2023-09-08 DOI: 10.1186/s40708-023-00202-x
Patricia Maraver, Carolina Tecuatl, Giorgio A Ascoli

The increasing number of peer-reviewed publications constitutes a challenge for biocuration. For example, NeuroMorpho.Org, a sharing platform for digital reconstructions of neural morphology, must evaluate more than 6000 potentially relevant articles per year to identify data of interest. Here, we describe a tool that uses natural language processing and deep learning to assess the likelihood of a publication to be relevant for the project. The tool automatically identifies articles describing digitally reconstructed neural morphologies with high accuracy. Its processing rate of 900 publications per hour is not only amply sufficient to autonomously track new research, but also allowed the successful evaluation of older publications backlogged due to limited human resources. The number of bio-entities found since launching the tool almost doubled while greatly reducing manual labor. The classification tool is open source, configurable, and simple to use, making it extensible to other biocuration projects.

越来越多的同行评议出版物对生物认证构成了挑战。例如,NeuroMorpho。Org是一个神经形态学数字重建的共享平台,每年必须评估6000多篇潜在的相关文章,以识别感兴趣的数据。在这里,我们描述了一个工具,它使用自然语言处理和深度学习来评估与项目相关的出版物的可能性。该工具以高精度自动识别描述数字重建神经形态的文章。其每小时900份出版物的处理速度不仅足以自主跟踪新研究,而且可以成功评估因人力资源有限而积压的旧出版物。自启动该工具以来,发现的生物实体数量几乎翻了一番,同时大大减少了体力劳动。该分类工具是开源的、可配置的、易于使用的,因此可以扩展到其他生物定位项目。
{"title":"Automatic identification of scientific publications describing digital reconstructions of neural morphology.","authors":"Patricia Maraver, Carolina Tecuatl, Giorgio A Ascoli","doi":"10.1186/s40708-023-00202-x","DOIUrl":"10.1186/s40708-023-00202-x","url":null,"abstract":"<p><p>The increasing number of peer-reviewed publications constitutes a challenge for biocuration. For example, NeuroMorpho.Org, a sharing platform for digital reconstructions of neural morphology, must evaluate more than 6000 potentially relevant articles per year to identify data of interest. Here, we describe a tool that uses natural language processing and deep learning to assess the likelihood of a publication to be relevant for the project. The tool automatically identifies articles describing digitally reconstructed neural morphologies with high accuracy. Its processing rate of 900 publications per hour is not only amply sufficient to autonomously track new research, but also allowed the successful evaluation of older publications backlogged due to limited human resources. The number of bio-entities found since launching the tool almost doubled while greatly reducing manual labor. The classification tool is open source, configurable, and simple to use, making it extensible to other biocuration projects.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"23"},"PeriodicalIF":0.0,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10491540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10284131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cerebrovascular disease case identification in inpatient electronic medical record data using natural language processing. 基于自然语言处理的住院电子病历数据中脑血管病病例识别。
Q1 Computer Science Pub Date : 2023-09-02 DOI: 10.1186/s40708-023-00203-w
Jie Pan, Zilong Zhang, Steven Ray Peters, Shabnam Vatanpour, Robin L Walker, Seungwon Lee, Elliot A Martin, Hude Quan

Background: Abstracting cerebrovascular disease (CeVD) from inpatient electronic medical records (EMRs) through natural language processing (NLP) is pivotal for automated disease surveillance and improving patient outcomes. Existing methods rely on coders' abstraction, which has time delays and under-coding issues. This study sought to develop an NLP-based method to detect CeVD using EMR clinical notes.

Methods: CeVD status was confirmed through a chart review on randomly selected hospitalized patients who were 18 years or older and discharged from 3 hospitals in Calgary, Alberta, Canada, between January 1 and June 30, 2015. These patients' chart data were linked to administrative discharge abstract database (DAD) and Sunrise Clinical Manager (SCM) EMR database records by Personal Health Number (a unique lifetime identifier) and admission date. We trained multiple natural language processing (NLP) predictive models by combining two clinical concept extraction methods and two supervised machine learning (ML) methods: random forest and XGBoost. Using chart review as the reference standard, we compared the model performances with those of the commonly applied International Classification of Diseases (ICD-10-CA) codes, on the metrics of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).

Result: Of the study sample (n = 3036), the prevalence of CeVD was 11.8% (n = 360); the median patient age was 63; and females accounted for 50.3% (n = 1528) based on chart data. Among 49 extracted clinical documents from the EMR, four document types were identified as the most influential text sources for identifying CeVD disease ("nursing transfer report," "discharge summary," "nursing notes," and "inpatient consultation."). The best performing NLP model was XGBoost, combining the Unified Medical Language System concepts extracted by cTAKES (e.g., top-ranked concepts, "Cerebrovascular accident" and "Transient ischemic attack"), and the term frequency-inverse document frequency vectorizer. Compared with ICD codes, the model achieved higher validity overall, such as sensitivity (25.0% vs 70.0%), specificity (99.3% vs 99.1%), PPV (82.6 vs. 87.8%), and NPV (90.8% vs 97.1%).

Conclusion: The NLP algorithm developed in this study performed better than the ICD code algorithm in detecting CeVD. The NLP models could result in an automated EMR tool for identifying CeVD cases and be applied for future studies such as surveillance, and longitudinal studies.

背景:通过自然语言处理(NLP)从住院患者电子病历(emr)中提取脑血管疾病(CeVD)是疾病自动化监测和改善患者预后的关键。现有的方法依赖于编码人员的抽象,存在时间延迟和编码不足的问题。本研究旨在开发一种基于nlp的方法,利用EMR临床记录检测CeVD。方法:对2015年1月1日至6月30日在加拿大阿尔伯塔省卡尔加里市3家医院随机抽取的18岁及以上住院出院患者进行病例回顾,确认CeVD情况。这些患者的病历数据通过个人健康号码(一个唯一的终生标识符)和入院日期与行政出院摘要数据库(DAD)和Sunrise™临床管理器(SCM) EMR数据库记录相关联。我们结合随机森林和XGBoost两种临床概念提取方法和两种监督机器学习(ML)方法训练了多个自然语言处理(NLP)预测模型。以图表回顾为参考标准,将该模型与常用的国际疾病分类(ICD-10-CA)代码在敏感性、特异性、阳性预测值(PPV)和阴性预测值(NPV)等指标上的性能进行比较。结果:3036例研究样本中,CeVD患病率为11.8%(360例);患者年龄中位数为63岁;根据图表数据,女性占50.3% (n = 1528)。在从EMR中提取的49份临床文件中,四种文件类型被确定为识别CeVD疾病最具影响力的文本来源(“护理转院报告”、“出院总结”、“护理笔记”和“住院会诊”)。表现最好的NLP模型是XGBoost,它结合了ctake提取的统一医学语言系统概念(例如排名最高的概念,“脑血管事故”和“短暂性脑缺血发作”)和术语频率逆的文档频率矢量器。与ICD编码相比,该模型的灵敏度(25.0%比70.0%)、特异性(99.3%比99.1%)、PPV(82.6比87.8%)和NPV(90.8%比97.1%)总体效度更高。结论:NLP算法对CeVD的检测效果优于ICD编码算法。NLP模型可以产生用于识别CeVD病例的自动化电子病历工具,并应用于未来的研究,如监测和纵向研究。
{"title":"Cerebrovascular disease case identification in inpatient electronic medical record data using natural language processing.","authors":"Jie Pan, Zilong Zhang, Steven Ray Peters, Shabnam Vatanpour, Robin L Walker, Seungwon Lee, Elliot A Martin, Hude Quan","doi":"10.1186/s40708-023-00203-w","DOIUrl":"10.1186/s40708-023-00203-w","url":null,"abstract":"<p><strong>Background: </strong>Abstracting cerebrovascular disease (CeVD) from inpatient electronic medical records (EMRs) through natural language processing (NLP) is pivotal for automated disease surveillance and improving patient outcomes. Existing methods rely on coders' abstraction, which has time delays and under-coding issues. This study sought to develop an NLP-based method to detect CeVD using EMR clinical notes.</p><p><strong>Methods: </strong>CeVD status was confirmed through a chart review on randomly selected hospitalized patients who were 18 years or older and discharged from 3 hospitals in Calgary, Alberta, Canada, between January 1 and June 30, 2015. These patients' chart data were linked to administrative discharge abstract database (DAD) and Sunrise<sup>™</sup> Clinical Manager (SCM) EMR database records by Personal Health Number (a unique lifetime identifier) and admission date. We trained multiple natural language processing (NLP) predictive models by combining two clinical concept extraction methods and two supervised machine learning (ML) methods: random forest and XGBoost. Using chart review as the reference standard, we compared the model performances with those of the commonly applied International Classification of Diseases (ICD-10-CA) codes, on the metrics of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV).</p><p><strong>Result: </strong>Of the study sample (n = 3036), the prevalence of CeVD was 11.8% (n = 360); the median patient age was 63; and females accounted for 50.3% (n = 1528) based on chart data. Among 49 extracted clinical documents from the EMR, four document types were identified as the most influential text sources for identifying CeVD disease (\"nursing transfer report,\" \"discharge summary,\" \"nursing notes,\" and \"inpatient consultation.\"). The best performing NLP model was XGBoost, combining the Unified Medical Language System concepts extracted by cTAKES (e.g., top-ranked concepts, \"Cerebrovascular accident\" and \"Transient ischemic attack\"), and the term frequency-inverse document frequency vectorizer. Compared with ICD codes, the model achieved higher validity overall, such as sensitivity (25.0% vs 70.0%), specificity (99.3% vs 99.1%), PPV (82.6 vs. 87.8%), and NPV (90.8% vs 97.1%).</p><p><strong>Conclusion: </strong>The NLP algorithm developed in this study performed better than the ICD code algorithm in detecting CeVD. The NLP models could result in an automated EMR tool for identifying CeVD cases and be applied for future studies such as surveillance, and longitudinal studies.</p>","PeriodicalId":37465,"journal":{"name":"Brain Informatics","volume":"10 1","pages":"22"},"PeriodicalIF":0.0,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10474977/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10161449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Brain Informatics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1