首页 > 最新文献

Annals of Data Science最新文献

英文 中文
Drinkers Voice Recognition Intelligent System: An Ensemble Stacking Machine Learning Approach 饮酒者语音识别智能系统:集合堆叠机器学习方法
Q1 Decision Sciences Pub Date : 2024-07-07 DOI: 10.1007/s40745-024-00559-8
P. Terlapu
{"title":"Drinkers Voice Recognition Intelligent System: An Ensemble Stacking Machine Learning Approach","authors":"P. Terlapu","doi":"10.1007/s40745-024-00559-8","DOIUrl":"https://doi.org/10.1007/s40745-024-00559-8","url":null,"abstract":"","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":" 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141671173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Kernel Density Estimation-Based Entropic Isometric Feature Mapping for Unsupervised Metric Learning 用于无监督度量学习的基于核密度估计的新熵等距特征映射法
Q1 Decision Sciences Pub Date : 2024-07-06 DOI: 10.1007/s40745-024-00548-x
Alaor Cervati Neto, A. Levada, Michel Ferreira Cardia Haddad
{"title":"A New Kernel Density Estimation-Based Entropic Isometric Feature Mapping for Unsupervised Metric Learning","authors":"Alaor Cervati Neto, A. Levada, Michel Ferreira Cardia Haddad","doi":"10.1007/s40745-024-00548-x","DOIUrl":"https://doi.org/10.1007/s40745-024-00548-x","url":null,"abstract":"","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":" 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141672260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Power Evaluation of Some Tests for Inverse Rayleigh Distribution 反瑞利分布某些测试的功率评估
Q1 Decision Sciences Pub Date : 2024-07-05 DOI: 10.1007/s40745-024-00536-1
Vahideh Ahrari, P. Hasanalipour
{"title":"Power Evaluation of Some Tests for Inverse Rayleigh Distribution","authors":"Vahideh Ahrari, P. Hasanalipour","doi":"10.1007/s40745-024-00536-1","DOIUrl":"https://doi.org/10.1007/s40745-024-00536-1","url":null,"abstract":"","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":" 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141675008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Pricing of Data Based on Bi-level Programming Model 基于双层编程模型的数据定价研究
Q1 Decision Sciences Pub Date : 2024-06-16 DOI: 10.1007/s40745-024-00549-w
Yurong Ding, Yingjie Tian

Effective value measurement and pricing methods can greatly promote the healthy development of data sharing, exchange and reuse. However, the uncertainty of data value and neglect of interactivity lead to information asymmetry in the transaction process. A perfect pricing system and well-designed data trading market (hereafter called data market) can widely promote data transactions. We take the three-agents data market as an example to construct a sound data trading process. The data owner who provides data records, the model buyer who is interested in buying machine learning (ML) model instances, and the data broker who interacts between the data owner and the model buyer. Based on the characteristics of data market, like truthfulness, revenue maximization, version control, fairness and non-arbitrage, we propose a data pricing methods based on different model versions. Firstly, we utilize market research and construct a revenue maximization (RM) problem to price the different versions of ML models and solve it with the RM-ILP process. However, the RM model based on market research has two major problems: one is that the model buyer has no incentive to tell the truth, that is, the model buyer will lie in the market research to obtain a lower model price; the other is that it asks the data broker to release version menu in advance, resulting in an inefficient operation of the data market. In view of the defects of the RM transaction model, we propose a model buyers behavior analysis, establish the revenue maximization function based on different data versions to establish a bi-level linear programming model. We further add the incentive compatibility constraint and the individual rationality constraint, taking the utility of the model buyer and the revenue of the data broker into account. This reflects the consumer driven model in the data transaction mode. Finally, the RM-BLP process is proposed to transform RM problem into an equivalent single-level integer programming problem and we solve it with the “Gurobi” solver. The validity of the model is verified by experiments.

有效的价值衡量和定价方法可以极大地促进数据共享、交换和再利用的健康发展。然而,数据价值的不确定性和对交互性的忽视导致了交易过程中的信息不对称。完善的定价体系和精心设计的数据交易市场(以下简称数据市场)可以广泛促进数据交易。我们以三方数据市场为例,构建完善的数据交易流程。提供数据记录的数据所有者,有意购买机器学习(ML)模型实例的模型购买者,以及在数据所有者和模型购买者之间进行交互的数据经纪人。基于数据市场的真实性、收益最大化、版本控制、公平性和非套利等特点,我们提出了一种基于不同模型版本的数据定价方法。首先,我们利用市场调研,构建了一个收益最大化(RM)问题,对不同版本的 ML 模型进行定价,并利用 RM-ILP 过程求解。然而,基于市场调研的 RM 模型存在两大问题:一是模型购买者没有说实话的动机,即模型购买者会在市场调研中撒谎以获得较低的模型价格;二是要求数据经纪人提前发布版本菜单,导致数据市场运作效率低下。针对RM交易模型的缺陷,我们提出了模型买家行为分析方法,建立了基于不同数据版本的收益最大化函数,从而建立了双层线性规划模型。考虑到模型购买者的效用和数据经纪人的收益,我们进一步增加了激励相容约束和个体理性约束。这反映了数据交易模式中的消费者驱动模式。最后,我们提出了 RM-BLP 流程,将 RM 问题转化为等价的单级整数编程问题,并使用 "Gurobi "求解器进行求解。实验验证了模型的有效性。
{"title":"Research on Pricing of Data Based on Bi-level Programming Model","authors":"Yurong Ding,&nbsp;Yingjie Tian","doi":"10.1007/s40745-024-00549-w","DOIUrl":"10.1007/s40745-024-00549-w","url":null,"abstract":"<div><p>Effective value measurement and pricing methods can greatly promote the healthy development of data sharing, exchange and reuse. However, the uncertainty of data value and neglect of interactivity lead to information asymmetry in the transaction process. A perfect pricing system and well-designed data trading market (hereafter called data market) can widely promote data transactions. We take the three-agents data market as an example to construct a sound data trading process. The data owner who provides data records, the model buyer who is interested in buying machine learning (ML) model instances, and the data broker who interacts between the data owner and the model buyer. Based on the characteristics of data market, like truthfulness, revenue maximization, version control, fairness and non-arbitrage, we propose a data pricing methods based on different model versions. Firstly, we utilize market research and construct a revenue maximization (RM) problem to price the different versions of ML models and solve it with the RM-ILP process. However, the RM model based on market research has two major problems: one is that the model buyer has no incentive to tell the truth, that is, the model buyer will lie in the market research to obtain a lower model price; the other is that it asks the data broker to release version menu in advance, resulting in an inefficient operation of the data market. In view of the defects of the RM transaction model, we propose a model buyers behavior analysis, establish the revenue maximization function based on different data versions to establish a bi-level linear programming model. We further add the incentive compatibility constraint and the individual rationality constraint, taking the utility of the model buyer and the revenue of the data broker into account. This reflects the consumer driven model in the data transaction mode. Finally, the RM-BLP process is proposed to transform RM problem into an equivalent single-level integer programming problem and we solve it with the “Gurobi” solver. The validity of the model is verified by experiments.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1391 - 1419"},"PeriodicalIF":0.0,"publicationDate":"2024-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142412038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UAV-YOLOv5: A Swin-Transformer-Enabled Small Object Detection Model for Long-Range UAV Images UAV-YOLOv5:斯温变换器支持的远距离无人机图像小目标检测模型
Q1 Decision Sciences Pub Date : 2024-05-25 DOI: 10.1007/s40745-024-00546-z
Jun Li, Chong Xie, Sizheng Wu, Yawei Ren

This paper tackle the challenges associated with low recognition accuracy and the detection of occlusions when identifying long-range and diminutive targets (such as UAVs). We introduce a sophisticated detection framework named UAV-YOLOv5, which amalgamates the strengths of Swin Transformer V2 and YOLOv5. Firstly, we introduce Focal-EIOU, a refinement of the K-means algorithm tailored to generate anchor boxes better suited for the current dataset, thereby improving detection performance. Second, the convolutional and pooling layers in the network with step size greater than 1 are replaced to prevent information loss during feature extraction. Then, the Swin Transformer V2 module is introduced in the Neck to improve the accuracy of the model, and the BiFormer module is introduced to improve the ability of the model to acquire global and local feature information at the same time. In addition, BiFPN is introduced to replace the original FPN structure so that the network can acquire richer semantic information and fuse features across scales more effectively. Lastly, a small target detection head is appended to the existing architecture, augmenting the model’s proficiency in detecting smaller targets with heightened precision. Furthermore, various experiments are conducted on the comprehensive dataset to verify the effectiveness of UAV-YOLOv5, achieving an average accuracy of 87%. Compared with YOLOv5, the mAP of UAV-YOLOv5 is improved by 8.5%, which verifies that it has high-precision long-range small-target UAV optoelectronic detection capability.

本文探讨了在识别远距离和小型目标(如无人机)时与识别精度低和检测遮挡物相关的挑战。我们介绍了一种名为 UAV-YOLOv5 的复杂检测框架,它融合了 Swin Transformer V2 和 YOLOv5 的优点。首先,我们引入了 Focal-EIOU,这是对 K-means 算法的改进,旨在生成更适合当前数据集的锚点框,从而提高检测性能。其次,替换了网络中步长大于 1 的卷积层和池化层,以防止特征提取过程中的信息丢失。然后,在 Neck 中引入 Swin Transformer V2 模块以提高模型的准确性,并引入 BiFormer 模块以提高模型同时获取全局和局部特征信息的能力。此外,还引入了 BiFPN,以取代原有的 FPN 结构,从而使网络能够获取更丰富的语义信息,并更有效地跨尺度融合特征。最后,在现有结构中加入了小型目标检测头,从而提高了模型检测小型目标的精确度。此外,我们还在综合数据集上进行了各种实验,以验证 UAV-YOLOv5 的有效性,其平均准确率达到了 87%。与 YOLOv5 相比,UAV-YOLOv5 的 mAP 提高了 8.5%,验证了其具备高精度远程小目标无人机光电探测能力。
{"title":"UAV-YOLOv5: A Swin-Transformer-Enabled Small Object Detection Model for Long-Range UAV Images","authors":"Jun Li,&nbsp;Chong Xie,&nbsp;Sizheng Wu,&nbsp;Yawei Ren","doi":"10.1007/s40745-024-00546-z","DOIUrl":"10.1007/s40745-024-00546-z","url":null,"abstract":"<div><p>This paper tackle the challenges associated with low recognition accuracy and the detection of occlusions when identifying long-range and diminutive targets (such as UAVs). We introduce a sophisticated detection framework named UAV-YOLOv5, which amalgamates the strengths of Swin Transformer V2 and YOLOv5. Firstly, we introduce Focal-EIOU, a refinement of the K-means algorithm tailored to generate anchor boxes better suited for the current dataset, thereby improving detection performance. Second, the convolutional and pooling layers in the network with step size greater than 1 are replaced to prevent information loss during feature extraction. Then, the Swin Transformer V2 module is introduced in the Neck to improve the accuracy of the model, and the BiFormer module is introduced to improve the ability of the model to acquire global and local feature information at the same time. In addition, BiFPN is introduced to replace the original FPN structure so that the network can acquire richer semantic information and fuse features across scales more effectively. Lastly, a small target detection head is appended to the existing architecture, augmenting the model’s proficiency in detecting smaller targets with heightened precision. Furthermore, various experiments are conducted on the comprehensive dataset to verify the effectiveness of UAV-YOLOv5, achieving an average accuracy of 87%. Compared with YOLOv5, the mAP of UAV-YOLOv5 is improved by 8.5%, which verifies that it has high-precision long-range small-target UAV optoelectronic detection capability.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1109 - 1138"},"PeriodicalIF":0.0,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial Data Analysis for Robust Classification of Network Topology Through Synthetic Combinatorics 通过合成组合学对网络拓扑结构进行稳健分类的空间数据分析
Q1 Decision Sciences Pub Date : 2024-05-20 DOI: 10.1007/s40745-024-00523-6
Samrat Hore, Stabak Roy, Malabika Boruah, Saptarshi Mitra

The measurement of network topology through various spatial topological indices like Alpha, Beta and Gamma are widely used for spatial data analysis. However, explaining the classification of the network topology of a city based on Alpha, Beta and Gamma indices is not conclusive, as the result of individual indices are different. To address an efficient classification of network topology, a Modified Synthetic Indicator (MSI) has been proposed and criticised over existing synthetic indicators based on the Composite Weighted Connectivity Index (CWCI), the linear combination of Alpha, Beta and Gamma indices. Application of the proposed MSI in micro-level (ward level) classification of network topology i.e., road network connectivity, has been verified in Agartala City and calibrates the efficiency of CWCI over Alpha, Beta and Gamma indices. The study reveals that the proposed CWCI is more robust than any individual graph-theoretic measure.

通过 Alpha、Beta 和 Gamma 等各种空间拓扑指数来测量网络拓扑结构被广泛用于空间数据分析。然而,基于 Alpha、Beta 和 Gamma 指数对城市网络拓扑进行分类的解释并不可靠,因为各个指数的结果各不相同。为了有效地对网络拓扑结构进行分类,提出了一种修正的合成指标(MSI),并对现有的基于综合加权连接指数(CWCI)(Alpha、Beta 和 Gamma 指数的线性组合)的合成指标进行了批评。已在阿加尔塔拉市验证了所提出的 MSI 在网络拓扑(即路网连通性)微观层面(选区层面)分类中的应用,并校准了 CWCI 相对于 Alpha、Beta 和 Gamma 指数的效率。研究结果表明,建议的 CWCI 比任何单独的图论测量方法都更加稳健。
{"title":"Spatial Data Analysis for Robust Classification of Network Topology Through Synthetic Combinatorics","authors":"Samrat Hore,&nbsp;Stabak Roy,&nbsp;Malabika Boruah,&nbsp;Saptarshi Mitra","doi":"10.1007/s40745-024-00523-6","DOIUrl":"10.1007/s40745-024-00523-6","url":null,"abstract":"<div><p>The measurement of network topology through various spatial topological indices like Alpha, Beta and Gamma are widely used for spatial data analysis. However, explaining the classification of the network topology of a city based on Alpha, Beta and Gamma indices is not conclusive, as the result of individual indices are different. To address an efficient classification of network topology, a Modified Synthetic Indicator (MSI) has been proposed and criticised over existing synthetic indicators based on the Composite Weighted Connectivity Index (CWCI), the linear combination of Alpha, Beta and Gamma indices. Application of the proposed MSI in micro-level (ward level) classification of network topology i.e., road network connectivity, has been verified in Agartala City and calibrates the efficiency of CWCI over Alpha, Beta and Gamma indices. The study reveals that the proposed CWCI is more robust than any individual graph-theoretic measure.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1341 - 1359"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141122125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified Image Harmonization with Region Augmented Attention Normalization 利用区域增强注意力归一化统一图像协调
Q1 Decision Sciences Pub Date : 2024-05-11 DOI: 10.1007/s40745-024-00531-6
Junjie Hou, Yuqi Zhang, Duo Su

The image harmonization task endeavors to adjust foreground information within an image synthesis process to achieve visual consistency by leveraging background information. In academic research, this task conventionally involves the utilization of simple synthesized images and matching masks as inputs. However, obtaining precise masks for image harmonization in practical applications poses a significant challenge, thereby creating a notable disparity between research findings and real-world applicability. To mitigate this disparity, we propose a redefinition of the image harmonization task as “Unified Image Harmonization,” where the input comprises only a single image, thereby enhancing its applicability in real-world scenarios. To address this challenge, we have developed a novel framework. Within this framework, we initially employ inharmonious region localization to detect the mask, which is subsequently utilized for harmonization tasks. The pivotal aspect of the harmonization process lies in normalization, which is accountable for information transfer. Nonetheless, the current background-to-foreground information transfer and guidance mechanisms are limited by single-layer guidance, thereby constraining their effectiveness. To overcome this limitation, we introduce Region Augmented Attention Normalization (RA2N), which enhances the attention mechanism for foreground feature alignment, consequently leading to improved alignment and transfer capabilities. Through qualitative and quantitative comparisons on the iHarmony4 dataset, our model exhibits exceptional performance not only in unified image harmonization but also in conventional image harmonization tasks.

图像协调任务致力于在图像合成过程中调整前景信息,通过利用背景信息实现视觉一致性。在学术研究中,这项任务通常使用简单的合成图像和匹配掩码作为输入。然而,在实际应用中,为图像协调获取精确的遮罩是一项巨大的挑战,从而造成了研究成果与实际应用之间的明显差距。为了缩小这种差距,我们建议将图像协调任务重新定义为 "统一图像协调",即输入只包括一张图像,从而提高其在现实世界中的适用性。为了应对这一挑战,我们开发了一个新颖的框架。在这一框架内,我们首先利用不和谐区域定位来检测掩码,然后利用掩码进行协调任务。协调过程的关键在于归一化,它负责信息传递。然而,目前从背景到前景的信息传输和引导机制受到单层引导的限制,从而制约了其有效性。为了克服这一局限性,我们引入了区域增强注意归一化(RA2N),它增强了前景特征配准的注意机制,从而提高了配准和传输能力。通过在 iHarmony4 数据集上进行定性和定量比较,我们的模型不仅在统一图像协调方面,而且在传统图像协调任务中都表现出了卓越的性能。
{"title":"Unified Image Harmonization with Region Augmented Attention Normalization","authors":"Junjie Hou,&nbsp;Yuqi Zhang,&nbsp;Duo Su","doi":"10.1007/s40745-024-00531-6","DOIUrl":"10.1007/s40745-024-00531-6","url":null,"abstract":"<div><p>The image harmonization task endeavors to adjust foreground information within an image synthesis process to achieve visual consistency by leveraging background information. In academic research, this task conventionally involves the utilization of simple synthesized images and matching masks as inputs. However, obtaining precise masks for image harmonization in practical applications poses a significant challenge, thereby creating a notable disparity between research findings and real-world applicability. To mitigate this disparity, we propose a redefinition of the image harmonization task as “Unified Image Harmonization,” where the input comprises only a single image, thereby enhancing its applicability in real-world scenarios. To address this challenge, we have developed a novel framework. Within this framework, we initially employ inharmonious region localization to detect the mask, which is subsequently utilized for harmonization tasks. The pivotal aspect of the harmonization process lies in normalization, which is accountable for information transfer. Nonetheless, the current background-to-foreground information transfer and guidance mechanisms are limited by single-layer guidance, thereby constraining their effectiveness. To overcome this limitation, we introduce Region Augmented Attention Normalization (RA2N), which enhances the attention mechanism for foreground feature alignment, consequently leading to improved alignment and transfer capabilities. Through qualitative and quantitative comparisons on the iHarmony4 dataset, our model exhibits exceptional performance not only in unified image harmonization but also in conventional image harmonization tasks.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1865 - 1886"},"PeriodicalIF":0.0,"publicationDate":"2024-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140989549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting the Functional Changes in Protein Mutations Through the Application of BiLSTM and the Self-Attention Mechanism 通过应用 BiLSTM 和自注意机制预测蛋白质突变的功能变化
Q1 Decision Sciences Pub Date : 2024-04-25 DOI: 10.1007/s40745-024-00530-7
Zixuan Fan, Yan Xu

In the field of bioinformatics, changes in protein functionality are mainly influenced by protein mutations. Accurately predicting these functional changes can enhance our understanding of evolutionary mechanisms, promote developments in protein engineering-related fields, and accelerate progress in medical research. In this study, we introduced two different models: one based on bidirectional long short-term memory (BiLSTM), and the other based on self-attention. These models were integrated using a weighted fusion method to predict protein functional changes associated with mutation sites. The findings indicate that the model's predictive precision matches that of the current model, along with its capacity for generalization. Furthermore, the ensemble model surpasses the performance of the single models, highlighting the value of utilizing their synergistic capabilities. This finding may improve the accuracy of predicting protein functional changes associated with mutations and has potential applications in protein engineering and drug research. We evaluated the efficacy of our models under different scenarios by comparing the predicted results of protein functional changes across various numbers of mutation sites. As the number of mutation sites increases, the prediction accuracy decreases significantly, highlighting the inherent limitations of these models in handling cases involving more mutation sites.

在生物信息学领域,蛋白质功能的变化主要受蛋白质突变的影响。准确预测这些功能变化可以加深我们对进化机制的理解,促进蛋白质工程相关领域的发展,并加快医学研究的进展。在这项研究中,我们引入了两种不同的模型:一种是基于双向长短期记忆(BiLSTM)的模型,另一种是基于自我注意的模型。使用加权融合法将这些模型整合在一起,预测与突变位点相关的蛋白质功能变化。研究结果表明,该模型的预测精度与当前模型相匹配,同时还具有泛化能力。此外,组合模型的性能还超过了单一模型,突出了利用其协同能力的价值。这一发现可能会提高预测与突变相关的蛋白质功能变化的准确性,并有可能应用于蛋白质工程和药物研究。我们通过比较不同突变位点数量下蛋白质功能变化的预测结果,评估了我们的模型在不同情况下的功效。随着突变位点数量的增加,预测准确率明显下降,这凸显出这些模型在处理涉及更多突变位点的情况时存在固有的局限性。
{"title":"Predicting the Functional Changes in Protein Mutations Through the Application of BiLSTM and the Self-Attention Mechanism","authors":"Zixuan Fan,&nbsp;Yan Xu","doi":"10.1007/s40745-024-00530-7","DOIUrl":"10.1007/s40745-024-00530-7","url":null,"abstract":"<div><p>In the field of bioinformatics, changes in protein functionality are mainly influenced by protein mutations. Accurately predicting these functional changes can enhance our understanding of evolutionary mechanisms, promote developments in protein engineering-related fields, and accelerate progress in medical research. In this study, we introduced two different models: one based on bidirectional long short-term memory (BiLSTM), and the other based on self-attention. These models were integrated using a weighted fusion method to predict protein functional changes associated with mutation sites. The findings indicate that the model's predictive precision matches that of the current model, along with its capacity for generalization. Furthermore, the ensemble model surpasses the performance of the single models, highlighting the value of utilizing their synergistic capabilities. This finding may improve the accuracy of predicting protein functional changes associated with mutations and has potential applications in protein engineering and drug research. We evaluated the efficacy of our models under different scenarios by comparing the predicted results of protein functional changes across various numbers of mutation sites. As the number of mutation sites increases, the prediction accuracy decreases significantly, highlighting the inherent limitations of these models in handling cases involving more mutation sites.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"1077 - 1094"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140656386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research on Intelligent Courses in English Education based on Neural Networks 基于神经网络的英语教育智能课程研究
Q1 Decision Sciences Pub Date : 2024-04-25 DOI: 10.1007/s40745-024-00528-1
Huimin Yao, Haiyan Wang

Accurately predicting students’ performance plays a crucial role in achieving the intellectualization of courses. This paper studied intelligent courses in English education based on neural networks and designed a firefly algorithm-back propagation neural network (FA-BPNN) method. The correlation between various features and final grades was calculated using the students’ online learning data. Features with higher correlation were selected as the input for the FA-BPNN algorithm to estimate the final score that students achieved in the “College English” course. It was found that the training time of the FA-BPNN algorithm was 3.42 s, the root-mean-square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) values of the FA-BPNN algorithm were 0.986, 0.622, and 0.205, respectively. They were lower than those of the BPNN, genetic algorithm (GA)-BPNN, and particle swarm optimization (PSO)-BPNN algorithms, as well as the adaptive neuro-fuzzy inference system approach. The results indicated the efficacy of the FA for optimizing the parameters of the BPNN algorithm. The comparison between the predicted results and actual values suggested that the average error of the FA-BPNN algorithm was only 0.5, which was the smallest. The experimental results demonstrate the reliability of the FA-BPNN algorithm for performance prediction and its practical application feasibility.

准确预测学生成绩对实现课程智能化起着至关重要的作用。本文研究了基于神经网络的英语教育智能课程,设计了一种萤火虫算法-反向传播神经网络(FA-BPNN)方法。利用学生的在线学习数据计算了各种特征与最终成绩之间的相关性。选择相关性较高的特征作为 FA-BPNN 算法的输入,以估计学生在 "大学英语 "课程中取得的最终成绩。结果发现,FA-BPNN 算法的训练时间为 3.42 s,FA-BPNN 算法的均方根误差(RMSE)、平均绝对误差(MAE)和平均绝对百分比误差(MAPE)值分别为 0.986、0.622 和 0.205。它们分别低于 BPNN、遗传算法(GA)-BPNN 和粒子群优化(PSO)-BPNN 算法以及自适应神经模糊推理系统方法。结果表明,FA 在优化 BPNN 算法参数方面效果显著。预测结果与实际值的比较表明,FA-BPNN 算法的平均误差仅为 0.5,是最小的。实验结果证明了 FA-BPNN 算法在性能预测方面的可靠性和实际应用的可行性。
{"title":"Research on Intelligent Courses in English Education based on Neural Networks","authors":"Huimin Yao,&nbsp;Haiyan Wang","doi":"10.1007/s40745-024-00528-1","DOIUrl":"10.1007/s40745-024-00528-1","url":null,"abstract":"<div><p>Accurately predicting students’ performance plays a crucial role in achieving the intellectualization of courses. This paper studied intelligent courses in English education based on neural networks and designed a firefly algorithm-back propagation neural network (FA-BPNN) method. The correlation between various features and final grades was calculated using the students’ online learning data. Features with higher correlation were selected as the input for the FA-BPNN algorithm to estimate the final score that students achieved in the “College English” course. It was found that the training time of the FA-BPNN algorithm was 3.42 s, the root-mean-square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) values of the FA-BPNN algorithm were 0.986, 0.622, and 0.205, respectively. They were lower than those of the BPNN, genetic algorithm (GA)-BPNN, and particle swarm optimization (PSO)-BPNN algorithms, as well as the adaptive neuro-fuzzy inference system approach. The results indicated the efficacy of the FA for optimizing the parameters of the BPNN algorithm. The comparison between the predicted results and actual values suggested that the average error of the FA-BPNN algorithm was only 0.5, which was the smallest. The experimental results demonstrate the reliability of the FA-BPNN algorithm for performance prediction and its practical application feasibility.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"1095 - 1107"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140653938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian Inference for the Entropy of the Rayleigh Model Based on Ordered Ranked Set Sampling 基于有序排序集合采样的雷利模型熵的贝叶斯推断
Q1 Decision Sciences Pub Date : 2024-02-27 DOI: 10.1007/s40745-024-00514-7
Mohammed S. Kotb, Haidy A. Newer, Marwa M. Mohie El-Din

Recently, ranked set samples schemes have become quite popular in reliability analysis and life-testing problems. Based on ordered ranked set sample, the Bayesian estimators and credible intervals for the entropy of the Rayleigh model are studied and compared with the corresponding estimators based on simple random sampling. These Bayes estimators for entropy are developed and computed with various loss functions, such as square error, linear-exponential, Al-Bayyati, and general entropy loss functions. A comparison study for various estimates of entropy based on mean squared error is done. A real-life data set and simulation are applied to illustrate our procedures.

近来,有序集合样本方案在可靠性分析和寿命测试问题中颇受欢迎。基于有序排序集合样本,研究了雷利模型熵的贝叶斯估计值和可信区间,并与基于简单随机抽样的相应估计值进行了比较。这些贝叶斯熵估计器是用各种损失函数(如平方误差、线性-指数、Al-Bayyati 和一般熵损失函数)开发和计算的。对基于均方误差的各种熵估计值进行了比较研究。为了说明我们的程序,我们应用了真实数据集和模拟。
{"title":"Bayesian Inference for the Entropy of the Rayleigh Model Based on Ordered Ranked Set Sampling","authors":"Mohammed S. Kotb,&nbsp;Haidy A. Newer,&nbsp;Marwa M. Mohie El-Din","doi":"10.1007/s40745-024-00514-7","DOIUrl":"10.1007/s40745-024-00514-7","url":null,"abstract":"<div><p>Recently, ranked set samples schemes have become quite popular in reliability analysis and life-testing problems. Based on ordered ranked set sample, the Bayesian estimators and credible intervals for the entropy of the Rayleigh model are studied and compared with the corresponding estimators based on simple random sampling. These Bayes estimators for entropy are developed and computed with various loss functions, such as square error, linear-exponential, Al-Bayyati, and general entropy loss functions. A comparison study for various estimates of entropy based on mean squared error is done. A real-life data set and simulation are applied to illustrate our procedures.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1435 - 1458"},"PeriodicalIF":0.0,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140427345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annals of Data Science
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1