Training with Input Selection and Testing (TWIST) Algorithm: A Significant Advance in Pattern Recognition Performance of Machine Learning

M. Buscema, Marco Breda, W. Lodwick
{"title":"Training with Input Selection and Testing (TWIST) Algorithm: A Significant Advance in Pattern Recognition Performance of Machine Learning","authors":"M. Buscema, Marco Breda, W. Lodwick","doi":"10.4236/JILSA.2013.51004","DOIUrl":null,"url":null,"abstract":"This article shows the efficacy of TWIST, a methodology for the design of training and testing data subsets extracted from given dataset associated with a problem to be solved via ANNs. The methodology we present is embedded in algorithms and actualized in computer software. Our methodology as implemented in software is compared to the current standard methods of random cross validation: 10-Fold CV, random split into two subsets and the more advanced T&T. For each strategy, 13 learning machines, representing different families of the main algorithms, have been trained and tested. All algorithms were implemented using the well-known WEKA software package. On one hand a falsification test with randomly distributed dependent variable has been used to show how T&T and TWIST behaves as the other two strategies: when there is no information available on the datasets they are equivalent. On the other hand, using the real Statlog (Heart) dataset, a strong difference in accuracy is experimentally proved. Our results show that TWIST is superior to current methods. Pairs of subsets with similar probability density functions are generated, without coding noise, according to an optimal strategy that extracts the most useful information for pattern classification.","PeriodicalId":69452,"journal":{"name":"智能学习系统与应用(英文)","volume":"05 1","pages":"29-38"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"智能学习系统与应用(英文)","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.4236/JILSA.2013.51004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 40

Abstract

This article shows the efficacy of TWIST, a methodology for the design of training and testing data subsets extracted from given dataset associated with a problem to be solved via ANNs. The methodology we present is embedded in algorithms and actualized in computer software. Our methodology as implemented in software is compared to the current standard methods of random cross validation: 10-Fold CV, random split into two subsets and the more advanced T&T. For each strategy, 13 learning machines, representing different families of the main algorithms, have been trained and tested. All algorithms were implemented using the well-known WEKA software package. On one hand a falsification test with randomly distributed dependent variable has been used to show how T&T and TWIST behaves as the other two strategies: when there is no information available on the datasets they are equivalent. On the other hand, using the real Statlog (Heart) dataset, a strong difference in accuracy is experimentally proved. Our results show that TWIST is superior to current methods. Pairs of subsets with similar probability density functions are generated, without coding noise, according to an optimal strategy that extracts the most useful information for pattern classification.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
训练与输入选择和测试(TWIST)算法:机器学习模式识别性能的重大进展
本文展示了TWIST的有效性,TWIST是从给定数据集中提取的训练和测试数据子集的设计方法,该数据集与通过人工神经网络解决的问题相关。我们提出的方法嵌入在算法中,并在计算机软件中实现。我们在软件中实现的方法与目前随机交叉验证的标准方法进行了比较:10倍CV,随机分成两个子集和更先进的T&T。对于每种策略,代表不同主要算法族的13台学习机器都经过了训练和测试。所有算法均使用知名的WEKA软件包实现。一方面,随机分布因变量的证伪检验已用于显示T&T和TWIST如何表现为其他两个策略:当数据集上没有可用信息时,它们是等效的。另一方面,使用真实的Statlog (Heart)数据集,实验证明了准确性的巨大差异。我们的结果表明,TWIST优于现有的方法。根据提取最有用的模式分类信息的最优策略,生成具有相似概率密度函数的子集对,且没有编码噪声。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
135
期刊最新文献
Architecting the Metaverse: Blockchain and the Financial and Legal Regulatory Challenges of Virtual Real Estate A Proposed Meta-Reality Immersive Development Pipeline: Generative AI Models and Extended Reality (XR) Content for the Metaverse A Comparison of PPO, TD3 and SAC Reinforcement Algorithms for Quadruped Walking Gait Generation Multiple Collaborative Service Model and System Construction Based on Industrial Competitive Intelligence Skin Cancer Classification Using Transfer Learning by VGG16 Architecture (Case Study on Kaggle Dataset)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1