在数据挖掘中应用引入多个子模型的支持向量机系统

Weinan Tang
{"title":"在数据挖掘中应用引入多个子模型的支持向量机系统","authors":"Weinan Tang","doi":"10.1016/j.sasc.2024.200096","DOIUrl":null,"url":null,"abstract":"<div><p>As the information age develops, the scale and form of data become more diverse and diverse. Therefore, people need to use effective means to process information. For large-scale data mining problems, a clustering-based kernel matrix inner product filtering method is introduced to decompose the original quadratic programming problem into multiple sub-problems to support parallel training. And a Spark-based multiple submodels parallel support vector machine is proposed. By introducing open-source tools such as OpenCV, image feature extraction can be performed on large-scale video data. Finally, combined with the designed parallel support vector machine algorithm, video facial and expression recognition is carried out. These experiments confirmed that the research method achieved a maximum acceleration ratio of 2090 times when processing Covtype datasets. The research model could achieve an accuracy of over 99 %. Under the maximum data scale experiment, the research model improved prediction accuracy by 21 percentage points with an acceptable additional time cost of about 4 min only. Task parallel processing could more fully utilize cluster performance, increasing by approximately 3.5 m/s<sup>2</sup> from 30 to 150 cores. The research model had the highest recognition accuracy for facial expressions, further demonstrating the effectiveness and superiority of this method. The research method has improved the efficiency of big data analysis and mining, and is of great significance in parallel analysis of video data.</p></div>","PeriodicalId":101205,"journal":{"name":"Systems and Soft Computing","volume":"6 ","pages":"Article 200096"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772941924000255/pdfft?md5=d616220c2edf456459593121e4e4af23&pid=1-s2.0-S2772941924000255-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Application of support vector machine system introducing multiple submodels in data mining\",\"authors\":\"Weinan Tang\",\"doi\":\"10.1016/j.sasc.2024.200096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>As the information age develops, the scale and form of data become more diverse and diverse. Therefore, people need to use effective means to process information. For large-scale data mining problems, a clustering-based kernel matrix inner product filtering method is introduced to decompose the original quadratic programming problem into multiple sub-problems to support parallel training. And a Spark-based multiple submodels parallel support vector machine is proposed. By introducing open-source tools such as OpenCV, image feature extraction can be performed on large-scale video data. Finally, combined with the designed parallel support vector machine algorithm, video facial and expression recognition is carried out. These experiments confirmed that the research method achieved a maximum acceleration ratio of 2090 times when processing Covtype datasets. The research model could achieve an accuracy of over 99 %. Under the maximum data scale experiment, the research model improved prediction accuracy by 21 percentage points with an acceptable additional time cost of about 4 min only. Task parallel processing could more fully utilize cluster performance, increasing by approximately 3.5 m/s<sup>2</sup> from 30 to 150 cores. The research model had the highest recognition accuracy for facial expressions, further demonstrating the effectiveness and superiority of this method. The research method has improved the efficiency of big data analysis and mining, and is of great significance in parallel analysis of video data.</p></div>\",\"PeriodicalId\":101205,\"journal\":{\"name\":\"Systems and Soft Computing\",\"volume\":\"6 \",\"pages\":\"Article 200096\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772941924000255/pdfft?md5=d616220c2edf456459593121e4e4af23&pid=1-s2.0-S2772941924000255-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systems and Soft Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772941924000255\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems and Soft Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772941924000255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着信息时代的发展,数据的规模和形式越来越多样化和多元化。因此,人们需要使用有效的手段来处理信息。针对大规模数据挖掘问题,提出了一种基于聚类的核矩阵内积滤波方法,将原来的二次编程问题分解为多个子问题,支持并行训练。并提出了一种基于 Spark 的多子模型并行支持向量机。通过引入 OpenCV 等开源工具,可以对大规模视频数据进行图像特征提取。最后,结合所设计的并行支持向量机算法,进行了视频面部和表情识别。这些实验证实,在处理 Covtype 数据集时,该研究方法的最大加速比达到了 2090 倍。研究模型的准确率超过 99%。在最大数据规模实验中,研究模型提高了 21 个百分点的预测准确率,而额外的时间成本仅为 4 分钟左右。任务并行处理可以更充分地利用集群性能,从 30 个核心增加到 150 个核心,增加了约 3.5 m/s2。研究模型对面部表情的识别准确率最高,进一步证明了该方法的有效性和优越性。该研究方法提高了大数据分析和挖掘的效率,对视频数据的并行分析具有重要意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Application of support vector machine system introducing multiple submodels in data mining

As the information age develops, the scale and form of data become more diverse and diverse. Therefore, people need to use effective means to process information. For large-scale data mining problems, a clustering-based kernel matrix inner product filtering method is introduced to decompose the original quadratic programming problem into multiple sub-problems to support parallel training. And a Spark-based multiple submodels parallel support vector machine is proposed. By introducing open-source tools such as OpenCV, image feature extraction can be performed on large-scale video data. Finally, combined with the designed parallel support vector machine algorithm, video facial and expression recognition is carried out. These experiments confirmed that the research method achieved a maximum acceleration ratio of 2090 times when processing Covtype datasets. The research model could achieve an accuracy of over 99 %. Under the maximum data scale experiment, the research model improved prediction accuracy by 21 percentage points with an acceptable additional time cost of about 4 min only. Task parallel processing could more fully utilize cluster performance, increasing by approximately 3.5 m/s2 from 30 to 150 cores. The research model had the highest recognition accuracy for facial expressions, further demonstrating the effectiveness and superiority of this method. The research method has improved the efficiency of big data analysis and mining, and is of great significance in parallel analysis of video data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.20
自引率
0.00%
发文量
0
期刊最新文献
A systematic assessment of sentiment analysis models on iraqi dialect-based texts Application of an intelligent English text classification model with improved KNN algorithm in the context of big data in libraries Analyzing the quality evaluation of college English teaching based on probabilistic linguistic multiple-attribute group decision-making Interior design assistant algorithm based on indoor scene analysis Research and application of visual synchronous positioning and mapping technology assisted by ultra wideband positioning technology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1