Logistic regression within DBMS

J. Isaac, Sandhya Harikumar
{"title":"Logistic regression within DBMS","authors":"J. Isaac, Sandhya Harikumar","doi":"10.1109/IC3I.2016.7918045","DOIUrl":null,"url":null,"abstract":"The context of this paper is to come up with an analytical query model for data categorization within DBMS. DBMS being the asset for most of the organizations, classification can help in getting better insight and control over the data. Conventionally, classification algorithms like logistic regression, KNN, etc. are applied after exporting the data out of DBMS, using non DBMS tools like R, matrix packages, generic data mining programs or large scale systems like Hadoop and Spark. However, this leads to I/O overhead since the data within DBMS is updated quite frequently and usually cannot be accommodated in the main memory. This paper proposes an alternative strategy, based on SQL and UDFs, to integrate the logistic regression for data categorization as well as prediction query processing within DBMS. A comparison of SQL with user defined functions (UDFs) as well as with statistical packages like R is presented, by experimentation on real datasets. The empirical results show the viability and validity of this approach for predicting the class of a given query.","PeriodicalId":305971,"journal":{"name":"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd International Conference on Contemporary Computing and Informatics (IC3I)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3I.2016.7918045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

The context of this paper is to come up with an analytical query model for data categorization within DBMS. DBMS being the asset for most of the organizations, classification can help in getting better insight and control over the data. Conventionally, classification algorithms like logistic regression, KNN, etc. are applied after exporting the data out of DBMS, using non DBMS tools like R, matrix packages, generic data mining programs or large scale systems like Hadoop and Spark. However, this leads to I/O overhead since the data within DBMS is updated quite frequently and usually cannot be accommodated in the main memory. This paper proposes an alternative strategy, based on SQL and UDFs, to integrate the logistic regression for data categorization as well as prediction query processing within DBMS. A comparison of SQL with user defined functions (UDFs) as well as with statistical packages like R is presented, by experimentation on real datasets. The empirical results show the viability and validity of this approach for predicting the class of a given query.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数据库管理系统中的逻辑回归
本文的背景是提出一个用于DBMS中数据分类的分析查询模型。DBMS是大多数组织的资产,分类可以帮助更好地了解和控制数据。通常,分类算法,如逻辑回归,KNN等,是在从DBMS导出数据后,使用非DBMS工具,如R,矩阵包,通用数据挖掘程序或大型系统,如Hadoop和Spark。但是,这会导致I/O开销,因为DBMS中的数据更新非常频繁,通常不能容纳在主内存中。本文提出了一种基于SQL和udf的替代策略,将数据分类和预测查询处理的逻辑回归集成到DBMS中。通过对真实数据集的实验,给出了SQL与用户定义函数(udf)以及像R这样的统计包的比较。实验结果表明,该方法对于预测给定查询的类别具有可行性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Single-resistance-controlled quadrature oscillator employing two current differencing buffered amplifier FMODC: Fuzzy guided multi-objective document clustering by GA A study on disruption tolerant session based mobile architecture How effective is Black Hole Algorithm? Design of a high gain 16 element array of microstrip patch antennas for millimeter wave applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1