首页 > 最新文献

Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis最新文献

英文 中文
A Genetic-Based Incremental Local Outlier Factor Algorithm for Efficient Data Stream Processing 用于高效数据流处理的基于遗传的增量局部离群因子算法
Omar Alghushairy, Raed Alsini, Xiaogang Ma, T. Soule
Interest in outlier detection methods is increasing because detecting outliers is an important operation for many applications such as detecting fraud transactions in credit card, network intrusion detection and data analysis in different domains. We are now in the big data era, and an important type of big data is data stream. With the increasing necessity for analyzing high-velocity data streams, it becomes difficult to apply older outlier detection methods efficiently. Local Outlier Factor (LOF) is a well-known outlier algorithm. A major challenge of LOF is that it requires the entire dataset and the distance values to be stored in memory. Another issue with LOF is that it needs to be recalculated from the beginning if any change occurs in the dataset. This research paper proposes a novel local outlier detection algorithm for data streams, called Genetic-based Incremental Local Outlier Factor (GILOF). The algorithm works without any previous knowledge of data distribution, and it executes in limited memory. The outcomes of our experiments with various real-world datasets demonstrate that GILOF has better performance in execution time and accuracy than other state-of-the-art LOF algorithms.
对离群值检测方法的兴趣与日俱增,因为检测离群值是许多应用中的重要操作,如检测信用卡欺诈交易、网络入侵检测和不同领域的数据分析。我们现在正处于大数据时代,而大数据的一个重要类型就是数据流。随着分析高速数据流的需求不断增加,要有效地应用旧的离群点检测方法变得十分困难。局部离群因子(LOF)是一种著名的离群值算法。LOF 面临的一个主要挑战是,它需要将整个数据集和距离值存储在内存中。LOF 的另一个问题是,如果数据集发生任何变化,都需要从头开始重新计算。本研究论文提出了一种新颖的数据流局部离群点检测算法,称为基于遗传的增量局部离群点因子(GILOF)。该算法无需事先了解数据分布情况,并在有限的内存中执行。我们用各种真实数据集进行的实验结果表明,GILOF 在执行时间和准确性方面都优于其他最先进的 LOF 算法。
{"title":"A Genetic-Based Incremental Local Outlier Factor Algorithm for Efficient Data Stream Processing","authors":"Omar Alghushairy, Raed Alsini, Xiaogang Ma, T. Soule","doi":"10.1145/3388142.3388160","DOIUrl":"https://doi.org/10.1145/3388142.3388160","url":null,"abstract":"Interest in outlier detection methods is increasing because detecting outliers is an important operation for many applications such as detecting fraud transactions in credit card, network intrusion detection and data analysis in different domains. We are now in the big data era, and an important type of big data is data stream. With the increasing necessity for analyzing high-velocity data streams, it becomes difficult to apply older outlier detection methods efficiently. Local Outlier Factor (LOF) is a well-known outlier algorithm. A major challenge of LOF is that it requires the entire dataset and the distance values to be stored in memory. Another issue with LOF is that it needs to be recalculated from the beginning if any change occurs in the dataset. This research paper proposes a novel local outlier detection algorithm for data streams, called Genetic-based Incremental Local Outlier Factor (GILOF). The algorithm works without any previous knowledge of data distribution, and it executes in limited memory. The outcomes of our experiments with various real-world datasets demonstrate that GILOF has better performance in execution time and accuracy than other state-of-the-art LOF algorithms.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127910036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Predictive Modelling for Chronic Disease: Machine Learning Approach 慢性疾病预测建模:机器学习方法
Md. Rakibul Hoque, Mohammed Sajedur Rahman
Chronic diseases are responsible for half of annual mortality (51%) and almost half of the burden of all diseases (41%) in Bangladesh. Developing countries like Bangladesh are in a probable state of approximate loss of $7.3 trillion due to chronic diseases by 2025. Healthcare industries in Bangladesh now generate, collect, and store large amount of data. With the emergence of big data analytics, the approach to determine the factors causing specific effects on health is increasingly based on machine learning techniques. Therefore, it is important to conduct a predictive big data analysis using machine learning techniques to understand the likelihood of chronic diseases, specifically diabetes, hypertension, and heart diseases that are caused by age, income, and years of diseases. The aim of this research is to develop a predictive analytics tool for chronic diseases using machine learning techniques. The application of machine learning in the healthcare sector can minimize the costs of treatment and can help in taking proactive actions.
在孟加拉国,慢性病占年死亡率的一半(51%),几乎占所有疾病负担的一半(41%)。到2025年,孟加拉国等发展中国家因慢性病造成的损失可能达到约7.3万亿美元。孟加拉国的医疗保健行业现在生成、收集和存储大量数据。随着大数据分析的出现,确定对健康造成特定影响的因素的方法越来越多地基于机器学习技术。因此,使用机器学习技术进行预测性大数据分析,以了解由年龄、收入和疾病年数引起的慢性疾病,特别是糖尿病、高血压和心脏病的可能性是很重要的。本研究的目的是利用机器学习技术开发慢性疾病的预测分析工具。机器学习在医疗保健领域的应用可以最大限度地降低治疗成本,并有助于采取积极主动的行动。
{"title":"Predictive Modelling for Chronic Disease: Machine Learning Approach","authors":"Md. Rakibul Hoque, Mohammed Sajedur Rahman","doi":"10.1145/3388142.3388174","DOIUrl":"https://doi.org/10.1145/3388142.3388174","url":null,"abstract":"Chronic diseases are responsible for half of annual mortality (51%) and almost half of the burden of all diseases (41%) in Bangladesh. Developing countries like Bangladesh are in a probable state of approximate loss of $7.3 trillion due to chronic diseases by 2025. Healthcare industries in Bangladesh now generate, collect, and store large amount of data. With the emergence of big data analytics, the approach to determine the factors causing specific effects on health is increasingly based on machine learning techniques. Therefore, it is important to conduct a predictive big data analysis using machine learning techniques to understand the likelihood of chronic diseases, specifically diabetes, hypertension, and heart diseases that are caused by age, income, and years of diseases. The aim of this research is to develop a predictive analytics tool for chronic diseases using machine learning techniques. The application of machine learning in the healthcare sector can minimize the costs of treatment and can help in taking proactive actions.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127180835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Computing System Congestion Management Using Exponential Smoothing Forecasting 利用指数平滑预测计算系统拥塞管理
Ja Brady
An overloaded computer must finish what it starts and not start what will fail or hang. A congestion management algorithm, the author developed, effectively manages traffic overload with its unique formulation of Exponential Smoothing forecasting. This set of equations resolve forecasting startup issues that have limited the model's adoption as a discrete time series predictor. These expressions also satisfy implementation requirements to perform calculations using integer math and be able to reset the forecast seamlessly. A computer program, written in C language, which exercises the methodology, is downloadable from GitHub.
超载的计算机必须完成它启动的工作,而不能启动将失败或挂起的工作。本文提出的拥塞管理算法以其独特的指数平滑预测公式有效地管理了交通超载。这组方程解决了预测启动问题,这些问题限制了模型作为离散时间序列预测器的采用。这些表达式还满足使用整数数学执行计算的实现需求,并能够无缝地重置预测。可以从GitHub下载一个用C语言编写的计算机程序,该程序可以练习这种方法。
{"title":"Computing System Congestion Management Using Exponential Smoothing Forecasting","authors":"Ja Brady","doi":"10.1145/3388142.3388146","DOIUrl":"https://doi.org/10.1145/3388142.3388146","url":null,"abstract":"An overloaded computer must finish what it starts and not start what will fail or hang. A congestion management algorithm, the author developed, effectively manages traffic overload with its unique formulation of Exponential Smoothing forecasting. This set of equations resolve forecasting startup issues that have limited the model's adoption as a discrete time series predictor. These expressions also satisfy implementation requirements to perform calculations using integer math and be able to reset the forecast seamlessly. A computer program, written in C language, which exercises the methodology, is downloadable from GitHub.","PeriodicalId":409298,"journal":{"name":"Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116163049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the 2020 the 4th International Conference on Compute and Data Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1