Entropy-extreme concept of data gaps filling in a small-sized collection

IF 5 3区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Egyptian Informatics Journal Pub Date : 2025-02-10 DOI:10.1016/j.eij.2025.100621
Viacheslav Kovtun , Krzysztof Grochla , Mohammed Al-Maitah , Saad Aldosary , Oleksii Kozachko
{"title":"Entropy-extreme concept of data gaps filling in a small-sized collection","authors":"Viacheslav Kovtun ,&nbsp;Krzysztof Grochla ,&nbsp;Mohammed Al-Maitah ,&nbsp;Saad Aldosary ,&nbsp;Oleksii Kozachko","doi":"10.1016/j.eij.2025.100621","DOIUrl":null,"url":null,"abstract":"<div><div>The article investigates the process of filling data gaps in a small-sized collection, which generalizes information about periodic measurement of input and output parameters of a target object. To fill the data gaps, a concept is proposed based on generating a committee of entropy-optimal trajectories through sampling probability density functions of parameters from a stochastic parameterized model trained on relevant data. The concept is generalized to cases of filling gaps in output data, input data, and both those data spaces. Filling gaps in output data is implemented using entropy-extreme estimation of probability density functions for parameters of the model and errors of measurement. In the case of addressing missing values in input data, these are interpreted as results of transforming a sequence of independent stochastic vectors introduced into a model structurally identical to that formalized for filling gaps in output data. Thus, the proposed concept inherits the benefits of both parametric estimation and using a trained model of the target process and non-parametric estimation of undefined characteristics that distort data. The proposed concept was tested on the task of filling gaps in a collection consisting of 35 tuples with measurement results of three attributes. It was considered that the imperfection of the measurement procedure caused variability in the obtained data at the level of 15% of their absolute value. Less than 20% of the data from the collection was used to train the corresponding entropy-extreme model. The relative error of the filled missing data was 0.21.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"29 ","pages":"Article 100621"},"PeriodicalIF":5.0000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866525000143","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

The article investigates the process of filling data gaps in a small-sized collection, which generalizes information about periodic measurement of input and output parameters of a target object. To fill the data gaps, a concept is proposed based on generating a committee of entropy-optimal trajectories through sampling probability density functions of parameters from a stochastic parameterized model trained on relevant data. The concept is generalized to cases of filling gaps in output data, input data, and both those data spaces. Filling gaps in output data is implemented using entropy-extreme estimation of probability density functions for parameters of the model and errors of measurement. In the case of addressing missing values in input data, these are interpreted as results of transforming a sequence of independent stochastic vectors introduced into a model structurally identical to that formalized for filling gaps in output data. Thus, the proposed concept inherits the benefits of both parametric estimation and using a trained model of the target process and non-parametric estimation of undefined characteristics that distort data. The proposed concept was tested on the task of filling gaps in a collection consisting of 35 tuples with measurement results of three attributes. It was considered that the imperfection of the measurement procedure caused variability in the obtained data at the level of 15% of their absolute value. Less than 20% of the data from the collection was used to train the corresponding entropy-extreme model. The relative error of the filled missing data was 0.21.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Egyptian Informatics Journal
Egyptian Informatics Journal Decision Sciences-Management Science and Operations Research
CiteScore
11.10
自引率
1.90%
发文量
59
审稿时长
110 days
期刊介绍: The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.
期刊最新文献
Multistep prediction for egg prices: An efficient sequence-to-sequence network A multi-objective fuzzy model based on enhanced artificial fish Swarm for multiple RNA sequences alignment A road lane detection approach based on reformer model Advanced segmentation method for integrating multi-omics data for early cancer detection Innovation of teaching mechanism of music course integrating artificial intelligence technology: ITMMCAI-MCA-ACNN approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1