医学数据挖掘:副球孢子菌病患者数据库的案例研究

E. Ferreira, H. Rausch, S. Campos, A. Faria-Campos, Enio Pietra, Lílian Silva dos Santos
{"title":"医学数据挖掘:副球孢子菌病患者数据库的案例研究","authors":"E. Ferreira, H. Rausch, S. Campos, A. Faria-Campos, Enio Pietra, Lílian Silva dos Santos","doi":"10.1109/HealthCom.2014.7001854","DOIUrl":null,"url":null,"abstract":"Data mining applied to medical databases is a challenging process. The unavailability of large sources of data and data complexity are some of the difficulties encountered. This is especially true for rare and neglected diseases. Those databases are, in general, relatively small, wide and sparse, making them very challenging to analyze. There are also ethical, legal and social issues regarding privacy and clinical validation of the findings. This work proposes a way of dealing with this challenge with a case study of data mining applied in a Paracoccidioidomycosis (PCM) patients database. Paracoccidioidomycosis (PCM) is a typical Brazilian disease, caused by the yeast Paracoccidioides brasiliensis. This disease represents an important Public Health issue, due to its high incapacitating potential and the amount of premature deaths it causes if untreated. This paper discusses methods for the analysis of this complex dataset, to help increase the understanding of both the disease and this type of data. Despite the challenges of the dataset, some interesting findings were made being: flaws in form filling protocols, notably the lack of chest X-ray in 40% of the records; the discovery of a possible new relation between smoking habits and PCM evolution time. The average evolution time for smoking patients was 2.8 times longer; the successful classification/prediction of the cutaneous form of the disease with a 93% precision rate are some of the discoveries made.","PeriodicalId":269964,"journal":{"name":"2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Medical data mining: A case study of a Paracoccidioidomycosis patient's database\",\"authors\":\"E. Ferreira, H. Rausch, S. Campos, A. Faria-Campos, Enio Pietra, Lílian Silva dos Santos\",\"doi\":\"10.1109/HealthCom.2014.7001854\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining applied to medical databases is a challenging process. The unavailability of large sources of data and data complexity are some of the difficulties encountered. This is especially true for rare and neglected diseases. Those databases are, in general, relatively small, wide and sparse, making them very challenging to analyze. There are also ethical, legal and social issues regarding privacy and clinical validation of the findings. This work proposes a way of dealing with this challenge with a case study of data mining applied in a Paracoccidioidomycosis (PCM) patients database. Paracoccidioidomycosis (PCM) is a typical Brazilian disease, caused by the yeast Paracoccidioides brasiliensis. This disease represents an important Public Health issue, due to its high incapacitating potential and the amount of premature deaths it causes if untreated. This paper discusses methods for the analysis of this complex dataset, to help increase the understanding of both the disease and this type of data. Despite the challenges of the dataset, some interesting findings were made being: flaws in form filling protocols, notably the lack of chest X-ray in 40% of the records; the discovery of a possible new relation between smoking habits and PCM evolution time. The average evolution time for smoking patients was 2.8 times longer; the successful classification/prediction of the cutaneous form of the disease with a 93% precision rate are some of the discoveries made.\",\"PeriodicalId\":269964,\"journal\":{\"name\":\"2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HealthCom.2014.7001854\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 16th International Conference on e-Health Networking, Applications and Services (Healthcom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HealthCom.2014.7001854","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

数据挖掘应用于医学数据库是一个具有挑战性的过程。大型数据源的不可用性和数据的复杂性是遇到的一些困难。对于罕见和被忽视的疾病尤其如此。一般来说,这些数据库相对较小、较宽且稀疏,这使得分析它们非常具有挑战性。关于隐私和研究结果的临床验证,还存在伦理、法律和社会问题。这项工作提出了一种处理这一挑战的方法,并在副球孢子菌病(PCM)患者数据库中应用了数据挖掘的案例研究。副球孢子菌病(PCM)是一种典型的巴西疾病,由巴西副球孢子菌引起。这种疾病是一个重要的公共卫生问题,因为它具有很高的致残潜力,如果不加以治疗,会导致大量过早死亡。本文讨论了分析这一复杂数据集的方法,以帮助增加对疾病和这类数据的理解。尽管数据集存在挑战,但仍有一些有趣的发现:表格填写协议存在缺陷,特别是40%的记录中缺乏胸部x光片;发现吸烟习惯与PCM进化时间之间可能存在的新关系。吸烟患者的平均进化时间是前者的2.8倍;成功的分类/预测疾病的皮肤形式,准确率达93%是其中的一些发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Medical data mining: A case study of a Paracoccidioidomycosis patient's database
Data mining applied to medical databases is a challenging process. The unavailability of large sources of data and data complexity are some of the difficulties encountered. This is especially true for rare and neglected diseases. Those databases are, in general, relatively small, wide and sparse, making them very challenging to analyze. There are also ethical, legal and social issues regarding privacy and clinical validation of the findings. This work proposes a way of dealing with this challenge with a case study of data mining applied in a Paracoccidioidomycosis (PCM) patients database. Paracoccidioidomycosis (PCM) is a typical Brazilian disease, caused by the yeast Paracoccidioides brasiliensis. This disease represents an important Public Health issue, due to its high incapacitating potential and the amount of premature deaths it causes if untreated. This paper discusses methods for the analysis of this complex dataset, to help increase the understanding of both the disease and this type of data. Despite the challenges of the dataset, some interesting findings were made being: flaws in form filling protocols, notably the lack of chest X-ray in 40% of the records; the discovery of a possible new relation between smoking habits and PCM evolution time. The average evolution time for smoking patients was 2.8 times longer; the successful classification/prediction of the cutaneous form of the disease with a 93% precision rate are some of the discoveries made.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Using online social media platforms for ubiquitous, personal health monitoring Standard-based and distributed health information sharing for mHealth IoT systems Towards health exercise behavior change for teams using life-logging An integrated approach of diet and exercise recommendations for diabetes patients Low complex, programmable FPGA based 8-channel ultrasound transmitter for medical imaging researches
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1