对结构不良的医疗数据进行初级处理的方法

Dmytro Bychko, V. Shendryk, Yuliia Parfenenko
{"title":"对结构不良的医疗数据进行初级处理的方法","authors":"Dmytro Bychko, V. Shendryk, Yuliia Parfenenko","doi":"10.23939/sisn2020.08.001","DOIUrl":null,"url":null,"abstract":"The article deals with the approach to the primary processing of poorly structured medical protocol textual data stored and disseminated as pdf files. The relevance of this work is due to the lack of a universal structure for the presentation of medical protocols and methods of their processing. In the course of the work, the problem of primary processing of clinical protocol data was solved by the example of a unified clinical protocol of primary, secondary (specialized) and tertiary (highly specialized) medical care. The method of primary data processing was developed to create a clear structure of the symptoms of the disease. The first step in structuring clinical protocol data is to divide the protocol information into four basic parts, which allows it to be quickly converted to other formats. This process is implemented using an algorithm developed in C # programming language. The proposed algorithm parses the information from a pdf file and converts it to a txt file. After that, the received information is processed, which consists in the syntactic analysis of the text of the protocol and selection of the structural parts of the protocol corresponding to the headings of the sections: title page; introduction; a list of abbreviations used in the protocol; the main part of the protocol; list of literary sources. The identification of the disease name in the medical protocol is performed by comparing the protocol data and the list of disease names, presented in the world classification MKH-10. The headings “Introduction”, “List of abbreviations used in the protocol” and the main part of the protocol were analyzed and the algorithm for removing uninformed sections from the beginning of the protocol, for example, literature sources, was proposed. An algorithm for finding information in the main part of the medical protocol by processing input data by: tables, diagrams, headings, words, phrases and special symbols are also proposed. As a result of the clinical protocol processing algorithms, a new clinical protocol file is generated, which is three times smaller than the original file. It contains only meaningful information from clinical protocols that will speed up further work on this file, namely its use in medical decision support. The disease card based on a medical protocol in JSON format is presented.","PeriodicalId":444399,"journal":{"name":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","volume":"227 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The method of primary processing of poorly structured medical data\",\"authors\":\"Dmytro Bychko, V. Shendryk, Yuliia Parfenenko\",\"doi\":\"10.23939/sisn2020.08.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The article deals with the approach to the primary processing of poorly structured medical protocol textual data stored and disseminated as pdf files. The relevance of this work is due to the lack of a universal structure for the presentation of medical protocols and methods of their processing. In the course of the work, the problem of primary processing of clinical protocol data was solved by the example of a unified clinical protocol of primary, secondary (specialized) and tertiary (highly specialized) medical care. The method of primary data processing was developed to create a clear structure of the symptoms of the disease. The first step in structuring clinical protocol data is to divide the protocol information into four basic parts, which allows it to be quickly converted to other formats. This process is implemented using an algorithm developed in C # programming language. The proposed algorithm parses the information from a pdf file and converts it to a txt file. After that, the received information is processed, which consists in the syntactic analysis of the text of the protocol and selection of the structural parts of the protocol corresponding to the headings of the sections: title page; introduction; a list of abbreviations used in the protocol; the main part of the protocol; list of literary sources. The identification of the disease name in the medical protocol is performed by comparing the protocol data and the list of disease names, presented in the world classification MKH-10. The headings “Introduction”, “List of abbreviations used in the protocol” and the main part of the protocol were analyzed and the algorithm for removing uninformed sections from the beginning of the protocol, for example, literature sources, was proposed. An algorithm for finding information in the main part of the medical protocol by processing input data by: tables, diagrams, headings, words, phrases and special symbols are also proposed. As a result of the clinical protocol processing algorithms, a new clinical protocol file is generated, which is three times smaller than the original file. It contains only meaningful information from clinical protocols that will speed up further work on this file, namely its use in medical decision support. The disease card based on a medical protocol in JSON format is presented.\",\"PeriodicalId\":444399,\"journal\":{\"name\":\"Vìsnik Nacìonalʹnogo unìversitetu \\\"Lʹvìvsʹka polìtehnìka\\\". Serìâ Ìnformacìjnì sistemi ta merežì\",\"volume\":\"227 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vìsnik Nacìonalʹnogo unìversitetu \\\"Lʹvìvsʹka polìtehnìka\\\". Serìâ Ìnformacìjnì sistemi ta merežì\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23939/sisn2020.08.001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vìsnik Nacìonalʹnogo unìversitetu \"Lʹvìvsʹka polìtehnìka\". Serìâ Ìnformacìjnì sistemi ta merežì","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23939/sisn2020.08.001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文讨论了以pdf文件形式存储和传播的结构不良的医疗协议文本数据的初级处理方法。这项工作的相关性是由于缺乏一种通用的结构来介绍医学协议及其处理方法。在工作过程中,以一级、二级(专科)和三级(高度专科)医疗统一临床方案为例,解决了临床方案数据的初级处理问题。原始数据处理方法的发展是为了创建疾病症状的清晰结构。构建临床方案数据的第一步是将方案信息划分为四个基本部分,使其能够快速转换为其他格式。该过程使用c#编程语言开发的算法实现。该算法从pdf文件中解析信息并将其转换为txt文件。之后,对接收到的信息进行处理,包括对协议文本进行句法分析,选择与各节标题对应的协议结构部分:标题页;介绍;协议中使用的缩略语列表;协议主体部分;文学来源列表。医学方案中疾病名称的识别是通过将方案数据与世界分类MKH-10中的疾病名称列表进行比较来完成的。对“引言”、“协议中使用的缩略语列表”和协议主体部分的标题进行了分析,并提出了从协议开头删除不知情部分(如文献来源)的算法。提出了一种通过表格、图表、标题、单词、短语和特殊符号对输入数据进行处理,查找医疗协议主体部分信息的算法。根据临床协议处理算法,生成一个新的临床协议文件,该文件比原始文件小三倍。它只包含来自临床协议的有意义的信息,这些信息将加快对该文件的进一步工作,即在医疗决策支持中的使用。提出了一种基于JSON格式医疗协议的疾病卡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
The method of primary processing of poorly structured medical data
The article deals with the approach to the primary processing of poorly structured medical protocol textual data stored and disseminated as pdf files. The relevance of this work is due to the lack of a universal structure for the presentation of medical protocols and methods of their processing. In the course of the work, the problem of primary processing of clinical protocol data was solved by the example of a unified clinical protocol of primary, secondary (specialized) and tertiary (highly specialized) medical care. The method of primary data processing was developed to create a clear structure of the symptoms of the disease. The first step in structuring clinical protocol data is to divide the protocol information into four basic parts, which allows it to be quickly converted to other formats. This process is implemented using an algorithm developed in C # programming language. The proposed algorithm parses the information from a pdf file and converts it to a txt file. After that, the received information is processed, which consists in the syntactic analysis of the text of the protocol and selection of the structural parts of the protocol corresponding to the headings of the sections: title page; introduction; a list of abbreviations used in the protocol; the main part of the protocol; list of literary sources. The identification of the disease name in the medical protocol is performed by comparing the protocol data and the list of disease names, presented in the world classification MKH-10. The headings “Introduction”, “List of abbreviations used in the protocol” and the main part of the protocol were analyzed and the algorithm for removing uninformed sections from the beginning of the protocol, for example, literature sources, was proposed. An algorithm for finding information in the main part of the medical protocol by processing input data by: tables, diagrams, headings, words, phrases and special symbols are also proposed. As a result of the clinical protocol processing algorithms, a new clinical protocol file is generated, which is three times smaller than the original file. It contains only meaningful information from clinical protocols that will speed up further work on this file, namely its use in medical decision support. The disease card based on a medical protocol in JSON format is presented.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Project of the information system of sales on the charity auction platform Intelligent system for analyzing battery charge consumption processes Information system of feedback monitoring in social networks for the formation of recommendations for the purchase of goods Software for the implementation of an intelligent system to solve the problem of “cold start” Analysis of multiplication algorithms in Galuis fields for the cryptographic protection of information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1