S. Krivenko, A. Pulavskyi, L. Kryvenko, O. Krylova, Sergey A. Krivenko
{"title":"使用Mel-Frequency倒谱和幅度-时间心脏变异性作为XGBoost手工特征用于心脏病检测","authors":"S. Krivenko, A. Pulavskyi, L. Kryvenko, O. Krylova, Sergey A. Krivenko","doi":"10.23919/cinc53138.2021.9662929","DOIUrl":null,"url":null,"abstract":"We have developed the XGBoost model to identify 27 heart pathologies within the challenge Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/ Computing in Cardiology Challenge 2021. The technical part included several stages. At the first stage, the ECG was cut off to 10 seconds. At the second stage, resampling to frequencies 125 and 500 Hz was carried out and filtering in the 0.5-45 Hz bands. At the third stage, the features of HRV and symbolic dynamics were extracted from the signal with a sampling rate of 125 Hz. The melspectrograms were calculated based on a signal with a sampling frequency of 500 Hz. Then the features calculated for each lead were concatenated to obtain the final vector of features. We were faced with the task of constructing 27 independent binary classifiers, each of which defines a certain pathology. The fourth important step was to build balanced datasets for the algorithm. For the robustness of the models, the control groups for each contained almost all pathologies presented in the databases, except target disease. Our team Sunset scored 0.22, 0.21, 0.22, 0.21, 0.20 for the 12-lead, 6-lead, 4-lead, 3-lead, and 2-lead models, respectively, ranking 32 out of 39 teams for the first four lead combinations and 31 out of 39 teams for the last.","PeriodicalId":126746,"journal":{"name":"2021 Computing in Cardiology (CinC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Using Mel-Frequency Cepstrum and Amplitude-Time Heart Variability as XGBoost Handcrafted Features for Heart Disease Detection\",\"authors\":\"S. Krivenko, A. Pulavskyi, L. Kryvenko, O. Krylova, Sergey A. Krivenko\",\"doi\":\"10.23919/cinc53138.2021.9662929\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have developed the XGBoost model to identify 27 heart pathologies within the challenge Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/ Computing in Cardiology Challenge 2021. The technical part included several stages. At the first stage, the ECG was cut off to 10 seconds. At the second stage, resampling to frequencies 125 and 500 Hz was carried out and filtering in the 0.5-45 Hz bands. At the third stage, the features of HRV and symbolic dynamics were extracted from the signal with a sampling rate of 125 Hz. The melspectrograms were calculated based on a signal with a sampling frequency of 500 Hz. Then the features calculated for each lead were concatenated to obtain the final vector of features. We were faced with the task of constructing 27 independent binary classifiers, each of which defines a certain pathology. The fourth important step was to build balanced datasets for the algorithm. For the robustness of the models, the control groups for each contained almost all pathologies presented in the databases, except target disease. Our team Sunset scored 0.22, 0.21, 0.22, 0.21, 0.20 for the 12-lead, 6-lead, 4-lead, 3-lead, and 2-lead models, respectively, ranking 32 out of 39 teams for the first four lead combinations and 31 out of 39 teams for the last.\",\"PeriodicalId\":126746,\"journal\":{\"name\":\"2021 Computing in Cardiology (CinC)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Computing in Cardiology (CinC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/cinc53138.2021.9662929\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Computing in Cardiology (CinC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/cinc53138.2021.9662929","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
我们开发了XGBoost模型,以识别挑战Will Two Do?中的27种心脏疾病。心电图的不同维度:2021年心脏病学挑战中的物理网络/计算。技术部分包括几个阶段。在第一阶段,心电图被切断到10秒。在第二阶段,对频率125和500 Hz进行重采样,并在0.5-45 Hz频段进行滤波。第三阶段,以125 Hz的采样率提取信号的HRV特征和符号动力学特征。以采样频率为500hz的信号为基础,计算melogram。然后将每条引线计算的特征进行连接,得到最终的特征向量。我们面临的任务是构建27个独立的二元分类器,每个分类器定义一个特定的病理。第四步是为算法建立平衡的数据集。为了模型的稳健性,除了目标疾病外,每个对照组几乎包含数据库中提供的所有病理。我们队在12领先、6领先、4领先、3领先、2领先模式中得分分别为0.22、0.21、0.22、0.21、0.20,在前4种领先组合的39支队伍中排名第32位,在后39支队伍中排名第31位。
Using Mel-Frequency Cepstrum and Amplitude-Time Heart Variability as XGBoost Handcrafted Features for Heart Disease Detection
We have developed the XGBoost model to identify 27 heart pathologies within the challenge Will Two Do? Varying Dimensions in Electrocardiography: The PhysioNet/ Computing in Cardiology Challenge 2021. The technical part included several stages. At the first stage, the ECG was cut off to 10 seconds. At the second stage, resampling to frequencies 125 and 500 Hz was carried out and filtering in the 0.5-45 Hz bands. At the third stage, the features of HRV and symbolic dynamics were extracted from the signal with a sampling rate of 125 Hz. The melspectrograms were calculated based on a signal with a sampling frequency of 500 Hz. Then the features calculated for each lead were concatenated to obtain the final vector of features. We were faced with the task of constructing 27 independent binary classifiers, each of which defines a certain pathology. The fourth important step was to build balanced datasets for the algorithm. For the robustness of the models, the control groups for each contained almost all pathologies presented in the databases, except target disease. Our team Sunset scored 0.22, 0.21, 0.22, 0.21, 0.20 for the 12-lead, 6-lead, 4-lead, 3-lead, and 2-lead models, respectively, ranking 32 out of 39 teams for the first four lead combinations and 31 out of 39 teams for the last.