An Oversampling Technique for Handling Imbalanced Data in Patients with Metabolic Syndrome and Periodontitis

Q3 Dentistry Cumhuriyet Dental Journal Pub Date : 2023-11-29 DOI:10.7126/cumudj.1332452
S. M. Altingöz, B. Bakırarar, Elif Ünsal, Ş. Kurgan, M. Günhan
{"title":"An Oversampling Technique for Handling Imbalanced Data in Patients with Metabolic Syndrome and Periodontitis","authors":"S. M. Altingöz, B. Bakırarar, Elif Ünsal, Ş. Kurgan, M. Günhan","doi":"10.7126/cumudj.1332452","DOIUrl":null,"url":null,"abstract":"Objectives: Periodontitis has been suggested to be associated with several systemic diseases and conditions including obesity, metabolic syndrome, diabetes, chronic renal disease, respiratory disorders, and cardiovascular diseases. Metabolic syndrome (MetS) is a collection of impairment and is a risk factor for type 2 diabetes and cardiovascular disease. Our study is aimed to handle MetS unbalanced data using the synthetic minority over-sampling technique (SMOTE) to increase accuracy and reliability. Materials and Methods: Six metabolic syndrome patients and 26 systemically healthy subjects with periodontitis were recruited in this study. Clinical parameters (Plaque index (PI), gingival index (GI), probing pocket depth (PPD), clinical attachment loss (CAL), and bleeding on probing (BOP)) were obtained, smoking status and body-mass index (BMI), systemic diseases, fasting glucose levels, hemoglobin A1c (HbA1c) levels and serum advanced glycation end-products (AGE) levels were recorded by one examiner. First, the data was pre-processed by removing missing values, outliers and normalizing the data. Then, SMOTE technique was used to oversample the minority class. SMOTE works by creating synthetic data points that are similar to the existing minority class instances. The experimental dataset included numerous machine learning algorithms and assessed accuracy using both pre- and post-oversampling methods. Results: Our findings suggest that by increasing the sample size of a study, researchers can gain more accurate and reliable results. This is especially important when studying a population with a lower sample size, as the results may be skewed. Conclusion: SMOTE may result in over fitting on numerous copies of minority class samples.","PeriodicalId":10781,"journal":{"name":"Cumhuriyet Dental Journal","volume":"42 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cumhuriyet Dental Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7126/cumudj.1332452","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Dentistry","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: Periodontitis has been suggested to be associated with several systemic diseases and conditions including obesity, metabolic syndrome, diabetes, chronic renal disease, respiratory disorders, and cardiovascular diseases. Metabolic syndrome (MetS) is a collection of impairment and is a risk factor for type 2 diabetes and cardiovascular disease. Our study is aimed to handle MetS unbalanced data using the synthetic minority over-sampling technique (SMOTE) to increase accuracy and reliability. Materials and Methods: Six metabolic syndrome patients and 26 systemically healthy subjects with periodontitis were recruited in this study. Clinical parameters (Plaque index (PI), gingival index (GI), probing pocket depth (PPD), clinical attachment loss (CAL), and bleeding on probing (BOP)) were obtained, smoking status and body-mass index (BMI), systemic diseases, fasting glucose levels, hemoglobin A1c (HbA1c) levels and serum advanced glycation end-products (AGE) levels were recorded by one examiner. First, the data was pre-processed by removing missing values, outliers and normalizing the data. Then, SMOTE technique was used to oversample the minority class. SMOTE works by creating synthetic data points that are similar to the existing minority class instances. The experimental dataset included numerous machine learning algorithms and assessed accuracy using both pre- and post-oversampling methods. Results: Our findings suggest that by increasing the sample size of a study, researchers can gain more accurate and reliable results. This is especially important when studying a population with a lower sample size, as the results may be skewed. Conclusion: SMOTE may result in over fitting on numerous copies of minority class samples.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
处理代谢综合征和牙周炎患者不平衡数据的过度取样技术
目的:牙周炎被认为与多种全身性疾病和病症有关,包括肥胖、代谢综合征、糖尿病、慢性肾病、呼吸系统疾病和心血管疾病。代谢综合征(MetS)是一系列损伤的集合,也是 2 型糖尿病和心血管疾病的风险因素。我们的研究旨在利用合成少数群体过度抽样技术(SMOTE)处理 MetS 非平衡数据,以提高准确性和可靠性。 材料和方法:本研究招募了 6 名代谢综合征患者和 26 名患有牙周炎的全身健康受试者。由一名检查人员记录临床参数(牙菌斑指数(PI)、牙龈指数(GI)、探诊袋深度(PPD)、临床附着丧失(CAL)和探诊出血(BOP))、吸烟状况和体重指数(BMI)、全身性疾病、空腹血糖水平、血红蛋白 A1c(HbA1c)水平和血清高级糖化终产物(AGE)水平。首先,对数据进行预处理,去除缺失值、异常值并对数据进行归一化处理。然后,使用 SMOTE 技术对少数群体进行超采样。SMOTE 的工作原理是创建与现有少数群体实例相似的合成数据点。实验数据集包括多种机器学习算法,并使用超采样前和超采样后方法评估了准确性。 结果我们的研究结果表明,通过增加研究的样本量,研究人员可以获得更准确、更可靠的结果。在研究样本量较少的人群时,这一点尤为重要,因为结果可能会出现偏差。 结论SMOTE 可能会导致对少数群体样本的大量复制过度拟合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Cumhuriyet Dental Journal
Cumhuriyet Dental Journal Dentistry-Dentistry (all)
CiteScore
0.40
自引率
0.00%
发文量
0
审稿时长
8 weeks
期刊最新文献
The Effects of Home and Over-The-Counter Whitening Agents on Surface Roughness and Microhardness of High Aesthetic Composites Çocukların Ağız Alışkanlıklarına İlişkin Ebeveyn Farkındalık ve Bilgi Düzeyinin Değerlendirilmesi: Bir Anket Çalışması İmplant destekli sabit bölümlü protez tedavisi yapılan hastalarında yaşam kalitesi ve memnuniyetinin değerlendirilmesi Comparison of Clinical and Radiographic Healing of Periapical Lesions Using MTA or Conventional Filling Materials: Randomized Controlled Clinical Trial Effect of Music Therapy on Dental Anxiety in Periodontal Surgery
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1