用于将分段(片断)线性模型与 GAMLSS 模型进行比较的精炼肺活量数据集

IF 1 Q3 MULTIDISCIPLINARY SCIENCES Data in Brief Pub Date : 2024-10-23 DOI:10.1016/j.dib.2024.111062
Gerald Stanley Zavorsky
{"title":"用于将分段(片断)线性模型与 GAMLSS 模型进行比较的精炼肺活量数据集","authors":"Gerald Stanley Zavorsky","doi":"10.1016/j.dib.2024.111062","DOIUrl":null,"url":null,"abstract":"<div><div>Generalized Additive Models for Location, Scale, and Shape (GAMLSS) are widely used for developing spirometric reference equations but are often complex, requiring additional spline tables. This study explores the potential of Segmented (piecewise) Linear Regression as an alternative, comparing its predictive accuracy to GAMLSS and examining the agreement between the two methods. Spirometry data from nearly 16,600 patients, deemed Grade “A” and “B” acceptable from the NHANES 2007-2012 dataset, was analyzed. The dataset includes both nominal and scalar variables. Reference equations for forced expiratory volume in 1 s (FEV<sub>1</sub>), forced vital capacity (FVC), and the ratio (FEV<sub>1</sub>/FVC) were generated using GAMLSS (FEV<sub>1</sub>, FVC, FEV<sub>1</sub>/FVC), Segmented Linear Regression (FEV<sub>1</sub>, FVC) and multiple linear regression (FEV<sub>1</sub>/FVC). <em>K</em>-fold cross-validation was employed to compare prediction accuracy, using root-mean-square error (RMSE) and correlation coefficients. Agreement in classifying spirometric patterns (i.e. airway obstruction, restrictive spirometry pattern, mixed obstructive and restrictive disorder) was evaluated with the kappa statistic. This study uniquely compares the models by incorporating the lower limit of normal (LLN) using fitted z-scores of –1.645 or –1.96. The dataset is publicly available in SPSS (.sav) and .csv formats through the Mendeley Data repository.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"57 ","pages":"Article 111062"},"PeriodicalIF":1.0000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A refined spirometry dataset for comparing segmented (piecewise) linear models to that of GAMLSS\",\"authors\":\"Gerald Stanley Zavorsky\",\"doi\":\"10.1016/j.dib.2024.111062\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Generalized Additive Models for Location, Scale, and Shape (GAMLSS) are widely used for developing spirometric reference equations but are often complex, requiring additional spline tables. This study explores the potential of Segmented (piecewise) Linear Regression as an alternative, comparing its predictive accuracy to GAMLSS and examining the agreement between the two methods. Spirometry data from nearly 16,600 patients, deemed Grade “A” and “B” acceptable from the NHANES 2007-2012 dataset, was analyzed. The dataset includes both nominal and scalar variables. Reference equations for forced expiratory volume in 1 s (FEV<sub>1</sub>), forced vital capacity (FVC), and the ratio (FEV<sub>1</sub>/FVC) were generated using GAMLSS (FEV<sub>1</sub>, FVC, FEV<sub>1</sub>/FVC), Segmented Linear Regression (FEV<sub>1</sub>, FVC) and multiple linear regression (FEV<sub>1</sub>/FVC). <em>K</em>-fold cross-validation was employed to compare prediction accuracy, using root-mean-square error (RMSE) and correlation coefficients. Agreement in classifying spirometric patterns (i.e. airway obstruction, restrictive spirometry pattern, mixed obstructive and restrictive disorder) was evaluated with the kappa statistic. This study uniquely compares the models by incorporating the lower limit of normal (LLN) using fitted z-scores of –1.645 or –1.96. The dataset is publicly available in SPSS (.sav) and .csv formats through the Mendeley Data repository.</div></div>\",\"PeriodicalId\":10973,\"journal\":{\"name\":\"Data in Brief\",\"volume\":\"57 \",\"pages\":\"Article 111062\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data in Brief\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352340924010242\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340924010242","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

位置、尺度和形状的广义加法模型(GAMLSS)被广泛用于制定肺活量参考方程,但通常比较复杂,需要额外的样条表。本研究探索了分段(片断)线性回归作为替代方法的潜力,将其预测准确性与 GAMLSS 进行了比较,并检验了两种方法之间的一致性。研究分析了近 16600 名患者的肺活量数据,这些数据来自 2007-2012 年 NHANES 数据集,被视为 "A "级和 "B "级可接受数据。数据集包括名义变量和标量变量。使用 GAMLSS(FEV1、FVC、FEV1/FVC)、分段线性回归(FEV1、FVC)和多元线性回归(FEV1/FVC)生成了 1 秒用力呼气容积(FEV1)、用力生命容量(FVC)和比率(FEV1/FVC)的参考方程。使用均方根误差(RMSE)和相关系数对预测准确性进行了 K 倍交叉验证。用卡帕统计量评估了肺活量测量模式(即气道阻塞、限制性肺活量测量模式、阻塞性和限制性混合障碍)分类的一致性。本研究通过使用拟合 Z 分数-1.645 或-1.96,将正常值下限(LLN)纳入模型,对模型进行了独特的比较。数据集以 SPSS (.sav) 和 .csv 格式通过 Mendeley 数据库公开发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A refined spirometry dataset for comparing segmented (piecewise) linear models to that of GAMLSS
Generalized Additive Models for Location, Scale, and Shape (GAMLSS) are widely used for developing spirometric reference equations but are often complex, requiring additional spline tables. This study explores the potential of Segmented (piecewise) Linear Regression as an alternative, comparing its predictive accuracy to GAMLSS and examining the agreement between the two methods. Spirometry data from nearly 16,600 patients, deemed Grade “A” and “B” acceptable from the NHANES 2007-2012 dataset, was analyzed. The dataset includes both nominal and scalar variables. Reference equations for forced expiratory volume in 1 s (FEV1), forced vital capacity (FVC), and the ratio (FEV1/FVC) were generated using GAMLSS (FEV1, FVC, FEV1/FVC), Segmented Linear Regression (FEV1, FVC) and multiple linear regression (FEV1/FVC). K-fold cross-validation was employed to compare prediction accuracy, using root-mean-square error (RMSE) and correlation coefficients. Agreement in classifying spirometric patterns (i.e. airway obstruction, restrictive spirometry pattern, mixed obstructive and restrictive disorder) was evaluated with the kappa statistic. This study uniquely compares the models by incorporating the lower limit of normal (LLN) using fitted z-scores of –1.645 or –1.96. The dataset is publicly available in SPSS (.sav) and .csv formats through the Mendeley Data repository.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
期刊最新文献
Characterization of the complete mitogenome data of collared peccary, Dicotyles tajacu (Linnaeus, 1758) (Suina: Tayassuidae) from Ucayali, Peru Microbial community assembly across agricultural soil mineral mesocosms revealed by 16S rRNA gene amplicon sequencing data Climate impact dataset of 1233 ingredients to promote sustainability of food service operators in Finland BD-freshwater-fish: An image dataset from Bangladesh for AI-powered automatic fish species classification and detection toward smart aquaculture Interactive plant growth regulator and fertilizer application dataset on growth and yield attributes of tomato (Solanum lycopersicum L.)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1