{"title":"核医学患者自我报告的体重和身高:混淆可靠性和准确性的常见错误","authors":"S. Sabour","doi":"10.2967/jnmt.119.232546","DOIUrl":null,"url":null,"abstract":"TO THE EDITOR: I read with great interest the article by Blum et al. recently published in the Journal of Nuclear Medicine Technology (1). The authors aimed to assess the reliability of the selfreported weight and height of nuclear medicine patients in view of recommendations for weight-dependent tracer application for imaging and therapy. In total, 824 patients (334 men and 490 women) were asked to report their weight and height before imaging or therapy, along with their level of confidence that the weight and height they were reporting were correct. Subsequently, the weight and height of each patient were measured, and body mass index, body surface area, and lean body mass were calculated. Differences between the reported and true values were compared for statistical significance. The results indicated that an overor underestimation of weight by at least 10% was observed in 2% of the patients, and height was overestimated by 1% of the patients. Surprisingly, the authors concluded that most self-reported weights and heights of nuclear medicine patients are accurate. However, there were some methodologic issues regarding accuracy and reliability. First, it is crucial to realize that accuracy and reliability are two completely different methodologic issues. The term accuracy means the degree to which the result of a measurement, calculation, or specification conforms to the correct value or a standard. In other words, accuracy is the most important criterion for the quality of a test and refers to whether the test measures what it claims to measure. The core design for determining and measuring the accuracy of a test is a comparison between an index test and a reference standard by applying both on similar people who are suspected of having the target result of interest. The term reliability denotes refinement of a measurement, calculation, or specification, especially as represented by the number of digits given. Accuracy studies should report significant and comprehensive information together with the absolute number of true-positive, false-positive, false-negative, and true-negative results or should provide information that allows calculation of a minimum of one diagnostic performance indicator (i.e., sensitivity, specificity, predictive values, or likelihood ratio). Therefore, we recommend applying the most appropriate estimates to evaluate the accuracy of the self-reported weight and height. The Pearson r or the Spearman r can be applied to assess accuracy for quantitative variables. However, for qualitative (binary) variables, some of the well-known ways to assess accuracy include sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio (ranging from 1 to infinity; the higher the positive likelihood ratio, the more accurate the test), negative likelihood ratio (ranging from 0 to 1; the lower the negative likelihood ratio, the more accurate the test), diagnostic accuracy, and odds ratio (ratio of true results to false results) (2–8). Second, what is critically important is reliability, which is conceptually different from accuracy. Consequently, our methodologic and statistical approach to assessing reliability should be different. Depending on the type of variable, appropriate estimates to assess reliability are completely different from those used to assess accuracy. For quantitative variables, we can apply either the intraclass correlation coefficient or Bland–Altman plots. For qualitative variables, we can apply the weighted k or the Fleiss k to assess intraor interobserver reliability, respectively. Thus, because of the inappropriate use of statistical tests (Student t test and ANOVA) for accuracy and reliability analyses, as well as misinterpretation of the results, there may be a high level of uncertainty about the conclusion of Blum et al. The evidence is insufficient to conclude that the self-reported weights and heights of nuclear medicine patients are accurate.","PeriodicalId":22799,"journal":{"name":"The Journal of Nuclear Medicine Technology","volume":"50 1","pages":"386 - 386"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-Reported Weight and Height in Nuclear Medicine Patients: A Common Mistake Confusing Reliability and Accuracy\",\"authors\":\"S. Sabour\",\"doi\":\"10.2967/jnmt.119.232546\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"TO THE EDITOR: I read with great interest the article by Blum et al. recently published in the Journal of Nuclear Medicine Technology (1). The authors aimed to assess the reliability of the selfreported weight and height of nuclear medicine patients in view of recommendations for weight-dependent tracer application for imaging and therapy. In total, 824 patients (334 men and 490 women) were asked to report their weight and height before imaging or therapy, along with their level of confidence that the weight and height they were reporting were correct. Subsequently, the weight and height of each patient were measured, and body mass index, body surface area, and lean body mass were calculated. Differences between the reported and true values were compared for statistical significance. The results indicated that an overor underestimation of weight by at least 10% was observed in 2% of the patients, and height was overestimated by 1% of the patients. Surprisingly, the authors concluded that most self-reported weights and heights of nuclear medicine patients are accurate. However, there were some methodologic issues regarding accuracy and reliability. First, it is crucial to realize that accuracy and reliability are two completely different methodologic issues. The term accuracy means the degree to which the result of a measurement, calculation, or specification conforms to the correct value or a standard. In other words, accuracy is the most important criterion for the quality of a test and refers to whether the test measures what it claims to measure. The core design for determining and measuring the accuracy of a test is a comparison between an index test and a reference standard by applying both on similar people who are suspected of having the target result of interest. The term reliability denotes refinement of a measurement, calculation, or specification, especially as represented by the number of digits given. Accuracy studies should report significant and comprehensive information together with the absolute number of true-positive, false-positive, false-negative, and true-negative results or should provide information that allows calculation of a minimum of one diagnostic performance indicator (i.e., sensitivity, specificity, predictive values, or likelihood ratio). Therefore, we recommend applying the most appropriate estimates to evaluate the accuracy of the self-reported weight and height. The Pearson r or the Spearman r can be applied to assess accuracy for quantitative variables. However, for qualitative (binary) variables, some of the well-known ways to assess accuracy include sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio (ranging from 1 to infinity; the higher the positive likelihood ratio, the more accurate the test), negative likelihood ratio (ranging from 0 to 1; the lower the negative likelihood ratio, the more accurate the test), diagnostic accuracy, and odds ratio (ratio of true results to false results) (2–8). Second, what is critically important is reliability, which is conceptually different from accuracy. Consequently, our methodologic and statistical approach to assessing reliability should be different. Depending on the type of variable, appropriate estimates to assess reliability are completely different from those used to assess accuracy. For quantitative variables, we can apply either the intraclass correlation coefficient or Bland–Altman plots. For qualitative variables, we can apply the weighted k or the Fleiss k to assess intraor interobserver reliability, respectively. Thus, because of the inappropriate use of statistical tests (Student t test and ANOVA) for accuracy and reliability analyses, as well as misinterpretation of the results, there may be a high level of uncertainty about the conclusion of Blum et al. The evidence is insufficient to conclude that the self-reported weights and heights of nuclear medicine patients are accurate.\",\"PeriodicalId\":22799,\"journal\":{\"name\":\"The Journal of Nuclear Medicine Technology\",\"volume\":\"50 1\",\"pages\":\"386 - 386\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Nuclear Medicine Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2967/jnmt.119.232546\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Nuclear Medicine Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2967/jnmt.119.232546","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
致编辑:我怀着极大的兴趣阅读了Blum等人最近发表在《核医学技术杂志》(Journal of Nuclear Medicine Technology)上的一篇文章(1)。作者的目的是评估核医学患者自我报告的体重和身高的可靠性,并建议将体重依赖性示踪剂应用于成像和治疗。总共有824名患者(334名男性和490名女性)被要求在成像或治疗前报告他们的体重和身高,以及他们对所报告的体重和身高的自信程度。随后测量每位患者的体重和身高,计算体重指数、体表面积和瘦体重。比较报告值与真实值之间的差异是否具有统计学意义。结果表明,2%的患者体重高估或低估至少10%,1%的患者身高高估。令人惊讶的是,作者得出结论,大多数核医学患者自我报告的体重和身高都是准确的。然而,在准确性和可靠性方面存在一些方法学问题。首先,认识到准确性和可靠性是两个完全不同的方法论问题是至关重要的。准确度一词是指测量、计算或规范的结果符合正确值或标准的程度。换句话说,准确性是测试质量最重要的标准,它指的是测试是否测量了它声称要测量的东西。确定和测量测试准确性的核心设计是将索引测试和参考标准进行比较,将两者应用于怀疑具有目标结果的相似人群。“可靠性”一词表示测量、计算或规范的改进,特别是用所给出的数字表示的。准确性研究应报告重要和全面的信息,包括真阳性、假阳性、假阴性和真阴性结果的绝对数量,或应提供允许计算至少一个诊断性能指标的信息(即敏感性、特异性、预测值或似然比)。因此,我们建议应用最适当的估计来评估自我报告的体重和身高的准确性。皮尔逊r或斯皮尔曼r可用于评估定量变量的准确性。然而,对于定性(二元)变量,一些众所周知的评估准确性的方法包括灵敏度,特异性,正预测值,负预测值,正似然比(范围从1到无穷大;正似然比越高,检验越准确),负似然比(取值范围为0 ~ 1;负似然比越低,测试越准确),诊断准确性和优势比(真结果与假结果的比率)(2-8)。其次,至关重要的是可靠性,这在概念上不同于准确性。因此,我们评估可靠性的方法和统计方法应该是不同的。根据变量的类型,评估可靠性的适当估计与评估准确性的适当估计是完全不同的。对于定量变量,我们可以应用类内相关系数或Bland-Altman图。对于定性变量,我们可以分别应用加权k或Fleiss k来评估观察者内部或观察者之间的可靠性。因此,由于不恰当地使用统计检验(学生t检验和方差分析)进行准确性和可靠性分析,以及对结果的误解,Blum等人的结论可能存在高度的不确定性。证据不足以断定核医学患者自我报告的体重和身高是准确的。
Self-Reported Weight and Height in Nuclear Medicine Patients: A Common Mistake Confusing Reliability and Accuracy
TO THE EDITOR: I read with great interest the article by Blum et al. recently published in the Journal of Nuclear Medicine Technology (1). The authors aimed to assess the reliability of the selfreported weight and height of nuclear medicine patients in view of recommendations for weight-dependent tracer application for imaging and therapy. In total, 824 patients (334 men and 490 women) were asked to report their weight and height before imaging or therapy, along with their level of confidence that the weight and height they were reporting were correct. Subsequently, the weight and height of each patient were measured, and body mass index, body surface area, and lean body mass were calculated. Differences between the reported and true values were compared for statistical significance. The results indicated that an overor underestimation of weight by at least 10% was observed in 2% of the patients, and height was overestimated by 1% of the patients. Surprisingly, the authors concluded that most self-reported weights and heights of nuclear medicine patients are accurate. However, there were some methodologic issues regarding accuracy and reliability. First, it is crucial to realize that accuracy and reliability are two completely different methodologic issues. The term accuracy means the degree to which the result of a measurement, calculation, or specification conforms to the correct value or a standard. In other words, accuracy is the most important criterion for the quality of a test and refers to whether the test measures what it claims to measure. The core design for determining and measuring the accuracy of a test is a comparison between an index test and a reference standard by applying both on similar people who are suspected of having the target result of interest. The term reliability denotes refinement of a measurement, calculation, or specification, especially as represented by the number of digits given. Accuracy studies should report significant and comprehensive information together with the absolute number of true-positive, false-positive, false-negative, and true-negative results or should provide information that allows calculation of a minimum of one diagnostic performance indicator (i.e., sensitivity, specificity, predictive values, or likelihood ratio). Therefore, we recommend applying the most appropriate estimates to evaluate the accuracy of the self-reported weight and height. The Pearson r or the Spearman r can be applied to assess accuracy for quantitative variables. However, for qualitative (binary) variables, some of the well-known ways to assess accuracy include sensitivity, specificity, positive predictive value, negative predictive value, positive likelihood ratio (ranging from 1 to infinity; the higher the positive likelihood ratio, the more accurate the test), negative likelihood ratio (ranging from 0 to 1; the lower the negative likelihood ratio, the more accurate the test), diagnostic accuracy, and odds ratio (ratio of true results to false results) (2–8). Second, what is critically important is reliability, which is conceptually different from accuracy. Consequently, our methodologic and statistical approach to assessing reliability should be different. Depending on the type of variable, appropriate estimates to assess reliability are completely different from those used to assess accuracy. For quantitative variables, we can apply either the intraclass correlation coefficient or Bland–Altman plots. For qualitative variables, we can apply the weighted k or the Fleiss k to assess intraor interobserver reliability, respectively. Thus, because of the inappropriate use of statistical tests (Student t test and ANOVA) for accuracy and reliability analyses, as well as misinterpretation of the results, there may be a high level of uncertainty about the conclusion of Blum et al. The evidence is insufficient to conclude that the self-reported weights and heights of nuclear medicine patients are accurate.