Assessing the reliability of the FricTest® 4.0 for diagnosing symptomatic dermographism

IF 4.6 2区医学 Q2 ALLERGY Clinical and Translational Allergy Pub Date : 2024-11-13 DOI:10.1002/clt2.70005

Annika Gutsche, Martin Metz, Melba Munoz, Kit Wong, Ted Omachi, Rui Zhao, Marcus Maurer, Vasiliki Zampeli, Markus Magerl

{"title":"Assessing the reliability of the FricTest® 4.0 for diagnosing symptomatic dermographism","authors":"Annika Gutsche, Martin Metz, Melba Munoz, Kit Wong, Ted Omachi, Rui Zhao, Marcus Maurer, Vasiliki Zampeli, Markus Magerl","doi":"10.1002/clt2.70005","DOIUrl":null,"url":null,"abstract":"To the Editor,Symptomatic dermographism (SD), a common subtype of chronic inducible urticaria (CIndU), involves transient, strip-shaped wheals that itch and burn when the skin is stroked or scratched.1 SD affects ≥0.5% of the population,2 yet despite its high frequency and marked impact on quality of life, diagnostic tools and treatment options are limited.1, 3Diagnosis is based on the patient's medical history and provocation testing.2 Historically, provocation testing used a smooth, blunt object to stroke the skin, but variations in individuals' disease presentation highlighted the need for validated, reproducible tools.3 The FricTest®4.04 is a hand-held, flat plastic comb-like tool with four smooth pins (3.0–4.5 mm long) firmly stroked along the skin. The resulting wheals determine the critical friction threshold (CFT), the shortest pin length/minimum pressure that elicits a positive wheal response.5This study aimed to assess the reliability (reproducibility and repeatability) of FricTest inter-rater agreement (results from two different raters on the same patient at the same visit) and intra-rater agreement (results from the same rater on the same patient 7–14 days apart). Reliable results are important for monitoring treatment effects, helping patients understand triggers, and improving management. We assume that each patient's left and right forearms are the same, that visits are the same, and that previous provocation did not affect the reaction of subsequent tests.This single-center study was conducted at the Urticaria Center of Reference and Excellence6 at the Charité Hospital, Berlin, Germany. Adults had SD for >6 weeks, had active SD at enrollment, and gave written informed consent. The study followed the Declaration of Helsinki principles, and the Berlin Charité Ethics Committee approved the protocol.The primary endpoint, inter-rater agreement, was the intraclass correlation coefficient (ICC) between the CFT assessments of two raters within the same patient. The CFT scale is 0–4 (0 = no response, 4 = maximum response). ICCs plus upper and lower 95% confidence intervals (CI) were calculated using a mixed-effects linear model methodology.7 Rater A and B agreement was quantified using weighted kappa across four categories: left forearm, right forearm, Visit 1, and Visit 2. An ICC<0.4 indicated poor reliability, and 0.6–0.8 indicated substantial reliability.Two raters randomized to the order of assessments (Forearm 1 [right], Forearm 2 [left]) administered and recorded all FricTest results. At Visit 1, Rater A applied the FricTest to Forearm 1, covered the arm, and left the room. Rater B then applied the FricTest to Forearm 2, covered the arm, and left the room. Ten minutes after application one, Rater A returned, uncovered Forearm 1 and documented the reaction before re-covering the arm. Rater B repeated this process. Both raters repeated these procedures at Visit 2 after 7–14 days, using the opposite arm to Visit 1.Among 16 participants (62.5% female), we observed substantial inter-rater agreement when Rater A and B assessed the same patient during the same visit, with a weighted kappa of 0.86 for the left and 0.77 for the right forearm (Table 1, Figure 1). When the two raters assessed the same patient at different visits, the agreement was moderate at Visit 1 (weighted kappa of 0.56) but improved to substantial at Visit 2 (0.79; Table 1).When the same rater assessed one patient at different visits, we found substantial intra-rater agreement. Raters A and B reported weighted kappas of 0.72 and 0.73, indicating consistent and substantial agreement between Visits 1 and 2.Overall, there was high consistency in ICC measurements, with 0.89 (95% CI 0.71, 0.94) for inter-rater reliability and 0.81 (95% CI 0.60, 0.92) for intra-rater reliability. At Visits 1 and 2, the average mean differences between Rater A and B were 0.06 and 0.13, respectively. Similarly, the average mean differences for the left and right forearms were 0.06 and 0.25, respectively, indicating high agreement essential for reliable measurements.Our study validates the FricTest, demonstrating its reliability as a diagnostic and monitoring tool in SD. Measurement noise from repeated tests by the same or different raters is a well-recognized source of error in medical assessments.8 Our results showed almost perfect inter-rater and substantial intra-rater agreement when inducing wheals in SD patients. Accurately determining a patient's CFT could enhance SD diagnosis and ultimately lead to better management of the patient's condition.We observed changes in one patient that primarily drove variability between visits. Despite these fluctuations, the consistency with which each rater detected changes in disease activity confirms the robustness of our assessment tool in identifying improvements and deteriorations in SD.Our study has limitations, mostly relating to its small size, duration of follow-up, and applicability to real-world conditions. Itch was documented at baseline to ensure the correct diagnosis; it could not be measured as a result because the intensity of itch cannot be differentiated in such tightly condensed strips. Further real-world evidence needs to confirm the FricTest's reproducibility over longer periods and clinical validity in busy settings, which differ from the controlled conditions within this study. While we strove to ensure comparable physical environments at both visits, we cannot rule out minor differences in the experimental environment nor exclude disease-activity modifying factors between the two visits, which may have caused response fluctuations.The simple, compact and affordable FricTest should be utilized in clinical studies, especially those investigating potential CIndU treatments, and in primary care settings, where continuity of care can be challenging.Annika Gutsche: Methodology; writing—review & editing; writing—original draft; data curation. Martin Metz: writing—original draft; methodology; writing—review & editing; data curation. Melba Munoz: Writing—original draft; methodology; writing—review & editing; data curation. Kit Wong: Methodology; writing—review & editing; data curation. Ted Omachi: Data curation; methodology; writing—review & editing. Rui Zhao: Methodology; data curation; writing—review & editing. Marcus Maurer: Methodology; writing—original draft; writing—review & editing; data curation. Vasiliki Zampeli: Study protocol; study investigator; methodology; writing—review & editing; writing—original draft; data curation. Markus Magerl: Study protocol; study investigator; writing—original draft; methodology; writing—review & editing; data curation.A. Gutsche declares no conflict of interest. M. Metz is or recently was a speaker and/or consultant for Amgen, AstraZeneca, Argenx, Celldex, Celltrion, Escient, Jasper Therapeutics, Novartis, Pharvaris, Regeneron, Sanofi, and Third Harmonic Bio. M. Munoz is or recently was a speaker, advisor and/or received research funding from Jasper Therapeutics, Celldex Therapeutics, Takeda, GA2LEN, UNEV, Astra Zeneca and Roche outside the submitted work. K. Wong is a former employee of Genentech Inc. T. Omachi is an employee of Genentech Inc. R. Zhao is a former employee of Genentech/Roche. M. Maurer was a speaker and/or advisor for and/or has received research funds from Allakos, Amgen, Aralez, ArgenX, AstraZeneca, Celldex, Centogene, CSL, Behring, FAES, Genentech, GI Innovation, Innate Pharma, Kyowa Kirin, Leo Pharma Lilly, Menarini, Moxie, Novartis, Roche, Sanofi/Regeneron, Third Harmonic Bio, UCB, and Urich. V. Zampeli was a speaker and/or advisor for and/or has received research funding from Pharming, Takeda, and CSL Behring. M. Magerl is or recently was a speaker and/or advisor for and/or has received research funding from Biocryst, Pharming, Takeda, CSL Behring, Ionis Pharmaceuticals, KalVista, and Pharvaris.","PeriodicalId":10334,"journal":{"name":"Clinical and Translational Allergy","volume":"14 11","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560339/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Translational Allergy","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/clt2.70005","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ALLERGY","Score":null,"Total":0}

引用次数: 0

Abstract

To the Editor,

Symptomatic dermographism (SD), a common subtype of chronic inducible urticaria (CIndU), involves transient, strip-shaped wheals that itch and burn when the skin is stroked or scratched.¹ SD affects ≥0.5% of the population,² yet despite its high frequency and marked impact on quality of life, diagnostic tools and treatment options are limited.^{1, 3}

Diagnosis is based on the patient's medical history and provocation testing.² Historically, provocation testing used a smooth, blunt object to stroke the skin, but variations in individuals' disease presentation highlighted the need for validated, reproducible tools.³ The FricTest®4.0⁴ is a hand-held, flat plastic comb-like tool with four smooth pins (3.0–4.5 mm long) firmly stroked along the skin. The resulting wheals determine the critical friction threshold (CFT), the shortest pin length/minimum pressure that elicits a positive wheal response.⁵

This study aimed to assess the reliability (reproducibility and repeatability) of FricTest inter-rater agreement (results from two different raters on the same patient at the same visit) and intra-rater agreement (results from the same rater on the same patient 7–14 days apart). Reliable results are important for monitoring treatment effects, helping patients understand triggers, and improving management. We assume that each patient's left and right forearms are the same, that visits are the same, and that previous provocation did not affect the reaction of subsequent tests.

This single-center study was conducted at the Urticaria Center of Reference and Excellence⁶ at the Charité Hospital, Berlin, Germany. Adults had SD for >6 weeks, had active SD at enrollment, and gave written informed consent. The study followed the Declaration of Helsinki principles, and the Berlin Charité Ethics Committee approved the protocol.

The primary endpoint, inter-rater agreement, was the intraclass correlation coefficient (ICC) between the CFT assessments of two raters within the same patient. The CFT scale is 0–4 (0 = no response, 4 = maximum response). ICCs plus upper and lower 95% confidence intervals (CI) were calculated using a mixed-effects linear model methodology.⁷ Rater A and B agreement was quantified using weighted kappa across four categories: left forearm, right forearm, Visit 1, and Visit 2. An ICC<0.4 indicated poor reliability, and 0.6–0.8 indicated substantial reliability.

Two raters randomized to the order of assessments (Forearm 1 [right], Forearm 2 [left]) administered and recorded all FricTest results. At Visit 1, Rater A applied the FricTest to Forearm 1, covered the arm, and left the room. Rater B then applied the FricTest to Forearm 2, covered the arm, and left the room. Ten minutes after application one, Rater A returned, uncovered Forearm 1 and documented the reaction before re-covering the arm. Rater B repeated this process. Both raters repeated these procedures at Visit 2 after 7–14 days, using the opposite arm to Visit 1.

Among 16 participants (62.5% female), we observed substantial inter-rater agreement when Rater A and B assessed the same patient during the same visit, with a weighted kappa of 0.86 for the left and 0.77 for the right forearm (Table 1, Figure 1). When the two raters assessed the same patient at different visits, the agreement was moderate at Visit 1 (weighted kappa of 0.56) but improved to substantial at Visit 2 (0.79; Table 1).

When the same rater assessed one patient at different visits, we found substantial intra-rater agreement. Raters A and B reported weighted kappas of 0.72 and 0.73, indicating consistent and substantial agreement between Visits 1 and 2.

Overall, there was high consistency in ICC measurements, with 0.89 (95% CI 0.71, 0.94) for inter-rater reliability and 0.81 (95% CI 0.60, 0.92) for intra-rater reliability. At Visits 1 and 2, the average mean differences between Rater A and B were 0.06 and 0.13, respectively. Similarly, the average mean differences for the left and right forearms were 0.06 and 0.25, respectively, indicating high agreement essential for reliable measurements.

Our study validates the FricTest, demonstrating its reliability as a diagnostic and monitoring tool in SD. Measurement noise from repeated tests by the same or different raters is a well-recognized source of error in medical assessments.⁸ Our results showed almost perfect inter-rater and substantial intra-rater agreement when inducing wheals in SD patients. Accurately determining a patient's CFT could enhance SD diagnosis and ultimately lead to better management of the patient's condition.

We observed changes in one patient that primarily drove variability between visits. Despite these fluctuations, the consistency with which each rater detected changes in disease activity confirms the robustness of our assessment tool in identifying improvements and deteriorations in SD.

Our study has limitations, mostly relating to its small size, duration of follow-up, and applicability to real-world conditions. Itch was documented at baseline to ensure the correct diagnosis; it could not be measured as a result because the intensity of itch cannot be differentiated in such tightly condensed strips. Further real-world evidence needs to confirm the FricTest's reproducibility over longer periods and clinical validity in busy settings, which differ from the controlled conditions within this study. While we strove to ensure comparable physical environments at both visits, we cannot rule out minor differences in the experimental environment nor exclude disease-activity modifying factors between the two visits, which may have caused response fluctuations.

The simple, compact and affordable FricTest should be utilized in clinical studies, especially those investigating potential CIndU treatments, and in primary care settings, where continuity of care can be challenging.

Annika Gutsche: Methodology; writing—review & editing; writing—original draft; data curation. Martin Metz: writing—original draft; methodology; writing—review & editing; data curation. Melba Munoz: Writing—original draft; methodology; writing—review & editing; data curation. Kit Wong: Methodology; writing—review & editing; data curation. Ted Omachi: Data curation; methodology; writing—review & editing. Rui Zhao: Methodology; data curation; writing—review & editing. Marcus Maurer: Methodology; writing—original draft; writing—review & editing; data curation. Vasiliki Zampeli: Study protocol; study investigator; methodology; writing—review & editing; writing—original draft; data curation. Markus Magerl: Study protocol; study investigator; writing—original draft; methodology; writing—review & editing; data curation.

A. Gutsche declares no conflict of interest. M. Metz is or recently was a speaker and/or consultant for Amgen, AstraZeneca, Argenx, Celldex, Celltrion, Escient, Jasper Therapeutics, Novartis, Pharvaris, Regeneron, Sanofi, and Third Harmonic Bio. M. Munoz is or recently was a speaker, advisor and/or received research funding from Jasper Therapeutics, Celldex Therapeutics, Takeda, GA²LEN, UNEV, Astra Zeneca and Roche outside the submitted work. K. Wong is a former employee of Genentech Inc. T. Omachi is an employee of Genentech Inc. R. Zhao is a former employee of Genentech/Roche. M. Maurer was a speaker and/or advisor for and/or has received research funds from Allakos, Amgen, Aralez, ArgenX, AstraZeneca, Celldex, Centogene, CSL, Behring, FAES, Genentech, GI Innovation, Innate Pharma, Kyowa Kirin, Leo Pharma Lilly, Menarini, Moxie, Novartis, Roche, Sanofi/Regeneron, Third Harmonic Bio, UCB, and Urich. V. Zampeli was a speaker and/or advisor for and/or has received research funding from Pharming, Takeda, and CSL Behring. M. Magerl is or recently was a speaker and/or advisor for and/or has received research funding from Biocryst, Pharming, Takeda, CSL Behring, Ionis Pharmaceuticals, KalVista, and Pharvaris.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

评估 FricTest® 4.0 诊断症状性皮炎的可靠性。

致编辑：症状性皮炎（SD）是慢性诱发性荨麻疹（CIndU）的一种常见亚型，表现为一过性、条状的麦粒肿，当皮肤受到抚摸或抓挠时会出现瘙痒和烧灼感、3 诊断基于患者的病史和激惹试验。2 历史上，激惹试验使用光滑的钝器划动皮肤，但个体疾病表现的差异凸显了对有效、可重复工具的需求。3 FricTest®4.04 是一种手持式扁平塑料梳状工具，带有四个光滑的针脚（3.0-4.5 毫米长），可沿着皮肤用力划动。5 本研究旨在评估 FricTest 的评分者间一致性（由两名不同评分者在同一次就诊中对同一患者得出的结果）和评分者内部一致性（由同一评分者在相隔 7-14 天后对同一患者得出的结果）的可靠性（再现性和可重复性）。可靠的结果对于监测治疗效果、帮助患者了解诱发因素和改善管理非常重要。我们假定每位患者的左右前臂是一样的，就诊时间也是一样的，而且之前的诱发不会影响后续测试的反应。这项单中心研究在德国柏林夏里特医院的荨麻疹卓越参考中心6进行。研究对象为荨麻疹持续6周、入选时荨麻疹处于活动期、并出具知情同意书的成年人。该研究遵循《赫尔辛基宣言》的原则，并获得了柏林夏里特伦理委员会的批准。主要终点--评分者之间的一致性是指两名评分者对同一患者的CFT评估之间的类内相关系数（ICC）。CFT 量表为 0-4（0 = 无反应，4 = 最大反应）。采用混合效应线性模型方法计算 ICC 和上下 95% 置信区间 (CI)。7 采用加权卡帕法量化评分者 A 和 B 在左前臂、右前臂、第 1 次就诊和第 2 次就诊四个类别中的一致性。ICC<0.4 表示可靠性较差，0.6-0.8 表示可靠性较高。两名评分员随机分配评估顺序（前臂 1 [右]、前臂 2 [左]），实施并记录所有 FricTest 结果。在访问 1 中，评分员 A 对前臂 1 进行了 FricTest 测试，盖上手臂后离开房间。然后，测评员 B 对前臂 2 进行了 FricTest 测试，盖住手臂并离开房间。第一次施测十分钟后，测评员 A 返回，揭开前臂 1，记录下反应，然后重新盖上手臂。评分员 B 重复这一过程。在 16 名参与者（62.5% 为女性）中，我们观察到当评分者 A 和 B 在同一次就诊中对同一患者进行评估时，评分者之间的一致性非常高，左前臂的加权卡帕值为 0.86，右前臂的加权卡帕值为 0.77（表 1，图 1）。当两名评分员在不同的就诊时间对同一患者进行评估时，在就诊第 1 次时，一致性为中等（加权卡帕为 0.56），但在就诊第 2 次时，一致性提高到了相当高的水平（0.79；表 1）。评分员 A 和 B 报告的加权卡方值分别为 0.72 和 0.73，表明在第 1 次和第 2 次就诊时评分员之间的一致性很高。总体而言，ICC 测量结果的一致性很高，评分员之间的可靠性为 0.89（95% CI 0.71，0.94），评分员内部的可靠性为 0.81（95% CI 0.60，0.92）。在第 1 次和第 2 次检查中，评分员 A 和 B 之间的平均差分别为 0.06 和 0.13。我们的研究验证了 FricTest 的有效性，证明了它作为 SD 诊断和监测工具的可靠性。8 我们的研究结果表明，在诱导 SD 患者出现喘息时，评分者之间几乎完全一致，评分者内部也基本一致。准确确定患者的 CFT 可以提高 SD 诊断的准确性，并最终改善对患者病情的管理。尽管存在这些波动，但每位评分者发现疾病活动变化的一致性证实了我们的评估工具在识别 SD 改善和恶化方面的稳健性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Clinical and Translational Allergy Immunology and Microbiology-Immunology

CiteScore

7.50

自引率

4.50%

发文量

117

审稿时长

12 weeks

期刊介绍： Clinical and Translational Allergy, one of several journals in the portfolio of the European Academy of Allergy and Clinical Immunology, provides a platform for the dissemination of allergy research and reviews, as well as EAACI position papers, task force reports and guidelines, amongst an international scientific audience. Clinical and Translational Allergy accepts clinical and translational research in the following areas and other related topics: asthma, rhinitis, rhinosinusitis, drug hypersensitivity, allergic conjunctivitis, allergic skin diseases, atopic eczema, urticaria, angioedema, venom hypersensitivity, anaphylaxis, food allergy, immunotherapy, immune modulators and biologics, animal models of allergic disease, immune mechanisms, or any other topic related to allergic disease.