线性模型离群值诊断的扩展 w 检验

IF 3.9 2区 地球科学 Q1 GEOCHEMISTRY & GEOPHYSICS Journal of Geodesy Pub Date : 2024-06-18 DOI:10.1007/s00190-024-01855-0
Yangkang Yu, Ling Yang, Yunzhong Shen
{"title":"线性模型离群值诊断的扩展 w 检验","authors":"Yangkang Yu, Ling Yang, Yunzhong Shen","doi":"10.1007/s00190-024-01855-0","DOIUrl":null,"url":null,"abstract":"<p>The issue of outliers has been a research focus in the field of geodesy. Based on a statistical testing method known as the <i>w</i>-test, data snooping along with its iterative form, iterative data snooping (IDS), is commonly used to diagnose outliers in linear models. However, in the case of multiple outliers, it may suffer from the masking and swamping effects, thereby limiting the detection and identification capabilities. This contribution is to investigate the cause of masking and swamping effects and propose a new method to mitigate these phenomena. First, based on the data division, an extended form of the <i>w</i>-test with its reliability measure is presented, and a theoretical reinterpretation of data snooping and IDS is provided. Then, to alleviate the effects of masking and swamping, a new outlier diagnostic method and its iterative form are proposed, namely data refining and iterative data refining (IDR). In general, if the total observations are initially divided into an inlying set and an outlying set, data snooping can be considered a process of selecting outliers from the inlying set to the outlying set. Conversely, data refining is then a reverse process to transfer inliers from the outlying set to the inlying one. Both theoretical analysis and practical examples show that IDR would keep stronger robustness than IDS due to the alleviation of masking and swamping effect, although it may pose a higher risk of precision loss when dealing with insufficient data.</p>","PeriodicalId":54822,"journal":{"name":"Journal of Geodesy","volume":"13 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An extended w-test for outlier diagnostics in linear models\",\"authors\":\"Yangkang Yu, Ling Yang, Yunzhong Shen\",\"doi\":\"10.1007/s00190-024-01855-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The issue of outliers has been a research focus in the field of geodesy. Based on a statistical testing method known as the <i>w</i>-test, data snooping along with its iterative form, iterative data snooping (IDS), is commonly used to diagnose outliers in linear models. However, in the case of multiple outliers, it may suffer from the masking and swamping effects, thereby limiting the detection and identification capabilities. This contribution is to investigate the cause of masking and swamping effects and propose a new method to mitigate these phenomena. First, based on the data division, an extended form of the <i>w</i>-test with its reliability measure is presented, and a theoretical reinterpretation of data snooping and IDS is provided. Then, to alleviate the effects of masking and swamping, a new outlier diagnostic method and its iterative form are proposed, namely data refining and iterative data refining (IDR). In general, if the total observations are initially divided into an inlying set and an outlying set, data snooping can be considered a process of selecting outliers from the inlying set to the outlying set. Conversely, data refining is then a reverse process to transfer inliers from the outlying set to the inlying one. Both theoretical analysis and practical examples show that IDR would keep stronger robustness than IDS due to the alleviation of masking and swamping effect, although it may pose a higher risk of precision loss when dealing with insufficient data.</p>\",\"PeriodicalId\":54822,\"journal\":{\"name\":\"Journal of Geodesy\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Geodesy\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.1007/s00190-024-01855-0\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOCHEMISTRY & GEOPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geodesy","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1007/s00190-024-01855-0","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0

摘要

异常值问题一直是大地测量领域的研究重点。基于一种称为 w 检验的统计检验方法,数据窥探及其迭代形式--迭代数据窥探(IDS)--通常用于诊断线性模型中的异常值。然而,在多个异常值的情况下,它可能会受到掩蔽和沼泽效应的影响,从而限制了检测和识别能力。本文旨在研究掩蔽效应和沼泽效应的原因,并提出一种新方法来缓解这些现象。首先,在数据划分的基础上,提出了 W 检验的扩展形式及其可靠性度量,并从理论上重新解释了数据窥探和 IDS。然后,为了减轻掩蔽和沼泽的影响,提出了一种新的离群值诊断方法及其迭代形式,即数据精炼和迭代数据精炼(IDR)。一般来说,如果最初将全部观测数据分为内含集和离群集,那么数据窥探可以被视为从内含集向离群集选择离群值的过程。反之,数据提炼则是一个将异常值从离群集转移到正常集的反向过程。理论分析和实际案例都表明,IDR 比 IDS 具有更强的鲁棒性,因为它减轻了掩蔽和沼泽效应,不过在处理数据不足时,它可能会带来更高的精度损失风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An extended w-test for outlier diagnostics in linear models

The issue of outliers has been a research focus in the field of geodesy. Based on a statistical testing method known as the w-test, data snooping along with its iterative form, iterative data snooping (IDS), is commonly used to diagnose outliers in linear models. However, in the case of multiple outliers, it may suffer from the masking and swamping effects, thereby limiting the detection and identification capabilities. This contribution is to investigate the cause of masking and swamping effects and propose a new method to mitigate these phenomena. First, based on the data division, an extended form of the w-test with its reliability measure is presented, and a theoretical reinterpretation of data snooping and IDS is provided. Then, to alleviate the effects of masking and swamping, a new outlier diagnostic method and its iterative form are proposed, namely data refining and iterative data refining (IDR). In general, if the total observations are initially divided into an inlying set and an outlying set, data snooping can be considered a process of selecting outliers from the inlying set to the outlying set. Conversely, data refining is then a reverse process to transfer inliers from the outlying set to the inlying one. Both theoretical analysis and practical examples show that IDR would keep stronger robustness than IDS due to the alleviation of masking and swamping effect, although it may pose a higher risk of precision loss when dealing with insufficient data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Geodesy
Journal of Geodesy 地学-地球化学与地球物理
CiteScore
8.60
自引率
9.10%
发文量
85
审稿时长
9 months
期刊介绍: The Journal of Geodesy is an international journal concerned with the study of scientific problems of geodesy and related interdisciplinary sciences. Peer-reviewed papers are published on theoretical or modeling studies, and on results of experiments and interpretations. Besides original research papers, the journal includes commissioned review papers on topical subjects and special issues arising from chosen scientific symposia or workshops. The journal covers the whole range of geodetic science and reports on theoretical and applied studies in research areas such as: -Positioning -Reference frame -Geodetic networks -Modeling and quality control -Space geodesy -Remote sensing -Gravity fields -Geodynamics
期刊最新文献
Modified Bayesian method for simultaneously imaging fault geometry and slip distribution with reduced uncertainty, applied to 2017 Mw 7.3 Sarpol-e Zahab (Iran) earthquake Global 3D ionospheric shape function modeling with kriging Spherical radial basis functions model: approximating an integral functional of an isotropic Gaussian random field Capture of coseismic velocity waveform using GNSS raw Doppler and carrier phase data for enhancing shaking intensity estimation Derivation of the Sagnac (Earth-rotation) correction and analysis of its accuracy for GNSS applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1