Statistical analysis of very high‐dimensional data sets of hierarchically structured binary variables with missing data: An application to marine corps readiness evaluations

S. Zacks, W. Marlow, S. Brier
{"title":"Statistical analysis of very high‐dimensional data sets of hierarchically structured binary variables with missing data: An application to marine corps readiness evaluations","authors":"S. Zacks, W. Marlow, S. Brier","doi":"10.1002/NAV.3800320310","DOIUrl":null,"url":null,"abstract":"Abstract : The present analysis deals with very high-dimensional data sets, each one containing close to nine hundred binary variables. Each data set corresponds to an evaluation of one complex system. These data sets are characterized by large portions of missing data where, moreover, the unobserved variables are not the same in different evaluations. Thus, the problems which confront the statistical analysis are those of multivariate binary data analysis, where the number of variables is much larger than the sample size and in which missing data varies with the sample elements. The variables, however, are hierarchically structured and the problem of clustering variables to groups does not exist in the present study. In order to motivate the statistical problem under consideration, the Marine Corps Combat Readiness Evaluation System (MCCRES) is described for infantry battalions and then used for exposition. The present paper provides a statistical model for data from MCCRES and develops estimation and prediction procedures which utilize the dependence structure. The E-M algorithm is applied to obtain maximum likelihood estimates of the parameters of interest. Numerical examples illustrate the proposed methods.","PeriodicalId":431817,"journal":{"name":"Naval Research Logistics Quarterly","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1985-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Naval Research Logistics Quarterly","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/NAV.3800320310","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract : The present analysis deals with very high-dimensional data sets, each one containing close to nine hundred binary variables. Each data set corresponds to an evaluation of one complex system. These data sets are characterized by large portions of missing data where, moreover, the unobserved variables are not the same in different evaluations. Thus, the problems which confront the statistical analysis are those of multivariate binary data analysis, where the number of variables is much larger than the sample size and in which missing data varies with the sample elements. The variables, however, are hierarchically structured and the problem of clustering variables to groups does not exist in the present study. In order to motivate the statistical problem under consideration, the Marine Corps Combat Readiness Evaluation System (MCCRES) is described for infantry battalions and then used for exposition. The present paper provides a statistical model for data from MCCRES and develops estimation and prediction procedures which utilize the dependence structure. The E-M algorithm is applied to obtain maximum likelihood estimates of the parameters of interest. Numerical examples illustrate the proposed methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有缺失数据的层次结构二元变量的高维数据集的统计分析:在海军陆战队战备评估中的应用
摘要:当前的分析处理非常高维的数据集,每个数据集包含近九百个二进制变量。每个数据集对应于一个复杂系统的评估。这些数据集的特点是大量缺失数据,而且,在不同的评估中,未观察到的变量是不相同的。因此,统计分析面临的问题是多元二元数据分析的问题,其中变量的数量远远大于样本量,其中缺失的数据随样本元素的不同而变化。然而,变量是分层结构的,在本研究中不存在变量聚类到组的问题。为了激发所考虑的统计问题,对步兵营的海军陆战队战备评估系统(MCCRES)进行了描述,然后用于阐述。本文提供了MCCRES数据的统计模型,并开发了利用依赖结构的估计和预测程序。应用E-M算法获得感兴趣参数的最大似然估计。数值算例说明了所提出的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On estimating population characteristics from record‐breaking observations. i. parametric results Optimal replacement for fault‐tolerant systems Algorithms for the minimax transportation problem Nature of renyi's entropy and associated divergence function Rescheduling to minimize makespan on a changing number of identical processors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1