{"title":"Walking fingerprinting.","authors":"Lily Koffman, Ciprian Crainiceanu, Andrew Leroux","doi":"10.1093/jrsssc/qlae033","DOIUrl":null,"url":null,"abstract":"<p><p>We consider the problem of predicting an individual's identity from accelerometry data collected during walking. In a previous paper, we transformed the accelerometry time series into an image by constructing the joint distribution of the acceleration and lagged acceleration for a vector of lags. Predictors derived by partitioning this image into grid cells were used in logistic regression to predict individuals. Here, we (a) implement machine learning methods for prediction using the grid cell-derived predictors; (b) derive inferential methods to screen for the most predictive grid cells while adjusting for correlation and multiple comparisons; and (c) develop a novel multivariate functional regression model that avoids partitioning the predictor space. Prediction methods are compared on two open source acceleometry data sets collected from: (a) 32 individuals walking on a <math><mn>1.06</mn></math> km path; and (b) six repetitions of walking on a 20 m path on two occasions at least 1 week apart for 153 study participants. In the 32-individual study, all methods achieve at least 95% rank-1 accuracy, while in the 153-individual study, accuracy varies from 41% to 98%, depending on the method and prediction task. Methods provide insights into why some individuals are easier to predict than others.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"73 5","pages":"1221-1241"},"PeriodicalIF":1.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11561731/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series C-Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/jrsssc/qlae033","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
We consider the problem of predicting an individual's identity from accelerometry data collected during walking. In a previous paper, we transformed the accelerometry time series into an image by constructing the joint distribution of the acceleration and lagged acceleration for a vector of lags. Predictors derived by partitioning this image into grid cells were used in logistic regression to predict individuals. Here, we (a) implement machine learning methods for prediction using the grid cell-derived predictors; (b) derive inferential methods to screen for the most predictive grid cells while adjusting for correlation and multiple comparisons; and (c) develop a novel multivariate functional regression model that avoids partitioning the predictor space. Prediction methods are compared on two open source acceleometry data sets collected from: (a) 32 individuals walking on a km path; and (b) six repetitions of walking on a 20 m path on two occasions at least 1 week apart for 153 study participants. In the 32-individual study, all methods achieve at least 95% rank-1 accuracy, while in the 153-individual study, accuracy varies from 41% to 98%, depending on the method and prediction task. Methods provide insights into why some individuals are easier to predict than others.
期刊介绍:
The Journal of the Royal Statistical Society, Series C (Applied Statistics) is a journal of international repute for statisticians both inside and outside the academic world. The journal is concerned with papers which deal with novel solutions to real life statistical problems by adapting or developing methodology, or by demonstrating the proper application of new or existing statistical methods to them. At their heart therefore the papers in the journal are motivated by examples and statistical data of all kinds. The subject-matter covers the whole range of inter-disciplinary fields, e.g. applications in agriculture, genetics, industry, medicine and the physical sciences, and papers on design issues (e.g. in relation to experiments, surveys or observational studies).
A deep understanding of statistical methodology is not necessary to appreciate the content. Although papers describing developments in statistical computing driven by practical examples are within its scope, the journal is not concerned with simply numerical illustrations or simulation studies. The emphasis of Series C is on case-studies of statistical analyses in practice.