Raydonal Ospina, Ranah Duarte Costa, Leandro Chaves Rêgo, Fernando Marmolejo‐Ramos
{"title":"Application of nonparametric quantifiers for online handwritten signature verification: A statistical learning approach","authors":"Raydonal Ospina, Ranah Duarte Costa, Leandro Chaves Rêgo, Fernando Marmolejo‐Ramos","doi":"10.1002/sam.11673","DOIUrl":null,"url":null,"abstract":"This work explores the use of nonparametric quantifiers in the signature verification problem of handwritten signatures. We used the MCYT‐100 (MCYT Fingerprint subcorpus) database, widely used in signature verification problems. The discrete‐time sequence positions in the <jats:italic>x</jats:italic> ‐axis and <jats:italic>y</jats:italic>‐axis provided in the database are preprocessed, and time causal information based on nonparametric quantifiers such as entropy, complexity, Fisher information, and trend are employed. The study also proposes to evaluate these quantifiers with the time series obtained, applying the first and second derivatives of each sequence position to evaluate the dynamic behavior by looking at their velocity and acceleration regimes, respectively. The signatures in the MCYT‐100 database are classified via Logistic Regression, Support Vector Machines (SVM), Random Forest, and Extreme Gradient Boosting (XGBoost). The quantifiers were used as input features to train the classifiers. To assess the ability and impact of nonparametric quantifiers to distinguish forgery and genuine signatures, we used variable selection criteria, such as: information gain, analysis of variance, and variance inflation factor. The performance of classifiers was evaluated by measures of classification error such as specificity and area under the curve. The results show that the SVM and XGBoost classifiers present the best performance.","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"13 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/sam.11673","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
This work explores the use of nonparametric quantifiers in the signature verification problem of handwritten signatures. We used the MCYT‐100 (MCYT Fingerprint subcorpus) database, widely used in signature verification problems. The discrete‐time sequence positions in the x ‐axis and y‐axis provided in the database are preprocessed, and time causal information based on nonparametric quantifiers such as entropy, complexity, Fisher information, and trend are employed. The study also proposes to evaluate these quantifiers with the time series obtained, applying the first and second derivatives of each sequence position to evaluate the dynamic behavior by looking at their velocity and acceleration regimes, respectively. The signatures in the MCYT‐100 database are classified via Logistic Regression, Support Vector Machines (SVM), Random Forest, and Extreme Gradient Boosting (XGBoost). The quantifiers were used as input features to train the classifiers. To assess the ability and impact of nonparametric quantifiers to distinguish forgery and genuine signatures, we used variable selection criteria, such as: information gain, analysis of variance, and variance inflation factor. The performance of classifiers was evaluated by measures of classification error such as specificity and area under the curve. The results show that the SVM and XGBoost classifiers present the best performance.
期刊介绍:
Statistical Analysis and Data Mining addresses the broad area of data analysis, including statistical approaches, machine learning, data mining, and applications. Topics include statistical and computational approaches for analyzing massive and complex datasets, novel statistical and/or machine learning methods and theory, and state-of-the-art applications with high impact. Of special interest are articles that describe innovative analytical techniques, and discuss their application to real problems, in such a way that they are accessible and beneficial to domain experts across science, engineering, and commerce.
The focus of the journal is on papers which satisfy one or more of the following criteria:
Solve data analysis problems associated with massive, complex datasets
Develop innovative statistical approaches, machine learning algorithms, or methods integrating ideas across disciplines, e.g., statistics, computer science, electrical engineering, operation research.
Formulate and solve high-impact real-world problems which challenge existing paradigms via new statistical and/or computational models
Provide survey to prominent research topics.