{"title":"Randomized Spline Trees for Functional Data Classification: Theory and Application to Environmental Time Series","authors":"Donato Riccio, Fabrizio Maturo, Elvira Romano","doi":"arxiv-2409.07879","DOIUrl":null,"url":null,"abstract":"Functional data analysis (FDA) and ensemble learning can be powerful tools\nfor analyzing complex environmental time series. Recent literature has\nhighlighted the key role of diversity in enhancing accuracy and reducing\nvariance in ensemble methods.This paper introduces Randomized Spline Trees\n(RST), a novel algorithm that bridges these two approaches by incorporating\nrandomized functional representations into the Random Forest framework. RST\ngenerates diverse functional representations of input data using randomized\nB-spline parameters, creating an ensemble of decision trees trained on these\nvaried representations. We provide a theoretical analysis of how this\nfunctional diversity contributes to reducing generalization error and present\nempirical evaluations on six environmental time series classification tasks\nfrom the UCR Time Series Archive. Results show that RST variants outperform\nstandard Random Forests and Gradient Boosting on most datasets, improving\nclassification accuracy by up to 14\\%. The success of RST demonstrates the\npotential of adaptive functional representations in capturing complex temporal\npatterns in environmental data. This work contributes to the growing field of\nmachine learning techniques focused on functional data and opens new avenues\nfor research in environmental time series analysis.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"67 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07879","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Functional data analysis (FDA) and ensemble learning can be powerful tools
for analyzing complex environmental time series. Recent literature has
highlighted the key role of diversity in enhancing accuracy and reducing
variance in ensemble methods.This paper introduces Randomized Spline Trees
(RST), a novel algorithm that bridges these two approaches by incorporating
randomized functional representations into the Random Forest framework. RST
generates diverse functional representations of input data using randomized
B-spline parameters, creating an ensemble of decision trees trained on these
varied representations. We provide a theoretical analysis of how this
functional diversity contributes to reducing generalization error and present
empirical evaluations on six environmental time series classification tasks
from the UCR Time Series Archive. Results show that RST variants outperform
standard Random Forests and Gradient Boosting on most datasets, improving
classification accuracy by up to 14\%. The success of RST demonstrates the
potential of adaptive functional representations in capturing complex temporal
patterns in environmental data. This work contributes to the growing field of
machine learning techniques focused on functional data and opens new avenues
for research in environmental time series analysis.