Adelia Evangelista, Christian Acal, Ana M. Aguilera, Annalina Sarra, Tonio Di Battista, Sergio Palermi
{"title":"通过分组 Lasso 进行多函数线性回归的高维变量选择:PM10 监测案例研究","authors":"Adelia Evangelista, Christian Acal, Ana M. Aguilera, Annalina Sarra, Tonio Di Battista, Sergio Palermi","doi":"10.1002/env.2852","DOIUrl":null,"url":null,"abstract":"SummaryAnalyzing the effect of chemical and local meteorological variables over the behaviour in concentrations in the Abruzzo region (Italy), with the objective of forecasting and controlling air quality, motivates the current work. Given that the available data are curves that represent the day‐to‐day variations, a multiple function‐on‐function linear regression (MFFLR) model is considered. By assuming the Karhunen‐Loève expansion, MFFLR model can be reduced to a classical linear regression model for each principal component of the functional response in terms of all principal components (PCs) of the functional predictors. In this sense, a regularization approach for functional principal component regression based on the merge of functional data analysis with group Lasso is proposed. This novel methodology allows to estimate the model and, simultaneously, select those relevant functional predictors with the functional response, where each functional independent variable is represented by a group of input variables derived by the PCs.","PeriodicalId":50512,"journal":{"name":"Environmetrics","volume":"25 1","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High dimensional variable selection through group Lasso for multiple function‐on‐function linear regression: A case study in PM10 monitoring\",\"authors\":\"Adelia Evangelista, Christian Acal, Ana M. Aguilera, Annalina Sarra, Tonio Di Battista, Sergio Palermi\",\"doi\":\"10.1002/env.2852\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SummaryAnalyzing the effect of chemical and local meteorological variables over the behaviour in concentrations in the Abruzzo region (Italy), with the objective of forecasting and controlling air quality, motivates the current work. Given that the available data are curves that represent the day‐to‐day variations, a multiple function‐on‐function linear regression (MFFLR) model is considered. By assuming the Karhunen‐Loève expansion, MFFLR model can be reduced to a classical linear regression model for each principal component of the functional response in terms of all principal components (PCs) of the functional predictors. In this sense, a regularization approach for functional principal component regression based on the merge of functional data analysis with group Lasso is proposed. This novel methodology allows to estimate the model and, simultaneously, select those relevant functional predictors with the functional response, where each functional independent variable is represented by a group of input variables derived by the PCs.\",\"PeriodicalId\":50512,\"journal\":{\"name\":\"Environmetrics\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmetrics\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1002/env.2852\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmetrics","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1002/env.2852","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
摘要分析化学变量和当地气象变量对阿布鲁佐地区(意大利)浓度变化的影响,以达到预测和控制空气质量的目的,是当前工作的动机。鉴于现有数据是代表逐日变化的曲线,因此考虑采用多重函数-函数线性回归 (MFFLR) 模型。通过假定卡尔胡宁-洛埃夫扩展,MFFLR 模型可简化为一个经典的线性回归模型,即功能响应的每个主成分都与功能预测因子的所有主成分(PC)相关。从这个意义上讲,我们提出了一种基于功能数据分析与组 Lasso 合并的功能主成分回归正则化方法。这种新颖的方法可以估计模型,同时选择与功能响应相关的功能预测因子,其中每个功能自变量都由一组由 PC 导出的输入变量表示。
High dimensional variable selection through group Lasso for multiple function‐on‐function linear regression: A case study in PM10 monitoring
SummaryAnalyzing the effect of chemical and local meteorological variables over the behaviour in concentrations in the Abruzzo region (Italy), with the objective of forecasting and controlling air quality, motivates the current work. Given that the available data are curves that represent the day‐to‐day variations, a multiple function‐on‐function linear regression (MFFLR) model is considered. By assuming the Karhunen‐Loève expansion, MFFLR model can be reduced to a classical linear regression model for each principal component of the functional response in terms of all principal components (PCs) of the functional predictors. In this sense, a regularization approach for functional principal component regression based on the merge of functional data analysis with group Lasso is proposed. This novel methodology allows to estimate the model and, simultaneously, select those relevant functional predictors with the functional response, where each functional independent variable is represented by a group of input variables derived by the PCs.
期刊介绍:
Environmetrics, the official journal of The International Environmetrics Society (TIES), an Association of the International Statistical Institute, is devoted to the dissemination of high-quality quantitative research in the environmental sciences.
The journal welcomes pertinent and innovative submissions from quantitative disciplines developing new statistical and mathematical techniques, methods, and theories that solve modern environmental problems. Articles must proffer substantive, new statistical or mathematical advances to answer important scientific questions in the environmental sciences, or must develop novel or enhanced statistical methodology with clear applications to environmental science. New methods should be illustrated with recent environmental data.