Feilong Wang , Xin Wang , Yuan Hong , R. Tyrrell Rockafellar , Xuegang (Jeff) Ban
{"title":"Data poisoning attacks on traffic state estimation and prediction","authors":"Feilong Wang , Xin Wang , Yuan Hong , R. Tyrrell Rockafellar , Xuegang (Jeff) Ban","doi":"10.1016/j.trc.2024.104577","DOIUrl":null,"url":null,"abstract":"<div><div>Data has become ubiquitous nowadays in transportation, including vehicular data and infrastructure-generated data. The growing reliance on data poses potential cybersecurity issues to transportation systems, among which the so-called “data poisoning” attacks by adversaries are becoming increasingly critical. Such attacks aim to compromise a system’s performance by adding systematic and malicious noises, perturbations, or deviations to the dataset used by the system. Formal investigations of data poisoning attacks are essential for understanding the attacks and developing effective defense methods. This study develops a general data poisoning attack model for traffic state estimation and prediction (TSEP) that is a basic application in transportation. We first formulate data poisoning attacks as a general sensitivity analysis of parameterized optimization problems<span> over parameter changes (i.e., data perturbations) and study the Lipschitz continuity property of the solution with the presence of general (equality and inequality) constraints. Then, we develop attack models that fit a broader spectrum of learning applications (such as TSEP) by extending existing models that only focus on learning problems with no or equality constraints (widely used in the cybersecurity field). Since the solution of such general problems is often continuous but not differentiable with data changes, we apply the generalized implicit function theorem to compute the semi-derivatives that express how the TSEP solution responds to data perturbations. The semi-derivatives enable us to evaluate TSEP models’ vulnerability (at each data point) and solve the proposed attack model. We demonstrate the generality and effectiveness of the proposed method on two TSEP models using mobile sensing data.</span></div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"168 ","pages":"Article 104577"},"PeriodicalIF":7.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X24000986","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Data has become ubiquitous nowadays in transportation, including vehicular data and infrastructure-generated data. The growing reliance on data poses potential cybersecurity issues to transportation systems, among which the so-called “data poisoning” attacks by adversaries are becoming increasingly critical. Such attacks aim to compromise a system’s performance by adding systematic and malicious noises, perturbations, or deviations to the dataset used by the system. Formal investigations of data poisoning attacks are essential for understanding the attacks and developing effective defense methods. This study develops a general data poisoning attack model for traffic state estimation and prediction (TSEP) that is a basic application in transportation. We first formulate data poisoning attacks as a general sensitivity analysis of parameterized optimization problems over parameter changes (i.e., data perturbations) and study the Lipschitz continuity property of the solution with the presence of general (equality and inequality) constraints. Then, we develop attack models that fit a broader spectrum of learning applications (such as TSEP) by extending existing models that only focus on learning problems with no or equality constraints (widely used in the cybersecurity field). Since the solution of such general problems is often continuous but not differentiable with data changes, we apply the generalized implicit function theorem to compute the semi-derivatives that express how the TSEP solution responds to data perturbations. The semi-derivatives enable us to evaluate TSEP models’ vulnerability (at each data point) and solve the proposed attack model. We demonstrate the generality and effectiveness of the proposed method on two TSEP models using mobile sensing data.
期刊介绍:
Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.