Linhui Li;Xiaotong Lin;Yejia Huang;Zizhen Zhang;Jian-Fang Hu
{"title":"Beyond Minimum-of-N: Rethinking the Evaluation and Methods of Pedestrian Trajectory Prediction","authors":"Linhui Li;Xiaotong Lin;Yejia Huang;Zizhen Zhang;Jian-Fang Hu","doi":"10.1109/TCSVT.2024.3439128","DOIUrl":null,"url":null,"abstract":"Pedestrian trajectory prediction is an essential task in real-world applications, aimed at predicting plausible future trajectories based on limited observations. In this work, we rethink the standard evaluation metric of the pedestrian trajectory prediction task: Minimum-of-N Average Displacement Error (MoN-ADE). As for multi-modal prediction models that generate multiple trajectories for each pedestrian, this metric typically evaluates the model by only considering the one that is closest to the ground-truth trajectory. However, such an evaluation protocol cannot comprehensively evaluate the predictive ability of the model, and potentially encourage models to generate high-variance and dispersed trajectory distributions. This is quite impractical especially for many real-world scenes like autonomous driving that require precise and convergent trajectory predictions. To address these limitations, we design a novel metric towards comprehensive evaluation in pedestrian trajectory prediction, which moves beyond the traditional reliance on the closest prediction. Specifically, we replace the Minimum-of-N strategy with an insightful Random-Sampling-K strategy to calculate the expectations of the minimum ADE and formulate a novel metric: Area Under the Curve (AUC). Furthermore, motivated by the proposed metric, we introduce a novel objective function named K-Ensemble Loss, which guides the state-of-the-art models to optimize the whole prediction distribution and reduce the uncertainty caused by the high-variance predictions. Extensive experiments on three real-world datasets demonstrate that the proposed metric and objective function are provided with significant effectiveness and flexibility.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"34 12","pages":"12880-12893"},"PeriodicalIF":11.1000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10623470/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Pedestrian trajectory prediction is an essential task in real-world applications, aimed at predicting plausible future trajectories based on limited observations. In this work, we rethink the standard evaluation metric of the pedestrian trajectory prediction task: Minimum-of-N Average Displacement Error (MoN-ADE). As for multi-modal prediction models that generate multiple trajectories for each pedestrian, this metric typically evaluates the model by only considering the one that is closest to the ground-truth trajectory. However, such an evaluation protocol cannot comprehensively evaluate the predictive ability of the model, and potentially encourage models to generate high-variance and dispersed trajectory distributions. This is quite impractical especially for many real-world scenes like autonomous driving that require precise and convergent trajectory predictions. To address these limitations, we design a novel metric towards comprehensive evaluation in pedestrian trajectory prediction, which moves beyond the traditional reliance on the closest prediction. Specifically, we replace the Minimum-of-N strategy with an insightful Random-Sampling-K strategy to calculate the expectations of the minimum ADE and formulate a novel metric: Area Under the Curve (AUC). Furthermore, motivated by the proposed metric, we introduce a novel objective function named K-Ensemble Loss, which guides the state-of-the-art models to optimize the whole prediction distribution and reduce the uncertainty caused by the high-variance predictions. Extensive experiments on three real-world datasets demonstrate that the proposed metric and objective function are provided with significant effectiveness and flexibility.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.