Financial statement fraud undermines market integrity and incurs substantial costs for investors, regulators, and companies. Text-based detection methods have emerged as useful complements to traditional financial indicators, but many fail to incorporate domain-specific topics or sentiment cues, often missing subtle changes in deceptive communication. To overcome this problem, this study proposes a topic-driven financial sentiment analysis (TDFSA) model that detects corporate fraud by analyzing linguistic patterns in the Management Discussion & Analysis (MD&A) sections of annual reports. Our approach captures contextual sentiment within financially relevant topics using FinBERT embeddings. To evaluate these signals in fraud detection, we integrate the TDFSA outputs into a broader cost-sensitive evaluation framework. This framework combines text-based indicators with financial ratios to balance the need to avoid false alarms with the high cost of undetected fraud. Using data from U.S. firms flagged in SEC Accounting and Auditing Enforcement Releases from 2014 to 2024 and matched non-fraud peers, we examine trends in financial ratios, textual complexity, and sentiment dynamics in the three years preceding fraud events. The results show that models leveraging TDFSA achieve higher detection accuracy and lower cost than dictionary-based sentiment, generic topic models, and deep learning baselines.
扫码关注我们
求助内容:
应助结果提醒方式:
