{"title":"语气或术语:机器学习文本分析、特色词汇提取以及中国债券定价证据","authors":"Yueqian Peng , Li Shi , Xiaojun Shi , Songtao Tan","doi":"10.1016/j.jempfin.2024.101534","DOIUrl":null,"url":null,"abstract":"<div><p>We apply the machine-learning technique proposed by Zhou et al. (2024) to analyze credit rating reports in China’s bond markets, identifying featured vocabulary and generating text analysis scores. Compared with the traditional bag-of-words text analysis, evidence suggests three advantages of machine-learning scoring. Firstly, it covers featured vocabulary that compensates for missing information; secondly, it reduces misclassification of words’ sentiments; moreover, it mitigates the problem of equal weighting inherent in the bag-of-words method. Our findings indicate that the featured vocabulary neglected in the bag-of-words method plays a crucial role in text analysis and significantly contributes to bond pricing. Additionally, we find that machine-learning text analysis can address AAA rating inflation within China’s bond markets to some extent. In contrast, the bag-of-words method exhibits limited efficacy in mitigating this issue.</p></div>","PeriodicalId":15704,"journal":{"name":"Journal of Empirical Finance","volume":"78 ","pages":"Article 101534"},"PeriodicalIF":2.1000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Tone or term: Machine-learning text analysis, featured vocabulary extraction, and evidence from bond pricing in China\",\"authors\":\"Yueqian Peng , Li Shi , Xiaojun Shi , Songtao Tan\",\"doi\":\"10.1016/j.jempfin.2024.101534\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We apply the machine-learning technique proposed by Zhou et al. (2024) to analyze credit rating reports in China’s bond markets, identifying featured vocabulary and generating text analysis scores. Compared with the traditional bag-of-words text analysis, evidence suggests three advantages of machine-learning scoring. Firstly, it covers featured vocabulary that compensates for missing information; secondly, it reduces misclassification of words’ sentiments; moreover, it mitigates the problem of equal weighting inherent in the bag-of-words method. Our findings indicate that the featured vocabulary neglected in the bag-of-words method plays a crucial role in text analysis and significantly contributes to bond pricing. Additionally, we find that machine-learning text analysis can address AAA rating inflation within China’s bond markets to some extent. In contrast, the bag-of-words method exhibits limited efficacy in mitigating this issue.</p></div>\",\"PeriodicalId\":15704,\"journal\":{\"name\":\"Journal of Empirical Finance\",\"volume\":\"78 \",\"pages\":\"Article 101534\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Empirical Finance\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927539824000690\",\"RegionNum\":2,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Empirical Finance","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927539824000690","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
Tone or term: Machine-learning text analysis, featured vocabulary extraction, and evidence from bond pricing in China
We apply the machine-learning technique proposed by Zhou et al. (2024) to analyze credit rating reports in China’s bond markets, identifying featured vocabulary and generating text analysis scores. Compared with the traditional bag-of-words text analysis, evidence suggests three advantages of machine-learning scoring. Firstly, it covers featured vocabulary that compensates for missing information; secondly, it reduces misclassification of words’ sentiments; moreover, it mitigates the problem of equal weighting inherent in the bag-of-words method. Our findings indicate that the featured vocabulary neglected in the bag-of-words method plays a crucial role in text analysis and significantly contributes to bond pricing. Additionally, we find that machine-learning text analysis can address AAA rating inflation within China’s bond markets to some extent. In contrast, the bag-of-words method exhibits limited efficacy in mitigating this issue.
期刊介绍:
The Journal of Empirical Finance is a financial economics journal whose aim is to publish high quality articles in empirical finance. Empirical finance is interpreted broadly to include any type of empirical work in financial economics, financial econometrics, and also theoretical work with clear empirical implications, even when there is no empirical analysis. The Journal welcomes articles in all fields of finance, such as asset pricing, corporate finance, financial econometrics, banking, international finance, microstructure, behavioural finance, etc. The Editorial Team is willing to take risks on innovative research, controversial papers, and unusual approaches. We are also particularly interested in work produced by young scholars. The composition of the editorial board reflects such goals.