While AI technologies have garnered widespread attention for their revolutionary text generation capabilities, concerns have arisen regarding the risks associated with AI-generated text (AIGT), especially when used maliciously. Motivated by the recognition that AIGT is generated based on high-probability tokens, a process that inherently differs from the biological-based thought processes underlying human-written text (HWT), we trace and build upon theories of the language latent level to explore the fundamental differences between AIGT and HWT, particularly in terms of potentiality, logicality, and complexity. A novel method named LA2HDetect is proposed for automatic AIGT detection. Specifically, we discover that HWT exhibits higher potentiality than AIGT; AIGT and HWT each possesses unique characteristics in terms of logicality and complexity. These human-AI differences collectively form the decision-making mechanism of LA2HDetect. Extensive experiments on general domain datasets confirm the competitiveness and robustness of LA2HDetect, which outperforms existing methods. In addition, we evaluate the extensibility of LA2HDetect in multiple vertical domains, and explore the insights across progressively advanced AI models.
扫码关注我们
求助内容:
应助结果提醒方式:
