{"title":"在线语言处理中预期和反应能力的通用测量方法","authors":"Mario Giulianelli, Andreas Opedal, Ryan Cotterell","doi":"arxiv-2409.10728","DOIUrl":null,"url":null,"abstract":"We introduce a generalization of classic information-theoretic measures of\npredictive uncertainty in online language processing, based on the simulation\nof expected continuations of incremental linguistic contexts. Our framework\nprovides a formal definition of anticipatory and responsive measures, and it\nequips experimenters with the tools to define new, more expressive measures\nbeyond standard next-symbol entropy and surprisal. While extracting these\nstandard quantities from language models is convenient, we demonstrate that\nusing Monte Carlo simulation to estimate alternative responsive and\nanticipatory measures pays off empirically: New special cases of our\ngeneralized formula exhibit enhanced predictive power compared to surprisal for\nhuman cloze completion probability as well as ELAN, LAN, and N400 amplitudes,\nand greater complementarity with surprisal in predicting reading times.","PeriodicalId":501082,"journal":{"name":"arXiv - MATH - Information Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized Measures of Anticipation and Responsivity in Online Language Processing\",\"authors\":\"Mario Giulianelli, Andreas Opedal, Ryan Cotterell\",\"doi\":\"arxiv-2409.10728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a generalization of classic information-theoretic measures of\\npredictive uncertainty in online language processing, based on the simulation\\nof expected continuations of incremental linguistic contexts. Our framework\\nprovides a formal definition of anticipatory and responsive measures, and it\\nequips experimenters with the tools to define new, more expressive measures\\nbeyond standard next-symbol entropy and surprisal. While extracting these\\nstandard quantities from language models is convenient, we demonstrate that\\nusing Monte Carlo simulation to estimate alternative responsive and\\nanticipatory measures pays off empirically: New special cases of our\\ngeneralized formula exhibit enhanced predictive power compared to surprisal for\\nhuman cloze completion probability as well as ELAN, LAN, and N400 amplitudes,\\nand greater complementarity with surprisal in predicting reading times.\",\"PeriodicalId\":501082,\"journal\":{\"name\":\"arXiv - MATH - Information Theory\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Information Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Information Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generalized Measures of Anticipation and Responsivity in Online Language Processing
We introduce a generalization of classic information-theoretic measures of
predictive uncertainty in online language processing, based on the simulation
of expected continuations of incremental linguistic contexts. Our framework
provides a formal definition of anticipatory and responsive measures, and it
equips experimenters with the tools to define new, more expressive measures
beyond standard next-symbol entropy and surprisal. While extracting these
standard quantities from language models is convenient, we demonstrate that
using Monte Carlo simulation to estimate alternative responsive and
anticipatory measures pays off empirically: New special cases of our
generalized formula exhibit enhanced predictive power compared to surprisal for
human cloze completion probability as well as ELAN, LAN, and N400 amplitudes,
and greater complementarity with surprisal in predicting reading times.