首页 > 最新文献

Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics最新文献

英文 中文
Quantifiers in a Multimodal World: Hallucinating Vision with Language and Sound 多模态世界中的量词:语言和声音的幻觉视觉
Alberto Testoni, Sandro Pezzelle, R. Bernardi
Inspired by the literature on multisensory integration, we develop a computational model to ground quantifiers in perception. The model learns to pick, out of nine quantifiers (‘few’, ‘many’, ‘all’, etc.), the one that is more likely to describe the percent of animals in a visual-auditory input containing both animals and artifacts. We show that relying on concurrent sensory inputs increases model performance on the quantification task. Moreover, we evaluate the model in a situation in which only the auditory modality is given, while the visual one is ‘hallucinanted’ either from the auditory input itself or from a linguistic caption describing the quantity of entities in the auditory input. This way, the model exploits prior associations between modalities. We show that the model profits from the prior knowledge and outperforms the auditory-only setting.
受多感觉整合文献的启发,我们开发了一个计算模型来确定知觉中的量词。该模型学会从9个量词(“少数”、“许多”、“所有”等)中选择一个更有可能描述包含动物和人工制品的视觉-听觉输入中动物的百分比。我们表明,依赖于并发的感官输入增加了模型在量化任务上的性能。此外,我们在只给出听觉模态的情况下评估模型,而视觉模态要么来自听觉输入本身,要么来自描述听觉输入中实体数量的语言标题。通过这种方式,模型利用了模式之间的先验关联。我们表明,该模型从先验知识中获益,并且优于纯听觉设置。
{"title":"Quantifiers in a Multimodal World: Hallucinating Vision with Language and Sound","authors":"Alberto Testoni, Sandro Pezzelle, R. Bernardi","doi":"10.18653/v1/W19-2912","DOIUrl":"https://doi.org/10.18653/v1/W19-2912","url":null,"abstract":"Inspired by the literature on multisensory integration, we develop a computational model to ground quantifiers in perception. The model learns to pick, out of nine quantifiers (‘few’, ‘many’, ‘all’, etc.), the one that is more likely to describe the percent of animals in a visual-auditory input containing both animals and artifacts. We show that relying on concurrent sensory inputs increases model performance on the quantification task. Moreover, we evaluate the model in a situation in which only the auditory modality is given, while the visual one is ‘hallucinanted’ either from the auditory input itself or from a linguistic caption describing the quantity of entities in the auditory input. This way, the model exploits prior associations between modalities. We show that the model profits from the prior knowledge and outperforms the auditory-only setting.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125647563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Less Descriptive yet Discriminative: Quantifying the Properties of Multimodal Referring Utterances via CLIP 少描述性但有辨别性:用CLIP量化多模态指称话语的特性
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.cmcl-1.4
Ece Takmaz, Sandro Pezzelle, R. Fernández
In this work, we use a transformer-based pre-trained multimodal model, CLIP, to shed light on the mechanisms employed by human speakers when referring to visual entities. In particular, we use CLIP to quantify the degree of descriptiveness (how well an utterance describes an image in isolation) and discriminativeness (to what extent an utterance is effective in picking out a single image among similar images) of human referring utterances within multimodal dialogues. Overall, our results show that utterances become less descriptive over time while their discriminativeness remains unchanged. Through analysis, we propose that this trend could be due to participants relying on the previous mentions in the dialogue history, as well as being able to distill the most discriminative information from the visual context. In general, our study opens up the possibility of using this and similar models to quantify patterns in human data and shed light on the underlying cognitive mechanisms.
在这项工作中,我们使用基于变压器的预训练多模态模型CLIP来阐明人类说话者在提到视觉实体时所采用的机制。特别是,我们使用CLIP来量化多模态对话中人类引用话语的描述性程度(话语对孤立图像的描述程度)和歧视性程度(话语在相似图像中挑选单个图像的有效程度)。总的来说,我们的研究结果表明,随着时间的推移,话语的描述性变得越来越少,而它们的歧视性保持不变。通过分析,我们提出这种趋势可能是由于参与者依赖于对话历史中先前提到的内容,以及能够从视觉语境中提取出最具歧视性的信息。总的来说,我们的研究开辟了使用这种模型和类似模型来量化人类数据模式的可能性,并揭示了潜在的认知机制。
{"title":"Less Descriptive yet Discriminative: Quantifying the Properties of Multimodal Referring Utterances via CLIP","authors":"Ece Takmaz, Sandro Pezzelle, R. Fernández","doi":"10.18653/v1/2022.cmcl-1.4","DOIUrl":"https://doi.org/10.18653/v1/2022.cmcl-1.4","url":null,"abstract":"In this work, we use a transformer-based pre-trained multimodal model, CLIP, to shed light on the mechanisms employed by human speakers when referring to visual entities. In particular, we use CLIP to quantify the degree of descriptiveness (how well an utterance describes an image in isolation) and discriminativeness (to what extent an utterance is effective in picking out a single image among similar images) of human referring utterances within multimodal dialogues. Overall, our results show that utterances become less descriptive over time while their discriminativeness remains unchanged. Through analysis, we propose that this trend could be due to participants relying on the previous mentions in the dialogue history, as well as being able to distill the most discriminative information from the visual context. In general, our study opens up the possibility of using this and similar models to quantify patterns in human data and shed light on the underlying cognitive mechanisms.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115127241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The Role of Utterance Boundaries and Word Frequencies for Part-of-speech Learning in Brazilian Portuguese Through Distributional Analysis 基于分布分析的话语边界和词频在巴西葡萄牙语词性学习中的作用
Pablo Picasso Feliciano de Faria
In this study, we address the problem of part-of-speech (or syntactic category) learning during language acquisition through distributional analysis of utterances. A model based on Redington et al.’s (1998) distributional learner is used to investigate the informativeness of distributional information in Brazilian Portuguese (BP). The data provided to the learner comes from two publicly available corpora of child directed speech. We present preliminary results from two experiments. The first one investigates the effects of different assumptions about utterance boundaries when presenting the input data to the learner. The second experiment compares the learner’s performance when counting contextual words’ frequencies versus just acknowledging their co-occurrence with a given target word. In general, our results indicate that explicit boundaries are more informative, frequencies are important, and that distributional information is useful to the child as a source of categorial information. These results are in accordance with Redington et al.’s findings for English.
在本研究中,我们通过对话语的分布分析来解决语言习得过程中词性(或句法范畴)学习的问题。基于Redington et al.(1998)的分布学习器模型用于研究巴西葡萄牙语(BP)中分布信息的信息量。提供给学习者的数据来自两个公开的儿童定向言语语料库。我们提出了两个实验的初步结果。第一项研究调查了在向学习者呈现输入数据时,对话语边界的不同假设的影响。第二个实验比较了学习者在计算上下文词的频率和仅仅承认它们与给定目标词共现时的表现。总的来说,我们的结果表明,明确的边界更有信息量,频率很重要,而且分布信息作为分类信息的来源对孩子很有用。这些结果与雷丁顿等人对英语的研究结果一致。
{"title":"The Role of Utterance Boundaries and Word Frequencies for Part-of-speech Learning in Brazilian Portuguese Through Distributional Analysis","authors":"Pablo Picasso Feliciano de Faria","doi":"10.18653/v1/W19-2917","DOIUrl":"https://doi.org/10.18653/v1/W19-2917","url":null,"abstract":"In this study, we address the problem of part-of-speech (or syntactic category) learning during language acquisition through distributional analysis of utterances. A model based on Redington et al.’s (1998) distributional learner is used to investigate the informativeness of distributional information in Brazilian Portuguese (BP). The data provided to the learner comes from two publicly available corpora of child directed speech. We present preliminary results from two experiments. The first one investigates the effects of different assumptions about utterance boundaries when presenting the input data to the learner. The second experiment compares the learner’s performance when counting contextual words’ frequencies versus just acknowledging their co-occurrence with a given target word. In general, our results indicate that explicit boundaries are more informative, frequencies are important, and that distributional information is useful to the child as a source of categorial information. These results are in accordance with Redington et al.’s findings for English.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124763076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features HkAmsters在CMCL 2022共享任务:用语言特征的梯度提升框架预测眼动追踪数据
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.cmcl-1.13
Lavinia Salicchi, Rong Xiang, Yu-Yin Hsu
Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.
眼动数据在心理语言学研究中被用来推断阅读过程中认知过程的信息。在本文中,我们描述了我们为认知建模和计算语言学共享任务(CMCL) 2022 -子任务1提出的方法,该任务涉及来自6种语言的多个数据集的数据。我们使用目标词和前一个词的特征,以及目标词的surprisal作为回归特征,比较了不同的回归模型。我们的最终系统,使用梯度增强回归器,实现了最低的平均绝对误差(MAE),从而成为竞争中的最佳系统。
{"title":"HkAmsters at CMCL 2022 Shared Task: Predicting Eye-Tracking Data from a Gradient Boosting Framework with Linguistic Features","authors":"Lavinia Salicchi, Rong Xiang, Yu-Yin Hsu","doi":"10.18653/v1/2022.cmcl-1.13","DOIUrl":"https://doi.org/10.18653/v1/2022.cmcl-1.13","url":null,"abstract":"Eye movement data are used in psycholinguistic studies to infer information regarding cognitive processes during reading. In this paper, we describe our proposed method for the Shared Task of Cognitive Modeling and Computational Linguistics (CMCL) 2022 - Subtask 1, which involves data from multiple datasets on 6 languages. We compared different regression models using features of the target word and its previous word, and target word surprisal as regression features. Our final system, using a gradient boosting regressor, achieved the lowest mean absolute error (MAE), resulting in the best system of the competition.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129377420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Priming vs. Inhibition of Optional Infinitival “to” 选择性不定式“to”的启动与抑制
R. Melnick, T. Wasow
The word “to” that precedes verbs in English infinitives is optional in at least two environments: in what Wasow et al. (2015) previously called the “do-be” construction, and in the complement of “help”, which we explore in the present work. In the “do-be” construction, Wasow et al. found that a preceding infinitival “to” increases the use of following optional “to”, but the use of “to” in the complement of help is reduced following “to help”. We examine two hypotheses regarding why the same function word is primed by prior use in one construction and inhibited in another. We then test predictions made by the two hypotheses, finding support for one of them.
在英语不定式中,动词前面的“to”至少在两种情况下是可选的:Wasow等人(2015)之前称之为“do-be”结构,以及我们在本研究中探索的“help”的补语。在“do-be”结构中,Wasow等人发现前面的不定式“to”增加了后面可选的“to”的使用,但是在help的补语中“to”的使用在“to help”之后减少了。我们研究了两个关于为什么同一个功能词在一个结构中被先前使用而在另一个结构中被抑制的假设。然后,我们测试由两个假设做出的预测,为其中一个假设找到支持。
{"title":"Priming vs. Inhibition of Optional Infinitival “to”","authors":"R. Melnick, T. Wasow","doi":"10.18653/v1/W19-2902","DOIUrl":"https://doi.org/10.18653/v1/W19-2902","url":null,"abstract":"The word “to” that precedes verbs in English infinitives is optional in at least two environments: in what Wasow et al. (2015) previously called the “do-be” construction, and in the complement of “help”, which we explore in the present work. In the “do-be” construction, Wasow et al. found that a preceding infinitival “to” increases the use of following optional “to”, but the use of “to” in the complement of help is reduced following “to help”. We examine two hypotheses regarding why the same function word is primed by prior use in one construction and inhibited in another. We then test predictions made by the two hypotheses, finding support for one of them.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127910727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Modeling Study of the Effects of Surprisal and Entropy in Perceptual Decision Making of an Adaptive Agent 自适应智能体感知决策中熵和Surprisal效应的建模研究
Pyeong Whan Cho, Richard L. Lewis
Processing difficulty in online language comprehension has been explained in terms of surprisal and entropy reduction. Although both hypotheses have been supported by experimental data, we do not fully understand their relative contributions on processing difficulty. To develop a better understanding, we propose a mechanistic model of perceptual decision making that interacts with a simulated task environment with temporal dynamics. The proposed model collects noisy bottom-up evidence over multiple timesteps, integrates it with its top-down expectation, and makes perceptual decisions, producing processing time data directly without relying on any linking hypothesis. Temporal dynamics in the task environment was determined by a simple finite-state grammar, which was designed to create the situations where the surprisal and entropy reduction hypotheses predict different patterns. After the model was trained to maximize rewards, the model developed an adaptive policy and both surprisal and entropy effects were observed especially in a measure reflecting earlier processing.
从surprisal和entropy reduction两方面解释了在线语言理解中的处理困难。虽然这两种假设都得到了实验数据的支持,但我们并不完全了解它们对处理难度的相对贡献。为了更好地理解这一点,我们提出了一个感知决策的机制模型,该模型与具有时间动态的模拟任务环境相互作用。该模型在多个时间步长上收集自下而上的噪声证据,将其与自上而下的期望相结合,并做出感知决策,直接产生处理时间数据,而不依赖于任何关联假设。任务环境中的时间动态是由一个简单的有限状态语法决定的,该语法被设计用来创建惊喜和熵减少假设预测不同模式的情况。在模型被训练为最大化奖励后,模型发展了一个自适应策略,并且在反映早期处理的测量中观察到惊喜效应和熵效应。
{"title":"A Modeling Study of the Effects of Surprisal and Entropy in Perceptual Decision Making of an Adaptive Agent","authors":"Pyeong Whan Cho, Richard L. Lewis","doi":"10.18653/v1/W19-2906","DOIUrl":"https://doi.org/10.18653/v1/W19-2906","url":null,"abstract":"Processing difficulty in online language comprehension has been explained in terms of surprisal and entropy reduction. Although both hypotheses have been supported by experimental data, we do not fully understand their relative contributions on processing difficulty. To develop a better understanding, we propose a mechanistic model of perceptual decision making that interacts with a simulated task environment with temporal dynamics. The proposed model collects noisy bottom-up evidence over multiple timesteps, integrates it with its top-down expectation, and makes perceptual decisions, producing processing time data directly without relying on any linking hypothesis. Temporal dynamics in the task environment was determined by a simple finite-state grammar, which was designed to create the situations where the surprisal and entropy reduction hypotheses predict different patterns. After the model was trained to maximize rewards, the model developed an adaptive policy and both surprisal and entropy effects were observed especially in a measure reflecting earlier processing.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130882591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
About Time: Do Transformers Learn Temporal Verbal Aspect? 关于时间:变形金刚学习时态语言方面吗?
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.cmcl-1.10
Eleni (Lena) Metheniti, Tim Van de Cruys, Nabil Hathout
Aspect is a linguistic concept that describes how an action, event, or state of a verb phrase is situated in time. In this paper, we explore whether different transformer models are capable of identifying aspectual features. We focus on two specific aspectual features: telicity and duration. Telicity marks whether the verb’s action or state has an endpoint or not (telic/atelic), and duration denotes whether a verb expresses an action (dynamic) or a state (stative). These features are integral to the interpretation of natural language, but also hard to annotate and identify with NLP methods. We perform experiments in English and French, and our results show that transformer models adequately capture information on telicity and duration in their vectors, even in their non-finetuned forms, but are somewhat biased with regard to verb tense and word order.
Aspect是一个语言概念,描述动作、事件或动词短语的状态在时间上的位置。在本文中,我们探讨了不同的变压器模型是否能够识别方面的特征。我们关注两个具体的方面特征:远性和持续时间。目的性表示动词的动作或状态是否有终点(telic/atelic),持续时间表示动词表达的是动作(动态)还是状态(静态)。这些特征对于自然语言的解释是不可或缺的,但也很难用NLP方法进行注释和识别。我们在英语和法语中进行了实验,我们的结果表明,变压器模型充分捕获了其向量中的远程性和持续时间信息,即使在其非微调形式中也是如此,但在动词时态和词序方面有些偏差。
{"title":"About Time: Do Transformers Learn Temporal Verbal Aspect?","authors":"Eleni (Lena) Metheniti, Tim Van de Cruys, Nabil Hathout","doi":"10.18653/v1/2022.cmcl-1.10","DOIUrl":"https://doi.org/10.18653/v1/2022.cmcl-1.10","url":null,"abstract":"Aspect is a linguistic concept that describes how an action, event, or state of a verb phrase is situated in time. In this paper, we explore whether different transformer models are capable of identifying aspectual features. We focus on two specific aspectual features: telicity and duration. Telicity marks whether the verb’s action or state has an endpoint or not (telic/atelic), and duration denotes whether a verb expresses an action (dynamic) or a state (stative). These features are integral to the interpretation of natural language, but also hard to annotate and identify with NLP methods. We perform experiments in English and French, and our results show that transformer models adequately capture information on telicity and duration in their vectors, even in their non-finetuned forms, but are somewhat biased with regard to verb tense and word order.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124984166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Modeling Long-Distance Cue Integration in Spoken Word Recognition 语音识别中的远距离线索整合建模
Wednesday Bushong, T. Jaeger
Cues to linguistic categories are distributed across the speech signal. Optimal categorization thus requires that listeners maintain gradient representations of incoming input in order to integrate that information with later cues. There is now evidence that listeners can and do integrate cues that occur far apart in time. Computational models of this integration have however been lacking. We take a first step at addressing this gap by mathematically formalizing four models of how listeners may maintain and use cue information during spoken language understanding and test them on two perception experiments. In one experiment, we find support for rational integration of cues at long distances. In a second, more memory and attention-taxing experiment, we find evidence in favor of a switching model that avoids maintaining detailed representations of cues in memory. These results are a first step in understanding what kinds of mechanisms listeners use for cue integration under different memory and attentional constraints.
语言类别的线索分布在语音信号中。因此,最佳分类要求听者保持输入的梯度表示,以便将该信息与稍后的线索整合。现在有证据表明,听者能够并且确实整合了在时间上相隔很远的线索。然而,这种集成的计算模型一直缺乏。为了解决这一差距,我们首先从数学上形式化了听者在口语理解过程中如何维持和使用线索信息的四个模型,并在两个感知实验中对它们进行了测试。在一个实验中,我们发现了对远距离线索的合理整合的支持。在第二个需要更多记忆和注意力的实验中,我们发现了支持切换模型的证据,该模型避免了在记忆中保留线索的详细表征。这些结果是了解听者在不同的记忆和注意力限制下使用何种机制进行线索整合的第一步。
{"title":"Modeling Long-Distance Cue Integration in Spoken Word Recognition","authors":"Wednesday Bushong, T. Jaeger","doi":"10.18653/v1/W19-2907","DOIUrl":"https://doi.org/10.18653/v1/W19-2907","url":null,"abstract":"Cues to linguistic categories are distributed across the speech signal. Optimal categorization thus requires that listeners maintain gradient representations of incoming input in order to integrate that information with later cues. There is now evidence that listeners can and do integrate cues that occur far apart in time. Computational models of this integration have however been lacking. We take a first step at addressing this gap by mathematically formalizing four models of how listeners may maintain and use cue information during spoken language understanding and test them on two perception experiments. In one experiment, we find support for rational integration of cues at long distances. In a second, more memory and attention-taxing experiment, we find evidence in favor of a switching model that avoids maintaining detailed representations of cues in memory. These results are a first step in understanding what kinds of mechanisms listeners use for cue integration under different memory and attentional constraints.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131750627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Eye Gaze and Self-attention: How Humans and Transformers Attend Words in Sentences 眼睛注视和自我注意:人类和变形金刚如何注意句子中的单词
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.cmcl-1.9
Joshua Bensemann, A. Peng, Diana Benavides Prado, Yang Chen, N. Tan, P. Corballis, Patricia Riddle, Michael Witbrock
Attention describes cognitive processes that are important to many human phenomena including reading. The term is also used to describe the way in which transformer neural networks perform natural language processing. While attention appears to be very different under these two contexts, this paper presents an analysis of the correlations between transformer attention and overt human attention during reading tasks. An extensive analysis of human eye tracking datasets showed that the dwell times of human eye movements were strongly correlated with the attention patterns occurring in the early layers of pre-trained transformers such as BERT. Additionally, the strength of a correlation was not related to the number of parameters within a transformer. This suggests that something about the transformers’ architecture determined how closely the two measures were correlated.
注意描述了对包括阅读在内的许多人类现象都很重要的认知过程。这个术语也用来描述变形神经网络执行自然语言处理的方式。虽然在这两种情况下,注意力似乎有很大的不同,但本文分析了阅读任务中变压器注意力和显性人类注意力之间的相关性。对人眼跟踪数据集的广泛分析表明,人眼运动的停留时间与BERT等预训练变形器早期层中出现的注意力模式密切相关。此外,相关性的强度与变压器内参数的数量无关。这表明变压器的结构决定了这两种测量的相关性。
{"title":"Eye Gaze and Self-attention: How Humans and Transformers Attend Words in Sentences","authors":"Joshua Bensemann, A. Peng, Diana Benavides Prado, Yang Chen, N. Tan, P. Corballis, Patricia Riddle, Michael Witbrock","doi":"10.18653/v1/2022.cmcl-1.9","DOIUrl":"https://doi.org/10.18653/v1/2022.cmcl-1.9","url":null,"abstract":"Attention describes cognitive processes that are important to many human phenomena including reading. The term is also used to describe the way in which transformer neural networks perform natural language processing. While attention appears to be very different under these two contexts, this paper presents an analysis of the correlations between transformer attention and overt human attention during reading tasks. An extensive analysis of human eye tracking datasets showed that the dwell times of human eye movements were strongly correlated with the attention patterns occurring in the early layers of pre-trained transformers such as BERT. Additionally, the strength of a correlation was not related to the number of parameters within a transformer. This suggests that something about the transformers’ architecture determined how closely the two measures were correlated.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121798871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Visually Grounded Interpretation of Noun-Noun Compounds in English 英语名词-名词复合词的视觉基础解读
Pub Date : 1900-01-01 DOI: 10.18653/v1/2022.cmcl-1.3
Inga Lang, Lonneke van der Plas, M. Nissim, Albert Gatt
Noun-noun compounds (NNCs) occur frequently in the English language. Accurate NNC interpretation, i.e. determining the implicit relationship between the constituents of a NNC, is crucial for the advancement of many natural language processing tasks. Until now, computational NNC interpretation has been limited to approaches involving linguistic representations only. However, much research suggests that grounding linguistic representations in vision or other modalities can increase performance on this and other tasks. Our work is a novel comparison of linguistic and visuo-linguistic representations for the task of NNC interpretation. We frame NNC interpretation as a relation classification task, evaluating on a large, relationally-annotated NNC dataset. We combine distributional word vectors with image vectors to investigate how visual information can help improve NNC interpretation systems. We find that adding visual vectors increases classification performance on our dataset in many cases.
名词-名词复合词在英语中经常出现。准确的NNC解释,即确定NNC组成部分之间的隐含关系,对于许多自然语言处理任务的推进至关重要。到目前为止,计算NNC解释仅限于涉及语言表示的方法。然而,许多研究表明,以视觉或其他方式为基础的语言表征可以提高这一任务和其他任务的表现。我们的工作是对NNC口译任务的语言表征和视觉语言表征进行新颖的比较。我们将NNC解释定义为一个关系分类任务,在一个大型的、带关系注释的NNC数据集上进行评估。我们结合分布词向量和图像向量来研究视觉信息如何帮助改进NNC解释系统。我们发现,在许多情况下,添加视觉向量可以提高数据集的分类性能。
{"title":"Visually Grounded Interpretation of Noun-Noun Compounds in English","authors":"Inga Lang, Lonneke van der Plas, M. Nissim, Albert Gatt","doi":"10.18653/v1/2022.cmcl-1.3","DOIUrl":"https://doi.org/10.18653/v1/2022.cmcl-1.3","url":null,"abstract":"Noun-noun compounds (NNCs) occur frequently in the English language. Accurate NNC interpretation, i.e. determining the implicit relationship between the constituents of a NNC, is crucial for the advancement of many natural language processing tasks. Until now, computational NNC interpretation has been limited to approaches involving linguistic representations only. However, much research suggests that grounding linguistic representations in vision or other modalities can increase performance on this and other tasks. Our work is a novel comparison of linguistic and visuo-linguistic representations for the task of NNC interpretation. We frame NNC interpretation as a relation classification task, evaluating on a large, relationally-annotated NNC dataset. We combine distributional word vectors with image vectors to investigate how visual information can help improve NNC interpretation systems. We find that adding visual vectors increases classification performance on our dataset in many cases.","PeriodicalId":428409,"journal":{"name":"Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128387777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1