首页 > 最新文献

Dialogue and Discourse最新文献

英文 中文
GailBot: An automatic transcription system for Conversation Analysis GailBot:会话分析的自动转录系统
Q1 Arts and Humanities Pub Date : 2022-04-29 DOI: 10.5210/dad.2022.103
Muhammad Umair, Julia Beret Mertens, Saul Albert, J. D. Ruiter
Researchers studying human interaction, such as conversation analysts, psychologists, and linguists, all rely on detailed transcriptions of language use. Ideally, these should include so-called paralinguistic features of talk, such as overlaps, prosody, and intonation, as they convey important information. However, creating conversational transcripts that include these features by hand requires substantial amounts of time by trained transcribers. There are currently no Speech to Text (STT) systems that are able to integrate these features in the generated transcript. To reduce the resources needed to create detailed conversation transcripts that include representation of paralinguistic features, we developed a program called GailBot. GailBot combines STT services with plugins to automatically generate first drafts of transcripts that largely follow the transcription standards common in the field of Conversation Analysis. It also enables researchers to add new plugins to transcribe additional features, or to improve the plugins it currently uses. We describe GailBot’s architecture and its use of computational heuristics and machine learning. We also evaluate its output in relation to transcripts produced by both human transcribers and comparable automated transcription systems. We argue that despite its limitations, GailBot represents a substantial improvement over existing dialogue transcription software.
研究人类互动的研究人员,如对话分析师、心理学家和语言学家,都依赖于语言使用的详细记录。理想情况下,这些应该包括所谓的谈话的副语言特征,如重叠、韵律和语调,因为它们传达了重要的信息。然而,手工创建包含这些特征的会话文本需要训练有素的转录员花费大量的时间。目前还没有语音到文本(STT)系统能够将这些功能集成到生成的文本中。为了减少创建包含副语言特征表示的详细对话记录所需的资源,我们开发了一个名为GailBot的程序。GailBot将STT服务与插件相结合,自动生成转录稿的初稿,这些转录稿在很大程度上遵循会话分析领域常见的转录标准。它还使研究人员能够添加新的插件来转录额外的功能,或者改进它目前使用的插件。我们描述了GailBot的架构及其对计算启发式和机器学习的使用。我们还评估了与人类转录器和类似的自动转录系统产生的转录本相关的输出。我们认为,尽管有其局限性,GailBot代表了对现有对话转录软件的实质性改进。
{"title":"GailBot: An automatic transcription system for Conversation Analysis","authors":"Muhammad Umair, Julia Beret Mertens, Saul Albert, J. D. Ruiter","doi":"10.5210/dad.2022.103","DOIUrl":"https://doi.org/10.5210/dad.2022.103","url":null,"abstract":"Researchers studying human interaction, such as conversation analysts, psychologists, and linguists, all rely on detailed transcriptions of language use. Ideally, these should include so-called paralinguistic features of talk, such as overlaps, prosody, and intonation, as they convey important information. However, creating conversational transcripts that include these features by hand requires substantial amounts of time by trained transcribers. There are currently no Speech to Text (STT) systems that are able to integrate these features in the generated transcript. To reduce the resources needed to create detailed conversation transcripts that include representation of paralinguistic features, we developed a program called GailBot. GailBot combines STT services with plugins to automatically generate first drafts of transcripts that largely follow the transcription standards common in the field of Conversation Analysis. It also enables researchers to add new plugins to transcribe additional features, or to improve the plugins it currently uses. We describe GailBot’s architecture and its use of computational heuristics and machine learning. We also evaluate its output in relation to transcripts produced by both human transcribers and comparable automated transcription systems. We argue that despite its limitations, GailBot represents a substantial improvement over existing dialogue transcription software.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"48 1","pages":"63-95"},"PeriodicalIF":0.0,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83793669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
When to Say What and How: Adapting the Elaborateness and Indirectness of Spoken Dialogue Systems 何时说什么和如何说:调整口语对话系统的精细化和间接性
Q1 Arts and Humanities Pub Date : 2022-04-11 DOI: 10.5210/dad.2022.101
Juliana Miehle, W. Minker, Stefan Ultes
With the aim of designing a spoken dialogue system which has the ability to adapt to the user's communication idiosyncrasies, we investigate whether it is possible to carry over insights from the usage of communication styles in human-human interaction to human-computer interaction. In an extensive literature review, it is demonstrated that communication styles play an important role in human communication. Using a multi-lingual data set, we show that there is a significant correlation between the communication style of the system and the preceding communication style of the user. This is why two components that extend the standard architecture of spoken dialogue systems are presented: 1) a communication style classifier that automatically identifies the user communication style and 2) a communication style selection module that selects an appropriate system communication style. We consider the communication styles elaborateness and indirectness as it has been shown that they influence the user's satisfaction and the user's perception of a dialogue. We present a neural classification approach based on supervised learning for each task. Neural networks are trained and evaluated with features that can be automatically derived during an ongoing interaction in every spoken dialogue system. It is shown that both components yield solid results and outperform the baseline in form of a majority-class classifier.
为了设计一个能够适应用户沟通特质的口语对话系统,我们研究了是否有可能将人机交互中沟通风格使用的见解延续到人机交互中。在广泛的文献综述中,我们发现交际风格在人类交际中起着重要的作用。使用多语言数据集,我们表明系统的通信风格与用户先前的通信风格之间存在显著的相关性。这就是为什么提出了扩展口语对话系统标准体系结构的两个组件:1)自动识别用户通信风格的通信风格分类器和2)选择适当系统通信风格的通信风格选择模块。我们考虑了沟通风格的精细化和间接,因为它已经表明,他们影响用户的满意度和用户对对话的感知。我们提出了一种基于监督学习的神经分类方法。神经网络被训练和评估,这些特征可以在每个口语对话系统的持续交互过程中自动衍生。结果表明,这两个组件都产生了可靠的结果,并且以多数类分类器的形式优于基线。
{"title":"When to Say What and How: Adapting the Elaborateness and Indirectness of Spoken Dialogue Systems","authors":"Juliana Miehle, W. Minker, Stefan Ultes","doi":"10.5210/dad.2022.101","DOIUrl":"https://doi.org/10.5210/dad.2022.101","url":null,"abstract":"With the aim of designing a spoken dialogue system which has the ability to adapt to the user's communication idiosyncrasies, we investigate whether it is possible to carry over insights from the usage of communication styles in human-human interaction to human-computer interaction. In an extensive literature review, it is demonstrated that communication styles play an important role in human communication. Using a multi-lingual data set, we show that there is a significant correlation between the communication style of the system and the preceding communication style of the user. This is why two components that extend the standard architecture of spoken dialogue systems are presented: 1) a communication style classifier that automatically identifies the user communication style and 2) a communication style selection module that selects an appropriate system communication style. We consider the communication styles elaborateness and indirectness as it has been shown that they influence the user's satisfaction and the user's perception of a dialogue. We present a neural classification approach based on supervised learning for each task. Neural networks are trained and evaluated with features that can be automatically derived during an ongoing interaction in every spoken dialogue system. It is shown that both components yield solid results and outperform the baseline in form of a majority-class classifier.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"13 1","pages":"1-40"},"PeriodicalIF":0.0,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86557271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Analysis of Japanese Sentence-final Particle Yone: Compare Yone and Ne in Response 日语句末助词Yone分析:比较Yone和Ne的反应
Q1 Arts and Humanities Pub Date : 2021-12-18 DOI: 10.5210/dad.2021.206
Jun Xu
Yone, a Japanese sentence-final particle (SFP), is frequently used in conversation, and some functions overlap with ne, another SFP. However, not much discussion has taken place about their differences. This study argues that the two Japanese sentence-final particles, yone and ne, express a distinction about the speaker's state of mind: yone indicates that an idea has been on the speaker's mind, while ne suggests a thought just emerged into the speaker's awareness. Naturally occurring conversation data provides evidence for this claim. The results show that the particles reflect the speaker's choice of presenting his/her state of awareness.
日语的定语助词“Yone”在会话中使用频率较高,其部分功能与“ne”重叠。然而,对于他们之间的分歧,并没有进行太多的讨论。这项研究认为,日语中两个句子的最后助词yone和ne表达了说话人心理状态的区别:yone表示说话人已经有了一个想法,而ne表示说话人刚刚意识到一个想法。自然发生的谈话数据为这种说法提供了证据。结果表明,语气词反映了说话人对表达意识状态的选择。
{"title":"An Analysis of Japanese Sentence-final Particle Yone: Compare Yone and Ne in Response","authors":"Jun Xu","doi":"10.5210/dad.2021.206","DOIUrl":"https://doi.org/10.5210/dad.2021.206","url":null,"abstract":"Yone, a Japanese sentence-final particle (SFP), is frequently used in conversation, and some functions overlap with ne, another SFP. However, not much discussion has taken place about their differences. This study argues that the two Japanese sentence-final particles, yone and ne, express a distinction about the speaker's state of mind: yone indicates that an idea has been on the speaker's mind, while ne suggests a thought just emerged into the speaker's awareness. Naturally occurring conversation data provides evidence for this claim. The results show that the particles reflect the speaker's choice of presenting his/her state of awareness.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"2 1","pages":"174-191"},"PeriodicalIF":0.0,"publicationDate":"2021-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87136400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can we Fix the Scope for Coreference? Problems and Solutions for Benchmarks beyond OntoNotes 我们能修正共同引用的范围吗?OntoNotes以外基准测试的问题和解决方案
Q1 Arts and Humanities Pub Date : 2021-12-17 DOI: 10.5210/dad.2022.102
Amir Zeldes
Current work on automatic coreference resolution has focused on the OntoNotes benchmark dataset, due to both its size and consistency. However many aspects of the OntoNotes annotation scheme are not well understood by NLP practitioners, including the treatment of generic NPs, noun modifiers, indefinite anaphora, predication and more. These often lead to counterintuitive claims, results and system behaviors. This opinion piece aims to highlight some of the problems with the OntoNotes rendition of coreference, and to propose a way forward relying on three principles: 1. a focus on semantics, not morphosyntax; 2. cross-linguistic generalizability; and 3. a separation of identity and scope, which can resolve old problems involving temporal and modal domain consistency.
由于OntoNotes基准数据集的大小和一致性,目前关于自动共参考分辨率的工作主要集中在OntoNotes基准数据集上。然而,NLP从业者对OntoNotes标注方案的许多方面还不是很了解,包括对一般np的处理、名词修饰语、不定回指、谓语等。这些通常会导致违反直觉的主张、结果和系统行为。这篇观点文章旨在强调OntoNotes共同引用的一些问题,并根据三个原则提出一个前进的方向:1。关注语义,而不是形态语法;2. 跨语言的普遍性;和3。身份和范围的分离,可以解决涉及时间和模态域一致性的老问题。
{"title":"Can we Fix the Scope for Coreference? Problems and Solutions for Benchmarks beyond OntoNotes","authors":"Amir Zeldes","doi":"10.5210/dad.2022.102","DOIUrl":"https://doi.org/10.5210/dad.2022.102","url":null,"abstract":"Current work on automatic coreference resolution has focused on the OntoNotes benchmark dataset, due to both its size and consistency. However many aspects of the OntoNotes annotation scheme are not well understood by NLP practitioners, including the treatment of generic NPs, noun modifiers, indefinite anaphora, predication and more. These often lead to counterintuitive claims, results and system behaviors. This opinion piece aims to highlight some of the problems with the OntoNotes rendition of coreference, and to propose a way forward relying on three principles: 1. a focus on semantics, not morphosyntax; 2. cross-linguistic generalizability; and 3. a separation of identity and scope, which can resolve old problems involving temporal and modal domain consistency.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"111 1","pages":"41-62"},"PeriodicalIF":0.0,"publicationDate":"2021-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80644892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Lexical Alignment to Non-native Speakers 与非母语人士的词汇一致性
Q1 Arts and Humanities Pub Date : 2021-10-19 DOI: 10.5210/dad.2021.205
I. Ivanova, H. Branigan, Janet McLean, Albert Costa, M. Pickering
Two picture-matching-game experiments investigated if lexical-referential alignment to non-native speakers is enhanced by a desire to aid communicative success (by saying something the conversation partner can certainly understand), a form of audience design. In Experiment 1, a group of native speakers of British English that was not given evidence of their conversation partners’ picture-matching performance showed more alignment to non-native than to native speakers, while another group that was given such evidence aligned equivalently to the two types of speaker. Experiment 2, conducted with speakers of Castilian Spanish, replicated the greater alignment to non-native than native speakers without feedback. However, Experiment 2 also showed that production of grammatical errors by the confederate produced no additional increase of alignment even though making errors suggests lower communicative competence. We suggest that this pattern is consistent with another collaborative strategy, the desire to model correct usage. Together, these results support a role for audience design in alignment to non-native speakers in structured task-based dialogue, but one that is strategically deployed only when deemed necessary.
两个图片匹配游戏实验调查了帮助交际成功的愿望(通过说一些对话伙伴肯定能理解的东西)是否会增强与非母语人士的词汇参照一致性,这是一种听众设计形式。在实验1中,一组以英国英语为母语的人在没有得到会话伙伴图片匹配表现的证据的情况下,对非英语为母语的人比对以英语为母语的人表现得更一致,而另一组得到这种证据的人对两种类型的说话者表现得同样一致。实验二是对说卡斯蒂利亚西班牙语的人进行的,结果显示,在没有反馈的情况下,非母语人士比母语人士对非母语人士更有好感。然而,实验2也表明,即使犯语法错误表明交际能力较低,但被试者犯语法错误并没有额外增加一致性。我们建议此模式与另一种协作策略一致,即对正确使用建模的愿望。总之,这些结果支持听众设计在结构化任务对话中与非母语人士保持一致的作用,但只有在必要时才策略性地部署。
{"title":"Lexical Alignment to Non-native Speakers","authors":"I. Ivanova, H. Branigan, Janet McLean, Albert Costa, M. Pickering","doi":"10.5210/dad.2021.205","DOIUrl":"https://doi.org/10.5210/dad.2021.205","url":null,"abstract":"Two picture-matching-game experiments investigated if lexical-referential alignment to non-native speakers is enhanced by a desire to aid communicative success (by saying something the conversation partner can certainly understand), a form of audience design. In Experiment 1, a group of native speakers of British English that was not given evidence of their conversation partners’ picture-matching performance showed more alignment to non-native than to native speakers, while another group that was given such evidence aligned equivalently to the two types of speaker. Experiment 2, conducted with speakers of Castilian Spanish, replicated the greater alignment to non-native than native speakers without feedback. However, Experiment 2 also showed that production of grammatical errors by the confederate produced no additional increase of alignment even though making errors suggests lower communicative competence. We suggest that this pattern is consistent with another collaborative strategy, the desire to model correct usage. Together, these results support a role for audience design in alignment to non-native speakers in structured task-based dialogue, but one that is strategically deployed only when deemed necessary.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"82 1","pages":"145-173"},"PeriodicalIF":0.0,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84375016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Narrative Elements in Expository Texts: A Corpus Study of Educational Textbooks 说明文中的叙事元素:教育教科书语料库研究
Q1 Arts and Humanities Pub Date : 2021-10-12 DOI: 10.5210/dad.2021.204
N. Sangers, J. Evers-Vermeul, T. Sanders, H. Hoeken
While the use of narrative elements in educational texts seems to be an adequate means to enhance students’ engagement and comprehension, we know little about how and to what extent these elements are used in the present-day educational practice. In this quantitative corpus-based analysis, we chart how and when narrative elements are used in current Dutch educational texts (N=999). While educational texts have traditionally been considered prime exemplars of expository texts, we show that the distinction between the expository and narrative genre is not that strict in the educational domain: prototypical narrative elements – particularized events, experiencing characters, and landscapes of consciousness – occur in 45% of the corpus’ texts. Their distribution varies between school subjects: while specific events, specific people, and their experiences are often at the heart of the to-be-learned information in history texts, narrativity is less present in the educational content of biology and geography texts. Instead publishers employ narrative-like strategies to make these texts more concrete and imaginable, such as the addition of fictitious characters and representative entities.
虽然在教育文本中使用叙事元素似乎是提高学生参与度和理解力的适当手段,但我们对这些元素在当今教育实践中的使用方式和程度知之甚少。在这个基于语料库的定量分析中,我们绘制了当前荷兰教育文本中叙事元素的使用方式和时间(N=999)。虽然教育文本传统上被认为是说明文文本的主要范例,但我们表明,说明文和叙事体裁之间的区别在教育领域并不那么严格:典型的叙事元素——特定事件、经历人物和意识景观——出现在45%的语料库文本中。它们的分布因学校学科而异:在历史教科书中,特定事件、特定人物及其经历往往是要学习的信息的核心,而在生物和地理教科书的教育内容中,叙事性较少出现。相反,出版商采用类似叙事的策略使这些文本更加具体和可想象,例如添加虚构人物和代表性实体。
{"title":"Narrative Elements in Expository Texts: A Corpus Study of Educational Textbooks","authors":"N. Sangers, J. Evers-Vermeul, T. Sanders, H. Hoeken","doi":"10.5210/dad.2021.204","DOIUrl":"https://doi.org/10.5210/dad.2021.204","url":null,"abstract":"While the use of narrative elements in educational texts seems to be an adequate means to enhance students’ engagement and comprehension, we know little about how and to what extent these elements are used in the present-day educational practice. In this quantitative corpus-based analysis, we chart how and when narrative elements are used in current Dutch educational texts (N=999). While educational texts have traditionally been considered prime exemplars of expository texts, we show that the distinction between the expository and narrative genre is not that strict in the educational domain: prototypical narrative elements – particularized events, experiencing characters, and landscapes of consciousness – occur in 45% of the corpus’ texts. Their distribution varies between school subjects: while specific events, specific people, and their experiences are often at the heart of the to-be-learned information in history texts, narrativity is less present in the educational content of biology and geography texts. Instead publishers employ narrative-like strategies to make these texts more concrete and imaginable, such as the addition of fictitious characters and representative entities.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"6 1","pages":"115-144"},"PeriodicalIF":0.0,"publicationDate":"2021-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89899602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
User Satisfaction Reward Estimation Across Domains: Domain-independent Dialogue Policy Learning 跨领域的用户满意度奖励估计:独立于领域的对话策略学习
Q1 Arts and Humanities Pub Date : 2021-09-28 DOI: 10.5210/dad.2021.203
Stefan Ultes, Wolfgang Maier
Learning suitable and well-performing dialogue behaviour in statistical spoken dialogue systems has been in the focus of research for many years. While most work that is based on reinforcement learning employs an objective measure like task success for modelling the reward signal, we propose to use a reward signal based on user satisfaction. We propose a novel estimator and show that it outperforms all previous estimators while learning temporal dependencies implicitly. We show in simulated experiments that a live user satisfaction estimation model may be applied resulting in higher estimated satisfaction whilst achieving similar success rates. Moreover, we show that a satisfaction estimation model trained on one domain may be applied in many other domains that cover a similar task. We verify our findings by employing the model to one of the domains for learning a policy from real users and compare its performance to policies using user satisfaction and task success acquired directly from the users as reward.
在统计口语对话系统中学习合适且表现良好的对话行为是多年来研究的重点。虽然大多数基于强化学习的工作都采用任务成功等客观度量来建模奖励信号,但我们建议使用基于用户满意度的奖励信号。我们提出了一种新的估计器,并表明它在隐式学习时间依赖性的同时优于所有以前的估计器。我们在模拟实验中表明,可以应用实时用户满意度估计模型,从而获得更高的估计满意度,同时实现类似的成功率。此外,我们表明,在一个领域训练的满意度估计模型可以应用于覆盖类似任务的许多其他领域。我们通过将模型应用于从真实用户那里学习策略的一个领域来验证我们的发现,并将其性能与直接从用户那里获得的用户满意度和任务成功作为奖励的策略进行比较。
{"title":"User Satisfaction Reward Estimation Across Domains: Domain-independent Dialogue Policy Learning","authors":"Stefan Ultes, Wolfgang Maier","doi":"10.5210/dad.2021.203","DOIUrl":"https://doi.org/10.5210/dad.2021.203","url":null,"abstract":"Learning suitable and well-performing dialogue behaviour in statistical spoken dialogue systems has been in the focus of research for many years. While most work that is based on reinforcement learning employs an objective measure like task success for modelling the reward signal, we propose to use a reward signal based on user satisfaction. We propose a novel estimator and show that it outperforms all previous estimators while learning temporal dependencies implicitly. We show in simulated experiments that a live user satisfaction estimation model may be applied resulting in higher estimated satisfaction whilst achieving similar success rates. Moreover, we show that a satisfaction estimation model trained on one domain may be applied in many other domains that cover a similar task. We verify our findings by employing the model to one of the domains for learning a policy from real users and compare its performance to policies using user satisfaction and task success acquired directly from the users as reward.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"43 1","pages":"81-114"},"PeriodicalIF":0.0,"publicationDate":"2021-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87876486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic Essay Scoring Systems Are Both Overstable And Oversensitive: Explaining Why And Proposing Defenses 自动作文评分系统既过度稳定又过度敏感:解释原因并提出防御建议
Q1 Arts and Humanities Pub Date : 2021-09-24 DOI: 10.5210/dad.2023.101
Yaman Kumar Singla, Swapnil Parekh, Somesh Singh, J. Li, R. Shah, Changyou Chen
Deep-learning based Automatic Essay Scoring (AES) systems are being actively used in various high-stake applications in education and testing. However, little research has been put to understand and interpret the black-box nature of deep-learning-based scoring algorithms. While previous studies indicate that scoring models can be easily fooled, in this paper, we explore the reason behind their surprising adversarial brittleness. We utilize recent advances in interpretability to find the extent to which features such as coherence, content, vocabulary, and relevance are important for automated scoring mechanisms. We use this to investigate the oversensitivity (i.e., large change in output score with a little change in input essay content) and overstability (i.e., little change in output scores with large changes in input essay content) of AES. Our results indicate that autoscoring models, despite getting trained as “end-to-end” models with rich contextual embeddings such as BERT, behave like bag-of-words models. A few words determine the essay score without the requirement of any context making the model largely overstable. This is in stark contrast to recent probing studies on pre-trained representation learning models, which show that rich linguistic features such as parts-of-speech and morphology are encoded by them. Further, we also find that the models have learnt dataset biases, making them oversensitive. The presence of a few words with high co-occurrence with a certain score class makes the model associate the essay sample with that score. This causes score changes in ∼95% of samples with an addition of only a few words. To deal with these issues, we propose detection-based protection models that can detect oversensitivity and samples causing overstability with high accuracies. We find that our proposed models are able to detect unusual attribution patterns and flag adversarial samples successfully.
基于深度学习的自动作文评分(AES)系统正被积极地应用于各种高风险的教育和测试应用中。然而,很少有研究来理解和解释基于深度学习的评分算法的黑箱性质。虽然以前的研究表明评分模型很容易被愚弄,但在本文中,我们探讨了评分模型令人惊讶的对抗性脆性背后的原因。我们利用可解释性方面的最新进展来发现连贯性、内容、词汇和相关性等特征在多大程度上对自动评分机制很重要。我们用它来研究AES的过度敏感(即输出分数变化大而输入论文内容变化小)和过度稳定(即输出分数变化小而输入论文内容变化大)。我们的结果表明,自动评分模型,尽管被训练成“端到端”模型,具有丰富的上下文嵌入,如BERT,表现得像词袋模型。几个单词决定了论文的分数,而不需要任何上下文,这使得模型在很大程度上过于稳定。这与最近对预训练表征学习模型的探索性研究形成鲜明对比,后者表明丰富的语言特征(如词性和形态学)是由它们编码的。此外,我们还发现模型已经学会了数据集偏差,使它们过度敏感。在某个分数类别中存在一些高共现的单词,使模型将文章样本与该分数关联起来。这导致仅增加几个单词就会导致约95%的样本得分发生变化。为了解决这些问题,我们提出了基于检测的保护模型,该模型可以高精度地检测出过度敏感和导致过稳定的样品。我们发现我们提出的模型能够成功地检测异常归因模式和标记对抗性样本。
{"title":"Automatic Essay Scoring Systems Are Both Overstable And Oversensitive: Explaining Why And Proposing Defenses","authors":"Yaman Kumar Singla, Swapnil Parekh, Somesh Singh, J. Li, R. Shah, Changyou Chen","doi":"10.5210/dad.2023.101","DOIUrl":"https://doi.org/10.5210/dad.2023.101","url":null,"abstract":"Deep-learning based Automatic Essay Scoring (AES) systems are being actively used in various high-stake applications in education and testing. However, little research has been put to understand and interpret the black-box nature of deep-learning-based scoring algorithms. While previous studies indicate that scoring models can be easily fooled, in this paper, we explore the reason behind their surprising adversarial brittleness. We utilize recent advances in interpretability to find the extent to which features such as coherence, content, vocabulary, and relevance are important for automated scoring mechanisms. We use this to investigate the oversensitivity (i.e., large change in output score with a little change in input essay content) and overstability (i.e., little change in output scores with large changes in input essay content) of AES. Our results indicate that autoscoring models, despite getting trained as “end-to-end” models with rich contextual embeddings such as BERT, behave like bag-of-words models. A few words determine the essay score without the requirement of any context making the model largely overstable. This is in stark contrast to recent probing studies on pre-trained representation learning models, which show that rich linguistic features such as parts-of-speech and morphology are encoded by them. Further, we also find that the models have learnt dataset biases, making them oversensitive. The presence of a few words with high co-occurrence with a certain score class makes the model associate the essay sample with that score. This causes score changes in ∼95% of samples with an addition of only a few words. To deal with these issues, we propose detection-based protection models that can detect oversensitivity and samples causing overstability with high accuracies. We find that our proposed models are able to detect unusual attribution patterns and flag adversarial samples successfully.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"4 1","pages":"1-33"},"PeriodicalIF":0.0,"publicationDate":"2021-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87038384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Studying Alignment in a Collaborative Learning Activity via Automatic Methods: The Link Between What We Say and Do 通过自动方法研究协作学习活动中的一致性:我们所说和所做之间的联系
Q1 Arts and Humanities Pub Date : 2021-04-09 DOI: 10.5210/dad.2022.201
Utku Norman, Tanvi Dinkar, Barbara Bruno, C. Clavel
A dialogue is successful when there is alignment between the speakers, at different linguistic levels. In this work, we consider the dialogue occurring between interlocutors engaged in a collaborative learning task, where they are evaluated on how well they performed and how much they learnt. Our main contribution is to propose new automatic measures to study alignment; focusing on lexical alignment, and a new alignment context that we introduce termed as behavioural alignment (when an instruction given by one interlocutor was followed with concrete actions in a physical environment by another). Thus we propose methodologies to create a link between what was said, and what was done as a consequence. To do so, we focus on expressions related to the task in the situated activity. These expressions are minimally required by the interlocutors to make progress in the task. We then observe how these local alignment contexts build to dialogue level phenomena; success in the task. What distinguishes our approach from other works, is the treatment of alignment as a procedure that occurs in stages. Since we utilise a dataset of spontaneous speech dialogues elicited from children, a second contribution of our work is to study how spontaneous speech phenomena (such as when interlocutors say "uh", "oh" ...) are used in the process of alignment. Lastly, we make public the dataset to study alignment in educational dialogues. Our results show that all teams lexically and behaviourally align to some degree regardless of their performance and learning, and our measures capture that teams that did not succeed in the task were simply slower to collaborate. Thus we find that teams that performed better, were faster to align. Furthermore, our methodology captures a productive, collaborative period that includes the time where the interlocutors came up with their best solutions. We also find that well-performing teams verbalise the marker "oh" more when they are behaviourally aligned, compared to other times in the dialogue; showing that this marker is an important cue in alignment. To the best of our knowledge, we are the first to study the role of "oh" as an information management marker in a behavioural context (i.e. in connection to actions taken in a physical environment), compared to only a verbal one. Our measures contribute to the research in the field of educational dialogue and the intersection between dialogue and collaborative learning research. 
当说话者在不同的语言水平上保持一致时,对话是成功的。在这项工作中,我们考虑了参与协作学习任务的对话者之间发生的对话,其中他们的表现如何以及他们学到了多少。我们的主要贡献是提出了新的自动测量方法来研究对齐;重点关注词汇对齐,以及我们引入的称为行为对齐的新对齐上下文(当一个对话者给出的指令被另一个对话者在物理环境中以具体行动遵循时)。因此,我们提出了一种方法,在所说的和所做的之间建立联系。为此,我们将重点放在与情境活动中任务相关的表达上。这些表达是对话者在任务中取得进展的最低要求。然后,我们观察这些局部对齐上下文如何构建对话级现象;任务成功。我们的方法与其他作品的区别在于,将对齐处理为一个分阶段发生的过程。由于我们使用了儿童自发语音对话的数据集,因此我们工作的第二个贡献是研究自发语音现象(例如对话者说“uh”,“oh”…)如何在对齐过程中使用。最后,我们公开了数据集来研究教育对话中的对齐。我们的研究结果表明,无论团队的表现和学习情况如何,所有团队在词汇和行为上都在某种程度上保持一致,我们的测量结果表明,没有成功完成任务的团队只是协作速度较慢。因此,我们发现表现更好的团队能够更快地达成一致。此外,我们的方法捕获了一个富有成效的合作时期,其中包括对话者提出最佳解决方案的时间。我们还发现,与对话中的其他时候相比,当表现良好的团队在行为上一致时,他们会更多地用语言标记“哦”;表明这个标记是对齐中的一个重要提示。据我们所知,我们是第一个研究“哦”在行为背景下(即与物理环境中采取的行动有关)作为信息管理标记的角色,而不仅仅是口头标记。我们的措施有助于教育对话领域的研究以及对话与协作学习研究的交叉。
{"title":"Studying Alignment in a Collaborative Learning Activity via Automatic Methods: The Link Between What We Say and Do","authors":"Utku Norman, Tanvi Dinkar, Barbara Bruno, C. Clavel","doi":"10.5210/dad.2022.201","DOIUrl":"https://doi.org/10.5210/dad.2022.201","url":null,"abstract":"A dialogue is successful when there is alignment between the speakers, at different linguistic levels. In this work, we consider the dialogue occurring between interlocutors engaged in a collaborative learning task, where they are evaluated on how well they performed and how much they learnt. Our main contribution is to propose new automatic measures to study alignment; focusing on lexical alignment, and a new alignment context that we introduce termed as behavioural alignment (when an instruction given by one interlocutor was followed with concrete actions in a physical environment by another). Thus we propose methodologies to create a link between what was said, and what was done as a consequence. To do so, we focus on expressions related to the task in the situated activity. These expressions are minimally required by the interlocutors to make progress in the task. We then observe how these local alignment contexts build to dialogue level phenomena; success in the task. What distinguishes our approach from other works, is the treatment of alignment as a procedure that occurs in stages. Since we utilise a dataset of spontaneous speech dialogues elicited from children, a second contribution of our work is to study how spontaneous speech phenomena (such as when interlocutors say \"uh\", \"oh\" ...) are used in the process of alignment. Lastly, we make public the dataset to study alignment in educational dialogues. Our results show that all teams lexically and behaviourally align to some degree regardless of their performance and learning, and our measures capture that teams that did not succeed in the task were simply slower to collaborate. Thus we find that teams that performed better, were faster to align. Furthermore, our methodology captures a productive, collaborative period that includes the time where the interlocutors came up with their best solutions. We also find that well-performing teams verbalise the marker \"oh\" more when they are behaviourally aligned, compared to other times in the dialogue; showing that this marker is an important cue in alignment. To the best of our knowledge, we are the first to study the role of \"oh\" as an information management marker in a behavioural context (i.e. in connection to actions taken in a physical environment), compared to only a verbal one. Our measures contribute to the research in the field of educational dialogue and the intersection between dialogue and collaborative learning research. ","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"25 1","pages":"1-48"},"PeriodicalIF":0.0,"publicationDate":"2021-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77214634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Opinion Piece: How People Structure Representations of Discourse 评论文章:人们如何构建话语表征
Q1 Arts and Humanities Pub Date : 2021-02-25 DOI: 10.5210/dad.2021.101
Alan Garnham
Mental models or situation models include representations of people, but much of the literature about such models focuses on the representation of eventualities (events, states, and processes) or (small-scale) situations. In the well-known event-indexing model of Zwaan, Langston, and Graesser (1995), for example, protagonists are just one of five dimensions on which situation models are indexed. They are not given any additional special status. Consideration of longer narratives, and the ways in which readers or listeners relate to them, suggest that people have a more central status in the way we think about texts, and hence in discourse representations, Indeed, such considerations suggest that discourse representations are organised around (the representations of) central characters. The paper develops the idea of the centrality of main characters in representations of longer texts, by considering, among other things, the way information is presented in novels, with L’Éducation Sentimentale by Gustav Flaubert as a case study. Conclusions are also drawn about the role of representations of people in the representation of other types of text.
心理模型或情境模型包括人的表征,但是关于这些模型的大部分文献都集中在对偶然性(事件、状态和过程)或(小规模)情境的表征上。例如,在Zwaan, Langston和Graesser(1995)著名的事件索引模型中,主角只是情境模型索引的五个维度之一。他们没有任何额外的特殊地位。考虑到较长的叙述,以及读者或听众与之相关的方式,表明人们在我们思考文本的方式中具有更中心的地位,因此在话语表征中也是如此。事实上,这些考虑表明,话语表征是围绕中心人物(的表征)组织的。本文以古斯塔夫·福楼拜的《L ' Éducation Sentimentale》为例,通过对小说中信息呈现方式的研究,提出了长篇小说中主要人物的中心地位。结论也得出了关于人的表征在其他类型文本的表征中的作用。讨论和对话的其他方法,以及更普遍的行为,都与它们是否充分说明主角的中心作用的问题有关。
{"title":"Opinion Piece: How People Structure Representations of Discourse","authors":"Alan Garnham","doi":"10.5210/dad.2021.101","DOIUrl":"https://doi.org/10.5210/dad.2021.101","url":null,"abstract":"Mental models or situation models include representations of people, but much of the literature about such models focuses on the representation of eventualities (events, states, and processes) or (small-scale) situations. In the well-known event-indexing model of Zwaan, Langston, and Graesser (1995), for example, protagonists are just one of five dimensions on which situation models are indexed. They are not given any additional special status. Consideration of longer narratives, and the ways in which readers or listeners relate to them, suggest that people have a more central status in the way we think about texts, and hence in discourse representations, Indeed, such considerations suggest that discourse representations are organised around (the representations of) central characters. The paper develops the idea of the centrality of main characters in representations of longer texts, by considering, among other things, the way information is presented in novels, with L’Éducation Sentimentale by Gustav Flaubert as a case study. Conclusions are also drawn about the role of representations of people in the representation of other types of text.","PeriodicalId":37604,"journal":{"name":"Dialogue and Discourse","volume":"9 1","pages":"1-20"},"PeriodicalIF":0.0,"publicationDate":"2021-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78133771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Dialogue and Discourse
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1