GenAI and Socially Responsible AI in Natural Language Processing Applications: A Linguistic Perspective

Proceedings of the AAAI Symposium Series Pub Date : 2024-05-20 DOI:10.1609/aaaiss.v3i1.31230

Christina Alexandris

{"title":"GenAI and Socially Responsible AI in Natural Language Processing Applications: A Linguistic Perspective","authors":"Christina Alexandris","doi":"10.1609/aaaiss.v3i1.31230","DOIUrl":null,"url":null,"abstract":"It is a widely-accepted fact that the processing of very large amounts of data with state-of-the-art Natural Language Processing (NLP) practices (i.e. Machine Learning –ML, language agnostic approaches) has resulted to a dramatic improvement in the speed and efficiency of systems and applications. However, these developments are accompanied with several challenges and difficulties that have been voiced within the last years. Specifically, in regard to NLP, evident improvement in the speed and efficiency of systems and applications with GenAI also entails some aspects that may be problematic, especially when particular text types, languages and/or user groups are concerned.\nState-of-the-art NLP approaches with automated processing of vast amounts of data in GenAI are related to observed problematic Aspects 1-7, namely: (1) Underrepresentation, (2) Standardization. These result to (3) Barriers in Text Understanding, (4) Discouragement of HCI Usage for Special Text Types and/or User Groups, (5) Barriers in Accessing Information, (6) Likelihood of Errors and False Assumptions and (7) Difficulties in Error Detection and Recovery. An additional problem are typical cases, such as less-resourced languages (A), less experienced users (B) and less agile users (C). \nA hybrid approach involving the re-introduction and integration of traditional concepts in state-of-the-art processing approaches, whether they are automatic or interactive, concerns the following targets:\ni), (ii) and (iii): Making more types of information accessible to more types of recipients and user groups (i), Making more types of services accessible and user-friendly to more types of user groups (ii), Making more types of feelings, opinions, voices and reactions visible from more types of user groups (iii)\nSpecifically, in the above-presented cases traditional and classical theories, principles and models are re-introduced and can be integrated into state-of-the art data-driven approaches involving Machine Learning and neural networks, functioning as training data and seed data in Natural Language Processing applications where user requirements and customization are of particular interest and importance. A hybrid approach may be considered a compromise between speed and correctness / userfriendliness in (types of) NLP applications where the achievement of this balance plays a crucial role. In other words, a hybrid approach and the examples presented here target to prevent mechanisms from adopting human biases, ensuring fairness and socially responsible outcome and responsible Social Media. A hybrid approach and the examples presented here also target to customizing content to different linguistic and cultural groups, ensuring equitable information distribution. \nHere, we present characteristic examples with cases employing the re-introduction of four typical types of traditional concepts concerning classical theories, principles and models. These four typical classical theories, principles and models are also not considered to be flawless, however they can be transformed into practical strategies that can be integrated into evaluation modules, neural networks and training data (including knowledge graphs) and dialogue design. The proposed and discussed re-introduction of traditional concepts is not limited only to the particular models, principles and theories presented here. \nThe first example concerns the application of a classic principle from Theoretical Linguistics. The concept employed in the second example concerns a model from the field of Linguistics and Translation. The third and the fourth examples demonstrate the interdisciplinary application of models and theoretical frameworks from the fields of Linguistics-Cognitive Science and Linguistics-Psychology respectively.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"71 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI Symposium Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaaiss.v3i1.31230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

It is a widely-accepted fact that the processing of very large amounts of data with state-of-the-art Natural Language Processing (NLP) practices (i.e. Machine Learning –ML, language agnostic approaches) has resulted to a dramatic improvement in the speed and efficiency of systems and applications. However, these developments are accompanied with several challenges and difficulties that have been voiced within the last years. Specifically, in regard to NLP, evident improvement in the speed and efficiency of systems and applications with GenAI also entails some aspects that may be problematic, especially when particular text types, languages and/or user groups are concerned. State-of-the-art NLP approaches with automated processing of vast amounts of data in GenAI are related to observed problematic Aspects 1-7, namely: (1) Underrepresentation, (2) Standardization. These result to (3) Barriers in Text Understanding, (4) Discouragement of HCI Usage for Special Text Types and/or User Groups, (5) Barriers in Accessing Information, (6) Likelihood of Errors and False Assumptions and (7) Difficulties in Error Detection and Recovery. An additional problem are typical cases, such as less-resourced languages (A), less experienced users (B) and less agile users (C). A hybrid approach involving the re-introduction and integration of traditional concepts in state-of-the-art processing approaches, whether they are automatic or interactive, concerns the following targets: i), (ii) and (iii): Making more types of information accessible to more types of recipients and user groups (i), Making more types of services accessible and user-friendly to more types of user groups (ii), Making more types of feelings, opinions, voices and reactions visible from more types of user groups (iii) Specifically, in the above-presented cases traditional and classical theories, principles and models are re-introduced and can be integrated into state-of-the art data-driven approaches involving Machine Learning and neural networks, functioning as training data and seed data in Natural Language Processing applications where user requirements and customization are of particular interest and importance. A hybrid approach may be considered a compromise between speed and correctness / userfriendliness in (types of) NLP applications where the achievement of this balance plays a crucial role. In other words, a hybrid approach and the examples presented here target to prevent mechanisms from adopting human biases, ensuring fairness and socially responsible outcome and responsible Social Media. A hybrid approach and the examples presented here also target to customizing content to different linguistic and cultural groups, ensuring equitable information distribution. Here, we present characteristic examples with cases employing the re-introduction of four typical types of traditional concepts concerning classical theories, principles and models. These four typical classical theories, principles and models are also not considered to be flawless, however they can be transformed into practical strategies that can be integrated into evaluation modules, neural networks and training data (including knowledge graphs) and dialogue design. The proposed and discussed re-introduction of traditional concepts is not limited only to the particular models, principles and theories presented here. The first example concerns the application of a classic principle from Theoretical Linguistics. The concept employed in the second example concerns a model from the field of Linguistics and Translation. The third and the fourth examples demonstrate the interdisciplinary application of models and theoretical frameworks from the fields of Linguistics-Cognitive Science and Linguistics-Psychology respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自然语言处理应用中的 GenAI 和对社会负责的人工智能：语言学视角

一个公认的事实是，利用最先进的自然语言处理（NLP）方法（即机器学习-ML、语言无关方法）处理海量数据，极大地提高了系统和应用的速度和效率。然而，伴随着这些发展，过去几年中也出现了一些挑战和困难。特别是在 NLP 方面，GenAI 系统和应用的速度和效率的明显提高也带来了一些可能存在问题的方面，尤其是在涉及特定文本类型、语言和/或用户群体时。这些问题导致：(3) 文本理解障碍；(4) 阻碍特殊文本类型和/或用户群体使用人机交互技术；(5) 信息获取障碍；(6) 错误和错误假设的可能性；(7) 错误检测和恢复困难。另一个问题是典型案例，如资源较少的语言 (A)、经验较少的用户 (B) 和不够敏捷的用户 (C)。在最先进的处理方法中重新引入和整合传统概念的混合方法，无论是自动的还是交互的，都涉及到以下目标：i)、(ii)和(iii)：让更多类型的接收者和用户群体可以获取更多类型的信息 (i)，让更多类型的用户群体可以获取更多类型的服务并对其友好 (ii)，让更多类型的用户群体可以看到更多类型的感受、意见、声音和反应 (iii)。具体而言，在上述情况下，传统的经典理论、原则和模型被重新引入，并可以集成到涉及机器学习和神经网络的最先进的数据驱动方法中，在自然语言处理应用中作为训练数据和种子数据发挥作用，在自然语言处理应用中，用户需求和定制化是特别重要的。在（各类）自然语言处理应用中，混合方法可被视为速度与正确性/用户友好性之间的折衷方案，在这些应用中，实现这种平衡起着至关重要的作用。换句话说，混合方法和本文介绍的示例旨在防止机制采用人为偏见，确保公平性和对社会负责的结果，以及负责任的社交媒体。混合方法和本文介绍的实例还旨在为不同语言和文化群体定制内容，确保信息的公平传播。在此，我们通过重新引入四种典型的传统概念（涉及经典理论、原则和模型）的案例来介绍具有特色的实例。这四种典型的经典理论、原则和模型也并非完美无瑕，但它们可以转化为实用策略，融入评估模块、神经网络和训练数据（包括知识图谱）以及对话设计中。所建议和讨论的传统概念的重新引入并不仅限于本文所介绍的特定模型、原则和理论。第一个例子涉及理论语言学中一个经典原则的应用。第二个例子中使用的概念涉及语言学和翻译领域的一个模型。第三个和第四个例子分别展示了语言学-认知科学和语言学-心理学领域的模型和理论框架的跨学科应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the AAAI Symposium Series

自引率

0.00%

发文量