RESPECT: A framework for promoting inclusive and respectful conversations in online communications

Natural Language Processing Journal Pub Date : 2025-01-16 DOI:10.1016/j.nlp.2025.100126

Shaina Raza , Abdullah Y. Muaad , Emrul Hasan , Muskan Garg , Zainab Al-Zanbouri , Syed Raza Bashir

{"title":"RESPECT: A framework for promoting inclusive and respectful conversations in online communications","authors":"Shaina Raza , Abdullah Y. Muaad , Emrul Hasan , Muskan Garg , Zainab Al-Zanbouri , Syed Raza Bashir","doi":"10.1016/j.nlp.2025.100126","DOIUrl":null,"url":null,"abstract":"<div><div>Toxicity and bias in online conversations hinder respectful interactions, leading to issues such as harassment and discrimination. While advancements in natural language processing (NLP) have improved the detection and mitigation of toxicity on digital platforms, the evolving nature of social media conversations demands continuous innovation. Previous efforts have made strides in identifying and reducing toxicity; however, a unified and adaptable framework for managing toxic content across diverse online discourse remains essential. This paper introduces a comprehensive framework <strong>R</strong><span>ESPECT</span> designed to effectively identify and mitigate toxicity in online conversations. The framework comprises two components: an encoder-only model for detecting toxicity and a decoder-only model for generating debiased versions of the text. By leveraging the capabilities of transformer-based models, toxicity is addressed as a binary classification problem. Subsequently, open-source and proprietary large language models are utilized through prompt-based approaches to rewrite toxic text into non-toxic, and making sure these are contextually accurate alternatives. Empirical results demonstrate that this approach significantly reduces toxicity across various conversational styles, fostering safer and more respectful communication in online environments.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100126"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719125000020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Toxicity and bias in online conversations hinder respectful interactions, leading to issues such as harassment and discrimination. While advancements in natural language processing (NLP) have improved the detection and mitigation of toxicity on digital platforms, the evolving nature of social media conversations demands continuous innovation. Previous efforts have made strides in identifying and reducing toxicity; however, a unified and adaptable framework for managing toxic content across diverse online discourse remains essential. This paper introduces a comprehensive framework RESPECT designed to effectively identify and mitigate toxicity in online conversations. The framework comprises two components: an encoder-only model for detecting toxicity and a decoder-only model for generating debiased versions of the text. By leveraging the capabilities of transformer-based models, toxicity is addressed as a binary classification problem. Subsequently, open-source and proprietary large language models are utilized through prompt-based approaches to rewrite toxic text into non-toxic, and making sure these are contextually accurate alternatives. Empirical results demonstrate that this approach significantly reduces toxicity across various conversational styles, fostering safer and more respectful communication in online environments.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Natural Language Processing Journal

自引率

0.00%

发文量