{"title":"Text Detoxification System in Dialogue Conversations","authors":"M.D. Suvorov, V.I. Vinogradov","doi":"10.17759/mda.2023130102","DOIUrl":null,"url":null,"abstract":"<p>The work is aimed at improving the cultural level of correspondence in dialog systems. The key feature of the work is its focus on real&ndash;time use and ensuring sustainable detoxification, taking into account the specifics of dialog communication (typos, noise symbols, transliteration, etc.). The solution offers the use of a neural network approach and software processing to obtain embeds of tokens and the subsequent solution of the classification problem. Unlike traditional message filters, the task is to preserve the meaning of the source text by clearing it of toxic content. The operability of the system can be checked on the basis of the Telegram messenger, in which the model is presented in the form of a bot. The system itself is deployed on the basis of Serverless technology from a cloud provider, which allows it to adapt to peak loads and at the same time be easy to maintain.</p>","PeriodicalId":498071,"journal":{"name":"Modelirovanie i analiz dannyh","volume":"249 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Modelirovanie i analiz dannyh","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17759/mda.2023130102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The work is aimed at improving the cultural level of correspondence in dialog systems. The key feature of the work is its focus on real–time use and ensuring sustainable detoxification, taking into account the specifics of dialog communication (typos, noise symbols, transliteration, etc.). The solution offers the use of a neural network approach and software processing to obtain embeds of tokens and the subsequent solution of the classification problem. Unlike traditional message filters, the task is to preserve the meaning of the source text by clearing it of toxic content. The operability of the system can be checked on the basis of the Telegram messenger, in which the model is presented in the form of a bot. The system itself is deployed on the basis of Serverless technology from a cloud provider, which allows it to adapt to peak loads and at the same time be easy to maintain.