Explainability to Business: Demystify Transformer Models with Attention-based Explanations

2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC) Pub Date : 2023-05-04 DOI:10.1109/ICAAIC56838.2023.10141005

Rajasekhar Thiruthuvaraj, Ashly Ann Jo, Ebin Deni Raj

{"title":"Explainability to Business: Demystify Transformer Models with Attention-based Explanations","authors":"Rajasekhar Thiruthuvaraj, Ashly Ann Jo, Ebin Deni Raj","doi":"10.1109/ICAAIC56838.2023.10141005","DOIUrl":null,"url":null,"abstract":"Recently, many companies are relying on Natural Language Processing (NLP) techniques to understand the text data generated daily. It has become very critical to deal with this data because finding the sentiments of text and summarizing them will help the company understand the pain points of the customers posting reviews on social media or understand the experience of the customer. These requirements have increasingly demanded many advanced algorithms to deal the text data. The introduction of Transformers led to businesses adopting NLP methods more and more to keep up with their needs. Models like Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformers (GPT), state-of-the-art results were achieved with billions of parameters learned. Although these advancements improved the accuracy and expanded the use of algorithms to a wide range of NLP tasks like language translation, text summarization, and language modeling. Businesses are more interested in the Explainability of the model compared to its accuracy. Explainable Artificial Intelligence (XAI) plays an important role to comprehend the complexities of the model as well as the influence of weights on predictions. In this paper, the complexities of the transformer model are unraveled by presenting a straightforward method for computing explainable predictions. The DistilBERT model is chosen as an example to implement the explainable system due to its lighter nature. Combining the strengths of a Posthoc expla-nation with those of a self-learning neural network, the method makes it simple to scale it to other algorithms to implement. With technologies like python, PyTorch, and Hugging Face, a detailed step-by-step algorithmic computation is demonstrated to explain the predictions from the attention-based explanations.","PeriodicalId":267906,"journal":{"name":"2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAAIC56838.2023.10141005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Recently, many companies are relying on Natural Language Processing (NLP) techniques to understand the text data generated daily. It has become very critical to deal with this data because finding the sentiments of text and summarizing them will help the company understand the pain points of the customers posting reviews on social media or understand the experience of the customer. These requirements have increasingly demanded many advanced algorithms to deal the text data. The introduction of Transformers led to businesses adopting NLP methods more and more to keep up with their needs. Models like Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformers (GPT), state-of-the-art results were achieved with billions of parameters learned. Although these advancements improved the accuracy and expanded the use of algorithms to a wide range of NLP tasks like language translation, text summarization, and language modeling. Businesses are more interested in the Explainability of the model compared to its accuracy. Explainable Artificial Intelligence (XAI) plays an important role to comprehend the complexities of the model as well as the influence of weights on predictions. In this paper, the complexities of the transformer model are unraveled by presenting a straightforward method for computing explainable predictions. The DistilBERT model is chosen as an example to implement the explainable system due to its lighter nature. Combining the strengths of a Posthoc expla-nation with those of a self-learning neural network, the method makes it simple to scale it to other algorithms to implement. With technologies like python, PyTorch, and Hugging Face, a detailed step-by-step algorithmic computation is demonstrated to explain the predictions from the attention-based explanations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

对业务的可解释性:用基于注意力的解释揭开变压器模型的神秘面纱

最近，许多公司都依靠自然语言处理(NLP)技术来理解日常生成的文本数据。处理这些数据变得非常关键，因为找到文本的情感并总结它们将有助于公司了解客户在社交媒体上发表评论的痛点或了解客户的体验。这些要求越来越需要许多先进的算法来处理文本数据。变形金刚的引入导致越来越多的企业采用NLP方法来满足他们的需求。像变形金刚双向编码器表示(BERT)和生成式预训练变形金刚(GPT)这样的模型，通过学习数十亿个参数获得了最先进的结果。尽管这些进步提高了准确性，并将算法的使用扩展到广泛的NLP任务，如语言翻译、文本摘要和语言建模。与模型的准确性相比，企业对模型的可解释性更感兴趣。可解释人工智能(XAI)在理解模型的复杂性以及权重对预测的影响方面发挥着重要作用。在本文中，通过提出一种计算可解释预测的简单方法，揭示了变压器模型的复杂性。由于其较轻的性质，选择蒸馏器模型作为实现可解释系统的示例。该方法结合了Posthoc解释和自学习神经网络的优势，使其很容易扩展到其他算法来实现。使用python、PyTorch和hug Face等技术，演示了详细的一步一步的算法计算来解释基于注意力的解释的预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC)

自引率

0.00%

发文量