{"title":"Dirichlet uncertainty wrappers for actionable algorithm accuracy accountability and auditability","authors":"José Mena Roldán, O. Vila, J. V. Marca","doi":"10.1145/3351095.3372825","DOIUrl":null,"url":null,"abstract":"Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third-party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of the internals of each component used unavoidable cause effects, such as lack of transparency, difficulty in auditability, and the emergence of potential uncontrolled risks. They are effectively black-boxes. Accountability of such solutions is a challenge for the auditors and the machine learning community. In this work, we propose a wrapper that given a black-box model enriches its output prediction with a measure of uncertainty when applied to a target domain. To develop the wrapper, we follow these steps: Modeling the distribution of the output. In a text classification setting, the output is a probability distribution p(y|X, w*) over the different classes to predict, y, given an input text X and the pre-trained model with parameters w*. We model this output by a random variable to measure the variability that the data noise causes in the output. Here we consider the output distribution coming from a Dirichlet probability density function, thus p(y|X, w*) ~ Dir(α). Decomposition of the Dirichlet concentration parameter. To relate the output of the classifier with the concentration parameter in the Dirichlet distribution, we propose a decomposition of the concentration parameter in two terms: α = βy. The role of this scalar β is to control the spread of the distribution around the expected value, i.e. the original prediction y. Training the wrapper. Sentences are represented as the average value of their word embeddings. This representation feeds a neural network that outputs a single regression value that models the parameter β. For each input, we combine β and the black-box prediction to obtain the corresponding distribution for the output ym,i ~ Dir(αi). By using Monte Carlo sampling, we approximate the expected value of the classification probabilities, [EQUATION] and we train the model applying a cross-entropy loss over the predictions and the labels. Obtaining an uncertainty score from the wrapper. To obtain a numerical value for the uncertainty of a prediction, we draw samples from the resulting Dir(α) to evaluate the predictive entropy with [EQUATION], thus obtaining a numerical score for the uncertainty of each prediction. Using uncertainty for rejection. Based on this wrapper, we provide an actionable mechanism to mitigate risk in the form of decision rejection: once equipped with a value for the uncertainty of a given prediction, we can choose not to issue that prediction when the risk or uncertainty in that decision is significant. This results in a rejection system that selects the more confident predictions, discards those more uncertain, and leads to an improvement in the trustability of the resulting system. We showcase the proposed technique and methodology in a practical scenario where we apply a simulated sentiment analysis API based on NLP to different domains. On each experiment, we train a sentiment classifier using text reviews of products in a source domain. We apply the pre-trained black-box to obtain the predictions for the reviews from a target domain. The tuples of review plus black-box predictions are then used for training the wrapper to obtain the uncertainty. Finally, we use the uncertainty score to sort the predictions from more to less uncertain, and we search for a rejection point that maximizes the three performance measures: non-rejected accuracy, and classification and rejection quality. Experiments demonstrate the effectiveness of the uncertainty measure computed by the wrapper and shows its high correlation to bad quality predictions and misclassifications. In all the cases, the uncertainty metric here proposed outperforms traditional uncertainty measures.","PeriodicalId":377829,"journal":{"name":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","volume":"142 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3351095.3372825","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Nowadays, the use of machine learning models is becoming a utility in many applications. Companies deliver pre-trained models encapsulated as application programming interfaces (APIs) that developers combine with third-party components and their own models and data to create complex data products to solve specific problems. The complexity of such products and the lack of control and knowledge of the internals of each component used unavoidable cause effects, such as lack of transparency, difficulty in auditability, and the emergence of potential uncontrolled risks. They are effectively black-boxes. Accountability of such solutions is a challenge for the auditors and the machine learning community. In this work, we propose a wrapper that given a black-box model enriches its output prediction with a measure of uncertainty when applied to a target domain. To develop the wrapper, we follow these steps: Modeling the distribution of the output. In a text classification setting, the output is a probability distribution p(y|X, w*) over the different classes to predict, y, given an input text X and the pre-trained model with parameters w*. We model this output by a random variable to measure the variability that the data noise causes in the output. Here we consider the output distribution coming from a Dirichlet probability density function, thus p(y|X, w*) ~ Dir(α). Decomposition of the Dirichlet concentration parameter. To relate the output of the classifier with the concentration parameter in the Dirichlet distribution, we propose a decomposition of the concentration parameter in two terms: α = βy. The role of this scalar β is to control the spread of the distribution around the expected value, i.e. the original prediction y. Training the wrapper. Sentences are represented as the average value of their word embeddings. This representation feeds a neural network that outputs a single regression value that models the parameter β. For each input, we combine β and the black-box prediction to obtain the corresponding distribution for the output ym,i ~ Dir(αi). By using Monte Carlo sampling, we approximate the expected value of the classification probabilities, [EQUATION] and we train the model applying a cross-entropy loss over the predictions and the labels. Obtaining an uncertainty score from the wrapper. To obtain a numerical value for the uncertainty of a prediction, we draw samples from the resulting Dir(α) to evaluate the predictive entropy with [EQUATION], thus obtaining a numerical score for the uncertainty of each prediction. Using uncertainty for rejection. Based on this wrapper, we provide an actionable mechanism to mitigate risk in the form of decision rejection: once equipped with a value for the uncertainty of a given prediction, we can choose not to issue that prediction when the risk or uncertainty in that decision is significant. This results in a rejection system that selects the more confident predictions, discards those more uncertain, and leads to an improvement in the trustability of the resulting system. We showcase the proposed technique and methodology in a practical scenario where we apply a simulated sentiment analysis API based on NLP to different domains. On each experiment, we train a sentiment classifier using text reviews of products in a source domain. We apply the pre-trained black-box to obtain the predictions for the reviews from a target domain. The tuples of review plus black-box predictions are then used for training the wrapper to obtain the uncertainty. Finally, we use the uncertainty score to sort the predictions from more to less uncertain, and we search for a rejection point that maximizes the three performance measures: non-rejected accuracy, and classification and rejection quality. Experiments demonstrate the effectiveness of the uncertainty measure computed by the wrapper and shows its high correlation to bad quality predictions and misclassifications. In all the cases, the uncertainty metric here proposed outperforms traditional uncertainty measures.