Informers: Evaluating Explanation Quality

Student Team: Raghav ChaudharyChristopher Garcia-Cordova, Kaleen Shrestha, Zachary Sweet

Project Mentor: Marina Danilevsky, IBM Research

A point has been reached in technological advances where outcomes of important decisions can be determined by the output of a machine​ ​learning model. This has motivated the development of methods that can​ ​generate explanations for these models. However, when it comes to​ ​black-box models, post-hoc explanations for such models need to be​ ​viewed critically. This project developed a library to evaluate​ ​post-hoc explanations for machine learning models. This library had​ ​multiple criteria for evaluating explanation quality, such as​ ​consistency, faithfulness, and model confidence indication. The library also contained tools to generate perturbations of text based on​ ​adjuncts and synonyms in order to automatically create synthetic data​ ​for further analysis. The project was able to demonstrate how effective and applicable ​metrics were via novel user studies to understand the usefulness of​ ​metrics to an end-user. The objective of the project was to enable researchers​ ​and end-users to better understand their models and explanations, as well as​ ​propose some guidance for evaluating explanation techniques.

Informers: Evaluating Explanation Quality (PDF)

Last modified: Sep 09, 2024