Student Team: Raghav Chaudhary, Christopher Garcia-Cordova, Kaleen Shrestha, Zachary Sweet
Project Mentor: Marina Danilevsky, IBM Research
A point has been reached in technological advances where outcomes of important decisions can be determined by the output of a machine learning model. This has motivated the development of methods that can generate explanations for these models. However, when it comes to black-box models, post-hoc explanations for such models need to be viewed critically. This project developed a library to evaluate post-hoc explanations for machine learning models. This library had multiple criteria for evaluating explanation quality, such as consistency, faithfulness, and model confidence indication. The library also contained tools to generate perturbations of text based on adjuncts and synonyms in order to automatically create synthetic data for further analysis. The project was able to demonstrate how effective and applicable metrics were via novel user studies to understand the usefulness of metrics to an end-user. The objective of the project was to enable researchers and end-users to better understand their models and explanations, as well as propose some guidance for evaluating explanation techniques.