FCE: Feedback based Counterfactual Explanations for Explainable AI

Artificial Intelligence provides accurate predictions for critical applications (e.g., healthcare, finance), but lacks the ability to explain its internal mechanism in most applications which require high interaction with humans. Even if many studies analyze machine learning models and their learning behaviour and eventually provide an interpretation of the inner mechanics of these models, these studies often entail a simpler surrogate model generate explanations by producing a piece of interpretable information such as feature scores. The crucial caveat against these studies is the lack of human involvement in the design and evaluation of explanations, consequently giving rise to trust issues and lack of acceptance and understanding. To this end, we address this limitation by involving humans in the counterfactual explanation generation process which is enriched with user feedback and thus enhancing the automated explanations which are better aligned with user expectations. In this paper, we propose a user feedback based counterfactual explanation approach for explainable Artificial Intelligence. In our work, we utilize feedback in two ways: first, feedback to customize the explanations locally that helps in providing the neighbourhood to discern the feasible explanations; and second, to evaluate the generated explanations.

keywords: Explainable Artificial Intelligence