An operational framework for guiding human evaluation in Explainable and Trustworthy AI
The assessment of explanations by humans presents a significant challenge within the context of Explainable and Trustworthy AI. This is attributed not only to the absence of universal metrics and standardized evaluation methods, but also to complexities tied to devising user studies that assess the perceived human comprehensibility of these explanations. To address this gap, we introduce a survey-based methodology for guiding the human evaluation of explanations. This approach amalgamates leading practices from existing literature and is implemented as an operational framework. This framework assists researchers throughout the evaluation process, encompassing hypothesis formulation, online user study implementation and deployment, and analysis and interpretation of collected data. The application of this framework is exemplified through two practical user studies.
keywords: Approaches for ensuring calibrated trust in AI, Fairness, accountability,& transparency in AI ethical framework implementations, Evaluation methods for AI ethical framework implementations