2026-04-13T01:51:23Zhttps://recercat.cat/oai/request

oai:recercat.cat:2117/3811592025-07-16T22:36:20Zcom_2072_1033col_2072_452949

00925njm 22002777a 4500 dc Arias Duart, Anna author Parés, Ferran author Garcia-Gasulla, Dario author 2022-05 Explainability has become a major topic of research in Artificial Intelligence (AI), aimed at increasing trust in models such as Deep Learning (DL) networks. However, trustworthy models cannot be achieved with explainable AI (XAI) methods unless the XAI methods themselves can be trusted. To evaluate XAI methods one may assess interpretability, a qualitative measure of how understandable an explanation is to humans [1]. While this is important to guarantee the proper interaction between humans and the model, interpretability generally involves end-users in the process [2], inducing strong biases. In fact, a qualitative evaluation alone cannot guarantee coherency to reality (i.e., model behavior), as false explanations can be more interpretable than accurate ones. To enable trust on XAI methods, we also need quantitative and objective evaluation metrics, which validate the relation between the explanations produced by the XAI method and the behavior of the trained model under assessment. In this work we propose a novel evaluation score for feature attribution methods, described in §I-A. Our input alteration approach induces in-distribution noise into samples, that is, alterations on the input which correspond to visual patterns found within the original data distribution. To do so we modify the context of the sample instead of the content, leaving the original pixels values untouched. In practice, we create a new sample, composed of samples of different classes, which we call a mosaic image (see examples in Figure 2). Using mosaics as input has a major benefit: each input quadrant is an image from the original distribution, producing blobs of activations in each quadrant which are consequently coherent. Only the pixels forming the borders between images, and the few corresponding activations, may be considered out of distribution. By inducing in-distribution noise, mosaic images introduce a problem in which XAI methods may objectively err (focus on something it should not be focusing on). On those composed mosaics we ask a XAI method to provide explanation for just one of the contained classes, and follow its response. Then, we measure how much of the explanation generated by the XAI is located on the areas corresponding to the target class, quantifying it through the Focus score. This score allows us to compare methods in terms of explanation precision, evaluating the capability of XAI methods to provide explanations related to the requested class. Using mosaics has another benefit. Since the noise introduced is in-distribution, the explanation errors identify and exemplify biases of the model. This facilitates the elimination of biases in models and datasets, potentially resulting in more reliable solutions. We illustrate how to do so in §I-C. Àrees temàtiques de la UPC::Informàtica::Arquitectura de computadors High performance computing Bias Explainable AI Bias detection Image classification Càlcul intensiu (Informàtica) Intel·ligència artificial Artificial intelligence Focus! rating XAI methods and finding biases