[161] MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks25min 2022Q4 XAI ACL
[157] LeGrad: An Explainability Method for Vision Transformers via Feature Formation SensitivityCLIP XAI 2024Q2
[154] Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignmentgoogle XAI evaluation 2024Q2
[147] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder TransformersICCV 2021Q1 XAI