Towards A Rigorous Science of Interpretable Machine Learning

28 February 2017

Finale Doshi-Velez

Papers citing "Towards A Rigorous Science of Interpretable Machine Learning"

50 / 402 papers shown

Title
Financial Fraud Detection Using Explainable AI and Stacking Ensemble Methods Fahad Almalki Mehedi Masud 21 0 0 15 May 2025
Engineering Risk-Aware, Security-by-Design Frameworks for Assurance of Large-Scale Autonomous AI Models Krti Tallam 26 0 0 09 May 2025
From Pixels to Perception: Interpretable Predictions via Instance-wise Grouped Feature Selection Moritz Vandenhirtz Julia E. Vogt 35 0 0 09 May 2025
Reasoning Models Don't Always Say What They Think Yanda Chen Joe Benton Ansh Radhakrishnan Jonathan Uesato Carson E. Denison ... Vlad Mikulik Samuel R. Bowman Jan Leike Jared Kaplan E. Perez ReLM LRM 67 12 1 08 May 2025
KERAIA: An Adaptive and Explainable Framework for Dynamic Knowledge Representation and Reasoning Stephen Richard Varey A. D. Stefano Anh Han 68 0 0 07 May 2025
Robustness questions the interpretability of graph neural networks: what to do? Kirill Lukyanov Georgii Sazonov Serafim Boyarsky Ilya Makarov AAML 137 0 0 05 May 2025
A New Approach to Backtracking Counterfactual Explanations: A Causal Framework for Efficient Model Interpretability Pouria Fatemi Ehsan Sharifian Mohammad Hossein Yassaee 43 0 0 05 May 2025
Privacy Risks and Preservation Methods in Explainable Artificial Intelligence: A Scoping Review Sonal Allana Mohan Kankanhalli Rozita Dara 29 0 0 05 May 2025
PhysNav-DG: A Novel Adaptive Framework for Robust VLM-Sensor Fusion in Navigation Applications Trisanth Srinivasan Santosh Patapati 34 0 0 03 May 2025
Adaptive Token Boundaries: Integrating Human Chunking Mechanisms into Multimodal LLMs Dongxing Yu 29 0 0 03 May 2025
Thinking Outside the Template with Modular GP-GOMEA Joe Harrison Peter A. N. Bosman T. Alderliesten 29 0 0 02 May 2025
A Mathematical Philosophy of Explanations in Mechanistic Interpretability -- The Strange Science Part I.i Kola Ayonrinde Louis Jaburi MILM 86 1 0 01 May 2025
Gradient Attention Map Based Verification of Deep Convolutional Neural Networks with Application to X-ray Image Datasets Omid Halimi Milani Amanda Nikho Lauren Mills M. Tliba Ahmet Enis Cetin Mohammed H. Elnagar MedIm 31 0 0 29 Apr 2025
Mitigating Societal Cognitive Overload in the Age of AI: Challenges and Directions Salem Lahlou 60 0 0 28 Apr 2025
REMEMBER: Retrieval-based Explainable Multimodal Evidence-guided Modeling for Brain Evaluation and Reasoning in Zero- and Few-shot Neurodegenerative Diagnosis Duy-Cat Can Quang-Huy Tang Huong Ha Binh T. Nguyen Oliver Y. Chén 26 0 0 12 Apr 2025
A constraints-based approach to fully interpretable neural networks for detecting learner behaviors Juan D. Pinto Luc Paquette 41 0 0 10 Apr 2025
Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset Diana Galván-Sosa Gabrielle Gaudeau Pride Kavumba Yunmeng Li Hongyi gu Zheng Yuan Keisuke Sakaguchi P. Buttery LRM 35 0 0 31 Mar 2025
Investigating the Duality of Interpretability and Explainability in Machine Learning Moncef Garouani Josiane Mothe Ayah Barhrhouj Julien Aligon AAML 39 2 0 27 Mar 2025
Inteligencia Artificial para la conservación y uso sostenible de la biodiversidad, una visión desde Colombia (Artificial Intelligence for conservation and sustainable use of biodiversity, a view from Colombia) Juan Sebastián Canas Camila Parra-Guevara Manuela Montoya-Castrillón Julieta M Ramírez-Mejía Gabriel-Alejandro Perilla ... Mario Murcia Elkin A. Noguera-Urbano Jose Manuel Ochoa-Quintero Susana Rodríguez Buriticá J. Ulloa 39 0 0 17 Mar 2025
Causality Is Key to Understand and Balance Multiple Goals in Trustworthy ML and Foundation Models Ruta Binkyte Ivaxi Sheth Zhijing Jin Mohammad Havaei Bernhard Schölkopf Mario Fritz 119 0 0 28 Feb 2025
Model Lakes Koyena Pal David Bau Renée J. Miller 63 0 0 24 Feb 2025
Verification and Validation for Trustworthy Scientific Machine Learning John D. Jakeman Lorena A. Barba J. Martins Thomas O'Leary-Roseberry AI4CE 56 0 0 21 Feb 2025
The Complexity of Learning Sparse Superposed Features with Feedback Akash Kumar 140 0 0 08 Feb 2025
Revisiting Rogers' Paradox in the Context of Human-AI Interaction K. M. Collins Umang Bhatt Ilia Sucholutsky 44 1 0 16 Jan 2025
Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics Lukas Klein Carsten T. Lüth U. Schlegel Till J. Bungert Mennatallah El-Assady Paul F. Jäger XAI ELM 34 2 0 03 Jan 2025
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation Dennis Fucci Marco Gaido Beatrice Savoldi Matteo Negri Mauro Cettolo L. Bentivogli 51 1 0 03 Nov 2024
CAT: Concept-level backdoor ATtacks for Concept Bottleneck Models Songning Lai Jiayu Yang Yu Huang Lijie Hu Tianlang Xue Zhangyi Hu Jiaxu Li Haicheng Liao Yutao Yue 26 1 0 07 Oct 2024
Synthesizing Interpretable Control Policies through Large Language Model Guided Search Carlo Bosio Mark W. Mueller 24 0 0 07 Oct 2024
Multi-Scale Grouped Prototypes for Interpretable Semantic Segmentation Hugo Porta Emanuele Dalsasso Diego Marcos D. Tuia 93 0 0 14 Sep 2024
Optimizing Neural Network Performance and Interpretability with Diophantine Equation Encoding Ronald Katende 27 0 0 11 Sep 2024
Interpretable Clustering: A Survey Lianyu Hu Mudi Jiang Junjie Dong Xinying Liu Zengyou He 26 1 0 01 Sep 2024
Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction Melkamu Mersha Khang Lam Joseph Wood Ali AlShami Jugal Kalita XAI AI4TS 67 28 0 30 Aug 2024
A prototype-based model for set classification Mohammad Mohammadi Sreejita Ghosh VLM 101 1 0 25 Aug 2024
The Evolution of Reinforcement Learning in Quantitative Finance: A Survey Nikolaos Pippas Cagatay Turkay Elliot A. Ludvig AIFin 87 3 0 20 Aug 2024
Unsupervised Machine Learning Hybrid Approach Integrating Linear Programming in Loss Function: A Robust Optimization Technique Andrew Kiruluta Andreas Lemos 31 0 0 19 Aug 2024
On Behalf of the Stakeholders: Trends in NLP Model Interpretability in the Era of LLMs Nitay Calderon Roi Reichart 36 10 0 27 Jul 2024
A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models Daking Rai Yilun Zhou Shi Feng Abulhair Saparov Ziyu Yao 79 19 0 02 Jul 2024
Restyling Unsupervised Concept Based Interpretable Networks with Generative Models Jayneel Parekh Quentin Bouniot Pavlo Mozharovskyi A. Newson Florence dÁlché-Buc SSL 55 1 0 01 Jul 2024
Ontology Embedding: A Survey of Methods, Applications and Resources Jiaoyan Chen Olga Mashkova Fernando Zhapa-Camacho R. Hoehndorf Yuan He Ian Horrocks 47 4 0 16 Jun 2024
Graphical Perception of Saliency-based Model Explanations Yayan Zhao Mingwei Li Matthew Berger XAI FAtt 36 2 0 11 Jun 2024
Ask LLMs Directly, "What shapes your bias?": Measuring Social Bias in Large Language Models Jisu Shin Hoyun Song Huije Lee Soyeong Jeong Jong C. Park 38 6 0 06 Jun 2024
AI with Alien Content and Alien Metasemantics H. Cappelen J. Dever 19 4 0 30 May 2024
Exploring a Multimodal Fusion-based Deep Learning Network for Detecting Facial Palsy Nicole Heng Yim Oo Min Hun Lee Jeong Hoon Lim CVBM 19 3 0 26 May 2024
Synthesizing Programmatic Reinforcement Learning Policies with Large Language Model Guided Search Max Liu Chan-Hung Yu Wei-Hsu Lee Cheng-Wei Hung Yen-Chun Chen Shao-Hua Sun 53 4 0 26 May 2024
A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns Asaf Yehudai Taelin Karidi Gabriel Stanovsky Ariel Goldstein Omri Abend 33 1 0 23 May 2024
Securing the Future of GenAI: Policy and Technology Mihai Christodorescu Craven S. Feizi Neil Zhenqiang Gong Mia Hoffmann ... Jessica Newman Emelia Probasco Yanjun Qi Khawaja Shams Turek SILM 38 3 0 21 May 2024
A Survey of Deep Learning-based Radiology Report Generation Using Multimodal Data Xinyi Wang Grazziela Figueredo Ruizhe Li W. Zhang Weitong Chen Xin Chen MedIm ViT 41 2 0 21 May 2024
The Fall of an Algorithm: Characterizing the Dynamics Toward Abandonment Nari Johnson Sanika Moharana Christina Harrington Nazanin Andalibi Hoda Heidari Motahhare Eslami 26 7 0 21 Apr 2024
Safety Implications of Explainable Artificial Intelligence in End-to-End Autonomous Driving Shahin Atakishiyev Mohammad Salameh Randy Goebel 66 6 0 18 Mar 2024
Can Interpretability Layouts Influence Human Perception of Offensive Sentences? Thiago Freitas dos Santos Nardine Osman Marco Schorlemmer 19 0 0 01 Mar 2024