FAT Forensic Events

Hands-on Tutorial on Explainable ML with FAT Forensics

Learn more about FAT Forensics: Source Code Documentation

Resources: Recordings Slides Jupyter Notebooks Slack


What and How of Machine Learning Transparency

Building Bespoke Explainability Tools with Interoperable Algorithmic Components

An online hands-on tutorial at ECML-PKDD 2020
Where: Virtual, through the ECML-PKDD 2020 conference, Ghent, Belgium.
When: Friday, September 18th, 2020.

We strongly suggest participants prepare for the hands-on exercises beforehand. Please have a look at the slides corresponding to the “hands-on session preparation” section of Part 2 of the tutorial and the notebooks page. They explain different ways to participate:

  • installing the Python package and downloading the Jupyter Notebooks on your own machine;
  • executing the notebooks online via Google Colab (a Google account is required); and
  • running the notebooks via My Binder directly in the browser.

Table of Contents

About the Tutorial

In this hands-on tutorial we:

In particular, we focus on popular surrogate explainers such as Local Interpretable Model-agnostic Explanations (LIME1). They are model-agnostic, post-hoc and compatible with a diverse range of data types – image, text and tabular – making them a popular choice for explaining black-box predictions. We embrace this ubiquity and teach participants how to improve upon such off-the-shelf solutions by taking advantage of the aforementioned modularity. In particular, we demonstrate how to harness it to compose a suite of bespoke transparency forensic tools for black-box models and their predictions.

   Our effort to build custom explainers from interoperable algorithmic modules is grounded with a theoretical introduction followed by interactive coding exercises. This structure supports two distinct goals: research & development as well as deployment of such tools. The hands-on part of the tutorial, therefore, walks the attendees through building and validating their own, tailor-made, transparency techniques in the context of surrogate explainers designated for tabular data. For example, it demonstrates how choosing their three core modules2 – interpretable data representation, data sampling and explanation generation – influences the type and quality of the resulting explanations of black-box predictions. These programming exercises are delivered using FAT Forensics34 – a Python toolbox open-sourced under the BSD 3-Clause license and designed for inspecting Fairness, Accountability and Transparency (FAT) aspects of data, models and predictions.


To reference this tutorial please use:

  author       = {Sokol, Kacper and
                  Hepburn, Alexander and
                  Santos-Rodriguez, Raul and
                  Flach, Peter},
  title        = {{W}hat and How of Machine Learning Transparency:
                  {B}uilding Bespoke Explainability Tools with
                  Interoperable Algorithmic Components},
  month        = sep,
  year         = 2020,
  publisher    = {Zenodo},
  version      = {2020ecmlpkdd},
  doi          = {10.5281/zenodo.4035128},
  url          = {https://doi.org/10.5281/zenodo.4035128}


Many public-domain implementations of transparency, interpretability and explainability (TIE) algorithms are a result of academic research projects. As such, the development of these tools usually starts with a particular research goal in mind, e.g., demonstrating the capabilities of a proposed method. In practice, this often means that the software is engineered without an elaborate design, which tends to be detrimental for its reusability and modularity. Moreover, such tools are usually aimed at a general audience – making them easy to use for non-experts – which requires hiding all of their complexity away from the user. For example, this can be achieved by providing a TIE functionality as an end-to-end “product” and only allowing its customisation through designated parameters exposed via an accessible Application Programming Interface (API). However, more advanced users such as researchers and developers – who want to build on top of these implementations or deploy them in a custom system – may need to resort to tinkering with the source code, which can be frustrating and time consuming. While creating strictly modular TIE tools may be desirable, this approach comes at the expense of a more complex design and an API that is not necessarily suitable for a lay audience, thus limiting the reach and appeal of such a software.

   In this tutorial, we address the challenges highlighted above and provide participants with knowledge and hands-on skills to help them:

To this end, we demonstrate how a modular software design – bundling all of the reusable TIE algorithmic components in a single package under a standardised API – combines the best of both worlds and caters to technical and casual users alike. By offering access to a low-level API of TIE building blocks, we supply a collection of algorithms that can be used by technical users – researchers and developers – to compose their own bespoke explainers. Additionally, we use these components to build popular and easy to use TIE algorithms as part of the API targeted at a lay audience. This modular software design is embodied in an open source Python package – FAT Forensics – which we released to improve and advance reproducibility of TIE algorithms and provide native access to their vital components.

   By participating in this tutorial, academics can gain insights into experimenting with the algorithmic design and building blocks of state-of-the-art explainers – a perspective that is uncommon among other educational resources in this space, which simply show how to apply such tools. Attendees from industry, on the other hand, can learn to build and tune bespoke explainers of black-box machine learning models and their predictions to meet their business needs, e.g., improve transparency of deployed predictive models or extract important insights for relevant stakeholders. The tutorial also briefly discusses best practices of software engineering for machine learning research and highlights benefits of modular design, both of which contribute to sustainable and reproducible research in the field. Our hands-on approach provides participants with first-hand experience, leading to a better understanding of how TIE algorithms operate and how to avoid possible pitfalls of using off-the-shelf solutions.

FAT Forensics (Software)

To support the goals of our hands-on tutorial, we employ FAT Forensics – an open source Python package that can inspect selected fairness, accountability and transparency aspects of data (and their features), models and predictions. The toolbox spans all of the FAT domains because many of them share underlying algorithmic components that can be reused in multiple different implementations, often across the FAT borders. This interoperability allows, for example, a counterfactual data point generator to be used as a post-hoc explainer of black-box predictions on one hand, and as an individual fairness (disparate treatment) inspection tool on the other. The modular architecture34 enables FAT Forensics to deliver robust and tested low-level FAT building blocks as well as a collection of FAT tools built on top of them. Users can choose from these ready-made tools or, alternatively, combine the available building blocks to create their own bespoke algorithms without the need of modifying the code base.

   The modular design of the toolbox also decouples an FAT tool from its presentation medium, thus enabling two distinct “modes of operation”. In the research mode (data in – visualisations out), the tool can be loaded into an interactive Python session, e.g., a Jupyter Notebook, supporting rapid prototyping and exploratory analysis. This mode is intended for FAT researchers who can use it to propose new fairness metrics, compare them with the existing ones or use them to inspect a new system or data set. The deployment mode (data in – data out), on the other hand, can be used to incorporate FAT functionality into a data processing pipeline to provide a (numerical) analytics or become the foundation of any kind of automated reporting and dashboarding. This mode is intended for machine learning engineers and data scientists who may use it to monitor or evaluate a predictive system during its development and deployment.

   FAT Forensics is published under the BSD 3-Clause open source licence, which permits commercial applications. The toolbox has been built with the best software engineering practices in mind to ensure its longevity, sustainability, extensibility and streamlined maintenance. The development workflow established around the package allows for its adoption into new and existing projects, and provides an easy way for the community to contribute novel FAT algorithms. Finally, the documentation3 is carefully crafted to cater a wide range of users and applications, and consists of:

Schedule and Resources

The tutorial lasts for 4 hours, including a 30-minute break. The first part – 1 hour and 15 minutes – introduces popular transparency, explainability and interpretability approaches. It focuses on surrogate explainers of tabular data, discussing their pros, cons and modularisation. The next part – 30 minutes in session and a 30-minute break – presents the software underpinning the tutorial. It begins with a 15-minute introduction to FAT Forensics, which covers its algorithmic design and available implementations. The following 15 minutes are used to show participants how to set up the package in preparation for the hands-on session. It involves installing any dependencies as well as downloading required data sets and Jupyter Notebooks for attendees who could not complete these tasks beforehand. Next, we take a 30-minute break during which we help participants to resolve any technical difficulties and setup issues. The final part – 1 hour and 45 minutes – is devoted to hands-on exercises:

Tutorial recordings can be access on YouTube.

A more detailed outline of the tutorial is presented below.

Part 1: Identifying Modules of Black-box Explainers

Introduction to modular machine learning transparency, explainability and interpretability through a lens of surrogate explainers for tabular data – 1 hour and 15 minutes in total.

Duration Activities Instructor Resources
2.00pm CEST
(15 minutes)
Background and motivation of research on modular explainers. Peter Flach recording
2.15pm CEST
(60 minutes)
What and how of modular interpretability: a case study of bespoke surrogate explainers for tabular data. Kacper Sokol recording

Part 2: Getting to Know FAT Forensics

Introduction to hands-on machine learning interpretability with FAT Forensics in preparation for the hands-on exercises – 30 minutes in session and a 30-minutes break.

Duration Activities Instructor Resources
3.15pm CEST
(15 minutes)
Introduction to open source interpretability tools using the example of FAT Forensics – promises and perils of modular research software. Alex Hepburn recording
3.30pm CEST
(15 minutes)
Hands-on session preparation. Setting up the package on a personal machine and experimenting with it online: My Binder and Google Colab. Overview of FAT Forensics' API documentation, online tutorials and how-to guides. Alex Hepburn recording
3.45pm CEST
(30 minutes)
Break. (An opportunity to resolve any individual issues with the software setup encountered by participants.) Kacper Sokol
& Alex Hepburn

Part 3: Building Bespoke Surrogate Explainers (Hands-on)

Hands-on transparency with bespoke surrogate explainers for tabular data built from interoperable algorithmic modules – 1 hour and 45 minutes in total.

Duration Activities Instructor Resources
4.15pm CEST
(15 minutes)
Introduction to the hands-on resources and overview of the Jupyter Notebooks. Alex Hepburn recording
4.30pm CEST
(80 minutes)
Active participation facilitated by the instructors: bring your own data. Building bespoke surrogate explainers of black-box predictions for tabular data – trade-offs associated with choosing particular algorithmic components when building surrogate explainers2. Kacper Sokol
& Alex Hepburn

5.50pm CEST
(10 minutes)
Summary and farewell. Raul Santos-Rodriguez recording


Kacper Sokol

Kacper is a final-year PhD student and research associate at the University of Bristol. His main research focus is transparency – interpretability and explainability – of machine learning systems. In particular, he has done work on enhancing transparency of logical predictive models (and their ensembles) with counterfactual explanations. Kacper is the designer and lead developer of the FAT Forensics package.


Alexander Hepburn

Alex is a third-year PhD student at the University of Bristol. His research is based on including user-defined prior information in loss functions to improve human perceptual systems and cost sensitive learning, mainly applied to deep learning. Alex is a core developer of the FAT Forensics package.


Raul Santos-Rodriguez

Raul is a Senior Lecturer in Data Science and Intelligent Systems in the Department of Engineering Mathematics at the University of Bristol. His research interests lie in data science, machine learning, artificial intelligence and their applications to signal processing, particularly healthcare, remote sensing and music information retrieval.


Peter Flach

Peter is a Professor of Artificial Intelligence at the University of Bristol. His research interests include mining highly structured data, the evaluation and improvement of machine learning models, and human-centred AI. Peter recently stepped down as Editor-in-Chief of the Machine Learning journal, is President of the European Association for Data Science, and has published several books including “Machine Learning: The Art and Science of Algorithms that Make Sense of Data” (Cambridge University Press, 2012).



  1. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1135–1144. https://dl.acm.org/doi/10.1145/2939672.2939778 

  2. Kacper Sokol, Alexander Hepburn, Raul Santos-Rodriguez, and Peter Flach. 2019. bLIMEy: Surrogate Prediction Explanations Beyond LIME. 2019 Workshop on Human-Centric Machine Learning (HCML 2019) at the 33rd Conference on NeuralInformation Processing Systems (NeurIPS 2019), Vancouver, Canada (2019). https://arxiv.org/abs/1910.13016  2

  3. Kacper Sokol, Alexander Hepburn, Rafael Poyiadzi, Matthew Clifford, Raul Santos-Rodriguez, and Peter Flach. 2020. FAT Forensics: A Python Toolbox for Implementing and Deploying Fairness, Accountability and Transparency Algorithms in Predictive Systems. Journal of Open Source Software, 5(49), p.1904. https://joss.theoj.org/papers/10.21105/joss.01904  2 3

  4. Kacper Sokol, Raul Santos-Rodriguez, and Peter Flach. 2019. FAT Forensics: A Python Toolbox for Algorithmic Fairness, Accountability and Transparency. arXiv preprint arXiv:1909.05167. https://arxiv.org/abs/1909.05167  2