Here at Synapse, we’re not only committed to building great products but also fair and transparent ones. To that end, we’re working hard to ensure our classification models are free of social biases and are favoring inclusion over exclusion.
It goes without saying that ethical questions that are difficult for human beings to reach consensus on will not be easily solved by machine learning models. The difficulty involved, however, is not a reason to shy away from pursuing technical solutions. In fact, this difficulty indicates just how pressing and essential it is that more research and more direct applications are developed.
While our solutions may not always be perfect, being transparent about what our models do and why they do it is a necessary first step. In the process, we can help to simplify complexity and to account for everything we build.
In future posts, we’ll go into more detail about the technical approaches we are implementing to accomplish this goal. In the meantime, this post provides a brief overview of some recent libraries and theories we feel are particularly promising.
Christoph Molnar’s online book provides a useful overview of current approaches for explaining black box machine learning models. A number of libraries — most of which are open-source — have also been released over the last few years and tend to have relatively straightforward implementations for explaining the predictions of already trained models.
LIME (Local Interpretable Model-Agnostic Explanations), is probably the most widely used interpretability library. As the name suggests, it is model-agnostic, meaning it can be used on anything from a polynomial regression model to a deep neural network.
LIME’s approach is to perturb most of the features of a single prediction instance — essentially zeroing-out these features — and then to test the resulting output. By running this process repeatedly, LIME is able to determine a linear decision boundary for each feature indicating its predictive importance (e.g. which pixels contributed the most to the classification of a specific image). The associated paper provides a more rigorous discussion of the approach, though this post by the authors is probably the best place to start.
Skater, another open-source library, incorporates LIME’s local interpretations, while also including global explanations. For instance, there is built-in functionality for producing marginal plots (showing the relationship between different pairs of variables) and partial dependence plots (showing the relationship between each variable and the model output).
H2O, an open-source machine learning platform, includes various methods for model interpretability along with its more general ML implementations. The authors provide examples and documentation for approaches such as decision tree surrogate models, sensitivity analysis, and monotonic constraints (useful for instances in which you want to ensure that changes to a specific variable result in a continuous increase or decrease in the model’s output). They’ve also written a post describing different interpretability methods in depth.
SHAP (SHapley Additive exPlanations), unifies multiple different interpretability methods (including LIME) into a single approach. It does this by mathematically defining a class of additive feature attribution methods, and demonstrates that six different interpretability methods currently in use fall within this class. See the associated paper for more details.
Bayesian Deep Learning
Bayesian deep learning has emerged as another way to gain more insight into black box models. Rather than explaining individual feature importance for predictions, a bayesian approach enables one to measure how confident a deep learning model is about its predictions. This is useful on multiple fronts, but one particularly beneficial result is that predictions that are output with a high degree of uncertainty can be set aside for closer manual analysis by a human being.
A number of probabilistic programming languages have been released over the last few years, starting with Stan back in 2012. Since then there’s been PyMC3 (running Theano on the backend), Edward (running on TensorFlow), and Pyro (released just last November by Uber’s AI labs and running on Pytorch).
Pyro was specifically designed for deep learning applications, and the Edward documentation provides a number of tutorials and videos for bayesian deep learning, including an example of how to use dropout to approximate bayesian probabilities. A key paper originally published back in 2015 demonstrates that dropout, a standard method of regularizing deep learning models to prevent overfitting, converges to a gaussian process and hence can be used to measure model confidence.
Fair Machine Learning
Annual conferences such as FAT* (Conference on Fairness, Accountability, and Transparency) have helped bring increasing attention to the need to build more equitable models, while also drawing scholars, researchers, and practitioners from different fields into conversation. A number of open-source fairness libraries have also been released in recent years, though most of them are still in the early stages.
Although there is no consensus yet on what measures are most conducive for producing fair outcomes (and of course, much depends on how we define fairness in the first place), a number of compelling criteria and definitions have been proposed. Fairness Measures, Reflections on Quantitative Fairness, and Attacking Discrimination with Smarter Machine Learning all provide useful resources for beginning to think through how best to approach the difficult but essential question of how to build fair models.
When we speak of “bias” in machine learning we are usually referring to the mathematical assumptions built into the parameters of a model. It is becoming increasingly urgent, however, that we also consider the other definitions of “bias,” and with them, all the ways our models affect actual human beings, their lives as well as their livelihoods.
As a banking platform, we are confident that optimizing our models for transparency and interpretability is not only the right thing to do but will also lead to better, more inclusive products.