Explaining ML Models with Shapley Values

Overview

Recently it was in the news that MIT'S NANDA (Networked AI Agents in Decentralized Architecture) initiative published a report finding that only five percent (5%) of AI pilot programs generate significant revenue, with the majority failing to have any meaningful impact. Buried within that report are two factors that lead to success with current technology: working with an expert vendor and automating back-office processes.

The Zybe Approach

One positive that can be drawn from the recent report is that there are now general guidelines companies can follow make success a more likely outcome. Implementing AI/ML solutions can be done using a formulaic approach, and it doesn't require incurring a significant cost footprint. Zybe group has had success partnering with customers to deliver effective AI/ML solutions by focusing on a few particular outcomes:

Identifying high-impact low-risk back-office operations suitable for AI/ML automation.
Starting from a preconfigured Infrastructure as Code (IaC) cloud with a framework informed by best practices.
Partnering with in-house teams to streamline adoption and build out capabilities.
Establishing clear metrics for quantifying and measuring progress towards goals.

Despite the imaginative media landscape around artificial intelligence, We've found that the underlying mechanics of ML models can be explained simply and in a completely unambiguous way. By embedding with teams to fill out knowledge gaps, building out an in-house AI/ML practice can be achieved simply by laying out practical procedures, metrics, and guidelines.

About Explainability

Explainabiltiy is the field of understanding and interpreting why a machine learning model makes a particular decision. For many AI/ML applications, a crucial factor to success is gaining control of the internals of a model from a mechanistic standpoint. This could be to enforce a sense of fairness or enforce some other concept on the model. Part of this process is understanding how a model arrived at a particular set of outcomes. Cooperative game theory provides a solution concept called Shapley values that is useful for gaining that understanding.

Game Theory

As stated in Wikipedia, Game theory is:

the study of mathematical models of strategic interactions.

In other words, its the study of the interplay between two or more rational parties. It's found successful application in a wide variety of fields from economics to warfare and now to explaining ML models.

The Shapley Value

In a game where players cooperate, the Shapley Value is a formal rule for distributing gains and losses, or attributing credit and blame, to collaborating players. In the context of the model examples below, the prediction of a model is the game and the features included in the model are the players

SHAP (SHapley Adaptive exPlanations)

SHAP is a Python library for explaining the output of machine learning models. It provides sample datasets and can integrate with matplotlib to provide visualized explanations.

Simple Sentiment Analysis Example

First, ensure any needed dependencies are present in the jupyter environment.

%pip install torch tensorflow tf_keras transformers matplotlib shap numpy scipy

In this example, a BERT (Bidirectional Encoder Representations for Transformers) model, which is ideal for text classification is used. A random review from the IMDB dataset provided with the SHAP library is used as a sample input.

import shap
import numpy as np
import scipy as sp
from torch import tensor
from transformers import DistilBertTokenizerFast, DistilBertForSequenceClassification

model_id = "lxyuan/distilbert-base-multilingual-cased-sentiments-student"
tokenizer = DistilBertTokenizerFast.from_pretrained(model_id)
model = DistilBertForSequenceClassification.from_pretrained(model_id).cuda()

# BERT uses word-piece tokenization which can create additional tokens so
# additional parameters are specified to ensure fixed length tokenization
def f(x):
    tv = tensor([tokenizer.encode(v, padding="max_length", max_length=500, truncation=True) for v in x]).cuda()
    outputs = model(tv)[0].detach().cpu().numpy()
    scores = (np.exp(outputs).T / np.exp(outputs).sum(-1)).T
    val = sp.special.logit(scores[:, 1])
    return val

explainer = shap.Explainer(f, tokenizer)

imdb_train = shap.datasets.imdb()[0]
shap_values = explainer(imdb_train[:10], fixed_context=1)

shap.plots.text(shap_values[2], display=True)
shap.plots.waterfall(shap_values[2])

Using SHAP's text plot we're able to see how tokens overlay on top of the text along with the importance of those tokens. Red regions increase the output of the model while blue regions decrease the output giving the overall sentiment.

inputs

This

film

lack

ed

something

I

couldn

'

t

put

my

finger

on

at

first

:

char

isma

on

the

part

of

the

leading

actress

.

This

in

evi

tab

ly

translated

to

lack

of

chemistry

when

she

shared

the

screen

with

her

leading

man

.

Even

the

romantic

scenes

came

across

as

being

merely

the

actors

at

play

.

It

could

very

well

have

been

the

director

who

mis

cal

cula

ted

what

he

needed

from

the

actors

.

I

just

don

'

t

know

.

<

br

/

>

<

br

/

>

But

could

it

have

been

the

screenplay

?

Just

exactly

who

was

the

chef

in

love

with

?

He

seemed

more

enam

ored

of

his

cu

lina

ry

skills

and

restaurant

,

and

ultimately

of

himself

and

his

youth

ful

ex

plo

its

,

than

of

any

body

or

anything

else

.

He

never

convinced

me

he

was

in

love

with

the

princes

s

.

<

br

/

>

<

br

/

>

I

was

disa

ppo

inted

in

this

movie

.

But

,

don

'

t

for

get

it

was

nominated

for

an

Oscar

,

so

judge

for

your

self

.

A Waterfall plot also helps to visualize what tokens had the most influence on the model. In this sample we can see how the BERT model broke apart the word "dissapointed" and the tokens comprising this word had an outsized impact on the output.

Selection of next token example

Another simple way to glance at the internals of a model in a superficial way is to have the model complete a sentence, and then look at the other possible outcomes the model considered. In this example we're asking GPT-2 to complete a common english idiom.

import shap
from transformers import AutoModelForCausalLM, AutoTokenizer

text = "costs an arm and a"

model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
shap_model = shap.models.TopKLM(model, tokenizer, k=5)
masker = shap.maskers.Text(tokenizer)
explainer = shap.Explainer(shap_model, masker)
shap_values = explainer([text])

shap.plots.text(shap_values, display=True)

[0]

outputs

leg

hand

half

foot

shoulder

inputs

cost

s

an

arm

and

a

Here we can see that "arm" was attributed much more importance than the other words in the input sentence. Given this we can begin to understand why all of the other potential responses are body parts involved with limbs.

In Practice

In order to be useful, an explanation needs to be contrasted with some kind of baseline. In the context of machine learning models, a baseline is essentially what the model would predict if it had no information about the input features. A common baseline is the average prediction across the dataset. Imagining a hypothetical model that predicts if an image is a cat, the baseline would be how often the model predicts "cat" in general. By employing SHAP the contribution of each feature is calculated by looking at the change from the baseline as features are added.

After establishing a SHAP baseline, pre and post-training model bias metrics can be collected to inform what actions should be taken to adjust the behavior of the model. This could mean removing or refactoring features, or performing some procedure upon training data.

How does this help?

On a scale of complexity, models have a range of effectiveness. If they're too simplistic they underperform in general and if they're too complex they overperform until given new data not included during training. By establishing a track record of explanations and analyzing how they differ from the baseline underfitting and overfitting, as well as other types of bias, can be avoided (or enhanced). Other outliers could indicate problems with the training data or the implementation of a feature.

Want to Talk?

I'm always working to better understand the needs and challenges faced by industry leaders. If you'd like to have a conversation about what you've been seeing in your space, or have questions about how Zybe approaches machine learning and artificial intelligence in general, please reach out to me on LinkedIn or my email below.

Overview

The Zybe Approach

About Explainability

Game Theory

The Shapley Value

SHAP (SHapley Adaptive exPlanations)

Simple Sentiment Analysis Example

Selection of next token example

In Practice

How does this help?

Further Reading

Want to Talk?