Case Importance#

Objectives: what you will take away#

  • Definitions & an understanding of case importance and the situations in which they can be retrieved.

Prerequisites: before you begin#

Data#

Our example dataset for this recipe is the well known Adult dataset. It is accessible via the pmlb package installed earlier. We use the fetch_data() function to retrieve the dataset in Step 1 below.

Concepts & Terminology#

How-To Guide#

Case importance is similar to feature importance in that it comprises of two metrics, Accuracy Contributions for Case and Prediction Contributions for Case. Unlike global feature importance metrics, case contributions are calculated just locally. Conceptually, local metrics use either a specific subset of the cases that are trained into the Trainee or a set of new cases.

Setup#

The user guide assumes you have created and setup a Trainee as demonstrated in basic workflow. The Trainee will be referenced as trainee in the sections below.

Case Contributions#

Case contributions can be retrieved by setting case_robust_prediction_contributions or case_full_prediction_contributions to True.

details = {'case_robust_prediction_contributions': True}

Case Accuracy Contributions#

Case Accuracy Contributions can be retrieved by setting case_robust_accuracy_contributions or case_full_accuracy_contributions to True.

details = {'case_robust_accuracy_contributions': True}

React#

Since case importance is a local metric, cases or case indices must be provided as well as an action feature.

results = trainee.react(
    test_case[context_features],
    context_features=context_features,
    action_features=action_features,
)

Results#

The results can be retrieved in the details section of the results.

case_prediction_contributions = pd.DataFrame(results['details']['prediction_contributions'][0])
case_accuracy_contributions = pd.DataFrame(results['details']['accuracy_contributions'][0])

Complete Code#

The code from all of the steps in this guide is combined below:

import pandas as pd
from pmlb import fetch_data

from howso.engine import Trainee
from howso.utilities import infer_feature_attributes

# import data
df = fetch_data('adult')

# Subsample the data to ensure the example runs quickly
df = df.sample(1000)
# Split out the last row for a prediction set and drop the Action Feature
test_case = df.iloc[[-1]].copy()
df.drop(df.index[-1], inplace=True)
test_case = test_case.drop('target', axis=1)

features = infer_feature_attributes(df)

action_features = ['target']
context_features = features.get_names(without=action_features)

trainee = Trainee(features=features)

trainee.train(df)

trainee.analyze(context_features=context_features, action_features=action_features)

details = {'case_robust_prediction_contributions': True}

results = trainee.react(
    test_case[context_features],
    context_features=context_features,
    action_features=action_features,
    details=details
)

case_contributions = pd.DataFrame(results['details']['case_robust_prediction_contributions'][0])

API References#