Conviction#

Objectives: what you will take away#

  • How-To retrieve familiarity, similarity, and prediction residual conviction metrics.

Prerequisites: before you begin#

Data#

Our example dataset for this recipe is the well known Adult dataset. It is accessible via the pmlb package installed earlier. We use the fetch_data() function to retrieve the dataset in Step 1 below.

Concepts & Terminology#

How-To Guide#

Familiarity Conviction and Similarity Conviction are measurements of how surprising a case is. This can be useful for tasks such as anomaly detection. Prediction Residual Conviction can be used to drill down into a specific case and examine its features. It measures how surprising each cases feature values is, thus it can reveal information such as why a case was anomalous. For example, if a NBA player’s height was 3 foot tall, that value would be very surprising since most NBA players are very tall.

Setup#

The user guide assumes you have created and setup a Trainee as demonstrated in basic workflow. The created Trainee will be referenced as trainee in the sections below. This guide also assumes you have installed the pmlb python library for the dataset used.

Familiarity Conviction#

There are two types of Familiarity Conviction available, both accessible when Trainee.react_into_features() is called. familiarity_conviction_addition is the familiarity conviction of adding the specified case and familiarity_conviction_removal is the familiarity conviction of removing the specified case. Trainee.react_into_features() stores these convictions which can be retrieved through Trainee.get_cases()

trainee.react_into_features(
    familiarity_conviction_addition=True,
    familiarity_conviction_removal=True
)
familiarity_conviction_addition = trainee.get_cases(
    session=trainee.active_session,
    features=[
        'familiarity_conviction_addition',
        'familiarity_conviction_removal'
    ]
)

Similarity Conviction#

Similarity Conviction is a singular metric that is also accessible when Trainee.react_into_features() is called.

trainee.react_into_features(similarity_conviction = True)
familiarity_conviction_addition = trainee.get_cases(
    session=trainee.active_session,
    features=['similarity_conviction']
)

Prediction Residual Conviction#

Since Prediction Residual Conviction details the conviction around a prediction, this is retrieved by specifying specific cases in Trainee.react()

details = {
    'feature_residuals_robust': True
}

results = trainee.react(
    test_case[context_features],
    context_features=context_features,
    action_features=action_features,
    details=details
)

Complete Code#

The code from all of the steps in this guide is combined below:

import pandas as pd
from pmlb import fetch_data

from howso.engine import Trainee
from howso.utilities import infer_feature_attributes

# import data
df = fetch_data('adult')

# Subsample the data to ensure the example runs quickly
df = df.sample(2000)
test_case = df.iloc[[-1]].copy()
df.drop(df.index[-1], inplace=True)

features = infer_feature_attributes(df)

action_features = ['target']
context_features = features.get_names(without=action_features)

trainee = Trainee(features=features)

trainee.train(df)

trainee.analyze(context_features=context_features, action_features=action_features)

trainee.react_into_features(
    familiarity_conviction_addition=True,
    familiarity_conviction_removal=True,
    similarity_conviction=True
)

familiarity_conviction_addition = trainee.get_cases(
    session=trainee.active_session,
    features=[
        'familiarity_conviction_addition',
        'familiarity_conviction_removal'
    ]
)

print(familiarity_conviction_addition)

details = {
    'feature_residuals_robust': True,
    'similarity_conviction': True
}

results = trainee.react(
    test_case[context_features],
    context_features=context_features,
    action_features=action_features,
    details=details
)
print(results)

Below is an example of expected output from this sample code:

$ python conviction_example.py
    familiarity_conviction_addition  familiarity_conviction_removal
0                            0.424315                        0.481610
1                           24.344436                       24.373889
2                            0.495148                        0.555847
3                            0.463858                        0.523487
4                            0.288355                        0.248439
...                               ...                             ...
1994                         6.460913                        6.248667
1995                        46.903956                       46.594968
1996                         2.195260                        2.305391
1997                        24.788612                       24.992936
1998                         0.740464                        0.812168

[1999 rows x 2 columns]
target
0       1
{'action_features': ['target'],
'feature_residuals_robust': [{'age': 8.888516681825308,
                            'capital-gain': 416.7392605164004,
                            'capital-loss': 59.906358535804515,
                            'education': 0.4523004291045252,
                            'education-num': 0.4655826176126248,
                            'fnlwgt': 65946.6678484109,
                            'hours-per-week': 6.298493661647657,
                            'marital-status': 0.512476275479471,
                            'native-country': 0.07145970131801563,
                            'occupation': 0.8772108612524578,
                            'race': 0.16017621174491645,
                            'relationship': 0.7104566198137716,
                            'sex': 0.3580994265834227,
                            'target': 0.09681983534852417,
                            'workclass': 0.18761097169888336}],
'similarity_conviction': [0.9699384581322016]}

API References#