Residuals#

Objectives: what you will take away#

  • How-To Retrieve global and local residuals.

Prerequisites: before you begin#

Data#

Our example dataset for this recipe is the well known Adult dataset. It is accessible via the pmlb package installed earlier. We use the fetch_data() function to retrieve the dataset in Step 1 below.

Concepts & Terminology#

How-To Guide#

Setup#

The user guide assumes you have created and setup a Trainee as demonstrated in basic workflow. The created Trainee will be referenced as trainee in the sections below.

[1]:
import pandas as pd
from pmlb import fetch_data

from howso.engine import Trainee
from howso.utilities import infer_feature_attributes

df = fetch_data('adult').sample(1_000)
features = infer_feature_attributes(df)

trainee = Trainee(features=features)
trainee.train(df)
trainee.analyze()

features.to_dataframe()
/home/docs/checkouts/readthedocs.org/user_builds/diveplane-howso-docs/envs/latest/lib/python3.11/site-packages/howso/utilities/feature_attributes/pandas.py:148: UserWarning: You have one or more suggestions to consider for your feature attributes configuration. Please view them by printing the `suggestions` property of your returned feature attributes object (`your_attributes_object.suggestions`).
  warnings.warn(suggestion_warning, UserWarning)
[1]:
type decimal_places bounds data_type original_type
min max allow_null observed_min observed_max data_type size
age continuous 0 0.0 124.0 True 17.0 82.0 number numeric 8
workclass nominal 0 NaN NaN False NaN NaN number integer 8
fnlwgt continuous 0 0.0 1559326.0 True 19847.0 953588.0 number numeric 8
education nominal 0 NaN NaN False NaN NaN number integer 8
education-num continuous 0 0.0 25.0 True 2.0 16.0 number numeric 8
marital-status nominal 0 NaN NaN False NaN NaN number integer 8
occupation nominal 0 NaN NaN False NaN NaN number integer 8
relationship nominal 0 NaN NaN False NaN NaN number integer 8
race nominal 0 NaN NaN False NaN NaN number integer 8
sex nominal 0 NaN NaN False NaN NaN number integer 8
capital-gain continuous 0 0.0 164870.0 True 0.0 99999.0 number numeric 8
capital-loss continuous 0 0.0 3982.0 True 0.0 2415.0 number numeric 8
hours-per-week continuous 0 0.0 162.0 True 2.0 99.0 number numeric 8
native-country nominal 0 NaN NaN False NaN NaN number integer 8
target nominal 0 NaN NaN False NaN NaN number integer 8

Local Residuals#

Local metrics are retrieved through using Trainee.react(). Both Robust and non-robust (full) versions are available, although full is recommended for residuals.

[2]:
# Get local full residuals
details = {'feature_full_residuals_for_case': True}
results = trainee.react(
    df.iloc[[-1]],
    context_features=features.get_names(without=["target"]),
    action_features=["target"],
    details=details
)

residuals = results['details']['feature_full_residuals_for_case']
residuals
[2]:
native-country capital-loss target fnlwgt sex workclass relationship age race capital-gain education-num occupation marital-status education hours-per-week
0 [0.10497122025332606, 0] [0, 0] [0, 0] [3892.8891464159533, 0] [0.5347950770078824, 0] [0.11679499294505225, 0] [0.25755208025854126, 0] [0.8980091835356987, 0] [0.04824150751552414, 0] [0, 0] [0, 0] [0.648723288676678, 0] [0.05021972175534373, 0] [0, 0] [8.596570848836706, 0]

Global Residuals#

Howso has the ability to retrieve both local vs global metrics. Global metrics are retrieved through using Trainee.react_aggregate(). Both Robust and non-robust (full) versions are also available.

[3]:
# Get global full residuals
residuals = trainee.react_aggregate(
    details={'feature_full_residuals': True},
).to_dataframe()
residuals
[3]:
native-country capital-loss target fnlwgt sex workclass relationship age race occupation education-num capital-gain marital-status education hours-per-week
feature_full_residuals [0.17635573108240263, 0] [163.0842452028395, 0] [0.21930926983381843, 0] [81408.23083480468, 0] [0.2607118895803338, 0] [0.3669314376133491, 0] [0.3186086818086218, 0] [8.3280741172867, 0] [0.20994217929625442, 0] [0.7944257293597181, 0] [0.16888651553946177, 0] [2155.029487759862, 0] [0.21729213822237398, 0] [9.032563585975595e-14, 0] [7.691471588572398, 0]

API References#