Residuals#
Objectives: what you will take away#
How-To Retrieve global and local residuals.
Prerequisites: before you begin#
You’ve successfully installed Howso Engine
You have an understanding of Howso’s basic workflow.
Data#
Our example dataset for this recipe is the well known Adult
dataset. It is accessible via the pmlb package installed earlier. We use the fetch_data()
function to retrieve the dataset in Step 1 below.
Concepts & Terminology#
How-To Guide#
Setup#
The user guide assumes you have created and setup a Trainee
as demonstrated in basic workflow.
The created Trainee
will be referenced as trainee
in the sections below.
[1]:
import pandas as pd
from pmlb import fetch_data
from howso.engine import Trainee
from howso.utilities import infer_feature_attributes
df = fetch_data('adult').sample(1_000)
features = infer_feature_attributes(df)
trainee = Trainee(features=features)
trainee.train(df)
trainee.analyze()
features.to_dataframe()
[1]:
type | decimal_places | bounds | data_type | original_type | ||||||
---|---|---|---|---|---|---|---|---|---|---|
min | max | allow_null | observed_min | observed_max | data_type | size | ||||
age | continuous | 0 | 0.0 | 127.0 | True | 17.0 | 84.0 | number | numeric | 8 |
workclass | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
fnlwgt | continuous | 0 | 0.0 | 1902182.0 | True | 19395.0 | 1161363.0 | number | numeric | 8 |
education | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
education-num | continuous | 0 | 0.0 | 26.0 | True | 1.0 | 16.0 | number | numeric | 8 |
marital-status | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
occupation | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
relationship | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
race | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
sex | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
capital-gain | continuous | 0 | 0.0 | 164870.0 | True | 0.0 | 99999.0 | number | numeric | 8 |
capital-loss | continuous | 0 | 0.0 | 4953.0 | True | 0.0 | 3004.0 | number | numeric | 8 |
hours-per-week | continuous | 0 | 0.0 | 162.0 | True | 2.0 | 99.0 | number | numeric | 8 |
native-country | continuous | 0 | 0.0 | 68.0 | False | 0.0 | 41.0 | number | integer | 8 |
target | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 |
Local Residuals#
Local metrics are retrieved through using Trainee.react()
.
Both Robust and non-robust (full) versions are available, although full
is recommended for residuals.
[2]:
# Get local full residuals
details = {'feature_full_residuals_for_case': True}
results = trainee.react(
df.iloc[[-1]],
context_features=features.get_names(without=["target"]),
action_features=["target"],
details=details
)
residuals = results['details']['feature_full_residuals_for_case']
residuals
[2]:
[{'sex': 0,
'workclass': 0.5331587655580869,
'race': 0,
'occupation': 0.7978098338827468,
'fnlwgt': 135036,
'capital-loss': 133,
'capital-gain': 185,
'target': 0.466527866685404,
'hours-per-week': 6,
'education': 0,
'relationship': 0.05012456999954262,
'age': 12,
'native-country': 26,
'education-num': 0,
'marital-status': 0}]
Global Residuals#
Howso has the ability to retrieve both local vs global metrics.
Global metrics are retrieved through using Trainee.react_aggregate()
. Both Robust and non-robust (full) versions are also available.
[3]:
# Get global full residuals
residuals = trainee.react_aggregate(
details={'feature_full_residuals': True},
)
residuals
[3]:
{'feature_full_residuals': {'sex': 0.2570762289270188,
'workclass': 0.36810121214714053,
'race': 0.20137945439242821,
'occupation': 0.7711453769837245,
'target': 0.21644047023383686,
'fnlwgt': 78093.13751516423,
'capital-gain': 938.679181774177,
'capital-loss': 133.79947992722927,
'hours-per-week': 8.237219438250245,
'education': 9.896150565680273e-14,
'relationship': 0.33661721607414674,
'marital-status': 0.22055382814740387,
'native-country': 2.88085842471993,
'education-num': 0.0009155718366581049,
'age': 8.238250587342119}}