Residuals#
Objectives: what you will take away#
- How-To Retrieve global and local residuals. 
Prerequisites: before you begin#
- You’ve successfully installed Howso Engine 
- You have an understanding of Howso’s basic workflow. 
Data#
Our example dataset for this recipe is the well known Adult dataset. It is accessible via the pmlb package installed earlier. We use the fetch_data() function to retrieve the dataset in Step 1 below.
Concepts & Terminology#
How-To Guide#
Setup#
The user guide assumes you have created and setup a Trainee as demonstrated in basic workflow.
The created Trainee will be referenced as trainee in the sections below.
[1]:
import pandas as pd
from pmlb import fetch_data
from howso.engine import Trainee
from howso.utilities import infer_feature_attributes
df = fetch_data('adult').sample(1_000)
features = infer_feature_attributes(df)
trainee = Trainee(features=features)
trainee.train(df)
trainee.analyze()
features.to_dataframe()
[1]:
| type | decimal_places | bounds | data_type | original_type | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| min | max | allow_null | observed_min | observed_max | data_type | size | ||||
| age | continuous | 0 | 0.0 | 121.0 | True | 17.0 | 80.0 | number | numeric | 8 | 
| workclass | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| fnlwgt | continuous | 0 | 0.0 | 1098084.0 | True | 19678.0 | 673764.0 | number | numeric | 8 | 
| education | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| education-num | continuous | 0 | 0.0 | 25.0 | True | 2.0 | 16.0 | number | numeric | 8 | 
| marital-status | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| occupation | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| relationship | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| race | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| sex | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| capital-gain | continuous | 0 | 0.0 | 164870.0 | True | 0.0 | 99999.0 | number | numeric | 8 | 
| capital-loss | continuous | 0 | 0.0 | 4029.0 | True | 0.0 | 2444.0 | number | numeric | 8 | 
| hours-per-week | continuous | 0 | 0.0 | 162.0 | True | 2.0 | 99.0 | number | numeric | 8 | 
| native-country | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
| target | nominal | 0 | NaN | NaN | False | NaN | NaN | number | integer | 8 | 
Local Residuals#
Local metrics are retrieved through using Trainee.react().
Both Robust and non-robust (full) versions are available, although full
is recommended for residuals.
[2]:
# Get local full residuals
details = {'feature_full_residuals_for_case': True}
results = trainee.react(
    df.iloc[[-1]],
    context_features=features.get_names(without=["target"]),
    action_features=["target"],
    details=details
)
residuals = results['details']['feature_full_residuals_for_case']
residuals
[2]:
[{'hours-per-week': 28,
  'sex': 0,
  'occupation': 0.7335757605466264,
  'age': 3,
  'native-country': 0,
  'fnlwgt': 51534,
  'education-num': 0,
  'capital-gain': 114,
  'workclass': 0.8115459571184314,
  'relationship': 0.06439061700901239,
  'target': 0.2700843254972193,
  'capital-loss': 0,
  'education': 0,
  'marital-status': 0,
  'race': 0.10703621806027186}]
Global Residuals#
Howso has the ability to retrieve both local vs global metrics.
Global metrics are retrieved through using Trainee.react_aggregate().  Both Robust and non-robust (full) versions are also available.
[3]:
# Get global full residuals
residuals = trainee.react_aggregate(
    details={'feature_full_residuals': True},
).to_dataframe()
residuals
[3]:
| hours-per-week | sex | occupation | age | native-country | fnlwgt | education-num | workclass | capital-gain | relationship | target | capital-loss | education | marital-status | race | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| feature_full_residuals | 8.293756 | 0.256471 | 0.785627 | 8.216137 | 0.178461 | 74784.750172 | 0.011779 | 0.385171 | 2316.958528 | 0.366044 | 0.266758 | 148.943909 | 0.0 | 0.269275 | 0.187492 |