Model Performance#
Objectives: what you will take away#
How-To gauge performance using Howso’s native performance metrics.
Prerequisites: before you begin#
You have successfully installed Howso Engine
You have an understanding of Howso’s basic workflow.
You have an understanding of global vs local metrics.
Data#
Our example dataset for this recipe is the well known Adult
dataset. It is accessible via the pmlb package installed earlier. We use the fetch_data()
function to retrieve the dataset in Step 1 below.
Concepts & Terminology#
How-To Guide#
Setup#
The user guide assumes you have created and setup a Trainee
as demonstrated in basic workflow.
The Trainee
will be referenced as trainee
in the sections below.
Global model performance#
Global performance is calculated by Howso internally using a leave one out approach to the datapoints trained into the trainee that is
called by the react_in_trainee()
method. By setting residuals
to True, you can retrieve global
performance stats.
Available Stats
continuous features: mean absolute error (“mae”), root mean squared error (“rmse”), r2 (“r2”), spearman coefficient (“spearman_coeff”)
nominal features: Matthews correlation coefficient (“mcc”), accuracy (“accuracy”), precision (“precision”), recall (“recall”), the confusion matrix (“confusion_matrix”)
Note
The string representation of each statistic is listed in the parenthesis.
# Recommended metrics
trainee.react_into_trainee(action_feature=action_features[0], residuals=True)
stats = trainee.get_prediction_stats()
accuracy = stats["accuracy"]
Conditional Model Performance#
The Trainee is also able to compute and return performance statistics on subsets of the trained data that match a
certain set of conditions. For example, a user might want to investigate the performance of their Trainee on
cases that represent individuals over the age of 40. These conditioned prediction stats are not cached in the
Trainee, so the call to react_into_trainee()
is unnecessary. Instead, the user can directly call
get_prediction_stats()
while passing the condition parameter.
The user from the example would want to do the following:
# Conditions for continuous features take a tuple
# that represents the range of the condition.
performance_stats = trainee.get_prediction_stats(condition={"age": [40, 9999]})
These conditions can even be much more elaborate. For example a user could be interested in the performance on cases that represent individuals over the age of 40, who are also married and unemployed.
# Conditions for continuous features take a tuple
# that represents the range of the condition.
performance_stats = trainee.get_prediction_stats(
condition={
"age": [40, 9999],
"marital-status": "married",
"job-status": "unemployed",
}
)