Skip to content

Model Evaluation

Overview

Evaluating model performance and detecting bias.

Classification Metrics

Metric Formula Use When
Accuracy (TP+TN)/(TP+TN+FP+FN) Balanced classes
Precision TP/(TP+FP) False positives costly
Recall TP/(TP+FN) False negatives costly
F1 Score 2(PR)/(P+R) Balance precision/recall
AUC-ROC Area under ROC curve Ranking ability

Regression Metrics

Metric Description
MSE Mean Squared Error
RMSE Root Mean Squared Error
MAE Mean Absolute Error
Coefficient of determination
MAPE Mean Absolute Percentage Error

SageMaker Clarify

Detect bias and explain predictions.

Bias Detection

from sagemaker.clarify import SageMakerClarifyProcessor, BiasConfig, DataConfig

clarify_processor = SageMakerClarifyProcessor(
    role=role,
    instance_count=1,
    instance_type="ml.m5.xlarge"
)

bias_config = BiasConfig(
    label_values_or_threshold=[1],
    facet_name="gender",
    facet_values_or_threshold=[0]
)

data_config = DataConfig(
    s3_data_input_path="s3://bucket/data/",
    s3_output_path="s3://bucket/clarify-output/",
    label="target",
    headers=["feature1", "feature2", "target"]
)

clarify_processor.run_bias(
    data_config=data_config,
    bias_config=bias_config,
    pre_training_methods="all",
    post_training_methods="all"
)

Bias Metrics

Metric Description
Class Imbalance (CI) Difference in class proportions
Difference in Proportions (DPL) Label distribution difference
Disparate Impact (DI) Ratio of positive outcomes

Model Explainability

SHAP (SHapley Additive exPlanations) values.

from sagemaker.clarify import ModelConfig, SHAPConfig

model_config = ModelConfig(
    model_name="my-model",
    instance_type="ml.m5.xlarge",
    instance_count=1
)

shap_config = SHAPConfig(
    baseline=[baseline_data],
    num_samples=100,
    agg_method="mean_abs"
)

clarify_processor.run_explainability(
    data_config=data_config,
    model_config=model_config,
    explainability_config=shap_config
)

SageMaker Debugger

Monitor training in real-time.

  • Capture tensors during training
  • Built-in rules for common issues
  • Profiling for performance bottlenecks

Exam Tips

!!! warning "Key Points" - Use Clarify for bias detection and explainability - SHAP values explain individual predictions - Debugger monitors training for issues - Choose metrics based on business requirements