Introduction to Machine Learning - Model Evaluation Metrics
References and Acknowledgments
- "Machine Learning Basics - From Beginner to Job Entry"
- Machine Learning Basics - Supervised Learning - Average Absolute Error (Mean Absolute Error, MAE)
- scikit-learn.org
Different machine learning tasks require different metrics for evaluation. This article is based on scikit-learn and introduces the evaluation metrics for regression, classification, and clustering models through practical code.
Evaluation Metrics for Regression Models
For regression models, the goal is to make the predicted values fit the actual values as closely as possible. Common performance evaluation metrics include Mean Absolute Error (MAE) and Mean Squared Error (MSE).
Mean Absolute Error (MAE)
It measures the average absolute difference between predicted and true values in regression problems. A smaller MAE indicates a lower average difference between predicted and true values, meaning higher prediction accuracy.
Here, \(n\) is the number of samples, \(y_\text{true}\) represents true values, and \(y_\text{pred}\) denotes predicted values.
>>> from sklearn.metrics import mean_absolute_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_error(y_true, y_pred)
0.5
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_absolute_error(y_true, y_pred)
0.75
Mean Squared Error (MSE)
It calculates the average squared difference between predicted and true values. A smaller value indicates a better fit between predicted and true values.
In this formula, \(n\) indicates the number of samples, \(y_\text{true}\) stands for true values, and \(y_\text{pred}\) is for predicted values.
>>> from sklearn.metrics import mean_squared_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_squared_error(y_true, y_pred)
0.375
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_squared_error(y_true, y_pred)
0.708...
Evaluation Metrics for Classification Models
There are multiple evaluation metrics for classification models, and sometimes these metrics may even conflict with each other.
This post is translated using ChatGPT, please feedback if any omissions.