Is Only Accuracy Score Enough?

Data Science
Author

Benjamin DK Luong

Published

October 10, 2019

When we work with classification problems, we often check for the accuracy score to see how well our models are doing. However, looking only at the accuracy score is not enough.

MODEL 1 Predict 0 Predict 1
Actuals 0 238,541 847
Actuals 1 2,648 40
MODEL 2 Predict 0 Predict 1
Actuals 0 180,896 58,492
Actuals 1 550 2,138

Let’s say we want to detect frauds. The model output equals 1 means it predicts the transaction is a fraud. In our dataset, we have majority transactions are not frauds which are 0. When we train a model, we easily get high accuracy score because we have imbalanced dataset which has more 0 than 1. Comparing the accuracy score between two models, Model 1 has accuracy score : (238541+ 40)/(238541+ 847+ 2648+40) = 0.98556, and Model 2 has accuracy score: (180896+ 2138)/(180896+ 58492 + 550 +2138) = 0.7561. If we choose a model based on the accuracy score, we will choose Model 1 because it has high accuracy score. It’s able to predict 98.556% correctly.

However, we can see that Model 1 is able to predict 40 out 2688 frauds. The recall score is 40/(40+2648)= 0.01488= 1.488% which means the model doesn’t capture frauds, only 1.488% of frauds are predicted. On the other hand, the Model 2 recall score = 2138 / (550 + 2138) = 0.79539 = 79.539%. It means we are able to capture 79.539% of frauds. In the Model 2, we think a lot of normal transactions are frauds, but it’s worth to check them to make sure they are not frauds.