Sunday, August 4, 2024

What is a Good RMSE value?

The lower the RMSE, the better a given model is able to “fit” a dataset. However, the range of the dataset you’re working with is important in determining whether or not a given RMSE value is “low” or not.

For example, consider the following scenarios:

Scenario 1: We would like to use a regression model predict the price of homes in a certain city. Suppose the model has an RMSE value of $500. Since the typical range of houses prices is between $70,000 and $300,000, this RMSE value is extremely low. This tells us that the model is able to predict house prices accurately.

Scenario 2: Now suppose we would like to use a regression model to predict how much someone will spend per month in a certain city. Suppose the model has an RMSE value of $500. If the typical range of monthly spending is $1,500 – $4,000, this RMSE value is quite high. This tells us that the model is not able to predict monthly spending very accurately.

Normalizing the RMSE Value

One way to gain a better understanding of whether a certain RMSE value is “good” is to normalize it using the following formula:

Normalized RMSE = RMSE / (max value – min value)

This produces a value between 0 and 1, where values closer to 0 represent better fitting models.

For example, suppose our RMSE value is $500 and our range of values is between $70,000 and $300,000. We would calculate the normalized RMSE value as:

Normalized RMSE = $500 / ($300,000 – $70,000) = 0.002

Conversely, suppose our RMSE value is $500 and our range of values is between $1,500 and $4,000. We would calculate the normalized RMSE value as:

Normalized RMSE = $500 / ($4,000 – $1,500) = 0.2.

The first normalized RMSE value is much lower, which indicates that it provides a much better fit to the data compared to the second normalized RMSE value.

No comments:

Post a Comment