Empty Model

Empty Model

Empty Model (Null Model)

An empty model is a statistical model that describes a quantitative outcome variable using only a single overall value, typically the mean of the response. It is called empty because it contains no explanatory (predictor) variables.


The empty model answers the question:

“If we ignore all predictors, what single value best represents the data?”


Because it includes no predictors, the empty model is also commonly called a null model. It is also sometimes referred to as a simple model, since it is simpler than models that include one or more explanatory variables.

Why Empty Models Matter

Even though they are very simple, empty models are important because they:

  • Provide a baseline for comparison with more complex models

  • Represent the model you would use if you had no information other than the outcome itself

  • Are often the starting point for model building and hypothesis testing

Many modeling procedures compare a more complex model to the empty model to ask whether adding predictors actually improves prediction.

Mathematical Idea

For an outcome variable ( Y ), the empty model can be represented as follows:


Word Equation:
  1. Data = Model + Error

OR

  1. Data = Mean + Error


General Linear Model Notation:
  1. Yi = b0 + ei

OR

  1. Yi = Ȳ + ei


This means:
  • Every observation is predicted to be the same value (the sample mean)

  • All variation in the data is treated as unexplained noise

Example (Conceptual)

Suppose you are modeling exam scores for a class.

  • The empty model predicts that every student will score the class average.

  • It does not use study time, attendance, or prior GPA.

  • Any difference between a student’s actual score and the mean is considered error.

Example in R

Using a linear model with no predictors:


# Generic format

empty_model <- lm(Y ~ NULL, data = data_set)

# student score example

empty_model <- lm(score ~ NULL, data = exams)


  • ~ NULL means “model the response using only an intercept”

  • The intercept estimate is the mean of score


You can verify this:

mean(exams$score)


This value will match the intercept from the empty model.

Comparing to a Model with Predictors


model_with_predictor <- lm(score ~ study_hours, data = exams)


  • The empty model assumes no relationship between predictors and the response

  • The model with predictors tests whether including variables like study_hours improves predictions beyond the mean

Key Takeaway

The empty model is the simplest possible model for a quantitative variable.
It uses the mean as the prediction for all observations and serves as a critical reference point for understanding and evaluating more complex models.

    • Related Articles

    • null model

      Null model uses the mean to model the distribution of a quantitative variable; called "null" because it does not have any explanatory variables; also called an empty model; sometimes referred to as a simple model because it is simpler than models ...
    • Type I and Type II Error

      Type I and II Error describe the possible errors we might make when drawing conclusions about the DGP based on our data.  Type I error is when we should adopt the empty model but we adopt the complex model in error.  Type II error is when we should ...
    • complex model

      A complex model is a model with at least one explanatory variable.
    • SS Model

      SS model is the reduction in error (measured in sums of squares) due to the model; the area of all the squared deviations based on the distance between the complex model predictions and the null model predictions.
    • simple model

      A simple model is any model that is relatively more simple; in this course we typically compare relatively more complex models with one explanatory variable to a simple model that does not have any explanatory variables.