# Multilevel

Multilevel is needed when we want to model a hierarchical structure. Because the world is full of hierarchies (students within classes within schools, time within people, penalties within goalkeepers, clients within therapists and what we often forget in  science, research subjects within experimenters), we quite often, or almost always, need multilevel models. Through several examples (Clark, 1973; Van Baaren, Holland, Steenaert, & van Knippenberg; Hoeken & Hustinx, 2009) I hope to have shown the necessity of this procedure. However, I’ve gotten the question what the assumptions of multilevel models where. Because I did not had a straight answer, I would like to take this opportunity to address this question here.

To understand this, I quickly have to explain the nature of multilevel modeling. A complete (random intercept, random slope) multilevel model consists of a normal regression equation, yij=b0j+b1j*Xi+eij, where the subscripts i represent the differences between individuals and the subscripts j represent the differences between groups. The differences between groups are specified by regressing with a new formula (level 2) on the intercept and slope parameter, b0j=y00+u0j and b1j=y10+u1j (one of these equations can be dropped to get a fixed intercept, random slope or random intercept, fixed slope parameter model, see all three models in the figures below).

Now to the assumptions. These are the same as in ordinary multiple regression analysis

• linear relationships,
• normal distribution of the residuals,
• and homoscedasticity.

When the assumption of linearity is violated we could check for other relations (for instance, by using the square of a time variable in a longitudinal study). Note that due to the introduction of multiple levels there is now more than one residuals. Of course this complicates matters a bit, but it has been shown that multilevel estimation methods are quite robust for violations of this assumption on the second level (Hox & Maas, 2004). Another great advantage of multilevel is that heteroscedasticity can be modeled directly, to account violations of the final assumption (cf. Goldstein, 1995, pp. 48–57).

So to sum up, USE MULTILEVEL!

Goldstein, H., 1995. Multilevel Statistical Models. Edward Arnold, London; Halsted, New York.

Maas, C.J.M. & Hox, J.J., (2003). The influence of violations of assumptions on multilevel parameter estimates and

their standard errors. Computational Statistics & Data Analysis, 46, 427-440