Press "Enter" to skip to content

Exploring the margins command

It seems to me that the margins command really gives me a better understanding of the (linear) regression method itself. So far I have managed to predict expected values of the dependent variable conditional to the independent ones as well as the average marginal effects of the independent variables.

margins, dydx(*)

After a linear regression this command reports the average marginal effects. If there is no interaction effect included within the model, this command then basically reports the b-coefficients (as they really are dy/dx). But as soon as I include a interaction into the model, this command won’t report the b-coefficient of the interaction, but computes the average effect of a variable.

margins, at(age=(20 40 60) liberalism=(0(1)10) gender=0)

Without dydx(*) the margins command computes the expected values of the dependent variable. But it does it in a nice way as I can condition the expected values on the independent variables like I wish. Here I specified, that I want prediction of Y for:

  • age=20, gender=0: for each value of liberalism between 0 and 10 in steps of 1
  • age=40, gender=0: for each value of liberalism between 0 and 10 in steps of 1
  • age=60, gender=0: for each value of liberalism between 0 and 10 in steps of 1

In the end STATA predicts 11 * 3 = 33 values.

margins, at(age=(20 40 60) gender=0) atmeans

If I do not want to specify a whole range of values I can add atmeans. STATA then basically takes the mean of liberalism and I only get 3 predicted values.

margins, at(age=(20 40 60)) over(gender) atmeans

If I have a categorical variable I can use over() to condition the expected values (or dy/dx) on it, e.g. gender. I noted that STATA then also conditions the means of the other variables in the OLS model on gender.
So what to do with all this values? First of all, I can interpret them. But even more important we can compute nice graphs from it.

marginsplot, xdimension(at(liberalism)) recast(line) recastci(rarea) plot1opts(lcolor(black) lpattern(solid)) ci1opts(fcolor(gs12) lcolor(gs12) lpattern(solid))

xdimension(at(liberalism)) is the most important option here. The variable of which I computed the most values should be inserted here. For instance, it would make no sense to put age on the x-axis as we only have 3 values for each level of liberalism, but 40 values of liberalism for each age group. As I defined a interaction within the OLS model, I thereby can plot a nice conditional effects plot:
marginsplot showing an interaction

2 Comments

  1. Magdalena
    Magdalena 04/13/2017

    Dear Ben,
    I have a question regarding the interpretation of the confidence intervals. Are two predicted values significant from each other only when their respective confidence intervals don’t intersect or also when the confidence interval of one value does not include the other value?
    I hope I expressed myself clearly enough.
    Best,
    Magdalena

    • Benjamin Rosche
      Benjamin Rosche 04/13/2017

      Dear Magdalena,
      I quickly googled the question and the answer is: it is always true that if the confidence intervals do not overlap, the statistics are statistically significantly different.
      However, it is not necessarily true that they are not significantly different if the confidence intervals do overlap. The means are significantly different when X1 – X2 = 1.96*sqrt(SE1^2+SE2^2) and there is no overlap between the CI when X1 – X2 = 1.96*(SE1+SE2). Please also see this infobox from Cornell University.
      Best, Ben

Leave a Reply

Your email address will not be published. Required fields are marked *