It seems to me that the margins command really gives me a better understanding of the (linear) regression method itself. So far I have managed to predict expected values of the dependent variable conditional to the independent ones as well as the average marginal effects of the independent variables.
After a linear regression this command reports the average marginal effects. If there is no interaction effect included within the model, this command then basically reports the b-coefficients (as they really are dy/dx). But as soon as I include a interaction into the model, this command won’t report the b-coefficient of the interaction, but computes the average effect of a variable.
margins, at(age=(20 40 60) liberalism=(0(1)10) gender=0)
dydx(*) the margins command computes the expected values of the dependent variable. But it does it in a nice way as I can condition the expected values on the independent variables like I wish. Here I specified, that I want prediction of Y for:
- age=20, gender=0: for each value of liberalism between 0 and 10 in steps of 1
- age=40, gender=0: for each value of liberalism between 0 and 10 in steps of 1
- age=60, gender=0: for each value of liberalism between 0 and 10 in steps of 1
In the end STATA predicts 11 * 3 = 33 values.
margins, at(age=(20 40 60) gender=0) atmeans
If I do not want to specify a whole range of values I can add
atmeans. STATA then basically takes the mean of liberalism and I only get 3 predicted values.
margins, at(age=(20 40 60)) over(gender) atmeans
If I have a categorical variable I can use
over() to condition the expected values (or dy/dx) on it, e.g. gender. I noted that STATA then also conditions the means of the other variables in the OLS model on gender.
So what to do with all this values? First of all, I can interpret them. But even more important we can compute nice graphs from it.
marginsplot, xdimension(at(liberalism)) recast(line) recastci(rarea) plot1opts(lcolor(black) lpattern(solid)) ci1opts(fcolor(gs12) lcolor(gs12) lpattern(solid))
xdimension(at(liberalism)) is the most important option here. The variable of which I computed the most values should be inserted here. For instance, it would make no sense to put age on the x-axis as we only have 3 values for each level of liberalism, but 40 values of liberalism for each age group. As I defined a interaction within the OLS model, I thereby can plot a nice conditional effects plot: