Could the Policy Tensor be making a statistical error? Is the Trump-Overdose gradient robust? Does it look different in Trump-Obama counties? I have explained Trump’s election as the result of despair in Flyover country. Is this result robust or is it spurious? We are interested in structural relationships that are robust to estimation error. In particular, since this is an estimate of the cross-sectional gradient it must be robust to heteroskedasticity and outliers; indeed, OLS as a procedure as well. These concerns motivate the following examination of the stability of the overdose gradient.
I want to abandon weighted OLS and read the result from order statistics. The sample median is a more stable measure of location than the sample mean. The interquartile range is a more stable estimator of population standard deviation than the traditional unbiased estimator: the square root of the sum of squared deviations from the sample mean over n-1. In order to compare predictors in scale, we standardize them by these stable measures of location and scale. Their stability is due to the fact that both are functions of rank-order statistics and thus robust to outliers.
We compute deviations from the sample median divided by an estimator of the population standard deviation that is a constant times the sample interquartile range. Since we are interested in a neighborhood of the normal family of distributions that is privileged by deep universality theorems, we set the normalizing constant to be 1/(1-2 • the value of the inverse of the normal cumulative distribution function at 1/4). This normalization follows from the exact expression of the interquartile range of a standard normal distribution. We call this specific standardization by stable estimators of location and scale, regularization. That is what our Matlab function regularize does to counties’ college graduation rate (College), population growth rate (PopGrowth), and change in the natural log of deaths due to overdose per hundred thousand (Overdose). The response is the difference in county vote share between Trump and Romney (Trump).
In order to estimate the slope coefficient we use the minimum covariance determinant estimator of Rousseeuw. This estimator is insensitive to outliers due to local perturbations and has a high global breaking point. The reweighting scheme of Rousseeuw et al. (2012) yields the highest asymptotic efficiency. This is what the function robustslope() computes; robust() computes Newey and West’s standard errors for the slope coefficient—we can check visually that the slope estimates are robust.
We can have good confidence that the gradient of Trump against Overdose is about 2 percent in standard units. A one standard deviation move in overdose deaths is associated with a +2.0% swing toward Donald Trump in the cross-section of US counties. The robust standard error of the slope coefficient is shy of 0.2%, so the gradient is highly statistically significant.
Controlling for college graduation rate and population growth, a one standard deviation move in overdose deaths caused a +1.1% move in Trump swing. The correlation coefficients, of course, live on the unit interval. Spearman’s rank-correlation coefficient is more stable than Pearson’s. Both are invariant to regularization. The simple correlation between overdose and Trump is one-third. Controlling for college graduation rate and population growth, the partial correlation is one-fourth.
Do counties where both Obama and Trump won stand out in this regard? The answer is yes. But it is complicated. There is a diachronic pattern in the robust slope coefficients for the two sub-samples of Obama-Trump counties. Those who abandoned Obama in 2012 after voting for him in 2008 show a larger gradient that those who stayed with him in 2012. The slope for the former is +2.2%; the latter is +1.7%. Interestingly, both party strongholds — counties that voted for the same party in 2008, 2012, and 2016 — have gradients that are virtually identical to the national gradient (+2.0%). The pattern is in no way confined to Republican strongholds (+1.9%). Democratic strongholds (+1.9%), and especially counties that abandoned Obama in 2012 (+2.2%), exhibit a strong gradient. Conditional on counties that only abandoned the Democrats in 2016, the slope is +1.7%. The rank-order of the conditioned slopes is highly suggestive of a disillusionment with Obama already in 2012.
We have subjected our estimates to the most stringent tests. The pattern is robust. The stability of the estimate is not in doubt. The causal diagram underlying the strong correlations between education, population decline, and deaths of despair on the one hand, and the electoral swing to Trump on the other, is clear. There has been a breakdown in elite-mass relations of which Trump’s election and the counter-discourse to Boasian antiracism are merely symptoms. Vast portions of the country are in serious trouble. A lot of faith was invested in Obama in 2008 in the shadow of the financial crises. That faith was already shattered in 2012, despite the false dawn of Obama’s victory. Trump was no surprise. An enterprising political analyst could have looked at the pattern already evident in 2012 and predicted further instability.
In effect, Trump is a message from Flyover Country for elites. Are American elites listening? Democrats in particular need to get their act together. It is Democrats who repaired elite-mass relations through the 20th century and thereby re-stabilized the system. They must do it again. In order to do so, they must abandon the idea that racism is the key to 2016. It is not. Widespread despair is the key to 2016.
Postscript. The utility of regularization can be seen at a glance from the kernel density of residuals. The next figure displays the empirical density of the residuals of the robust simple regression of Trump on Overdose, and two normal curves fitted to the residuals with the method of moments. (We need both moments since these aren’t OLS residuals and hence don’t sum to zero by construction.) The “iqr-based” curve is a normal distribution with mean equal to the median of the residuals and standard deviation set equal to their interquartile range/(1-2 • the value of the inverse standard normal at 1/4); given by N(-0.0024,0.0437). The “sigma-based” curve is a normal distribution with mean equal to the sample mean and standard deviation equal to the unbiased sample standard deviation; given by N(-0.0033,0.0543). The former provides a tight fit for the residuals. The latter provides a rather poor fit. This is obviously due to the presence of outliers.
We observe the same thing in the residuals of the 3-factor model. Rank-based methods of estimation for location and scale parameters are far more robust.