Like the Committee of Operations Analysts the Policy Tensor has been perfecting the technique of data analysis and visualization. The problem that I identified when I decomposed the variation explained into components due to each predictor was that fixed-effects are signed. By squaring the coefficients we lose the information contained in the signs of the coefficients. When we compare multiple models of the world, the natural empirical strategy is to embed them in an enveloping model that contains the alternate models as particular cases (given by linear restrictions). In order to distinguish the wheat from the chaff we have to get a handle on how much work each variable is doing. That can tell us an awful lot about what’s going on.

The z-scores, t-statistics, and p-values of the predictors are comparable. But that only tells us about the significance of the estimates (they all measure deviation from chance) not their relative amplitude. The problem with the raw slope coefficients is that they are not comparable. The covariates may be more or less variable thereby explaining more or less of the variation in the response. After struggling with this problem for years, I have realized that the best one can do is choose a central tendency of this variation around the means of the predictors. Interquartile range, the difference between the 75th and the 25th percentile, is a good measure of the bulk of the variation — it captures precisely half the variation around the mean for any given predictor. The tricky bit is what to do with dummy variables. The solution is to declare their interquartile range to be 0.5 (as long as the mean of the dummy is not near 0 or 1). One can then roughly compare the “fixed effects” of any set of predictors against each other.

Cross-sectional regressions run a severe risk of heteroskedasticity. It is imperative to compute robust standard errors for your slope coefficients. Also, for ease of interpretation we recommend transforming the response variable into percentiles. This allows for easier interpretation of the amplitude of the fixed effects. Finally, it is meaningless to display estimated fixed effects without appropriate error terms. With these remarks we are ready to illustrate our proposal for a standardized visual representation of fixed effects in linear models.

In our first illustration, the response is the time-depth of the Neolithic transition. The alternate models are as follows. Ashraf and Galor posit that Neolithic time-depth is quadratic in genetic diversity (proxied by migratory distance from Ethiopia, “mdist”). The Heliocentric model suggests that absolute latitude (“abslat”) should influence time-depth. The Diamond hypothesis suggests that the axes of the continents and distance from the Levant should control Neolithic time-depth. We pool these covariates in a single regression and compute robust standard errors. The bars displays the fixed effects defined as the absolute value of the slope coefficient times the interquartile range of the predictor. The error bars display the product of interquartile range of the predictor and the standard error of the slope coefficient.

In our second illustration, the response is per capita income in 2000. Again we convert the response to percentiles. We use the same predictors as before. The alternate models are as follows. Ashraf and Galor posit that per capita income is a quadratic function of genetic diversity (“mdist”). Köppen posited that global polarization is a function of our Heliocentric geometry (“abslat”). And although he did not quite put it quite like that, Diamond implied that global polarization is controlled by the same predictors as Neolithic time-depth, axis and distance from the Levant.

The fixed-effects thus displayed are not only comparable within the same model, we may faithfully compare fixed-effects in different models as long as we have the same response (preferably in percentile scores). Indeed, if we adopt this practice we can ditch ugly tables and display our results in a manner that is immediately intelligible and less subject to massaging. This would go a long way towards weeding out dicey results. It may even ease the replication crisis.