The Elegy of Serological Racialism: The Search for ‘Biochemical Races’

Just as scientific racialism was coming of age at the turn of the century, more profound developments were underfoot. The year after the appearance of Ripey’s The Races of Europe (1899), Mendel’s work was independently rediscovered at the same time by three different workers — after being ignored by them and everyone else for a whole generation. The birth of genetics led to a major rift between geneticists and naturalists that would not be resolved until the modern evolutionary synthesis (1936-1947). The year after the rediscovery of Mendel, in 1901, Karl Landsteiner discovered human blood groups. It took a few years before variation in blood groups could be properly understood. That had to wait for the basic mathematics of population genetics to be figured out. In 1908, Wilhelm Weinberg, a German physician, and Godfrey H. Hardy, the famous mathematician at Cambridge, independently clarified the elementary mathematics of gene frequencies in a stable population.

Anthropologists immediately sensed the opening of a new frontier. Raciology may have been more popular then ever before, but there were problems aplenty. One could only go so far with skin pigmentation, hair cross-sections, stature, and other skeletal and cranial measurements. For one, there was little agreement between physical anthropologists on the number and identity of the races of man — there seem to be as many races as race scientists, if not more! For another, progress was excruciatingly slow in understanding the origins of the races. All manner of theories abounded. The one issue on which all concerned agreed was the supremacy of the Nordic race. The only question was whether not just contemporary but all civilizations in history were secretly the creation of Grant’s Great Race — had the ancient Egyptians been Nordic too before mixing led to their racial degeneration? How else could one square the abjection of the contemporary Egyptian races and the astounding achievements of the ancient Egyptians?

Such old wives’ tales were a minor irritant to serious scientists. They were more concerned about the results of Dr Boas, who had shown in 1912 that the cephalic index (the ratio of the length and breadth of the skull) was not as stable as hitherto believed. More precisely, Boas demonstrated that the cranial index of second generation immigrants differed from their parents. This came a great shock since the cephalic index had been considered the most diagnostic of racial characters ever since Anders Retzius invented it in the mid-nineteenth century. A lot more was riding on the stability of skull measurements than was commonly realized. For if one could no longer rely on cranial measurements to identify prehistoric races in the archeological record, then the entire racial history of Europe would have to be reconsidered. In particular, the dolichocephalic Iron Age conquerors of Europe could not then be identified with contemporary Germanic races and the whole edifice of Germanic supremacism would come crumbling down.

And whatever the implications for racial history, it was supremely important to figure out diagnostic racial characters so as to put physical anthropology on a firm scientific footing. As raciology moved into high gear, the search for absolutely unquestionable racial characters became desperate. There must be a way to identify races that was not confounded by the environment! That is where serology came in. Blood groups offered the tantalizing prospect of an incontrovertibly genetic criteria to identify races. It was firmly held that purely genetic characters like blood groups would be free from environmental influences, and thereby allow incontrovertible identification of the races of man. Genetic characters were thought to be the master key to ‘the race problem’.

Progress in serological racialism, however, was excruciatingly slow because data on racial variation in blood group frequencies was sorely lacking. Only a few whites had been tested. But there were tantalizing hints of population differences in blood group frequencies. The first opportunity for large-scale testing arrived with the World War, when dozens of colonized races came to fight for white men on the battlefields of Europe.

In 1919, the Hirshfelds, a husband and wife team of doctors, reported from the Macedonian front. “Serological Differences Between the Blood of Different Races” appeared in the Lancet, already at that point the world’s leading medical journal. The Hirshfelds sought to attack ‘the race problem’ by identifying ‘biochemical races’. They clarified that the English group I was not an independent blood group at all but merely a heterozygote of genotypes A and B, that they labeled, properly enough, group AB. Their argument was statistical: ‘the frequency of the occurrence of A and B in central Europe can be brought into harmony with Mendel’s law’. Furthermore, they clarified that A and B are dominant and O recessive so that when a person has genotype AO or BO, their phenotype, their actual blood group, is A or B: ‘There are within the human species four properties of blood, A, non-A, B, and non-B. A and non-A, B and non-B behave to each other according to Mendel’s law, while A and B, non-A and non-B do not influence each other’.

They go on to ‘confirm Landseiner’s observation that these group-properties have nothing to do with disease. They appear also not to alter with time.’ The implication was immediate: we may therefore ‘form an anthropological criteria for the discovery of hitherto unknown and anatomically invisible relations between different races.’ After an exhaustive statistical examination, they document a broad northeast to southwest gradient in Eurasia for their ‘biochemical race-index’, the proportion A:B. They propose that ‘A and B had different points of origin and that there are two different biochemical races which arose in different places’.

‘One can imagine,’ they note, ‘that when man appeared on the earth A and B were present in the same proportions in different races’. And the present distribution is the result of diversifying selection: ‘A is more suitable for increased resistance of the organism to disease in a temperate climate, while B is more suitable in a hot climate’. But this hypothesis is rejected out of hand as ‘improbable’. Instead, we should think of ‘India as the cradle of one part of humanity—namely, of the biochemical race B. Both the east (Indo-China) and to the west, towards Europe and Africa, a broad stream of Indians poured out, ever lessening in its flow, which finally although continually diminishing penetrated to Western Europe’. And ‘we must assume that ‘A arose in North or Central Europe and spread out thence southwards and eastwards’.

It would later be discovered that the ABO blood group is a trans-species polymorphism in the primate order than has been under balancing selection for tens of millions of years. So the hopes of serological racialism were bound to be dashed sooner or later. Instead of an unchanging racial character encoding information of the origin of ‘biochemical races’, the gene frequencies responsible for blood group variation respond rapidly to changing environmental conditions. This is a case where selection completely confounds any population history signal.

The only published worldwide study of blood group frequencies that I know of is A.E. Mourant’s The Distribution of Human Blood Groups (1954). Unfortunately, I have not been able to get my hands on it yet. I found data here, via Bob Allison. As a quality check, I examined whether the frequencies matched the scattered frequencies mentioned in a number of papers. The data seems to be kosher; although not knowing the source makes me very nervous. (If you have access to Mourant’s monograph, please let me know.) In what follows, we’ll examine the distribution of blood group frequencies and do some elementary population genetics calculations.  The goal is to understand what the pattern looks like and what explains it.

We have blood group frequencies for 104 countries. The phenotypes are A; B; AB; and O. They correspond to genotypes AA, AO; BB, BO; AB and OO, since A and B are dominant and O is recessive. If we denote the phenotype frequencies by A, B and O, and the allele frequencies by lowercase a, b and o, then, for a population in Hardy-Weinberg equilibrium, $A=a^2+2ao$, $B=b^2+2bo$, $AB=2ab$, and $O=o^2$. Solving for the allele frequencies we obtain, $o=\sqrt{O}$, $a=\sqrt{A+O}-O$ and $b=\sqrt{B+O} +O$. As a quick check, we test whether $AB=2ab$ as expected. The correlation between expected and observed frequencies of AB is $r=0.929,\ p<0.0001$.

If you get the same allele from both parents, you’re a homozygote; if you get a different one from each parent; you’re a heterozygote. The fraction of heterozygotes in a given population is called heterozygosity and is an important measure of genetic variation. Expected heterozygosity for our alleles is given by $H=2ab+2ao+2bo$. We want to understand the global variation in this important measure of genetic variability. The next figure displays the variation in heterozygosity across continents. We can see that Eurasia has more genetic variation for these genes than Africa, and the Americas have the least. We can also see that much of the variation is within continents. How much? We can compute that exactly.

Table 1 reports another measure of variation called the F-statistics, denoted $F_{st}$. The next figure displays the same. This is the variance in allele frequencies between populations computed as mean of the diagonal elements of the R matrix. We can see that, by this measure, Asia and the Americas have more variation than the globe, while Africa and Europe have somewhat less. A simple variance decomposition shows that 15.1 percent of the international variation in mean allele frequencies is accounted for by the intercontinental variation in mean allele frequencies, whereas 84.9 percent is accounted for by variation within continents. Echoes of Lewontin.

 Table 1. F-Statistics. $F_{st}$ Std Error Europe 0.0181 0.0045 Asia 0.0292 0.0049 Africa 0.0154 0.0066 Americas 0.0470 0.0068 Pacific 0.0071 0.0135 Global 0.0250 0.0028

We only have data for $N=4$ countries in the Pacific, so we omit that in the plot for clarity.

How to we understand this global variation in allele frequencies for blood groups? We first ask whether geographic distance correlates with genetic distance, as we would expect. That is, we want to test whether populations that are further away from each other are more dissimilar than populations close by. We can do this with the Mantel test. We use Nei’s standard genetic distance. We compute geodesic distance over land from the latitude and longitudes by the Haversine formula, using the Sinai and the Bering Strait as way points where appropriate.

We strongly reject the null hypothesis of zero correlation. We also carry out the same test for each continent. Geography explains genetic distance very strongly in Europe and the America, less strongly in Asia but not Africa (we only have 4 observations in the Pacific.) That is, genetic distance is strongly correlated with geographic distance in Europe and the Americas, but only weakly or not at all correlated in Africa and Asia. Why is that?

 Table 2. Mantel tests. Global 0.2554 $p<0.0001$ Europe 0.438 $p<0.0001$ Asia 0.108 $p<0.0001$ Africa 0.0839 $p=0.1548$ Pacific 0.4326 $p=0.0942$ Americas 0.3439 $p<0.0001$

In order to better understand the variation in blood group frequencies, we look at the principal components. These are the directions that explain the vast bulk of the variation. PC1 explains 62.2 percent of the variation; PC2 explains 37.1 percent; together they explain 99.3 percent of the variation. We can see that PC1 separates Americas and Africa from Asia and Europe; while PC2 separates Europe from Asia. But these are blind statistical constructs. What do they correspond to?

This is where it gets very interesting. Turns out, PC1 is almost perfectly anti-correlated with heterozygosity $r=-0.975,\ p<0.0001$ and almost perfectly correlated with the frequency of the homozygote O $r=0.9935,\ p<0.0001$. (The correlation between heterozygosity and the frequency of blood group O is $r=-0.988,\ p<0.0001$.) So heterozygosity is entirely driven by the frequency of the homozygote O and that alone explains two-thirds of the international variation in blood group frequencies.

PC2, as it turns out is strongly correlated with the Hirshfelds’ ‘biochemical race-index’, the ratio A/B. The next figure plots the variation of the ratio A/B. We can see that the Hirshfelds weren’t entirely mistaken. Blood group A is much more common relative to blood group B in Europe compared to Africa and Asia. But as they did not know as the time — since Latin Americans largely stayed out of the World War — the A/B ratio is even higher in parts of the Americas.

So the broad pattern that emerges is that genetic distance in terms of blood group frequencies largely tracks geographic distance; the bulk of the variation in within continents; and subject to that qualification, heterozygosity is higher in Eurasia than elsewhere; blood group A is more common in Europe and the Americas, while B is more common in Asia. What explains these patterns?

First, geographic distance is expected to track gene frequencies because of isolation-by-distance. That is very well understood. What is less understood is the specific pattern of the distribution of A, B and O. Some progress has been made on the last. It turns out that the homozygote O confers protection against malaria. A large-scale study reported odds ratios for susceptibility to severe malaria, cerebral malaria, and malarial anemia in a dozen countries where malaria is endemic. We use the supplementary data from that paper to carry out Fisher’s exact tests. The null hypothesis is that the odds ratio equals 1, ie, that sporting the homozygote O confers no advantage. Table 3 reports our estimates.

 Table 3. Fisher’s exact tests for association between malaria and blood group O. Odds Ratio Std Error N Severe malaria 0.7221 0.0289 28,942 Celebral malaria 0.7612 0.0458 20,478 Severe malarial anaemia 0.6512 0.0558 19,325

Because malaria is endemic in the tropics, another way to test this is to see if the frequency of O is negatively correlated with distance from the equator. In our dataset, we find a strong correlation $r=-0.422,\ p<0.0001$.  So the evidence is extremely strong that the homozygote O is under selection in places where malaria is endemic.

The forces keeping A and B under balancing selection are much less understood. One possibility is suggested by the very strong correlation between the frequency of A and distance from the equator $r=0.640, p<0.0001$. Perhaps A confers protection against cold stress. But as far as I know, this hypothesis has not yet been tested.

As for the Hirshfelds’ hypothesis, that is easily debunked. We test whether the frequency of blood group A is correlated with distance from northern Europe after controlling for distance from the equator. We report correlations for the full sample as well as excluding the Americas. None of them are significant at the standard 5 percent level. So the Hirshfelds’ hypothesis for A fairs very poorly.

 Table 4. Partial correlations between geographic distance and frequency of blood group A. Controlling for absolute latitude. r p Britain -0.148 0.137 France -0.150 0.1312 Germany -0.135 0.1725 Sweden -0.116 0.2444 Excluding the Americas. r p Britain -0.191 0.0769 France -0.197 0.0676 Germany -0.173 0.1093 Sweden -0.130 0.2292

Their hypothesis for the origins of ‘the biochemical race B’ fares better in such a naive test. Distance from India is strongly correlated with the frequency of B $r=-0.558,\ p<0.0001$, and this correlation becomes stronger once we control for absolute latitude $r=-0.670,\ p<0.0001$. It cannot be ruled out that genetic drift may at least in part be responsible for the geographic variation we observe in the frequencies of blood group A, and especially B.

The scientific racialists of the long midcentury passage meant two different things when they talked about race. First, they really thought that there are discrete anthropological races in man that correspond to the allopatric subspecies or geographic races of evolutionary systematics. That turns out to be wrong. We know there are no subspecies in our species. We also know why that is the case. It is simply because all the population structure that we do observe — and there is certainly fine-scale population structure in that situated breeding populations have diverged to various degrees as a result of diversifying selection, founder effects and genetic drift — is too recent for the process of speciation to have gone far enough for subspeciation to have occurred. The subspecies elsewhere in the primate order have been genetically isolated from each other for millions of years. In contrast, all our ancestors were mating with each other in Africa less than a hundred thousand years ago. This fact of recency was only discovered in 1987; which explains another recency — that until quite recently, authorities in the biological sciences thought that there were allopatric subspecies in humans.

Second, the scientific racialists were interested in understanding our phylogeny and population history — what they called ‘the origins of the races’ and ‘racial history’. On that front, genetics has finally delivered. Or, at least, we have come very far indeed. Ancient DNA has completely revolutionized our understanding of population history. In the past couple of years, DNA from hundreds, even thousands of fossils have yielded an extraordinary fine-scale resolution on not only our population history but that of our sister taxa. And the revolution is ongoing. Every day yields exciting new results. A basic grasp of population genetics is no longer optional for those who want to write the contemporary history of this ongoing scientific revolution. That is why I am teaching myself population genetics while we wait for freedom from the coronavirus.

As the revolution unfolds, we are running a real risk that the discoveries will be misunderstood by the lay and misconstrued by the scientists themselves. The problem is not only that people have racialized preconceptions or that scientists are usually narrow specialists. What endows biology with the power it has as an ideology is not the strength of the explanations it provides. Instead, it derives its power from the strength of the explananda — the glaring and long-standing inequities of our lifeworld. It is the polarization of world order that makes biological reductionism such a compelling ideology. This means that in order to win the war against biological reductionism we need to abandon the sickly fear of the big questions that now haunts the humanities in the grip of Boasian antiracism.