World Affairs

Trump is Pushing Iran into Russian Arms

The most important consequence of the Bush demolition of the Iraqi state has been the reemergence of Iran as the most influential power in southwest Asia. The core of southwest Asia is now a vast zone of Iranian influence that General Suleimani ominously calls the “Greater Persian Gulf region.” Iran is now the dominant foreign power in Lebanon, Syria and Iraq, and has significant influence in Yemen and Afghanistan as well. With thousands of Iranian troops fighting in the Levant, Iran is projecting its power further west and more deeply than at any time since the peak of Safavid power in the seventeenth century.

Part of the reason is simply that Iran is arguably the most powerful state in the region. Figure 1 shows the distribution of war potential in the Middle East. All data is represented as national shares of selected power resources of the regional powers. Iran has a population of 80 million, a close second to Egypt’s 85 million. Its economy is comparable in size to Turkey’s and Saudi Arabia’s (although the latter is mostly income from oil sales on the global market and is not reflective of national capabilities). Iran has proven oil reserves of 160 billion barrels, second only to Saudi Arabia’s 269 billion. Its endowment of arable land is second only to Turkey’s. Most astonishingly, some 269,000 Iranians graduate with degrees in engineering or the sciences every year compared to just 212,000 in the rest of the regional powers combined.

WarPotentialMidEast

Figure 1. Distribution of war potential in southwest Asia. Source: CIA, World Economic Forum. 

Israel is barely visible in the spider chart of war potential—a major flaw of these metrics. Israel punches dramatically above its weight for a number of reasons. First, Israel is a settler colony of the crème de la crème of Europe. Perhaps as a result of sustained selection on cognitive ability in medieval Europe or the survival effect of the liquidation of the bulk of European Jewry (smarter Jews presumably escaped at higher rates than dumber Jews from the Nazis), Ashkenazi Jews have the highest IQs of any ethnic group ever recorded. Not coincidently, Jewish people are massively overrepresented among Nobel Laureates. Combined with the traditional Jewish emphasis on education (with universal literacy probably as early as the second century CE), the per capita skill-set and knowhow of the Jewish state has no counterpart anywhere else in the world. Second, Israelis are far more willing to fight for the flag—a very important factor in warfighting capabilities since the rise of nationalism at the end of the eighteenth century—than any other nation for obvious historical reasons. Third, like Prussia in the classical European balance of power, Israel’s geostrategic position has led to the development of a highly effective operational art of war that has made it into a modern day Sparta. Surrounded by hostile states and with neither the resources nor the manpower to win long, drawn-out wars of attrition, the militaries of both states cultivated an art of war that sought to front-load conflicts and seek the decisive victory. Fourth, Israel has successfully cultivated a close security relationship with the unipole—in no small part due to the influence of American Jewry. This has given Israel greater access to advanced weapons and military knowhow than any other regional power including Turkey (even though Turkey, unlike Israel, is a member of Nato).

Still, modulo the special case of Israel, Figure 1 provides a good approximation of the war potential of regional powers in southwest Asia. It shows that Iran has the most balanced portfolio of intrinsic power resources in the region. Saudi Arabia, by contrast, is a pure petrostate. Iran’s population is 2.7 times as large as Saudi Arabia’s. It has 5 times as much arable land, 4 times as many graduates, and produces 6.7 times as many engineers and scientists every year as Saudi Arabia. The Kingdom has been able to access and sell—with foreign expertise and knowhow—a much greater portion of its oil deposits and has, as a result, accumulated considerably greater financial resources than Iran.

Saudi Arabia has tried to convert its financial firepower into military might by spending gargantuan sums of money on weaponry. Figure 2 displays the real military spending of the regional powers as well as the real price of crude. (We start the clock in 1971 when the British left and the gulf RSC emerged.) The salafi oil monarchy is by far the biggest military spender in the region. Since 2003, Saudi military spending has grown rapidly along with the price of crude to reach levels dramatically higher than other regional powers.

MidEastMilSpend

Figure 2. Military spending by regional powers in southwest Asia. Source: SIPRI.

But it is extremely difficult, if not outright impossible if other ingredients of national power are lacking, to convert financial resources into warfighting capabilities simply by spending giant sums of money. The most important determinants of national warfighting capability are after all the size and skill-set of the populace and its willingness to fight for the flag of the nation-state. The Saudi populace is much smaller, much less skilled, and not nearly as motivated to fight for the flag as that of Iran. This is why a war between Iran and Saudi Arabia will be pretty much a one-sided affair. However, since Saudi Arabia is a US protectorate it is not at risk of being conquered by its stronger neighbor. Due to the presence of the US pacifier, security competition in the bipolar gulf region has instead been projected onto regional playing fields.

The dominant story of the region since 2003 has been the expansion of the zone of weakness. Three hitherto strong states of the region, Iraq, Syria, and Libya, have joined Lebanon, Palestine, Yemen and Afghanistan (the last is on the border of southwest and south Asia) as the playing fields of the regional powers. The regional players are Iran, Saudi Arabia, Egypt, Turkey, and to a lesser extent Israel. Although each regional power has their own particular security interests, the object of the regional game is to secure the orientation of weak states, or if there is no central authority, to secure influence in the polity or security zone by bankrolling and arming local security actors. Even more important than the push factors of regional security competition are the pull factors of sub-state actors seeking patrons. These features are manifest in the Syrian war but are no less true of other parts of the zone of weakness.

Some players are more in demand than others. No one except the Phalangists wants to be caught hobnobbing with the Israelis. Even the Kurds are tight lipped about their security cooperation with the Jewish state. More generally, transnational identities allow states in the region to mobilize opinion across borders. Sunni Arab groups, including many salafi jihadists, look to Saudi Arabia and the other Sunni Arab oil monarchies for support. Shiite actors seek Iranian support. Due to the rise in sectarian temperature—most dramatically as a result of the Syrian war—regional Sunni Arab actors, like the Palestinian resistance groups Hamas and Islamic Jihad, that used to be Iranian clients have pulled back. On the other hand, actors that were barely Shiite, such as the Alawi regime in Syria and the Houthis in Yemen, have become Shitte, and pushed further into Iranian arms. So the rise in sectarian temperature cuts both ways.

this-map-shows-the-brewing-proxy-war-between-iran-and-saudi-arabia

Figure 3. The regional game.

When the Syrian uprising began, Saudi Arabia saw a major opening to wrestle away Syria—a state that is central to the Sunni Arab imaginary—from the Iranian orbit. Weapons, money and fighters poured into the warzone through the Turkish rat line. Much of the flow originated in the oil monarchies and went to salafi jihadist groups such as ISIS, JN and Arhar al Sham. But the Iranian-Hezbollah intervention prevented the fall of Assad. Once the Russians intervened on the regime’s side, the great Saudi dream of rolling back Iranian influence in the Levant became tenuous. With the fall of Aleppo to the regime’s forces, all such hopes were dashed.

Meanwhile, the US-Saudi puppet in Yemen had been displaced by the Houthis with the support of the former Yemeni president (a Sunni). Saudi Arabia’s aggressive young leader Mohammed bin Salman al Saud (MBS), responded by launching an air war with the logistic and diplomatic help of the Obama administration.  There was a lot of brouhaha about Iranian influence in Yemen; Saudi Arabia’s backyard. But Iranian influence was always more imagined than real. The Houthi political movement, in fact, enjoyed broad-based, cross-ethnic support and was neither simply a Shiite group nor an Iranian proxy. The main consequence of the Saudi terror campaign in Yemen was to give a boost to Al Qaeda in the Arabian Peninsula (AQAP), one of the most dangerous and capable salafi jihadist groups in the region. (Yemen was not the only place where the main result of Saudi meddling was to strength salafi jihadism.)

Lost in the regional narrative was a potential gamechanger. This was an alliance between Russia and Iran—something that has never obtained ever before in history. Even though both were simply fighting together to save the Assad regime and there were no plans for a broader alliance, there was always the potential for one. The Obama administration was smart enough to know that it would not be in the US interest if Iran acquired a rival great power patron. (Obama went so far as to say that the Saudis and the Iranians would have to “share the region.”) As long as the United States could keep the door ajar just a little bit, Iran had more to lose from defying the Western alliance than gaining a great power patron.

In this trip, Trump has slammed the door in Iran’s face. It may further the interests of the oligarchs connected to the Trump White House. But it makes no sense in terms of US interests in the region. We should not be surprised if Iran gets closer to Russia and the Ruskies extend their influence in the Middle East as a result. I am not suggesting that this is a certainty. Russia has so far pursued defensive and limited aims—basically shoring up the Assad regime. But that is no guarantee that the Kremlin will not exploit this opening.

There was something deeply shameful about Trump declaring Iran to be threat to peace while standing in the heart of terror finance. Claims that Iran is a sponsor of terror are entirely bogus. They are based on Iranian patronage of Hezbollah and Hamas, which for all their Islamic rhetoric are nationalist resistance groups; not Islamic terrorists. Islamic terrorists, like the one who murdered young kids in Manchester this week, are without a single exception salafi jihadists who are bankrolled by financiers in the permissive jurisdictions of the gulf oil monarchies that Trump just declared his eternal love for. In fact, Iran is the one Muslim power that is guaranteed to be an ally against salafi jihadism. If the United States was serious about tackling salafi jihadism, the place to start is to put the oil monarchies is a financial straightjacket—all financial flows out of the gulf ought to monitored by a terror finance task force set up by Western intelligence agencies.

It’s too easy to blame the Trump administration for following policies that are so manifestly against the US and Western interest. The truth is that the blame lies on a broad swath of the foreign policy community—including Democrats. Somehow the debacles of the Bush administration have failed to kill the rogue states doctrine that is at the root of America’s failed foreign policy.

Standard
Thinking

Why did the United States invade Iraq?

MidEast

Bush’s decision to depose Saddam has always perplexed the Policy Tensor. I have previously argued that US policy with respect to Iraq after 1990 was inconsistent with foreign policy realism; that, during the 1990s, US foreign policy was guided by the rogue states doctrine that served as the justification for the forward-deployment of US forces around the globe (and defense spending high enough to allow for garrisoning the planet after the threat from the Soviet Union vanished into thin air) by inflating the threat posed by confrontation states; that Saddam became a poster child of the rogues’ gallery that the foreign policy elite in Washington said they were determined to contain; and that the US policy consensus on the threat posed by the person of Saddam Hussein meant that Saddam was most at risk from a revisionist policy innovation in Washington.

So when George Bush went about “searching for monsters to destroy,” Saddam was the most tempting target. The consensus in policy circles against Saddam explains why the US invaded Iraq and not Cuba, North Korea, Iran or Libya; or any other confrontation state that could, more or less convincingly, be framed as a “rogue state”, “outlaw state”, “backlash state”, or “Weapon State” (as Krauthammer put it in Foreign Affairs). In short, Bush was following the path of least resistance when he chose to overthrow President Hussein.

But why did Bush want to depose Saddam in the first place? The US veto on potential rivals’ access to gulf energy was already secured by the United States’ impregnable maritime power in the region. That is, with or without a friendly regime in Baghdad, the US could deny any challenger access to gulf energy simply using its overwhelming maritime power.

Moreover, almost any conceivable US national interest could’ve been more easily secured by bringing Saddam in from the cold. If Bush wanted his friends in the oil industry to benefit from access to Iraqi oil, Saddam could easily have been brought in from the cold on that very condition. If Bush wanted to ramp up Iraqi oil production to lower oil prices (and perhaps undermine Saudi Arabia’s position as the swing producer and its hold on OPEC), the easiest way to do that would’ve been to allow Western oil firms to invest in Iraqi production capacity. If Bush wanted an Iraqi regime that was a geopolitical ally of the United States and Israel, even that was within the realm of possibility with Saddam at the helm in Baghdad.

Suppose that for whatever reason it was impossible for the United States to work with Saddam. Then, any conceivable US interest would’ve been better served by replacing the regime led by Saddam Hussein with a more compliant military junta. For a democratic regime in Baghdad was ethno-demographically guaranteed to fall within the orbit of Iran.

In fact, what I found after combing through the archives of the 1990s was that US-Iraq relations had been personalized to an extraordinary degree and that there was an overwhelming consensus in the foreign policy establishment that the ideal scenario would be a military coup by a more accommodative general. The idea was that if Saddam were deposed by a more accommodative general, we would get the best of both worlds. An iron-fisted junta would provide stability in the sense that Iraq would serve as a bulwark against Iran and keep a lid on both ethnic nationalism (Shia, Sunni and Kurdish) and salafi jihadism. And a more accommodative leadership in Iraq would remove Iraq from the ranks of the confrontation states and thereby enhance the security, power and influence of the United States and its regional allies.

These considerations explain why, after kicking Saddam’s army out of Kuwait, Bush’s dad left Saddam in power and watched from the sidelines as Saddam crushed the Iraqi intifada. Bush Senior later explained the decision in his book that he coauthored with Scowcroft:

While we hoped that a popular revolt or coup would topple Saddam, neither the United States nor the countries of the region wished to see the breakup of the Iraqi state. We were concerned about the long-term balance of power at the head of the Gulf. Breaking up the Iraqi state would pose its own destabilizing problems.

The core of the Bush revolution in foreign policy was the decision to break with this policy consensus. Specifically, Bush Jr’s policy innovation was to overthrow Saddam Hussein without replacing him with a more accommodative military junta. What possible US interest could be served by that policy? What did principals in the Bush administration hope to accomplish? What was their grand-strategy? I think I finally have an answer.

My interpretation builds on the findings and arguments of a large number of scholars. For the sake of conciseness, I’ll focus exclusively on the excellent anthology edited by Jane Cramer and Trevor ThrallWhy Did the United States Invade Iraq? In what follows, I’ll summarize their findings before presenting my interpretation. All quotes that follow are from this book unless otherwise specified.


Cramer and Thrall argue that the core foreign policy principals in the Bush administration were President Bush, Vice President Dick Cheney and Defence Secretary Donald Rumsfeld. It’s plausible to imagine that they came under the sway of neocons and to the neocons’ well-known strategy of regime change in Iraq in the heightened threat environment after 9/11. But that story is inconsistent with the facts.

The record indicates they did not even make a decision after 9/ 11; they apparently had already made up their minds so they did not need to deliberate or debate. Instead they discussed war preparations and strategies for convincing the public and Congress, with no planning for how to make democracy take shape in Iraq.

Cheney played an extraordinary role in the administration. In particular, he handpicked almost all the neocon hawks who led the drumbeat to war:

Cheney helped appoint thirteen of the eighteen members of the Project for the New American Century.… Cheney lobbied strongly for one open advocate of regime change—Donald Rumsfeld—who was appointed to be Secretary of Defense. And then, Cheney and Rumsfeld together appointed perhaps the most famous advocate for overthrowing Saddam Hussein in order to create a democracy in Iraq, Paul Wolfowitz, as Undersecretary of Defense. Cheney created a powerful dual position for Scooter Libby …John Bolton as special assistant Undersecretary of State for Arms Control and International Security; David Wurmser as Bolton’s chief assistant; Robert Zoellick as US Trade Representative; and Zalmay Khalilzad as head of the Pentagon transition team…. [Cheney appointed] Elliot Abrams, Douglas Feith, Richard Perle and Abram Schulsky.

But, Cramer and Thrall argue, quite convincingly in my opinion, that “the neoconservatives and the Israel lobby were “used” to publicly sell the invasion, while the plans and priorities of the neoconservatives were sidelined during the war by the top Bush leaders.”

The State Department and the oil industry were becoming increasingly alarmed about the neoconservatives’ oil plan and Chalabi’s open advocacy for it. In the eyes of the mainstream oil industry, an aggressive oil grab by the United States might lead to a destabilization of the oil market and a delegitimizing of the Iraq invasion. This was argued in an independent report put out on January 23, 2003 by the Council on Foreign Relations and the Baker Institute entitled Guiding Principles for U.S. Post-Conflict Policy in Iraq. The report cautioned against taking direct control of Iraqi oil, saying, “A heavy American hand will only convince them (the Iraqis), and the rest of the world, that the operation in Iraq was carried out for imperialist rather than disarmament reasons. It is in American interests to discourage surch misperceptions….

The State Department plan triumphed over the neoconservatives’ plan, and this helps demonstrate that Cheney, Rumsfeld and Bush did not allow the neoconservatives and the Israel lobby to dominate US foreign policy even from the inception of the invasion. In fact, Bush appointed Phillip Carroll, the former chief of Shell Oil, to oversee the Iraqi oil business. Carroll executed much of the oil industries’ preferred plans for Iraqi oil. Revealingly, when L. Paul Bremer, the head of the Coalition Provisional Authority, ordered the de-Ba’athification of all government ministries in Iraq, Carroll refused to comply with Bremer’s order because removing the Ba’athist oil technocrats would have hindered the Iraqi oil business. In the end, the Baker plan (aligned with US oil industry interests) was implemented in its entirety. The US official policy was to use Production Sharing Agreements (PSAs) that legally left the ownership of the oil in Iraqi government hands while attempting to ensure new long-term multinational oil corporation profits.

On de-Ba’athification, on Chalabi, on Iraqi participation in OPEC, on the privatization of Iraqi oil, on bombing Iran and Syria, on threatening Saudi Arabia or giving the Saudis access to advanced weaponry, the administration went counter to the neoconservatives’ proposals and policy desiderata. In fact, the “neoconservatives realized that they had been used to sell the war publicly but were marginalized when it came to the creation of Middle East policy. In 2006 prominent neoconservatives broke with the administration and resoundingly attacked Bush’s policies.”

So, if the neoconservative vision of an expanding zone of democratic peace was not the motivation for the invasion, what was? “Cheney, Rumsfeld and Bush,” Cramer and Thrall argue, “were US primacists and not realists.”

Cheney authorized Paul Wolfowitz to manage a group project to write up a new Defense Planning Guidance (DPG) drafted by various authors throughout the Pentagon in full consultation with the Chairman of the Joint Chiefs of Staff, Colin Powell (Burr 2008). The DPG was leaked to the New York Times on March 7, 1992 (Tyler 1992). The radical plan caused a political firestorm as it called for US military primacy over every strategic region on the planet.

The draft DPG leaked in 1992 was widely perceived as a radical neoconservative document that was not endorsed by the high officials in the George H. W. Bush administration. Dick Cheney sought to distance himself from the document publicly while heartily endorsing it privately. Pentagon spokesman Pete Williams claimed that Cheney and Wolfowitz had not read it. Numerous other Pentagon officials stepped forward to say that the report represented the views of one man: Paul Wolfowitz. The campaign to scapegoat Wolfowitz for the unpopular plan was successful and the press dubbed the DPG as the “Wolfowitz Doctrine.” However, recently released classified documents show that the document was based on Powell’s “base force” plan and was drafted with the full consultation of Cheney and many other high Pentagon officials (Burr 2008). In the days after the leak, Wolfowitz and others worried that the plan would be dropped altogether. But in spite of the controversy, Cheney was very happy with the document, telling Zalmay Khalilzad, one of the main authors, “You have discovered a new rationale for our role in the world.”

Cramer and Thrall conclude:

We think a gradual consensus is forming among scholars of the war that Cheney, and to a lesser degree Rumsfeld, were the primary individuals whom Bush trusted. These three leaders together shared the desire to forcefully remove Saddam Hussein, they made the decision, and they made the key appointments of the talented advisers who crafted the arguments to sell the war to the American people. We have shown that President Bush was a zealous participant in the decision to invade, but he was likely not a primary architect to the extent the much more seasoned Cheney and Rumsfeld were. We find that the recently released documents proving intentional intelligence manipulation (especially from the British Iraq Inquiry, see Chapter 9), combined with the long career paths of Cheney and Rumsfeld and the actions of these top leaders before and after 9/ 11, belie the perception that the administration was swept up by events and acted out of misguided notions of imminent threats, Iraqi connections to Al Qaeda, or crusading idealism. The United States did not emotionally stumble into war because of 9/ 11. On the contrary, the top leaders took a calculated risk to achieve their goals of US primacy, including proving the effectiveness of the revolution in military affairs, and strengthening the power of the president.


The Policy Tensor agrees with the characterization of principals in the Bush administration as primacists. The problem is that invading Iraq does not follow from the grand-strategy of primacy. The primacists’ argument is straightforward and indeed compelling. The idea is that it was in the US interest to prolong unipolarity as long as possible and that required an active policy to prevent the reemergence of a peer competitor. As the authors of the Defence Planning Guidance put it in 1992,

Our first objective is to prevent the re-emergence of a new rival, either on the territory of the former Soviet Union or elsewhere, that poses a threat on the order of that posed formerly by the Soviet Union. This is a dominant consideration underlying the new regional defense strategy and requires that we endeavor to prevent any hostile power from dominating a region whose resources would, under consolidated control, be sufficient to generate global power. These regions include Western Europe, East Asia, the territory of the former Soviet Union, and Southwest Asia.

Separately, in combination, or even in an alliance with a near-peer, the so-called rogue states were never (and never would be) in a position to pose “a threat on the order of that posed formerly by the Soviet Union.” The combined GDP of the “rogue states”—Iraq, Iran, Libya, North Korea, and Cuba—never exceeded that of California, Texas, or New York. Even if Saddam conquered the Arabian peninsula and consolidated control over its oil resources, he would be in no position to “generate global power.” In any case, the unipole could quite easily deter an Iraqi invasion of the Arabian peninsula.

Even a nuclear-armed Iraq would be in no position to impose its will on US protectorates in the region, much less on the United States itself. Those who argue that a nuclear-armed Iraq or Iran cannot be deterred simply don’t understand the logic of nuclear deterrence. If Saddam has successfully acquired a nuclear deterrent, the United States would not have been able to invade and occupy Iraq. But the Iraqi deterrent would have been useless for the purposes of aggression, conquest, or regional domination. Had he retaken Kuwait, the United States would still have been able to kick him out simply because he would’ve been in no position to threaten the use of nuclear weapons against US forces for then he would be making the incredible threat of suicide to hold on to his conquests. Put more formally, extended deterrence is hard enough for the unipole; it is well-nigh impossible for a regional power like Iraq under Saddam.

If the United States under Bush had acted in accordance with the grand-strategy of primacy, she would have cared little about minor confrontation states and much more about actual potential rivals. In particular, the United States would have tried hard to thwart the emergence of a peer in the two extremities of eurasia. A more aggressive strategy to maintain primacy would see the United States not only preventing the consolidation of either of these two regions under a single power, but also undermining the growth rate of the only power that has the potential to become a peer of the United States without conquering a strategically important region. That is, if the Bush administration had followed the grand-strategy of primacy, it would’ve blocked China’s admission into the WTO, and more generally, prevented China’s emergence as the workshop of the world. That would’ve prolonged US primacy with much more certainty than the destruction of the entire rogues’ gallery.

So what was the grand-strategy that made the decision to invade intelligible?

Jonathan Cook has argued for a much more radical proposal in Israel and the Clash of Civilizations. He argues that it was in the Israeli interest to have its regional rivals disappear from the ranks of the confrontation states and be broken up into statelets that would not pose any significant threats to Israel’s regional primacy; and that the Israelis managed to convince principals in the Bush administration of the merits of their revisionist agenda for the region:

I propose a different model for understanding the [Bush] Administration’s wilful pursuit of catastrophic goals in the Middle East, one that incorporates many of the assumptions of both the Chomsky and Walt-Mearsheimer positions. I argue that Israel persuaded the US neocons that their respective goals (Israeli regional dominance and US control of oil) were related and compatible ends. As we shall see, Israel’s military establishment started developing an ambitious vision of Israel as a small empire in the Middle East more than two decades ago. It then sought a sponsor in Washington to help it realise its vision, and found one in the neocons. (p. 91.)

Yinon’s argument that Israel should encourage discord and feuding within states – destabilising them and encouraging them to break up into smaller units – was more compelling [than Sharon’s status-quo, state-centric vision of Israeli regional primacy]. Tribal and sectarian groups could be turned once again into rivals, competing for limited resources and too busy fighting each other to mount effective challenges to Israeli or US power. Also, Israeli alliances with non-Arab and non-Muslim groups such as Christians, Kurds and the Druze could be cultivated without the limitations imposed on joint activity by existing state structures. In this scenario, the US and Israel could manipulate groups by awarding favours – arms, training, oil remittances – to those who were prepared to cooperate while conversely weakening those who resisted. (p. 118.)

Israel and the neocons knew from the outset that invading Iraq and overthrowing its dictator would unleash sectarian violence on an unprecedented scale – and that they wanted this outcome. In a policy paper in late 1996, shortly after the publication of A Clean Break, the key neocon architects of the occupation of Iraq – David Wurmser, Richard Perle and Douglas Feith – predicted the chaos that would follow an invasion. ‘The residual unity of the [Iraqi] nation is an illusion projected by the extreme repression of the state’, they advised. After Saddam Hussein’s fall, Iraq would ‘be ripped apart by the politics of warlords, tribes, clans, sects, and key families.’ (p.133.)

I think Cook is mistaken about the importance of Israeli influence but he is onto something. Even if Israel managed to persuade principals in the Bush administration, there is no evidence to suggest that the Israel lobby, or even the neocons more generally (the lines between the two are blurred), had decisive influence over the Bush administration’s Middle East policy. (My position here is congruent with Cramer and Thrall’s). But what is clear is the frame of reference in which smashing Israel’s rivals would be in the US interest.

More precisely, I think principals in the Bush administration figured that Israel was nearly guaranteed to be a strong ally of the United States is a difficult region. After the reorientation of Egypt (mid-1970s) and the Islamic revolution (1979), three regional poles prevented total US-Israeli domination of the Middle East: Iran, Iraq and Syria. Smashing these confrontation states would guarantee Israel’s regional primacy and therefore, I think principals in the Bush administration reckoned, further the US interest in more easily dominating the region in a permanent alliance with its junior geopolitical ally. In other words, the grand-strategy of the Bush administration was to remove, by threats or by the use of force, Israel’s regional rivals in the Middle East.

They hoped to overthrow or cow into submission, the regimes of Iraq, Iran and Syria; and thereby establish unchallenged US-Israeli supremacy in the Middle East. What I am saying is that the United States’ grand-strategy was based on an ill-informed regional variant of offensive realism—one whose logic was conditional on a permanent alliance with a regional power—as opposed to the global and unconditional variant of offensive realism assumed by the grand-strategy of primacy (as put forward, say, by Mearsheimer).

It is clear that regional primacy was in the Israeli interest. It’s a bit of stretch to argue that it was it was also in the US interest. The problem is that, military primacy or not, Israel simply does not have that much influence in the region. Because it is a pariah in the Middle East, few actors try to seek its patronage (the Kurds are the main exception); most look to Iran, Saudi Arabia, or global powers. It is nearly impossible for Israel to play the role formerly played by Iran under the Shan or Egypt under Nasser. The United States has no choice but to work with other regional powers (Egypt, Turkey, Syria, Saudi Arabia and Iran) to work out regional problems. Moreover, from the perspective of a global power trying to minimize the costs of ensuring stability in a multipolar region of strategic significance, a balance of power is considerably more attractive than the precarious primacy of a pariah; perhaps even one guaranteed to be a permanent ally.

But the fundamental flaw of the grand-strategy pursued by the Bush administration was not the conflation of US and Israeli interest. (It can be argued, after all, that since Israel was basically guaranteed to be a permanent ally, Israeli regional primacy was squarely in the US interest.) No, the fundamental flaw of the revisionist strategy was the outright dismissal of the costs of the ensuing instability. No matter how far the prewar consensus was from foreign policy realism, at least the unbounded costs of regional instability were understood. When Bush broke with the consensus and smashed the Iraqi state, he clearly did not appreciate just how bad things could get.

Breaking up confrontation states into ethnic statelets and zones of weakness may sound like a splendid idea to half-baked geopolitical analysts. But instability and weakness are a source of insecurity, not power; as both the United States and Israel have since discovered.

To wrap up: The grand-strategy pursued by the United States when it invaded Iraq was to smash the regional poles that acted as confrontation states in the Middle East, whose removal from the equation promised unchallenged US-Israeli supremacy in this strategically-relevant region. Principals in the Bush administration simply did not appreciate the unbounded costs of the regional instability that would ensue.

Standard
Thinking

Balance Sheet Capacity and the Price of Crude

I’ve written before about the macrofinancial importance of broker-dealers (a.k.a. Wall Street banks). I emphasized the key role played by dealers in the so-called shadow banking system and have shown that fluctuations in balance sheet capacity explain the cross-section of stock excess returns. I have also argued for a monetary-financial explanation of the commodities rout. In this post, I will show that fluctuations in dealer balance sheet capacity also explain fluctuations in the price of crude.

The evidence can be read off Figure 1. Recessions are shown as dark bands. The top-left plot shows the real price of crude for reference. The spikes in the 1970s correspond to the oil price shocks in 1973 and 1979. Note the price collapse in 1986 and the price shock that attended the Iraqi occupation of Kuwait (the spike in the 1990 recession). Note also the extraordinary run-up in the price of crude during the 2000s boom and the return of China-driven triple digit prices after the great recession. Finally, note the dramatic oil price collapse in 2014 due to the US fracking revolution. We know that much of the fluctuation in the oil price was a result of geopolitical, supply-side and exogenous demand-side factors. My claim is that much of the rest is driven by the excess elasticity of the financial intermediary sector.

Crude

Figure 1. Source: Haver Analytics, author’s calculations.

Specifically, I show that fluctuations in the balance sheet capacity of US securities broker-dealers predict fluctuations in the oil price. We define balance sheet capacity as the log of the ratio of aggregate financial assets of broker-dealers to the aggregate financial assets of US households. We stochastically detrend the quarterly series by subtracting the trailing 4-quarter moving average from the original series. The plot on the top-right displays the stochastically detrended balance sheet capacity. We will show that it predicts 1-quarter ahead excess returns on crude.

We run 30-quarter rolling regressions of the form,

{R^{crude}_{t+1}=\alpha+\beta\times capacity_{t}+\varepsilon_{t+1}}, \qquad (1)

where {R^{crude}_{t+1}} is the return on Brent in quarter {t+1} in excess of the risk-free rate and {capacity_{t}} is the shock to balance sheet capacity in quarter {t}. We must take care to interpret rolling regressions because instead of two parameters suggested by equation (1), we are in effect running 183 regressions with different parameters.

The plot on the bottom right displays the percentage of variation explained in each predictive regression. We see that balance sheet capacity became a significant predictor of the price of crude in the mid-1980s. It’s predictive capability diminished in the mid-1990s, before gaining new heights in the 2000s. The period 1999-2007 was the heydey of financially-driven fluctuations in the price of crude. That relationship collapsed in the second quarter of 2007. During the financial crisis and the period of postcrisis financial repression, the relationship disappeared entirely. It only recovers at the very end of our sample in 2016.

The bottom-left plot in Figure 1 displays a signed measure of the influence of balance sheet capacity on the price of crude. We display the product of the slope coefficient in equation (1) with one minus its p-value. This measure kills three birds with one stone. We can (a) keep track of the sign of the slope coefficients (to see whether or not it reverses direction too much), (b) get an additional handle on the time-variation of the strength of the predictive relationship, and (c) control the noise by attenuating the slope coefficients in inverse proportion to their statistical significance. Note that we have reversed the direction of the Y axis in the plot on the bottom-left.

The slope and significance metric tells a story that is very similar to the one told by the percentage of variation explained. Moreover, we can see that the relationship is economically large and negative. The interpretation is that positive shocks to balance sheet capacity compress the risk premium embedded in the price of crude. When balance sheet capacity is plentiful, risk arbitrageurs (speculators who make risky bets) bid away expected excess returns. Conversely, when balance sheet capacity is scarce, risk arbitrageurs are constrained in the amount of leverage they can obtain from their dealers and are therefore compelled to leave expected excess returns on the table.


The main result above—that dealer balance sheet growth predicts returns on crude oil—was originally obtained by Erkko Etula for his doctoral dissertation at Harvard. 

Standard
Geopolitics

The Near-Unipolar World Reconsidered

Above 200

Figure 1. Countries rescaled by the number of people earning more than $200 dollars a day in 2002. Source: WorldMapper.Org.

This is an ongoing conversation with Ted Fertik.

Thanks for the link man. Tooze (2014) was an amazing read! I want to talk about two things. First, I am going to shamelessly insist that I was right about the role of near-unipolarity in Tooze’s schema. Second, I want to talk about how near-unipolarity relates to the history of the twentieth century. All quotes are from Tooze (2014) unless otherwise specified.


“In the wake of World War I think the stakes were higher.” Why were they higher? “What was at stake was a new global order under the sign of what has been variously referred to as ultraimperialism, American hegemony, or Empire”; that Churchill described as “the pyramids of peace” (quoted in The Deluge). [Emphasis mine.]

The “central challenge facing the German political elite” was the “sheer scale of twentieth-century Anglo-American economic predominance.” Tooze shows that the interwar order was one of unabashed Anglo-American cohegemony. The “main question” of the international politics of the interwar era is “how to understand the insurgency against the order.” More pertinently, the question facing the Germans was should they “conform and assimilate themselves to its power” or “mount an insurgency against it”?

“We must view that struggle as more asymmetric, and thus as an expression of the combined and uneven development of the international system…” [Emphasis mine.]

“Neither the international relations of the interwar period, nor World War II itself are well-described by models…derived from the more truly multipolar world of the late nineteenth century.


I contended that the world from the close of the nineteenth century to the rise of China in the 2000s was secretly near-unipolar. I presented GDP numbers and argued that GDP was a good enough measure to detect near-unipolarity. But I also have strong historical reasons to think carefully about near-unipolarity—as the quotes from Tooze above suggest.

When I say near-unipolar, I mean that there is a especially strong state in the system such that no state could hope to prevail against it in a war or an extended rivalry; that there is no doubt about the identity of the strongest state in the system; and that when statesmen evaluate great power war and great power military alliances they had to care a great deal about the unipole’s position—computations on the outcome of great power war and confrontation premised on the unipole’s disinterest have to be thrown out of the window if the unipole weighs in the balance.

Note again that this is a weak definition. It just means that there is a football in a pile of tennis balls. The unipole may not even have a standing army. It may or may not exercise influence abroad. A lesser great power may run the maritime world and lesser great powers may worry much more about each other (especially their strong neighbours) than the unipole. In fact, if the unipole is insular and isolationist, it may not cause the other great powers any headaches at all. Indeed, they may even make fun of its extant weakness.

However, in a near-unipolar world, such disdain is contingent on the foreign policy of the unipole. Were the unipole to mobilize its war potential and be willing to use force on the world stage, the lesser great powers would have to eat their insulting words. Moreover, lesser great powers threatened by each other can be expected to try to secure the protection of the unipole. An alliance with the unipole is, after all, very useful given the rule of force in world affairs. The unipole may therefore get pulled into other people’s fights despite itself. Even insularity and isolationism thus do not completely thwart the gravitational pull exerted by the unipole.

One could write a convincing history of the twentieth century in this frame of reference. The philosophy of history that such a work requires is almost insultingly straightforward. The basic fact of near-polarity serves as the single explanatory variable. That is, the twentieth century as the story of the clarification of the real balance of forces. Or history catching up with the secret topology of the world.

In this frame of reference, the outcomes of the main great power confrontations of the twentieth century—World War I, World War II, and the Cold War—were more or less known in advance. The game had, in fact, been rigged from the get go.

What explains the British surrender of naval preponderance in the Western Hemisphere in 1900? What explains the results of 1918? What explains the Washington Naval Conference of 1922? The stability of the interwar European order in the 1920s? The breakdown of that order and the turn to radicalism in 1931? The startling fact that not the winner but the power that basically sat out the Second World War dictated the postwar order? The outright capitulation of the second ranked power in the so-called bipolar world in 1989? All these questions have a single answer: The fact of the asymmetric size of the football.

Is it possible to construct a tighter, more parsimonious narrative frame? Is it not, then, a quite compelling frame of reference?


References

Tooze, Adam. “The Sense of a Vacuum.” Historical Materialism 22.3-4 (2014): 351-370.

Standard
World Affairs

The Geopolitics of the French Election

lepen1

If populism prevails in France, it would have a much more dramatic impact on geopolitical affairs than the victory of populism in the offshore powers.

The immediate geopolitical impact of Brexit is now clear. Britain’s unilateral decision to withdraw has unified the continent against the perfidious Albion. Little England has, in effect, been forced into splendid isolation from the continent. Going forward, Britain will not have a seat at the European table.

Across the pond, Trump pulled off the greatest bait-and-switch in US political history. All promises of economic nationalism and isolationism have been shelved. Instead, the political high tide of the GOP is being mined in the service of plutocratic interests. While the Bannon-Sessions-Miller wing remains committed to constructing an ethnic security state—and is worryingly empowered to do so—the foreign economic and security policies of the United States are back in the hands of the Blob. Despite expectations to the contrary, Liberal Hegemonism is alive and well in the United States. American populism seems to have been tamed at least in so far as US foreign policy is concerned.

The consensus on the impact of a Marine Le Pen victory is that it would spell the demise of the European project. In particular, it would mean the end of the euro. But there is a perfectly feasible alternative scenario that may obtain if she wins. In that scenario, the French withdrawal will leave an even more unified and compact EU; one that would look more and more like a German Delian league.

I will argue that the second scenario is more likely than the first and that it would reconfigure European geopolitics in important and foreseeable ways. But first, how did we get here?


During the 1950s and 1960s, the core of the world economy was tripolar. Global industrial production was dominated by national champions of the United States, Japan and Germany (more generally, western Europe). Northern labor had a quasi-monopoly on Northern knowhow. More precisely, national labor pools had a quasi-monopoly on the knowhow of national industrial champions. Within this context, domestic bargains between labor and capital along the lines of the Treaty of Detroit enabled broad-based growth in the core of the world economy.

The Western economic miracle of the early postwar era came to an end as a result of the Japanese onslaught. Japan was able to combine its relatively low wages with high productivity growth to dramatically swell its shares of the global product market; helped along by the container revolution of the late 1960s that enhanced the integration of global product markets. Unable to compete with the Japanese, western firms tried in vain to increase the growth rate of their productivity. The western world slid into a deep stagflation crisis during the 1970s that prepared the ground for the neoliberal counterrevolution whose main agenda was to put finance firmly back in the saddle and tear up the Treaty of Detroit. That alone would’ve been sufficient to guarantee the rise of plutocracy, precarity and wage polarization. But even more momentous developments were underfoot that undermined the geoeconomic foundations of broad-based prosperity in the center countries even more thoroughly.

The 1980s witnessed the telecommunication and intermodal transportation revolution whereby transportation and communication costs collapsed enough to split the atom of national champions. The result was what Richard Baldwin calls ‘the second unbundling’ of global production whereby managers in the headquarter economies (US, Germany, Japan) trained cheap foreign labor within a day’s flying distance of headquarters to create Factory North America, Factory Europe and Factory Asia. This unified national labor markets at the regional level even as global product markets integrated further at the global level.

The addition of hundreds of millions of Chinese workers to Factory Asia created a tremendous imbalance between capital and labor. The result was even greater worker insecurity, wage polarization, and intensification of plutocracy. At the same time, the reemergence of global finance unleashed the financial cycle that also whipsawed market society with bubbles and financial crises.

The consequence of these global-macro fluctuations and structural changes was tremendous trauma in western market societies. This trauma manifested itself as the rise of populism and the destruction of the political center.


Back to geopolitics. A Le Pen victory in France cannot be ruled out with any degree of certainty. I claimed that if she wins, France and England would likely face a virtual German Delian league. The reason is twofold. First, European states are extraordinarily exposed to the risk of a dramatic unraveling of Factory Europe. A breakup of the eurozone would result in the effective repeal of deep integration on the continent and therefore a wholesale disruption of European value chains. In order to forestall such a catastrophic scenario, other states are likely to stick together. Second, while the renationalization of market society might be a viable strategy for medium-weights like the UK and France, it is decidedly not a viable strategy for either the small rich northern nations of the European core or the poorer nations on Europe’s southern and eastern periphery. Niether the Nordics and the Low Countries on the one hand, nor eastern European nations like Poland and the Czech Republic on the other, have any possibility of maintaining their prosperity after renationalization. Basically, the depth and breadth of skill-sets in every country except maybe Germany and perhaps Italy, provide an insufficient basis to compete in global product markets. They can put up tariff walls to protect domestic industry. But then the size of their national markets would sharply limit their firms’ economies of scale. That’s what doomed the import-substitution strategies of countless developing nations.

Le Pen dreams of an independent France that can stand up to Germany. But the reality is far more sobering. The harsh truth is that, after the second unbundling, without combining the knowhow of the North with the cheap labor of the South you can no longer be truly competitive in global product markets. This is even true of the United States. Renationalization is a recipe for geoeconomic irrelevance. Isolationist Britain and France will not become third world states, but they will be marginalized; both in Europe and in the global marketplace.

So what happens if I’m right and the UK and France face a German Delian league on the continent? France and the United Kingdom are independent nuclear powers and the main politico-military actors in Europe. They are essential partners for the United States. If and when they withdraw into isolationism, European security will rest on German shoulders. Le Pen has already declared her intention on a reorientation of Franco-Russian relations—away from deterrence in alliance with the western bloc to bilateral cooperation. But the security of the Baltics, the Nordics, and central and eastern Europe depends on more than US engagement. It requires a European great power partner. What this means is that the German Delian league would have to obtain its own conventional and nuclear deterrent. Pressures in this direction are already building as a result of growing doubts about the US commitment to defend Europe. They will intensify with the French exit, if it obtains.

One can think more systematically about the geopolitical implications of populism in England and France through the theory of regional security complexes (RSCs). Buzan and Wæver described the European great power RSC as a security community (a territorial cluster of states for whom war amongst each other is unthinkable) of great powers protected by one global power and threatened by another. This configuration is unlikely to last. The question is, How will it be transformed?

In the Policy Tensor’s view, the answer is that the British and French exits correspond to a major structural transformation of the European great power RSC. In particular, there is a strong potential for the emergence of a new, dominant security actor on the scene; namely, the German Delian league. Whether or not security competition reemerges in western Europe will then depend on whether the secondary powers (France and the UK) exacerbate the Russian threat to the league’s security. A possible withdrawal of the American pacifier—which is no longer unthinkable either—will make it considerably more likely. But whether or not security competition reemerges, we’re looking at a major transformation of the European RSC.

Standard
Thinking

Silicon Valley’s Visions of Absolute Power

414JWlgTXGL._SY346_

Omnipotence is in front of us, almost within our reach…


Yuval Noah Harari

The word “disrupt” only appears thrice in Yuval Noah Harari’s Homo Deus: A Brief History of Tomorrow. That fact cannot save the book from being thrown into the Silicon Valley Kool-Aid wastebasket.

Hariri is an entertaining writer. There are plenty of anecdotes that stroke the imagination. There is the one about vampire bats loaning blood to each other. Then there’s the memorable quip from Woody Allen: Asked if he hoped to live forever through the silver screen, Allen replied, “I don’t want to achieve immortality through my work. I want to achieve it by not dying.” The book is littered with such clever yarns interspersed with sweeping, evidence-free claims. Many begin with “to the best of our knowledge” or some version thereof. Like this zinger: “To the best our knowledge, cats are able to imagine only things that actually exist in the world, like mice.” Umm, no, we don’t know that. Such fraudulent claims about scientific knowledge plague the book and undermine the author’s credibility. And they just don’t stop coming.

“To the best of our scientific understanding, the universe is a blind and purposeless process, full of sound and fury but signifying nothing.” How does one even pose this question scientifically?

“To the best of our knowledge” behaviorally modern humans’ decisive advantage over others was that they could exercise “flexible cooperation with countless number of strangers.” Unfortunately for the theory, modern humans eliminated their competitors well before any large-scale organization. During the Great Leap Forward—what’s technically called the Upper Paleolithic Revolution when we spread across the globe and eliminated all competition—mankind lived in small bands. There was virtually no “cooperation with countless strangers.” The reason why we prevailed everywhere and against every foe was because we had language, which allowed for unprecedented coordination within small bands. Harari seems completely unaware of the role of language in the ascent of modern humans. He claims that as people “spread into different lands and climates they lost touch with one another…” Umm, how exactly were modern humans in touch with each other across the vast expanse of Africa?

“To the best of our scientific understanding, determinism and randomness have divided the entire cake between them, leaving not a crumb for ‘freedom’…. Free will exists only in the imaginary stories we humans have invented.” Here, Harari takes one of the hardest open problems and pretends that science has an answer. The truth is much more sobering. Not only is there no scientific consensus on the matter of free will and consciousness, it would be disturbing if there were, since we have failed to develop the conceptual framework to attack the problem in the first place.

“According to the theory of evolution, all the choices animals make –whether of residence, food or mates – reflect their genetic code.… [I]f an animal freely chooses what to eat and with whom to mate, then natural selection is left with nothing to work with.” Nonsense. The theory of evolution, whether in the original or in its modern formulations, is entirely compatible with free will. Natural selection operates statistically and inter-generationally over populations, not on specific individuals. It leaves ample room for free will.


There are eleven chapters in the book. All the sweeping generalizations and hand-waving of the first ten chapters are merely a prelude to the final chapter. Here, Harari goes on the hard sell.

Dataism considers living organisms to be mere “biochemical algorithms” and “promises to provide the scientific holy grail that has eluded us for centuries: a single overarching theory that unifies all scientific disciplines….”

“You may not agree with the idea that organisms are algorithms” but “you should know that this is current scientific dogma…”

“Science is converging on an all-encompassing dogma, which says that organisms are algorithms, and life is data processing.”

“…capitalism won the Cold War because distributed data processing works better than centralized data processing, at least in periods of accelerating technological changes.”

“When Columbus first hooked up the Eurasian net to the American net, only a few bits of data could cross the ocean each year…”

“Intelligence is decoupling from consciousness” and “non-conscious but highly intelligent algorithms may soon know us better than we know ourselves.”

No, the current scientific dogma isn’t that organisms are algorithms. Nor is science converging on an all-encompassing dogma that says that life is data processing. Lack of incentives for innovation in the Warsaw Pact played a greater role in the outcome of the Cold War than the information-gathering deficiencies of centralized planning. When Columbus first “hooked up the Eurasian net to the American net,” much more than a few bits of data crossed the ocean. For instance, the epidemiological unification of the two worlds annihilated much of the New World population in short order.


There are more fundamental issues with Dataism, or more accurately, Data Supremacism. First, data is simply not enough. Without theory, it is impossible to make inferences from data, big or small. Think of the turkey. All year long, the turkey thinks that the human would feed and take care of it. Indeed, every day the evidence keeps piling up that humans want to protect the turkey. Then comes Thanksgiving.

Second, the data itself is not independent of reference frames. This is manifest in modern physics; in particular, in both relativity and quantum physics. What we observe critically depends on our choice of reference frame. For instance, if Alice and Bob measure a spatially-separated (more precisely, spacelike separated) pair of entangled particles, their observations may or may not be correlated depending on the axes onto which they project the quantum state. This is not an issue of decoherence. It is in principle impossible to extract information stored in a qubit without knowledge of the right reference frame. To go a step further, Kent (1999) has shown that observers can mask their communication from an eavesdropper (called Eve, obviously) if she doesn’t share their reference frame. Even more damningly, reference frames are a form of unspeakable information—information that, unlike other classical information, cannot be encoded into bits to be stored on media and transmitted on data links.

Third and most importantly, we do not have the luxury of assuming that an open problem will be solved at all, much less that it will be solved by a particular approach within a specific time-frame. This is a major source of radical uncertainty that is never going to go away. Think about cancer research. Big data and powerful new data science tools make the researchers’ jobs easier. But they cannot guarantee their success.

The main contribution of my doctoral thesis was solving the problem of reference frame alignment for observers trying to communicate in the vicinity of a black hole. The problem has no general solution. I exploited the locally-measurable symmetries of the spacetime to solve the problem. Observers located in the vicinity of a black hole can use my solution to communicate. If they don’t know my solution or don’t want to use it, they need to discover another solution that works. They cannot communicate otherwise. This is just one of countless examples where data plays at best a secondary role in solving concrete problems.

Empirical data is clearly very important for solving scientific, technical, economic, social, and psychological problems. But data is never enough. Much more is needed. Specifically, solving an open problem often requires a reformulation of the problem. That is, it often requires an entirely new theory. We don’t know yet if AI will ever be able to make the leap from calculator to theoretician. We cannot simply assume that they will be able to do so. They may run into insurmountable problems for which no solution may ever be found. However, if and when they do, there is no reason why humans should not be able to comprehend an AI’s theories. More powerful theories turn out to be simpler after all. And if and when that happens, the Policy Tensor for one would welcome our AI overlords.


Harari makes a big fuss about algorithms knowing you better than yourself. “Liberalism will collapse the day the system knows me better than I know myself.” Well, my weighing machine “knows” my weight better than I do. What difference does it make if an AI could tell me I really and truly have a 92 percent change of having a successful marriage with Mary and only 42 percent with Jane? Assuming that the AI knows me better than I do, why would I treat it any differently from my BMI calculator that insists that I am testing the upper bound of normality? After all, I also agree that the BMI calculator is more accurate than my subjective judgment about my fitness as the AI is about my love life.

Artificial Intelligence without consciousness is just a really fancy weighing machine. And data science is just a fancy version of simple linear regression. Why would Liberalism collapse if Silicon Valley delivers on its promises on AI? Won’t we double-down on the right to choose precisely because we can calibrate our choices much better?

If AI gain consciousness on the other hand, all bets are off. Whether as an existential threat or as a beneficial disruption, the arrival of the first Super AI will be an inflection point in human history. The arrival of advanced aliens poses similar risks to human civilization.

If you are interested in the potential of AI, you’re better off reading Nick Bostrom’s Superintelligence: Paths, Dangers, Strategies. If you are curious about scientific progress and our technological future in deep time as well as the primacy of theory, you should read David Deutsch’s The Beginning of Infinity: Explanations That Transform the World. If you are more interested in the unification of the sciences, look no further than Peter Watson’s Convergence: The Idea at the Heart of Science. (Although I do recommend Watson’s The Modern Mind, The German Genius, and The Great Divide more and in that order.) Finally, for the limits of scientific and technical advance, see John D. Barrow’s Impossibility: The Limits of Science and the Science of Limits.


Silicon Valley’s Kool-Aid encompasses long-term visions of both techno-utopias and techno-dystopias. The unifying fantasy is that, in the long run, technological advance will endow man and/or AI with absolute power. In the utopias, men become gods and mankind conquers the galaxy; and in much more ambitious versions, the entire universe itself. (It would be orders of magnitude harder to reach other galaxies than other stars.) In the more common dystopias, man won’t be able to compete with AI, or the elite will but the commoners won’t (this is Harari’s version). In either case, the Valley’s Kool-Aid is that technology will revolutionize human life and endow some—depending on the narrative: Silicon Valley, tech firms, AIs, the rich, all humans, or AI and humans—with god-like powers. Needless to say, this technology will come out of Silicon Valley.

In reality, a small oligopoly of what Farhad Manjoo calls the Frightful Five (Facebook, Google, Apple, Microsoft and Amazon) have cornered unprecedented market power; and stashed their oligopolistic supernormal profits overseas, just to rub it in your face. Apple alone has an untaxed $216 billion parked offshore. Far from obeying the motto “data wants to be free,” these oligopolistic firms hoard your data and sell it to the highest bidder. The dream of tech start-ups is no longer a unicorn IPO. Rather, it is a buyout by one of the oligopolists. If you are a truly successful firm in the Valley, you have either benefited from network externalities (like the Frightful Five which are all platforms with natural economies of scale), or you have managed to shed costs onto the shoulders of people who would’ve hitherto been your employees or customers (like Airbnb, Uber and so on). Silicon Valley is, in fact, more neoliberal than Wall Street. While the Street has managed to shed risks and costs to the state, the Valley has managed to shed risks and costs to employees and customers. That’s basically the Valley’s business model.

Alongside its hoard of financial resources, the Valley has also cornered an impressive amount of goodwill in the popular consciousness. Who does not admire Google and Apple? This goodwill is the result of the industry’s actual accomplishments; some of them genuine, some thrust upon them by fate. In the popular imaginary, the Valley is the source of innovation and dynamism; to be celebrated not decried. Yet, the concentration of power in the industry has started to worry the best informed. If mass technological unemployment does come to pass, the Valley should not be surprised to find itself a pariah and a target of virulent populism, in the manner of Wall Street in 2009.

Standard
Thinking

Causal Inference from Linear Models

For the past few decades, empirical research has shunned all talk of causation. Scholars use their causal intuitions but they only ever talk about correlation. Smoking is “associated to” cancer, being overweight is “correlated with” higher morbidity rates, college education is the strongest “correlate of” Trump’s vote gains over Romney, and so on and so forth. Empirical researchers don’t like to use causal language because they think that causal concepts are not well-defined. It is a hegemonic postulate of modern statistics and econometrics that all falsifiable claims can be stated in the language of modern probability. Any talk of causation is frowned upon because causal claims simply cannot be cast in the language of probability. For instance, there is no way to state in the language of probability that smoking causes cancer, that the tides are caused by the moon or that rain causes the lawn to get wet.

Unfortunately, or rather fortunately, the hegemonic postulate happens to be untrue. Recent developments in causality—a sub-discipline of philosophy—by Judea Pearl and others, have made it possible to talk about causality with mathematical precision and use causal models in practice. We’ll come back to causal inference and show how to do it in practice after a brief digression on theory.

Theories isolate a portion of reality for study. When we say that Nature is intelligible, we mean that it is possible to discover Nature’s mechanisms theoretically (and perhaps empirically). For instance, the tilting of the earth on its axis is the cause of the seasons. It’s why the northern and southern hemispheres have opposite seasons. We don’t know that from perfect correlation of the tilting and the seasons because correlation does not imply causation (and in any case they are not perfectly correlated). We could, of course, be wrong, but we think that this is a ‘good theory’ in the sense that it is parsimonious and hard-to-vary—it is impossible to fiddle with the theory without destroying it. [This argument is due to David Deutsch.] In fact, we find this theory so compelling that we don’t even subject it to empirical falsification.

Yes, it is impossible to derive causal inference from the data with absolute certainty. This is because, without theory, causal inference from data is impossible, and theories on their part can only ever be falsified; never proven. Causal inference from data is only possible if the data analyst is willing to entertain theories. The strongest causal claims a scholar can possibly make take the form: “Researchers who accept the qualitative premises of my theory are compelled by the data to accept the quantitative conclusion that the causal effect of X on Y is such and such.”

We can talk about causality with mathematical precision because, under fairly mild regularity conditions, any consistent set of causal claims can be represented faithfully as causal diagrams which are well-defined mathematical objects. A causal diagram is a directed graph with a node for every variable and directed edges or arrows denoting causal influence from one variable to another, e.g., {X\longrightarrow Y} which says that Y is caused by X where, say, X is smoking and Y is lung cancer.

The closest thing to causal analysis in contemporary social science are structural equation models. In order to illustrate the graphical method for causal inference, we’ll restrict attention to a particularly simple class of structural equation models, that of linear models. The results hold for nonlinear and even nonparametric models. We’ll work only with linear models not only because they are ubiquitous but also for pedagogical reasons. Our goal is to teach rank-and-file researchers how to use the graphical method to draw causal inferences from data. We’ll show when and how structural linear models can be identified. In particular, you’ll learn which variables you should and shouldn’t control for in order to isolate the causal effect of X on Y. For someone with basic undergraduate level training in statistics and probability it should take no more than a day’s work. So bring out your pencil and notebook.

A note on attribution: What follows is largely from Judea Pearl’s work on causal inference. Some of the results may be due to other scholars. There is a lot more to causal inference than what you will encounter below. Again, my goal here is purely pedagogical. I want you, a rank-and-file researcher, to start using this method as soon as you are done with the exercises at the end of this lecture. (Yes, I’m going to assign you homework!)

Consider the simple linear model,

{\large Y := \beta X + \varepsilon }

where {\varepsilon} is a standard normal random variable independent of X. This equation is structural in the sense that Y is a deterministic function of X and {\varepsilon} but neither X nor {\varepsilon} is a function of Y. In other words, we assume that Nature chooses X and {\varepsilon} independently, and Y takes values in obedience to the mathematical law above. This is why we use the asymmetric symbol “:=” instead of the symmetric “=” for structural equations.

We can embed this structural model into the simplest causal graph {X\longrightarrow Y} , where the arrow indicates the causal influence of X on Y . We have suppressed the dependence of Y on the error {\varepsilon}. The full graph reads {X\longrightarrow Y \dashleftarrow\varepsilon}, where the dotted lines denotes the influence of unobserved variables captured by our error term. The path coefficient associated to the link {X\longrightarrow Y} is {\beta}, the structural parameter of the simple linear model. A structural model is said to be identified if the structural parameters can in principle be estimated from the joint distribution of the observed variables. We will show presently that under our assumptions the model is indeed identified and the path coefficient {\beta} is equal to the slope of the regression equation,

{\beta=r_{YX}=\rho_{YX}\sigma_{Y}/\sigma_X},

where {\rho_{YX}} is the correlation between X and Y and {\sigma_{X}} and {\sigma_{Y}} are the standard deviations of X and Y respectively.  {r_{YX}} can be estimated from sample data with the usual techniques, say, ordinary least squares (OLS).

What allows straightforward identification in the base case is the assumption that X and {\varepsilon} are independent. If X and {\varepsilon} are dependent then the model cannot be identified. Why? Because in this case there is spurious correlation between X and Y that propagates along the “backdoor path” {X\dashleftarrow\varepsilon\dashrightarrow Y}. See Figure 1.

IMG_0406

Figure 1. Identification of the simple linear model.

Here’s what we can do if X and {\varepsilon} are dependent. We simply find another observed variable that is a causal “parent” of X (i.e., {Z\longrightarrow X} ) but independent of {\varepsilon}. Then we can use it as an instrumental variable to identify the model. This is because there is no backdoor path between Y and Z (which identifies {\alpha\beta} ) and X and Z (which identifies {\alpha}). See Figure 2.

IMG_0401

Figure 2. Identification with an instrumental variable.

In that case, {\beta}  is given by the instrumental variable formula,

{\beta=r_{YZ}/r_{XZ}}.

More generally, in order to identify the causal influence of X on Y in a graph G, we need to block all spurious correlation between X and Y. This can be achieved by controlling for the right set of covariates (or controls) Z. We’ll come to that presently. First, some graph terminology.

A directed graph is a set of vertices together with arrows between them (some of whom may be bidirected). A path is simply a sequence of connected links, e.g., {i\dashrightarrow m\leftrightarrow j\dashleftarrow k} is a path between i and k. A directed path is one where every node has arrows that point in one direction, e.g., {i\longrightarrow j\leftrightarrow m\longrightarrow k} is a directed path from i to k. A directed acyclic graph is a directed graph that does not admit closed directed paths. That is, a directed graph is acyclic if there are no directed paths from a node back to itself.

A causal subgraph of the form {i\longrightarrow m\longrightarrow j} is called a chain and corresponds to a mediating or intervening variable m between i and j. A subgraph of the form {i\longleftarrow m\longrightarrow j} is called a fork, and denotes a situation where the variables i and j have a common cause m. A subgraph of the form {i\longrightarrow m\longleftarrow j} is called an inverted fork and corresponds to a common effect. In a chain {i\longrightarrow m\longrightarrow j} or a fork {i\longleftarrow m\longrightarrow j}, i and j are marginally dependent but conditionally independent (where we condition on m). In an inverted fork {i\longrightarrow m\longleftarrow j} on the other hand, i and j are marginally independent but conditionally dependent (once we condition on m). We use family connections to talk in short hand about directed graphs. In the graph {i\longrightarrow j}, i is the parent and j is the child. The descendants of i are all nodes that can be reached by a directed path starting at i. Similarly, the predecessors of j are all nodes from which j can be reached by directed paths.

Definition (Blocking). A path p is blocked by a set of nodes Z if and only if p contains at least one arrow-emitting node that is in Z or p contains at least one inverted fork that is outside Z and has no descendant in Z. A set of nodes Z is said to block X from Y, written {(X\perp Y |Z)_{G}}, if Z blocks every path from X to Y.

The logic of the definition is that the removal of the set of nodes Z completely stops the flow of information from Y to X. Consider all paths between X and Y . No information passes through an inverted fork {i \longrightarrow m\longleftarrow j} so you can ignore the paths that contain inverted forks. Likewise, no information passes through a path without an arrow-emitting node so those can also be ignored. The rest of the paths are “live” and we must choose a set of nodes Z whose removal would block the flow of all information between X and Y along these paths. Note that whether Z blocks X from Y in a causal graph G can be decided by visual inspection when the number of covariates is small, say less than a dozen. If the number of covariates is large, as in many machine learning applications, a simple algorithm can do the job.

If Z blocks X from Y in a causal graph G, then X is independent of Y given Z. That is, if Z blocks X from Y then X|Z and Y |Z are independent random variables. We can use this property to figure out precisely which covariates we ought to control for in order to isolate the causal effect of X on Y in a given structural model.

Theorem 1 (Covariate selection criteria for direct effect). Let G be any directed acyclic graph in which {\beta} is the path coefficient of the link {X\longrightarrow Y}, and let {G_{\beta}} be the graph obtained by deleting the link {X\longrightarrow Y}. If there exists a set of variables Z such that no descendant of Y belongs to Z and Z blocks X from Y in {G_{\beta}}, then {\beta} is identifiable and equal to the regression coefficient {r_{YX\cdot Z}}. Conversely, if Z does not satisfy these conditions, then {r_{YX\cdot Z}} is not a consistent estimand of {\beta}.

Theorem 1 says that the direct effect of X on Y can be identified if and only if we have a set of covariates Z that blocks all paths, confounding as well as causal, between X and Y except for the direct path {X\longrightarrow Y}. The path coefficient is then equal to the partial regression coefficient of X in the multivariate regression of Y on X and Z,

{Y =\alpha_1Z_1+\cdots+\alpha_kZ_k+\beta X+\varepsilon.}

The above equation can, of course, be estimated by OLS. Theorem 1 does not say that the model as a whole is identified. In fact, the path coefficients associated the links {Z_{i}\longrightarrow Y} that the multivariate regression above suggests, are not guaranteed to be identified. The regression model would be fully identified if Y is also independent of {Z_{i}} given {\{(Z_{j})_{j\ne i}, X\}} in G_{i} for all {i=1,\dots,k}.

What if you wanted to know the total effect of X on Y ? That is, the combined effect of X on Y both through the direct channel (i.e., the path coefficient {\beta}) and through indirect channels, e.g., {X\longrightarrow W\longrightarrow Y} ? The following theorem provides the solution.

Theorem 2 (Covariate selection criteria for total effect). Let G be any directed acyclic graph. The total effect of X on Y is identifiable if there exists a set of nodes Z such that no member of Z is a descendant of X and Z blocks X from Y in the subgraph formed by deleting from G all arrows emanating from X. The total effect of X on Y is then given by {r_{YX\cdot Z}}.

Theorem 2 ensures that, after adjustment for Z, the variables X and Y are not associated through confounding paths, which means that the regression coefficient {r_{YX\cdot Z}} is equal to the total effect. Note the difference between the two criteria. For the direct effect, we delete the link {X\longrightarrow Y} and find a set of nodes that blocks all other paths between X and Y . For the total effect, we delete all arrows emanating from X because we do not want to block any indirect causal path of X to Y.

Theorem 1 is Theorem 5.3.1 and Theorem 2 is Theorem 5.3.2 in the second edition of Judea Pearl’s book, Causality: Models, Reasoning, and Inference, where the proofs may also be found. These theorems are of extraordinary importance for empirical research. Instead of the ad-hoc and informal methods currently used by empirical researchers to choose covariates, they provide a mathematically precise criteria for covariate selection. The next few examples show how to use these criteria for a variety of causal graphs.

Figure 3 shows a simple case (top left) {Z\longrightarrow X\longrightarrow Y} where the errors of Z and Y are correlated. We obtain identification by repeated application of Theorem 1. Specifically, Z blocks X from Y in the graph obtained from deleting the link {X\longrightarrow Y} (top right). Thus, {\alpha} is identified. Similarly, Y blocks Z from X in the graph obtained from deleting the link {Z\longrightarrow X} (bottom right). Thus, {\beta} is identified.

IMG_0393

Figure 3. Identification when a parent of X is correlated with Y.

Figure 4 shows a case where an unobserved disturbance term influences both X and Y. Here, the presence of the intervening variable Z allows for the identification of all the path coefficients. I’ve written the structural equation on the top right and checked the premises of Theorem 1 at the bottom left. Note that the path coefficient of {U\dashrightarrow X} is known to be 1 in accordance with the structural equation for X. Hence, the total effect of X on Y equals {\alpha\beta+\gamma}.

IMG_0398

Figure 4. Model identification with an unobserved common cause.

Figure 5 presents a more complicated case where the direct effect can be identified but not the total effect. The identification of {\delta} is impossible because X and Z are spuriously correlated and there is no instrumental variable or intervening available available.

IMG_0395

Figure 5. A more complicated case where only partial identification is possible.

If you have reached this far, I hope you have acquired a basic grasp of the graphical methods presented in this lecture. You probably feel that you still don’t really know it. This always happens when we learn a new technique or method. The only way to move from “I sorta know what this is about” to “I understand how to do this” is to sit down and work out a few examples. If you do the exercises in the homework below, you will be ready to use this powerful arsenal for live projects. Good luck!

Homework 

  1. Epidemiologists argued in the early postwar period that smoking causes cancer. Big Tobacco countered that both smoking and cancer are correlated with genotype (unobserved), and hence, the effect of smoking on cancer cannot be identified. Show Big Tobacco’s argument in a directed graph. What happens if we have an intervening variable between smoking and cancer that is not causally related to genotype? Say, the accumulation of tar in lungs? What would the causal diagram look like? Prove that it is then possible to identify the causal effect of smoking on cancer. Provide an expression for the path coefficient between smoking and cancer.
  2. Obtain a thousand simulations each of two independent standard normal random variables X and Y. Set Z=X+Y. Check that X and Y are uncorrelated. Check that X|Z and Y|Z are correlated. Ask yourself if it is a good idea to control for a variable without thinking the causal relations through.
  3. Obtain a thousand simulations each of three independent standard normal random variables {u,\nu,\varepsilon}. Let {X=u+\nu} and {Y=u+\varepsilon}. Create scatter plots to check that X and Y are marginally dependent but conditionally independent (conditional on u). That is, X|u and Y|u are uncorrelated. Project Y on X using OLS. Check that the slope is significant. Then project Y on X and u. Check that the slope coefficient for X is no longer significant. Should you or should you not control for u?
  4. Using the graphical rules of causal inference, show that the causal effect of X on Y can be identified in each of the seven graphs shown in Figure 6.
  5. Using the graphical rules of causal inference, show that the causal effect of X on Y cannot be identified in each of the eight graphs in Figure 7. Provide an intuitive reason for the failure in each case.
    possible

    Figure 6. Graphs where the causal effect of X on Y can be identified.

    Impossible

    Figure 7. Graphs where the causal effect of X on Y cannot be identified.

    P.S. I just discovered that there is a book on this very topic, Stanley A. Mulaik’s Linear Causal Modeling with Structural Equations (2009).

Standard