It is currently U.S. policy to deploy missile defenses that are “proven, cost-effective, and adaptable.” As outlined in the 2010 Ballistic Missile Defense Review, proven means “extensive testing and assessment,” or “fly before you buy.”Adaptive means that defenses can respond to unexpected threats by being rapidly relocated or “surged to a region,” and by being easily integrated into existing defensive architectures.
While “extensive testing” in the field is an important step towards proven defenses, this article argues that it is insufficient for truly proven—that is, trustworthy—defenses. Defenses against nuclear weapons face a very high burden of proof because a single bomb is utterly devastating. But even if defenses achieve this level of trustworthiness in one context, this article argues that they cannot immediately be trusted when they are adapted to another context. Calls for proven and adaptive defenses thus promote a dangerous fallacy: that defenses which are proven in one context remain proven when they are adapted to another.
To explain why defenses should not be regarded as both proven and adaptable, this article begins by outlining a little-noted yet critical challenge for missile defense: developing, integrating, and maintaining its complex and continually-evolving software. A second section uses experience with missile defense to illustrate three key reasons that software which is proven on testing ranges does not remain proven when it is adapted to the battlefield. A third section outlines some of the challenges associated with rapidly adapting missile defense software to new threat environments. The article concludes that while missile defenses may offer some insurance against an attack, they also come with new risks.
Missile defense as an information problem
Missile defense is a race against time. Intercontinental ballistic missiles travel around the globe in just thirty minutes, while intermediate, medium, and short range ballistic missiles take even less time to reach their targets. While defenders would ideally like to intercept missiles in the 3-5 minutes that they launch out of the earth’s atmosphere (boost phase), geographic and physical constraints have rendered this option impractical for the foreseeable future. The defense has the most time to “kill” a missile during mid-course (as it travels through space), but here a warhead can be disguised by decoys and chaff, making it difficult to find and destroy. As missiles (or warheads) re-enter the earth’s atmosphere, any decoys are slowed down, and the warhead becomes easier to track. But, this terminal phase of flight leaves only a few minutes for the defender to act.
These time constraints make missile defense not only a physical problem, but also an informational problem. While most missile defense publicity focuses on the image of a bullet hitting a bullet in the sky, each interception relies critically on a much less visible information system which gathers radar or sensor data about the locations and speeds of targets, and guides defensive weapons to those targets. Faster computers can speed along information processing, but do not ensure that information is processed and interpreted correctly. The challenge of accurately detecting targets, discriminating targets from decoys or chaff, guiding defensive weapons to targets, and coordinating complementary missile defense systems, all falls to a very complex software system.
Today’s missile defense systems must manage tremendous informational complexity—a wide range of threats, emerging from different regions, in uncertain and changing ways. Informational complexity stems not only from the diverse threats that defenses aim to counter, but also from the fact that achieving highly effective defenses requires layering multiple defensive systems over large geographic regions; this in turn requires international cooperation. For example, to defend the United States from attack by Iran, the ground-based midcourse defense (GMD) relies not only on radars and missiles in Alaska and California but also on radars and missiles stationed in Europe. Effective defenses require computers and software to “fuse” data from different regions and systems controlled by other nations into a seamless picture of the battle space. Missile defense software requirements constantly evolve with changing threats, domestic politics, and international relations.
Such complex and forever-evolving requirements will limit any engineer. But software engineers such as Fred Brooks have come to recognize the complexity associated with unpredictable and changing human institutions as their “essential” challenge. Brooks juxtaposes the complexity of physics with the “arbitrary complexity” of software. Whereas the complexity of nature is presumed to be governed by universal laws, the arbitrary complexity of software is “forced without rhyme or reason by the many human institutions and systems to which [software] interfaces must conform.”
In other words, the design of software is not driven by predictive and deterministic natural laws, but by the arbitrary requirements of whatever hardware and social organizations it serves. Because arbitrary complexity is the essence of software, it will always be difficult to develop correctly. Despite tremendous technological progress, software engineers have agreed that arbitrary complexity imposes fundamental constraints on our ability to engineer reliable software systems.
In the case of missile defense, software must integrate disparate pieces of equipment (such as missile interceptors, radars, satellites, and command consoles) with the procedures of various countries (such as U.S., European, Japanese, and South Korean missile defense commands). Software can only meet the ad hoc requirements of physical hardware and social organizations by becoming arbitrarily complex.
Software engineers manage the arbitrary complexity of software through modular design, skillful project management, and a variety of automated tools that help to prevent common errors. Nonetheless, as the arbitrary complexity of software grows, so too do unexpected interactions and errors. The only way to make software reliable is to use it operationally and correct the errors that emerge in real-world use. If the operating conditions change only slightly, new and unexpected errors may emerge. Decades of experience have shown that it is impossible to develop trustworthy software of any practical scale without operational testing and debugging.
In some contexts, glitches are not catastrophic. For example, in 2007 six F-22 Raptors flew from Hawaii to Japan for the first time, and as they crossed the International Date Line their computers crashed. Repeated efforts to reboot failed and the pilots were left without navigation computers, information about fuel, and much of their communications. Fortunately, weather was clear so they could follow refueling tankers back to Hawaii and land safely. The software glitch was fixed within 48 hours.
Had the weather been bad or had the Raptors been in combat, the error would have had much more serious consequences. In such situations, time becomes much more critical. Similarly, a missile defense system must operate properly within the first few minutes that it is needed; there is no time for software updates.
What has been proven? The difference between field tests and combat experience
Because a small change in operating conditions can cause unexpected interactions in software, missile defenses can only be proven through real-world combat experience. Yet those who describe defenses as “proven” are typically referring to results obtained on a testing range. The phased adaptive approach’s emphasis on “proven” refers to its focus on the SM-3 missile, which has tested better than the ground-based midcourse defense (GMD). The SM-3 Block 1 system is based on technology in the Navy’s Aegis air and missile defense system, and it has succeeded in 19 of 23 intercept attempts (nearly 83 percent), whereas the GMD has succeeded in only half (8 of 16) intercept attempts. Similarly, when Army officers and project managers call the theater high altitude area defense (THAAD) proven, they are referring to results on a test range. THAAD, a late midcourse and early terminal phase defense, has intercepted eleven out of eleven test targets since 2005.
While tests are extremely important, they do not prove that missile defenses will be reliable in battle. Experience reveals at least three ways in which differences between real-world operating conditions and the testing range may cause missile defense software to fail.
First, missile defense software and test programs make assumptions about the behavior of its targets which may not be realistic. The qualities of test targets are carefully controlled—between 2002 and 2008, over 11 percent of missile defense tests were aborted because the target failed to behave as expected.
But real targets can also behave unexpectedly. For example, in the 1991 Gulf War, short range Scud missiles launched by Iraq broke up as they reentered the atmosphere, causing them to corkscrew rather than follow a predictable ballistic trajectory.This unpredictable behavior is a major reason that the Patriot (PAC-2) missile defense missed at least 28 out of 29 intercept attempts. Although the Patriot had successfully intercepted six targets on a test range, the unpredictability of real-world targets thwarted its success in combat.
Second, missile defense tests are conducted under very different time pressures than those of real-world battle. Missile defense tests do not require operators to remain watchful over an extended period of days or weeks, until the precise one or two minutes in which a missile is fired. Instead crews are given a “window of interest,” typically spanning several hours, in which to look for an attack. Defenders of such tests argue that information about the window of attack is necessary (to avoid conflicts with normal air and sea traffic), and realistic (presumably because defenses will only be used during a limited period of conflict).
Yet in real-world combat, the “window of interest” may last much longer than a few hours. For example, the Patriot was originally designed with the assumption that it would only be needed for a few hours at a time, but when it was sent to Israel and Saudi Arabia in the first Gulf War, it was suddenly operational for days at a time. In these conditions, the Patriot’s control software began to accrue a timing error which had never shown up when the computer was rebooted every few hours. On February 25, 1991, this software-controlled timing error caused the Patriot to miss a Scud missile, which struck an Army barracks at Dhahran, Saudi Arabia, killing 28 Americans. The fix that might have helped the Patriot defuse the Dhahran attack arrived one day too late.
A third difference between test ranges and real-world combat is that air traffic is often present in and around combat zones, creating opportunities for friendly fire; the likelihood of friendly fire is increased by the stressful conditions of combat.For example, in the first Gulf War, the Patriot fired two interceptors at U.S. fighter jets (fortunately the fighters evaded the attack). When a more advanced version of the Patriot (PAC-3) was sent to Iraq in 2003, friendly fire caused more casualties. On March 23, 2003, a Patriot battery stationed near the Kuwait border shot down a British Tornado fighter jet, killing both crew members. Just two days later, operators in another battery locked on to a target and prepared to fire, discovering that it was an American F-16 only after the fighter fired back (fortunately only a radar was destroyed). Several days later, another Patriot battery shot down an American Navy Hornet fighter, killing its pilot.
A Defense Science Board task force eventually attributed the failure to several software-related problems. The Patriot’s Identify Friend or Foe (IFF) algorithms (which ought to have clearly distinguished allies from enemies) performed poorly. Command and control systems did not give crews good situational awareness, leaving them completely dependent on the faulty IFF technologies. The Patriot’s protocols, displays, and software made operations “largely automatic,” while “operators were trained to trust the software.”Unfortunately this trust was not warranted.
These three features—less predictable targets, longer “windows of interest,” and the presence of air traffic—are unique to combat, and are among the reasons that software which is proven on a test range may not be reliable in battle. Other differences concern the defensive technology itself—missile seekers are often hand-assembled, and quality is not always assured from one missile to the next. Missile defense aims to overcome such challenges in quality assurance by “layering” defensive systems (i.e. if one system fails to hit a missile, another one might make the kill). But unexpected interactions between missile defense layers could also cause failures. Indeed, some tests which produced “successful” interceptions by individual missile defense systems also revealed limitations in integrating different defensive systems. Layered defenses, like most individual defensive systems, have yet to be proven reliable in real-world battle.
The Fallacy of “Proven” and “Adaptive” Defenses
As this brief review suggests, field testing takes place in a significantly different operational environment than that of combat, and the difference matters. Missile defenses that were “proven” in field testing have repeatedly failed when they were adapted to combat environments, either missing missiles completely, or shooting down friendly aircraft. Thus, talk of “proven” and “adaptable” defense furthers a dangerous fallacy—that defensive systems that are proven in one context remain proven as they are adapted to new threats.
Defensive deployments do not simply “plug-and-play” as they are deployed to new operational environments around the world because they must be carefully integrated with other weapons systems. For example, to achieve “layered” defenses of the United States, computers must “fuse” data from geographically dispersed sensors and radars and provide commands in different regions with a seamless picture of the battle space. In the first U.S. missile defense test that attempted to integrate elements such as Aegis and THAAD, systems successfully intercepted targets, but also revealed failures in the interoperability of different computer and communications systems. In the European theater, these systems confront the additional challenge of being integrated with NATO’s separate Active Layered Theater Ballistic Missile Defence (ALTBMD).
Similar challenges exist in the Asia-Pacific region, where U.S. allies have purchased systems such as Patriot and Aegis. It is not yet clear how such elements should interoperate with U.S. forces in the region. The United States and Japan have effectively formed a joint command relationship, with both nations feeding information from their sensors into a common control room. However, command relationships with other countries in the Asian Pacific region such as South Korea and Taiwan remain unclear.
The challenge of systems integration was a recurring theme at the May 2014 Atlantic Council’s missile defense conference. Attendees noted that U.S. allies such as Japan and South Korea mistrust one another, creating difficulties for integrating computerized command and control systems. They also pointed to U.S. export control laws that create difficulties by restricting the flow of computer and networking technologies to many parts of the world.Atlantic Council senior fellow Bilal Saab noted that the “problem with hardware is it doesn’t operate in a political vacuum.”
Neither does software. All of these constraints—export control laws, mistrust between nations, different computer systems—produce arbitrarily complex requirements for the software, which must integrate data from disparate missile defense elements into a unified picture of the battle space. Interoperability that is proven at one time does not remain proven as it is adapted to new technological and strategic environments.
Although defenses cannot be simultaneously proven and adaptive, it may still make sense to deploy defenses. Missile defenses that have undergone robust field testing may provide some measure of insurance against attack. Additionally, cooperative defenses may provide a means of reducing reliance on massive nuclear arsenals—although efforts to share NATO or U.S. missile defenses with Russia are currently stalled.
But whatever insurance missile defense offers, it also comes with new risks due to its reliance on tremendously complex software. Other analyses of missile defense have pointed to risks associated with strategic instability,and noted that defenses appear to be limiting rather than facilitating reductions of offensive nuclear arsenals.An appreciation for the difficulty of developing, integrating, and maintaining complex missile defense software calls attention to a slightly different set of risks.
The risks of friendly fire are evident from experience with the Patriot. More fundamentally, the inability of complex software to fully anticipate target behavior limits its reliability in battle, as seen in the first Gulf War. The PAC-3 system appears to have performed better in the second Gulf War; according to the Army, the defenses incapacitated nine out of nine missiles headed towards a defended asset. Thus, the PAC-3 system may be regarded as truly proven against a particular set of targets. But however well defenses perform against one set of targets, we cannot be assured that they will perform equally well against a new set of targets.
Additionally, defenses must be exceedingly reliable to defend against nuclear-armed missiles. In World War II, a 10 percent success rate was sufficient for air defenses to deter bombers, but the destructive power of nuclear weapons calls for a much higher success rate. If even one nuclear weapon gets by a defensive system, it can destroy a major city and its surroundings.
The greatest risk of all comes not with defenses themselves, but with overconfidence in their capabilities. In 2002, faith in military technology prompted then Secretary of Defense Donald Rumsfeld to overrule seasoned military planners, insisting that high technology reduced the number of ground troops that were necessary in Iraq. As we now know, this confidence was tragically misplaced.
The decision to rely upon a missile defense deployment should thus weigh the risks of a missile attack against the risks of friendly fire and of unreliable defenses. While the fly-before-you-buy approach is an essential step towards trustworthy defenses, field testing does not yield truly proven, or trustworthy, defenses. However proven a defensive system becomes in one battle context, it does not remain proven when it is adapted to another. Ultimately, the notion of proven and adaptive defenses is a contradiction in terms.
Rebecca Slayton is an Assistant Professor in Science & Technology Studies at the Judith Reppy Institute for Peace and Conflict Studies at Cornell University. Her research examines how experts assess different kinds of risks in new technology, and how their arguments gain influence in distinctive organizational and political contexts. She is author of Arguments that Count: Physics, Computing, and Missile Defense, 1949-2012 (MIT Press: 2013), which compares how two different ways of framing complex technology—physics and computer science—lead to very different understandings of the risks associated with weapons systems. It also shows how computer scientists established a disciplinary repertoire—quantitative rules, codified knowledge, and other tools for assessment—that enabled them to construct authoritative arguments about complex software, and to make those analyses “stick” in the political process.
Slayton earned a Ph.D. in physical chemistry at Harvard University in 2002, and completed postdoctoral training in the Science, Technology, and Society Program at the Massachusetts Institute of Technology. She has also held research fellowships from the Center for International Security and Cooperation at Stanford University. She is currently studying efforts to manage the diverse risks—economic, environmental, and security—associated with a “smarter” electrical grid.