“But does it work?” is a simple, seemingly straightforward question that, unfortunately, is not asked (or answered) nearly enough in the foreign policy space. This may be because policymakers either do not know how to answer the query or perhaps do not want to hear the answer. When it comes to international relations, how do you measure goals? What does “stable, prosperous, and friendly,” as denoted in the 2019 National Security Strategy, really mean? Despite the inherent opaqueness and subsequent challenge of defining the characteristic, all foreign policy should, primarily, be effective. In order to prove or argue effectiveness, policymakers must objectively measure the impact or success of a policy or program.
In the joint military world, the United States analyzes, measures, scores, and revises approaches through two concepts: Measures of Effectiveness (MOE) and Measures of Performance (MOP). MOPs are internal, measuring a team’s actions and ask, “Are we doing things the right way?” MOEs, on the other hand, are external, measuring the impact of a team or policy’s actions and ask, “Are we doing the right things?” In all foreign policy—hard power, soft power, or a combination of the two—MOPs are far easier to determine. We have open and consistent access to friend behavior. We can require team members to document and report their performance. However, developing and tracking MOEs are much less straightforward because measuring the impact of foreign policy is inherently difficult. If all foreign policy is aimed at shaping the behavior of actors, then how do we measure behavior? The most obvious result is whether a targeted actor exhibits a desired behavior, but this is a bit too simplistic. We can take countering terrorist recruitment efforts for an example. A program designed to keep individuals in village X from joining al Qaeda can be deemed successful if say 80 percent (our MOE for this scenario) of residents do not join the local cell—but how do we prove that it was the program that resulted in this outcome? In statistical terms, how do we isolate the variables to determine causation? Foreign policy isn’t conducted in a sterile lab; we cannot have true control groups and wholly independent variables that we can add and subtract to determine direct effects. Even if we achieve our 80 percent goal, how do we logically argue that the specific program and not a second or third variable resulted in the preferred outcome? What if we fail to meet the 80 percent threshold? Do we scrap the entire program believing that MOEs must be binary?
The answer to measuring soft power lies in how we develop MOEs and MOPs.
The United States Agency for International Development (USAID) is familiar with the concept of measuring impact. It uses tools such as light pollution studies to better understand the impact of infrastructure programs. Just as with its hard power counterpart, soft power MOEs are much simpler at the tactical level. It’s much easier to show the impact of an infrastructure program designed to bring electricity to a city than it is to show the impact of that electricity in lowering crime, infant mortality, or food insecurity rates. Even more difficult is the ability to tie those declining rates to a security or policy objective to the U.S. taxpayer.
Returning to our example in village X, we often fail to extrapolate the cumulative steps required to change an actor’s behavior, or what could be considered “micro-goals.” We often fail to tie a policy goal to a policy because we don’t fully understand the link between the two. Even when we do accurately link the two, we still run the risk of determining the individual chain links between them, or determining the multiple steps required between a current situation and a desired situation. This is where micro-goals as MOPs are helpful. While our ultimate MOPs in village X may be to keep 80 percent of residents from joining al Qaeda, it’s helpful to determine micro-goals along the way. Examples of these goals may be a 10 percent decline in al Qaeda approval ratings (determined by polling for example), or a 20 percent decline in activity on a recruitment site from IP addresses in village X, or whatever the particular situation may warrant. These micro-goals can help guide a policy as it is enacted, bolster support and enthusiasm for an approach, justify funding, and ultimately result in more nuanced and effective foreign policy initiatives.
We can also reach into performance psychology for advice on how to develop complete and useful MOEs by using the SMART method, meaning foreign policy micro-goals or MOPs should be “specific, measurable, attainable, relevant, and time-bound.” Each micro-MOE should denote a single specific outcome; that outcome should be assigned a tangible metric, be realistic and relevant to the final MOP, and have a designated time or date by which it should be realized.
Along with performance psychologists, economists are perhaps the most well-known practitioners who break large, often convoluted, and decidedly complex goals into measurable micro-goals or indicators. A “good economy” is a concept highly resistant to measurements when considered in its entirety. Economists, however, can break this goal into SMART components: employment rates, stock market indexes, gross domestic product (GDP), GDP per capita, and inflation to name a few. Not a single measurement on its own can accurately describe the state of the economy, but together they can help policymakers craft and monitor programs and initiatives to help create that ultimate goal of a “good economy.”
Practitioners in the foreign policy space must do the same.
There’s hope in this regard, with some leading soft power agencies already measuring progress of soft power programs. USAID, as discussed above, tracks performance and impact. The State Department, for another, maintains a robust database of the alumni of cultural and educational exchange programs—allowing for policymakers to point to the success of former participants. One such program, the International Visitor Program (IVP), which sends future foreign leaders on state-sponsored study trips to the United States, includes notable alumni, such as Tony Blair and Angela Merkel. In another decades-old initiative, embassies were measured on how well they were able to convince host countries to vote or otherwise side with U.S. policy initiatives. A joint resolution passed by Congress in 1983 mandated an annual report from the State Department on foreign nations’ voting practices in the United Nations, prompting the DoS to pressure embassies to persuade their host countries to cast favorable votes. Unfortunately, these MOPs failed to meet the “relevant” requirement of a SMART goal—often pushing embassies to achieve an irrelevant quantitative goal, rather than a relevant qualitative goal, equally weighting “throw-away” policies with real impactful U.S. policy goals. Put another way, embassies that were easily able to convince host countries to side with the United States on five policies were considered more successful than an embassy able to truly convince a host country of one policy—despite how impactful or critical that one policy might have been.
This program highlights another key pitfall of metrics in foreign policy: an emphasis on MOPs rather than MOEs, or, in plain language, the growing abundance of participation trophies. Measuring friendly actions are useful, but ultimately with foreign policy, the effect is much more important than the effort. By focusing on the performance at the sake of crafting SMART MOEs, this State Department program rewarded the wrong things. Measuring effectiveness is a way to dispassionately and objectively measure a policy against reality, to better determine if friendly actions are having the desired impact. Only when we are able to do this can we truly measure policies and decide which to continue, cancel, or recalibrate.
Wargamers understand the impact of the truly unquantifiable and, due to the nature of their work, have arguably had the most success in applying hard metrics to the previously deemed unmeasurable—most notably in the areas of atmospherics, gray zone tactics, and information operations. While models that incorporate metrics for these less tangible concepts are much less mature than models incorporating metrics for the tangible components of an armed conflict, wargamers are at least recognizing the importance of their inclusion. As General, and later President, Dwight D. Eisenhower once famously quipped, “Plans are worthless, but planning is everything.” Even if MOEs and MOPs for intangibles are not wholly mature, their inclusion at least forces planners to think about these components and their impact on operations.
This was the intent behind Johns Hopkins University’s Applied Physics Lab’s “Green Country Model,” a wargame which took into consideration the effects of various social factors. Of the six scored aspects of the game, nearly all incorporated and assigned points to less tangible aspects of war. Most notably, players are awarded points for “affinity, hubris, and influence.” Affinity points are awarded or subtracted based on the level of friendship between two players. Hubris represents, well, the hubris of a player, while influence is measured and scored by the respect or fear of a player on behalf of other players.
Outside of the United States, other organizations have sought to quantify and score soft power resources, most notably the Soft Power 30, which assesses the soft power resources of 30 nations and provides insights into trends over the past five years. This particular approach uses sub-indices to measure a nation’s digital infrastructure, global reach and appeal of cultural outputs, attractiveness of economic models, level of human capital in terms of education, strength of a diplomatic network, and a government’s commitment to freedom, human rights, and democracy. This method also assesses the relative soft power score among nations and complements their research with extensive polling in 25 countries across each region of the world. While the terminal goal of assessing foreign policy is to measure the actual outcomes and impacts of soft power, not simply the potential or soft power resources as measured here, this approach can be useful when tackling the challenge of applying metrics to these less than concrete concepts.
Ultimately, the qualitative nature of foreign relations negates a clean quantitative approach to measuring performance or effectiveness. Still, as in most disciplines, some tangible quantitative data can help frame, illuminate, or otherwise bolster the qualitative assessments most often found in the world of international relations. These quantitative additions can be incredibly useful in garnering the monetary and moral support for soft power programs. Thinking in this way is critical to transforming the hyper-militarized foreign policy approach of the United States to transform it into something more nuanced, more diverse, and ultimately more effective.
The views expressed in this article are those of the authors alone and do not necessarily reflect the position of the Foreign Policy Research Institute, a non-partisan organization that seeks to publish well-argued, policy-oriented articles on American foreign policy and national security priorities.
 Special thanks to Retired U.S. Ambassador Emil Skodon for his insights into the effects of this report requirement.
 Special thanks to the JHU APL for their contributions to this article, specifically to Teresa Kinyon for her work on the Green Country Model.