The question came to me while I was watching Grantland's hilarious "Men in Blazers" World Cup Preview. They boldly called for the US to win the World Cup despite being in the "Group of Death" with 2 of the top 3 teams by FIFA rank. Were they kidding?
On the one hand, the preview is definitely comedy - they make predictions based on how they feel after eating "World Cup-Cakes." On the other hand, anyone in a blazer seems at least a little serious. Either way, like any high quality journalism, the Men in Blazers got me thinking. How deadly is the US's Group G? How screwed is the US in the group stage?
The answers are "very deadly" and "less screwed than I thought."
My first goal (GOOOAL!) was to figure out the probability that the US gets through the group round. To that I need to predict what is going to happen in the group stage games. I used FIFA's point rankings (as of May) and results from the past three World Cups to build a model of game outcomes as a function of FIFA rank.
I found that point differential varies with the log difference of two teams' FIFA rank. I adjusted the 2002 and 2006 FIFA ranks to make them comparable after FIFA changed scoring methodologies. The adjustments rescale the old ranks and adjust the ranks of less competitive conferences lower. For complete transparency, all the data and number crunching in R are posted to my GitHub repo.
I used that model to simulate the first round of the World Cup 10,000 times and calculate the probability that each team makes it out of the group stage.
For each team, this scatterplot shows the probability of surviving the first round against the team's FIFA rank points. Argentina(Group F), at the top of the chart, has about an 80% chance of getting through. Australia(B) at the bottom has about a 15% chance.
The points are colored by Group and you can highlight each by clicking buttons A through G. Or show all the labels at once using the button on the top right. Or hover over any dot with your mouse.
There is a lot of interesting information here. For one, Group G is low on the scatter plot = deadly. Group G teams have worse chances of making it through the group stage than teams with similar FIFA ranks.
There are seven teams with lower rankings than the US that have a better chance of getting through. Russia(H) stands 4 spots below the US in FIFA ranks, but has over a 15% better chance of surviving the group stage than the US.
When I heard that the US was in a group with the number 2 & 3 ranked teams, my gut reaction was to guess that they had maybe a 20-25% chance of getting through the Group stage. But they beat out either Germany or Portugal in 44% of my simulations. That surprised me.
My knee jerk reaction to underestimate the US prospects highlights a well known "irrationallity". My "gut" forecast overweighted appealing narrative information (Germany and Portugal are two top three teams!) and neglected boring "base rate" information (four teams with the same skill each have a 50% chance of survival). In behavioral economics this is called base rate fallacy, probably with a bit of fundamental attribution error thrown in. In a sport where games are often decided by one goal, luck plays a very important role & the base rate is stronger than you think.
Featuring 3 of the top 12 teams, Italy, England, and Uruguay, Group D also looks tough. Argentina(F) has the easiest path to the Round of 16. The most threatening team in their group is Bosnia and Herzegovina (which is apparently one country). Group H's top team is Belgium, the 12th ranked FIFA team - just two spots higher than the US.
One Measure to Rule them All
Calculating what percentage of all World Cup teams' FIFA points each group composes puts a single number on how deadly each group is. That percentage also lets you compare the deadliness of the groups in this World Cup to past World Cups.
Group G accounts for 14.42% of all the FIFA points in this World Cup. That's almost 2% more than average. Based on that measure, 2014 Group G is just barely less deadly than the deadliest group of this millenia, Group F in 2002. Barely. Group F accounted for 0.04% more of the FIFA points in 2002 than Group G does in 2014.
The outcome of 2002 Group F is a reassuring case study in what can go right for the US. Sweden, the weakest of the strongest 3 in the group by FIFA rankings, tied England in the first game and won the "gimme" game against Nigeria.
England's win over Argentina, left them in position to advance with a tie in their final game against Argentina. They tied & advanced. Argentina, who had the same FIFA rank as eventual champion Brazil, went home.
The friendliest groups are also interesting. Somehow Russia and Belgium both managed to make their way in to the easiest groups of the last four World Cups: 2002 and 2014 Group H. If FIFA doesn't launch a full scale investigation, I'll have to take matters in to my own hands (sounds like the plot of the worst political thriller ever - working title 'Tournament of Deception').
At fourth on the "friendliest list," Group F from this World Cup should be a breeze for Group favorite Argentina. My simulations show the "Blue and Whites" as having the best chance to advance out of the group stage.