Monday, June 29, 2015

Update: Nadal's Odds after Ferrer's Withdrawal

Eighth-seed David Ferrer has withdrawn from Wimbledon due to an elbow injury. Ferrer's withdrawal thoroughly changes the game for those on the same side of the bracket - Nadal, who might have faced Ferrer in the quarterfinals, now has a much more open path. The highest seed he can possibly meet in the quarterfinals is now Fabio Fognini, ranked 30th. In fact, Nadal's chance of reaching the latter stages increaes significantly, according to our probability model:



That said, Nadal's chances of winning the tournament remains slim at 1.6%. To look at the odds of other contenders, check out our earlier post previewing the entire tournament.

Saturday, June 27, 2015

Predicting the 2015 Wimbledon Men's Tournament


















The upcoming Wimbledon showdown promises to be a high stakes game amongst men’s tennis elites. Roger Federer looks to win a record 8th Wimbledon title on his strongest surface. Novak Djokovic seeks to defend his title after winning his 5th Australian Open title this year. Rafael Nadal, seeded 10th, has a lot to prove and now must do it on his least reliable surface. Murray comes fresh off of a win at the Aegon Championships and will look to capitalize on his hometown support. Finally, we cannot rule out Stanislas Wawrinka, the "other" Swiss that recently won the French Open crown.

But who is the favorite to win the most prestigious tournament in tennis in 2015? Inspired by FiveThirtyEight’s World Cup Prediction model, we apply similar principles to men’s tennis to calculate the odds of each player advancing to different stages of the tournament. 

What do our results say? Djokovic is a clear favorite to clinch his 3rd Wimbledon Crown.

In fact, we are more optimistic about this claim than what the betting odds suggest.  Our model claims that Djokovic has a 55.6% chance of winning, while Bet365 implies a win probability of 44.4%. We also believe that Federer has a higher chance of winning Wimbledon than Murray, due to his strong head-to-head record against the British favorite. Finally, we think Bet365 overvalues Nadal’s prospect; given his recent form, we think other Top 10 players such as Berdych, Raonic and Nishikori have higher chances of winning the title.




Our model also shows the probability of each player reaching a given round of Wimbledon (below). This helps us determine who we should bet on at latter stages of the tournament. For instance, Kei Nishikori is quite a reasonable bet if he makes it to the semifinals. While he only has a 11% chance of making it to this stage (lower than many other top 10 players), he has the 5th highest chance of winning the grand slam, suggesting that Nishikori will only become stronger in the latter stages. (He is also projected to meet Djokovic in the semifinals, which explains his low probabilities)




We will continue to update these two figures after each round - stay tuned for more of our Wimbledon analysis. The details of our model are described below:

Details of Our Model

FiveThirtyEight uses ESPN’s Soccer Power Index (SPI) to rate a team at a given time. Similarly, we created an Elo-rating style system that tracks a tennis player’s rating at a given point in the tournament. Our system uses current ATP Rankings as a baseline and adjusts player ratings based on matches they have played in the past year. How much matches affect a player’s ratings are based on the following factors:

  • Quality of Opponents -  An upset (i.e. beating a higher-rated or losing to a lower-rated opponent) causes a player’s rating to fluctuate more
  • Recency - Matches that happened recently are weighted higher than matches further back in time
  • Surface - Since we are predicting the Wimbledon tournament, grass court matches matter more than clay and hard court ones
  • Tournament Level - Grand slam matches, Masters 1000 matches and ATP 250 and 500 level matches are given different weightings.
  • Set-Level Scoring - Players are given more credit for straight-set wins than more drawn-out affairs.


Using these ratings, we simulated 5,000 tournaments and with that predicted the odds of players reaching different rounds. In calculating the odds of one player beating another, we also took into account their head-to-head record. Like in the player’s rating, recency, surface, tournament level, and scoring matter.

Saturday, June 6, 2015

Insights Before the French Open Men's Final

http://s3.india.com/wp-content/uploads/2014/11/novak-djokovic-vs-stanislas-wawrinka.jpg
The Roland Garros Men's Final between Novak Djokovic and Stan Wawrinka is an especially important matchup. Djokovic seeks to become only the eighth man in the Open Era to win a career grand slam, while Wawrinka seeks to cement his legacy in the record books by becoming a multiple grand slam winner. In light of this final, DataBucket presents key insights from three different perspective to keep you up-to-date with the tennis world.

From a Head-to-Head Perspective

The following infographic reveals some interesting facts regarding this blockbuster match-up.
From a Surface Perspective

DataBucket also examined detailed statistics across grand slams, and found that Roland Garros matches are more tipsy-turvy in terms of change in momentum, but tend to have a more decisive winner at the end. The following graphs and insights support our conclusions:



  1. Roland Garros matches tend to be more straightforward affairs than other Grand Slams. In Australian Open and Wimbledon, 48% and 52% of matches go beyond three sets, compared to 44% for Roland Garros. As a result, don’t expect a tediously long match.
  2. Most people think that clay court matches will have more errors due to the surface's tendency to slow down and lengthen rallies. However, statistics show that the winner/unforced error ratio in Roland Garros is comparable to Australian and US Opens. This means viewers can still expect an entertaining final.
  3. The audience should expect more shifts in momentum. Sets will be won more decisively; as shown, only 13.2% of sets go to tiebreaks, less than in any other grand slam. However, there are also more breaks of serves - return games are won 23% of the time, more than in any other grand slam.

From a Historical Perspective

We also wanted to analyze tennis information independent of the upcoming French Open final. We sought to quantify how consistent players are in this current era compared to top players in 2000. Rankings out of the top 100 are excluded to disregard players' rankings when they first became professional, since it isn't indicative of the performance they are known for.

The following two graphs plot the average rankings of such players with their ranking standard deviation, over their entire careers:

  1.  Roger Federer, Pete Sampras, and Andre Agassi all have lower standard deviations than the other players. This is because these players spent a lot of time at certain ranks - Sampras and Federer were consistently #1 for a long time, as was Andre Agassi for the top 5.
  2. Top players today, such as Federer, Nadal, and Murray, have lower overall standard deviations compared to Sampras and Agassi. This is understandable, because of the consistency of the players today (Federer most often at 1, Nadal most often at 2, and Murray most often at 4).
  3. In 2000 and now, players with higher average rank have a higher standard deviation. This is likely due to the fact that players that are on average, worse, tend to be less consistent as well. Also, top players can only have leeway to move down the ranks, whereas other players have room to move up or down the ranks.