Saturday, July 11, 2015

How DataBucket's Wimbledon Model Can Be Better


This year's Wimbledon tournament culminates tomorrow in a final blockbuster showdown between Novak Djokovic and Roger Federer. Over the past two weeks, we have developed a probability model that determines the odds of each player reaching each stage of the tournament (these odds were updated after every round). Now with two competitors remaining, our model claims that Novak Djokovic has a 63.8% chance of beating Roger Federer.

But should you trust our model's results?


Upon inspection of my model assumptions, there are areas where our probability model can be improved.

Match Scores are Not Always an Accurate Indicator of Current Form

Here's one of the key concerns: while our model accounts for how close a match was (e.g. winners in straight sets are rewarded more than winners in 5 sets), match scores are not always a true indicator of form or current ability. Roger Federer was rewarded significantly for beating Andy Murray in straight sets, especially since he only had a 55% chance of winning, but in my opinion he should be awarded more. Federer faced only one break point in the entire match (and that was in the first game). He hit 20 aces, had over 5 winners for every unforced error he made, and won over 80% of the time when serving at 30-30 or deuce. Legends of the games proclaimed it as one of the best serving performances ever witnessed. Even Federer acknowledged this match as "definitely one of the best matches I've played in my career."

Likewise, Andy Murray should not be penalized as heavily for losing this match in straight sets. He hit over 2 winners for every unforced error he made (the average for the tournament was 1.5). He served a respectable 12 aces compared to only 1 double fault. And he managed to stay with Federer to the end of each set, only for his opponent to step up a gear. This is not a demoralizing defeat on Murray, but rather a performance many would call a valiant effort. As Sports Illustrated cited in their live blog, "So good. Too good. Too, too good from Roger Federer." My model can be improved by incorporating some of these detailed match statistics, but how much they should influence these probabilities is very much up for debate.

A Career-Defining Win Can Go Many Ways

We all would probably agree that this match is one of the highlights of Roger Federer's already illustrious career. We can classify this match as one of his career-defining matches. But such a strong performance from Federer can go either way. He may gain plenty of momentum from this performance and play superbly against Djokovic in the final. Or he may expend too much energy and suffer from mental or physical fatigue and fall to the steady Serb. This is what our model also lacks - the ability to capture a player's reaction to a career-defining win. Will a player succumb to the pressures of playing the match of his life in the next match, like Lukas Rosol after beating Djokovic in 2012 Wimbledon or Kei Nishikori after beating Djokovic in 2014 US Open? Or will a player rise to the occasion, gain confidence and play at a much higher level after a career-changing win, like Robin Soderling in 2009 French Open, or Stanislas Wawrinka in 2014 Australian Open?

For these reasons, while listing Djokovic as a 63.8% favorite seems reasonable to most, there are just many factors in tennis that are difficult to quantify. DataBucket will continue to try to incorporate as many of these factors as possible, especially as the US Open is just around the corner.

Tuesday, July 7, 2015

Wimbledon QF Preview: Top 4 Seeds Likely to Advance

We are now down to the quarterfinals of Wimbledon, and the top four seeds are still going strong. In fact, according to my probability model, the chances of Djokovic, Federer, Murray and Wawrinka making the semifinals are very high. In my previous 4th round post, each of these players had at least 67% chance of making it to the final four. Now each player has at least a 78% chance.



With these results comes a few interesting observations:

Djokovic's Chances of Winning have Decreased: Before the start of Wimbledon, Djokovic had a 56% chance of winning - this has decreased to 52%. As my model accounts for each player's margin of victory, Djokovic's close five-set encounter against Kevin Anderson actually hurt his chances of winning the tournament.

Wawrinka's Title Prospect Continue to Rise: Before the tournament, my model predicted Stan the Man's winning odds to be at 6.2%. Now it has climbed to 9.1%, as he has breezed through the early rounds without dropping a set. We had predicted Wawrinka to have only a 66% chance of reaching the QF. Now that he has reached this stage, he will only become a more dangerous threat.

Federer is More Likely to Win Wimbledon Than Murray: I had argued this point in my previous post, but Federer's performance at Wimbledon have continued to put him in front of Murray (Bet365 apparently disagrees). Federer easily handled Bautista Agut on Monday, while Murray fought through a tough four-set encounter against big-serving Ivo Karlovic. Murray also has not beaten Federer or Djokovic since 2013.

Gasquet's Title Odds are Overrated (It's not 2%): Sure, he has one of the prettiest backhands in the world. Sure, he beat Dimitrov and Kyrgios (who was overrated anyway). But he has to beat Wawrinka, Djokovic and Federer along the way and these are clearly very tough hurdles to overcome.

To see how my model works, take a look at my original post on predicting the Wimbledon tournament. Any comments are welcome!

Sunday, July 5, 2015

Djokovic Still the Favorite After the 1st Week of Play


The Wimbledon has dwindled down to its final 16 contenders. While some of the top seeds have fallen (think Raonic, Nishikori, and Nadal), Djokovic, Federer, Murray and Wawrinka are still in contention.


Last week, I presented my model that predicted the odds of each players reaching different rounds of the tournament. After three rounds of play, the odds of the contender haven’t changed significantly (Djokovic at 57%, Federer at 18%, Murray at 13%, and Wawrinka at 8%). That said, the chance of these players reaching the semifinals has increased dramatically. Mark your calendars down for some mouthwatering clashes on Thursday and Friday, as the top 4 seeds all have at least a 67% chance of making it to the final four.



Comparing our results with betting odds, we claim that Bet365 appears to have overestimated the chance that the underdogs will win the tournament. They may be making this move on purpose to hedge against the risk of paying out huge multiples, or they are wary of the fact that three of the past six slams have been won by a member not in the Big Four. But giving Nick Kyrgios a 1 in 29 chance of winning is far too optimistic given that he will have to beat Djokovic, Wawrinka and Federer/Murray to win the tournament.


Another overly optimistic implication betting companies have made is giving Murray a 1 in 3 chance of winning the tournament. Yes, many people are probably betting on Murray. Yes, many people think homecourt advantage matters. But keep in mind that Murray is 2-6 in Grand Slam finals and has lost to Federer or Djokovic 11 times in a row. There are also lapses in concentration, such as in the third set against Andreas Seppi, that is worthy of concern and may come back to haunt him when he plays Federer or Djokovic.


Thoughts on my model? Thoughts on Wimbledon in general? Feel free to comment below. Check out the posting on the first week of Wimbledon here.