Saturday, September 12, 2015

The Odds of an All-Italian US Open Final Were Less than 1%

The semi finals of the women's US Open produced two monumental upsets. Just like in the men's final four last year, where two significantly lower ranked players upset the top two seeds, Flavia Pennetta dismissed Simona Halep in straight sets, and Roberta Vinci came for a set down to deny Serena Williams from achieving the first Grand Slam since Steffi Graf in 1988.

FiveThirtyEight declared Serena William's loss as the greatest loss of all time, according to the current Elo Ratings of Williams and Vinci at the time of their semi final match up. That said, we wanted to measure this upset in a probabilistic manner. How likely was it that both Italian players upset the top seeds?

To answer this question, we refer to our tennis prediction model, which also uses an Elo-Rating style metric to calculate a player's ability. However, our system only incorporates matches played in the past year and head-to-head matches between players in the past 5 year. Our method also places more emphasis on detailed tennis metrics such as sets and games won in each match, the court surface being played on, and the stage and quality of tournaments being played. This allows our model to make accurate predictions for any tournament at any given time.

The table above represents each of the semi-finalists chance of reaching each stage of the tournament (Finalist or Winner) before the Friday matches. Notice that the Italians only had a 21% and 3.7% chance of winning their matches. Further analysis of past tennis data (from 1968 to 2015) suggested that semi final match outcomes are essentially independent of each other. Probabilistically, the chance that the second match would be an upset is the same as the chance that the second match would be an upset given that the first match was an upset. Thus, the probability that an All-Italian US Open final would have occurred is 21% x 3.7% = 0.8%.

In terms of who would win the final tomorrow, betting odds have declared Flavia Pennetta a 4/9 favorite, or an implied winning probability of 69.3%. Our model suggests otherwise, declaring Pennetta as merely 54.7%. As our model places more weighting on later stage matches and strength of opponent, Vinci's ability improved much more than Pennetta's, as Williams has a significantly higher rating than the rest of the field. Thus, despite being ranked over 15 places higher than Vinci, Pennetta is only a slight favorite in this final matchup. You can essentially treat this final as a toss up.

Stay tuned for our preview of the men's final on Sunday.

