Monday, September 14, 2015

Are Women's Tennis Rankings More Volatile than Men's Rankings?

http://www.irishexaminer.com/media/images/s/SerenaWilliamsUSOpenSep2015_large.jpg
Serena Williams' shocking loss in the 2015 US Open semi-finals left number 26 and number 43 in the world to face off in the finals. Meanwhile, the finals on the men's' side was populated by number 1 and number 2, Novak Djokovic and Roger Federer, respectively. That, in combination with the fact that many female players now far off-the-radar, such as Ivanovic and Jankovic, have been former world number 1, led us back to the question - how volatile are women's' rankings in comparison to men's?

Methodology
Using weekly ATP and WTA weekly rankings data from Jeff Sackmann's Github, we analyze the variance of the rankings of players currently in the top 30. We also exclude rankings data outside of the top 100 to minimize the variance impact of when these players first became professional, which is not indicative of their pro performance.


Looking at the WTA rank variance of the current top 30 players, we see that as expected, strong players like Serena Williams and Maria Sharapova who rank consistently at the top (excluding injuries) have low mean rank and low rank variance. For mid-tier players, such as Sam Stosur and Roberta Vinci, the variance on the whole becomes much higher.

Smaller circles, which indicate newer players with fewer weeks in the top 100 under their belts - such as Sloane Stephens and Petra Kvitova - have markedly higher mean and variance than the "power cluster" of consistent, top players. However, there are many mid-tier players with many weeks in the top 100, but still large variance and average rank. For newcomers, their ranking behavior is still yet to be determined - they could join either the consistent top players or the varying mid-tier players.


The graph of the top 30 ATP players show that ranking means are similar across men and women, ranking from 5 to 55. However, the variance is lower for men on the whole. Similar to the WTA results, small circles indicating newcomers generally trend to the right and the top of the graph, meaning higher variance and rank. This is due to the fact that these players undergo a lot of ranking movement when they first go pro, which is not indicating of their long-run ranking behavior.

Again, like in the women's results, men's ranking behavior breaks into two camps: the consistent, top players like Roger Federer and Rafael Nadal, and the mid-tier players who vary more, such as David Ferrer and Philipp Kohlschreiber. One surprise is that Novak Djokovic has such a low average rank but such a high ranking variance - Djokovic has sharp rises in the rankings, and variance penalizes that over small incremental increases.


Finally, looking at the graph of WTA rank variance vs ATP rank variance over the years with regards to the current top 30 players, we see that WTA is significantly higher than ATP variance. This is mostly attributed to periods of extreme variance exhibited by certain players, such as Maria Kirilenko and Jamie Hampton. On the whole, however, looking at the individual variances of the top 30 players, women do have higher rank variance than do men.

3 comments:

  1. This is great work and I love the use of Tableau!
    A couple of thoughts:

    One thing I would consider looking into is based on Stephanie Kovalchik's recent work on how match format explains why WTA seems more inconsistent that ATP. Her paper is not out yet, but her website is http://on-the-t.com/
    I had the pleasure of meeting of meeting her to get the gist of her work: https://medium.com/the-tennis-notebook/tennis-note-19-f3c30e0a79ea

    I would be interested to see variance dating back to 70s. I think Jeff has the rankings going back pretty far. We all know how consistent and dominant the top players have been in the last decade so perhaps this is slightly biased. I also think it would be worth looking more into the actual causes for certain years in WTA. For instance, 2006 Serena completely disappeared. 2009 marked the return of Kim Clijsters who took a wild card at the US Open and won it. 2010 was when Henin returned and Schiavone came out of nowhere. I know you listed a few reasons, but actually finding the cause would take it further.

    Finally, I understand you looked at rank and I am taking a wild guess and you looked at ranking points to calculate variance? How much does the discrepancy in ranking point systems affect this difference?

    All questions and suggestions to ponder! Cannot wait for another post :)

    Cheers,
    Nikita Taparia
    the.tennis.notebook@gmail.com
    Feel free to message me at @kryptobanana on Twitter

    ReplyDelete
  2. This may mean either those restaurants improve their sanitary conditions significantly after a health inspection of need my papers or that inspectors tend to bump up restaurant grades from one year to the next.

    ReplyDelete