Tour de France: Statistics Reveal the Drama
57 Seconds. After more than 2,000 miles and nearly three weeks of grueling cycling, Cadel Evans needed 57 seconds to catch the leader. And he would have to do it riding alone for only 26.4 miles.
He gained two and a half minutes.
When you watch the Tour de France, you realize that amid the extreme physical challenges of the race comes a high level of personal drama. And while most people picture a large pack (the peloton) of virtually every rider in the tour riding together for most of the race, you might be surprised at just how much separation happens during each stage of the race.
Stages are categorized as follows:
- En Ligne - This is a French term that in English we would call "flat".
- High Mountains - Long, steep climbs and extremely fast descents (riders often top 60 mph).
- Individual time-trial - Riders start separated by a couple of minutes and must ride the stage alone.
- Medium Mountains - Not as big as the high mountains, but still mountains that 99% of people would never think of riding a bike up or down.
- Team time-trial - Teams start separately and ride separately, and all members of the team are given the same time.
Rather than looking at the total time, results are presented in "gaps", which is the time between the winning time and their own. So the winner of a stage has a gap of 0:00; the second place rider would have a gap of 0:07 if he crossed the finish seven seconds later, etc. If riders finish in a pack, they are all given the same time. Based on the image of nearly everyone riding together, you would expect a huge group to have the same time on a stage with a few finishers faster or slower, right?
Here is a look at the distribution of gaps for each stage in the 2011 Tour De France, colored based on the stage type:
The graph reveals that riders' finishes are much more spread out than it would seem, and it appears that mountain stages present the largest time gaps. Another factor that could affect the time gaps would be the length of the stage, so I plotted the average time gap versus the stage length and kept the grouping:
From that graph it appears that the stage type is much more important in determning the average time gap than the distance. In fact, the following ANOVA table confirms that stage distance is not significant while stage type is highly significant:
Removing the distance term shows that about 82% of the variation in average time gaps is explained by the stage type.
Another aspect of the race is that many riders are unable to finish, whether due to injury, fatigue, or doping violations (which are less common now than a few years ago, but still happen). Here are the number of riders that dropped out of the race at each stage, with stage type still indicated:
The outlier is from a day where a single crash caused injuries to four riders that were significant enough for them to quit (in the past, riders have been known to complete the race even after breaking bones!).
Back to our friend Cadel Evans, who won that final time trial over the former leader by two and a half minutes...based on the graphs above, you now have some context for what a great ride that was. The individual time trial had, on average, a five-and-a-half-minute gap on the winner of that stage. So beating another rider by two-and-a-half minutes may not seem incredible, but remember that Cadel didn't need to beat just an average rider—he needed to beat the race favorite and current leader with only that stage remaining!
Here is a profile of the top 10 finishers in the race, and their overall time gaps (meaning how far back they were cumulatively across all stages to that point, not just for that stage) from the leader among that group:
That's Cadel in blue, and you can see him drop into first place at Stage 20, the individual time trial, after his performance (the last stage of the race is largely symbolic and other than a sprinting competition does not change times...Cadel had won the race after the time trial).