dcsimg
 

The Odds of Throwing a Perfect Game

If you like baseball pitching statistics, then you've loved the month of June. On the first of the month, Johan Santana pitched the first no-hitter in Mets history. Then a week later, the Seattle Mariners used 6 different pitchers to do the same thing. That tied the MLB record for most pitchers used in a no-hitter. And finally, 5 days after that, Matt Cain pitched the 22nd perfect game in major league history. And we're only halfway through June! It doesn't take a Six Sigma Black Belt to realize it's been a crazy month.

But as a stat nerd, the question I have is how crazy has June really been? What are the odds of throwing a perfect game and a no-hitter? (Don't worry, it doesn't take a Six Sigma Black Belt to figure that out, either!) But before we start, we have an important question to answer.

What Year's Data Should We Start With?

There have been 22 perfect games, with the first two both happening in 1880. But in 1880 pitchers threw underhand, it took 8 balls to draw a walk, and a batter was not awarded first base if they were hit by a pitch. In other words, the odds of pitchers in 1880 throwing a perfect game were vastly different than today. To account for this, I'm going to start collecting data at 1900, as people seem to agree that this is when the modern era of Major League Baseball began. Since 1900, there have been 20 perfect games and 235 no-hitters.

How Do We Calculate the Odds of a Perfect Game?

I went to baseball-reference.com and recorded the total number of games played, including any postseason games, for the last 113 seasons (I included all games played through June 13th, 2012). I also recorded the league average for on-base percentage. If you want to follow along, you can get the data here

Since 1900, there have been 181,921 major league baseball games. But in each game, there are two pitchers. So to get the number of opportunities for a perfect game, we need to double that number. That  means since 1900, there have been 363,842 opportunities for a perfect game. And only 20 of them have occurred! What are the odds?

Odds of throwing a perfect game = 20 / 363,842 = 0.000055 = approx 1 in 18,192

Yeah, that's pretty low. Giants fans that were in attendance at AT&T Park Wednesday night should consider themselves extremely lucky. What about Mets and Mariners fans? How lucky should they consider themselves?

Odds of throwing a no-hitter = 235 / 363,842 = 0.000646 = approx 1 in 1,548

That's still quite lucky, but not near as much as the perfect game. So, sorry Mets and Mariners fans, we're going to focus on the perfect game from here on out. It's just more interesting. Why? Well...you'll see.

Is This What We Would Expect to Happen?

The odds above are just what we've observed in the last 113 years. But let's stop for a minute and think about what we would expect. To pitch a perfect game, no runner can reach base. That means you have to get 27 batters out in a row. So the probability of throwing a perfect game is equal to the probability of getting 27 batters out in a row.

Remember when I said for each season I collected the league average for on-base percentage (OBP)? Well, OBP is the percent of the time that a batter reaches base (either by a hit, a walk, or getting hit by a pitch). That means the probability of getting a batter out is 1 minus the on-base percentage. I'll have Minitab calculate the average OBP since 1900.

This is the average OBP for all of major league baseball in the last 113 years. So the probability of a pitcher getting a batter out is:

1 - 0.32856 = 0.67144 = 67.1%

This means that over the past 113 years, batters get out 67.1% of the time. Now, this number isn't constant, as it changes slightly depending on the batter and the pitcher. But I can't break down every plate appearance since 1900, so we're going to stick with this number. Now let's calculate the odds!

Odds of throwing a perfect game = 0.67144^27 = 0.00002134 = approx 1 in 46,800

Again, the true odds depend on the pitcher and the team he's pitching against. Some games will have odds slightly better, and some will have odds slightly worse. But they should even out, making our odds of 1 in 46,800 a good estimate for the average game. So using a probability of 0.00002134, how many perfect games would we expect to see in 363,842 opportunities?

Expected number of perfect games = 0.00002134 * 363,842= 7.8 Perfect Games

So there have been more than twice as many perfect games as we would expect! But of course, the 7.8  number is just the average. Certainly we could get other outcomes. After all, if you flip a coin 100 times, you're not always going to get 50 heads. We can use a probability distribution plot to visualize the other possibilities. We use a binomial distribution with 363,842 trials and an event probability of 0.00002134. 



We see that any number of perfect games between 4 and 11 wouldn't be that uncommon. But wait, there have been 20 perfect games. I don't even see any gray bars even close to 20! In fact, by using Minitab's cumulative distribution function the probability that we would see at least 20 perfect games since 1900 is 1 in 5,780. That's very uncommon!

Is It Just Random Variation?

It could be. But think of is this way. Imagine we take the 181,921 games played since 1900, and say they are just one sample. Then we take another sample of 181,921 games. And then another. And another, until we have 5,780 samples (it would take over 650,000 years). In just one of those samples, we would expect to have at least 20 perfect games. So are we just "lucky" enough to have that sample be the very first one we took? I'm thinking not.

Then something has to be wrong with the expected value, right? I guess so, but I'm not sure what it is. And then I found some numbers that really made my head spin. Let's take the fact that there have been 20 perfect games and work backwards:

  • 20 perfect games / 363,842 opportunities = A probability of 0.0000055 of getting 27 batters out in a row
  • 0.000055^(1/27) = A probability of 69.5% of getting one batter out
  • 1 - .695 = An average OBP of 0.305

In a league where there have been 20 perfect games in 363,842 opportunities, we would expect the average OBP of the league to be 0.305. Why did this make my head spin? Consider these stats:

  • Batters that have faced Hall of Famer Nolan Ryan had an OBP of 0.307
  • Batters that have faced Yankee Ace CC Sabathia have an average OBP of 0.306
  • Batters that have faced the last 10 pitchers to throw a perfect game have an average OBP of 0.310

So in a league made up of nothing but clones of Nolan Ryan, CC Sabathia, and the last 10 pitchers to throw a perfect game (that includes Randy Johnson), you still wouldn't have a league where the average batter gets out 69.5% of the time. Mind = Blown.

So, What Are the Odds of a Perfect Game?

Well, I can confidently say that they are low, at least 1 in 18,192 and no higher than 1 in 46,800. But for the life of me, I can't figure out why these two numbers are so different. If anybody has any theories, I'd love to hear them! In the meantime, I'll finish with some things that definitely have better odds of happening than a perfect game.

  • Winning $400 on a Pirates or Phillies Pennsylvania Lottery scratch off ticket (1 in 12,000)
  • Having a randomly picked clover be a four-leaf clover (1 in 10,000) 
  • Getting four of a kind in a 5 card poker hand (1 in 4,164)
  • Successfully navigating an asteroid field (1 in 3,720……at least according to C-3P0)

Photo by Art Siegel, used under Creative Commons 2.0 license.

Comments

Name: Quentin • Friday, June 15, 2012

I always enjoy your sports-related posts. Interesting you got very different probabilities than the actual results. Don't forget that many games have gone longer than 9 innings. So the 27 at-bats per game is probably a little larger in reality, which would seem to make perfect games for a given pitcher even less likely than what you calculated.


Name: Kevin • Tuesday, June 19, 2012

Thanks Quentin! And I agree that a pitcher could have to make more than 27 outs to throw a perfect game. But that would require the other pitcher(s) to throw a shutout. So the odds of one pitcher throwing a perfect game while the other(s) throws a shutout are a lot lower than that of a normal game going into extra innings.

In the 22 perfect games, none of them have gotten past the 7th inning with the score tied at 0. So I do agree that it could happen, but I think the odds are really low, so it wouldn’t affect the probability a whole lot. But even if it only affects it a little, it definitely would lower the odds that I calculated above.


Name: Ed • Wednesday, June 27, 2012

Thanks for the odds on pitching a perfecto ! A good friend of mine was in San Francisco recently and witnessed Matt Cain's perfect game. Actually, he was at the Met-Philly game on Fathers Day 1964 to witness Jim Bunning's perfecto. We are wondering the odds for a single fan to witness two perfecto's....and he has only been at some 50 MLB games ever !


Name: Kevin • Wednesday, June 27, 2012

I would say that your friend is very lucky. I'm glad you told me he's been to 50 games, since the odds of witnessing two perfect games depends on how many games you've seen. The odds also depend on which probability we use, so I'll use both!
• Using what we’ve observed, his odds are 1 in 270,270
• Using what we’d expect, his odds are 1 in 1,666,667. Yes, that’s over 1 in a million.

Now, I’ve though some more about why the two probabilities in the article are off, and I think I figured out some of it (coming out with a Part II soon). In short, his odds are closer to the 270,270 number than 1.6 million. But either way, he's very fortunate.

And if you’re interested in how to use Minitab to calculate that, you can go to Calc > Probability Distributions > Binomial. Then click “Probability”, enter 50 as your number of trials, 0.000055 as your event probability, and 2 as your input constant. You can use it to find out other stuff too. For example, if you put 1000 as your number of trials and 0 as your input constant, you can get the probability of going to 1,000 baseball games and not seeing a single perfect games (about 95%).


Name: Jack Oosterveld • Tuesday, September 4, 2012

My wife and I were on vacation in seattle from Edmonton,Alberta(what are the chances) We never in our 34 years of marriage ever attended a major baseball league game so we thought it would be nice too take one in . Well what are the odds that that afternoon on August 15 we would witness a PERFECT GAME!!! The thrill, the excitment, and guess who is our favorite team. THE ODDS!! What an afternoon!!


Name: Jimmy • Thursday, December 12, 2013

Hey Kevin, I'm writing on a term paper on the movie/novel For Love of the Game and am in need of some information. I'm trying to figure out the mathematical odds of Billy Chapel throwing a perfect game. I noticed that you did research on total games and on-base percentage from 1900 to 2012 and was wondering where you accessed that information. I need this ASAP, please. Thanks and great work here!


Name: Kevin • Thursday, December 12, 2013

Jimmy,

All of my statistics came from http://www.baseball-reference.com

Good luck on your paper!


Name: Eric • Tuesday, March 18, 2014

I think a more correct way to think about it is to have each pitcher's natural OBP be drawn each game from a Beta Distribution with a mean equal to their career average, but a bit of variance. So a very good pitcher with a career average OBP of .305 might have their daily natural OBP be drawn from somewhere between a Beta(305,695) and Beta(30.5,69.5) while a weaker pitcher with a career avg OBP of .250 might have their natural OBP drawn from somewhere between a Beta(250,750) and Beta(25.0,75.0). With this paradigm, we allow for a pitcher to be totally on one day and totally off on another day; not just experiencing the rises and falls of random batter success/failure.


Name: Eric • Tuesday, March 18, 2014

Oops, got that backwards:

*a weaker pitcher with a career avg OBP of .450 might have their natural OBP drawn from somewhere between a Beta(450,550) and Beta(45.0,55.0).


Name: glenn • Wednesday, April 9, 2014

Just curious, did your OBP account for reaching base via error? That would break up a perfect game and may explain some of the difference in the two numbers.


Name: Kevin Rudy • Thursday, April 10, 2014

Glenn, I purposely did not account for errors. For one, I had a lot of data to collect as it was, and didn't have time to get data on errors as well. And second, I figured they are so rare that they wouldn't affect the numbers too much.

And even if they did affect the numbers a lot, it would actually further the difference. Accounting for errors would make it even harder to pitch a perfect game, thereby making it even more puzzling as to why there have been so many.

Eric, I agree that on any given day, a pitcher has the ability to perform above or below his average based on your Beta distribution. If they're having a good day, their odds of a perfect game are higher, and if they're having a bad day it's lower. But then in the long run wouldn't that even out? So if we're trying to determine their probability of throwing a perfect game in general, wouldn't it make the most sense to just use the mean?


Name: Matt • Monday, June 30, 2014

Most people agree that your best chance of hitting against a starter is to get him early. Maybe the more batters you get out, the better chance you have of retiring the next batter. Perhaps you could do a correlation to opponent OBP vs. batters retired previously to figure out how this can cause a drop in OBP. You can use this to update the OBP dynamically. You could test this by running a million monte carlo simulations or so and seeing how many perfect games were thrown.


Name: Kevin Rudy • Tuesday, July 1, 2014

Matt,

I completely agree that looking at OBP as you get more consecutive batters out would be a great analysis. The problem is getting the data. I'm not sure how I would get OBP for different number of consecutive batters retired unless I went though individual game logs. And to get a large enough sample size to make the analysis worthwhile would take forever! But if you have any other ideas on how to get the data, let me know!


Name: Frank • Friday, August 22, 2014

Question about a perfect game... If a pitcher was throwing a perfect game, and struck out the batter but the ball got away from the catcher, and the batter ended up on first, is it still considered a perfect game since the out was recorded? And if so, then wouldn't it be possible for that runner to score off stollen bases and a wild pitch or sacrifice fly? So it would be a perfect game even with the run scored? I just know that a description of a perfect game is nobody reaches base, 27 up, 27 down. But technically, in my scenario, that batter did go down as the strike out was recorded. Right?? Or no??


Name: Kevin Rudy • Tuesday, August 26, 2014

Frank,

If a batter reaches first on a strike 3 passed ball, the pitcher is given a strikeout, but an out is not recorded. So the batter was never actually "out" and thus the perfect game would be ruined.


blog comments powered by Disqus