dcsimg
 

Calculating Baseball Park Factors: Minitab Execs Make it Fast

I’ve expressed an interest in baseball park factors that I’m still exploring. It intrigues me that parkfactors.com says that there are 6 neutral parks in major league baseball, even though when I look at the graphs that I’ve made from the ESPN scores it looks to me like there are only 6 clearly non-neutral parks.

American League Baseball Parks

National League Baseball Parks

Unfortunately, I’ve noticed that the ESPN park factors don’t match the formula that they give on their website. I’m not positive that the ESPN park factors are wrong because the inputs are wrong. The website may employ a more complicated formula. But lacking a reliable source for the statistic, I’m going to calculate it myself.

Because I have to reproduce the same calculation many times to get the data that I want, I’m going to use a Minitab exec to make it go faster. An exec file saves Minitab commands so that you can repeat an analysis without having to use the menus.

Want to try it for yourself? Follow along with the data for the 2012 Giants:

  1. Follow this link to the data source:

http://www.baseball-reference.com/teams/SFG/2012-schedule-scores.shtml

  1. Above the table, choose CSV.
  2. Copy and paste the text into a text editor and save it as a CSV file.

Now that you have the data, lets set up the exec. Save these commands in a text file with the extension MTB:

#Remove extra header rows
Copy 'Gm_' - 'W-L' 'Gm_' - 'W-L';
  Exclude;
  Where "R = '*'".
#Calculate park factor
Name C22 'away'
Formula 'away' = sum(if(C6="@", c9+c10, 0))
Name C23 'home'
Formula 'home' = sum(if(C6 <> "@", c9+c10, 0))
Name C24 'home games'
Formula 'home games' = sum(if(C6 <> "@", 1, 0))
Name C25 'away games'
Formula 'away games' = sum(if(C6 = "@", 1, 0))
Formula 'away games' = count(c9)-c24
Name C26 'ratio'
Formula 'ratio' = ('home' / 'home games')/('away' / 'away games')

Now, open the data and run the exec:

  1. Choose File > Open Worksheet.
  2. In Files of Type, select Text (*.csv).
  3. Open the saved CSV file with the data in it.
  4. Choose File > Other Files > Run an Exec.
  5. Press Select File.
  6. Choose the MTB file you saved and click Open.

Now instead of having to remove the rows with headings instead of data, name 4 columns, calculate the number of runs scored at home, the number of runs scored away, the number of games at home, and the number of games away, you run the exec. One step takes a lot less time, and that’s always worth your while.

Have an idea about making your own exec? Here’s an example of using an exec to update data from a database, including how to add the exec to a menu or toolbar!

Comments

Name: Jim Bouldin • Tuesday, August 20, 2013

Definitely don't use the ESPN factors, they're a mess.


blog comments powered by Disqus