*This post is by Matt Eckerle, on work done during his ‘Sprintbattical’. T he Climate Corporation has “sprintbaticals,” two-week breaks to work on something a little different…. in this case applying some exploratory data analysis to baseball rather than to agriculture.*

At some point in my mid-20s, I started ~~getting really interested in~~ obsessing over baseball. Something about the pace and complexity of the game really got to me. The more I watched, the more I learned, and the more it fascinated me. Around that time, Sportvision came out with PITCHf/x, a system that now tracks the trajectory of every pitch thrown in Major League Baseball™. MLB™ puts the PITCHf/x (along with much, much more) data on their Gameday web server without any firewall, so anyone can get in there and start crunching the data, even as the game is being played! As a baseball-obsessed developer, it was only a matter of time before I got into it.

One thing that drives everybody nuts in baseball is blown calls at the plate. Sometimes an umpire will call a ball that looks way outside the plate a strike, striking out your team’s batter. Or sometimes the ump will call a ball on what looks like strike three, keeping the other team’s batter alive for one more pitch. What’s particularly frustrating about this is that *you cannot contest a called pitch in Major League Baseball*. Get over it. Talk to the hand. Zip it. But there is an expression that goes, “the truth is in the data.” Could analysis of PITCHf/x data shed light on the quality of umpires’ calls? I set out to answer the question.

The rule book strike zone changes for every pitch based on the stance of the batter, so I needed a normalized coordinate system to assimilate called pitches over time. I came up with a coordinate system where where the coordinates at the corners have a magnitude of 1 and built a CalledPitch model to transform itself from the PITCHf/x coordinates to my “strikezone” coordinates.

def CalledPitch: """ Class representing useful data for a called pitch Inputs: px: Horizontal distance from center of plate, ft (float) px: Vertical distance above plate, ft (float) sz_top: Top of strike zone, ft (float) sz_bot: Bottom of strike zone, ft (float) strike: Called strike (boolean) """ def __init__(px, pz, sz_top, sz_bot, strike): self.px = px self.pz = pz self.sz_top = sz_top self.sz_bot = sz_bot self.strike = strike def strikezone_coords(self): """ Pitch coordinates normalized to strikezone """ # x is normalized to half home plate width half_home_plate = 17.0 / 2 / 12 # feet x = self.px / half_home_plate # y is normalized to top half of strikezone h = (self.sz_top - self.sz_bot) / 2 z = (self.pz - (self.sz_bot + h)) / h return x, z

With this done, I can create probability matrices of the likelihood that a pitch in any sector of the strikezone coordinates is called a strike using histogram2d. This kind of data of course leads directly to geeking out over heat maps, like this one I generated from every called pitch in the 2012 regular season.

Suddenly, my goal of measuring the quality of umpires’ calls doesn’t seem so hard. I can measure their ability to call balls and strikes based on the fraction of tiles in and out of the strike zone that are classified correctly at least half the time:

*ball score = tiles < 0.5 / total tiles (outside strike zone)*

*strike score = tiles >= 0.5 / total tiles (inside strike zone)*

I can combine these two scores to make an overall score of the accuracy of the called strike zone:

*total score = strike score – (1 – ball score)*

Applying these calculations over the data pictured above, I get

Strike Score | 0.9725 |

Ball Score | 0.8650 |

Total Score | 0.8375 |

So it seems that umpires are much more likely to call a ball a strike than they are to call a strike a ball, which essentially stacks the odds in the pitcher’s favor. Pitchers are innocent until proven guilty. This is in fact exactly how the umpire is instructed to call close pitches. But assuming that the rule book strike zone is the true goal, let’s see what happens if we rank all the umpires based on the total score.

**The Five Best Umpires, by strike zone accuracy (2012 regular season, minimum 1000 pitches called)**

- Todd Tichenor
- Phil Cuzzi
- Jerry Meals
- Lance Barksdale
- Mike Everitt

I think it’s notable that Jerry Meals, famous for blowing calls involving tagged base runners, had the third most accurate strike zone in MLB in 2012.

Can we apply this sort of methodology to other aggregations and get meaningful results? Let’s look at one other grouping, by batter. Presumably, some batters are somehow able to get better strike zones than others. First we’ll make up a score for this:

*good for batter score = ball score – strike score*

This score will be positive if the batter has a strike zone called in his favor. Let’s see who the top three batters are.

**The Three Most Favored Batters (2012 regular season, minimum 1000 pitches called)**

- B.J. Upton
- Yonder Alonso
- Mike Napoli

Here’s what the best “batter strike zone” in MLB looked like in 2012 (in fact the only batter with a favorable score, B.J. Upton):

Now let’s see the bottom three:

**The Three Least Favored Batters (2012 regular season, minimum 1000 pitches called)**

- Lucas Duda
- Josh Reddick
- Neil Walker

And here is the worst “batter strike zone” in MLB in 2012:

So the next time you see a batter slam his bat to the ground in anger after being called out on a strike that may have been a ball, think about Lucas Duda. His strike zone is consistently called nearly a half plate width outside the plate! Meanwhile I’m wondering, *is there something these poor guys are doing that brings on this home plate umpire wrath?* The truth is in the data, and more analysis may bring it out!

Awesome analysis.