Keeping Score on the Ump

This post is by Matt Eckerle, on work done during his ‘Sprintbattical’.  The Climate Corporation has “sprintbaticals,” two-week breaks to work on something a little different…. in this case applying some exploratory data analysis to baseball rather than to agriculture.

At some point in my mid-20s, I started getting really interested in obsessing over baseball. Something about the pace and complexity of the game really got to me. The more I watched, the more I learned, and the more it fascinated me. Around that time, Sportvision came out with PITCHf/x, a system that now tracks the trajectory of every pitch thrown in Major League Baseball™. MLB™ puts the PITCHf/x (along with much, much more) data on their Gameday web server without any firewall, so anyone can get in there and start crunching the data, even as the game is being played! As a baseball-obsessed developer, it was only a matter of time before I got into it.

One thing that drives everybody nuts in baseball is blown calls at the plate. Sometimes an umpire will call a ball that looks way outside the plate a strike, striking out your team’s batter. Or sometimes the ump will call a ball on what looks like strike three, keeping the other team’s batter alive for one more pitch. What’s particularly frustrating about this is that you cannot contest a called pitch in Major League Baseball. Get over it. Talk to the hand. Zip it. But there is an expression that goes, “the truth is in the data.” Could analysis of PITCHf/x data shed light on the quality of umpires’ calls? I set out to answer the question.

The rule book strike zone changes for every pitch based on the stance of the batter, so I needed a normalized coordinate system to assimilate called pitches over time. I came up with a coordinate system where where the coordinates at the corners have a magnitude of 1 and built a CalledPitch model to transform itself from the PITCHf/x coordinates to my “strikezone” coordinates.

def CalledPitch:
    """
    Class representing useful data for a called pitch
    Inputs:
        px: Horizontal distance from center of plate, ft (float)
        px: Vertical distance above plate, ft (float)
        sz_top: Top of strike zone, ft (float)
        sz_bot: Bottom of strike zone, ft (float)
        strike: Called strike (boolean)
        """
    def __init__(px, pz, sz_top, sz_bot, strike):
        self.px = px
        self.pz = pz
        self.sz_top = sz_top
        self.sz_bot = sz_bot
        self.strike = strike

    def strikezone_coords(self):
        """
        Pitch coordinates normalized to strikezone
        """
        # x is normalized to half home plate width
        half_home_plate = 17.0 / 2 / 12 # feet
        x = self.px / half_home_plate
        # y is normalized to top half of strikezone
        h = (self.sz_top - self.sz_bot) / 2
        z = (self.pz - (self.sz_bot + h)) / h
        return x, z

With this done, I can create probability matrices of the likelihood that a pitch in any sector of the strikezone coordinates is called a strike using histogram2d. This kind of data of course leads directly to geeking out over heat maps, like this one I generated from every called pitch in the 2012 regular season.

Probability of a pitch being called a strike, based on pitches called in the 2012 regular season

Probability of a pitch being called a strike, based on pitches called in the 2012 regular season

Suddenly, my goal of measuring the quality of umpires’ calls doesn’t seem so hard. I can measure their ability to call balls and strikes based on the fraction of tiles in and out of the strike zone that are classified correctly at least half the time:

ball score = tiles < 0.5 / total tiles (outside strike zone)

strike score = tiles >= 0.5 / total tiles (inside strike zone)

I can combine these two scores to make an overall score of the accuracy of the called strike zone:

total score = strike score – (1 – ball score)

Applying these calculations over the data pictured above, I get

Strike Score 0.9725
Ball Score 0.8650
Total Score 0.8375

So it seems that umpires are much more likely to call a ball a strike than they are to call a strike a ball, which essentially stacks the odds in the pitcher’s favor. Pitchers are innocent until proven guilty. This is in fact exactly how the umpire is instructed to call close pitches. But assuming that the rule book strike zone is the true goal, let’s see what happens if we rank all the umpires based on the total score.

The Five Best Umpires, by strike zone accuracy (2012 regular season, minimum 1000 pitches called)

  1. Todd Tichenor
  2. Phil Cuzzi
  3. Jerry Meals
  4. Lance Barksdale
  5. Mike Everitt

I think it’s notable that Jerry Meals, famous for blowing calls involving tagged base runners, had the third most accurate strike zone in MLB in 2012.

Can we apply this sort of methodology to other aggregations and get meaningful results? Let’s look at one other grouping, by batter. Presumably, some batters are somehow able to get better strike zones than others. First we’ll make up a score for this:

good for batter score = ball score – strike score

This score will be positive if the batter has a strike zone called in his favor. Let’s see who the top three batters are.

The Three Most Favored Batters (2012 regular season, minimum 1000 pitches called)

  1. B.J. Upton
  2. Yonder Alonso
  3. Mike Napoli

Here’s what the best “batter strike zone” in MLB looked like in 2012 (in fact the only batter with a favorable score, B.J. Upton):

Best "batter strike zone" based on regular season 2012, minimum 1000 called pitches

Best “batter strike zone” based on regular season 2012, minimum 1000 called pitches

Now let’s see the bottom three:

The Three Least Favored Batters (2012 regular season, minimum 1000 pitches called)

  1. Lucas Duda
  2. Josh Reddick
  3. Neil Walker

And here is the worst “batter strike zone” in MLB in 2012:

Worst "batter strike zone" based on regular season 2012, minimum 1000 called pitches

Worst “batter strike zone” based on regular season 2012, minimum 1000 called pitches

So the next time you see a batter slam his bat to the ground in anger after being called out on a strike that may have been a ball, think about Lucas Duda. His strike zone is consistently called nearly a half plate width outside the plate! Meanwhile I’m wondering, is there something these poor guys are doing that brings on this home plate umpire wrath? The truth is in the data, and more analysis may bring it out!

Tagged with: , , , , ,
Posted in Engineering
One comment on “Keeping Score on the Ump
  1. Luis santos says:

    Awesome analysis.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: