The postseason is intended to showcase and reward baseball’s best teams, which make their case for inclusion over a revealing 162-game schedule. And while one wouldn’t know it from the invective that accompanies every questionable October call, the same is supposed to be true for umpires. “The goal each and every season,” MLB spokesman Mike Teevan told me via email, “is to have the most deserving Umpires working Postseason games.”
In the small samples of postseason play, the most deserving players and teams don’t always perform the way they did during the regular season. Presumptive every-award winner Clayton Kershaw took two losses in the NLDS after losing three games all year; in the ALDS, probable MVP Mike Trout went 1-for-12. All four division-series favorites were either swept or eliminated in four games. Umpires, too, have been in the spotlight for undesirable reasons. Fourth-year ump Vic Carapazza, working his first postseason, ejected Nationals second baseman Asdrubal Cabrera and manager Matt Williams after they objected to his strike zone in the 10th inning of an 18-inning Game 2. After the Dodgers’ Game 3 loss, outfielder Matt Kemp called umpire Dale Scott’s zone “terrible” and “by far the worst I’ve ever seen.” And with the Nats on the verge of elimination in the ninth inning of Game 4, plate ump Hunter Wendelstedt drew widespread criticism after ringing up shortstop Ian Desmond on what appeared to be a checked swing without appealing to the first-base umpire for a more informed ruling.
Given the elevated stakes and the national profile of postseason games, it’s only natural that umpires would be subjected to greater scrutiny at this time of year and that we’d fixate on every missed call as intently as we dissect every managerial decision. It’s also not surprising that frustrated players, moments after a setback or a demoralizing defeat in the playoffs, would resort to umpire-blaming more quickly than they would after an equivalent call in a meaningless midsummer matchup. In October, more smoke surrounding umpire calls doesn’t necessarily mean more fire.1 It might just mean that the inevitable mistakes humans make when asked to do an impossible job provoke stronger reactions when the calls matter more.
2014 MLB PlayoffsAll of our postseason coverage!MLB’s postseason umpire selection system is supposed to minimize those mistakes. “The main component in the selection of Umpires for Postseason assignments is performance during the season,” Teevan wrote. “We factor in results from the Zone Evaluation system for their plate assignments, accuracy on their calls and rulings, and observations of their work by our Supervisory staff. In addition, there is consideration given to an Umpire’s experience level (overall seniority and previous Postseasons), his proficiency at handling situations, health and time missed during the season, and a number of other administrative factors.”
Theoretically, then, umpires who work in the postseason should show above-average regular-season performance, just as the hitters and pitchers on playoff teams will, on the whole, have better stats than those on losing teams. Similarly, postseason umpiring should be more accurate than regular-season umpiring on the whole.
We can look for evidence of increased October umpire quality with data from PITCHf/x, Major League Baseball Advanced Media’s pitch-tracking technology. Calling pitches is only one aspect of an umpire’s job, but it might be the most important, especially now that the expanded instant replay system gives teams recourse to appeal most other kinds of calls.
With assistance from Daren Willman, proprietor of invaluable advanced-stats resource Baseball Savant, I examined umpire correct-call rates from 2009 to 2014, all of which fell into a narrow band between the lower limits of MLB’s tolerance for idiosyncratic strike zones and the upper limits of the human sensory system.2 Willman classified strikes on called pitches outside the dimensions of the rulebook strike zone and balls on called pitches inside the zone as incorrect calls. Balls on called pitches outside the zone and strikes on called pitches inside the zone were designated as correct calls. Each umpire’s correct-call rate is simply his tally of correct calls divided by all of his calls.
Among the 79 umpires who called at least 3,000 pitches during the 2014 regular season, the difference between the most accurate (Lance Barksdale, 88.6 percent correct calls) and the least accurate (Brian O’Nora, 84.2 percent) was only 4.4 percentage points. Because full-time umps can call several thousand pitches in a season, though, minor differences in accuracy add up: The gap between Barksdale and O’Nora translates to 193 incorrect calls over the course of a typical umpire’s season, or roughly seven per full game behind the plate (which would, on average, be distributed evenly between teams). Most umpires, however, are clustered so closely together that you’d have a hard time telling the good from the bad by watching.3
Roughly one-third of umpires who call games from behind the plate during the regular season also do so during the postseason. The following table compares the regular-season accuracy of postseason umps to the regular-season accuracy of all umps. If the best ball/strike-callers are being picked for the postseason, the accuracy of the “postseason only” group should be above the league average.
|Year||All Umps||Postseason Umps Only|
|2009||84.8 percent||84.8 percent|
It’s not. Only in 2009 did postseason umps even match the league average. In each of the past five seasons, umpires selected for postseason duty have been less accurate than their counterparts who spent October at home. The differences are small, but these are enormous samples — more than 850,000 combined called pitches for the postseason umps alone.
As another check, we can see whether the correct-call rate rises in the postseason relative to the regular season.
|2009||84.8 percent||83.6 percent|
Again, no. In four of the past six seasons, postseason strike zones haven’t been any more accurate than regular-season strike zones.
So what’s going on here? The words Teevan used to describe the league’s goals — to have the “most deserving” umpires, not necessarily the best or most accurate — might be telling. Here’s how this year’s ALCS and NLCS umpiring crews ranked in regular-season correct-call rate (out of 79 qualified umps), along with their accumulated years of service in the majors. (Asterisks denote crew chiefs.)
|Name||Accuracy Rank||Years of Service|
On the American League side, only one member of the seven-man umpiring crew4 ranked above the median in accuracy rate. The National League crew fares somewhat better but still contains two of the lowest-ranked umps. Every umpire on the list, however, has logged some serious time in a chest protector. With 37 years of service, Joe West is the active leader among all umps, and even Dan Iassogna, the least-experienced ump in either of this year’s championship series, has put in 13 years. In this group, the top-ranked Barksdale, who has 11 years of major league service, would be the new kid on the black. And Hal Gibson, a 33-year-old ump who debuted in July 2013 but finished third in this year’s accuracy ratings, probably wouldn’t understand any of his colleagues’ pop-culture references.
Although West has been a competent pitch-caller in the past, he wasn’t close to being one of this season’s most accurate umps, and it seems like a stretch to say he’s one of the most proficient at “handling situations.” West earned a one-game suspension in September for grabbing Jonathan Papelbon’s jersey during a dispute; ironically, MLB executive VP for baseball operations Joe Torre, who disciplined West, was once on the receiving end of a shove from him in an incident that resulted in a three-game suspension for West. “Cowboy Joe” had a reputation for being a bit of a brawler more than two decades ago, and he regularly ranks among players’ least-favorite officials. However, he’s the president of the powerful World Umpires Association, and no umpire has been on the job longer, which Teevan called a “consideration.” To be fair, West’s calls were reviewed via replay 52 times and overturned only 22 times, according to data from MLB. That’s a 41.5 percent overturn rate, lower than the league-average 47.3 percent. On the whole, postseason umps had a 43.9 percent overturn rate (although that doesn’t tell us whether their calls were challenged more or less often than average).
In theory, giving preference to experienced umps sounds like a strategy that would improve the quality of the calls. In practice, though, umpiring experience might not matter much more than postseason experience. This season saw the biggest crop of rookie umpires in the past several years, as MLB used 11 first-time major league umps to accommodate the need for replay officials. Despite someanecdotalevidence that rookie/fill-in umps are prone to making mistakes, more comprehensive data shows that rookie umps actually have above-average correct-call rates — and that veteran umps (defined here as those who debuted before 2001) are below average. That would suggest that the replacement level for pitch-calling is high, and that accuracy in judging balls and strikes doesn’t tend to improve over time.
|Year||All Umps||Rookie Umps||Veteran Umps|
I chose pre-2001 as the cutoff for “veteran” umps because 2001 was the first season for QuesTec’s Umpire Information System, a precursor to PITCHf/x in umpire evaluation. The increased role of technology in internal umpire reviews has made strike zones more standardized and brought them closer in line with rulebook zones, as indicated by the rising percentages in the tables above. As a result, the shape of the zone haschanged to make low strikes more frequent without any reduction in the rate of high strikes. Might it be that rookie umps have been more adaptable (or selected for their adherence to the rulebook zone),5 and that veteran umps show up as less accurate because they’re still calling strikes the way they were before QuesTec and PITCHf/x?
|% Strikes on Low Pitches||% Strikes on High Pitches|
|Year||All Umps||Rookies||Veterans||Year||All Umps||Rookies||Veterans|
No: Both rookies and vets have evolved to call low strikes (those in the lower third of the zone) at roughly the same rate. Young umpires haven’t driven the downward expansion of the strike zone, and older umps aren’t less accurate because they’re clinging to an outdated understanding of the zone.
We should note that there are more sophisticated ways of studying umpire accuracy than simply counting out-of-zone strikes and in-zone balls as missed calls. For instance, one could use a more probabilistic model and assign fractional misses to each mistake based on how often pitches in a given location are called strikes. MLB’s Zone Evaluation system, which Teevan referenced and which I wrote about last year in an article about automating the strike zone, might make more adjustments to the data that could further refine the numbers. MLB also reviews every non-ball/strike call through its SURE system, whether it was challenged or not, which provides a more complete picture of an umpire’s overall accuracy — although accuracy on non-reviewable calls that can’t be undone, such as the ball/strike judgments that made Kemp and Cabrera mad, might be a more important criterion.
The variation in pitch-calling skill among big league umpires is slim enough that choosing one umpire over another might not make any difference in a given game, just as a manager’s decision to signal for a slightly less effective reliever won’t usually cost his team a win. However, it takes only one crucial blown call to inflame a fan base. By allowing longevity, missed time, and politics to play a role in postseason umpire selection, the sport might be making the same mistake Don Mattingly and Matt Williams made in the NLDS: leaving the best available options on the bench at the most important point in the season. Major League Baseball sought to make postseason assignments more flexible in the last round of collective bargaining with the union, but it could behoove MLB to further redefine “deserving” when the league and its umpires negotiate a new contract at the end of this year.
Filed Under: 2014 MLB Playoffs, MLB, Joe West, Lance Barksdale, umpires, Hal Gibson, Clayton Kershaw, Asdrubal Cabrera, Joe Torre, Matt Williams, Dale Scott
Nothing preoccupies players, managers and fans during the postseason like an umpire’s strike-zone judgment. While there’s plenty of outcry over balls and strikes in the regular season too, the stakes are much higher — and the complaints correspondingly louder — when an entire championship can hang on one errant call.
But although this year’s playoffs have contained a couple of poorly calledgames, it’s not quite time to kill the umpires yet. Their overall strike-zone accuracy this postseason has not been significantly lower than what it was in the regular season, nor has it been much worse than you’d expect if you picked a handful of MLB games at random.
Take, for instance, the Chicago Cubs’ NL Division Series Game 1 loss to the St. Louis Cardinals, in which bothfans and players found fault with an inconsistent strike zone. During that game, home plate umpire Phil Cuzzi missed 15 ball-strike calls, for a total accuracy rate of 88.4 percent.1 That’s bad — it ranks among the worst 10 percent of all MLB games this season — but consider as well that 10 postseason games have been played so far.2 In a sample of that size, the probability that we wouldn’t see a game called as poorly as the Cubs-Cardinals series opener was only about 34.9 percent,3 so the odds were good that some team was going to be on the receiving end of a bunch of bad calls. The Cubs simply had the misfortune of being that team.
Despite that bad game, there’s nothing statistically unusual about this postseason’s umpiring performance. In the playoffs as of Oct. 10, the umps have an accuracy rate of 91.4 percent. Depending on how you feel about robot umps, that kind of accuracy may seem unacceptable, but it’s right in line with the season-long MLB average of 91.6 percent. And if you randomly chose a set of 10 games from the regular season, you’d find that the umpires were less accurate than they’ve been this postseason about 36 percent of the time.
Because MLB bases its postseason crew assignments on merit, we might expect the playoff umpires to have a better accuracy rate than the overall regular-season average. But as my Grantland colleague Ben Lindbergh noted last year, since 2009 there’s been essentially no difference in strike-zone accuracy between regular-season and postseason games.4 So although it would be nice if the strike zone were being called more precisely in the playoffs, the umpires’ execution so far is almost exactly what we’d expect based on their performance during the regular season.