The Question
Umpires miss calls. We know this — our data shows an average of 10.9 wrong calls per game. But how much do those wrong calls actually matter? Not all misses are equal: a wrong call at 0-0 barely shifts the at-bat, while a wrong call at 3-2 can be the difference between a walk and a strikeout.
We computed the challenge value of every wrong call in the first 11 days of the 2026 season — the counterfactual wOBA impact at the count state when the call was made. Then we ranked every game by total impact.
The Top 10 Most Impactful Games
Challenge value = sum of counterfactual wOBA for all wrong calls in the game. Higher = more run expectancy shifted by umpire errors. ABS column shows overturned/total challenges.
The Headline Game
CB Bucknor's March 28 BOS @ CIN game stands out: 21 missed calls, 4.08 total challenge value. That's the highest single-game umpire impact this season — roughly 4 runs of expectancy shifted by wrong calls. The game was decided 6-5, meaning the umpire's errors may have been larger than the margin of victory.
"May have been" is important. Challenge value measures the theoretical run impact of flipping each call — it doesn't tell us which team benefited. A false strike hurts the batter; a missed strike hurts the pitcher. Without knowing the direction of each error, we can say the umpire shifted 4 runs but not that he gave 4 runs to one side.
The Average Game
Across all 137 games in our database, the average total challenge value is ~2.0 wOBA per game. That means on a typical night, umpire errors shift about 2 runs of expectancy. Most of that value concentrates in a few high-leverage wrong calls at deep counts (3-2, 2-2, 1-2).
The distribution is right-skewed: most games have modest impact (1.5-2.5 wOBA), but a few games — usually those with low accuracy and wrong calls at critical counts — reach 3-4 wOBA.
Ron Kulpa Appears Twice
Ron Kulpa is the only umpire to appear twice in the top 10: MIA @ NYY on April 4 (2.75 CV) and TEX @ BAL on March 30 (2.34 CV). His average accuracy across those games is 91.8% — the second-lowest in our sample among umpires with multiple games. He's also a Wild Expander in our zone style framework — the only umpire in MLB whose wrong calls are more often false strikes than missed strikes (FS% = 55%).
Is this a pattern or small-sample noise? With only 2 games, we can't say. But Kulpa's 2025 season data tells the same story: 91.4% accuracy, 12.3 wrong calls per game — consistently one of the highest-impact umpires in the league.
High Accuracy Doesn't Mean Low Impact
This is the counterintuitive finding: accuracy and impact are not as tightly linked as you'd expect. An umpire can be 95% accurate but if their 5% of misses all come in high-leverage 3-2 counts, the total run impact is large. Conversely, an umpire with 92% accuracy whose misses cluster in 0-0 and 0-1 counts has lower game impact.
The key variable is when the umpire misses, not just how often. A single wrong call at 3-2 (swing value: 0.305 wOBA) is worth 4.4 times a wrong call at 0-0 (0.070 wOBA). Umpire reports that only show accuracy percentage are missing half the story.
This is why CalledThird tracks challenge value alongside accuracy in every game report.
The ABS Connection
In the 10 highest-impact games, teams used an average of 5.6 ABS challenges (vs. 3.9 across all games). They won 2.4 of those challenges. High-impact games generate more challenges — which makes sense: when the umpire is missing more calls, teams have more opportunities to correct them.
But the challenge system doesn't fully offset the umpire effect. Even after ABS corrections, the residual impact of unchallenged wrong calls remains. Teams get a limited number of challenges per game, and they can't challenge every borderline pitch. The umpire still matters.
Methodology
Data: 137 games from March 26 – April 5, 2026. Called pitches from Statcast, classified against ABS zone model. ABS challenge data from Baseball Savant gamefeed API.
Challenge value: Counterfactual wOBA at each count state using Tom Tango's RE288 linear weights. A wrong call at 3-2 has swing value 0.305; at 0-0, 0.070. Total CV = sum of swing values for all wrong calls in a game.
Limitations: Challenge value measures total run expectancy shifted, not directional impact on either team. We cannot say which team benefited without per-pitch directional analysis (false strikes hurt batters, missed strikes hurt pitchers). The 11-day sample is small for umpire-level conclusions.
Cite this analysis
CalledThird. "The Umpire Effect: How Much Does It Actually Matter?." CalledThird.com, April 6, 2026. https://calledthird.com/analysis/the-umpire-effect
All CalledThird analysis is original research. If you reference our findings, data, or charts in your work, please link back to the original article. For data inquiries: hello@calledthird.com