Skip to content

Metrics Glossary

CalledThird invents some metrics and inherits others. This page is the reference: every metric we use, what it measures, and how to read the numbers. Each entry links to the article that introduced or validated it.

Umpire metrics

IOAI — In-Out Asymmetry Index

How much more (or less) generous an umpire is on the outside corner versus the inside corner, relative to the league baseline.

Formula
Residual called-strike rate on the 4-inch outside edge band (|plate_x| ∈ [0.83, 1.17] ft, normalized so “outside” is always away from the batter) minus residual called-strike rate on the inside edge band. Residual = empirical − baseline-model expected.
Range
Roughly −0.20 to +0.20 in 2025 data. Positive = outside-generous. Negative = inside-generous.
Worked example
Stu Scheurwater: +0.120 (calls outside-corner strikes 12 percentage points above league baseline). Alex MacKay: −0.070. The full 2025 league spans 18.9pp.
Reliability
Most stable umpire feature. Cross-half 2025 correlation r = 0.51. Cross-method per-umpire correlation r = 0.98.
Read more
The 19-Point Strike Zone

EAR — Edge Aggression Rate

How aggressively an umpire patrols the rulebook edge band — does he give the edge or shrink the zone?

Formula
Residual called-strike rate on pitches with zone_dist_inches ∈ [0, 2] — the “shadow band” just outside the rulebook zone.
Range
Roughly −0.10 to +0.10. Positive = gives the edge. Negative = shrinks the zone.
Reliability
Second-most stable feature. Cross-half r ≈ 0.5. Persistent across both our independent statistical pipelines.
Read more
The 19-Point Strike Zone

BSR — Borderline Strike Rate

Of pitches in the borderline region (~3 inches from the rulebook edge), what fraction are called strikes? A descriptive aggregate, not residualized.

Range
League average is around ~50%. Aggressive umpires push it to ~60%+; conservative umpires fall to ~40%.
Note
Used in our Four Kinds of Zone 2×2 (accuracy × BSR). BSR is related to EAR but is descriptive rather than residualized — it doesn’t correct for the location distribution of pitches.

Wrong Calls / Game

The average count of false strikes (called strikes that were balls) plus missed strikes (called balls that were strikes) per game an umpire works.

League average 2025
About 10.9 wrong calls per game on average (about 7.2% of all called pitches).
Read more
The Umpire Effect

HLB — High-Low Bias DISPUTED

Tendency to call the top of the zone vs the bottom. We don’t feature it.

Status
Method-dependent. One of our pipelines finds r = 0.42 cross-half; the other finds r = 0.19 with a 95% CI that crosses zero. We report it but don’t rank umpires on it.

CCZE — Count-Conditioned Zone Expansion NOISE (per-umpire)

Per-umpire tendency to give a wider zone on 3-0 vs 0-2 counts. Real at the population level, not at the individual umpire level.

Why it’s here
The broadcast claim “Ump X expands the zone with two strikes” doesn’t hold up — cross-half r ≈ 0 in both our methods. We report it as the canonical example of a folk-wisdom metric that’s actually noise.
Read more
The myth of the two-strike zone

Pitcher metrics

Tunneling Divergence

How separated a pitcher’s pitch types are at the plate, after looking identical at the hitter’s decision point ~23.9 ft from the rubber.

Formula
Jensen-Shannon divergence between per-pitch-type centroids at decision-point coordinates (dec_x, dec_z) vs final plate coordinates (plate_x, plate_z). Higher = pitches look more alike at the decision point but more different at the plate.
Range
Roughly 2 to 20 in raw units (inches of effective separation). Top tunnelers cluster around 10+.
Read more
The Pitch Tunneling Atlas · The physics behind it

Plate Sep / Decision Sep

The raw geometric separations between pitch-type centroids at the plate (plate_sep) and at the decision point (dec_sep). The ratio drives Tunneling Divergence.

Units
Inches. A high-tunneling pitcher has small dec_sep (pitches look alike when the hitter has to commit) and large plate_sep (they end up far apart).

Command Variance / Starter Scatter

How tightly a starter hits his intended target locations across an outing. Captured as a scatter of intended vs actual locations.

Use
Lower scatter = better command. The Explore tab shows this as a tightness metric on the Pitchers > Command sub-tab.
Read more
Do pitchers lose their command?

Walk Spike Attribution

For a 2026 pitcher, how much of his walk-rate change is attributable to the ABS zone change vs his own pitch-mix behavior.

Read more
Three weeks later: The Walk Spike is fading

Hitter metrics

Coaching Gap (Δ predictable wOBA)

The wOBA edge a hitter extracts specifically on predictable pitches — pitches where the at-bat context strongly forecasts what’s coming.

Headline finding
Low-chase hitters extract roughly +0.04 wOBA more on predictable pitches than high-chase hitters. Power doesn’t matter for this; contact rate doesn’t matter; discipline does.
Read more
The Coaching Gap that lives where hitters don’t chase

Chase Tertile

Hitter discipline classification — low / mid / high — based on how often they swing at pitches outside the zone.

Use
The pivotal split in the Coaching Gap finding. Low-chase tertile is where the predictable-pitch edge lives.

April Sell Score / Hot Start Stability

For 2026 hot starts, a regression-aware projection of how much of the early-season pace is likely to sustain.

Use
Splits the 2026 hot starts into “sell” (likely regression) vs “sleeper” (probably real, baseball isn’t talking about them yet). Dual-agent validated.
Read more
The April Sell List

ABS & Challenges

ACG — ABS Conformance Gap

Among pitches that became ABS challenges, the fraction where the umpire’s original call was overturned by the robot.

Status
Defined per-umpire but currently too thin to publish (most umpires have 10–15 challenges in 2026 so far). All-Star break revisit.

Challenge Overturn Rate

The fraction of challenges (by player, team, or count) that the ABS system overturns.

League average 2026
About 53% league-wide. Catcher-initiated challenges hit ~61%; batter-initiated ~45%.
Read more
The best ABS challengers are catchers

Challenge Value

For each overturned challenge, the wOBA-shift saved or captured based on the count it occurred in.

Why count matters
A 3-2 wrong call is worth roughly 0.690 wOBA. A 1-0 wrong call is worth 0.130. Smart challenges happen at high-value counts.
Read more
Minnesota buys leverage. Cincinnati buys certainty.

Counts & Game state

Count Leverage / wOBA per Count

The expected wOBA outcome of a plate appearance given the current count state.

Anchors
3-2: 0.690 · 2-2: 0.384 · 0-0: 0.095. Wrong calls scale with this.
Read more
The count tells you everything · The count that matters

Reliever Strand Rate

For relief pitchers, the fraction of inherited runners that fail to score after the reliever enters.

Note
2026 leaderboard live; uses MLB official inherited-runner counts joined from boxscores. Stability is a separate question we’ll revisit at season end.
Read more
The fireman’s dilemma