Metrics Glossary
CalledThird invents some metrics and inherits others. This page is the reference: every metric we use, what it measures, and how to read the numbers. Each entry links to the article that introduced or validated it.
Umpire metrics
IOAI — In-Out Asymmetry Index
How much more (or less) generous an umpire is on the outside corner versus the inside corner, relative to the league baseline.
- Formula
- Residual called-strike rate on the 4-inch outside edge band (
|plate_x| ∈ [0.83, 1.17]ft, normalized so “outside” is always away from the batter) minus residual called-strike rate on the inside edge band. Residual = empirical − baseline-model expected. - Range
- Roughly
−0.20to+0.20in 2025 data. Positive = outside-generous. Negative = inside-generous. - Worked example
- Stu Scheurwater:
+0.120(calls outside-corner strikes 12 percentage points above league baseline). Alex MacKay:−0.070. The full 2025 league spans 18.9pp. - Reliability
- Most stable umpire feature. Cross-half 2025 correlation
r = 0.51. Cross-method per-umpire correlationr = 0.98. - Read more
- The 19-Point Strike Zone
EAR — Edge Aggression Rate
How aggressively an umpire patrols the rulebook edge band — does he give the edge or shrink the zone?
- Formula
- Residual called-strike rate on pitches with
zone_dist_inches ∈ [0, 2]— the “shadow band” just outside the rulebook zone. - Range
- Roughly
−0.10to+0.10. Positive = gives the edge. Negative = shrinks the zone. - Reliability
- Second-most stable feature. Cross-half
r ≈ 0.5. Persistent across both our independent statistical pipelines. - Read more
- The 19-Point Strike Zone
BSR — Borderline Strike Rate
Of pitches in the borderline region (~3 inches from the rulebook edge), what fraction are called strikes? A descriptive aggregate, not residualized.
- Range
- League average is around
~50%. Aggressive umpires push it to~60%+; conservative umpires fall to~40%. - Note
- Used in our Four Kinds of Zone 2×2 (accuracy × BSR). BSR is related to EAR but is descriptive rather than residualized — it doesn’t correct for the location distribution of pitches.
Wrong Calls / Game
The average count of false strikes (called strikes that were balls) plus missed strikes (called balls that were strikes) per game an umpire works.
- League average 2025
- About
10.9 wrong calls per gameon average (about7.2%of all called pitches). - Read more
- The Umpire Effect
HLB — High-Low Bias DISPUTED
Tendency to call the top of the zone vs the bottom. We don’t feature it.
- Status
- Method-dependent. One of our pipelines finds
r = 0.42cross-half; the other findsr = 0.19with a 95% CI that crosses zero. We report it but don’t rank umpires on it.
CCZE — Count-Conditioned Zone Expansion NOISE (per-umpire)
Per-umpire tendency to give a wider zone on 3-0 vs 0-2 counts. Real at the population level, not at the individual umpire level.
- Why it’s here
- The broadcast claim “Ump X expands the zone with two strikes” doesn’t hold up — cross-half
r ≈ 0in both our methods. We report it as the canonical example of a folk-wisdom metric that’s actually noise. - Read more
- The myth of the two-strike zone
Pitcher metrics
Tunneling Divergence
How separated a pitcher’s pitch types are at the plate, after looking identical at the hitter’s decision point ~23.9 ft from the rubber.
- Formula
- Jensen-Shannon divergence between per-pitch-type centroids at decision-point coordinates (
dec_x,dec_z) vs final plate coordinates (plate_x,plate_z). Higher = pitches look more alike at the decision point but more different at the plate. - Range
- Roughly
2to20in raw units (inches of effective separation). Top tunnelers cluster around10+. - Read more
- The Pitch Tunneling Atlas · The physics behind it
Plate Sep / Decision Sep
The raw geometric separations between pitch-type centroids at the plate (plate_sep) and at the decision point (dec_sep). The ratio drives Tunneling Divergence.
- Units
- Inches. A high-tunneling pitcher has small
dec_sep(pitches look alike when the hitter has to commit) and largeplate_sep(they end up far apart).
Command Variance / Starter Scatter
How tightly a starter hits his intended target locations across an outing. Captured as a scatter of intended vs actual locations.
- Use
- Lower scatter = better command. The Explore tab shows this as a tightness metric on the Pitchers > Command sub-tab.
- Read more
- Do pitchers lose their command?
Walk Spike Attribution
For a 2026 pitcher, how much of his walk-rate change is attributable to the ABS zone change vs his own pitch-mix behavior.
Hitter metrics
Coaching Gap (Δ predictable wOBA)
The wOBA edge a hitter extracts specifically on predictable pitches — pitches where the at-bat context strongly forecasts what’s coming.
- Headline finding
- Low-chase hitters extract roughly
+0.04 wOBAmore on predictable pitches than high-chase hitters. Power doesn’t matter for this; contact rate doesn’t matter; discipline does. - Read more
- The Coaching Gap that lives where hitters don’t chase
Chase Tertile
Hitter discipline classification — low / mid / high — based on how often they swing at pitches outside the zone.
- Use
- The pivotal split in the Coaching Gap finding. Low-chase tertile is where the predictable-pitch edge lives.
April Sell Score / Hot Start Stability
For 2026 hot starts, a regression-aware projection of how much of the early-season pace is likely to sustain.
- Use
- Splits the 2026 hot starts into “sell” (likely regression) vs “sleeper” (probably real, baseball isn’t talking about them yet). Dual-agent validated.
- Read more
- The April Sell List
ABS & Challenges
ACG — ABS Conformance Gap
Among pitches that became ABS challenges, the fraction where the umpire’s original call was overturned by the robot.
- Status
- Defined per-umpire but currently too thin to publish (most umpires have 10–15 challenges in 2026 so far). All-Star break revisit.
Challenge Overturn Rate
The fraction of challenges (by player, team, or count) that the ABS system overturns.
- League average 2026
- About
53%league-wide. Catcher-initiated challenges hit~61%; batter-initiated~45%. - Read more
- The best ABS challengers are catchers
Challenge Value
For each overturned challenge, the wOBA-shift saved or captured based on the count it occurred in.
- Why count matters
- A 3-2 wrong call is worth roughly
0.690wOBA. A 1-0 wrong call is worth0.130. Smart challenges happen at high-value counts. - Read more
- Minnesota buys leverage. Cincinnati buys certainty.
Counts & Game state
Count Leverage / wOBA per Count
The expected wOBA outcome of a plate appearance given the current count state.
- Anchors
- 3-2:
0.690· 2-2:0.384· 0-0:0.095. Wrong calls scale with this. - Read more
- The count tells you everything · The count that matters
Reliever Strand Rate
For relief pitchers, the fraction of inherited runners that fail to score after the reliever enters.
- Note
- 2026 leaderboard live; uses MLB official inherited-runner counts joined from boxscores. Stability is a separate question we’ll revisit at season end.
- Read more
- The fireman’s dilemma