Original Research
Every piece of analysis is original, methodologically rigorous, and transparent about its data and methods. If you can't see the work, you can't trust the numbers.
We Tried to Build a Pressure Grade. Most of It Was Skill in a Costume.
Everyone has a read on who melts down and who locks in. We tried to turn it into a number two independent ways — a Bayesian pipeline and a gradient-boosted ML pipeline, six seasons, 3.85M pitches. The single 'Pressure Grade' doesn't exist: the clutch residual is noise (r=0.08, a coin flip), and no composite beats a pitcher's plain overall skill out-of-sample — it loses or ties. Our own headline hypothesis — that command separates the pressure-proof — failed out-of-sample. But the multi-dimensional read survived: the two methods grade each pitcher's stuff/command/contact nearly identically (stuff agreement r=0.95), and a pitcher's overall skill predicts his future high-leverage results better than his past high-leverage line does. The honest version is now on every pitcher's page.
The Bat-Speed Arms Race: We Went Looking for the Treadmill. It Isn't There.
Hitters keep swinging harder and MLB offense keeps slipping, so the take writes itself: everyone's selling out for bat speed and whiffing for it — a treadmill. We tested it on 254 hitters across two seasons of bat tracking and couldn't find it. Within a hitter, adding 1 mph of bat speed (2025→2026) is worth about +0.27 runs/100 PA — a modest positive lean, not a penalty — and it comes with no whiff cost at all (exit velo +0.66 mph and xwOBAcon +12 pts per mph, whiff flat). The 2026 'offense crisis' is mostly the calendar: against the matched window, the tracked-swing population is flat and bat speed actually rose. The elegant 'free vs bought speed' story is a clean null, and the result survives an explicit attrition correction (bat speed doesn't predict who washes out). Two independent pipelines (interpretability-first + ML) and two cross-review rounds. Borderline by design — re-test at the All-Star break.
The Two-Strike Swing: Not Shorter — Slower, Flatter, and Deeper.
Every coach says 'shorten up with two strikes.' Bat-tracking from 332,111 swings says the swing barely shortens (−1.3%). What hitters actually do is one coordinated compact move: slower (bat speed −1.8 mph), flatter (attack angle down ~1°, and more once you control for location), and deeper (contact point about an inch closer to the catcher). Roughly 60% of hitters make all three adjustments at once — a geometry that, as far as we can tell, has never been published. The brake buys contact (the biggest decelerators cut their two-strike whiff rate ~3pp) at a cost of ~30 points of xwOBAcon. Andrés Giménez is the league's brake artist in both 2025 and 2026; Victor Scott II is the mirror image — he speeds up AND steepens. Every finding was independently reproduced by two analysis pipelines and replicates in 2026.
Umpires Don't Have Personality Types. They Do Have an Outside Corner That Spans 19 Points.
We pre-registered an attempt to cluster 83 home-plate umpires into discrete calling-style types. Two independent statistical methods both refused — four of five kill gates fail in both, including an inter-method ARI of 0.105 against a 0.30 threshold. But on the outside corner, the league spans 18.9 percentage points: Stu Scheurwater calls outside-corner strikes 12pp above league baseline; Alex MacKay calls them 7pp below. The asymmetry isn't catcher framing (r > 0.94 after dropping top framers). And on count-conditioned zone behavior, three independent location-controlled methods agree the broadcaster myth has the sign backwards: umpires CONTRACT the zone on 0-2 by ~18pp, not expand it.
Three Weeks Later: The Walk Spike Is Fading, and We Know Who's Paying the Bill
Three weeks ago we said about half the walk spike was the new ABS zone. With two more weeks of data and a much harder analytical pass: the number has muted to ~+26%, the spike is fading (W1 9.61% → W7 8.79%), and we can now name the pitchers paying the price. Three pitchers cleared bootstrap stability in both pipelines: Kyle Finnegan (command-archetype, +11.4pp walks), Riley O'Brien (stuff-helped, −8.3pp), Camilo Doval (stuff-helped, −7.5pp). Spearman ρ ≈ −0.27 between stuff-minus-command and walk-rate change, both methods.
We Tested the 7-Hole Tax Six Different Ways. It Isn't There.
Two national outlets reported that umpires call a different strike zone for hitters in the 7-hole. We ran two independent statistical pipelines on every angle the claim could hide in — raw replication, pitch selection, borderline-pitch zones, per-umpire, per-hitter, and the catcher mechanism. Six tests, two methods, two cross-reviews. All null. The bias claim is not in the data we have.
The April Sell List
Six MVP-pace April hot starts that two independent statistical methods say won't last — Pages, Rice, Trout, Judge, Carroll, Muncy. Six sleepers baseball isn't talking about that both methods independently flagged as real. And the one big name we couldn't make up our minds about (Murakami, with caveats). Three rounds of dual-agent analysis, two cross-reviews, one retraction.
ABS Took the High Strike — and That's Roughly 40-50% of the Walk Spike. Pitchers Own the Rest.
Two weeks ago we said the walk spike was pitchers, not umpires. Two independent ML pipelines and a counterfactual replay later, we're back to update that. The 2026 ABS-era zone shrank at the top edge, and the zone change accounts for roughly half the walk spike. We were wrong to dismiss it at zero.
The Coaching Gap That Lives Where Hitters Don't Chase
Six rounds of dual-agent research on 2.9M pitches across five seasons. One finding survived: low-chase hitters extract +0.04 wOBA more on predictable pitches than high-chase hitters. Power doesn't matter. Contact rate doesn't matter. Discipline does.
The Physics Behind the Tunneling Atlas
Trajectory equations, model validation (4.9" systematic bias that cancels for relative comparisons), decision-point sensitivity analysis, and how to reproduce the tunneling model.
The Pitch Tunneling Atlas
654 pitchers. 739,820 pitches. The first league-wide tunneling model. Plate diversity explains 11x more whiff variance than decision-point similarity — but tunneling IS real (p=0.016).
Minnesota Buys Leverage. Cincinnati Buys Certainty.
The Twins and Reds are solving ABS in opposite ways. Minnesota: 3.2/game, most total value. Cincinnati: 72% overturn — the highest in baseball, with 91.7% in early counts.
The Best ABS Challengers Are Catchers
940 challenges. Catchers overturn 60.6% vs batters' 45.5% (OR=1.85, p<0.001). They challenge closer pitches and still win more — in every count bucket.
ABS Isn't Rewriting Every At-Bat. It's Repricing the Last Pitch.
3-2 is just 2.8% of called pitches but generates 24% of all ABS run value. Late counts dominate the challenge economy while 0-0 is noise.
Cam Schlittler's Three-Fastball Blueprint
231 pitches, 22 K, 0 BB. Schlittler rebuilt his arsenal into a perfectly balanced three-fastball attack — 100th percentile usage balance, 20.6" of horizontal spread, and a remade cutter with +7.2" of vertical break.
After a Fight, the Zone Gets Cleaner
7 bench-clearing brawls, 6 complete pairs. The zone doesn't expand or shrink — but accuracy improves unanimously (+2.0pp, p=0.001). Umpires stop giving freebies off the plate.
The Wrong-Way Slider: Imai's Impossible Pitch
Tatsuya Imai's slider breaks 10.4 inches the wrong way — 0th percentile among all RHP sliders. League-wide data shows arm-side-breaking sliders get 4.2pp fewer called strikes on edge pitches.
The Walk Rate Spike: Umpires or Pitchers?
Walk rate is up nearly a full point in 2026. The obvious suspect is ABS squeezing the zone. The data says pitchers are nibbling more — shadow-zone strike rate actually rose +4pp.
CB Bucknor by the Numbers
4,174 pitches, 3rd worst accuracy in MLB, and the highest average miss distance of any umpire. A data profile of baseball's most controversial umpire — now that ABS can measure it.
The Fireman's Dilemma
4,044 reliever entries, 6,516 inherited runners. The outs gradient determines most of the outcome: 44% strand at 0 outs, 82% at 2 outs. Whether individual strand rate is skill or noise remains an open question.
Do Pitchers Lose Their Command?
4,892 starts, 729K pitches. The average pitcher's scatter is flat. But 14% of starts produce a scatter blow-up, and the distribution is asymmetric. Garrett Crochet's scatter increases 31.5% on average.
Four Kinds of Zone
83 umpires, 380,000 pitches, and a 21-percentage-point gap in borderline strike rates. A two-axis framework that tells you what to expect from tonight's home plate umpire.
The Umpire Effect: How Much Does It Matter?
For every game this season, we computed the run-value impact of every wrong call. CB Bucknor's BOS-CIN game tops the list at 4.08 wOBA — roughly 4 runs shifted by umpire errors in a game decided by 1.
The Challenge System Is Quietly Favoring Defense
541 ABS challenges, 55% overturned, and a 10-percentage-point gap between the fielder side and batters. Plus: the count tradeoff, the late-inning fade, and why defense has captured 4.8 more runs than offense from the challenge system.
Which Pitchers Can You Predict?
We trained XGBoost on 729,827 pitches to predict the next pitch. The broad model barely beat the count baseline — but five pitchers stood out. Chris Sale's next pitch is predictable 58% of the time.
Catcher Framing in the ABS Era
74 catchers, an 18.7 percentage point gap between best and worst framers, and a counterintuitive finding: in the ABS challenge era, good framing may gain value because it influences the batter's decision to challenge.
The Anatomy of a Missed Call
379,155 called pitches, a 7.2% miss rate, and the half-inch cliff where human judgment breaks down. Plus: challenge value by count, catcher framing ranges, and where umpires miss most.
The Count Tells You Everything
We built a pitch prediction model on 729,827 pitches. XGBoost beat the marginal baseline by 2.1pp — but beat the count-conditional baseline by only 0.5pp. The count already encodes nearly all predictive information.
Two Myths the Data Kills
Pitchers don't measurably lose their command late in games (r = 0.007). Plate discipline doesn't predict future hitting (r = -0.019). Two baseball beliefs that don't survive 729,827 pitches.
On the Research Queue
Every follow-up we’ve promised to come back to — with the article that owes you the answer and when we expect to deliver it.
Bat-Speed Arms Race — All-Star break re-test
Round 1 landed a borderline positive: within-player, +1 mph of bat speed ≈ +0.27 RV/100 PA (CI −0.07 to +0.62), no whiff cost, survives attrition correction. With ~2× the 2026 sample at the break: does the within-player effect tighten and clear zero (graduating 'no treadmill' to 'swinging harder works'), or fade? Re-confirm the free-vs-bought null and whether the in-news names (Cam Smith −2.1, Caminero ~0) move. Re-run analyze.py on both pipelines against the matched window.
7-Hole Tax — All-Star break re-test
Two open threads from Round 2: (1) Strict spot-7 random-slopes sensitivity — Round 2's H4 ran on a bottom-of-order indicator (spots 7-9) for power; rerun with a strict spot-7 random slope when sample doubles. (2) Low-chase tertile signal — Round 2 found −1.45pp [−3.98, +1.03] on low-chase 7-hole batters (reverse direction, CI crosses zero). Retest when sample tightens.
ABS Walk Spike — All-Star break re-test
Re-run the archetype scatter and per-pitcher adaptation leaderboard at ~2× the sample. Specifically: does the +26% zone-attribution number stabilize, do more pitchers clear bootstrap stability for within-2026 adaptation, and does Mason Miller (Claude-only at 0.68 ML stability) get over the line?
Bucknor 2026 trajectory under ABS
Only 1 Bucknor game was in our 2026 database when the profile shipped. Has his accuracy actually improved under ABS challenge pressure, or is the 2025 baseline still the right read? Re-check when his 2026 sample crosses 20+ games.
April Sell List — All-Star break re-check
Did the six sleeper hitters (Caglianone, Pereira, Barrosa, Basallo, Mayo, House) actually break out? Did the six fakes (Pages, Rice, Trout, Judge, Carroll, Muncy) regress as projected? Did Murakami's signal hold once we have 80+ MLB games on him?
Catcher Challenge Skill Separation
By mid-season, top catchers will have 30-50 challenges each. Enough to run a hierarchical model with shrinkage and separate real skill from small-sample luck. The early leaders (Dingler 7/7, O'Hoppe 10/12, Ramírez 7/9) have CIs that overlap heavily — a proper hierarchical model is the right tool once samples grow.
The ABS Effect on Umpires (full-season)
Early data shows every umpire with 3+ games in 2026 improved vs their 2025 baseline (+2.6pp). Is the challenge system actually making umpires more accurate, or is the early sample lucky? Mid-season checkpoint, then a definitive end-of-season piece.
ABS Challenge Decision Optimizer
Given the count, inning, score differential, and your remaining challenges — should you challenge? A proper optimization model using Savant's opportunity denominators. Targeting when sample crosses 2,000+ total challenges (current pace ~250/week).
ABS Walk Spike — per-umpire / team / catcher decomposition (R4)
Round 3 settled the league-aggregate magnitude (~+26%) and identified the archetype effect. Round 4 (deferred to ≥All-Star break) will surface per-actor breakdowns: which umpires drove the top-edge first-pitch loss, which teams adapted fastest, and whether catcher framing still meaningfully shifts the new zone or ABS feedback has washed it out.
NPB-to-MLB power translator
Murakami's 56-HR NPB peak (2022 record) has no MLB comparable — Ohtani's NPB peak was 22. Our model gave Murakami a league-average prior because no robust translation factor exists for Japanese-born sluggers in his power tier. Build one — even an honest range — before the next major hot-starter analysis.
Reliever strand-rate skill persistence
Is inherited-runner strand rate a stable reliever skill or mostly noise? The 2025-to-2026 cross-season correlation is r = 0.098 on thin 2026 samples (3-5 IR per reliever). End-of-season we'll have enough data to test whether top-quartile 2025 strand-rate relievers actually outperformed in 2026.