Random Effects and Home Runs

I recently was reading “Multilevel Modeling in Baseball” by Jim Albert, in which he used a random effects model to show how much influence hitters vs pitchers had on various stats. It showed that hitters have a much higher contribution to home run rates, which checks out. It also got me thinking - which pitchers perhaps saw more HR just because of who they faced?

I used 2025 play by play data from Retrosheet. I also decided to add park to the random effects model, since some parks (i.e. Wrigley) play more as home run parks than others. We’ll run the model, then predict for each plate appearance the probability of a home run. We’ll then sum up those probabilities and compare to actuals, first for pitchers and then for batters, and compare to actual. If you’re thinking “this sounds a lot like Expected Goals*”, well, the premise very much is. A better model would account for pitch type, wind speed, etc., but this fairly simple approach should yield some interesting results.

*I find xG to be helpful, and the same concept is what drives Outs Above Average.

Show the code

library(hrbrthemes)
library(janitor)
library(gt)
library(gtExtras)
library(tidyverse)
library(lme4)

#read retrosheet data
mlb_2025 <- read_csv('2025plays.csv')
players <- read_csv('biofile0.csv') %>%
  mutate(player_name = paste0(usename, ' ', lastname)) %>%
  select(id, player_name)

#filter out stolen base events
mlb_2025 <- mlb_2025 %>%
  filter(sb2 == 0 & sb3 == 0 & cs2 == 0 & cs3 == 0)

#run our random effects model
hr_fit <- glmer(hr ~ (1 | pitcher) + (1 | batter) + (1 | site),
                data = mlb_2025,
                family = binomial)

#create predictions. To extract standard errors, we need to predict
#on the log scale, so we'll munge the predictions to transform.
hr_pred <- predict(hr_fit, mlb_2025, se.fit = TRUE, type = "link")

hr_pred_munged <- hr_pred %>%
  as_tibble() %>%
  mutate(low = fit - (se.fit * 1.96),
         high = fit + (se.fit * 1.96)) %>%
  mutate(low = plogis(low),
         fit = plogis(fit),
         high = plogis(high)) %>%
  rename(hr_prediction = fit)

mlb_2025_pred <- mlb_2025 %>%
  cbind(hr_pred_munged)

#aggregate by pitcher
pitcher_pred <- mlb_2025_pred %>%
  summarise(pa = n(),
            hr = sum(hr),
            hr_pred = sum(hr_prediction),
            hr_pred_low = sum(low),
            hr_pred_high = sum(high),
            .by = c(pitcher)) %>%
  mutate(dif = hr - hr_pred) %>%
  arrange(desc(dif)) %>%
  left_join(players, by = join_by(pitcher == id)) %>%
  relocate(player_name) %>%
  select(-pitcher)

head(pitcher_pred, 5) %>%
  bind_rows(tail(pitcher_pred, 5)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')

Player Name	pa	hr	Predicted HR	Predicted HR (Low)	Predicted HR (High)	dif
Shota Imanaga	603	34	26.2	12.8	52.9	7.8
Nestor Cortes	158	13	6.4	3.0	13.3	6.6
Bailey Ober	628	30	23.9	11.6	48.6	6.1
Bowden Francis	292	19	13.2	6.4	26.7	5.8
Elvin Rodriguez	91	9	3.3	1.6	6.5	5.7
Yoshinobu Yamamoto	834	16	21.0	10.6	41.6	−5.0
Adrian Morejon	306	2	7.0	3.4	14.7	−5.0
David Peterson	737	11	16.4	7.8	34.2	−5.4
Jose Soriano	733	12	17.7	8.9	35.3	−5.7
Cristopher Sanchez	863	12	18.9	9.4	37.9	−6.9

This suggests that Shota Imanaga - who has to pitch at Wrigley a lot - gave up 8 more home runs than we would have expected. Which seems relevant, but his 34 HR also fall within the range of the prediction interval of 12.8 and 52.4. This is a pretty wide range, but that’s common for pitchers. The difference of 8 HR is pretty extreme, since pitchers are within -5 and 5 in most cases.

Show the code

pitcher_pred %>%
  ggplot(aes(x = dif))+
  geom_histogram()+
  theme_ipsum()+
  theme(text=element_text(size = 16,  family="Oswald"),
        panel.grid.minor = element_blank(),
        plot.title.position = "plot")+
  labs(title = 'Difference Between Expected and Actual HR Totals, Pitchers')

So Imanaga has an extreme overall difference, but it’s not extreme compared to what we’d expect for him. What if we instead looked at pitchers whose value was outside of their high/low bounds?

Show the code

pitcher_pred %>%
  filter(hr < hr_pred_low | hr > hr_pred_high) %>%
  arrange(desc(dif)) %>%
  head(5) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')

Player Name	pa	hr	Predicted HR	Predicted HR (Low)	Predicted HR (High)	dif
Elvin Rodriguez	91	9	3.3	1.6	6.5	5.7
Austin Cox	104	9	4.2	2.0	8.8	4.8
Noah Davis	67	7	2.7	1.2	5.6	4.3
Shaun Anderson	57	6	2.4	1.2	4.9	3.6
Kyle Gibson	75	7	3.4	1.7	6.7	3.6

This very first thing to note here is that none of these pitchers saw a lot of action. Only Austin Cox faced over 100 batters. So, even though the model helps regress some, there are still sample sizes at play here. I don’t think that “Elvin Rodriguez gave up 2.5 more HRs compared to the upper bound of his prediction interval” is an insight that’s going to set the world on fire.

Let’s take a look at batters as long as we’re at it.

Show the code

#aggregate by pitcher
batter_pred <- mlb_2025_pred %>%
  summarise(pa = n(),
            hr = sum(hr),
            hr_pred = sum(hr_prediction),
            hr_pred_low = sum(low),
            hr_pred_high = sum(high),
            .by = c(batter)) %>%
  mutate(dif = hr - hr_pred) %>%
  arrange(desc(dif)) %>%
  left_join(players, by = join_by(batter == id)) %>%
  relocate(player_name) %>%
  select(-batter)

head(batter_pred, 5) %>%
  bind_rows(tail(batter_pred, 5)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')

Player Name	pa	hr	Predicted HR	Predicted HR (Low)	Predicted HR (High)	dif
Cal Raleigh	773	65	59.1	31.4	107.5	5.9
Eugenio Suarez	718	52	46.9	25.4	84.3	5.1
Kyle Schwarber	761	58	53.0	23.2	114.7	5.0
Aaron Judge	725	54	49.0	25.2	92.3	5.0
Shohei Ohtani	830	63	58.1	26.1	123.2	4.9
Isiah Kiner-Falefa	507	2	5.8	2.7	12.2	−3.8
Xavier Edwards	626	3	7.0	3.3	14.7	−4.0
Santiago Espinal	332	0	4.1	2.2	7.4	−4.1
Nick Allen	420	0	4.5	2.0	9.9	−4.5
Chandler Simpson	444	0	4.7	2.1	10.5	−4.7

No real surprises there, and I think the model’s being a bit unfair to Aaron Judge, who was put on earth to destroy baseballs. Like the Multilevel Modeling article suggested, hitters have more control - just check their residual chart compared to pitchers:

Show the code

batter_pred %>%
  ggplot(aes(x = dif))+
  geom_histogram()+
  theme_ipsum()+
  theme(text=element_text(size = 16,  family="Oswald"),
        panel.grid.minor = element_blank(),
        plot.title.position = "plot")+
  labs(title = 'Difference Between Expected and Actual HR Totals, Batters')

This is a much tighter distribution, with differences between -3 and 3 compared to -5 and 5. As long as we’re at it, let’s check guys who outperformed their expectations:

Show the code

batter_pred %>%
  filter(hr < hr_pred_low | hr > hr_pred_high) %>%
  arrange(desc(dif)) %>%
  head(5) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')

Player Name	pa	hr	Predicted HR	Predicted HR (Low)	Predicted HR (High)	dif
Sal Stewart	64	5	2.6	1.6	4.2	2.4
Jared Young	47	4	2.0	1.1	3.3	2.0
Brice Matthews	50	4	2.0	1.0	3.7	2.0
Ryan Fitzgerald	55	4	2.0	1.1	3.4	2.0
Sandy Leon	12	1	0.3	0.2	0.8	0.7

OK, nothing interesting there. We do have one more thing we can look into, though - what does this model consider to be the unlikeliest home runs hit in 2026?

Show the code

mlb_2025_pred %>%
  mutate(dif = hr - hr_prediction) %>%
  select(gid, batter, pitcher, hr, hr_prediction, dif) %>%
  inner_join(players, by = join_by(batter == id)) %>%
  select(-batter) %>%
  relocate(player_name, .after = gid) %>%
  rename(batter = player_name) %>%
  inner_join(players, by = join_by(pitcher == id)) %>%
  select(-pitcher, -hr) %>%
  relocate(player_name, .after = batter) %>%
  rename(pitcher = player_name) %>%
  arrange(desc(dif)) %>%
  head() %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c(hr_prediction, dif), decimals = 3)

gid	batter	pitcher	hr_prediction	dif
NYN202505120	Isiah Kiner-Falefa	David Peterson	0.009	0.991
NYN202506010	Tyler Freeman	Clay Holmes	0.010	0.990
PIT202504040	Ke'Bryan Hayes	Fernando Cruz	0.011	0.989
CLE202508130	Xavier Edwards	Gavin Williams	0.011	0.989
PIT202506100	Ke'Bryan Hayes	Ronny Henriquez	0.011	0.989
COL202508160	Tyler Freeman	Andrew Saalfrank	0.012	0.988

Isiah Kiner-Falefa - who our model earlier said should have had 6 HRs, but only had 2 - hit a home run off David Peterson - who only gave up 11 HRs in 168.2 innings last year - in the Pirates-Mets game on May 12, 2025. The model gave it only a 1% chance, but nonetheless he went yard. The broadcasters seem fairly stunned, so while this model hasn’t quite been as helpful as I hoped for its original intent, it at least found that. And if getting the chance to watch obscure Isiah Kiner-Falefa home runs in February doesn’t justify this whole thing, then nothing does.