Random Effects and Home Runs

statistics
r
baseball
Author

Mark Jurries II

Published

February 6, 2026

I recently was reading “Multilevel Modeling in Baseball” by Jim Albert, in which he used a random effects model to show how much influence hitters vs pitchers had on various stats. It showed that hitters have a much higher contribution to home run rates, which checks out. It also got me thinking - which pitchers perhaps saw more HR just because of who they faced?

I used 2025 play by play data from Retrosheet. I also decided to add park to the random effects model, since some parks (i.e. Wrigley) play more as home run parks than others. We’ll run the model, then predict for each plate appearance the probability of a home run. We’ll then sum up those probabilities and compare to actuals, first for pitchers and then for batters, and compare to actual. If you’re thinking “this sounds a lot like Expected Goals*”, well, the premise very much is. A better model would account for pitch type, wind speed, etc., but this fairly simple approach should yield some interesting results.

*I find xG to be helpful, and the same concept is what drives Outs Above Average.
Show the code
library(hrbrthemes)
library(janitor)
library(gt)
library(gtExtras)
library(tidyverse)
library(lme4)

#read retrosheet data
mlb_2025 <- read_csv('2025plays.csv')
players <- read_csv('biofile0.csv') %>%
  mutate(player_name = paste0(usename, ' ', lastname)) %>%
  select(id, player_name)

#filter out stolen base events
mlb_2025 <- mlb_2025 %>%
  filter(sb2 == 0 & sb3 == 0 & cs2 == 0 & cs3 == 0)

#run our random effects model
hr_fit <- glmer(hr ~ (1 | pitcher) + (1 | batter) + (1 | site),
                data = mlb_2025,
                family = binomial)

#create predictions. To extract standard errors, we need to predict
#on the log scale, so we'll munge the predictions to transform.
hr_pred <- predict(hr_fit, mlb_2025, se.fit = TRUE, type = "link")

hr_pred_munged <- hr_pred %>%
  as_tibble() %>%
  mutate(low = fit - (se.fit * 1.96),
         high = fit + (se.fit * 1.96)) %>%
  mutate(low = plogis(low),
         fit = plogis(fit),
         high = plogis(high)) %>%
  rename(hr_prediction = fit)

mlb_2025_pred <- mlb_2025 %>%
  cbind(hr_pred_munged)

#aggregate by pitcher
pitcher_pred <- mlb_2025_pred %>%
  summarise(pa = n(),
            hr = sum(hr),
            hr_pred = sum(hr_prediction),
            hr_pred_low = sum(low),
            hr_pred_high = sum(high),
            .by = c(pitcher)) %>%
  mutate(dif = hr - hr_pred) %>%
  arrange(desc(dif)) %>%
  left_join(players, by = join_by(pitcher == id)) %>%
  relocate(player_name) %>%
  select(-pitcher)

head(pitcher_pred, 5) %>%
  bind_rows(tail(pitcher_pred, 5)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')
Player Name pa hr Predicted HR Predicted HR (Low) Predicted HR (High) dif
Shota Imanaga 603 34 26.2 12.8 52.9 7.8
Nestor Cortes 158 13 6.4 3.0 13.3 6.6
Bailey Ober 628 30 23.9 11.6 48.6 6.1
Bowden Francis 292 19 13.2 6.4 26.7 5.8
Elvin Rodriguez 91 9 3.3 1.6 6.5 5.7
Yoshinobu Yamamoto 834 16 21.0 10.6 41.6 −5.0
Adrian Morejon 306 2 7.0 3.4 14.7 −5.0
David Peterson 737 11 16.4 7.8 34.2 −5.4
Jose Soriano 733 12 17.7 8.9 35.3 −5.7
Cristopher Sanchez 863 12 18.9 9.4 37.9 −6.9

This suggests that Shota Imanaga - who has to pitch at Wrigley a lot - gave up 8 more home runs than we would have expected. Which seems relevant, but his 34 HR also fall within the range of the prediction interval of 12.8 and 52.4. This is a pretty wide range, but that’s common for pitchers. The difference of 8 HR is pretty extreme, since pitchers are within -5 and 5 in most cases.

Show the code
pitcher_pred %>%
  ggplot(aes(x = dif))+
  geom_histogram()+
  theme_ipsum()+
  theme(text=element_text(size = 16,  family="Oswald"),
        panel.grid.minor = element_blank(),
        plot.title.position = "plot")+
  labs(title = 'Difference Between Expected and Actual HR Totals, Pitchers')

So Imanaga has an extreme overall difference, but it’s not extreme compared to what we’d expect for him. What if we instead looked at pitchers whose value was outside of their high/low bounds?

Show the code
pitcher_pred %>%
  filter(hr < hr_pred_low | hr > hr_pred_high) %>%
  arrange(desc(dif)) %>%
  head(5) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')
Player Name pa hr Predicted HR Predicted HR (Low) Predicted HR (High) dif
Elvin Rodriguez 91 9 3.3 1.6 6.5 5.7
Austin Cox 104 9 4.2 2.0 8.8 4.8
Noah Davis 67 7 2.7 1.2 5.6 4.3
Shaun Anderson 57 6 2.4 1.2 4.9 3.6
Kyle Gibson 75 7 3.4 1.7 6.7 3.6

This very first thing to note here is that none of these pitchers saw a lot of action. Only Austin Cox faced over 100 batters. So, even though the model helps regress some, there are still sample sizes at play here. I don’t think that “Elvin Rodriguez gave up 2.5 more HRs compared to the upper bound of his prediction interval” is an insight that’s going to set the world on fire.

Let’s take a look at batters as long as we’re at it.

Show the code
#aggregate by pitcher
batter_pred <- mlb_2025_pred %>%
  summarise(pa = n(),
            hr = sum(hr),
            hr_pred = sum(hr_prediction),
            hr_pred_low = sum(low),
            hr_pred_high = sum(high),
            .by = c(batter)) %>%
  mutate(dif = hr - hr_pred) %>%
  arrange(desc(dif)) %>%
  left_join(players, by = join_by(batter == id)) %>%
  relocate(player_name) %>%
  select(-batter)

head(batter_pred, 5) %>%
  bind_rows(tail(batter_pred, 5)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')
Player Name pa hr Predicted HR Predicted HR (Low) Predicted HR (High) dif
Cal Raleigh 773 65 59.1 31.4 107.5 5.9
Eugenio Suarez 718 52 46.9 25.4 84.3 5.1
Kyle Schwarber 761 58 53.0 23.2 114.7 5.0
Aaron Judge 725 54 49.0 25.2 92.3 5.0
Shohei Ohtani 830 63 58.1 26.1 123.2 4.9
Isiah Kiner-Falefa 507 2 5.8 2.7 12.2 −3.8
Xavier Edwards 626 3 7.0 3.3 14.7 −4.0
Santiago Espinal 332 0 4.1 2.2 7.4 −4.1
Nick Allen 420 0 4.5 2.0 9.9 −4.5
Chandler Simpson 444 0 4.7 2.1 10.5 −4.7

No real surprises there, and I think the model’s being a bit unfair to Aaron Judge, who was put on earth to destroy baseballs. Like the Multilevel Modeling article suggested, hitters have more control - just check their residual chart compared to pitchers:

Show the code
batter_pred %>%
  ggplot(aes(x = dif))+
  geom_histogram()+
  theme_ipsum()+
  theme(text=element_text(size = 16,  family="Oswald"),
        panel.grid.minor = element_blank(),
        plot.title.position = "plot")+
  labs(title = 'Difference Between Expected and Actual HR Totals, Batters')

This is a much tighter distribution, with differences between -3 and 3 compared to -5 and 5. As long as we’re at it, let’s check guys who outperformed their expectations:

Show the code
batter_pred %>%
  filter(hr < hr_pred_low | hr > hr_pred_high) %>%
  arrange(desc(dif)) %>%
  head(5) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = 4:7, decimals = 1) %>%
  cols_label(player_name = 'Player Name',
             hr_pred = 'Predicted HR',
             hr_pred_low = 'Predicted HR (Low)',
             hr_pred_high = 'Predicted HR (High)')
Player Name pa hr Predicted HR Predicted HR (Low) Predicted HR (High) dif
Sal Stewart 64 5 2.6 1.6 4.2 2.4
Jared Young 47 4 2.0 1.1 3.3 2.0
Brice Matthews 50 4 2.0 1.0 3.7 2.0
Ryan Fitzgerald 55 4 2.0 1.1 3.4 2.0
Sandy Leon 12 1 0.3 0.2 0.8 0.7

OK, nothing interesting there. We do have one more thing we can look into, though - what does this model consider to be the unlikeliest home runs hit in 2026?

Show the code
mlb_2025_pred %>%
  mutate(dif = hr - hr_prediction) %>%
  select(gid, batter, pitcher, hr, hr_prediction, dif) %>%
  inner_join(players, by = join_by(batter == id)) %>%
  select(-batter) %>%
  relocate(player_name, .after = gid) %>%
  rename(batter = player_name) %>%
  inner_join(players, by = join_by(pitcher == id)) %>%
  select(-pitcher, -hr) %>%
  relocate(player_name, .after = batter) %>%
  rename(pitcher = player_name) %>%
  arrange(desc(dif)) %>%
  head() %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c(hr_prediction, dif), decimals = 3)
gid batter pitcher hr_prediction dif
NYN202505120 Isiah Kiner-Falefa David Peterson 0.009 0.991
NYN202506010 Tyler Freeman Clay Holmes 0.010 0.990
PIT202504040 Ke'Bryan Hayes Fernando Cruz 0.011 0.989
CLE202508130 Xavier Edwards Gavin Williams 0.011 0.989
PIT202506100 Ke'Bryan Hayes Ronny Henriquez 0.011 0.989
COL202508160 Tyler Freeman Andrew Saalfrank 0.012 0.988

Isiah Kiner-Falefa - who our model earlier said should have had 6 HRs, but only had 2 - hit a home run off David Peterson - who only gave up 11 HRs in 168.2 innings last year - in the Pirates-Mets game on May 12, 2025. The model gave it only a 1% chance, but nonetheless he went yard. The broadcasters seem fairly stunned, so while this model hasn’t quite been as helpful as I hoped for its original intent, it at least found that. And if getting the chance to watch obscure Isiah Kiner-Falefa home runs in February doesn’t justify this whole thing, then nothing does.