AL Central Projections

statistics
r
baseball
Author

Mark Jurries II

Published

June 27, 2025

We’re at the halfway point of the MLB season, with most teams having played 81 games. This is as good a time as any to evaluate the AL Central and see the odds of the Tigers winning the division. We could just use the trusty Fangraphs projections, but it’s fun to dive into binomial simulation.

*I think it’s fun, anyway. Mileage may vary.

R makes it fairly straightforward to simulate binomial trials. If I wanted to flip a coin 100 times and see how often it comes up heads, I can do so with one line of code. As a bonus, we can do 10,000 sets of 100 flips*, so we can see how typical our numbers are.

*To validate this yourself, flip a coin one million times and record the results.
Show the code
set.seed(2000)
simmed_flips <- rbinom(10000, 100, .5)
  
summary(simmed_flips)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  29.00   47.00   50.00   49.99   53.00   68.00 

Roughly half of our flips are heads, which is what we’d expect. But we can get as few as 31 and as many as 61, so there’s still some variation in there. So there’s the first thing we need to account for - even things as straightforward as a coin flip can still show a lot of variation. We can use the same process for modeling a team’s chances of winning, while it’s obviously more complex than a coin flip* the principle holds.

*You are falling into your old error, Jeeves, of thinking that the Tigers are a penny.

Secondly, we need to know what win rate to feed our model. A team may have a high winning percentage, but given the previous example that may be a fluke. Same for low winning percentage. To accommodate this, we’ll take a quasi-Bayesian approach and average a team’s actual winning percentage with .500*. So they still get credit for what they’ve done, but we adjust our expectations a bit as well.

*A better value would be preseason projections. .500 is a bit simplistic, it’s fine for understanding how the model works but if I were to build something out for all of MLB I’d get much more sophisticated. I’d also weigh by games played and games remaining, but since we’re ~50% through the season a straight average is fine.

Let’s note as well that we’re simulating each team individually. That is, we’re just asking how many wins we can reasonably expect a given team to end with - we’re not modeling games against other teams, adjusting for strength of schedule, or anything like that. We also don’t know what injuries may come nor who teams may trade for.

Before that, let’s check the AL Central as of the morning of 6/27/25.

Show the code
library(gt)
library(gtExtras)
library(hrbrthemes)
library(tidyverse)

al_central <- tribble(~'team', ~'w', ~'l',
                      'Tigers', 51, 31,
                      'Guardians', 40, 39,
                      'Twins', 39, 42,
                      'Royals', 38, 43,
                      'White Sox', 26, 55) %>%
  mutate(win_perc = w / (w + l),
         games_remaining = 162 - w -l,
         weighted_win_perc = (.500 + win_perc) / 2)

al_central %>%
  select(-weighted_win_perc) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c('win_perc'), decimals = 3)
team w l win_perc games_remaining
Tigers 51 31 0.622 80
Guardians 40 39 0.506 83
Twins 39 42 0.481 81
Royals 38 43 0.469 81
White Sox 26 55 0.321 81

OK, that’s our baseline - these games have been played and will be accounted for in our final numbers. Let’s now simulate the rest of the season.

Show the code
al_central_simmed <- al_central %>%
  group_by(team) %>%
  mutate(
    simulations = list(rbinom(n = 10000, size = games_remaining, prob = weighted_win_perc))
  ) %>%
  unnest(simulations) %>% # Expand the list column into new rows
  ungroup() %>%
  rename(value = win_perc, t = simulations) %>%
  mutate(projected_wins = w + t,
         ros_projected_win_perc = t / games_remaining) %>%
  group_by(team) %>%
  mutate(sim_id = row_number()) %>%
  group_by(sim_id) %>%
  mutate(ranking = rank(-projected_wins, ties.method = "min"))
 
al_central_simmed %>%
  ggplot(aes(x = projected_wins, color = team))+
  geom_density()+
  theme_ipsum()+
  scale_color_manual(values = c("#E31937",  "#004687", "#FA4616", "#002B5C",  "#27251F" ))

As a Tigers fan, this is encouraging - the average number of wins is north of 90, and far enough away from the rest of the division that we’re relatively comfortable. (Although their surge last year from 0.6% to 100% is a reminder that unexpected things happen in baseball.) If you’re a White Sox fan, well, at least it’s better than last year.

One thing this doesn’t tell us is how often the Tigers win the division. Just because they win 85 games in a sim doesn’t mean someone else wins 86, and just because Cleveland wins 93 doesn’t mean Detroit has fewer wins. What we’ll do here is number our simulations and see who has the most wins. Let’s look at our first sim as an example.

Show the code
al_central_simmed %>%
  ungroup() %>%
  filter(sim_id == 1) %>%
  select(team, w, l, games_remaining, ros_projected_win_perc, projected_wins) %>%
  arrange(desc(projected_wins)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c('ros_projected_win_perc'), decimals = 3)
team w l games_remaining ros_projected_win_perc projected_wins
Tigers 51 31 80 0.588 98
Guardians 40 39 83 0.627 92
Twins 39 42 81 0.556 84
Royals 38 43 81 0.407 71
White Sox 26 55 81 0.494 66

In this example, the Tigers play .588 in their remaining 80 games, finishing with 98 wins, 6 up on the second-place Guardians, who do really well in this sim. This seems plausible, but we won’t know how likely it is. It’s a good thing we did this 10,000 times so we can average our results.

Show the code
al_central_simmed %>%
  group_by(team) %>%
  mutate(won_division = ifelse(ranking == 1, 1, 0)) %>%
  summarise(avg_projected_wins = mean(projected_wins),
            projected_10th = quantile(projected_wins, .10),
            projected_90th = quantile(projected_wins, .90),
            won_division = mean(won_division)) %>%
  arrange(desc(avg_projected_wins)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c('avg_projected_wins'), decimals = 1) %>%
  fmt_percent(columns = c('won_division'), decimals = 1)
team avg_projected_wins projected_10th projected_90th won_division
Tigers 95.8 90 101 98.7%
Guardians 81.8 76 88 1.7%
Twins 78.8 73 84 0.3%
Royals 77.2 71 83 0.1%
White Sox 59.3 54 65 0.0%

This model has the Tigers winning an average of 96 games, winning the Central in 98.7% of sims. Their 10th percentile projected wins is 90, which exceeds the 90th percentile for every other team. So, it would require a combination of some other team playing red-hot ball and the Tigers playing poorly for them to not win the division. For instance:

Show the code
al_central_simmed %>%
  group_by(sim_id) %>%
  mutate(tigers_won = max(ifelse(ranking == 1 & team == 'Tigers', 1, 0))) %>%
  filter(tigers_won == 0) %>%
  ungroup() %>%
  filter(sim_id == min(sim_id)) %>%
  select(team, w, l, games_remaining, ros_projected_win_perc, projected_wins) %>%
  arrange(desc(projected_wins)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(columns = c('ros_projected_win_perc'), decimals = 3)
team w l games_remaining ros_projected_win_perc projected_wins
Twins 39 42 81 0.679 94
Tigers 51 31 80 0.512 92
Guardians 40 39 83 0.530 84
Royals 38 43 81 0.407 71
White Sox 26 55 81 0.420 60

In this sim, the Twins play .679 ball the rest of the year while the Tigers fall to .512. Could it happen? Sure. But the Twins only won in 0.3% of sims. This doesn’t mean it’s impossible - the model still says there’s a chance - but it’s also pretty unlikely. But seeing unlikely things is part of the appeal of baseball. Yet the Tigers winning the Central is far and away the most likely outcome, which means it’s going to be a fun fall.