Team wRC+ Balance

statistics
r
baseball
Author

Mark Jurries II

Published

May 23, 2025

We’re almost 60 games into the MLB season. A lot can happen in 60 games that may not happen in the next 100, but it still gives us a fairly decent sample. One of the fun* questions to ask is “how balanced is a team’s lineup?”.

*You may have a more interesting definition of fun than I do.

To answer this, I downloaded batting leader data from Fangraphs, then aggregated by team. I chose wRC+ as my metric of interest, since this adjusts for ballpark and league. Since we want to know how balanced a team is, we’ll take their overall wRC+, as well as the weighted standard deviation of wRC+. This means, for instance, that we can still include Wenceel Pérez and his frankly crazy 257 wRC+ (i.e. he’s 157% better than league average) for the Tigers, but since he only has 11 PA, it won’t count for much in either the team total nor the standard deviation.

Speaking of standard deviation - a SD of 10 is very different for a team with a mean of 110 than it is for a team of 90. To account for this, we’ll use the coefficient of variation (COV), which is the standard deviation divided by the mean*.

*I hope you were sitting down when you read this, because this is exciting stuff.

A wRC+ of 100 means exactly league average, considering both on-base percentage and power (using linear weights). A low COV means teams tend to cluster around the mean, while high means they’re spread out. This measure performance, not talent - so if a player has been lucky for 60 days and is due for a bounceback, that won’t show here. Without further ado, a chart:

Show the code
library(gt)
library(gtExtras)
library(Hmisc)
library(hrbrthemes)
library(janitor)
library(mlbplotR)
library(tidyverse)

fg_stats <- read_csv('fangraphs-leaderboards.csv') %>%
  clean_names()

team_stats <- fg_stats %>%
  group_by(team) %>%
  summarise(r = sum(r),
            avg_wrc_plus = weighted.mean(w_rc, pa),
            sd_wrc_plus = sqrt(wtd.var(w_rc, pa))) %>%
  mutate(cov_wrc_plus = sd_wrc_plus / avg_wrc_plus) %>%
  mutate(team = clean_team_abbrs(team)) %>%
  mutate(team = ifelse(team == 'ATH', 'OAK', team))

team_stats %>%
  ggplot(aes(x = avg_wrc_plus, y = cov_wrc_plus))+
  geom_mlb_logos(aes(team_abbr = team), width = 0.055, alpha = 0.7)+
  theme_ipsum()+
  xlab('team_wrc+')+
  ylab('COV')

A few thing jump out here: first, the Rockies are just horrible. A team wRC+ of 64 is abysmal, and while it’s highly variable, only one player with significant playing time, Jordan Beck, has a wRC+ over 100. Catcher Jacob Stallings has a wRC+ of 9 - 9! - in 88 PA, with a .152/.230/.190 line.

Next, the Dodgers and Yankees are clearly having good years, and while their variation is higher, that’s easily explainable. Aaron Judge, who may very well be a robot, has a wRC+ of 240. That dwarfs Paul Goldschmidt’s 156, which is an excellent number any player would be happy to have. The Dodgers have Shohei Ohtani (187) and Freddie Freeman (192), with Will Smith trailing with “only” 177. Max Muncy is at 96, which is a little disappointing given his past performance, although his xwOBA of .339 suggests better days ahead.

Finally, my two favorite teams. The Tigers have a wRC+ of 110, but it’s one of the more balanced lineups out there. Justyn-Henry Malloy (94) and Trey Sweeney (75) are the only players with over 100 PA with a wRC+ under 100. The Cubs, meanwhile, have no player with over 100 PA with a WRC+ under 100, making their lineup a tough one to contend with.

Of course, we really care about runs in the end. We can run a very quick linear model to see if team total wRC+ or consistency is more important.

Show the code
lm_runs <- lm(r ~ avg_wrc_plus + cov_wrc_plus, data = team_stats)
summary(lm_runs)

Call:
lm(formula = r ~ avg_wrc_plus + cov_wrc_plus, data = team_stats)

Residuals:
    Min      1Q  Median      3Q     Max 
-27.570 -13.542  -4.804  11.701  38.552 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  -35.1722    50.5960  -0.695    0.493    
avg_wrc_plus   2.6394     0.3415   7.729  2.6e-08 ***
cov_wrc_plus  47.4837    54.5826   0.870    0.392    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 18.62 on 27 degrees of freedom
Multiple R-squared:  0.7846,    Adjusted R-squared:  0.7687 
F-statistic: 49.18 on 2 and 27 DF,  p-value: 9.961e-10

So that total is carries more weight than the distribution. Team wRC+ has a significant relationship, while the COV is considered not staistically signficant. This is a very simple model, though, so it may be missing on some factors. Speed and baserunning, for instance, would be pretty important factors that we’re not considering here.

Still, it’s not totally without its uses. It helps us see which teams are more reliant on certain players to carry the offense and which may be more resilient to injuries.

Show the code
team_stats %>%
  arrange(desc(avg_wrc_plus)) %>%
  gt() %>%
  gt_theme_espn() %>%
  fmt_number(c(avg_wrc_plus), decimals = 0) %>%
  fmt_number(c(sd_wrc_plus, cov_wrc_plus), decimals = 2)
team r avg_wrc_plus sd_wrc_plus cov_wrc_plus
NYY 310 129 50.00 0.39
LAD 322 125 45.34 0.36
CHC 332 120 31.39 0.26
AZ 286 114 35.91 0.31
SEA 256 112 39.72 0.35
DET 295 110 37.99 0.34
NYM 247 109 28.48 0.26
OAK 242 108 30.13 0.28
PHI 272 107 29.30 0.27
STL 270 106 29.77 0.28
HOU 235 105 23.50 0.22
BOS 280 105 41.89 0.40
TOR 232 104 29.47 0.28
SD 236 101 43.36 0.43
TB 240 100 38.91 0.39
WSH 253 98 34.64 0.35
ATL 230 98 38.21 0.39
BAL 215 97 43.79 0.45
CIN 270 97 38.27 0.39
MIN 228 96 42.48 0.44
SF 244 94 32.95 0.35
CLE 224 94 42.95 0.46
MIA 229 94 30.54 0.33
LAA 228 92 36.47 0.40
MIL 252 88 35.99 0.41
KC 193 83 39.09 0.47
TEX 196 82 35.77 0.44
CWS 197 81 38.55 0.48
PIT 185 78 32.64 0.42
COL 179 64 42.02 0.66