We’re almost 60 games into the MLB season. A lot can happen in 60 games that may not happen in the next 100, but it still gives us a fairly decent sample. One of the fun* questions to ask is “how balanced is a team’s lineup?”.
*You may have a more interesting definition of fun than I do.
To answer this, I downloaded batting leader data from Fangraphs, then aggregated by team. I chose wRC+ as my metric of interest, since this adjusts for ballpark and league. Since we want to know how balanced a team is, we’ll take their overall wRC+, as well as the weighted standard deviation of wRC+. This means, for instance, that we can still include Wenceel Pérez and his frankly crazy 257 wRC+ (i.e. he’s 157% better than league average) for the Tigers, but since he only has 11 PA, it won’t count for much in either the team total nor the standard deviation.
Speaking of standard deviation - a SD of 10 is very different for a team with a mean of 110 than it is for a team of 90. To account for this, we’ll use the coefficient of variation (COV), which is the standard deviation divided by the mean*.
*I hope you were sitting down when you read this, because this is exciting stuff.
A wRC+ of 100 means exactly league average, considering both on-base percentage and power (using linear weights). A low COV means teams tend to cluster around the mean, while high means they’re spread out. This measure performance, not talent - so if a player has been lucky for 60 days and is due for a bounceback, that won’t show here. Without further ado, a chart:
A few thing jump out here: first, the Rockies are just horrible. A team wRC+ of 64 is abysmal, and while it’s highly variable, only one player with significant playing time, Jordan Beck, has a wRC+ over 100. Catcher Jacob Stallings has a wRC+ of 9 - 9! - in 88 PA, with a .152/.230/.190 line.
Next, the Dodgers and Yankees are clearly having good years, and while their variation is higher, that’s easily explainable. Aaron Judge, who may very well be a robot, has a wRC+ of 240. That dwarfs Paul Goldschmidt’s 156, which is an excellent number any player would be happy to have. The Dodgers have Shohei Ohtani (187) and Freddie Freeman (192), with Will Smith trailing with “only” 177. Max Muncy is at 96, which is a little disappointing given his past performance, although his xwOBA of .339 suggests better days ahead.
Finally, my two favorite teams. The Tigers have a wRC+ of 110, but it’s one of the more balanced lineups out there. Justyn-Henry Malloy (94) and Trey Sweeney (75) are the only players with over 100 PA with a wRC+ under 100. The Cubs, meanwhile, have no player with over 100 PA with a WRC+ under 100, making their lineup a tough one to contend with.
Of course, we really care about runs in the end. We can run a very quick linear model to see if team total wRC+ or consistency is more important.
Show the code
lm_runs <-lm(r ~ avg_wrc_plus + cov_wrc_plus, data = team_stats)summary(lm_runs)
Call:
lm(formula = r ~ avg_wrc_plus + cov_wrc_plus, data = team_stats)
Residuals:
Min 1Q Median 3Q Max
-27.570 -13.542 -4.804 11.701 38.552
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -35.1722 50.5960 -0.695 0.493
avg_wrc_plus 2.6394 0.3415 7.729 2.6e-08 ***
cov_wrc_plus 47.4837 54.5826 0.870 0.392
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 18.62 on 27 degrees of freedom
Multiple R-squared: 0.7846, Adjusted R-squared: 0.7687
F-statistic: 49.18 on 2 and 27 DF, p-value: 9.961e-10
So that total is carries more weight than the distribution. Team wRC+ has a significant relationship, while the COV is considered not staistically signficant. This is a very simple model, though, so it may be missing on some factors. Speed and baserunning, for instance, would be pretty important factors that we’re not considering here.
Still, it’s not totally without its uses. It helps us see which teams are more reliant on certain players to carry the offense and which may be more resilient to injuries.