Left on Base
The question first came up, I believe, in May of 1989.
In baseball, our Pirates were leading the league in runners Left On Base. We bemoaned this fact; through poor clutch hitting, we were losing games we should have won.
But a large number of Left On Base doesn't necessarily mean that a team is hitting poorly. If, during a game, a team gets twenty runners aboard and scores half of them, it's done much better than usual in the scoring department but has still left an above-average number of runners on the bases. And a weak-hitting team that never gets anybody on will never leave anybody on, either.
So which is it? Is it good or bad to be leading the league in Left On Base?
To find out, while also discovering other relationships between the various team statistics, I set up a program on my home computer to figure the Coefficients of Correlation among some fifty categories. I've analyzed the end-of-season data from the National and American Leagues for the 1989 and 1990 seasons, giving me four Coefficients for each pair of categories (for example, Triples versus Wins).
(A mathematical explanation is in order. Calculating the Coefficient of Correlation assumes a linear relationship between the two categories. For example, every eight Triples might give a team one game in the Win column. The Coefficient then measures how well the actual data fit this ideal relationship. A Coefficient of +1.00 means a perfect fit: more Triples always means more Wins. A Coefficient of zero means there's no fit at all: the number of Triples has nothing to do with the number of Wins. A negative Coefficient means that one of the categories is going up while the other goes down: for example, the Coefficient between Wins and ERA is -0.73, which means that the more Wins a team has, the lower its ERA is likely to be.)
Of course, just because two categories are strongly correlated does not mean that there's necessarily a reason behind it. In the NL'90 stats, Batting Average had a +0.75 correlation with Winning Percentage (WL%) on the Road. But this is probably a coincidence. Why? For one thing, Batting Average vs Home WL% was only -0.01, and I can't think of a good reason why a high average would lead to wins on the road but not at home. For another, Batting Average vs WL% generally, both leagues both seasons home and away, averaged only +0.41.
With that caveat, what statistical categories were most strongly correlated to a high WL% over the two-year period?
Stating the Obvious
The top two were not surprising: WL% in Night games and against Right-Handed Pitchers, both at +0.91. Most games are at night or against right-handers.
The next two were WL% at Home and on the Road, both at +0.68. The better a team, the better its record both home and away.
Other correlations between overall WL% and WL% under certain conditions were as follows: on Grass, +0.80; Day games, +0.68; against Lefthanders, +0.67; on Artificial surfaces, +0.64; in One-Run games, +0.62; and in Extra Innings, +0.35. Only the last is surprisingly low. Apparently the sudden-death nature of extra innings leads to random results, in which the better team often doesn't win.
Among the more common statistical categories, it appears that pitchers have a greater influence on winning than hitters do.
The strongest correlations here are Runs Allowed (-0.77) and Earned Run Average (-0.73). Saves are next at +0.66; that's natural, since your team can't have a save unless it has a win. Then come Hits Allowed (-0.58) and Unearned Runs (-0.55). And Shutouts correlate at +0.47.
The top offensive category is Runs Batted In at +0.57. Then come On-Base Percentage (+0.52), Slugging Percentage (+0.45), and Batting Average (+0.41). Average Salary has as much connection to winning as Batting Average does, correlating at +0.42.
As the Coefficients get smaller, the connection between categories and winning gets weaker. On defense, both Walks Issued and Errors contribute to losing (-0.39). On offense, Hits are a plus at +0.33. Next come some "speed" categories: Stolen Base Percentage at +0.28, Sacrifice Bunts at +0.27, and Sacrifice Flies at +0.26. (The numbers in the National League only are +0.35, +0.50, +0.33, while in the American League they're +0.22, +0.04, +0.20. These categories, especially bunting, are more essential to winning in the NL.)
Continuing down the list: Home Runs Allowed -0.26, Triples +0.24, Stolen Bases +0.23, Home Runs Hit +0.22, and Complete Games +0.20 (Complete Games are +0.31 in the NL but only +0.09 in the AL).
And the least important categories: Wild Pitches -0.20, Balks +0.18 (balks are good?), Doubles +0.18, Walks by Batters +0.16, Strikeouts by Pitchers +0.15 (surprisingly unimportant), Hit Batsmen -0.06, Strikeouts by Batters +0.05, and Left On Base -0.04. To answer the original question: Left On Base has almost nothing to do with winning.
Until now, we've been reporting the Coefficients of Correlation between WL% and various other categories. Now let's look at the relationship between a few pairs of categories that don't include WL%.
For example, what does Left On Base correlate with? Walks +0.65, On-Base Percentage +0.55, Home Runs -0.40. Leading the league in Left On Base probably means you're getting a lot of walks, and it may mean you aren't clearing the bases with homers very often.
What are the most important factors in scoring Runs? Well, first of all, driving them in: RBI correlates at +0.99. Slugging Percentage is +0.88, which would imply that the long ball is important; but Hits are +0.63, Doubles +0.44, Triples +0.27, and Home Runs +0.41. Sacrifice Flies correlate at +0.51 because a run scores every time. What else affects run production, however weakly? Good things: Stolen Base Percentage +0.16, Walks by Batters +0.07, Stolen Bases +0.02. Bad things: Sacrifice Bunts -0.12 (a surprise), Strikeouts by Batters -0.12, Caught Stealing -0.11.
How about the defensive side of the equation, correlations with Runs Allowed? Unsurprisingly, Earned Run Average leads the list with +0.97. Allowing more runs: Unearned Runs +0.65, Walks Allowed +0.55 (those walks will kill you), Errors +0.42, Home Runs Allowed +0.41, Wild Pitches +0.28. Allowing fewer runs: Shutouts -0.60, Saves -0.50, Strikeouts by Pitchers -0.26.
Finally, what does money buy a team? What correlates most with Average Salary? The 1990 figures imply that it's pitching in the National League and slugging in the American.