Rehashing Runs Created and OPS
I devoted far too many words in the first incarnation of Walk Like a Sabermetrician to discussing the properties of OPS, its relationship to team runs scored, and my issues with the way it is marketed and utilized by sabermetrically-inclined people. I believe this is the first time I have felt compelled to address OPS here, but the impetus comes from the godfather himself. The 2023 Bill James Handbook contains a short piece authored by Bill James and titled “OPS and Runs Scored”.
The crux of the short piece is that James was approached sometime in the 1980s by sabermetricians who were distributing a signed statement to baseball media stating that they endorsed the use of OPS as a “standard measure of a hitter’s effectiveness”. (Incidentally, as something of a sabermetric historian, I would love to see a copy of this document. Given the timeframe, one has to imagine that Pete Palmer would have been involved as he was a booster of OPS in The Hidden Game of Baseball and subsequent works).
James did not sign the statement, but says that one of the arguments that they made to him in appealing for his signature was that “OPS...bore a straight-line relationship to runs scored”. James goes on to point out that after finally studying the issue in the present day, OPS does not have a 1:1 relationship with runs scored, but rather a square relationship (e.g. if a team’s OPS is 10% higher than league average, they can expect to score 21% more runs than league average).
This is all well and good. James is absolutely correct about the relationship of OPS to runs scored, and were I in his shoes I also would have declined to endorse OPS as a standard measure for the many reasons that I have articulated in the aforementioned posts. However, one line in the article perpetuates a common misunderstanding that repeats in almost every forum in which OPS and OPS+ are discussed:
“And if a hitter has an OPS+ of 110, he is not creating 10% more runs than an average hitter, he is creating 21% more runs than an average hitter.”
This is incorrect. It would be true if OPS+ was calculated as OPS/LgOPS, but from The Hidden Game of Baseball on, OPS+ (whether called Normalized OPS or Production+ or OPS+ or aOPS or whatever as the names have changed slightly with each Pete Palmer-authored publication in which the metric has been included) has been calculated as OBA/LgOBA + SLG/LgSLG – 1. This formulation does in fact have a 1:1 relationship with runs scored.
It is easy to see why this causes confusion, as people are conditioned to expect that a metric called “X+” will be X divided by the league average (except in the case of ERA+ where they expect league average divided by X even though it results in a metric with very misleading properties). When people learn that they have been mistaken about how OPS+ is calculated, they often call for the formula to be changed to use OPS/LgOPS, but of course this would make true James’ claim about the relationship of OPS+ to runs scored and thus make the scale of the metric less useful. One could make the formula 2*OPS/LgOPS – 1 and preserve the 1:1 relationship, but doing so would reduce the accuracy of OPS+ as a predictor of runs scored. This is due to the fact that the most accurate version of adding OBA and SLG to estimate runs scored does not given them equal weight but rather gives significantly more weight (I don’t want to go down that path right now, so let’s just call it something in the neighborhood of 1.8) to OBA. By dividing the two averages by the league average and then adding, OPS+ uses an implicit weight of more than 1.0 for OBA since LgSLG is in any conceivable modern major league context greater than LgOBA (in 2022, the overall MLB OBA was .312 and SLG was .395 for an effective OBA weight of 1.27). This does not maximize accuracy, but it does move in the right direction.
You may be wondering where the Runs Created in the title of this post comes from, and it is a seperate article by James in the Handbook titled “A Runs Created Method for the Manfred Era”. James notes that due to the extra inning rules, Runs Created was systematically underestimating team runs scored. This is a very important observation, and one that I have been endeavoring to get into the conciousness of sabermetricians and websites that publish data as the statistics polluted by extra innings cannot be used as inputs to standard metrics without causing distortions of the type noted by James. This issue is not in anyway isolated to just the Runs Created fiormula. Ideally, I would like to be able to easily remove all extra innings from team totals at least for the purpose of calibrating metrics on the traditional scales. I am not aware of any easy way to do this – the best I can think of using publically available data sources is to comb through Retrosheet play-by-play data and remove them. Unfortunately, I have not had the time or motivation to do this, and thus am using metrics that I am no longer able to easily calibrate to changing major league conditions of play.
It is not entirely clear from James’ article whether the changes he made to the Runs Created formula were designed to make Runs Created a predictor for team runs scored including the phony runners, or if it was his tinkering with the formula due to the poor performance that inspired the changes independent of correcting for phony runners, and the latter was an additional reason for inaccuracy that he noticed when revamping the formula. This ambiguity means that I will not attempt to verify the accuracy of the formula compared to other run estimators as I’m not sure what the fair standard should be based on what the formula was attempting to accomplish. I will instead focus on analyzing what the new formula says about the relative value of offensive events.
The new Runs Created formula is (note DP is actually GIDP, I am just loathe to have abbreviations longer than two letters, although I do for catcher’s interference so I’m really just making up my abbreviation scheme as I go):
A = .9(H + W + HB + INT – CS) – DP
B = TB + .4(SB + SH + SF) + .25(W – IW) + 1.71HR
C = PA – SH
RC = A*B/C
I will be focusing on the team RC formula, but I should note that the theoretical team method that is used to calculate individual player RC is unchanged.
This is the fourth iteration of what James used to call the “technical” RC formula (actually, the fifth, but the 1983 version was never fully implemented and quickly superseded). The first was published in 1984, the second in 1998, the third in 2004, and the fourth now in 2022. James is correct that there are slight changes in the game that may necessitate adjustment to run estimator coefficients, so it is not necessarily fair to compare all of these as they apply in 2022, but I think doing so is instructive to note how the construct has changed over time even if some of the changes to weights are due to conditions when the formulas were developed.
To make it easier to compare and to calculate the intrinsic linear weights that these various formulas imply given the 2022 major league batting totals, the table below shows each granular-level event included in the formulas (e.g. single is shown as opposed to the category of hits, which includes four distinct events) and the weight it is given in the A, B, and C factors of each of the formula, along with the number of each of the events in 2022:
The “Total Factor Value” shown at the bottom of each column is the total of A, B, and C for the 2022 majors. These values can be used along with the coefficients for each of the events in each of the factors (which I will call a, b, and c) to calculate the intrinsic linear weight that each event is assigned when calculating RC for the 2022 major league totals. The formula for this is:
LW = (A/C)*b + (B/C)*a – (A/C)*(B/C)*c
If you want more details on this calculation, please see my article in the November 2005 edition of By the Numbers.
The results are:
What immediately jumps off the page is the massive weight given to the home run, which marks a big change from the 2004 formula. That formula was the first to attempt to weight the hit types more precisely than simply using total bases. Now James has reverted to using total bases and applying a huge bonus for a home run. The treatment of the home run is the fundamental design difference (although there are others) between Runs Created and David Smyth’s Base Runs, which is in my opinion a much better model of the run scoring process as it recognizes that a home run always produces at least one run and does not in fact add a baserunner.
Other items of note are that while the two previous iterations had a slight penalty for a strikeout compared to other outs, that has been removed. I do not disbelieve that over some sample of data, the new formula may be more accurate. But given the massive weight on the home run, I am very skeptical that any performance improvement would be persistent as additional years are added to the test data.