Tyler's Think Tank: Playing Moneyball (Again)

Could it be that sluggers like Nelson Cruz are undervalued? (TheSportsPost)

Introduction:

This study investigates the role that certain statistics play in determining the salaries of major league position players, with the overarching goal of determining whether some common measures of hitter performance are being valued properly and if there has been a change over time in how they influence salary. Michael Lewis’s 2003 book Moneyball revealed that for a long time, teams were not valuing offensive performance properly, resulting in a market failure. Smarter teams, like Billy Beane’s Oakland A’s and Theo Epstein’s Boston Red Sox, were thus able to gain a competitive edge by acquiring offensive talent cheaply on the open market. As the median cost of purchasing a free agent continues to escalate and the levels of run-scoring keep dwindling, it’s becoming increasingly important that teams allocate their finite financial resources wisely in order to stay competitive (Gaines, 2014). The Rays, A’s, and Pirates have shown it possible for small market teams to compete with wealthier teams through efficient payroll utilization, whereas large market teams like the Phillies, Mets, and Yankees have underperformed lately because they overpaid for poor players. The recent paucity of offensive production and available free agents makes accurate assessments of hitting talent even more imperative, as there’s simply not enough hitting to go around.

By looking at the effects of several traditional and newer statistics on salary, this study hopes to reveal market inefficiencies that could be taken advantage of by a general manager. The hypothesis is that sabermetrics that came into vogue more recently, such as Runs Created, wOBA, and OPS, will be undervalued whereas traditional metrics like the Triple Crown stats (batting average, home runs, and runs batted in) will be overvalued. Therefore, players who excel in sabermetric categories but have less impressive traditional statistics will be undervalued whereas players with strong traditional statistics but weak peripherals may be overvalued.

First, this paper will provide background literature comprised of research studies, economic papers, and expert analysis on some common popular and effective measures of hitting performance that were incorporated into the model. Then the variables and statistics that make up the model will be presented, describing what each one means and why it is important. After that the results of this study will be presented based on the 10 most recent years of data (2004-2013) from the Lahman baseball database, covering the post-Moneyball era as well as the post-steroid era, which saw more stable offensive production following the implementation of league-wide PED testing and penalties in 2004. These results will be compared to those from the years 1985 through 2003 to observe the Moneyball effect, seeing what changes, if any, have come about in how offensive performance is valued. This analysis will help determine what inefficiencies there are, if any, in the labor market for position players that a general manager could exploit. This study will then identify which statistics are not being valued properly and explain why, given their relationship with team success, they should be valued differently. Lastly, this paper expects t to confirm the theory that aggregation is inappropriate when analyzing position players because the structure of salary rewards varies significantly between the four main groups of position players: infielders, outfielders, catchers, and designated hitters (Hakes and Sauer, 2006). Therefore, it is likely that the author will need to find disaggregate models for each type of position player as well.

Moneyball: an important book and great movie

Literature Review

For a variety of reasons, baseball has become offense-starved in recent years. Runs per game have decreased steadily since 2006, falling from 4.86 per game per team that year to 4.07 in 2014, the lowest rate since 1969. In 2014 there wasn’t a single team that averaged 4.86 runs per game. There are many factors responsible for this downward trend. One is that hitters as a group are striking out more than ever due to increased acceptance of strikeout-prone hitters, a better-regulated (and lower) strike zone, and harder-throwing pitchers. In 2006, 16.8 percent of plate appearances ended in a strikeout; in 2014 that figure reached 20.4 percent. That means batters are putting the ball in play less frequently, and when they do, often it’s into the teeth of a defensive shift, hit towards better-positioned fielders[1]. Fewer balls in play translate to fewer hits, and fewer hits mean fewer home runs. Unsurprisingly, power is way down; there were 1,200 fewer home runs hit in 2014 compared to 2006, a 22.3 percent decrease that coincides with stricter performance enhancing drug penalties (Baseball-Reference, 2014). Pitchers also have the advantage of advanced scouting reports and video preparation to exploit batters’ weaknesses. All of these developments have combined to suffocate offense, resulting in the sport’s lowest scoring period since the pitching-dominated 1960s (a time commonly referred to as the second Deadball Era). Pitching and defense rule the game, meaning talented hitters are harder to come by these days (Schaal, 2014).

Because quality hitting is at such a premium, available bats on the free agent market are in high demand. Thanks to baseball’s recent influx of TV money and exploding revenues across the sport, teams are becoming richer than ever before. More teams can afford to spend money on free agents, thereby increasing the demand for them (Chen, 2012). In the 2013-2014 offseason alone, the 30 MLB teams spent around $2.4 billion dollars on roughly 140 free agents, over 92 percent of which was guaranteed money (Gaines, 2014). But because teams have taken to locking up their young stars with below-market-rate contract extensions (thereby preventing them from reaching the open market in their primes), combined with aging curves reverting to normal in the testing era[2], free agent talent is becoming increasingly scarce. With a smaller talent pool to choose from and more clubs actively pursuing free agents, teams—especially small market teams with tight budgets—need to be smarter than ever before about identifying which players are worth adding to their payroll (Sawchik, 2013).

The best way to do that is through statistical analysis, once an innovative idea now commonplace throughout the sport in large part because of Michael Lewis’s landmark book Moneyball, which popularized sabermetrics by detailing how the Oakland A’s used them to build juggernaut rosters on shoestring budgets (Hakes and Sauer, 2006). While there are numerous available statistics to measure hitting talent, some are more correlated with team success than others. For instance, studies have shown that an efficient labor market would reward players for their power and on-base skills, the ones most correlated with run production and, by extension, winning. In their 2006 study, Jahn Hakes and Raymond Sauer found that looking at a team’s on-base and slugging percentages together, compared to those of its opponent, explains 88.5 percent of variation in winning percentage (Hakes and Sauer, 2006). Similarly, a 2005 study conducted by Adam Houser observed strong correlations between winning and OBP and SLG. The best statistic, then, would measure both of these skills. Three that do are OPS (On-Base Plus Slugging), secondary average (SecA), and wOBA (weighted On-Base Average), all relatively new sabermetrics strongly related to team success (Baseball and Sabermetrics, 2013). wOBA in particular has curried favor with the sabermetric community for its reputation as “a solid, context-neutral statistic that values hitting properly” (Cameron, 2008). Because OPS, SecA, and wOBA combine power and on-base skills, both of which are heavily correlated with winning, they should drive salary and are thus present in this model.

Breaking down each skill further, the beneficial effects of on-base percentage are well-documented in Moneyball as well as the Hakes and Sauer paper. As the number of baserunners increases, the run expectancy for that inning increases as well because there are more scoring opportunities. This model measures a hitter’s batting eye using walk to strikeout ratio and expects it to be significant. Likewise, hitting for power also fuels run-scoring, for three of the five events with the highest run expectancy are extra-base hits (Tango, 2010). Intuitively this makes sense, as a walk can only score a run when the bases or loaded and for a single to score a run there almost always has to be a runner in scoring position. A home run, on the other hand, is the most valuable outcome of a plate appearance because it scores the batter and any runners on base. Extra base hits are more likely to score baserunners regardless of which base they’re on and put the batter in scoring position, improving the likelihood that he scores as well. One would expect power (represented in this model by home runs and Isolated Power) to be properly valued as sluggers have historically been well-rewarded dating back to Babe Ruth, who in 1931 famously earned a higher salary than President Hoover[3]. On the other hand, extra base hits that don’t go over the wall tend to be underrated because they aren’t as exciting as home runs, so it’s possible that slugging could be somewhat undervalued (Booth, 2013).

Another catch-all statistic is Runs Created, invented in 1979 by Bill James, the “father of sabermetrics.” Runs Created estimates the number of runs a hitter contributes to his team. It has been revised and updated over the years, with the “Technical” version serving as the most widely used iteration because it accounts for all basic, easily available offensive statistics. Runs Created has been shown to be an accurate measure of an individual’s offensive contribution because when used with aggregated team totals, the formula closely approximates (within five percent) how many runs the team actually scores (Appelman, 2008). According to Beyond the Batting Average, it “It is useful to look at team RC because it is often a better reflection of a team’s offensive ability and a better predictor of future performance than runs scored” (Panas 39). It is also a key component in James’s popular Win Shares formula and is used by Baseball Prospectus to compute its Equivalent Average (EqA) sabermetric (SABR, 2014). A recent study found that Runs Created predicts a team’s success most effectively (Baseball and Sabermetrics, 2013). It is widely considered to be one of the most accurate measures of offensive contributions.

These statistics do a much better job reflecting a player’s performance than traditional metrics, which individually fail to say much about a hitter’s skill set. For example, the three statistics most commonly used to value hitters have been the Triple Crown stats: batting average, home runs, and runs batted in, none of which are strongly correlated with superior offensive statistics such as OPS and Runs Created. Consequently, “none of the Triple Crown stats are that accurate in telling a player's run producing ability” (Weber, 2014). Batting average measures how often a player reaches base via a base hit but fails to account for walks and hit-by-pitches. It also treats every hit equally, leading John Thorn, official historian of the MLB, to criticize the statistic as a “venerable, uncannily durable fraud” (Kenny, 2012). The aforementioned Houser study even concluded that batting average was negatively correlated with wins. Home runs measure power but ignore doubles and triples, and can also be heavily skewed by a batter’s home park. Runs batted in, or RBI, supposedly measure a player’s ability to drive in runs as well as hit for power to some extent (as hitters with high home run totals tend to have high RBI figures), but has been exposed as a team-dependent stat. Team-dependent stats are faulty because they rely too much on outside factors such as luck or the performance of a player’s teammates. When evaluating individual players, the goal should be to isolate him from his teammates as much as possible (which rate stats and RC do). It’s impossible to do so with a statistic like RBI, which is heavily dependent on the amount of RBI opportunities a player has and how skilled the batters in front of him are at reaching base. Runs scored is similarly team-dependent, for a player needs the hitters behind him to drive him in. Recent baseball literature has made it clear that these kinds of statistics, in addition to the crude Triple Crown measurements, probably shouldn’t be the first ones a GM looks at when deciding whether to pursue a player. However, since they are all common back-of-the-baseball card statistics, they have been included in this model.

The model also must account for different salary structures between the different subsets of position players. A recent study by Matt Swartz of The Hardball Times found that bat-first positions (outfielders, designated hitters, and first basemen) are paid more per win above replacement than glove-first players. Teams have traditionally valued good hitters over good fielders, partially because defensive statistics are less reliable than offensive statistics, making fielding contributions more difficult to judge (Swartz, 2014). Catchers in particular are penalized because the wear and tear they endure typically limits their playing time and hinders their offensive contributions. As a result, backstops have the lowest average salary among position players[4] and make less on the free agent market than all the other positions (Armstrong, 2014). On the other hand, designated hitters had the highest average salary in 2013[5] because they hit well and are less likely to get hurt (Associated Press, 2013). It’s also important to consider that statistics are not valued equally between the groups. For instance, corner infielders and outfielders are expected to be major sources of power and run production, whereas catchers and middle infielders with power are exceptionally rare. Thus, it is likely that using an aggregate model for analyzing salary determinants of position players is not appropriate and will lead to inaccurate conclusions. This paper will implement an F-test to see which model is more appropriate, but expects to find that aggregation is inappropriate and it will therefore be necessary to construct disaggregate models.

Joe Carter was overrated because of his shiny Triple Crown numbers (SportsNet)

Data Description:

The basic model follows a semi-log function:

Ln(salary) = B₀+ B₁GSavg + B₂R + B₃HR + B₄RBI + B₅RC + B₆Avg + B₇OPS + B₈wOBA + B₉SecA+ B₁₀BBKratio + B₁₁ISO +B₁₂YearsinMLB+B₁₃YearsinMLBsquared ₊e

All of the variables in this model are baseball statistics and the data used to calculate them came from the Lahman baseball database. Career statistics were used because general managers typically pay attention to a player’s entire body of work instead of just his most recent season, which could have been a fluke year or cut short by injury. Rate stats were used in combination with raw totals to account for playing time discrepancies and better compare across positions, since up-the-middle players (particularly catchers) play more physically demanding positions and typically suit up less than corner defenders.

Though it’s possible general managers consider playoff performance when evaluating hitters, this study focused solely on regular season statistics because not every player gets the opportunity to appear in playoff games. The sample sizes thus vary wildly from player to player and tend to be very small, sometimes non-existent. One cannot and should not draw meaningful conclusions about a player’s true ability based on a handful of postseason games. Furthermore, there is little to no evidence in literature supporting the notion that player salary is dependent upon postseason success.

Lastly, a mix of newer sabermetrics and traditional statistics were used to compare which ones have a greater effect on salary. Each variable is described in detail below:

Ln(salary): The salary variable is the salary of the position player measured in dollars. The natural log of salary was used to reduce the effect of extreme outliers on the model.

GSavg: The player’s games started total divided by the number of seasons he’s played measures the average number of games he starts per season. A player’s games played total is a good measure of his durability and is essential to determining his value because nobody is very valuable if he only plays half a season. The best players are in the lineup every day and will typically play 140 to 160 games per year if healthy. While injuries are often beyond a player’s control, he can maximize his durability by preparing for the season, then staying in good shape and taking care of himself once the season is underway. Players with healthy track records are also considered less risky than players who are injury-prone and more likely to break down. Using games is a better measure of durability than at-bats because at-bats are dependent on a player’s batting order position.

R: Runs scored. Typically players who get on base a lot, hit for power, and run the bases well score a lot of runs. This is largely a team-dependent statistic, as the only way a player can drive himself in is with a home run, and also a function of batting order position. Players at the top of the order tend to score more runs than those at the bottom because they hit in front of the team’s best run-producers and have more opportunities to score runs. However, players can improve their likelihood of scoring by getting into scoring position via extra base hits and with aggressive baserunning.

HR: A home run is an event where the batter comes around to score on a single play without the benefit of the error from the defense. Most often this is done by hitting the ball over one of the outfield fences in fair territory, but it’s also possible to have an inside-the-park home run that does not clear the fence, but rather remains in the field of play long enough for the batter to circle the bases. Home runs reflect a player’s power, as the balls that get hit the farthest tend to leave the yard.

RBI: Runs Batted In counts the number of times a batter drives in a baserunner and himself via a hit, walk, hit by pitch, sacrifice fly, or groundout (excluding double plays). This statistic has been criticized recently because it’s more a measure of opportunity than anything else: as the players with the most RBI opportunities typically have a high output.

RC: Runs Created is a sabermetric developed by Bill James that estimates the number of runs a player contributed to his team by taking into account all basic offensive statistics. It is a good tool for comparing players against each other because it measures a player’s contribution independent of his teammates (Panas 40). When these totals are added together for all the players on a team, the total is usually within five percent of the team’s real runs scored total (SABR, 2014). The formula for RC is displayed below:

$RC = \frac{(H+BB-CS+HBP-GIDP) \times (TB+(.26 \times (BB - IBB + HBP)) + (.52 \times (SH + SF + SB)))}{AB+BB+HBP+SH+SF}$

Avg: Batting average is a simple rate statistic that divides a player’s hit total by his total number of at-bats. It measures how often a player reaches base via a hit, and is also a measure of speed to some extent as faster players are able to leg out more infield hits than slower batters. It is also a good measure of a player’s ability to make contact, as putting the ball in play more often typically leads to more base hits and higher batting averages. The problem with batting average is that it treats all hits equally, and does not differentiate between singles and extra base hits. Thus, having a high batting average is not very useful if it is “empty,” as in obtained mostly by singles. Furthermore, batting average is susceptible to luck, for it is heavily tied to a player’s Batting Average on Balls in Play (BABiP), which can fluctuate wildly from season to season.

OPS: On-Base Percentage Plus Slugging—the sum of a player’s on-base percentage and slugging percentage—is widely viewed as one of the best hitting statistics because it measures a hitter’s ability to get on base and hit for power, the two skills most correlated with winning. However, OPS is imperfect because it undervalues getting on base relative to hitting for extra bases and does not properly weigh each type of extra base hit (FanGraphs, 2014). OPS is also closely correlated with Runs Created, meaning it accurately measures offensive value (Weber, 2014). Using OPS also prohibited including OBP and SLG in the model because of multicollinearity.

wOBA: Weighted On-Base Average is a context-neutral sabermetric that measures a hitter’s overall value by combining all the different aspects of hitting into one statistic and weighting them with their actual run value. It can be used to calculate how many runs above or below average a player was using these linear weights. It is one of the best offensive statistics available for capturing a player’s offensive contributions and is scaled to OBP, so .400 is great, .320 is average and anything below .300 is poor (FanGraphs, 2014). The formula used to calculate wOBA is displayed below:

wOBA = (0.690x(BB - IBB) + 0.722×HBP + 0.888×1B + 1.271×2B + 1.616×3B + 2.101×HR) (AB + BB – IBB + SF + HBP)

SecA: Secondary average is a sabermetric created by Bill James that divides the sum of extra bases gained on hits (TB-H), walks, and stolen bases (minus times caught stealing) by at-bats. It measures the number of bases a player gained independent of batting average. By incorporating extra base hits, walks, and stolen bases, it measures his power, discipline, and speed—the three most important skills for a hitter. A player can thus have a low batting average but high secondary average (and vice versa), so SecA helps identify players who are productive offensive players despite poor batting averages. According to Scott Gray, who worked with Bill James, "Secondary average is a much better indicator of offensive ability than batting average" (Gray, 2006). The formula used to calculate SecA is [BB + (TB-H) + (SB-CS)] / AB

BBKratio: The ratio of a player’s walk total to his strikeout total is an easy way to measure his strike zone knowledge. The best hitters walk more often than they strike out, but any ratio close to 1:1 is exceptional.

ISO: Isolated power is the difference between slugging percentage and batting average. By stripping out the singles from slugging percentage, ISO measures the number of extra base hits a player gets per at-bat and is thus a good gauge of his raw power. Good power hitters have an ISO over .200 while the league average tends to fall around .140 (FanGraphs, 2014).

YearsinMLB and YearsinMLBsq: These two stats measure a player’s major league experience in years. The squared variable is included to account for the fact that sometimes the relationship between experience and salary is not linear. In fact, it is usually the shape of a bell curve because a player’s salary is suppressed early in his career when he has little bargaining power, peaks when he reaches free agency, and declines toward the end of his career when his durability and skills diminish. Squaring YearsinMLB presents experience as a linear relationship.

Dummy Variables: The Lahman database provides data from the 1985 season through the 2013 season. The author compiled data using year dummies from 1985 through 2003 for the pre-Moneyball era and from 2004 through 2013 for the post-Moneyball era in both the aggregate and disaggregate models, omitting the 1985 and 2004 seasons to avoid collinearity.

The aggregate models included infielders, outfielders, catchers, and designated hitters as dummy variables to account for any effect position may have had on salary and avoid omitted variable bias. They were replaced by the clause “if IF==1” for the disaggregate infielder model, “if OF==1” for the disaggregate outfielder model, “if C==1” for the disaggregate catcher model, and “if DH==1” for the disaggregate designated hitter model.

It is also important to note that pre-arbitration eligible players are not included in the model because they are unable to negotiate their salaries, which are usually close to the league minimum regardless of their talent level. A player is typically arbitration-eligible after his third season, at which point he and his team can negotiate salaries agreed upon by an arbitrator. It is generally believed that arbitration players are paid close to what they would make in free agency, which a player becomes eligible for after his sixth season. Therefore, in order to exclude the statistics of any player who had not been in the league for three years, all players with less than four years of major league experience were dropped from the model.

Lastly, in this data it was implied that the players signed new deals every year.

Mark Teixeira has good secondary averages despite poor batting averages (CSN)

Results & Analysis:

The aggregate and disaggregate results are summarized in the following tables:

Table 1: Summary statistics (career)

Variable	Observations	Mean	Stand. Dev.	Min	Max
Lnsalary	7,782	14.15927	1.224417	11.0021	17.31202
GSavg	10,243	68.80587	37.19332	0	155.125
R	10,243	378.3355	329.0393	0	2,227
HR	10,243	80.3167	92.94184	0	762
RBI	10,243	353.5923	321.9295	0	1,996
RC	10,243	396.869	364.3221	.9371428	2,857.884
Avg	10,243	.2625083	.0260075	.1025641	.4444444
OPS	10,243	.7330512	.0914736	.2779204	1.555556
wOBA	10,243	.3424909	.046488	.1296341	.8615556
SecA	10,243	.248787	.0725125	0	1.11111
BBKratio	10,243	.587493	.2821653	0	2.246528
ISO	10,243	.14229	.0518394	0	.6666667
YearsinMLB	10,243	8.367666	3.807199	4	25
YearsinMLBsq	10,243	84.51118	79.49687	16	625

Table 2: Variable Correlations for Aggregate Model 1985-2013

Variable	lnsalary	GSavg	R	HR	RBI	RC	Avg	OPS	wOBA	SecA	BBKra	ISO	YrsinMLB	YrsinMLBsq
lnsalary	1.000
GSavg	0.627	1.000
R	0.527	0.711	1.000
HR	0.539	0.562	0.831	1.000
RBI	0.539	0.666	0.938	0.939	1.000
RC	0.546	0.693	0.987	0.881	0.966	1.000
Avg	0.538	0.579	0.554	0.412	0.516	0.572	1.000
OPS	0.648	0.547	0.587	0.692	0.644	0.645	0.763	1.000
wOBA	0.645	0.542	0.585	0.667	0.627	0.639	0,771	0.995	1.000
SecA	0.503	0.382	0.485	0.660	0.532	0.538	0.358	0.843	0.844	1.000
BBKratio	0.141	0.323	0.413	0.192	0.316	0.427	0.415	0.319	0.360	0.274	1.000
ISO	0.517	0.346	0.387	0.690	0.532	0.456	0.333	0.831	0.800	0.853	-0.053	1.000
YearsinMLB	0.235	0.356	0.792	0.634	0.772	0.776	0.286	0.283	0.281	0.216	0.308	0.158	1.000
YearsinMLBsq	0.180	0.330	0.792	0.638	0.771	0.780	0.267	0.269	0.267	0.212	0.303	0.152	0.974	1.000