Sabermetrics - Colorado Mesa University

Download Report

Transcript Sabermetrics - Colorado Mesa University

Sabermetrics
The Art and Science of Quantifying
an Athlete’s Value
Mark Rogers
April 2, 2010
SABR
• The Society for American Baseball Research
• Founded in 1971 in Cooperstown, New York
• “To foster the research, preservation, and
dissemination of the history and record of baseball”
• This is certainly made easier by being located in town
alongside the National Baseball Hall of Fame.
• 6,700 members
• Mostly statisticians, sports writers, and former players
and officials
A Universe of Statistics
• Baseball fans are particularly fond of using statistics to
measure a player’s ability, for several reasons.
– Low scoring: limits most other quantitative measurements of the only
thing affecting the outcome, the points scored.
– No clock: games can have indeterminate length and scoring chances,
so a “fair number of chances to score” can vary
– Consistency: the game is played under almost the exact same rules
(and using mostly the same strategy) as it was when it premiered in
the 19th century, unlike other sports.
•
•
•
•
The modern “live-ball era” of baseball: 1920—present
Football: the forward pass altered the way games were played
Basketball: the modern “shot clock era” accelerates scoring
Hockey/soccer: hey, until recently, who cared?
• This consistency allows current players to be readily compared
to almost any other player from the past, unlike most other
sports.
The Pioneer
• The first great baseball statistician was Henry
Chadwick (1824—1908).
– An English transplant to Brooklyn, where he followed
cricket, rounders, and their American cousin, baseball
– Wrote summaries of games for New York newspapers, and
included a summary table of the game’s major statistics,
the box score.
– For his contributions to the
legacy of the game, Chadwick
was elected to the Baseball
Hall of Fame in 1938.
The Original Baseball Statistics
• Chadwick’s early box scores focused primarily on
tabulating easily observable aspects of the game.
– Hits/H: any ball hit in fair territory, but not easily retrievable by
the fielders, resulting in a player safely reaching base
– Walks (abbreviated BB, for “reached base on balls”)
– Strikeouts (usually abbreviated K, or occasionally SO)
– At-bats/AB: # of batting chances not resulting in a walk, since
those “ball” pitches are deemed unhittable
– Stolen bases/SB: safe advances made between hit balls
– Runs/R: # of times a player scores/crosses home plate
– Runs batted in/RBI: # of players crossing home plate because of
that player’s at-bats, and not other issues
The Original Baseball Statistics
• Chadwick’s early box scores focused primarily on
tabulating easily observable aspects of the game.
– Errors/E: mental or physical lapses resulting in a player being
safe who should have theoretically been thrown out, or
“running themselves out” by overrunning
– Single/double/triple (1B/2B/3B): the # of bases safely reached
by a player immediately following their hit, not counting any
fielding error
– Home runs/HR: player scores on their own hit, not due to
fielder’s error; counts as 4 bases
– Total extra-base hits/XBH: 2B + 3B + HR
– Total bases/TB: combined # of bases reached by a player on
their own hits, not counting any fielding errors, for an entire
game (1B + 2×2B + 3×3B + 4×HR)
The Original Baseball Statistics
• Errors and other issues “outside the player’s control” are
often used to distinguish batting performances worthy of
credit from those less so.
– A player who reaches base solely due to a fielding error is not credited
with a hit or an official at-bat.
– Hits followed by an error are scored as the type of hit an errorless
version of the play would have resulted in, plus “advanced (or thrown
out) on error.”
– Walks/BB: appearance not counted as an official at-bat
– Hit by pitch/HBP: player awarded first base as a result, but not given
credit for a hit or an at-bat
– Fielder’s choice/FC: player reached base only because a fielder chose
to throw out a runner closer to scoring; thus, the player is not given
credit for a hit
– Double play/DP: batter causes multiple runners to be thrown out;
deemed so terrible as to not warrant RBI credit even if a run scores!
Raw Data vs. Average Data
• Since baseball games are of varying length, the
number of at-bats (and therefore, the number of
“expected” hits, runs, etc.) can vary widely.
– Therefore, measuring only the raw batting data is not the
fairest measure of who is the “best” player.
– Chadwick devised several alternative statistical methods by
calculating averages based on the ratio between the
number of achievements made (in batting, pitching, or
fielding) and the number of opportunities for them.
The Hitting Averages
• Batting average: a measure of the rate of fair hits made in
appropriate batting opportunities:
H
BA=
AB
– Chadwick viewed this as superior to the cricket average, which
compared the number of runs to the number of outs.
– “Situational” batting averages can also be measured, such as the
batter’s average with “runners in scoring position” (RISP), or one
factoring in the # of times they grounded into a double play (GIDP).
• Slugging percentage: a measure of the player’s power, which
counts extra-base hits extra, but for the same number of ABs:
SLG=
TB 1B+(2  2B)+(3  3B)+(4  HR)
=
AB
AB
The Hitting Comparisons
• In addition to the “portion of a whole” ratios of the
BA, SLG, OBP, etc., ratios can be measured comparing
one type of statistic to another.
– Walk-to-strikeout ratio (BB/K): measures the hitter’s
ability to maximize one and minimize the other
– Ground-ball-to-fly-ball ratio (G/F): ditto
– At-bats per home run (AB/HR): measures the rate of home
runs, by using its easier-to-work-with reciprocal
• Mark McGwire holds the all-time career record, with an AB/HR of
10.61 (having hit a home run in 9.4% of his official at-bats).
• The league average for AB/HR in 2009 was 32.9 (the average
players hit a home run in 3.0% of their at-bats).
The Triple Crown
• Traditionally, the HR and RBI counts, along with the popularly
followed race for the batting-average title, were deemed to
be the “major” hitting categories.
• Players who lead the league in all three are said to have won
the Triple Crown of hitting.
• However, such a feat is difficult because of the wide gap
between a power hitter “swinging for the fences” at the cost
of many strikeouts and someone hitting “for average,” aiming
for numerous hits even if they were “only” singles.
– The most recent Triple Crowns were:
• American League: Carl Yastrzemski (Boston Red Sox), 1967
• National League: Joe “Ducky” Medwick (St. Louis Cardinals), 1937
A Problem with the System
• One problem with the use of the BA as a gauge of a player’s
ability is that it makes no distinction between singles and
“bigger” extra-base hits.
– Thus, a “power hitter” who makes fewer hits, but scores more RBI
with a more powerful selection of hits, would be deemed a worse
player.
– Example: Ryan Howard, 2008 (the year he finished 2nd in MVP voting)
His 48 home runs and 146 RBI led the league (with 331 total bases in
610 official at-bats), but he had 199 strikeouts to go with them, which
helped lower his BA to just .251.
• He got a hit ¼ of the time, but his TB makes it look as if he did so ½ of the time.
• One solution is to use the slugging percentage (SLG), which
gives “extra credit” for these bigger hits.
– Howard’s SLG for 2008 was .543, much closer to the league-best.
Another Problem with the System
• Both the BA and the SLG also fail to count walks as an official at-bat,
which fails to give credit to a player with “good eyes” who is able to
avoid strikeouts long enough to draw a walk.
– Example: Pete Rose, 1974
In an “off” year, he had a .284 BA, but also a career-best 104 walks.
• A solution to this problem is to use the on-base percentage (OBP),
which counts hits and walks (as well as getting hit by a pitch, which
can also have tactical advantages) and uses something more closely
approximating the total “plate appearances” instead of “at-bats”:
H  BB  HBP
OBP 
AB  BB  HBP  SF
– SF: sacrifice flies
– Rose’s 1984 OBP was .385, close to the league lead.
To Count, Or Not To Count?
• Some baseball occurrences vary as to which
statistical categories they will count toward.
– Bunt/sacrifice hit/SH: a deliberate attempt to hit the ball
so as to allow a runner to advance, at the expense of the
batter
• Like a reverse fielder’s choice, the bunt does not count as a hit,
and the attempt does not count as an official at-bat.
• Assuming the runner is indeed thrown out before reaching base, it
will also not count towards the OBP.
• However, if the runner is deemed to primarily be trying to reach
first, it may be scored as a single or out instead.
• If the batter bunts toward a runner on third (to draw away the
third baseman), this squeeze play will result in an RBI credited.
To Count, Or Not To Count?
• Some baseball occurrences vary as to which
statistical categories they will count toward.
– Sacrifice fly/SF: a fly ball hit with less than two outs, that is
caught far enough from the infield to allow a runner to
score
• Like a sacrifice hit (bunt), the sacrifice fly does not count as a hit,
and the attempt does not count as an official at-bat.
• However, unlike the sacrifice hit, it will count towards the OBP, as
the play is considered more accidental and less a tactical decision.
• The maneuver’s primary benefit is allowing a runner on third base
to score while the caught ball is relayed; thus, the batter is
credited with an RBI if successful.
The Best of Both Worlds
• Some armchair statisticians have argued that a measurement
fusing the advantages of the SLG and the OBP would be ideal.
• “On-base plus slugging”: OPS = OBP + SLG
• The OPS has become one the primary statistical benchmarks
used for hitters in the modern-day game.
• However, it is not without its share of controversy.
– The “equal mixture” blends two measurements that normally have
very unequal numbers; typically, SLG > OBP, weighting it preferentially.
– It also has no intrinsic meaning in game-play terms, unlike the BA (the
frequency of getting a hit) or OBP (the frequency of reaching base).
Try 1 from Column A, 1 from Column B
• Each of these statistics can also be added to or subtracted
from each other for a variety of results.
• Isolated power/IsoP: a measure of the hitter’s power effects
without the influence of the number of hits:
IsoP = SLG − BA
• Secondary average/SecA: a measure of the hitter’s number of
bases attained without the influence of the number of hits:
– Including any gained through walks and stolen bases
TB  H  BB  SB  CS
SecA 
AB
– CS: # of times caught stealing (presumably, additionally subtracted so
as to highlight the difference between two players who achieve the
same number of stolen bases in a very different number of attempts)
So…What’s “Good”?
• The variety of measurements necessitate some sort of
benchmark for what would rank a player among the league’s
best in each category.
• Typical career averages (modern “live-ball era,” since 1920):
BA
OBP
SLG
OPS
“Average”
.267
.330
.420
.750
“Great”
.300
.370
.460
.830
“Elite”
.325
.400
.500
.900
All-time record
.366
(Ty Cobb)
.482
(Ted Williams)
.690
(Babe Ruth)
1.164
(Babe Ruth)
Active leader
.334
(Albert Pujols)
.427
(Todd Helton/
Albert Pujols)
.628
(Albert Pujols)
1.055
(Albert Pujols)
Seeing the Numbers
• In addition to studying the numbers themselves, we can also visualize
them using a scatterplot, searching for a presumed correlation between
the OBP (the ability to get on base using more than just hits) and the SLG
(the ability to get past first base with one’s hits).
• The red lines represent the league average for each statistic.
– Upper-left:
+ power, − average
– Lower-right:
− power, + average
– Lower-left:
weaker in both
– Upper-right:
stronger in both
Seeing the Numbers
• We can also plot the best-fit trend line for this scatterplot (shown here in
light blue), showing the expected link between hitting for average (OBP)
and hitting for power (SLG) that certain players exceed and others trail.
– Anyone above this line is hitting for more power than their OBP would have suggested.
– Anyone below this line is hitting for less power than their OBP would have suggested (or
possibly getting on base more often than their SLG would have suggested).
• Once again, there is
a clear sign of which
current player
excels at both of
these critical areas.
– Albert Pujols,
St. Louis Cardinals
Pitching Statistics
• Much as the batting average (BA) represents the
number of “official” at-bats for a hitter, and PA the
actual number of “plate appearances,” the pitcher’s
performance can be measured by how many batters
were faced.
– The equivalent of the PA is the # of “batters faced” (BF).
– The equivalent of the BA is the “opponents’ batting
average” (OBA), which similarly subtracts any plate
appearance not counted as an at-bat for the hitter:
OBA =
H
BF BB HBP  SF  SH CI
• CI: Catcher Interference with the play (½ times per year per team)
The Pitching Averages
• Earned run average/ERA: a measure of how many earned
runs a pitcher would be expected to give up on average over a
full 9-inning game, regardless of the actual number of innings
ER
pitched (IP):
ERA =
9
IP
– “Earned” runs/ER: those directly or indirectly due to the
pitcher’s actions, including:
• Runners who score due to the pitcher’s actions (not any fielders’)
• Runners left behind by that pitcher (who is “responsible” for them)
who later score when the relief pitcher allows a hit
• But not including runners who score only because a player’s earlier
error gave the team an “extra out” to drive them in
– “Good” ERAs vary widely depending on the era played.
• 2.00 or less in “pitchers’ eras,” 4.00 or more in “hitters’ eras”
The Other Triple Crown
• For pitchers, the three statistical categories deemed to be the
most important have traditionally been wins, strikeouts, and
ERA.
– Like the hitting categories, these were deemed the easiest to follow.
• Players who lead the league in all three are said to have won
the Triple Crown of pitching.
• Because the skill sets involved for pitchers are not as disparate
as for hitters, some (particularly “power pitchers” excelling in
strikeouts as a means to an end) can find winning it easier.
– The most recent pitching Triple Crowns were:
• American League: Johan Santana (Minnesota Twins), 2006
• National League: Jake Peavy (San Diego Padres), 2007
The Pitching Averages
• Much like the batting average “levels the field” of batting
statistics between players with different numbers of at-bats, a
variety of pitching averages attempt to balance pitchers who
have a varying number of innings pitched (IP), in the same
manner as the ERA (by dividing the raw data by the number of
innings pitched and multiplying by 9).
– Hits per 9 innings pitched: “H/9” = H ÷ IP × 9
– Strikeouts per 9 innings: “K/9” = K ÷ IP × 9
– Walks per 9 innings: “BB/9” = BB ÷ IP × 9
• Since excess walks and hits can still exhaust a pitcher (who
must then be replaced) even if they do not result in runs, one
popular modern average combines these “trivial” slip-ups.
– Walks plus hits per inning pitched (WHIP): (BB + H) ÷ IP
The Pitching Comparisons
• Many of the ratios used to measure hitting prowess
can be inverted to measure pitching prowess.
– Strikeout-to-walk ratio (K/BB): measures the pitcher’s
ability to maximize one and minimize the other
– Fly-ball-to-ground-ball ratio (F/G): ditto
• The ERAs can be adjusted as well for certain
situations, including the “catcher’s ERA” (CERA), the
average ERA of the team with a particular catcher
playing.
– A measure of the catcher’s ability to control the game
– Thus, it is more of a fielding statistic.
Fielding Statistics
• Putouts (PO): the number of outs directly caused by a fielder
• Assists (A): the number of outs in which a player was
indirectly involved
• Total (fielding) chances: TC = PO + A + E
– The total number of opportunities to make a defensive play
• Fielding percentage:
PO + A
PO + A
FP =
=
TC
PO + A + E
– Typically 98.5% (0.985) or better for most players, or slightly lower for
difficult defensive positions (third base and shortstop)
• Range factor: RF = (PO + A) ÷ IP × 9
– A proportional extrapolation of a full game, the fielding equivalent of
the ERA; used to gauge the amount
The Prophet
• As general manager of the St. Louis Cardinals and the
Brooklyn Dodgers, Branch Rickey profoundly altered baseball,
from the integration he pioneered with Jackie Robinson to the
“farm system” he invented to find untested players and train
them for major-league success in the minor leagues.
– With both teams struggling when
he arrived, he worked to maximize
player value by signing them for
the lowest cost and training them
to their fullest potential.
– From Ken Burns’ film Baseball:
“Nobody knew how to put a
dollar sign on the muscle better
than Branch Rickey.”
The Prophet
• Rickey was asked by LIFE magazine if there was a “formula”
for baseball success. Skeptical at first, he worked for six
months to come up with what he thought might hold the key.
G = (hitting proficiency)  (pitching proficiency)
 H + BB + HBP

3  (TB  H)
R

+
+
4  AB
H + BB + HBP 
 AB + BB + HBP
 H

BB + HBP
ER
K

+
+

 F
8  (BF + BB + HBP) 
 AB BF + BB + HBP H + BB + HBP
– Rickey argued that this carefully balanced formula did an excellent job
of approximating the final standings at season’s end, even if it violated
many long-held beliefs about what was important for a team to win.
– Hall of Famer George Sisler: “I still don’t believe it, but there it is.”
Focusing on What Really Matters
• Rickey’s formulas were the precursors to the OBP and SLG—
new ways of measuring the effectiveness of a hitter in hopes
of finding a better match for the only baseball statistic that
really counts, the number of runs scored.
– Research suggests that a player’s batting average correlates with the
team’s run-scoring success only 75% of the time.
– OPS (OBP + SLG), on the other hand, does so 90% of the time.
• A perfect solution would be to calculate the number of runs
each player is personally responsible for, but since scoring
runs in baseball is such a communal effort, this is difficult.
The Visionary
• In 1977, baseball writer and statistician Bill James coined the
term sabermetrics (based on the acronym SABR) to refer to
the statistical analysis of baseball data.
• For years, James wrote a self-published baseball statistical
abstract after publishers deemed its subject matter too
esoteric for a mainstream audience.
• His work found an obsessive audience of writers, fans, and
baseball officials, and was soon published nationwide.
– It would inspire an entire field of study and copycat publications.
– 2006: Bill James named one of Time’s 100 Most Influential People
Creating the Runs
• James created a number of new statistical categories.
– Range factor (RF)
– Pythagorean expectation: an estimate of how many games the team
should have won, based on the total runs scored and allowed:
(runs scored)2
Pythagorean expectation 
(runs scored)2  (runs allowed)2
• The Pythagorean expectation has a strong correlation with the number of games
the team actually goes on to win, although it can be improved still further by using
an exponent of 1.82 instead.
– Win shares: like the winnings divided up by a championship team, this
formula divides up 3w “shares” of w wins among the players according
to the amount each is entitled to, depending on their performance.
• Incorporates hitting, pitching, fielding, and even other “intangible” issues
• However, it is very difficult to calculate; its description in James’s book is 84 pages.
Creating the Runs
• James created a number of new statistical categories.
– Runs created (RC): a general category of formulas combining an onbase factor A, an advancement factor B, and an opportunity factor C:
RC 
A B
C
– This formula can vary widely depending on how those three categories
are defined (or refined). One basic formula is:
RC 
(H  BB)  TB
AB  BB
– Adjustments can be made to this formula by adding in other factors, or
by weighting the various factors with coefficients to make them more
or less important:
RC 
(H  BB  CS)  (TB  (0.55  SB))
AB  BB
Creating the Runs
• James created a number of new statistical categories.
– One particularly elaborate version incorporates many small tweaks:
RC 
(H  BB  CS  HBP  GIDP) (TB  (0.26 (BB  IBB  HBP)) (0.52 (SH  SF  SB)))
AB  BB  HBP  SH  SF
• James also developed formulas to help predict a pitcher’s
performance, based on both the ERA and other acts within
the game, all carefully balanced to create the “Game Score.”
• Using these very technical formulas, James became one of the
leading experts on predicting outcomes in baseball.
– Even he, though, cautioned against overdependence on their use.
The Game Changes
• For a century, professional baseball players’ salaries were
restricted by the reserve clause, which guaranteed teams the
right of first renewal even after contracts expired.
– In theory, it meant a form of job security; in practice, it was a “noncompete” clause, which meant that players could not choose to pursue
higher salaries (or a better team) elsewhere.
– League officials had been given an anti-trust exemption by Congress.
• Following the first MLB strike in 1972, players won raises and,
more importantly, binding arbitration on salary issues.
– An arbitrator soon struck down the reserve clause, allowing players
whose contract with a team ended after 6 years to declare themselves
“free agents,” who could sign for whatever the open market allowed.
– The result was an explosion in player salaries, which forced many
teams to take a hard look at where their money could best be spent.
The True Believer
• In 1995, new owners inherited the Oakland A’s, and ordered
general manager Sandy Alderson and assistant GM Billy Beane
to slash spending on player salaries.
– In 1998, Beane became the GM, “running the team” at just age 35.
– The A’s had been one of the most successful teams of the previous
two decades, winning 6 pennants and 4 World Series in a small city.
• Unable to spend freely to acquire talent, Beane was forced to
find undervalued players, and used sabermetrics to do so.
– The front office began to emphasize statistics such as OBP, SLG, and
fielding ability rather than the traditional favored stats of BA and RBI.
– In 2001 and 2002, the A’s were one of the best teams in baseball,
winning over 100 games each year despite the second-lowest payroll.
– Michael Lewis’s book Moneyball chronicles the struggles of the “Beane
counters” to convince the team’s skeptical old-guard scouts.
The True Believer
• Since all teams were required to spend a minimum amount on
salaries, and all teams won at least 50 games, the key issue
was how much extra each team paid for each extra victory.
– Oakland had excelled in this area, paying $500,000 for each “extra”
victory on their way to the division title (one of just two teams to
spend less than $1 million per extra win), while richer but poorerperforming teams were paying $3 million or more for each of theirs.
– To avoid the high costs of free agency, the A’s were also forced to get
as much mileage as possible out of these undervalued stars’ contracts
before their success attracted the attention of the big-market teams.
– Before the 2001 season, Oakland had lost Jason Giambi, Johnny
Damon, and Jason Isringhausen, three All-Stars whose new $33 million
combined annual salaries were as much as the A’s entire team.
The Apostles
• Following the success of Billy Beane, several other teams
hired young general managers who used sabermetrics (rather
than a long career as a professional baseball scout) as an
integral part of analyzing potential player acquisitions.
• Their success rate varied widely.
– Paul DePodesta
• Named GM of the Los Angeles Dodgers at age 31
• Fired after just his second season, the Dodgers’ second-worst since moving to L.A.
– Theo Epstein
• Named GM of the Boston Red Sox at age 28, the youngest in history
• Hired Bill James as a sabermetric adviser
• Two years later, the Red Sox broke “The Curse,” winning their first World Series in
86 years (as well as another one, three years later).
A Very Serious Subject
• Several professors now teach sabermetrics courses.
– Jim Albert, Bowling Green State University
• www-math.bgsu.edu/~albert/
– Andy Andres, Tufts University
• www.sabermetrics101.com/
– Steven J. Miller, Williams College
• www.williams.edu/go/math/sjmiller/public_html/399/index.htm
• Typically, the courses are elective follow-ups to a
standard statistics course, created by baseball fans.
– Student projects often involve designing a series of
mathematical models to perform their own analysis.
Play Ball.
PowerPoint slides available at
www.mesastate.edu/~mcrogers
Support your team!
Season Opener:
New York Yankees at Boston Red Sox
Sunday, April 4, 6:00 p.m., ESPN2
Opening Day:
St. Louis Cardinals at Cincinnati Reds
Monday, April 5, 11:10 a.m., ESPN
Bibliography
• Jim Albert and Jay Bennett, Curve Ball: Baseball, Statistics,
and the Role of Chance in the Game.
• Baseball: A Film by Ken Burns.
• Baseball Reference: www.baseball-reference.com/players/
• ESPN statistics page: espn.go.com/mlb/statistics
• David Grabiner, The Sabermetric Manifesto.
(www.baseball1.com/bb-data/grabiner/manifesto.html)
• Bill James’ new website: www.billjamesonline.net/
• Dan Lewis, “Lies, Damn Lies, and RBIs.” National Review,
March 31, 2001. (available online at old.nationalreview.com/
weekend/play-ball/pb-lewis033101.shtml)
• Michael Lewis, Moneyball: The Art of Winning a Unfair Game.
Bibliography
• MLB Official Rules:
mlb.mlb.com/mlb/official_info/official_rules/foreword.jsp
• Branch Rickey, “Goodby to Some Old Baseball Ideas.” LIFE
Magazine, August 2, 1954. (available online at
www.baseballthinkfactory.org/btf/pages/essays/rickey/goodb
y_to_old_idea.htm or “scanned” at Google Books)
• SABR: www.sabr.org/
• Alan Schwarz, The Numbers Game: Baseball’s Obsession with
Statistics.
• THE print almanac: John Thorn & Pete Palmer, Total Baseball.
• Tom M. Tiger, Mitchel Lichtman, & Andrew Dolphin, The Book:
Playing the Percentages in Baseball. (insidethebook.com)