Title: Adding Complexity Doesn’t Make Spring Stats More Predictive
Date: March 25, 2014
Original Source: Rotographs
Synopsis: This article attempted to improve on earlier methods for predicting breakouts based on spring training statistics.
What Do They Call Doing The Same Thing Over And Over Again And Expecting Different Results?
How Blake Spent Four Hours On A Monday And A Poll If He Should Smash Computer
Do Spring Stats Matter? The Answer May Surprise You
Five Reasons Your Dog Can Identify A Power Breakout
No Signal In Spring Power Noise
Back around 2005 or 2006, John Dewan, founder of STATS Inc. and co-founder of Baseball Info Solutions, made a very fantasy-relevant discovery: He could predict power breakouts with a 60 percent success rate based on spring training statistics.
His methodology – simply finding a player whose spring slugging is 200 points higher than their career mark (minimum 200 career plate appearances and 40 spring plate appearances) – was simple and easy to understand. It also made it easy for those drafting late to identify breakout candidates.
Unfortunately, it doesn’t really work.
In the spring of 2013, Ben Lindberg and Jon Shepherd of Baseball Prospectus tested the so-called Dewan Rule and more or less put it to rest:
Even after adjusting for league slugging percentage (which, again, the Dewan Rule doesn’t specify as a necessary step), the results revealed nothing of use. Of the 218 Dewan Rule batters, 112 improved (51.4 percent). Before their hot springs, the group as a whole slugged 24 points higher than the league. In the seasons after their hot springs, the group slugged 22 points higher than the league.
Basically, the Dewan Rule barely broke even.
Shortly before that article was written, I had taken my own dive into the Dewan Rule over at Beyond the Box Score, hoping to add greater complexity, a possibly necessary trade-off for improved efficacy. My idea was fairly simple: since Baseball Reference now provides a “quality of opposition” indicator for spring statistics, perhaps I could vet out players who were experiencing slugging surges due to feasting on inferior competition. While the QOO metric treats a Wade Davis the same as a Clayton Kershaw, I thought it may provide some additional value since it does happen to separate a Clayton Kershaw from a Jack Leathersich from aJimmie Sherfy.
What I did last year was find batters who met the Dewan criteria (a 200-point slugging jump in spring), who also showed that 200-point slugging bump over their prior year slugging (to account for players who had already improved but not for a long enough time to move their career slugging number), and had an average quality of opposition of at least 9.0 (which works out to 50 percent Triple-A pitching, 50 percent MLB pitching, a Quad-A level of sorts).
That list had 20 players. Unfortunately, only 11 saw gains to eight who took a step back (one didn’t qualify), and the group as a whole only improved by .020 slugging over their 2012 numbers and 0.015 over their career numbers.
|Player||2012 Slg||Career Slg||Spring Slg||OppQual||2013 Slg||2012-2013 Slg Gain||Career-2013 Slg Gain|
When I sat down to write this article, I expected to highlight how much this QOO wrinkle added and give myself the ol’ Barry Horowitz self-pat-on-the-back. That’s because this method identified Jason Castro and Brandon Belt as 2013 breakout candidates, and it was right in both cases – Castro had a slugging jump of .084 and Belt saw a .060 increase. Unfortunately, that was just confirmation bias, and I was only remembering a pair of players it worked for.
And this is kind of the issue with “breakout identifiers,” because it’s really easy to only remember the ones that worked. Maybe that’s not a bad thing – a sleeper who you correctly identify surely helps more than a sleeper you drop in May hurts – but it doesn’t make the method any more reliable than picking players to improve at random.
One thing that occurred to me after trying to use the model last season was that small-sample BABIPs can wreak havoc in spring, so maybe isolated slugging (ISO) was a better breakout identifier than slugging percentage. I went back and re-ran the “Qual-Adjusted Dewan Rule” using an ISO jump of .070 in the spring of 2013 in place of the .200-point slugging jump Dewan had used.
That gave us a player pool of 44 players. Of those, 21 saw an increase in ISO in 2013, 22 did not, and one didn’t qualify. On average, players in this group gained just .006 of isolated slugging.
|Age||Tm||OppQual||PA||Spring ISO||Career ISO||Spring-Career||2012 ISO||Spring-2012||2013 ISO||2013-2012|
And once again, there are multiple hits that could have led me to believe there was something here – Marlon Byrd, Castro, Belt, Will Venable, Josh Donaldsonand Nate Schierholtz were among the names this version of the breakout predictor would have identified, but it also thought Josh Reddick, Cain, Moustakas, and Wilin Rosario were all in line for big gains that never materialized.
To review, we’ve taken Dewan’s relatively simple model, one that was proven not to work, and tried to make two key improvements – accounting for the quality of opposition, and trying to strip out some BABIP luck by using ISO instead of slugging. And still, there seems to be little in the way of predictive power. If anyone can think of ways to further improve the potential for using spring stats to predict power breakouts, I’m all ears and will gladly try again.
And in the event you still value the “hits” that such a “breakout predictor” may find, here is a list of players who would qualify based on spring stats so far this year (minimum 40 spring plate appearances, 200 career plate appearances, and a .070 gain in isolated slugging over their career and 2013 marks):
|Age||Tm||OppQual||PA||Spring ISO||2013 ISO||CareerISO||Spring-2013||Spring-Career|