Wednesday, November 28, 2007

DIPS Theory and Pitcher BABIP

If you’ve never before heard of Defense Independent Pitching Statistics (aka DIPS) Theory, brace yourself. DIPS Theory contradicts much of what is considered conventional baseball knowledge, but acceptance of it is an invaluable tool for any fantasy player who is committed to winning.

In its infantile form, DIPS Theory suggests that pitchers cannot control what happens to a baseball once it is put into the field of play. In other words, pitchers cannot control how many hits on balls in play they give up. This idea was originally presented by Voros McCracken, who earned a job with the Boston Red Sox for his work.

DIPS Theory realizes that many statistics – like ERA – that have traditionally fallen squarely on the shoulder of the pitcher, are actually influenced by a number of factors… the most prominent being a pitcher’s skill, the defense behind him, his bullpen, and luck. Logically then, why does it make any sense at all to focus on things that are out of a pitcher’s control? If they are uncontrollable and highly variable, what possible sense could it make to use them as an indicator of skill? Hopefully, you’re saying “none” to yourself right now. That means we’re on the same page.

Taking this knowledge into consideration, DIPS Theory attempts to separate the pitching skill from everything else in an attempt to get a feel for how good a pitcher truly is. To do this, Voros said that we should look at all of the statistics that a pitcher can control – those that aren’t compromised by external influence – and none that a pitcher can’t control. We can then create formulas based off of them. Here is how he classified different traditional statistics:

Defense Independent

Walks, Strikeouts, Home Runs (essentially), Hit Batsmen, Intentional Walks

Defense Dependent

Wins, Losses, Innings, Runs, Earned Runs, Hits Allowed, Sacrifice Hits, Sacrifice Flies

This breakdown makes complete sense. Defense has essentially nothing to do with a pitcher striking a batter out, and luck has only a minimal affect. Maybe a pitcher faces a few more Adam Dunn-type strikeout hitters per year, but that is about the extent of it. The same thing could be said about the rest of the Defense Independent variables.

If we look at the Defense Dependent variables, though, it becomes obvious how riddled with external noise they are. Wins and Losses have a lot to do with offense, runs have a lot to do with defense and bullpen, hits are heavily influenced by defense, etc.

Since Voros first introduced this DIPS concept in 2001, some nice strides have been made in refining it. As can be observed in his two lists above, Voros realized that Home Runs have some variability. Recently, Home Runs have been replaced in DIPS theory by batted ball types (i.e. ground balls, fly balls, line drives, and pop-ups) that can better predict home run rates than actual homers can. We’ll talk more about that at a later date, though.

Right now, the three most prominent statistics in DIPS Theory, and the ones I primarily focus on when evaluating a pitcher are – in order of importance – strikeouts, walks, and ground balls.

Not only are these statistics the most meaningful stats that a pitcher has nearly complete control over, but they also help solve the riddle of the hits on balls in play. Henceforth, hits on balls in play will be referred to as BABIP (Batting Average on Balls in Play). This statistic is calculated as such:

BABIP = (H-HR) / (AB-HR-K)


It essentially measures how many hits a batter gives up on contacted balls that do not leave the park (Home Runs are treated separately because pitchers have a decent amount of control over them). In Voros’s original article, he suggested that pitchers have virtually no control over how often these balls in play become hits. His exact words were, “There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.” While this was an extraordinary first step, it isn’t entirely true.

Before we explain why, let’s try and understand where Voros was coming from with his theory. Imagine that a pitcher makes a beautiful pitch, and the batter hits a weak-ish, shallow fly ball to left field. The problem for the pitcher is that Albert Belle is sitting out there, and he isn’t able to get to the ball in time. It falls for a hit, though by no fault of the pitcher. There are numerous scenarios like this that happen countless times every year. There are even a number of times when the ball doesn’t fall in because of simple dumb luck. When we think of all the variables involved, Voros’s original hypothesis makes a whole lot of sense.

To restate, his exact words were: “There is little if any difference among major-league pitchers in their ability to prevent hits on balls hit in the field of play.” It was later discovered, though, that BABIP is not a random thing; it is simply highly variable. My colleague at The Hardball Times, David Gassko, penned a fantastic article at the beginning of this year, entitled Uncovering DIPS, about pitcher control over BABIP.

David found that 75% of a pitcher’s BABIP can be found in his peripheral numbers (strikeout rate, walk rate, hit batsmen rate, and home run rate). Voros had originally stated that his belief was “simply that hits allowed are not a particularly meaningful statistic in the evaluation of pitchers.” David’s study shows that BABIP is indeed a meaningful statistic, if treated correctly.

Of course, because of the amount of luck involved, BABIP will fluctuate from year-to-year. If you look at several years’ worth of data, though, it becomes clear that pitchers with good peripheral stats generally have better BABIPs. Not mind-bogglingly better, but better nevertheless. If you’re curious to see some real-life examples of this, here are some career BABIPs to peruse.


It should be noted that league average BABIP is generally between .300-.305. Also, the difference would be even larger if we had a list of poor pitchers to look at, but pitchers who are poor don’t generally keep their jobs long enough to get a large enough sample size.

This is further validated by another article by David, which mentions that one standard deviation of BABIP is 0.009, which equates to .20 points of an ERA. He explained this quite easily and succinctly:

“In other words, 68% of all pitchers are affected by no more than +/- .20 runs due to their ability to prevent hits on balls in play, while 95% are within +/- .40 runs. So the effect is there, but not particularly large.”

So I’ve rambled on for a while now, and hopefully you were able to follow everything I said. At this point, though, you might be saying, “Derek, this is all well and good, but how do I apply this stuff to my fantasy team?”

The most important things that I can stress to you are the three DIPS stats we’ve talked about: strikeout rate, walk rate, and ground ball rate. I’m looking for a catchy name for when I’m talking about all three. Let’s go with the “Triforce of DIPS” and see if it catches on… and yes, that is a Legend of Zelda reference.

We’ll talk more in-depth about each of these stats next week and how to put them all together into an easily understandable format (the ERA scale), but for now, know that if you are going to evaluate pitchers, these are the three most important isolated statistics to look at.

I will, however, talk now about how you can apply BABIP to your fantasy teams. As I’ve hopefully driven home by now, BABIP is highly variable and prone to extreme fluctuations. When you see a player with a really high BABIP, expect that it will regress going forward. When it does, expect a significant affect on the pitcher’s WHIP (which is dependent on hits) and a marginal affect on ERA. The opposite holds true for a really low BABIP. Let’s try a quick exercise and see if you have the hang of it. Here are nine pitchers and their 2007 BABIPs and WHIPs. See if you can tell me which ones, judging by this data only, should have higher and lower WHIPs next year.


Well, judging solely by this data (we’re ignoring any progress or regression there could be in strikeout or walk rates for the moment), we would expect El Duque, Chris Young, Ubaldo Jimenez, and Carlos Zambrano to have higher WHIPs next year and Zach Duke, Scott Olsen, Matt Garza, Chris Capuano, and Felix Hernandez to have lower WHIPs.

Buddy Carlyle and Cliff Lee were thrown in there to trick you. While their WHIPs are high, this is not because of unlucky BABIPs. Their skills just aren’t that good. Here is a list of each of these pitchers and their DIPS WHIP, as I calculate it, from 2007. They should be in line with our conclusions above.


A word of warning. Just because a player’s high BABIP will regress does not mean you should automatically target him. You need to check his other skills to make sure he is a quality pitcher. While Zach Duke’s WHIP should get lower, a 1.46 WHIP is not something you should be actively pursuing.

That wraps it up for now. If you have any questions, feel free to e-mail me.

Saturday, November 24, 2007

Introduction

Hello everyone, and welcome to the first edition of Stat Head. Each week, I’ll be breaking down a different statistic. Normally, this will be a statistic that is a little unconventional, one you probably wouldn’t be hearing about on, say, an ESPN.

First, a little about myself. My name is Derek Carty. I am a student in New Jersey and an avid fantasy baseball player. I am the chief fantasy analyst for the Hardball Times, where I write for the Fantasy Focus blog. If you like the types of things I delve into here, feel free to stop by the Hardball Times, where I apply these concepts to actual players.

I won’t be talking about a specific stat today, but rather why the stats I will talk about in the future are important and how they can help you win your fantasy baseball league.

The heart of baseball statistics comes down to this: baseball is game part skill, part luck. For each of the 10 primary fantasy stats, there is some measure of luck involved. Simply by looking at one of these stats, one cannot tell if it is truly reflective of a player’s skill. In fact, for each of these stats, you can better predict them from year-to-year using some measure other than the stat itself than you could by using the actual stat. Pitching strikeouts can predict themselves quite well, but even for these there are other – perhaps better – ways of forecasting.

If up until now you have been unfamiliar with this concept, you may be asking yourself, “So how do we predict the stats, if they can’t predict themselves? How is that even possible?” The answer, at the barebones, is to separate the luck from the skill. If we look at the components of each stat separately, at the things a player can control (or at the things he can’t), we can see where the expected level of the stat should be. By doing this, we can better predict the path that the stat will take in the future.

If you’re still not convinced, look up the stats of your favorite pitcher. It really doesn’t matter who it is. Just make sure to pick someone who has been in the majors for at least three years or who you have minor league numbers for. Do that now… I’ll wait for you.

Now look down the column labeled “ERA.” Notice how ERA fluctuates from year-to-year, seemingly without any rhyme or reason. Take Johan Santana who is pretty much the consensus top pitcher in baseball. In the past four years, his ERA has ranged from 2.61 to 3.33. While obviously good, if you were to try and put Santana on an ERA for 2008 by only using that information, it might be a little difficult, no?

You might think that the difference between putting him at a 2.60 ERA or a 3.30 ERA is negligible; I mean, they are both great figures, ones you would surely take on your fantasy team any day. The truth, though, is that there is a 0.70 difference between the two. That is a huge gap. That’s nearly a full ERA point.

Once you move away from the elite pitchers, that 0.70 could be the difference between a pretty good 4.00 ERA and a pretty poor 4.70 ERA. In many fantasy leagues, that is the difference between a #3 fantasy pitcher and one who isn’t rosterable. And the killer part is that there is just no way to tell whether it will be on the high end or the low end if you only focus on ERA. It seems a little counterintuitive at first, but you need to dig deeper than the stat itself.

That’s where I come in. I’ll tell you the stats you should be looking at, the stats that dictate where a player’s ERA (or any other stat for that matter) should end up. Because there is luck involved, it will be impossible to always get it right, but in the long run you will get far more right than you would by any other means, and that’s really what it’s all about.

Baseball is not about perfection. In conventional terms, a player who hits .300 is considered a success, completely ignorant of the fact that he failed to get a hit 70% of the time. If we look at a little bit more complete statistic and an absolute freak like Barry Bonds, we see an OBP that was once an absurd .609. Still, failure occurs 40% of the time. If one consistently gets just 60% of the questions on a high school or college exam correct, that person would not graduate.

Baseball is not about perfection, and predicting baseball statistics is the exact same way. Perfection is not an option because of the high variability of the stats… because of the luck factor. But by digging deeper into the stats, we can begin to sift through the luck, find the skill, and from there we can make better predictions than we ever could before.

That’s all for now, but be sure to stop by on Tuesday as I begin to discuss some of these statistics, starting with the one I like (and hate) the most: Batting Average on Balls in Play (BABIP).
Untitled 1
   
  About Us - Contact - Advertising - Privacy Policy - Copyright Disclaimer
Copyright © 2008 Front Office Sports Enterprise. All Rights Reserved.