New statistic WIP

Many baseball statistics are kept at the forefront of…uh, baseball statistics because they’re pretty easy to calculate and they’ve been around for a while. This doesn’t mean that they’re inherently useful or revealing. One such statistic is wins for a pitcher–because it’s somewhat dependent on how many runs they receive in support, it can’t directly measure whether they’re doing well at retiring batters and not giving up runs. It’s also dependent on when they receive runs in support. Witness the countless “BS, W” (which are a little bit, if you’ll pardon my language…um, BS)–a starting pitcher does well, the reliever gives up the lead, but is credited with the win because the team eventually rallies back anyway.

So–urged on by a couple of nice instigators–I got to playing with an idea for re-awarding pitcher wins, part of it plagiarized from another sport entirely. In the admittedly small sample size of 33 Cubs victories this year (shush), my system “felt” egregiously wrong three times, due to a vulnerability that I realized going in. This is, coincidentally, exactly as many as “egregious” assignments of winning pitchers under the normal system. So I can’t really say my way is an improvement (hence why I’m not going into details. Yet. Maybe later once I’ve tweaked it and/or come up with a funny acronym.)

I do, however, want to discuss some interesting case studies. My system and the standard system agreed in 24 of the 33 cases; setting aside 6 more “egregious” games leaves a couple other borderline judgments. Maybe looking at some of these will show you how awkward this system is…

April 13: Cubs 9, Astros 5

Carlos Zambrano started the game for the Cubs, who led 6-0 when he took the hill in the bottom of the sixth. He then gave up five runs before being pulled for Marcos Mateo, who recorded one out to get out of the inning. Sean Marshall, Kerry Wood, and Carlos Marmol pitched scoreless seventh, eighth, and ninth innings respectively. Zambrano was the actual winning pitcher, despite almost giving the game away. My system would credit Wood with the victory. Had the Cubs not scored one run in the top of the sixth, Zambrano wouldn’t have gotten the win…on the other hand, that one run was his solo homer, so perhaps he deserves the credit after all.

May 7: Cubs 3, Reds 2

Casey Coleman entered the bottom of the seventh with the Cubs up by one. He walked the leadoff batter and then gave up a single for first and third with nobody out. Wood came in to induce an out at second that brought in one run, a sacrifice bunt, and another single that brought in the go-ahead run. He got out of the inning, James Russell retired two batters in the eighth, and Marmol pitched a scoreless inning-and-a-third before picking up the win when the Cubs walked off (twice, but that’s another story). My system also credits Marmol with the victory…did Coleman really deserve it, or were those two batters too big a liability? I might still need to make some tweaks as far as inherited runners go.

June 14: Cubs 5, Brewers 4, ten innings.

Even more time for more pitchers and more…intrigue.

Randy Wells left the game after six, with the Cubs down by 3. Rodrigo Lopez pitched a scoreless seventh (the Cubs got another run in the bottom half, so it was 3-1), but Lopez walked the leadoff batter in the eighth and was pulled for Russell. Russell gave up a bunt which moved the runner to second, and was removed for Chris “Not That One” Carpenter. He gave up a double to bring in a run (charged to Lopez), but then got out of the inning. In the bottom of the eighth, the Cubs scored three to tie the game. Marmol and Jeff Samardzija pitched scoreless ninths and tenths respectively, and walked off in the tenth to give Samardzija the win. My system credits Carpenter with the victory, which seems wrong–he pitched two-thirds of an inning, during which one run was scored. On the other hand, did anyone else really deserve the win?

The point is, only timing distinguishes Marmol and Samardzija’s performance, and it doesn’t make sense to put a lot of value on a raw statistic that is influenced by factors so arbitrary. I might go off and tweak my system, but for now my recommendation is bipartite. Either a) do not award “blown save/win” under any circumstances, and give the win to the guy whose game would have been saved had the save not been blown, or/and b) track statistics more meaningful than wins (and, for that matter, saves), which you should do anyway. :p

Also, read the actual rules about “effective”ness for winning pitchers, as shown here (pages 110-111). I’d be curious to see how many times a season rules 10.17 (b) and (c) are applied, among all teams.

Advertisements

ZACK Tweak

One modification to the ZACK statistic I defined last summer; I’m pretty sure you should replace x by x/2 only when you are playing your nearest competitor in a given race. Last time I said you should replace it whenever that is not the case…I think I got it flipped around.

A Rockies fan at Coors today would have accumulated a ZACK of 38, which is rather high.

ZACK Non-Attack

Unless the opposing pitcher is on your fantasy team, there’s really almost never a bad time for your team to get hits. There are times when it would be really, really nice for them to get hits, and then times when it’s not so important, but hey, the more the merrier. Fantasy aside, can we therefore conclude that it is always good for your team to get hits, and always the best course of action to root for them to do so?

No, we cannot. For we also appreciate the home fans we see in broadcasts of no-hitters we switch to, who start cheering for the visiting pitcher at the end of the game–tacitly rooting against their own batters. But when does the “end of the game” begin? When is it acceptable to switch your allegiance?

Enter the ZACK, a new statistic that hopes to measure the acceptability of rooting for a no-hitter against your own team. This is the first version of the formula, so the scale of some of the numbers involved might be off. Feel free to give feedback.

ZACK is derived from four variables: ZACK=Z*(A+C+K). Each of these letters stands for a different question that a fan might implicitly consider on these occasions:

Z is for Zone? (Where am I?)
A is for Accomplishment? (How far along is this game?)
C is for Crucial? (Is this an important game to win?)
K is for Killed? (Are we getting killed out there?)

Here’s how you compute these…

Zone. We take as our baseline being actually present at the game; if you’re at the game, Z=1 so it won’t make a difference when you multiply it by the sum and you can move on to A. If you’re not at the game…
…but you’ve been paying attention (through any chosen medium) from the start, Z=3/4
…but you’re been paying attention for a while, Z =2/3
…but you just started paying attention because someone has alluded to what’s going on, Z=1/2 (You may think that this should be strange for a game involving “your team”, but perhaps they’re only “your team” in the league you don’t care about quite as much and you’re busy watching a different game.)
…but you just started paying attention because someone has explicitly told you what’s going on, step away from the computer (or open a new tab), and give that person a stern talking-to from me.

Accomplishment. The editors of MLB.com apparently believe in a “bright line test” after the fifth inning or so, allowing live look-ins after a specific point in the game. I do not; no inning should be disproportionately important. (If the correct answer to the question posed at the end of the second paragraph is “the seventh inning, period”, we wouldn’t be having this conversation.)

So, my first principle in ZACK is that each inning is equally important. Therefore, A=8 at the beginning of the game, and increases by 1 for every out recorded by the opposing pitcher, whether that out comes in the first inning or the ninth. (Well, actually, I didn’t consider extra innings. Maybe A should go up by 2 for every out beyond the twenty-seventh!)

However, I can appreciate that completing a half-inning may have more subjective value than just three outs. So, A increases by an extra 1 for every half-inning completed. (The A value for a game could go 8-9-10-12-13-14-16 through the first two innings if all the batters go down in order.)

Subtract that original 8 from your A value if your team has gotten on base through other means.

Crucial. This is the hardest to calculate, and the one that might need the most fine-tuning. I apologize in advance.

The first thing to do is see if your team has either clinched a playoff spot, or been eliminated from playoff contention. If so, C=9. Go directly to K. If it hasn’t, continue below. However, if the value that you get from continuing below is greater than 8.5, use C=8.5 instead.

Okay, so your team is still in some sort of race; it’s either leading the division or the wild card by x games, or it’s x games out of the division or wild card (consider whichever case makes x smaller–usually you’re closer to the wild card than the division title). If you are playing the team chasing you by x games/in first by x games, go to the next paragraph. Otherwise, take x/2 and go to the next paragraph.

Okay, so the number you got from the paragraph above, which is either x or x/2? To this, add the number of months remaining in the regular season; +5 in April, +4 in May, +3 in June, +2 in July, +1 in August, 0 in September. (If this is October and your team has not clinched or been eliminated, you have other things to worry about than ZACK.)

Okay, now you have C. Unless this value is greater than 8.5, remember, in which case C=8.5.

Killed. K increases by 1 for each run, beyond the second, your opponents have scored.

I originally considered tweaking this, as your team is not necessarily getting killed out there even if your opponents have scored more than two runs. However, I think the novelty factor of watching your team score without getting hits offsets this.

All right, we’re done! Multiply Z*(A+C+K) to get your ZACK score, and then root away.

What’s that? You’d maybe like an example or two to help with the numbers and stuff? That’s fine.

For a couple of examples, we will take two games last night; the Twins game against the Royals I attended (you will never guess who was pitching for the Royals. Never. At all), and the simultaneous Cubs game against the Astros. Both the Twins and Cubs qualify as “my team”.

First, let’s get Z and C out of the way, as we can do that early for both teams. (I suppose in late September C could change while the game is going on, if another team loses/wins at the same time; again, if this will dramatically alter your result, you have other things than ZACK to worry about at this point.) I was at the Twins game, so Z=1 there, but the Cubs were already losing by the time I started glancing at the scoreboard, so Z=2/3 there.

C for the Twins? They entered the game with a 5.5 game lead over the White Sox, so x=5.5. They weren’t playing the White Sox, so 5.5/2= 2.25. It’s September, so that’s all there is to do to find C.

The Cubs are actually farther out of the wild card race than they are the division race. Taking the “mere” 19.5 games by which they trail the Reds, we have 19.5/2=9.75. This is greater than 9.5, so C=9.5 for the Cubs. (That is to say, we’re not worrying too much, playoff wise, about the outcome of this game. Might as well root for a no-hitter.)

So, at the beginning of the game, the Twins’ ZACK was 2.25+8=10.25, and the Cubs’ was (2/3)*(9.5+8)=(2/3)*17.5=11.67. Here’s what happened through the first few innings…

Graph of Cubs and Twins' ZACK scores

Am I on the right track? There’s a lot of room for improvement, but I feel like this is a good start.

By the way, I’m not going to give you a cutoff value for acceptability, except that I’m pretty sure it’s greater than 14. Beyond that, it’s an issue that each fan must struggle with alone.

Also, although it should be obvious, don’t bother with ZACK once your team has already gotten a hit! Then, just sit back, enjoy, and hope the barrage continues for enough runs that Matt Capps can’t blow it.