I've been reading and enjoying your work, Joe, since you wrote for the KC Star. This piece is beneath you. Going after Houston's weather? From October through April, the weather is great. And my understanding is the commissioner's office controls whether the roof is open or closed, including for game 2. Both teams have been playing in the same ballparks. The most common time for the roof to be open is in April. The Astros are proving they're the better team, thoroughly outplaying the Yankees through game 3.
The funny thing about this is that even in Joe’s story about how the postseason is all luck, he relies on narratives. A good bounce, a favorable wind. In this case, perhaps that happened, but that’s not the way luck usually works. Bergman is a good hitter. So is Judge. In any given game, either of them could easily hit a home run. But there’s usually no particular reason why it happens in one game instead of another
I didn’t see Judge’s opposite field shot. Bregman’s looked like home run off the bat, and he reacted to it that way. It went a few rows up. Maybe the wind prevented it from going further. The 4% figure may have more to do with how short left field is in Minute Maid Park than anything else.
The problem with Statcast's estimates is that even though they have all the bells, whistle and cameras, they don't use any of that for their hit estimates. Speed of ball, angle of exit, (only the up and down angle) how far the ball went before it hit the ground. They plug those numbers in and spit out a number,
They pay absolutely zero attention to where the ball is hit. I found that out because I was researching a double in a game that I watched, softly hit down the first base line, that they gave an absurdly low chance of being a hit. When I watched it, I felt it was a double all day long unless the first baseman was guarding the line or they were in a shift that direction. A weak double, but it happens all the time with a ball right up the line like that.
So, for Bregman's hit they had a ball hit at a pretty average rate (It was 91.4 mph. The average is 87) at a fly ball angle, and it went 360 feet. (Which also means that close to the line, it is out of pretty much any park. I don't like short porches either. I think the lines should all be at least 330. But that will never happen.) But because they don't pay attention to where the ball is hit, obviously anywhere from the beginning of left center to the beginning of right center it is a can of corn everywhere. So they get the 4%.
Because Judge hit his on a line with a lot MPH, and it was deep, it had a 91% chance. An outfielder has to be close to get to something like that. But the Astros were playing him deep, as they should, and Tucker made a great play. Statcast also said that Judge's drive would have been out of exactly one ballpark: Yankee stadium.
Also, for all the talk of the short porch the left field line in Houston is 3 feet shorter than the line at Yankee Stadium, and actually a foot longer than the right field at Yankee stadium.
I don't know if that is a problem with statcast so much as the way they mean for it to work. I think the goal with that number is to make it purely what the batter (and pitcher) control vs an actual probability of an event. Getting a weak double because of where a soft liner is hit is more luck than skill for the batter. There are factors they could look at to see if the hitter got lucky, but should they?
Boy, Yankees fans are like Michigan Wolverine fans, who complain about wind, rain, refs, “crowning of the field”, and whatever other excuse they can find.
I don't know - the tone of this piece just seems off to me (and honestly, the whole "No BS Preview" series). It reads like "who cares? Nothing you do matters. It's all a coin flip anyway!" That just seems... antithetical to all we love about sports. Yes, I know luck is always in play (in life and in sports). But sports at its core is supposed to be about the competition - line 'em up, mine against yours, who can handle the pressure, who's better. Results matter. If we really believe the statistical analysis selects the best team, and the game is just a random outcome generator, what's the point of playing the game? Why should anyone care? And I can't imagine you believe that.
Yes, it’s pretty nihilistic to believe that the playoffs are just random. Maybe this nihilism is what’s destroying baseball. Not the shift. Not the length of games. Not pitching changes. Not all the strikeouts. It’s the underlying, widespread, and statistically supported belief that nothing a player or coach can do matters.
I think the truth is that randomness, which is definitely real, is viewed incorrectly by many of us. It’s not hopeless and pointless. Players still play. They hustle and hit and pitch and run. They play as a team. The random nature of the sport frees players from destiny or fate, from fear and intimidation. The power shifts to players having to rely only on themselves and each other, which empowers and motivates them to play with a sense of purpose in which they rely on each other.
Dismissing the wins of lesser teams as “luck” completely ignores the great things they achieved as a team. Did they get lucky? You know they did. Did they do what they needed to do to win? Absolutely.
In a three game series, based on a statistical study during regular season games from 2010-2019 of three game series, From 2010-19, with at least 60 games into the season, with a win% difference of at least .075: The "better team" won 62% of the time.
From 2010-19, at least 60 games into the season, with a win% difference of at least .150: The "better team" won 67% of the time.
One out of three game series, at least with a meaningful win percentage gap, the lesser team is likely to win one of the three games 33% of the time. That means that teams with a .075% winning gap would post an upset a third of time.
The difference between Regular Season and Playoff Baseball is that it's more likely that both teams would be playing their very best players without concern for rest or shelving players to recover from the most minor of injuries. IDK how much that would affect this study, if at all.
I would say the playoffs don't determine the best team. What the playoffs determine is... the winner of the playoffs. If you played the whole thing simultaneously in Universe A and parallel Universe B you'd get completely different outcomes. And that that's way more true in baseball than in other major American sports.
Doesn't mean it's not exciting or fun to watch. I celebrate when my team wins and feel terrible when they go out. But as for determining who's best, who can handle the pressure, all that jive... nah. It's a crapshoot.
Luck?? How 'bout the Yankees getting the gift of two runs handed to them when Framber Valdez committed two errors on the same play? Or do the Yankess call that skill?
Complaining about playing outside... The last I checked, Yankee stadium is outside. Also, as I recall, the teams played in the same stadium last night. They say it is a poor carpenter who blames his tools. It is even a poorer carpenter who blames the jobsite for his tools not working.
It’s human nature to only see luck when it goes against us while luck that goes in our favor is really something we did ourselves.
Many times during a game a hitter will ever so slightly miss and foul a pitch straight back, or hit a rope down the line but just foul. Pitchers get lucky too, however quickly they forget. There’s some luck in every pitch and swing - nobody can put the ball exactly where they want or swing exactly in a certain plane.
What I find the Yankee-est about Severino’s quote is that it seems to ignore that the Astros have in fact been the better team this season in terms of both their overall record and (by quite a bit) head to head against New York. If I didn’t know better from that quote I’d think it was the other way around.
One point in favor of Sevy's assessment: the Yankees outscored their opposition by 240 runs this year, while the Astros were +219. (And given the unbalanced schedule, some would say the AL East is a stronger division than the AL West, but I've not looked into that.) The Yankees therefore did more toward actual winning -- because outscoring is the only way to win -- than the Astros did, so they could probably be considered the better team over the course of the season. But the Astros are hotter now, and the Yankees stumbled to the finish line, so recency bias factors in strongly. And it certainly matters: now is when we're playing this tournament!
I think over 162 games, to borrow a line from Bill Parcells, you are what your record says you are. A flawed measure but less flawed than the others I think.
Run differential to me is useful as a rough guide to how sustainable winning is, not as an accomplishment to shoot for. If the Yankees beat the Royals 12-2 and the Astros beat them 9-3, I don’t thInk it was 67% more of a win by New York even though their run differential might say so.
That’s just me and I guess for subjective things like who’s really better, the beauty is that everyone can choose their own lens. And though someone will win this series, whoever doesn’t can still claim to be better and, if they want, unlucky.
Last I checked, the team who wins more games is the better team than who had the higher run differential. I’m tired of hearing how the Astros have only won these games by one or two runs, as if one or two run victories don’t count as wins. The Astros won 106 games this season, many of them just like the games they have won this postseason. I watched most of them. They are set up to win games like this. This is how they play. If they go 11-0 thru the playoffs with a +16 win differential, are we going to have to hear about how close it was?
I’m just not sure winning close games is a repeatable skill, and therefore a meaningful measure of quality. I guess if you win, you can claim to be the “best,” which is why we play the games, after all. I would say that particular team is the “winner,” but not necessarily the “best” — the best teams don’t always win, which is a big part of the appeal of sports, of course.
But run differential is not as big of an indicator as people think. It can be (and somewhat was this year for the Yankees) be an indicator of a good offensive team that has a lot of large run number in individual games. I do a run spread which takes into account how many runs a team gives up or scores in each game. I haven't done it for the Yanks and Astros yet, but I have 300 seasons doing it, and it is much more accurate than pythagorean record, which just take onlyruns scored and runs allowed in the full season into account.
The Yankees had 22 games in which they scored double digit runs this year, and their run differential in those games was +186. That is still only 22 wins. In the other 140 games they only outscored their opponent by 54. (.386 runs per game) The Astros had only 13 such explosions and outscored their opponents in those games by 109. In the other games 149 games they outscored their opponents by 107 runs or .718 runs per game. It is not so hard just from this to see where the Astros made up those 9 wins and then won more.
I don't have time right now to run the expected wins for both teams, but I will at some point. Just looking at this, I believe the Astros expected wins will easily be higher than the Yankees.
That’s fascinating, and a great way to get at the crux of the run distributions. I wrote an article way back when for SABR’s “By the Numbers” newsletter that used standardized run differential adjusted for league size as the criterion, and while the results were solid, there were definitely some outliers that I bet your methodology would catch. Keep us posted!
OK. So I did the Astros and the Yankees for this year. I do actual run scoring for each game, both offensively, and for pitching and defense and come up with a percentage as to what kind of winning percentage each would have if they were exactly an average team on the other side of the ball, adjusting a bit for park factors. Then I crunch some numbers and come up with an expected winning percentage overall, and then how many wins that would be.
When I first started doing that, I did 10 seasons of 30 teams, and compared the differential to pythagorean differentials for each team. It was significantly more accurate, overall, than either BR or Fangraphs versions. (They are a little different) Also when I added up all the teams together for a full season, it was a little closer to the actual number of wins in the season.
This year I have the Astros at 103 expected wins, and the Yankees at 98. Though they nail the Astros, I am closer with the Yankees and closer overall for all the teams I have done. (About half thus far) Their biggest outlier was being 9 games off on the Rangers (6 games for me) My biggest outlier was the Dodgers at 7 games. They actually underachieved, and had the most expected wins I have had since doing this at 118. (Actually 117.55, but they round, so I do as well) I have never done the Mariners 116 win team. Maybe I should do that one.
No one cares about this, but it is something I got obsessed with at one point in time, when I did the 10 season comparison. It just seemed that using the season run total didn't account for variance enough. Now I start doing it (in season) with the Royals (my team), and then start doing teams that interest me, and usually end up doing all the teams at some point in the off season just to get that overall comparison. My method has yet to not finish first, though it has been pretty close some years.
Great philosophical observation about what happened in the game, and what we — even the players — think happened in the game. Reminds me of HBO GoT quote, “A story we agree to tell each other over and over till we forget that it's a lie." Then I also remember the Texas - USC bowl game, and VY, and realize that — numbers be damned — some DO will an outcome into reality. Love. Sports.
What it comes down to is that Judge is consistently, and disproportionately, rewarded at home for that swing… but that doesn’t mean it’ll work on the road…
It's true! This year, Aaron Judge had an AMAZING OPS at home: 1.081. And on the road, he was absolutely pathetic, only having an OPS of 1.141.
Sure, he hit 30 homers in 80 games at Yankee Stadium this year. Anyone can do that. What's really pathetic is how he only hit 32 homers in his 77 road games.
Apropos almost nothing that Joe said, I DO NOT understand the use of the Statcast numbers telling us that there was a 4% hit probability on Bregman's ball or a 91% probability on Judge's (the same complaint on NFL stats like "that pass had a 5% chance of completion"). In fact, there was a 100% chance that Bregman got a hit, and a 0% chance that Judge got a hit (and there is either a 100% or 0% chance the pass was caught.) How do these numbers aid in any way in decision making? Just because we can measure something, doesn't mean we have to. Part of the resistance to a lot of new stats is that we keep throwing out numbers without providing any reason. Almost 90 years ago Red Barber might tell us the wind is blowing in from right field - what does Statcast add to that?
Using my LaffStat formula (an extremely complicated, proprietary formula the details of which I cannot divulge), I was able to tell, within mere seconds of him hitting the ball, that Judge had a 0% chance of getting a hit. I have been using this formula for some time, and it is pretty much flawless.
I forget what network it was (I think I've watched baseball games on 70 different networks this season), but they kept showing percentages on the bottom right hand of the screen and it was super irritating. It would be things like "Mike Trout has a 13% chance of getting a hit" on a 0-2 count or "Dansby Swanson has a 47% chance of getting a walk" on a 3-0 count. I hated every minute of it.
I believe that this is akin to when a meteorologist says "there's a 50% chance of rain," in that yes while it will rain or it won't, the thing they're really saying is that on days with extremely similar weather conditions to the ones happening at that time, it has rained 50% (or so) of the time. So, Bregman's home run had a launch angle and velocity that in the past have translated into a hit 4% of the time, which is why it's so cool and/or weird that this time something (the wind?) happened to turn it into, you know, a game-winning home run. I dunno, does that make sense?
If the meteorologist says there is a 50% chance of rain, that helps me make a decision, and I can carry an umbrella. If she tells me there was a 50% chance of rain yesterday it does nothing for me. I can't picture the dugout conversation that went: Analyst: Alex, if you hit the ball with a 47% launch angle and at 109 mph you only have a 4% chance of a hit. Alex: OK, I won't do that."
It makes perfect sense in the right context. When evaluating hitters over a season, Statcast exit velocities and angles can tell us the expected batting and slugging averages, based on the balls they actually hit, rather than the fielding results which have a degree of luck: for example, a 109-mph line drive hit right at a fielder, or a soft fly that just happens to land in the one spot where no fielder gets to it.
As far as giving the percentages for individual hits during a game, many people find them interesting, and many Yankee fans apparently find them useful to explain how their team was robbed of a win, this time.
What good is telling us the distance of each home run? It doesn't change anything, but it's interesting piece of information that makes the game more enjoyable to many. So, it is with the Statcast numbers.
Telling me after the fact that the home run travelled 480 feet is a fact that tells me something about how the ball was hit. Telling me that there was a 47 % chance of it going that far tells me nothing. If you're playing poker and hit an inside straight the important fact is that you hit it. The fact that there was only an 8% chance of hitting it is important while you're making the decision to draw. It is meaningless after the fact.
Statcast data tells you how hard the ball was hit and at what angle. It doesn't tell you that there's a 47% chance it will go that far, it tells you that of all balls hit that hard, at that angle, 47% fell in for hits. Many people find this interesting, at least as interesting as how far it went.
Ultimately, nothing matters but the final score. But some of us watching the games and getting interesting information along the way. Most of us try not to tell other people what kinds of information they should and shouldn't find interesting or informative.
What this really tells us is that Statcast is not able to utilize all the information available (wind speed and direction, air temperature, relative humidity at field level and in the air, placement of players and likely many more). If Statcast could take everything into account, it clearly would predict that in these instances, that chance of Judge getting a hit was 0% and the chance for Bregman was 100%.
It makes a lot of sense to me and is incredibly interesting. It tells me that Bregman didn't actually hit the ball all that well, but circumstances coalesced (ballpark shape and size, wind, humidity, amount of spin) in just such a way that he was rewarded with a home run. It means that if Bregman hit the ball that way 100 times in a variety of ballparks and weather conditions and against a variety of opponents, he would probably be out 96 times. I find that completely logical as well as fascinating.
Context in baseball is super important and adds a lot to the viewing experience. Statcast figures help us determine what effect context is having on actual gameplay.
The breaks would go the Yankees way a lot more often if they stopped swinging at pitches way outside the strike zone. The first time through the order, yea ok, you get fooled, the second and third times, way fewer excuses.
Every time I see the expression "ground ball with eyes," in my head I hear it in that gravelly voice Kevin Costner used to delivered it as Crash Davis in the pool hall explaining the difference between a .300 and a .250 hitter.
I saw a replay of that at bat maybe a year ago. I was surprised at how hard it was NOT hit. It was a medium hit line drive, not a scorcher. I think it would have needed to be hit to one side or another for two feet to work. I don’t remember it being that high- maybe head high. And I was rooting for the Giants.
I've been reading and enjoying your work, Joe, since you wrote for the KC Star. This piece is beneath you. Going after Houston's weather? From October through April, the weather is great. And my understanding is the commissioner's office controls whether the roof is open or closed, including for game 2. Both teams have been playing in the same ballparks. The most common time for the roof to be open is in April. The Astros are proving they're the better team, thoroughly outplaying the Yankees through game 3.
The funny thing about this is that even in Joe’s story about how the postseason is all luck, he relies on narratives. A good bounce, a favorable wind. In this case, perhaps that happened, but that’s not the way luck usually works. Bergman is a good hitter. So is Judge. In any given game, either of them could easily hit a home run. But there’s usually no particular reason why it happens in one game instead of another
And I thought football was the luckiest game.
I didn’t see Judge’s opposite field shot. Bregman’s looked like home run off the bat, and he reacted to it that way. It went a few rows up. Maybe the wind prevented it from going further. The 4% figure may have more to do with how short left field is in Minute Maid Park than anything else.
The problem with Statcast's estimates is that even though they have all the bells, whistle and cameras, they don't use any of that for their hit estimates. Speed of ball, angle of exit, (only the up and down angle) how far the ball went before it hit the ground. They plug those numbers in and spit out a number,
They pay absolutely zero attention to where the ball is hit. I found that out because I was researching a double in a game that I watched, softly hit down the first base line, that they gave an absurdly low chance of being a hit. When I watched it, I felt it was a double all day long unless the first baseman was guarding the line or they were in a shift that direction. A weak double, but it happens all the time with a ball right up the line like that.
So, for Bregman's hit they had a ball hit at a pretty average rate (It was 91.4 mph. The average is 87) at a fly ball angle, and it went 360 feet. (Which also means that close to the line, it is out of pretty much any park. I don't like short porches either. I think the lines should all be at least 330. But that will never happen.) But because they don't pay attention to where the ball is hit, obviously anywhere from the beginning of left center to the beginning of right center it is a can of corn everywhere. So they get the 4%.
Because Judge hit his on a line with a lot MPH, and it was deep, it had a 91% chance. An outfielder has to be close to get to something like that. But the Astros were playing him deep, as they should, and Tucker made a great play. Statcast also said that Judge's drive would have been out of exactly one ballpark: Yankee stadium.
Also, for all the talk of the short porch the left field line in Houston is 3 feet shorter than the line at Yankee Stadium, and actually a foot longer than the right field at Yankee stadium.
I don't know if that is a problem with statcast so much as the way they mean for it to work. I think the goal with that number is to make it purely what the batter (and pitcher) control vs an actual probability of an event. Getting a weak double because of where a soft liner is hit is more luck than skill for the batter. There are factors they could look at to see if the hitter got lucky, but should they?
Boy, Yankees fans are like Michigan Wolverine fans, who complain about wind, rain, refs, “crowning of the field”, and whatever other excuse they can find.
I don't know - the tone of this piece just seems off to me (and honestly, the whole "No BS Preview" series). It reads like "who cares? Nothing you do matters. It's all a coin flip anyway!" That just seems... antithetical to all we love about sports. Yes, I know luck is always in play (in life and in sports). But sports at its core is supposed to be about the competition - line 'em up, mine against yours, who can handle the pressure, who's better. Results matter. If we really believe the statistical analysis selects the best team, and the game is just a random outcome generator, what's the point of playing the game? Why should anyone care? And I can't imagine you believe that.
Presently, they care because betting by the younger generations is through the roof.
Yes, it’s pretty nihilistic to believe that the playoffs are just random. Maybe this nihilism is what’s destroying baseball. Not the shift. Not the length of games. Not pitching changes. Not all the strikeouts. It’s the underlying, widespread, and statistically supported belief that nothing a player or coach can do matters.
I think the truth is that randomness, which is definitely real, is viewed incorrectly by many of us. It’s not hopeless and pointless. Players still play. They hustle and hit and pitch and run. They play as a team. The random nature of the sport frees players from destiny or fate, from fear and intimidation. The power shifts to players having to rely only on themselves and each other, which empowers and motivates them to play with a sense of purpose in which they rely on each other.
Dismissing the wins of lesser teams as “luck” completely ignores the great things they achieved as a team. Did they get lucky? You know they did. Did they do what they needed to do to win? Absolutely.
In a three game series, based on a statistical study during regular season games from 2010-2019 of three game series, From 2010-19, with at least 60 games into the season, with a win% difference of at least .075: The "better team" won 62% of the time.
From 2010-19, at least 60 games into the season, with a win% difference of at least .150: The "better team" won 67% of the time.
One out of three game series, at least with a meaningful win percentage gap, the lesser team is likely to win one of the three games 33% of the time. That means that teams with a .075% winning gap would post an upset a third of time.
The difference between Regular Season and Playoff Baseball is that it's more likely that both teams would be playing their very best players without concern for rest or shelving players to recover from the most minor of injuries. IDK how much that would affect this study, if at all.
I would say the playoffs don't determine the best team. What the playoffs determine is... the winner of the playoffs. If you played the whole thing simultaneously in Universe A and parallel Universe B you'd get completely different outcomes. And that that's way more true in baseball than in other major American sports.
Doesn't mean it's not exciting or fun to watch. I celebrate when my team wins and feel terrible when they go out. But as for determining who's best, who can handle the pressure, all that jive... nah. It's a crapshoot.
Luck?? How 'bout the Yankees getting the gift of two runs handed to them when Framber Valdez committed two errors on the same play? Or do the Yankess call that skill?
More like lack thereof.
Complaining about playing outside... The last I checked, Yankee stadium is outside. Also, as I recall, the teams played in the same stadium last night. They say it is a poor carpenter who blames his tools. It is even a poorer carpenter who blames the jobsite for his tools not working.
It’s human nature to only see luck when it goes against us while luck that goes in our favor is really something we did ourselves.
Many times during a game a hitter will ever so slightly miss and foul a pitch straight back, or hit a rope down the line but just foul. Pitchers get lucky too, however quickly they forget. There’s some luck in every pitch and swing - nobody can put the ball exactly where they want or swing exactly in a certain plane.
What I find the Yankee-est about Severino’s quote is that it seems to ignore that the Astros have in fact been the better team this season in terms of both their overall record and (by quite a bit) head to head against New York. If I didn’t know better from that quote I’d think it was the other way around.
One point in favor of Sevy's assessment: the Yankees outscored their opposition by 240 runs this year, while the Astros were +219. (And given the unbalanced schedule, some would say the AL East is a stronger division than the AL West, but I've not looked into that.) The Yankees therefore did more toward actual winning -- because outscoring is the only way to win -- than the Astros did, so they could probably be considered the better team over the course of the season. But the Astros are hotter now, and the Yankees stumbled to the finish line, so recency bias factors in strongly. And it certainly matters: now is when we're playing this tournament!
I think over 162 games, to borrow a line from Bill Parcells, you are what your record says you are. A flawed measure but less flawed than the others I think.
Run differential to me is useful as a rough guide to how sustainable winning is, not as an accomplishment to shoot for. If the Yankees beat the Royals 12-2 and the Astros beat them 9-3, I don’t thInk it was 67% more of a win by New York even though their run differential might say so.
That’s just me and I guess for subjective things like who’s really better, the beauty is that everyone can choose their own lens. And though someone will win this series, whoever doesn’t can still claim to be better and, if they want, unlucky.
Last I checked, the team who wins more games is the better team than who had the higher run differential. I’m tired of hearing how the Astros have only won these games by one or two runs, as if one or two run victories don’t count as wins. The Astros won 106 games this season, many of them just like the games they have won this postseason. I watched most of them. They are set up to win games like this. This is how they play. If they go 11-0 thru the playoffs with a +16 win differential, are we going to have to hear about how close it was?
I’m just not sure winning close games is a repeatable skill, and therefore a meaningful measure of quality. I guess if you win, you can claim to be the “best,” which is why we play the games, after all. I would say that particular team is the “winner,” but not necessarily the “best” — the best teams don’t always win, which is a big part of the appeal of sports, of course.
But run differential is not as big of an indicator as people think. It can be (and somewhat was this year for the Yankees) be an indicator of a good offensive team that has a lot of large run number in individual games. I do a run spread which takes into account how many runs a team gives up or scores in each game. I haven't done it for the Yanks and Astros yet, but I have 300 seasons doing it, and it is much more accurate than pythagorean record, which just take onlyruns scored and runs allowed in the full season into account.
The Yankees had 22 games in which they scored double digit runs this year, and their run differential in those games was +186. That is still only 22 wins. In the other 140 games they only outscored their opponent by 54. (.386 runs per game) The Astros had only 13 such explosions and outscored their opponents in those games by 109. In the other games 149 games they outscored their opponents by 107 runs or .718 runs per game. It is not so hard just from this to see where the Astros made up those 9 wins and then won more.
I don't have time right now to run the expected wins for both teams, but I will at some point. Just looking at this, I believe the Astros expected wins will easily be higher than the Yankees.
That’s fascinating, and a great way to get at the crux of the run distributions. I wrote an article way back when for SABR’s “By the Numbers” newsletter that used standardized run differential adjusted for league size as the criterion, and while the results were solid, there were definitely some outliers that I bet your methodology would catch. Keep us posted!
OK. So I did the Astros and the Yankees for this year. I do actual run scoring for each game, both offensively, and for pitching and defense and come up with a percentage as to what kind of winning percentage each would have if they were exactly an average team on the other side of the ball, adjusting a bit for park factors. Then I crunch some numbers and come up with an expected winning percentage overall, and then how many wins that would be.
When I first started doing that, I did 10 seasons of 30 teams, and compared the differential to pythagorean differentials for each team. It was significantly more accurate, overall, than either BR or Fangraphs versions. (They are a little different) Also when I added up all the teams together for a full season, it was a little closer to the actual number of wins in the season.
This year I have the Astros at 103 expected wins, and the Yankees at 98. Though they nail the Astros, I am closer with the Yankees and closer overall for all the teams I have done. (About half thus far) Their biggest outlier was being 9 games off on the Rangers (6 games for me) My biggest outlier was the Dodgers at 7 games. They actually underachieved, and had the most expected wins I have had since doing this at 118. (Actually 117.55, but they round, so I do as well) I have never done the Mariners 116 win team. Maybe I should do that one.
No one cares about this, but it is something I got obsessed with at one point in time, when I did the 10 season comparison. It just seemed that using the season run total didn't account for variance enough. Now I start doing it (in season) with the Royals (my team), and then start doing teams that interest me, and usually end up doing all the teams at some point in the off season just to get that overall comparison. My method has yet to not finish first, though it has been pretty close some years.
Great philosophical observation about what happened in the game, and what we — even the players — think happened in the game. Reminds me of HBO GoT quote, “A story we agree to tell each other over and over till we forget that it's a lie." Then I also remember the Texas - USC bowl game, and VY, and realize that — numbers be damned — some DO will an outcome into reality. Love. Sports.
What it comes down to is that Judge is consistently, and disproportionately, rewarded at home for that swing… but that doesn’t mean it’ll work on the road…
It's true! This year, Aaron Judge had an AMAZING OPS at home: 1.081. And on the road, he was absolutely pathetic, only having an OPS of 1.141.
Sure, he hit 30 homers in 80 games at Yankee Stadium this year. Anyone can do that. What's really pathetic is how he only hit 32 homers in his 77 road games.
Apropos almost nothing that Joe said, I DO NOT understand the use of the Statcast numbers telling us that there was a 4% hit probability on Bregman's ball or a 91% probability on Judge's (the same complaint on NFL stats like "that pass had a 5% chance of completion"). In fact, there was a 100% chance that Bregman got a hit, and a 0% chance that Judge got a hit (and there is either a 100% or 0% chance the pass was caught.) How do these numbers aid in any way in decision making? Just because we can measure something, doesn't mean we have to. Part of the resistance to a lot of new stats is that we keep throwing out numbers without providing any reason. Almost 90 years ago Red Barber might tell us the wind is blowing in from right field - what does Statcast add to that?
Using my LaffStat formula (an extremely complicated, proprietary formula the details of which I cannot divulge), I was able to tell, within mere seconds of him hitting the ball, that Judge had a 0% chance of getting a hit. I have been using this formula for some time, and it is pretty much flawless.
I forget what network it was (I think I've watched baseball games on 70 different networks this season), but they kept showing percentages on the bottom right hand of the screen and it was super irritating. It would be things like "Mike Trout has a 13% chance of getting a hit" on a 0-2 count or "Dansby Swanson has a 47% chance of getting a walk" on a 3-0 count. I hated every minute of it.
That sounds like Apple, but i can't say for sure because I watched so little of their broadcasts because they were so annoying.
I believe that this is akin to when a meteorologist says "there's a 50% chance of rain," in that yes while it will rain or it won't, the thing they're really saying is that on days with extremely similar weather conditions to the ones happening at that time, it has rained 50% (or so) of the time. So, Bregman's home run had a launch angle and velocity that in the past have translated into a hit 4% of the time, which is why it's so cool and/or weird that this time something (the wind?) happened to turn it into, you know, a game-winning home run. I dunno, does that make sense?
If the meteorologist says there is a 50% chance of rain, that helps me make a decision, and I can carry an umbrella. If she tells me there was a 50% chance of rain yesterday it does nothing for me. I can't picture the dugout conversation that went: Analyst: Alex, if you hit the ball with a 47% launch angle and at 109 mph you only have a 4% chance of a hit. Alex: OK, I won't do that."
What you write is "true", but it still doesn't "make sense".
It makes perfect sense in the right context. When evaluating hitters over a season, Statcast exit velocities and angles can tell us the expected batting and slugging averages, based on the balls they actually hit, rather than the fielding results which have a degree of luck: for example, a 109-mph line drive hit right at a fielder, or a soft fly that just happens to land in the one spot where no fielder gets to it.
As far as giving the percentages for individual hits during a game, many people find them interesting, and many Yankee fans apparently find them useful to explain how their team was robbed of a win, this time.
What good is telling us the distance of each home run? It doesn't change anything, but it's interesting piece of information that makes the game more enjoyable to many. So, it is with the Statcast numbers.
Telling me after the fact that the home run travelled 480 feet is a fact that tells me something about how the ball was hit. Telling me that there was a 47 % chance of it going that far tells me nothing. If you're playing poker and hit an inside straight the important fact is that you hit it. The fact that there was only an 8% chance of hitting it is important while you're making the decision to draw. It is meaningless after the fact.
Statcast data tells you how hard the ball was hit and at what angle. It doesn't tell you that there's a 47% chance it will go that far, it tells you that of all balls hit that hard, at that angle, 47% fell in for hits. Many people find this interesting, at least as interesting as how far it went.
Ultimately, nothing matters but the final score. But some of us watching the games and getting interesting information along the way. Most of us try not to tell other people what kinds of information they should and shouldn't find interesting or informative.
What this really tells us is that Statcast is not able to utilize all the information available (wind speed and direction, air temperature, relative humidity at field level and in the air, placement of players and likely many more). If Statcast could take everything into account, it clearly would predict that in these instances, that chance of Judge getting a hit was 0% and the chance for Bregman was 100%.
Telling you after the fact that Bregman hit the ball x mph tells you a LOT about how well it was hit.
It makes a lot of sense to me and is incredibly interesting. It tells me that Bregman didn't actually hit the ball all that well, but circumstances coalesced (ballpark shape and size, wind, humidity, amount of spin) in just such a way that he was rewarded with a home run. It means that if Bregman hit the ball that way 100 times in a variety of ballparks and weather conditions and against a variety of opponents, he would probably be out 96 times. I find that completely logical as well as fascinating.
Context in baseball is super important and adds a lot to the viewing experience. Statcast figures help us determine what effect context is having on actual gameplay.
The breaks would go the Yankees way a lot more often if they stopped swinging at pitches way outside the strike zone. The first time through the order, yea ok, you get fooled, the second and third times, way fewer excuses.
Bucky Dent's team wants you to know the other team's odd outfield dimensions are unfair.
Every time I see the expression "ground ball with eyes," in my head I hear it in that gravelly voice Kevin Costner used to delivered it as Crash Davis in the pool hall explaining the difference between a .300 and a .250 hitter.
“Or why couldn’t McCovey have hit the ball even TWO feet higher?” - Charlie Brown, Jan. 28, 1963.
(can't post images here, but I'm sure you get the reference)
I saw a replay of that at bat maybe a year ago. I was surprised at how hard it was NOT hit. It was a medium hit line drive, not a scorcher. I think it would have needed to be hit to one side or another for two feet to work. I don’t remember it being that high- maybe head high. And I was rooting for the Giants.