Forecasting Accuracy 2024 - Good or Bad?

Jason N · Tuesday at 8:36 AM

I guess some of that will come down to how many Sig Tors there were yesterday, I would say 1 for sure, perhaps 2? but I stopped watching after the Barnell storm, so I don't know if anything formed after that. if it ends up being just one report of a sig tor, is that considered enough to call it a hit? .. I mean, their reasoning for upgrading to, valid. But I wonder when they updated it last evening to trim the area, should they have just gotten rid of the High and left MDT. but here I am, hind sighting lol.

One thing they don't do is discriminate with SLIGHT, MDT, or HIGH on whether its tornadic or not, their categories are irrespective of tornadoes, just severe, and longevity. So based on those conditions, they probably hit what they forecast.

Jesse Risley · Tuesday at 8:47 AM

Jason N said:
I guess some of that will come down to how many Sig Tors there were yesterday, I would say 1 for sure, perhaps 2? but I stopped watching after the Barnell storm, so I don't know if anything formed after that. if it ends up being just one report of a sig tor, is that considered enough to call it a hit? .. I mean, their reasoning for upgrading to, valid. But I wonder when they updated it last evening to trim the area, should they have just gotten rid of the High and left MDT. but here I am, hind sighting lol.

One thing they don't do is discriminate with SLIGHT, MDT, or HIGH on whether its tornadic or not, their categories are irrespective of tornadoes, just severe, and longevity. So based on those conditions, they probably hit what they forecast.

I haven't seen a good, purely objective categorization of what consists of a "bust" per se. For the high risk, it may come down to how the wind damage reports are quantified (see below). If we don't see "numerous intense and long-tracked tornadoes" once all of the cards are on the table, then the specific, tornado-driven high risk upgrade wouldn't verify. I have no idea how widespread and intense the wind damage is in eastern OK though. I wouldn't personally call the event a "bust," since there were a plethora of severe weather reports, although it's possible that the final report verifications will merit that the event only met a lower categorical threshold for verification, e.g., an enhanced or moderate, for example. I think it's important to message that this doesn't mean a complete bust even though a specific severe weather outlook categorization threshold may not have been met, although I understand the social science optics with the general public are far more complex.

5-HIGH (magenta) - High risk - An area where a severe weather outbreak is expected from either numerous intense and long-tracked tornadoes or a long-lived derecho-producing thunderstorm complex that produces hurricane-force wind gusts and widespread damage. This risk is reserved for when high confidence exists in widespread coverage of severe weather with embedded instances of extreme severe (i.e., violent tornadoes or very damaging convective wind events).

Jason N · Tuesday at 9:20 AM

Jesse Risley said:
I think it's important to message that this doesn't mean a complete bust even though a specific severe weather outlook categorization threshold may not have been met, although I understand the social science optics with the general public are far more complex.

I dont think it was a complete bust at all. but when it comes to whether someone has to calculate a HIT/MISS/FAR, it would be interesting to see how they determine it, separate from the social aspect ofcourse, as you said thats too complex and I would agree with you, better there to just take a pole before and after storm outbreaks maybe on , did you feel you were adequately warned? type questions. lol

Jesse Risley · Tuesday at 10:13 AM

@Jeff Duda posted this in another thread. I was previously unaware of Victor's page:
https://atlas.niu.edu/pperfect/

Jason N · Tuesday at 10:27 AM

I cant get to the page. is there another link? and is there any information on what's being used to produce this?

Jesse Risley · Tuesday at 10:28 AM

Jason N said:
I cant get to the page. is there another link? and is there any information on what's being used to produce this?

https://www.atmos.albany.edu/facstaff/rfovell/NWP/bryan-fritsch-2000.pdf

Jason N · Tuesday at 10:33 AM

you sent me the link to MAUL lol , I was talking about the practically perfect Probs

Jesse Risley · Tuesday at 10:34 AM

Jason N said:
you sent me the link to MAUL lol , I was talking about the practically perfect Probs

Oh, oops: SPC Practically Perfect Hindcast

Jason N · Tuesday at 10:35 AM

hahaha , no worries.. thank ya!

Sean Ramsey · Tuesday at 10:37 AM

Kind of funny how we treat the weather risk communication after the fact as opposed to say fire danger risk or other areas where danger can occur.

It's a risk, doesn't mean when it doesn't happen it's wrong. They assign high fire danger risk in forests all the time, no one goes back and says bust if a fire didn't occur on a particular day. Climbing Mount Everest is a high risk, but if climbers didn't die on a particular day, no one says the risk didn't verify.

I understand that people modify their days based on weather forecast communication. But, there was an inherent danger yesterday, it just wasn't realized on a mass scale.

Is the forecast made to predict tornadoes (quantity, size & type) or to communicate the risk (or danger) associated with them if they occur, or both?

Seems to me, the (very good) chance was there, right or wrong after the fact. Maybe someday they'll reach a point in forecasting where it's called "High Probability".

Jason N · Tuesday at 11:01 AM

Sean Ramsey said:
It's a risk, doesn't mean when it doesn't happen it's wrong. They assign high fire danger risk in forests all the time, no one goes back and says bust if a fire didn't occur on a particular day.

Yeah, I think that's where we get into the social aspect of the discussion, Warning for Warnings sake, be damned the statistics. I can see that on several levels, I just know that there are others within communities, that look at justification, dollars, Insurance, business, Operations, So say yesterday was only a Marginal Risk, people died and towns were leveled. They might ask, "well, you missed a 2 billion dollar severe weather event", why?

I think in that instance, people might not get away with, "well it's just risk", they might start asking more of how your performance as a center of excellence is, why are you here?, because you cost the taxpayer money to exist. Beyond it being important to potentially save lives, there is billions in industry that rely on it as well. So, I think both sides of the discussion have a say on some level.

But Human to Human, I fall on the side that says, it's all informative! treat it as such. we shouldn't care too much about how they performed, but believe me, someone does.

Jesse Risley · Tuesday at 11:43 AM

Though none of my own original thoughts per se, I always try to stress that meteorologists are expected to predict the future of how a fluid dynamic will behave. It's as simple as that. It's not an exact science and probably never will be. If the general public understood the nature of fluid dynamics (most get the concept of fortune telling, although that's a completely false science even at best) and the probabilities involved with trying to predict future behaviors, it would probably be better understood, but we'll probably never buck the whole "it must be nice to get paid to be wrong half the time" poppycock.

Jeff Duda · Tuesday at 11:44 AM

Yesterday's High risk forecast was a bust. I don't see how you can't justify that. Now, I don't know how some differentiate between a "complete bust" and a "regular bust", so maybe the nuance saves that discussion. But a high risk event is supposed to be a rare event that only occurs a few times per year and consists of widespread and extreme severe weather reports. "Widespread" and "extreme" are subjective terms, sure, but even looking at the raw LSR counts shows that what happened yesterday has already been exceeded in 2024 by non-high risk events such as March 14th (Storm Prediction Center 20240314's Storm Reports). Considering even further that the official "high" designation was due strictly to the tornado probabilities and not hail or wind, and the forecast was errant even more so since it overforecast one category and underforecast another.

I've noticed that when SPC issues PDS tornado watches they tend to be very large (looking into finding stats for this). But making a watch large in area artificially increases the chances of it capturing >=2 tornadoes and >=1 EF2+ tornadoes. [Side note: 2/1 tor/sigtor already seems like a pretty low bar, and many non-PDS tornado watches readily exceed that threshold, so I'm already not sure the PDS designation is that helpful.] I would think, if anything, you'd want to issue smaller PDS tornado watches to really focus in on the highest threat areas without overly alerting way more people. IDK, I am not an SPC forecaster and cannot speak for their forecasting philosophy, but in this regard I disagree with it.

Secondly, this particular PDS tornado watch (Storm Prediction Center PDS Tornado Watch 189) is likely only going to verify by the skin of its teeth, seeing as the only likely sig tor at this point occurred in the far eastern end of Osage County, and even the parallelogram defining the watch seems to end right about at the longitude of where Barnsdall is. So even with that extra huge area, they still almost missed! That to me does not show a particularly high level of skill.

Jason N · Tuesday at 12:53 PM

Jeff Duda said:
Yesterday's High risk forecast was a bust. I don't see how you can't justify that.

This is what I mean:

is it a bust in skill/accuracy of the watch polygon? ehem, parallelogram......

or, is it a bust in reports to spatial coverage of the high area? or to the specifics of you didnt get but 1 EF-3 or higher! you suck!! lol.... you mention how it was superseded in count on a previous forecast, so, it seems to me like, we look to verify forecasts, we do that now with reports so if we were to somehow link that, say..... (Marginal = 5-9 reports ; Slight 10-19 reports; Enhanced 20-29 reports; MDT 30-49 reports; High 50+) its arbitrary just for discussion, and the reports are aggregate totals inside of the spatial area of coverage for each, not separated into each category, it seems like this would be the easiest way to see if an area Hit or Missed

Jesse Risley · Tuesday at 1:04 PM

Jason N said:
This is what I mean:

is it a bust in skill/accuracy of the watch polygon? ehem, parallelogram...... or, is it a bust in reports to spatial coverage of the high area? or to the specifics of you didnt get but 1 EF-3 or higher! you suck!! lol.... you mention how it was superseded in count on a previous forecast, so, it seems to me like, we look to verify forecasts, we do that now with reports so if we were to somehow link that, say..... (Marginal = 5-9 reports ; Slight 10-19 reports; Enhanced 20-29 reports; MDT 30-49 reports; High 50+) its arbitrary just for discussion, and the reports are aggregate totals inside of the spatial area of coverage for each, not separated into each category, it seems like this would be the easiest way to see if an area Hit or Missed

I'd argue that it's two-pronged: it was largely a tornado-driven high risk and those metrics were not met based on the current (and projected) reports and, spatially, the wind reports that would have verified a high risk for damaging wind events were displaced from the delineated high risk area. Those two components fell short of what was forecast. There were plenty of severe weather reports, so it's not like a true bust where severe weather fails to completely materialize as forecast, but that gets into the nuances of language and subjectivity.

Jason N · Tuesday at 1:42 PM

all good points!..

Jamie H · Wednesday at 3:08 AM

Considering I thought it wasn't a bad initial forecast, the past few replies have been really useful, thanks!

JamesCaruso · Wednesday at 5:16 AM

Sean Ramsey said:
Kind of funny how we treat the weather risk communication after the fact as opposed to say fire danger risk or other areas where danger can occur.

It's a risk, doesn't mean when it doesn't happen it's wrong. They assign high fire danger risk in forests all the time, no one goes back and says bust if a fire didn't occur on a particular day. Climbing Mount Everest is a high risk, but if climbers didn't die on a particular day, no one says the risk didn't verify.

I understand that people modify their days based on weather forecast communication. But, there was an inherent danger yesterday, it just wasn't realized on a mass scale.

Is the forecast made to predict tornadoes (quantity, size & type) or to communicate the risk (or danger) associated with them if they occur, or both?

Seems to me, the (very good) chance was there, right or wrong after the fact. Maybe someday they'll reach a point in forecasting where it's called "High Probability".

Some good points generally from a social science perspective and how people interpret risk - which is generally poorly. People focus only on the probability of something happening, forgetting that, unless the forecast probability is 100%, the forecast is implicitly assigning a non-zero probability to that thing NOT happening. I like the fire danger analogy.

But with regard to your last point about “high probability” - that’s already exactly what it’s supposed to be, a probabilistic forecast.

William Monfredo · Wednesday at 2:10 PM

Chased OK high-risks since 2010; that year & 2011 produced. Since then, as you know, not so much. (SPC's Roger Edwards said a degree too warm aloft May 20, 2019, Magnum day, prevented a major outbreak.) May 6, I don't think the storms interfered with each other, at least early on, but the surface temps seemed low for much of the day due to clouds. Yet, elevated instability can still get a job done; go figure. (Fall tornadoes in northern AZ anyone?)

The May 10, 2010 Tornado Outbreak in Oklahoma

The May 24, 2011 Tornado Outbreak in Oklahoma

Jeff Duda · Thursday at 11:57 AM

Yesterday's event appears to have been slightly underforecast* by SPC given the practically perfect observation contour levels exceeded 60% for both wind and hail depending on the source (see below). According to the convective outlook probability-to-category conversion table, however, it appears that the dearth of significant wind reports would have rendered this forecast failing to reach the high risk category (by the narrowest of margins). And apparently there is no PP contour level for hail LSRs that is high enough to verify a high risk. *Since moderate risks were in place based on both wind and hail, technically SPC's day 1 forecast yesterday was pretty accurate. But in my biased opinion it still seems a bit low based on the coverage of the 45 and 60% contours, both in terms of spatial extent and location. Also, neither hazard type was forecast to exceed 60%.

Practically perfect obs:

Screenshot 2024-05-09 at 10-56-09 HWT SFE - Experimental Outlook Verification.png

Screenshot 2024-05-09 at 10-55-59 HWT SFE - Experimental Outlook Verification.png

Scaling is a bitch:

There are 421 filtered LSRs as of 10:45 AM MDT the day after this event.

Meanwhile, Monday's event still only has 355 filtered LSRs (spread out over a larger area), and last I checked, OUN has only confirmed 3 tornadoes from the event so far, but they haven't put out an update in more than 24 hours which leads me to believe the surveys are done. Tulsa has only confirmed the Barnsdall tornado and expects to find more. Springfield has confirmed 3 (outside the high risk area).

Jeff Duda · Friday at 12:33 PM

I'm gonna continue to pile on here.

Yesterday technically would have verified a high risk for wind (SIG is required in addition to 60% coverage, and there are a small number of sig wind reports within the 60% PP contour). The 60% PP level was also exceeded for hail, but it seems that there is no hail coverage threshold that is adequate to satisfy a high risk according to SPC's standards, so for the time being I'll let them have that.

Screenshot 2024-05-10 at 11-24-55 HWT SFE - Experimental Outlook Verification.png

Without loss of generality (i.e., all of the forecasts maintained the same max category), here is the day 1 convective outlook from yesterday:

The highest category was enhanced - that means this forecast is a two-category underforecast. That also makes arguably (see edit to my previous post) two forecasts this week that missed low. Yet for some reason these two underforecasts aren't making the headlines like the overforecast was.

Jason N · Friday at 12:42 PM

that's an interesting point you bring up. Based on what you're pointing out, it seems that they "reserve" the High for tornado days only? I wonder how many times they used high for Derecho events. Maybe that's how they view it then. Tornadoes, Derecho get the top spots on High Risk days?, while MCC's or long-lived wind events get enhanced, maybe MDT if conditions warrant a higher tornado prob? .. I don't know but that's interesting.

Jeff Duda · Friday at 1:07 PM

Jason N said:
that's an interesting point you bring up. Based on what you're pointing out, it seems that they "reserve" the High for tornado days only? I wonder how many times they used high for Derecho events. Maybe that's how they view it then. Tornadoes, Derecho get the top spots on High Risk days?, while MCC's or long-lived wind events get enhanced, maybe MDT if conditions warrant a higher tornado prob? .. I don't know but that's interesting.

No, higher-end derechos are deserving of high risk designation, too, in addition to tornado outbreaks.

Andy Wehrle · Saturday at 1:03 PM

I think part of the reason we don't see high risks for 60% hatched wind more often is that the truly exceptional, top-end derechos (the ones with widespread >100 MPH gusts over a long swath) are hard to forecast with accuracy. The most recent such event (the August 10, 2020 Iowa-Illinois derecho, which IIRC produced at least one 120 MPH measured gust and one estimated at 140 MPH [category 4 hurricane!] based on damage) had a marginal risk covering most of the eventual affected area on the initial Day-1 outlook. When it became apparent what was about to occur, they upgraded to a 45% hatched (MDT) for wind and issued several PDS severe thunderstorm watches.

The few times we have seen those risks issued (June 3, 2014 for example), the resulting event has been relatively underwhelming (and in that case, the core of the event tracked further south with the majority of the high risk zone void of severe reports).

Forecasting Accuracy 2024 - Good or Bad?

EF4

EF4

EF4

EF4

EF4

EF4

EF4

site owner, PhD

EF4

EF4

EF2

EF0

site owner, PhD

site owner, PhD

EF4

site owner, PhD

EF5

Similar threads