python pattern for checking if there is a most recent forecast available? #272
-
I had written my own GFS and HRRR grib2 downlaod tool in python and recently discovered Herbie. It's amazing! It will allow me to abstract the grib generation we are using for ocean navigation forecasts. However one challenge still exists - the method for determining if a forecast is even ready yet. For example, if you look at https://www.nco.ncep.noaa.gov/pmb/nwprod/prodstat/index.html and GFS for example, the 00z 0p25 T1534 FORECAST F000-F384 forecast generation doesn't even start until over 3 hours after the forecast time, e.g. for 00z, the model generation starts at ~03:20z and finishes at ~05:20z. Right now, I just wait until ~30 min after each forecast starts on average (e.g. 00z starts at 3:20, so I start looking around ~3:40z) to start checking for an existing folder and grib2 files. Usually the first 24 fxx files are ready within 20-30 min. But is there a better method than this? It would be nice if there was some kind of model availability service that I could check directly, instead of just hitting the URLs over and over until I don't get a 404, or now with Herbie, rechecking the availability over and over until it returns true. Thanks for this wonderful utility! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @datalus667, I'm happy to hear you like Herbie. Herbie was designed to retrieve archived forecasts, and it does that by predicting what URLs should exist. I'm not aware of a "model availability service" aside from the page you shared or checking if a file is available. When I need the latest forecast available, I loop over a set of recent datetimes (hitting URLs until I find something, like you suggested). For example: from herbie import Herbie
import pandas as pd
# Create a list of dates to try
dates = pd.date_range(
pd.Timestamp.utcnow().floor("1h").tz_localize(None), periods=6, freq="-1H"
)
# Find first existing Herbie object
for i in dates:
H = Herbie(i, model="hrrr", priority=["aws", "nomads"])
if H.grib:
break Breaking that up... I know the HRRR model runs every hour, so I made a list of the 6 recent hours to try, starting at the most recent hour. dates = pd.date_range(
pd.Timestamp.utcnow().floor("1h").tz_localize(None), periods=6, freq="-1H"
)
Then I loop over each of those dates and tell Herbie to look for the data. I set for i in dates:
H = Herbie(i, model="hrrr", priority=["aws", "nomads"]) The value of if H.grib:
break This took 2 seconds to find the most recent HRRR forecast Similarly, the GFS runs every 6 hours, so I would do this from herbie import Herbie
import pandas as pd
# Create a list of dates to try. GFS runs every 6 hours
dates = pd.date_range(
pd.Timestamp.utcnow().floor("6h").tz_localize(None), periods=4, freq="-6H"
)
print(dates)
# Find first existing Herbie object
for i in dates:
H = Herbie(i, model="gfs", priority=["aws", "nomads"])
if H.grib:
break If you didn't know the interval of the model, you could just check each hour for the last day, which takes a bit more time, but it'll find the latest run available. %%time
from herbie import Herbie
import pandas as pd
# Create a list of dates to try. GFS runs every 6 hours
dates = pd.date_range(
pd.Timestamp.utcnow().floor("1h").tz_localize(None), periods=24, freq="-1H"
)
print(dates)
# Find first existing Herbie object
for i in dates:
H = Herbie(i, model="gfs", priority=["aws", "nomads"])
if H.grib:
break If you are waiting for data for a particular time/forecast to become available, you could use a while loop and a timer like you suggested. import time
now = pd.Timestamp.utcnow().floor("1h").tz_localize(None)
attempts = 0
H = Herbie(now, model="hrrr", priority=["aws", "nomads"])
while H.grib is None:
# Try to find file again
H = Herbie(now, model="hrrr", priority=["aws", "nomads"])
# Wait 5 seconds
time.sleep(5)
attempts += 1
print(f"{attempts=}") I hope this gives you some ideas of what you can do with Herbie. I have thought about implementing a "latest" feature in Herbie using this looping method. H = Herbie(date='latest', model='hrrr', ...) |
Beta Was this translation helpful? Give feedback.
Hi @datalus667, I'm happy to hear you like Herbie.
Herbie was designed to retrieve archived forecasts, and it does that by predicting what URLs should exist. I'm not aware of a "model availability service" aside from the page you shared or checking if a file is available.
When I need the latest forecast available, I loop over a set of recent datetimes (hitting URLs until I find something, like you suggested). For example: