GRIB file Input
Hello there!
I have been having a lovely time making use of the aifs-ens as presented on hugging face.
This has been super helpful:
https://huggingface.co/ecmwf/aifs-ens-1.0/blob/main/run_AIFS_ENS_v1.ipynb
One thing for me: I want to switch from opendata to my own grib file (the same data - just in my local resource).
At a high level, I started by changing:
data = ekd.from_source("ecmwf-open-data", date=date, param=param, levelist=levelist)
to:
data = ekd.from_source("file","input.grib", date=date, param=param, levelist=levelist)
But it seems that reading from GRIB... the arguments of date, param, and levlist don't really carry through.
So my question is: are there more resources that a person like me could look at to implement differing functionalities? I guess really more tutorials/examples and such!
Regards, and thanks,
Brian E.
Hi Brian,
Assuming your grib file has all the necessary variables at both input timesteps t0 and t-6, you can do something like this to construct the input_state
DATE = datetime.datetime.fromisoformat("2025-10-01T00:00:00") # t0, change this according to your input data
ds = ekd.from_source("file", "input.grib")
fields = defaultdict(list)
for date in [DATE - datetime.timedelta(hours=6), DATE]:
data = ds.sel(valid_datetime=date.isoformat())
for field in data:
name = (
f"{field.metadata('param')}_{field.metadata('levelist')}"
if field.metadata("typeOfLevel") == "isobaricInhPa"
else field.metadata("param")
)
fields[name].append(field.to_numpy())
for param, values in fields.items():
fields[param] = np.stack(values)
input_state = dict(date=DATE, fields=fields)
If your grib contains more variables than needed, you can add the param option to the ds.sel(...) to select only the AIFS variables. You can find some examples on working with grib in earthkit-data here https://earthkit-data.readthedocs.io/en/latest/examples/grib_overview.html
Thank you very much - I shall give it a go!
Hi there hi there. So this did indeed get things going, but then I ran into a snag.
It seems that the output was not happy.
THE DETAILS:
Using files like so:
aws s3 cp --no-sign-request s3://ecmwf-forecasts/20251015/00z/aifs-ens/0p25/enfo/20251015000000-0h-enfo-cf.grib2 .
WHEN I RUN IT:
two values for number of points 2076480 (GDS) 1038240 (calculated)
two values for number of points 2076480 (GDS) 1038240 (calculated)
THE FINAL ERROR:
Traceback (most recent call last):
File "/home/ec2-user/aifs2/aifs-local.py", line 314, in
for i,state in enumerate(runner.run(input_state=input_state, lead_time=lead)):
File "/home/ec2-user/miniforge3/lib/python3.12/site-packages/anemoi/inference/runner.py", line 219, in run
input_tensor = self.prepare_input_tensor(input_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ec2-user/miniforge3/lib/python3.12/site-packages/anemoi/inference/runner.py", line 392, in prepare_input_tensor
input_state = self.validate_input_state(input_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ec2-user/miniforge3/lib/python3.12/site-packages/anemoi/inference/runner.py", line 822, in validate_input_state
raise ValueError(f"Size mismatch latitudes={nlat}, number_of_grid_points={number_of_grid_points}")
ValueError: Size mismatch latitudes=1038240, number_of_grid_points=542080
As I understand it: The N320 reduced Gaussian grid has 542,080 grid points, while a 0.25°longitude-latitude grid has 1,038,240 grid points.
From what I can tell, when I regrid to the N320, I do get the right number of points, but the metadata from the GRIB file did not change...
Thus, I think it is thinking my data is on the Lat-Lon grid when I've actually regridded it. Or at least, I THINK I have regridded it.
Any hints and suggestions welcomed on how to drive this from the opendata