You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just checked on the iids requested in #72 and noticed that there might have been a bug in the ingestion of previous LEAP ingested stores.
I ran the following:
iids= [
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HH.highres-future.r1i1p1f1.Omon.so.gn.v20200514',
'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.Omon.so.gn.v20200514',
'CMIP6.HighResMIP.NERC.HadGEM3-GC31-HH.hist-1950.r1i1p1f1.Omon.thetao.gn.v20200514',
'CMIP6.HighResMIP.MOHC.HadGEM3-GC31-HH.highres-future.r1i1p1f1.Omon.thetao.gn.v20200514'
]
importintakeurl="https://storage.googleapis.com/cmip6/cmip6-pgf-ingestion-test/catalog/catalog.json"# Only stores that pass current testscol=intake.open_esm_datastore(url)
CMIP6_naming_schema="mip_era.activity_id.institution_id.source_id.experiment_id.member_id.table_id.variable_id.grid_label.version"foriidiniids:
print("==============================================================================")
print(f'Checking for {iid=}')
facet_dict= {k:vfork,vinzip(CMIP6_naming_schema.split('.'), iid.split('.'))}
# we do not catalog the mip_era...which we probably should? TODO raise an issuedelfacet_dict['mip_era']
cat=col.search(**facet_dict)
iflen(cat) ==0:
print('Not found in catalog')
eliflen(cat) ==1:
path=cat.df['zstore'].tolist()[0]
ddict=cat.to_dataset_dict()
name, ds=ddict.popitem()
print(f'{name=} Found in catalog at \n{path=}')
display(ds)
else:
print(f"Found more than one entry. ⛔️ This should not happen.")
Which showed all iids already in the catalog (within the 'LEAP legacy' prefix 'gs://cmip6/CMIP6_LEAP_legacy/'). Each of these datasets has only 2 timesteps which makes me conclude that we have ingested pruned datasets into the main catalog. The way I have this set up currently this should not happen, but we need to overwrite (related to #76 ) these now and build better checks in the future.
I just checked on the iids requested in #72 and noticed that there might have been a bug in the ingestion of previous LEAP ingested stores.
I ran the following:
Which showed all iids already in the catalog (within the 'LEAP legacy' prefix
'gs://cmip6/CMIP6_LEAP_legacy/'
). Each of these datasets has only 2 timesteps which makes me conclude that we have ingested pruned datasets into the main catalog. The way I have this set up currently this should not happen, but we need to overwrite (related to #76 ) these now and build better checks in the future.Here are the steps I am taking next:
The text was updated successfully, but these errors were encountered: