Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hydrofab2ngen tools #46

Open
wants to merge 108 commits into
base: main
Choose a base branch
from

Conversation

jameshalgren
Copy link
Contributor

@jameshalgren jameshalgren commented Apr 27, 2023

Replaces original intent of #26
Remaining checkmarks still to be examined, possibly through #26, refreshed.
Working on this with respect to #24.

Work remaining might include the following:

  • Include the basin ID in the output dataframe. (Actually, this is critical...)
  • Add option to process on entirely remote file set.
  • Label the output with times (it's simply an hourly output now, with no explicit timestamping)
  • Add more flexibility regarding inputs, CLI interaction, etc.
  • Incorporate csv/.nc file output as an option
  • Package and distribute (with whatever re-writing that requires.)
  • Fix the parallel bug (orphaned objects left behind by parallel execution.)
  • Work with HDF5 group to make reading netcdf files a truly concurrent activity (see here - suggestion to re-compile with --enable-parallel flag or here - possibly change to access pattern)
  • Walk through the math and make sure all (or at least one) of the methods produce the expected output
  • Figure out how to have a more nuanced zonal statistic where the windowing can be partial (or confirm that it doesn't matter.)

How to test:

Download a set of test data files to use with python prep_hydrofab_forcings_ngen.py user_input_ngen.json:

with the default configuration downloading 18 hours of Short Range forecast and the 03W VPU, it can generate the 30000 feature time series in just a minute or so after generating the .json file with the weights. Unfortunately, generating that file takes almost 20 minutes and is an opportunity for optimization.

@jameshalgren
Copy link
Contributor Author

@JordanLaserGit
Could we work with @arpita0911patel to create the template files for each of the VPUs and store those in the CIROH cloud buckets?

That would save everyone quite a bit of time and could be a step towards more efficient data access...

@aaraney
Copy link

aaraney commented May 1, 2023

watching


def aorc_as_rate(dataFrame):
"""
Convert kg/m^2 -> m/s
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these two can live in some AORC tool repo. Would that be something to put in hydrotools?

Comment on lines 22 to 23
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used...

@jameshalgren
Copy link
Contributor Author

Almost there -- need to resolve subset import issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants