-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory requirements #141
Conversation
@MaggieMarvin This is a great start thank you. I like the biospec mods to only allocate for specific bioemis species when required, and also think its OK to adjust the fix the time dimension for "tmp" arrays' running averages. Does this allow you to run longer simulations that you require? So, I think the initial changes are looking OK, but considering still the large memory requirement of running on large domains (even regional subsets), I think we still need an actual restart option and output though, and not sure about your restart variables. Can this be added here too, or wait for a future? For example, I am not sure your "rst" variables have the right size for time dimension. For restart, don't we only need those output a single timestep (i.e., the last previous 24-hr and 240-hr averages), or actually no time at all (i,j,k)? We could write the actual 24-hr and 240-hr variables as restarts (e.g., "ppfd_sun24_3d") in a new file if turned on (at specific user frequency, every timestep, hour, 6 hours, 24-hrs, etc.). Then, upon new restart simulation, the model would use those 24- and 240-hr values and update using the fractions and new instantaneous (e.g., ppfd_sun) until a long enough simulation would calculate updated 24-hr and 240-hr new averages to use there. This is similar to how HEMCO/MEGAN2 does it. Ultimately, not sure we even need these new "rst" variables, but rather create functionality/option to output and read the 24-hr and 240-hr average variables (e.g., ppfd_sun24_3d, ppfd_sun240_3d, etc.) for restart capability. |
src/canopy_alloc.F90
Outdated
if(.not.allocated(ppfd_shade240_tmp)) allocate(ppfd_shade240_tmp(241,nlat*nlon,modlays)) | ||
if(.not.allocated(tmp2mref_tmp)) allocate(tmp2mref_tmp(25,nlat*nlon)) | ||
if(.not.allocated(ubzref_tmp)) allocate(ubzref_tmp(25,nlat*nlon)) | ||
if(.not.allocated(tleaf_sun24_rst)) allocate(tleaf_sun24_rst(24,nlat*nlon,modlays)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need the "rst" variables, please see my general comment on idea to output the total 24-hr and 240-hr leaf temp, ppfd in i,j,k (and maybe lai) as restart variables to be read in separately and adjusted to instantaneous based on fractions until new 24-hr and 240-hr become available.
src/canopy_alloc.F90
Outdated
if(.not.allocated(ppfd_shade240_tmp_3d)) allocate(ppfd_shade240_tmp_3d(241,nlon,nlat,modlays)) | ||
if(.not.allocated(tmp2mref_tmp_3d)) allocate(tmp2mref_tmp_3d(25,nlon,nlat)) | ||
if(.not.allocated(ubzref_tmp_3d)) allocate(ubzref_tmp_3d(25,nlon,nlat)) | ||
if(.not.allocated(tleaf_sun24_rst_3d)) allocate(tleaf_sun24_rst_3d(24,nlon,nlat,modlays)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we need the "rst" variables, please see my general comment on idea to output the total 24-hr and 240-hr leaf temp, ppfd in i,j,k (and maybe lai) as restart variables to be read in separately and adjusted to instantaneous based on fractions until new 24-hr and 240-hr become available.
src/canopy_calcs.F90
Outdated
tleaf_ave240_tmp_3d(nn,i,j,:) = tleaf_ave | ||
tmp2mref_tmp_3d(25,i,j) = tmp2mref | ||
ubzref_tmp_3d(25,i,j) = ubzref | ||
ppfd_sun24_rst_3d(:,i,j,:) = ppfd_sun24_tmp_3d(1:24,i,j,:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, no functionality yet to read in "rst" variables from external restart files? See earlier comments on how these restart variables could look like too. Maybe we need to chat as not sure how these work right now.
src/canopy_canvars_mod.F90
Outdated
@@ -52,6 +52,18 @@ MODULE canopy_canvars_mod | |||
real(rk), allocatable :: ppfd_shade240_tmp ( : , :, : ) ! PPFD for shaded leaves (umol phot/m2 s) | |||
real(rk), allocatable :: tmp2mref_tmp ( : , : ) ! 2-meter (AGL) input reference air temperature (K) | |||
real(rk), allocatable :: ubzref_tmp ( : , : ) ! 10-meter (AGL) input reference wind speed (m/s) | |||
real(rk), allocatable :: tleaf_sun24_rst ( : , :, : ) ! Leaf temp for sunlit leaves (K) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here and below for having new "rst" variables. Are these really needed>?
src/canopy_dealloc.F90
Outdated
@@ -59,6 +59,18 @@ SUBROUTINE canopy_dealloc | |||
if(allocated(ppfd_shade240_tmp)) deallocate(ppfd_shade240_tmp) | |||
if(allocated(tmp2mref_tmp)) deallocate(tmp2mref_tmp) | |||
if(allocated(ubzref_tmp)) deallocate(ubzref_tmp) | |||
if(allocated(tleaf_sun24_rst)) deallocate(tleaf_sun24_rst) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here and below for having new "rst" variables. Are these really needed>?
src/canopy_init.F90
Outdated
@@ -43,6 +43,18 @@ SUBROUTINE canopy_init | |||
if(allocated(ppfd_shade240_tmp)) ppfd_shade240_tmp(:,:,:) = 0.0_rk | |||
if(allocated(tmp2mref_tmp)) tmp2mref_tmp(:,:) = 0.0_rk | |||
if(allocated(ubzref_tmp)) ubzref_tmp(:,:) = 0.0_rk | |||
if(allocated(tleaf_sun24_rst)) tleaf_sun24_rst(:,:,:) = fillreal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment here and below for having new "rst" variables. Are these really needed>?
Thanks Patrick! It works with the rst variables removed and produces consistent results. I'll work on using the main average variables (like ppfd_sun24_3d) to generate restart outputs next. |
@MaggieMarvin Great work, and excellent discussion today working through what I/we have done so far with this important historical effect and memory requirements. I have edited the title to focus on what you have done here in terms of memory requirements. If you can run that 3-4 day test to confirm all works ok after 2 days that would be great and I will complete the review and eventual merge here. You can then bring back your new 24hr and 240hr restart variables in another separate draft PR and we can work to get the functionality working with new restart file I/O, option to use them in the NL and codes, etc. |
@drnimbusrain Thanks and that all sounds great! I'll get a 4-day simulation going and get back to you with the results. |
I ended up doing a test with the FluCS simulation so that I could focus on one point that we know generates biogenic emissions, and I added a write statement to track changes in one of the tmp variables over time. In the file rolling.txt, the 25 hourly values in the ppfd_sun24_tmp array for the FluCS tower location at the surface are written to a new line as canopy-app loops through each timestep (nn). The array populates from lines 1 to 24 and then at line 25 the array window starts to shift. I added the write statement after assigning the new instantaneous value to the 25th place and copying over to complete the shift, so the last two elements in the array should be the same, at least as written out here. The values vary from day to day, and because this variable tracks solar radiation we do expect to see those zeros at night. And just a note that ppfd_sun24_tmp is not an integer array, but I wrote out here as integers just to show the effects more clearly. |
Great news Maggie seems to be working as expected. I don't see any values in your txt file, but believe you in your tests! |
@MaggieMarvin Great, I approve and merging. Hope this at least gets you the longer simulations for right now, and looking forward to working on restart capability in next draft PR. |
These changes are a first attempt at optimizing memory allocation for time-dependent historical variables, which can otherwise become a barrier to starting simulations for longer time periods when the historical options are turned on (hist_opt = 1). Here those variables are allocated only the necessary 24 and 240 hours respectively, with an extra hour used to facilitate a rolling time window. I've also added new "restart variables" (*_rst and *_rst_3d) that save the historical information from the previous 24 and 240 hour time windows and could potentially be output to generate restart files or read in to provide restart information at simulation start. Lastly, I also included a few adjustments to allocation and processing according to species selection (biospec_opt > 0) to further cut down on memory requirements and run time.
Even with these changes there is still a considerable memory barrier (I need at least 2 TB to run canopy-app for one month on a cropped GFS grid), but it is definitely reduced (at least 3 TB was needed previously) and it does also make some progress towards a restart capability where canopy-app can be run for shorter time periods while maintaining the information needed to apply historical effects.
From a two-day test run, results are very similar to the original historical calculation with slight differences due to processing of 240-hour variables: here 24-hour weights are applied to the 240-hour variables until the 240-hour variables are fully populated.