-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several fixes and improvements enabling greater applicability of the algorithm for very large datasets #355
base: main
Are you sure you want to change the base?
Conversation
Add an height_threshold check in seed detection, line 945. Fixes issue 3dgeo-heidelberg#324
for more information, see https://pre-commit.ci
added parameter and check to either use or not use seed candidates that are unfinished by the end of the time series
added check to ensure algorithm continues when only 1 neighbor is found during seed scoring
…termediate_saving`
Added a psutil virtual memory usage tracker
fixed wrong printing statement of percentage virtual mem used
remove intermediate saving, too much memory used
added missing parenthesis
added new commits from py4dgeo
added intermediate saving and resuming from seed index in segmentatio…
added maximum seed index parameter, defining at which seed to terminate the region growing algorithm.
removed unnecessary print statements
fixed intermediate saving by not putting it at the end but at beginning of loop
…tomatically subsetting the computation, due to time constraints on clsuters
Previous `height_threshold` check in seed detection checked the full time series instead of the interval used as seed for determining if `height_threshold` was met
…ferencing the analysis.objects to copying the analysis.objects
changed write to number_of_seeds.txt to write a string
Made the writing of the number of seeds optional as parameter. This allows for the automated computation of the sizes of subsets of seeds to be run sequentially, to workaround time constraints of HPCs.
for more information, see https://pre-commit.ci
Thanks for these valuable contributions to the 4D-OBC algorithm, @hdaan! |
That is interesting. When I look to review the changes it does add them in the code. Do you only see the result from this one commit then? Can you maybe show me what you see when you review? |
There is one issue I just found in the Can you update this before merging? Or should I open a new pull request? |
This pull requests contains several commits containing the following changes:
objects
in theSpatiotemporalAnalysis
object, the algorithm still goes on to detect moreobjects
instead of only returningprecalculated
. This is done hereintermediate_saving
) which allows to save objects every _n_th seed. This also enables the splitting of the 4D-OBC computation between batches of seeds. These can be defined by the new parametersresume_from_seed
andstop_at_seed
. Thus, the computation starts and stops at the given seed numbers, starting at 1.height_threshold
parameter. This parameter was previously given but never used. Now, only seeds are added where the absolute difference between minimum and maximum of a seed is larger thanheight_threshold
. This is done hereuse_unfinished
. If this isFalse
, unfinished seeds, i.e., seeds that have not returned to their initial elevation are not considered as seeds for 4D-OBC segmentation, else, they are considered, as previously done.seed_sorting_scorefunction
for seeds without neighbors. Previously these would give an error through division by zero here. Now they get assigned a very high value, giving them a very low ranking. As they don't have neighbors they won't be growing into 4D-OBCs either way. This is a workaround, which should be reconsidered, in the future.