Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various bugs #336

Merged
merged 33 commits into from
Oct 31, 2023
Merged

Fix various bugs #336

merged 33 commits into from
Oct 31, 2023

Conversation

annehaley
Copy link
Collaborator

@annehaley annehaley commented Aug 30, 2023

This PR is essentially a bug hunt; it began with looking at multi-domain project use cases for unexpected behavior. Most of these fixes are related to multi-domain cases, but a few are miscellaneous bugs.

This PR resolves #330.

Results

The following bugs were identified and addressed in this PR:

Multi-domain

  • A new anatomy was assumed for every unique suffix in a subject within an swproj file. One use case project had keys like the following: "shape_1", "shape_2", "alignment_1", "alignment_2", "alignment_global", etc. Only two domain shapes exist for the subject, but a new anatomy was assumed, and this resulted in a third anatomy option "global" in the client, which contained no shapes. This was fixed by raising a warning (and not creating a new anatomy type) when a new suffix is encountered on any non-shape prefixes. Shape prefixes include "shape", "mesh", "segmentation", "contour", and "image".
  • When ReconstructedSample objects were created in the post-Analysis step, all reconstructed shapes of the same subject would reference the first OptimizedParticles object that belonged to the subject. On the client-side, the particles reference is what determines which shapes should be shown on the "reconstructed" layer. Only one reconstructed shape had been appearing because only one particle object had been referenced among multiple ReconstructedSamples. The solution was to properly make one-to-one assignments in the post-Analysis step.
  • The number of particles the user could specify for the Optimization step was being ignored for multi-domain cases. The value expected by the shapeworks executable for "number_of_particles" is actually a string of m numbers, delimited by a space, where m is the number of domains for a subject. We were sending in "32", when for a two-domain case, the expected value should be "32 32". An extra bit of logic was added where we edit the swproj file prior to task execution. Currently, the user can only specify one number, so we repeat that same value m times. In the future, this should be upgraded so that the user can specify m unique values for m domains. This is tracked in Multi-domain optimization: unique n_particles for each domain #338.
  • The store variables that track several file lists for the analysis results were not structured to handle a multidomain use case. Those store variables and the functions that call them were restructured.
    • analysisFileShown -> analysisFilesShown
    • currentAnalysisFileParticles -> currentAnalysisParticlesFiles
    • meanAnalysisFileParticles -> meanAnalysisParticlesFiles
      This fixed all particle comparisons for showing the difference from the mean shape and showing whether particles are good/bad. The attached videos demonstrate the fixed behavior.
shapeworks_multidomain_diff_from_mean.mp4
shapeworks_multidomain_good_bad_particles.mp4
  • Particle comparison occurs in one more place: group comparisons. I haven't been able to test the new store var structures with this use case because I don't have a dataset that has groups of multi domain subjects. With help from @JakeWags, the underlying structures for Group Comparisons were modified to support multi-domain cases as well.

Miscellaneous

  • When re-running a task, the old copy of the result layer would remain until the user removed the layer and put it back. Only then would the contents of the layer update to the new results. This is fixed by automatically removing the old result layer at the time a re-run is requested.
  • When re-running a task, the old result objects are deleted, and objects associated with the old results are also deleted. In many use-cases this does not pose a problem, but in one use-case, particle files are referenced as belonging in a folder with a custom name (rather than the default folder "particles"). When the Optimize step was run on this project after a Groom step in which the old particle files are deleted, the Optimize execution would create new particle files in the wrong location. It expects to replace old particle files if they exist, but will create the new ones in the default location if old particles do not exist. Then, in the post-Optimization step, attempting to upload the particles using the paths in the swproj file would raise an error (it would look for the particles in the custom-named folder, when they now existed in the default-named folder). The solution to this problem was to eliminate the cascading deletion behavior. Database relations not tied to a Subject, Project, or Dataset now use the SET_NULL behavior instead of the CASCADE behavior.

@annehaley annehaley changed the title Simplify prefix parsing for multi-domain projects Fix multi-domain projects Aug 30, 2023
@annehaley annehaley force-pushed the fix-multi-domain branch 2 times, most recently from 2b467be to 9513218 Compare September 10, 2023 13:17
@annehaley annehaley changed the title Fix multi-domain projects Fix various bugs Sep 15, 2023
@annehaley annehaley marked this pull request as ready for review September 15, 2023 15:55
@annehaley
Copy link
Collaborator Author

annehaley commented Sep 15, 2023

Rebased to exclude commit with configuration.MIDDLEWARE += ['allauth.account.middleware.AccountMiddleware']: moved to #339

@annehaley
Copy link
Collaborator Author

As of my recent pair-programming with @JakeWags, it appears that this PR fixes any multi-domain use cases. We confirmed the whole workflow (Groom, Optimize, Analyze) works on both single-domain and multi-domain test datasets.

We can start the review process, but I would like to wait on merging this for the moment. Jake will soon be receiving a six-domain dataset from a collaborator, and it would be a good idea to test with that data before merging. So far, our only multi-domain datasets have had just two domains.

@manthey
Copy link
Contributor

manthey commented Oct 2, 2023

I downloaded https://www.shapeworks-cloud.org/#/dataset/151/project/194 and uploaded it to my local instance then regroomed and reoptimized it. While playing with it, sometimes I don't see all of the domains:
shapeworks_layer_bug

@JakeWags
Copy link
Collaborator

JakeWags commented Oct 2, 2023

I downloaded https://www.shapeworks-cloud.org/#/dataset/151/project/194 and uploaded it to my local instance then regroomed and reoptimized it. While playing with it, sometimes I don't see all of the domains:

@manthey Is this happening after running through every operation, when using the "pre-cached" files, or both?

@manthey
Copy link
Contributor

manthey commented Oct 2, 2023

I downloaded https://www.shapeworks-cloud.org/#/dataset/151/project/194 and uploaded it to my local instance then regroomed and reoptimized it. While playing with it, sometimes I don't see all of the domains:

@manthey Is this happening after running through every operation, when using the "pre-cached" files, or both?

After uploading the project, I reran groom and optimize with the default values. I saw the issue while playing with the results. If I do a hard reload of project, I can see the issue if all I do is turn on and off the original, groomed, and reconstructed layers. I can get it consistently to have an issue by just loading from default, picking groomed, picking reconstructed (then I think I see an issue), turning off original and turning off reconstructed (now I have only one domain showing). I see this in both Chrome and Firefox.

@annehaley annehaley mentioned this pull request Oct 2, 2023
6 tasks
@annehaley
Copy link
Collaborator Author

I believe I found the cause:

When we render shapes (not including particles), we send in shape data to the rendering process as a 2D array. For each domain, there is a list of overlapping shapes. When we cache these shapes, we use a unique label, which is supposed to include the index of the domain.

We had been getting this index from

shapes.flat().forEach((shape, index) => {
    ...
    const cacheLabel = `${label}_${layerName}_${index}`
})

Because of this, our cached labels looked like this:

  1. Showing only original for one subject:
ellipsoid_joint_02_Original_0
ellipsoid_joint_02_Original_1
  1. Showing original and groomed for one subject:
ellipsoid_joint_02_Original_0
ellipsoid_joint_02_Groomed_1
ellipsoid_joint_02_Original_2
ellipsoid_joint_02_Groomed_3
  1. Showing only groomed for one subject:
ellipsoid_joint_02_Groomed_0
ellipsoid_joint_02_Groomed_1

Going in this order of steps, at the third step, the shape stored for ellipsoid_joint_02_Groomed_0 and ellipsoid_joint_02_Groomed_1 is actually the same geometry. Both refer to the first anatomy.

The solution is simple: don't flatten the 2D array. We only care about the index of the domain, not the index of the layer in the overlapping shapes. The layer is already reflected in the cache label. I've made this change in ee31dd3.

@annehaley
Copy link
Collaborator Author

I did find another issue while making the above fix. When viewing the Analysis tab for this dataset, it looks like the two domains are drawn on top of each other. I was able to confirm that these are two separate sets of shape and particle files.

Perhaps this is the output of the analysis itself? I'm not aware that we do any centering / repositioning. We may just have to ask Alan.

image

@JakeWags and I had noticed the same thing happening on the Hip Multiple Domain dataset.
image

@JakeWags
Copy link
Collaborator

The analysis objects all being at the origin is due to no reference alignment being used to compute distance transforms. Alan mentioned potential options for individual domain reference, global reference, and Multi-Level PCA. These would likely be options set before the analysis is run (in optimization) or after the analysis has been generated (see std. dev. range).

@JakeWags
Copy link
Collaborator

The analysis objects all being at the origin is due to no reference alignment being used to compute distance transforms. Alan mentioned potential options for individual domain reference, global reference, and Multi-Level PCA. These would likely be options set before the analysis is run (in optimization) or after the analysis has been generated (see std. dev. range).

SCIInstitute/ShapeWorks#2154 will fix this issue.

@manthey
Copy link
Contributor

manthey commented Oct 25, 2023

We might want to merge this and then address remaining bugs on another PR. I still see a bug when I am plying with "Ellipsoid Multiple Domain". I switch to showing Groomed and Reconstructed, turning off Original. I am looking at one subject. If I turn off anatomy_1, it looks correct. If I turn off just anatomy_2, it does not:
shapeworks_anat_bug

@annehaley
Copy link
Collaborator Author

We might want to merge this and then address remaining bugs on another PR. I still see a bug when I am plying with "Ellipsoid Multiple Domain".

Ok, I agree that we should cut off the scope of this PR. I've made an issue for this bug so we can address it in a future PR: #347

Copy link
Contributor

@manthey manthey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I say let's merge this and follow up with any remaining issues as other PRs.

@annehaley annehaley merged commit e8975f3 into master Oct 31, 2023
3 checks passed
@annehaley annehaley deleted the fix-multi-domain branch October 31, 2023 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multi-Domain use cases
3 participants