-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapting CutoutFactory
to account for error-less TICA Cubes
#88
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #88 +/- ##
==========================================
+ Coverage 94.40% 94.45% +0.05%
==========================================
Files 8 8
Lines 1429 1443 +14
==========================================
+ Hits 1349 1363 +14
Misses 80 80
☔ View full report in Codecov by Sentry. |
c0e89d3
to
190a0dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tesscut code will need to change for this to work
astroquery? anything else?
Side note: @hmlewis-astro will be doing a science review of this PR (cutout image/WCS inspection, etc) next sprint, so I'd like to hold off on merging until I get her approval. Thanks! |
There's some flake8 failures in CI -- look relatively trivial to fix: https://github.com/spacetelescope/astrocut/actions/runs/5716882528/job/15489424621?pr=88 would be good to have CI passing. I didn't look at the "allowed failure" at all -- that can be ignored for now, i think. |
Interesting - I haven't seen that codestyle error before. I'll update with a new commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding the CI failure for python 3.11 and installing everything from source, I think we might need an additional step to install openblas: libblas-dev
this can be done in a different PR, though. Once this is merged, I don't mind taking a look.
The merge-base changed after approval.
The merge-base changed after approval.
Before we merge this into main: I demo'ed the work here earlier this week and got a suggestion from @scfleming to keep the |
I'd be happy with that change @jaymedina -- it will make this cutout code easier to maintain, as well. It will make the cutouts the same size as the SPOC cutouts, though, (not any smaller) but I don't think that was the intention with any of this. |
Updated accordingly. Yep, the cutouts themselves are small, so it's not as big of a deal for them to have the error array. The important thing was to reduce the size of the cubes by removing the error array there, and to adapt the |
@@ -729,10 +729,10 @@ def _build_tpf(self, cube_fits, img_cube, uncert_cube, cutout_wcs_dict, aperture | |||
|
|||
# Adding flux and flux_err (data we actually have!) | |||
pixel_unit = 'e-/s' if self.product == 'SPOC' else 'e-' | |||
flux_err_array = uncert_cube if self.product == 'SPOC' else empty_arr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm a little uncertain about this. numpy array assignment is done by reference. i'm just worried that because we are using empty_arry here to assign into cadence_arry, or appending to columns, that something might change inside of empty_arr
and then would be applied to flux_err_arry at this point.
e.g. to illustrate what i'm worried about
>>> import numpy as np
>>> empty_arr = np.zeros((2,3))
>>> other_arr = empty_arr[:,1]
>>> other_arr[0] = 1
>>> other_arr
array([1., 0.])
>>> empty_arr
array([[0., 1., 0.],
[0., 0., 0.]])
maybe on this line, we should assign flux_err_arry to just np.zeros(img_cube.shape)
to be sure it's empty?
maybe we're not changing cadence_array
so this isn't an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thing i'm wondering -- should the error be zero? or should it be np.NaN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this too. I lean towards 0. for two reeasons at the moment:
- the TICA cubes had the dummy values sent to 0. I think, so this maintains that part of it
- adding NaN may result in issues for non-NaN-safe calculations, e.g., produce values of NaN for computation that included the flux uncertainty. Of course 0. could also introduce problems depending on what the user is doing with it, it's not obious to me which is better if someone isn't aware that these are essentially undefined but use the flux uncertainties in calcuations within their code.
E.g., a flux uncertainty of 0 wouldn't impact a propagation of error calculation, but a value of NaN would result in a NaN result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case wouldn't it be better to nullify the results than to silently make the calculation with an incorrect value for the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was my old philosophy: use NaN's as a forced flag to indicate to an unaware user that something is not defined, and you either need to mask or add nan-safe options to your function calls. these days I wonder how many functions filter out NaN values automatically or not. Do you have thoughts on this @hmlewis-astro @jaymedina ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm in favor of that. I'll go ahead and make a commit unless there are other suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not in favor of using any actual number here -- the error/uncertainty is unknown. using actual numbers is likely just going to result in confusion. if we're returning an array to conform with some expectation that the cutout has a certain number of extensions or whatever, but the values are not to be used -- well, this is precisely what np.NaN is for. we might also consider writing something in the header of the fits file about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not opposed to changing the vals to np.nan
s. One thing to consider tho, is that np.zero
arrays are used for all the other columns which we don't have information about, for example CADENCEN0
which is the column that was referenced in the start of this thread (when making references to cadence_array
). So if 1 column uses np.nan
to fill in values we don't know about, and 1 column uses np.zero
, that will be even more confusing.
What we can do for now is stick with the traditional np.zero
array (I can reassign flux_err_array
to np.zeros(img_cube.shape)
to ensure its a zero array as suggested in the first comment of the thread), and once we come to can agreement on what the fill vals should be, I'll open up a new PR to update the empty_arr
for all relevant columns. Let me know if this works with you all and I'll make a Jira ticket.
As for timeline, I'd like to get this PR merged in by end of this week or beginning of next week so I have enough time to update and redeploy the TESSCut API before the end of this sprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this solution works for now! Once you make the changes to reassign flux_err_array
following Ben's original comment, I think this is ready to merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thanks @hmlewis-astro !
Should the title of this PR change? |
CutoutFactory
CutoutFactory
to account for error-less TICA Cubes
I agree, keeping the |
getattr(cube[1], cube_data_prop), threads=threads, verbose=verbose | ||
) | ||
img_cutout, uncert_cutout, aperture = self._get_cutout(getattr(cube[1], cube_data_prop), threads=threads, | ||
verbose=verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably align on what whitespace formatting we're using on this project to avoid unnecessary changes back and forth...
I use darker which just applies black selectively to my set of changes (instead of on the entire file).
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
The merge-base changed after approval.
367da13
to
fef425a
Compare
Summary
This PR updates
CutoutFactory
to account for the removed error array inTicaCubeFactory
(work that was done in #87). It involves turning theproduct
kwarg fromcube_cut()
into a class attribute so that it doesn't have to be fed into every method as an argument before being called. It also involves using this newself.product
class attribute to make conditionals that make the code dynamic and changes the logic depending on whether the product for the cutouts is"TICA"
or"SPOC"
. Unit test changes are also made where appropriate.Task List
CutoutFactory
cube_cut
unit testsScreenshots