Adapting `CutoutFactory` to account for error-less TICA Cubes #88

jaymedina · 2023-07-26T14:01:57Z

Summary

This PR updates CutoutFactory to account for the removed error array in TicaCubeFactory (work that was done in #87). It involves turning the product kwarg from cube_cut() into a class attribute so that it doesn't have to be fed into every method as an argument before being called. It also involves using this new self.product class attribute to make conditionals that make the code dynamic and changes the logic depending on whether the product for the cutouts is "TICA" or "SPOC". Unit test changes are also made where appropriate.

Task List

Account for removed error array in CutoutFactory
Update cube_cut unit tests
Rebase onto removing error array dimension from TicaCubeFactory #87

Screenshots

codecov · 2023-07-26T19:00:07Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.05% 🎉

Comparison is base (f0b4c45) 94.40% compared to head (367da13) 94.45%.
Report is 5 commits behind head on main.

❗ Current head 367da13 differs from pull request most recent head fef425a. Consider uploading reports for the commit fef425a to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #88      +/-   ##
==========================================
+ Coverage   94.40%   94.45%   +0.05%     
==========================================
  Files           8        8              
  Lines        1429     1443      +14     
==========================================
+ Hits         1349     1363      +14     
  Misses         80       80

Files Changed	Coverage Δ
astrocut/cube_cut.py	`98.97% <100.00%> (+0.03%)`	⬆️
astrocut/cutouts.py	`88.84% <100.00%> (ø)`
astrocut/make_cube.py	`94.15% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

falkben

tesscut code will need to change for this to work

astroquery? anything else?

astrocut/cube_cut.py

astrocut/make_cube.py

jaymedina · 2023-07-28T19:41:54Z

tesscut code will need to change for this to work

astroquery? anything else?

astroquery.mast.Tesscut wraps around the TESSCut API, so once we redeploy TESSCut API, we should be good to go (I will keep an eye on unit tests). We also have to ensure that this notebook runs correctly, too. More detail is in the Jira ticket for this Epic.

Side note: @hmlewis-astro will be doing a science review of this PR (cutout image/WCS inspection, etc) next sprint, so I'd like to hold off on merging until I get her approval. Thanks!

astrocut/cube_cut.py

astrocut/tests/test_cube_cut.py

falkben · 2023-07-31T16:12:51Z

There's some flake8 failures in CI -- look relatively trivial to fix: https://github.com/spacetelescope/astrocut/actions/runs/5716882528/job/15489424621?pr=88

would be good to have CI passing. I didn't look at the "allowed failure" at all -- that can be ignored for now, i think.

jaymedina · 2023-07-31T16:53:13Z

There's some flake8 failures in CI -- look relatively trivial to fix: https://github.com/spacetelescope/astrocut/actions/runs/5716882528/job/15489424621?pr=88

Interesting - I haven't seen that codestyle error before. I'll update with a new commit.

falkben

regarding the CI failure for python 3.11 and installing everything from source, I think we might need an additional step to install openblas: libblas-dev

this can be done in a different PR, though. Once this is merged, I don't mind taking a look.

The merge-base changed after approval.

jaymedina · 2023-08-02T15:23:30Z

Before we merge this into main:

I demo'ed the work here earlier this week and got a suggestion from @scfleming to keep the FLUX_ERR cutouts in TICA for the sake of making the code easier to integrate into 3rd party software like Lightkurve and user scripts. I believe this would involve changing the logic that starts here in this branch so that an empty array (empty_arr) gets filled in for each row of this column when the product is TICA, and also updating the unit tests accordingly (we will check TICA for the existence of FLUX_ERR again, and ensure that they are empty arrays). Let me know what you all think of this logic before I make a commit @falkben @hmlewis-astro

falkben · 2023-08-02T16:12:38Z

I'd be happy with that change @jaymedina -- it will make this cutout code easier to maintain, as well.

It will make the cutouts the same size as the SPOC cutouts, though, (not any smaller) but I don't think that was the intention with any of this.

jaymedina · 2023-08-02T16:55:13Z

It will make the cutouts the same size as the SPOC cutouts, though, (not any smaller) but I don't think that was the intention with any of this.

Updated accordingly. Yep, the cutouts themselves are small, so it's not as big of a deal for them to have the error array. The important thing was to reduce the size of the cubes by removing the error array there, and to adapt the CutoutFactory code to take this change into account. Thanks for the quick response!

falkben · 2023-08-02T17:04:57Z

astrocut/cube_cut.py

@@ -729,10 +729,10 @@ def _build_tpf(self, cube_fits, img_cube, uncert_cube, cutout_wcs_dict, aperture

        # Adding flux and flux_err (data we actually have!)
        pixel_unit = 'e-/s' if self.product == 'SPOC' else 'e-'
+        flux_err_array = uncert_cube if self.product == 'SPOC' else empty_arr


i'm a little uncertain about this. numpy array assignment is done by reference. i'm just worried that because we are using empty_arry here to assign into cadence_arry, or appending to columns, that something might change inside of empty_arr and then would be applied to flux_err_arry at this point.

e.g. to illustrate what i'm worried about

>>> import numpy as np >>> empty_arr = np.zeros((2,3)) >>> other_arr = empty_arr[:,1] >>> other_arr[0] = 1 >>> other_arr array([1., 0.]) >>> empty_arr array([[0., 1., 0.], [0., 0., 0.]])

maybe on this line, we should assign flux_err_arry to just np.zeros(img_cube.shape) to be sure it's empty?

maybe we're not changing cadence_array so this isn't an issue?

another thing i'm wondering -- should the error be zero? or should it be np.NaN?

I thought about this too. I lean towards 0. for two reeasons at the moment:

the TICA cubes had the dummy values sent to 0. I think, so this maintains that part of it

adding NaN may result in issues for non-NaN-safe calculations, e.g., produce values of NaN for computation that included the flux uncertainty. Of course 0. could also introduce problems depending on what the user is doing with it, it's not obious to me which is better if someone isn't aware that these are essentially undefined but use the flux uncertainties in calcuations within their code.

E.g., a flux uncertainty of 0 wouldn't impact a propagation of error calculation, but a value of NaN would result in a NaN result.

in that case wouldn't it be better to nullify the results than to silently make the calculation with an incorrect value for the error?

that was my old philosophy: use NaN's as a forced flag to indicate to an unaware user that something is not defined, and you either need to mask or add nan-safe options to your function calls. these days I wonder how many functions filter out NaN values automatically or not. Do you have thoughts on this @hmlewis-astro @jaymedina ?

I'm in favor of that. I'll go ahead and make a commit unless there are other suggestions!

I am not in favor of using any actual number here -- the error/uncertainty is unknown. using actual numbers is likely just going to result in confusion. if we're returning an array to conform with some expectation that the cutout has a certain number of extensions or whatever, but the values are not to be used -- well, this is precisely what np.NaN is for. we might also consider writing something in the header of the fits file about it?

I'm not opposed to changing the vals to np.nans. One thing to consider tho, is that np.zero arrays are used for all the other columns which we don't have information about, for example CADENCEN0 which is the column that was referenced in the start of this thread (when making references to cadence_array). So if 1 column uses np.nan to fill in values we don't know about, and 1 column uses np.zero, that will be even more confusing.

What we can do for now is stick with the traditional np.zero array (I can reassign flux_err_array to np.zeros(img_cube.shape) to ensure its a zero array as suggested in the first comment of the thread), and once we come to can agreement on what the fill vals should be, I'll open up a new PR to update the empty_arr for all relevant columns. Let me know if this works with you all and I'll make a Jira ticket.

As for timeline, I'd like to get this PR merged in by end of this week or beginning of next week so I have enough time to update and redeploy the TESSCut API before the end of this sprint.

I think this solution works for now! Once you make the changes to reassign flux_err_array following Ben's original comment, I think this is ready to merge

Sounds good, thanks @hmlewis-astro !

falkben · 2023-08-02T17:37:43Z

Should the title of this PR change?

hmlewis-astro · 2023-08-02T17:59:23Z

I agree, keeping the FLUX_ERR in the TICA cutouts is fine, and will make things easier for users that want to run the same/similar code on cutouts made from SPOC and TICA cubes.

astrocut/cube_cut.py

falkben · 2023-08-04T19:55:29Z

astrocut/cube_cut.py

-                getattr(cube[1], cube_data_prop), threads=threads, verbose=verbose
-            )
+            img_cutout, uncert_cutout, aperture = self._get_cutout(getattr(cube[1], cube_data_prop), threads=threads,
+                                                                   verbose=verbose)


we should probably align on what whitespace formatting we're using on this project to avoid unnecessary changes back and forth...

I use darker which just applies black selectively to my set of changes (instead of on the entire file).

The merge-base changed after approval.

… when TICA

jaymedina added the enhancement New feature or request label Jul 26, 2023

jaymedina self-assigned this Jul 26, 2023

jaymedina marked this pull request as draft July 26, 2023 14:02

jaymedina force-pushed the tica-cutout-factory branch from c0e89d3 to 190a0dc Compare July 26, 2023 19:57

jaymedina mentioned this pull request Jul 26, 2023

removing error array dimension from TicaCubeFactory #87

Merged

jaymedina marked this pull request as ready for review July 26, 2023 20:08

jaymedina requested review from falkben and hmlewis-astro July 26, 2023 20:09

falkben requested changes Jul 27, 2023

View reviewed changes

falkben requested changes Jul 28, 2023

View reviewed changes

astrocut/cube_cut.py Outdated Show resolved Hide resolved

astrocut/tests/test_cube_cut.py Show resolved Hide resolved

falkben previously approved these changes Jul 31, 2023

View reviewed changes

falkben reviewed Aug 2, 2023

View reviewed changes

jaymedina changed the title ~~Removing TICA Error Array from CutoutFactory~~ Adapting CutoutFactory to account for error-less TICA Cubes Aug 2, 2023

jaymedina commented Aug 4, 2023

View reviewed changes

astrocut/cube_cut.py Show resolved Hide resolved

falkben previously approved these changes Aug 4, 2023

View reviewed changes

hmlewis-astro previously approved these changes Aug 8, 2023

View reviewed changes

jaymedina requested a review from hmlewis-astro August 8, 2023 14:39

hmlewis-astro previously approved these changes Aug 8, 2023

View reviewed changes

jaymedina added 18 commits August 8, 2023 10:42

Turning product kwarg into a class attribute

41dbad5

Updating _get_cutout

aa341f4

Updating CutoutFactory to account for no error array

7d980f3

Removing tshooting print statmnts

9a0e72d

Fixed unit tests

fd31337

Fixing unit test

6017f0f

Updating comment on self.cube_shape call

d17ac62

Setting uncert_cutout to None for TICA

a52d0fa

Updating docstring to describe _get_cutout uncert_cutout returns None…

fcc8f2a

… when TICA

Enforcing uppercase string for self.product

0d472aa

Updated docstring for _build_tpf

e29b33f

Assert FLUX_ERR is in SPOC cutout but not TICA

f861de5

Updated return dtype in docstring for _get_cutout

806fd67

Checking for instances rather than comparing types directly

4609b2c

Reintroducing FLUX_ERR column for TICA

db9282e

Updating unit tests

f2cdef1

Removing reference to error_arr in cadence_array

fe78172

codestyle

fef425a

jaymedina force-pushed the tica-cutout-factory branch from 367da13 to fef425a Compare August 8, 2023 14:42

jaymedina requested a review from hmlewis-astro August 8, 2023 14:46

hmlewis-astro approved these changes Aug 8, 2023

View reviewed changes

jaymedina merged commit b02f1d4 into main Aug 8, 2023
5 of 6 checks passed

jaymedina deleted the tica-cutout-factory branch August 8, 2023 14:47

hmlewis-astro mentioned this pull request Aug 10, 2023

Update cube and cutout unittests #90

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapting `CutoutFactory` to account for error-less TICA Cubes #88

Adapting `CutoutFactory` to account for error-less TICA Cubes #88

jaymedina commented Jul 26, 2023 •

edited

Loading

codecov bot commented Jul 26, 2023 •

edited

Loading

falkben left a comment

jaymedina commented Jul 28, 2023 •

edited

Loading

falkben commented Jul 31, 2023

jaymedina commented Jul 31, 2023

falkben left a comment •

edited

Loading

jaymedina commented Aug 2, 2023

falkben commented Aug 2, 2023

jaymedina commented Aug 2, 2023

falkben Aug 2, 2023

falkben Aug 2, 2023 •

edited

Loading

scfleming Aug 2, 2023 •

edited

Loading

falkben Aug 2, 2023

scfleming Aug 2, 2023

jaymedina Aug 2, 2023

falkben Aug 2, 2023

jaymedina Aug 2, 2023 •

edited

Loading

hmlewis-astro Aug 4, 2023

jaymedina Aug 4, 2023

falkben commented Aug 2, 2023

hmlewis-astro commented Aug 2, 2023

falkben Aug 4, 2023 •

edited

Loading

Adapting CutoutFactory to account for error-less TICA Cubes #88

Adapting CutoutFactory to account for error-less TICA Cubes #88

Conversation

jaymedina commented Jul 26, 2023 • edited Loading

Summary

Task List

Screenshots

codecov bot commented Jul 26, 2023 • edited Loading

Codecov Report

falkben left a comment

Choose a reason for hiding this comment

jaymedina commented Jul 28, 2023 • edited Loading

falkben commented Jul 31, 2023

jaymedina commented Jul 31, 2023

falkben left a comment • edited Loading

Choose a reason for hiding this comment

jaymedina commented Aug 2, 2023

falkben commented Aug 2, 2023

jaymedina commented Aug 2, 2023

falkben Aug 2, 2023

Choose a reason for hiding this comment

falkben Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

scfleming Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

falkben Aug 2, 2023

Choose a reason for hiding this comment

scfleming Aug 2, 2023

Choose a reason for hiding this comment

jaymedina Aug 2, 2023

Choose a reason for hiding this comment

falkben Aug 2, 2023

Choose a reason for hiding this comment

jaymedina Aug 2, 2023 • edited Loading

Choose a reason for hiding this comment

hmlewis-astro Aug 4, 2023

Choose a reason for hiding this comment

jaymedina Aug 4, 2023

Choose a reason for hiding this comment

falkben commented Aug 2, 2023

hmlewis-astro commented Aug 2, 2023

falkben Aug 4, 2023 • edited Loading

Choose a reason for hiding this comment

Adapting `CutoutFactory` to account for error-less TICA Cubes #88

Adapting `CutoutFactory` to account for error-less TICA Cubes #88

jaymedina commented Jul 26, 2023 •

edited

Loading

codecov bot commented Jul 26, 2023 •

edited

Loading

jaymedina commented Jul 28, 2023 •

edited

Loading

falkben left a comment •

edited

Loading

falkben Aug 2, 2023 •

edited

Loading

scfleming Aug 2, 2023 •

edited

Loading

jaymedina Aug 2, 2023 •

edited

Loading

falkben Aug 4, 2023 •

edited

Loading