Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-44945 Implement code to fill in missing columns #44

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

bbrondel
Copy link
Collaborator

@bbrondel bbrondel commented Oct 14, 2024

This code relies on existing data already being loaded into the database (especially s_ra, s_dec, *_mjd, and physical_filter) to fill in missing data.

@bbrondel bbrondel requested review from ktlim and Vebop October 14, 2024 22:17
@bbrondel bbrondel marked this pull request as ready for review October 14, 2024 23:07
Copy link
Collaborator

@Vebop Vebop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you expect to only need to use fixup this once? (so no unit tests?)
Can you note on how you use/used this in TTS/summit to fix the missing data, and how you know it works/didn't work?

approved with these questions answered

# Bail out because we don't have enough info.
return dict()

if exposure_rec["s_ra"] == 0.0 and exposure_rec["s_dec"] == 0.0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I see conversation about this somewhere on why we have these? could that info be added to the comment? Maybe I hallucinated it though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall mentioning that there were zeroed out RA/Dec present in the database, but not how they got there. Really these values ought to be replaced with NULLs IMO.

# Use K&Y 1989 model to compute airmass from altitude
airmass = None
if altitude >= 0 and altitude <= 90:
airmass = 1 / (sin(radians(altitude)) + 0.50572 * (altitude + 6.07995) ** -1.6364)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put these magic numbers in some constants, if there are names for them?

"zenith_distance_end": zenith_distance_end,
"airmass": airmass,
}
for k, v in calculations.items():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're editing, can you expand these to useful names, or at least the full words?


if exposure_rec["obs_start_mjd"] is not None and exposure_rec["obs_end_mjd"] is not None:
self.logger.debug("Inferring column: dark_time")
return {"dark_time": 86400 * (exposure_rec["obs_end_mjd"] - exposure_rec["obs_start_mjd"])}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define this 86400?

python/lsst/consdb/hinfo_fix_missing.py Outdated Show resolved Hide resolved
dictionary in one go. Call this to apply the fixup methods. To add
more fixup methods, just add them to this class, make sure the
method name starts with "fix_", and make sure the signature matches:
def fix_whatever(self, exposure_rec: dict[str, Any])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it would be clearer to add the -> dict[str, Any]: to the end of this?

python/lsst/consdb/hinfo_fix_missing.py Outdated Show resolved Hide resolved


class Fixer:
"""A collection of functions that try to patch up missing data columns."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned about the extent of the coding here. If this information is available from the ObservationInfo produced by the metadata translator but we previously used a header value, we should switch to the ObservationInfo. If the value is in the ObservationInfo but is incorrect, we should fix the translator. If the information is not available in the ObservationInfo, but is available elsewhere in obs_lsst (e.g. filter to band mappings), we should try to use that. We should only compute something here if it's not computed anywhere else (which could be because we remove the other place it's computed).

Otherwise, keeping all of these in sync will be a maintenance nightmare.

# Can't infer band from physical_filter
return dict()

for band in "ugrizy":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems better to match r"SDSS([ugrizy])" or r"([ugrizy])_\d+" and then extract the group. Then no loop is needed.

@ktlim
Copy link
Contributor

ktlim commented Oct 20, 2024

Please omit line change counts in your PR titles.

This may have been my fault. I like having these when people notify me (either in Jira comments or Slack messages or emails) that they're requesting a review from me so that I can quickly judge the scope of the change. But I agree that they don't belong in PR titles.

@bbrondel bbrondel changed the title DM-44945 Implement code to fill in missing columns +259-0 DM-44945 Implement code to fill in missing columns Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants