Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brokers should not throw exceptions on attempted creation of duplicate objects #253

Closed
dmcollom opened this issue Feb 20, 2020 · 10 comments
Closed
Labels
bug Something isn't working Data Services Data Services

Comments

@dmcollom
Copy link
Contributor

In MARS specifically, when attempting to create and object from an alert when the object has previously been created, the TOM Toolkit throws an exception and should instead attempt to resolve any differences in target properties.

This is likely an issue in all or most brokers.

@dmcollom dmcollom added the bug Something isn't working label Feb 20, 2020
@JulienPeloton
Copy link
Contributor

Hi @dmcollom,

I do not know if this is related, but I tried to automatize query with broker (following https://github.com/TOMToolkit/tom_base/blob/main/tom_alerts/management/commands/runbrokerquery.py) and I am facing similar errors.

For the Fink broker, our to_target method reads:

def to_target(self, alert: dict) -> Target:
    """ Redirect query result to a Target

    Parameters
    ----------
    alert: dict
        Dictionary containing alert data: {column name: value}. See
        `self.fetch_alerts` for more information.

    """
    target = Target.objects.create(
        name=alert['i:objectId'], # unique ID for objects, but shared among several alerts
        type='SIDEREAL',
        ra=alert['i:ra'],
        dec=alert['i:dec'],
    )
    return target

That is if an object periodically sends alerts, the same target will be created - and then I have an error like:

Created target: ZTF17aaaabte # First alert is OK
Traceback (most recent call last): # subsequent alerts throw error
  File "/home/ubuntu/.local/lib/python3.8/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/django/db/backends/sqlite3/base.py", line 413, in execute
    return Database.Cursor.execute(self, query, params)
sqlite3.IntegrityError: UNIQUE constraint failed: tom_targets_target.name

Is there a way to avoid multiple creation of targets for the same object?

@dmcollom
Copy link
Contributor Author

So, the standard way to avoid multiple creation in Django is get_or_create, but I'm not 100% sure that would work in the general case because the data for the target will presumably change in very small ways as more data comes in. As an example, in MARS the most recent alert for ZTFaaaabte has an RA of 27.86472, but the one before that has 27.86447, which is enough to fail the get portion of get_or_create, provided you pass in RA.

What it comes down to is that I'm not sure what the behavior should be in this case, although I agree that it certainly shouldn't cause an error, and that your use case needs to be able to work. Do you have any thoughts about how the differences in coordinates should be handled, or perhaps @rachel3834 has an opinion? Once we work that out, I can put a fix in pretty easily, I should think.

@JulienPeloton
Copy link
Contributor

Thanks - but in my case, it does not even consider ra/dec, and it complains about two targets having the same name: sqlite3.IntegrityError: UNIQUE constraint failed: tom_targets_target.name. I think I would need a mechanism to check the available targets, and not create the target if the name already exists (regardless other properties).

@rachel3834
Copy link
Contributor

This is certainly a general issue in astronomy. In the event that two targets have the same name, then I would say that they best resolution would be to store the original parameter values as a ReducedDatum series, and append the new values.

If the RA, Dec have associated uncertainties - not always the case - then you could include a test to evaluate which set of measurements is more precise, and set the Target's RA, Dec attributes to the most precise set of values.

The converse example is where you have two potential targets with different names with overlapping coordinates. In the case of most objects (ignoring proper motion) you would want to handle this case as two detections of the same target (as above). The exception would be for moving objects, especially Solar System targets.

@dmcollom
Copy link
Contributor Author

dmcollom commented Apr 30, 2021

My apologies for not being clear: I see your point that the failure is on the Target name, but it is indeed considering ra/dec here.

What the solution would be in your case would be this:

def to_target(self, alert: dict) -> Target:
    target, created = Target.objects.get_or_create(name=alert['i:objectId'], type=Target.SIDEREAL)
    if created is False:
        # Decide what to do about RA/Dec differences and do it here
        target.save()
    return target

If RA/Dec are included in the kwargs for get_or_create, then the function will assume it doesn't exist due to the differences. Does that make sense? This is actually a way more convenient solution than what I was considering, which was to make the user confirm that they either want to merge targets or create a new one.

@JulienPeloton
Copy link
Contributor

Thanks for the snippet, it was very useful to debug! One thing was missing for my case. When

target, created = Target.objects.get_or_create(name=alert['i:objectId'], type=Target.SIDEREAL)

is called on a brand new object, it creates a target in the database with name as single property. But then one should make sure to manually add ra/dec (or any other properties) before saving for good. For reference, I slightly modified runbrokerquery.py such that it can deal with alerts with the same name:

from tom_alerts.models import BrokerQuery
from tom_alerts.alerts import get_service_class, GenericAlert
from tom_targets.models import Target

from time import sleep
import sys

class Command(BaseCommand):
    help = 'Runs saved alert queries and saves the results as Targets'

    def add_arguments(self, parser):
        parser.add_argument(
            'query_name',
            help='One or more saved queries to run'
        )

    def to_target_from_generic(self, alert: GenericAlert) -> Target:
        """ Redirect query result to a Target using GenericAlert attributes

        We currently just use the name, position (ra, dec), and type.

        Parameters
        ----------
        alert: dict
            GenericAlert instance

        """
        target, created = Target.objects.get_or_create(
            name=alert.name
        )
        if created:
            target.ra = alert.ra
            target.dec = alert.dec
            target.type = 'SIDEREAL'
        else:
            # you could check if target.ra & alert.ra match etc.
            # and create a new alert is needed
            pass
        return target, created

    def handle(self, *args, **options):
        try:
            query = BrokerQuery.objects.get(name=options['query_name'])
            broker_class = get_service_class(query.broker)
            broker = broker_class()
            alerts = broker.fetch_alerts(query.parameters)
            ntarget = 0
            while True:
                try:
                    generic_alert = broker.to_generic_alert(next(alerts))

                    # Get or create a new target
                    target, created = self.to_target_from_generic(generic_alert)
                    if created:
                        target.save()
                        self.stdout.write('Created target: {}'.format(target))
                        ntarget += 1
                    else:
                        self.stdout.write('Target with the name {} already created - skipping'.format(generic_alert.name))
                except StopIteration:
                    self.stdout.write('Finished creating {} target(s)'.format(ntarget))
                    sys.exit()
                sleep(1)
        except KeyboardInterrupt:
            self.stdout.write('Exiting...')

@dmcollom
Copy link
Contributor Author

dmcollom commented May 3, 2021

Glad you were able to get it to work! We'll have to address this in the base toolkit at some point, as well.

@jchate6 jchate6 moved this to Backlog in TOM Toolkit Oct 6, 2022
@jchate6 jchate6 moved this from Backlog to Icebox in TOM Toolkit Feb 28, 2023
@jchate6 jchate6 added the Data Services Data Services label Jan 20, 2024
@jchate6
Copy link
Contributor

jchate6 commented Jan 20, 2024

I think this has been handled in more recent efforts in name matching.
We don't update anything, but maybe we should.

@jchate6 jchate6 moved this from Icebox to Triage in TOM Toolkit Jan 20, 2024
@jchate6
Copy link
Contributor

jchate6 commented Jan 24, 2024

Linked to #640

@jchate6
Copy link
Contributor

jchate6 commented May 30, 2024

This should be fixed with the addition of the match managers.
The current state of brokers will not allow a target with the same name (as per a fuzzy match) being created twice.
It will not update the target or anything with new data from the broker, but it will also not crash, and give the user the appropriate feedback.

It should fully resolved by the Custom Match Manager functionality in #885 .
Once that branch is merged a user can define for themselves when two targets should be considered the same, and that will be the block instead on just a fuzzy match for the name.

@jchate6 jchate6 closed this as completed May 30, 2024
@github-project-automation github-project-automation bot moved this from Staged to Closed in TOM Toolkit May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Data Services Data Services
Projects
Archived in project
Development

No branches or pull requests

4 participants