You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
atm the moment I have the raw screener/ingestor set up to only attempt to reprocess files that have failed screening less than three times. After this they stop getting re-queued.
However some files never even make it past ingestion as they have a missing header key (usually it is the "FIELD" key) which trips up the header parsing utility.
I think we could either pass these files off to a junk table or set up the header parsing to assign a default value to certain keys, e.g. a file missing a "FIELD" key would just get assigned hdr["FIELD"] = "UNKNOWN".
At the moment these files will just be perpetually cycling through the screener
The text was updated successfully, but these errors were encountered:
Why three times? I think if they fail once that should be enough
Files with a missing FIELD key are basically useless to us. I agree that an ingest_failed collection may be a good approach. maybe we could also store the error string or even traceback in this table.
3 times mostly for the wcs solve, if the solve times out it should store which index files it has compared with in the header and the next time it tries to solve for wcs it should pick up from where the previous run left off. We could just set a really long timeout but I think this way it will quickly try and solve each file and if it times out, it will come back to those files once it has processed all the new un-ingested files.
a lot of the files missing the "FIELD" key are actually just darks, so we don't actually need the FIELD key but the header parser raises an error as we have set it as a required column. We could just delete these old darks as I assume they were taken "manually" rather than via pocs. However atm the screener will just perpetually attempt to ingest these files, so we will be perpetually wasting a few cpu cores on these files. There are also some that fail because of "invalid datetime format" and I think they also just fail ingestion so end up in ingestion purgatory. For now I will just punt these files into a junk table rather than fiddle with the header parsing
The current implementation of the FileIngestor keeps track of which files did not succeed an avoids re-queuing them for processing. However, it does not currently remember about these files when the service is restarted.
atm the moment I have the raw screener/ingestor set up to only attempt to reprocess files that have failed screening less than three times. After this they stop getting re-queued.
However some files never even make it past ingestion as they have a missing header key (usually it is the "FIELD" key) which trips up the header parsing utility.
I think we could either pass these files off to a junk table or set up the header parsing to assign a default value to certain keys, e.g. a file missing a "FIELD" key would just get assigned
hdr["FIELD"] = "UNKNOWN"
.At the moment these files will just be perpetually cycling through the screener
The text was updated successfully, but these errors were encountered: