-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CE-136 JSON export download logging #3178
base: develop
Are you sure you want to change the base?
Conversation
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start. I think the overall flow of event -> kernel message -> logging -> log destination will work fine.
Would could also create a dedicated kernel channel for logging, like we do for job messaging, but I don't really think that's necessary for a starting point.
Also noting that I'd like to see the rather ancient KBError / KBFail messages make use of something like this, but that's out of scope for the prototype.
I guess my only concerns are:
- where do the logs go?
- what format should logs be?
- do we keep a parallel log on the file system / make that an option for running in dev mode?
- can we unify how all the logs are sent and processed so we can combine some older stuff and remove some cruft?
I know that expands the scope of this, so we can probably tackle all of that piecemeal. Once we can answer 1 and 2 with some devops input, we can satisfy some metric needs and build up the rest.
* of which was that returned by the last statement | ||
* in the Python code. | ||
*/ | ||
async function executePython(code) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable. I think there's some other error catching / wait-until-kernel-is-alive code elsewhere that we can either repurpose or just make use of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll hunt for that.
# contains multiple objects in sequence. | ||
ui_log = get_logger("narrative_ui_json") | ||
log_id = str(uuid.uuid4()) | ||
if not _has_handler_type(ui_log, logging.FileHandler): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think before going too far down this rabbit hole, we should talk to devops about what's the most reasonable logging format for their use. Though some local file logging would be good, too.
Also, file handling / formatting should be a more global thing for all logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just wanted to make sure that this prototyping work actually produced a JSON log in a minimal fashion with the simplest implementation -- a file in the same place we are currently logging (/tmp). And I agree it should be global. I figured, get it working in this case, then later switch over existing logging entry points to use it too.
I'm sure on the table still is logging to a file in a mounted directory and having an external process grab those logs. I don't think that is a good solution for Narrative logging, but it should be considered, as it is a traditional logging technique.
# If logs are combined, we need to tie log entries to | ||
# a specific version of a service in a specific environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Exactly -- I think that informs a lot of the rest. E.g. if we use a 3rd party logging service, the where is pretty much solved (unless we decide to hide it behind a service!), the format will be at least constrained, the metadata will be taken care of.
I don't think in general we want to duplicate the logs. I mean, there is an argument for logging to file + service if the files are actually preserved, to serve as a backup in case the service is down and we could then backfill the logging service. But that also might be a "nice to have". For development we could use a container to receive logs - if we use a 3rd party logging service, it should be easy to stand up such a service. Complicates development slightly but simplifies the Narrative code + configuration. Logging services should have nice tools for accessing the logs, filtering them, etc. Then again, Python logging allows one to retarget logs easily, though we'd need some sort of configuration to be able to switch it.
It surely is sensible to have all logging be consistent, but we may want to limit the initial implementation (map) to these logs as it only affects a single activity within the Narrative -- logging from the JSON download, which is not a common user action. |
Description of PR purpose/changes
This is a set of speculative changes to support logging of JSON downloads from the Narrative front end. This is primarily for discussion and exposition of one way to implement this in a manner quite similar to how front end logging has been implemented in the past.
The overall effort is to be able to log object JSON downloads, as triggered by the Narrative front end via the data panel's export functionality. These log entries should be available in a timely manner to KBase staff. In our specific case they are needed as part of Narrative object usage statistics.
The logging works by sending Python code to the back end for creating and emitting a log entry in JSON format via the standard Python logging API.
The log entries are currently appended to /tmp/kbase-narrative-ui.json. This is the same location (/tmp) but a different file name and format than the current logging mechanism uses.
I've separated most of the code so that current logging works unimpeded, both in the JS front end and Python back end code.
After getting the initial functionality working without too much trouble, I invested a couple hours trying to get more context into the log. One such piece of information is the narrative object reference. I tried but failed to get the current narrative version (e.g. after it obtains a new version when saved), but left that code in place anyway.
The log format is not something I put a tremendous amount of thought or any research into. At a minimum I want to satisfy our immediate needs - the time, username, and downloaded object ref. Being straightforward to parse is another important feature, so JSON. There is additional structure to make the log entries a bit more robust in the presence of other types of log entries, but clearly it is just a starting point.
Ultimately, we may want to use a different logging handler to send the log entries to a network endpoint for ingestion into a service and/or database.
Jira Ticket / Issue
Related Jira ticket: https://kbase-jira.atlassian.net/browse/DATAUP-X
DATAUP-69 Adds a PR template
)Testing Instructions
Dev Checklist:
Updating Version and Release Notes (if applicable)