CE-136 JSON export download logging #3178

eapearson · 2023-01-26T21:43:47Z

Description of PR purpose/changes

This is a set of speculative changes to support logging of JSON downloads from the Narrative front end. This is primarily for discussion and exposition of one way to implement this in a manner quite similar to how front end logging has been implemented in the past.

The overall effort is to be able to log object JSON downloads, as triggered by the Narrative front end via the data panel's export functionality. These log entries should be available in a timely manner to KBase staff. In our specific case they are needed as part of Narrative object usage statistics.

The logging works by sending Python code to the back end for creating and emitting a log entry in JSON format via the standard Python logging API.

The log entries are currently appended to /tmp/kbase-narrative-ui.json. This is the same location (/tmp) but a different file name and format than the current logging mechanism uses.

I've separated most of the code so that current logging works unimpeded, both in the JS front end and Python back end code.

After getting the initial functionality working without too much trouble, I invested a couple hours trying to get more context into the log. One such piece of information is the narrative object reference. I tried but failed to get the current narrative version (e.g. after it obtains a new version when saved), but left that code in place anyway.

The log format is not something I put a tremendous amount of thought or any research into. At a minimum I want to satisfy our immediate needs - the time, username, and downloaded object ref. Being straightforward to parse is another important feature, so JSON. There is additional structure to make the log entries a bit more robust in the presence of other types of log entries, but clearly it is just a starting point.

Ultimately, we may want to use a different logging handler to send the log entries to a network endpoint for ingestion into a service and/or database.

Jira Ticket / Issue

CE-136 https://kbase-jira.atlassian.net/browse/CE-136

Related Jira ticket: https://kbase-jira.atlassian.net/browse/DATAUP-X

Added the Jira Ticket to the title of the PR (e.g. DATAUP-69 Adds a PR template)

Testing Instructions

Details for how to test the PR:

Tests pass locally and in GitHub Actions
Changes available by spinning up a local narrative and navigating to X to see Y

Dev Checklist:

Updating Version and Release Notes (if applicable)

Version has been bumped for each release
Release notes have been updated for each release (and during the merge of feature branches)

sonarqubecloud · 2023-01-26T21:46:24Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
4 Code Smells

No Coverage information
0.0% Duplication

briehl

This is a good start. I think the overall flow of event -> kernel message -> logging -> log destination will work fine.

Would could also create a dedicated kernel channel for logging, like we do for job messaging, but I don't really think that's necessary for a starting point.

Also noting that I'd like to see the rather ancient KBError / KBFail messages make use of something like this, but that's out of scope for the prototype.

I guess my only concerns are:

where do the logs go?
what format should logs be?
do we keep a parallel log on the file system / make that an option for running in dev mode?
can we unify how all the logs are sent and processed so we can combine some older stuff and remove some cruft?

I know that expands the scope of this, so we can probably tackle all of that piecemeal. Once we can answer 1 and 2 with some devops input, we can satisfy some metric needs and build up the rest.

briehl · 2023-01-30T16:36:24Z

kbase-extension/static/kbase/js/util/Logging.js

+     *                              of which was that returned by the last statement
+     *                              in the Python code.
+     */
+    async function executePython(code) {


This seems reasonable. I think there's some other error catching / wait-until-kernel-is-alive code elsewhere that we can either repurpose or just make use of.

Ok, I'll hunt for that.

briehl · 2023-01-30T16:39:59Z

src/biokbase/narrative/common/kblogging.py

+    # contains multiple objects in sequence.
+    ui_log = get_logger("narrative_ui_json")
+    log_id = str(uuid.uuid4())
+    if not _has_handler_type(ui_log, logging.FileHandler):


I think before going too far down this rabbit hole, we should talk to devops about what's the most reasonable logging format for their use. Though some local file logging would be good, too.

Also, file handling / formatting should be a more global thing for all logging.

Yeah, I just wanted to make sure that this prototyping work actually produced a JSON log in a minimal fashion with the simplest implementation -- a file in the same place we are currently logging (/tmp). And I agree it should be global. I figured, get it working in this case, then later switch over existing logging entry points to use it too.

I'm sure on the table still is logging to a file in a mounted directory and having an external process grab those logs. I don't think that is a good solution for Narrative logging, but it should be considered, as it is a traditional logging technique.

briehl · 2023-01-30T16:40:28Z

src/biokbase/narrative/common/kblogging.py

+        # If logs are combined, we need to tie log entries to 
+        # a specific version of a service in a specific environment.


eapearson · 2023-02-06T18:23:49Z

I guess my only concerns are:

where do the logs go?

Exactly -- I think that informs a lot of the rest. E.g. if we use a 3rd party logging service, the where is pretty much solved (unless we decide to hide it behind a service!), the format will be at least constrained, the metadata will be taken care of.

what format should logs be?

do we keep a parallel log on the file system / make that an option for running in dev mode?

I don't think in general we want to duplicate the logs. I mean, there is an argument for logging to file + service if the files are actually preserved, to serve as a backup in case the service is down and we could then backfill the logging service. But that also might be a "nice to have".

For development we could use a container to receive logs - if we use a 3rd party logging service, it should be easy to stand up such a service. Complicates development slightly but simplifies the Narrative code + configuration. Logging services should have nice tools for accessing the logs, filtering them, etc. Then again, Python logging allows one to retarget logs easily, though we'd need some sort of configuration to be able to switch it.

can we unify how all the logs are sent and processed so we can combine some older stuff and remove some cruft?

I know that expands the scope of this, so we can probably tackle all of that piecemeal. Once we can answer 1 and 2 with some devops input, we can satisfy some metric needs and build up the rest.

It surely is sensible to have all logging be consistent, but we may want to limit the initial implementation (map) to these logs as it only affects a single activity within the Narrative -- logging from the JSON download, which is not a common user action.

eapearson added 3 commits January 26, 2023 09:47

enable ui logging to a json file [CE-136]

b50e534

add comment [CE-136]

5fbbda5

put a bare minimum extra thought into the log format [CE-136]

348e1d0

briehl reviewed Jan 30, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CE-136 JSON export download logging #3178

CE-136 JSON export download logging #3178

eapearson commented Jan 26, 2023 •

edited

Loading

sonarqubecloud bot commented Jan 26, 2023

briehl left a comment •

edited

Loading

briehl Jan 30, 2023

eapearson Feb 6, 2023

briehl Jan 30, 2023

eapearson Feb 6, 2023

briehl Jan 30, 2023

eapearson commented Feb 6, 2023

		# If logs are combined, we need to tie log entries to
		# a specific version of a service in a specific environment.

CE-136 JSON export download logging #3178

Are you sure you want to change the base?

CE-136 JSON export download logging #3178

Conversation

eapearson commented Jan 26, 2023 • edited Loading

Description of PR purpose/changes

Jira Ticket / Issue

Testing Instructions

Dev Checklist:

Updating Version and Release Notes (if applicable)

sonarqubecloud bot commented Jan 26, 2023

briehl left a comment • edited Loading

Choose a reason for hiding this comment

briehl Jan 30, 2023

Choose a reason for hiding this comment

eapearson Feb 6, 2023

Choose a reason for hiding this comment

briehl Jan 30, 2023

Choose a reason for hiding this comment

eapearson Feb 6, 2023

Choose a reason for hiding this comment

briehl Jan 30, 2023

Choose a reason for hiding this comment

eapearson commented Feb 6, 2023

eapearson commented Jan 26, 2023 •

edited

Loading

briehl left a comment •

edited

Loading