-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CT-1685: Restore certain aspects of legacy logging behavior important… #6443
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
# The default event manager will not log anything, but some tests run code that | ||
# generates events, without configuring the event manager, so we create an empty | ||
# manager here until there is a better testing strategy in place. | ||
EVENT_MANAGER: EventManager = EventManager() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed a functional regression related to the fact that EVENT_MANAGER
does not have any logger configured, prior to setup_event_logger
being called.
If dbt runs into configuration errors during initialization, they are not surfaced to the user.
See #6434
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I was worried about that. I'll address that today with a separate PR. There is a strange issue where file logging is not set up until further into the initialization process. Do you think it would be acceptable to set up only stdout logging until the log file path can be determined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that work in dbt-rpc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO only stdout logging is acceptable for now. My sense is, we really only expect these logs to fire if something goes wrong during initialization. It's okay to have some hardcoded, generic, unconfigurable behavior that just ensures we're showing the error to the user.
We'll need to figure out the story for dbt-server
, since I think the initial plan is to tail the logs/dbt.log
file. If we think the longer-term answer is registering a custom callback, that might solve our problem.
As far as dbt-rpc, all we need is ongoing compatibility with the legacy logger for v1.4. The way I understand the code in this PR, if the ENABLE_LEGACY_LOGGER
env var is enabled, we'll immediately flip on the legacy logger, without waiting for the call to setup_event_logger
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtcohen6 Your understanding is correct w.r.t. dbt-rpc
From speaking to @racheldaniel, I have a fairly good idea how dbt-server is using logging now. They do their own configuration of the EVENT_MANAGER before running tasks, so I feel confident that we have a good story for them.
I'm going to prepare a PR to address #6434, now. I believe that will restore all of the important logging behaviors we had in 1.3, and I have to apologize for not considering how important it was to get something set up before the call to setup_event_logger() is made.
With all that said, there are some interesting things to note about the file logging behavior of the legacy logger. I found earlier this week that it starts recording messages after it is set up, which is immediately dbt starts. It then writes the recorded messages to the log file (if any) when the correct file path has been determined, and stops recording. My hope is that in 1.5 we can find a way to start all logging, both console and file, immediately when dbt starts. Does that seem feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to apologize for not considering how important it was to get something set up before the call to setup_event_logger() is made.
I'm glad we caught it early! I think the failure here was lack of sufficient inline code comment around the event system initialization, to scare you away from what was a perfectly sensible refactor. "We need to configure these defaults here, even though it looks like we shouldn't, because some log events might arrive before we have the real inputs for proper configuration — namely, if we encounter an early exception."
I found earlier this week that it starts recording messages after it is set up, which is immediately dbt starts. It then writes the recorded messages to the log file (if any) when the correct file path has been determined, and stops recording.
IIRC, it does this by storing logs in a "buffer," and then "replaying" the buffer once the proper log path has been determined. A little too magical for my taste buds, but this might be an acceptable pattern to emulate.
Lines 417 to 421 in 5e4e917
def _replay_buffered(self): | |
assert self._msg_buffer is not None, "_msg_buffer should never be None in _replay_buffered" | |
for record in self._msg_buffer: | |
super().emit(record) | |
self._msg_buffer = None |
My hope is that in 1.5 we can find a way to start all logging, both console and file, immediately when dbt starts. Does that seem feasible?
Maybe! We could roll our own similar version of a stateful "buffer"? Or split log-path
out of dbt_project.yml
, so that we can know where we're logging without having to parse any config/files? (Feels related: #6207)
FWIW in the happy path, when config initialization succeeds (99% of the time), I don't think we're firing any log events before setup_event_logger
is called.
resolves #6437
Description
These changes restore the behavior described in the associated issue. The legacy (logbook) logger is now active by default, and the appropriate log level is applied to ensure the events are seen by dbt-rpc.
Checklist
changie new
to create a changelog entry