Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature COMMANDLOG to record slow execution and large request/reply #1294

Merged
merged 26 commits into from
Jan 24, 2025

Conversation

soloestoy
Copy link
Member

@soloestoy soloestoy commented Nov 12, 2024

As discussed in PR #336.

We have different types of resources like CPU, memory, network, etc. The slowlog can only record commands eat lots of CPU during the processing phase (doesn't include read/write network time), but can not record commands eat too many memory and network. For example:

  1. run "SET key value(10 megabytes)" command would not be recored in slowlog, since when processing it the SET command only insert the value's pointer into db dict. But that command eats huge memory in query buffer and bandwidth from network. In this case, just 1000 tps can cause 10GB/s network flow.
  2. run "GET key" command and the key's value length is 10 megabytes. The get command can eat huge memory in output buffer and bandwidth to network.

This PR introduces a new command COMMANDLOG, to log commands that consume significant network bandwidth, including both input and output. Users can retrieve the results using COMMANDLOG get <count> large-request and COMMANDLOG get <count> large-reply, all subcommands for COMMANDLOG are:

  • COMMANDLOG HELP
  • COMMANDLOG GET <count> <slow|large-request|large-reply>
  • COMMANDLOG LEN <slow|large-request|large-reply>
  • COMMANDLOG RESET <slow|large-request|large-reply>

And the slowlog is also incorporated into the commandlog.

For each of these three types, additional configs have been added for control:

  • commandlog-request-larger-than and commandlog-large-request-max-len represent the threshold for large requests(the unit is Bytes) and the maximum number of commands that can be recorded.
  • commandlog-reply-larger-than and commandlog-large-reply-max-len represent the threshold for large replies(the unit is Bytes) and the maximum number of commands that can be recorded.
  • commandlog-execution-slower-than and commandlog-slow-execution-max-len represent the threshold for slow executions(the unit is microseconds) and the maximum number of commands that can be recorded.
    • Additionally, slowlog-log-slower-than and slowlog-max-len are now set as aliases for these two new configs.

Copy link

codecov bot commented Nov 12, 2024

Codecov Report

Attention: Patch coverage is 93.75000% with 8 lines in your changes missing coverage. Please review.

Project coverage is 70.93%. Comparing base (3032ccd) to head (63d1d88).
Report is 5 commits behind head on unstable.

Files with missing lines Patch % Lines
src/commandlog.c 94.87% 6 Missing ⚠️
src/module.c 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #1294      +/-   ##
============================================
+ Coverage     70.81%   70.93%   +0.12%     
============================================
  Files           121      121              
  Lines         65132    65162      +30     
============================================
+ Hits          46121    46225     +104     
+ Misses        19011    18937      -74     
Files with missing lines Coverage Δ
src/blocked.c 91.38% <100.00%> (ø)
src/commands.def 100.00% <ø> (ø)
src/config.c 78.39% <ø> (ø)
src/latency.c 80.92% <100.00%> (ø)
src/multi.c 97.30% <ø> (ø)
src/networking.c 88.83% <ø> (+0.20%) ⬆️
src/server.c 87.60% <100.00%> (-0.03%) ⬇️
src/server.h 100.00% <ø> (ø)
src/module.c 9.59% <0.00%> (-0.01%) ⬇️
src/commandlog.c 94.87% <94.87%> (ø)

... and 9 files with indirect coverage changes

@hwware
Copy link
Member

hwware commented Nov 15, 2024

Just for Note: Let's first refresh the memory in pr #336, the last comment conclusion is:

After a core team meeting, we decided adding a new command COMMANDLOG with subcommands HEAVYTRAFFIC and SLOW, and then slowlog.c can be renamed to a common commandlog.c. #336 (comment)

@hwware
Copy link
Member

hwware commented Nov 15, 2024

From my understanding, for the command: commandlog get should be:
COMMANDLOG get count slow | heavytraffic-input | heavytraffic-output.

commandlog len should be:
COMMANDLOG len slow | heavytraffic-input | heavytraffic-output.

commandlog reset should be:
COMMANDLOG reset slow | heavytraffic-input | heavytraffic-output.

Can you describe the terms heavytraffic-input and heavytraffic-output in the json file (argument part) because I can only know
them from the source codes and valkey.conf so far?

And I think the existing slowlog commands should be deprecated? If yes, I think you should update the related json files as well.

valkey.conf Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
src/commandlog.c Outdated Show resolved Hide resolved
Signed-off-by: zhaozhao.zz <[email protected]>
@soloestoy soloestoy changed the title Feature COMMANDLOG to record slow and heavy traffic Feature COMMANDLOG to record slow execution and large request/reply Nov 20, 2024
Copy link
Member

@hwware hwware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, approved

src/config.c Outdated Show resolved Hide resolved
src/config.c Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
@zuiderkwast zuiderkwast added release-notes This issue should get a line item in the release notes major-decision-pending Major decision pending by TSC team labels Dec 6, 2024
@zuiderkwast zuiderkwast added the needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. label Dec 6, 2024
@zuiderkwast
Copy link
Contributor

@valkey-io/core-team Let's make a TSC decision on the command syntax and configs as described in the top comment.

Approve or vote 👍 or 👎.

Copy link
Member

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, still agree with high level direction.

src/commandlog.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
src/server.h Outdated Show resolved Hide resolved
tests/unit/commandlog.tcl Show resolved Hide resolved
src/commandlog.c Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
soloestoy and others added 6 commits January 13, 2025 15:46
Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM, didn't review the code carefully in depth though.

@soloestoy soloestoy added major-decision-approved Major decision approved by TSC team and removed major-decision-pending Major decision pending by TSC team labels Jan 16, 2025
src/commandlog.c Outdated Show resolved Hide resolved
src/server.c Outdated Show resolved Hide resolved
src/commandlog.c Outdated Show resolved Hide resolved
src/commandlog.c Outdated Show resolved Hide resolved
src/server.h Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
valkey.conf Outdated Show resolved Hide resolved
src/commandlog.c Show resolved Hide resolved
soloestoy and others added 5 commits January 21, 2025 19:43
Signed-off-by: zhaozhao.zz <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Signed-off-by: zhaozhao.zz <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Signed-off-by: zhaozhao.zz <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Signed-off-by: zhaozhao.zz <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Signed-off-by: zhaozhao.zz <[email protected]>
Copy link
Member

@PingXie PingXie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @soloestoy!

src/commandlog.c Show resolved Hide resolved
src/commands/slowlog.json Show resolved Hide resolved
@soloestoy soloestoy merged commit 3f21705 into valkey-io:unstable Jan 24, 2025
51 checks passed
@soloestoy
Copy link
Member Author

Merged! Thanks every one, the next work is adding "not-recommended" for command's spec.

kronwerk pushed a commit to kronwerk/valkey that referenced this pull request Jan 27, 2025
…alkey-io#1294)

As discussed in PR valkey-io#336.

We have different types of resources like CPU, memory, network, etc. The
`slowlog` can only record commands eat lots of CPU during the processing
phase (doesn't include read/write network time), but can not record
commands eat too many memory and network. For example:

1. run "SET key value(10 megabytes)" command would not be recored in
slowlog, since when processing it the SET command only insert the
value's pointer into db dict. But that command eats huge memory in query
buffer and bandwidth from network. In this case, just 1000 tps can cause
10GB/s network flow.
2. run "GET key" command and the key's value length is 10 megabytes. The
get command can eat huge memory in output buffer and bandwidth to
network.

This PR introduces a new command `COMMANDLOG`, to log commands that
consume significant network bandwidth, including both input and output.
Users can retrieve the results using `COMMANDLOG get <count>
large-request` and `COMMANDLOG get <count> large-reply`, all subcommands
for `COMMANDLOG` are:

* `COMMANDLOG HELP`
* `COMMANDLOG GET <count> <slow|large-request|large-reply>`
* `COMMANDLOG LEN <slow|large-request|large-reply>`
* `COMMANDLOG RESET <slow|large-request|large-reply>`

And the slowlog is also incorporated into the commandlog.

For each of these three types, additional configs have been added for
control:

* `commandlog-request-larger-than` and
`commandlog-large-request-max-len` represent the threshold for large
requests(the unit is Bytes) and the maximum number of commands that can
be recorded.
* `commandlog-reply-larger-than` and `commandlog-large-reply-max-len`
represent the threshold for large replies(the unit is Bytes) and the
maximum number of commands that can be recorded.
* `commandlog-execution-slower-than` and
`commandlog-slow-execution-max-len` represent the threshold for slow
executions(the unit is microseconds) and the maximum number of commands
that can be recorded.
* Additionally, `slowlog-log-slower-than` and `slowlog-max-len` are now
set as aliases for these two new configs.

---------

Signed-off-by: zhaozhao.zz <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-decision-approved Major decision approved by TSC team needs-doc-pr This change needs to update a documentation page. Remove label once doc PR is open. release-notes This issue should get a line item in the release notes
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants