Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: auto heap dump by default if MALLOC_CONF=prof:true #12186

Merged
merged 4 commits into from
Sep 11, 2023

Conversation

fuyufjh
Copy link
Member

@fuyufjh fuyufjh commented Sep 8, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

As title.

By default, as long as MALLOC_CONF=prof:true is set, auto heap dump will be produced. Otherwise, a line of log will be printed

Cannot dump heap profile because Jemalloc prof is not enabled

This makes it easier to use auto heap dump and also facilitates https://linear.app/risingwave-labs/issue/CLOUD-1791/feature-enable-auto-heap-dump.

The default dump path is inherited from prof.prefix in MALLOC_CONF, but you may override it by the option server.auto_dump_heap_profile.dir. For example:

[server.auto_dump_heap_profile]
enabled = true
dir = "/my/dump/path"
threshold = 0.9

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

convert(total_memory_bytes as _),
MIN_COMPUTE_MEMORY_MB
);
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know why MIN_COMPUTE_MEMORY_MB was not checked anywhere, so I added it by the way.

Copy link
Contributor

@KeXiangWang KeXiangWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@@ -1131,7 +1127,7 @@ pub mod default {

pub mod auto_dump_heap_profile {
pub fn dir() -> String {
"".to_string()
".".to_string() // current directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll suggest putting it in ./.risingwave/profiling/auto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"." is not perfect but ./.risingwave is only used by risedev (for developers), it seems even worse.

Let me try to make it configurable by env var, so that kube-bench or risedev can set a proper output directory.

Copy link
Contributor

@yuhao-su yuhao-su Sep 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually not just used by risedev.

create_dir_all("./.risingwave/sled").expect("should create");

But an env for the prefix will be great. It can be useful for putting all kinds of local files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the ci failed because of this change. You need to update the example.toml

Copy link
Member Author

@fuyufjh fuyufjh Sep 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually not just used by risedev.

create_dir_all("./.risingwave/sled").expect("should create");

But an env for the prefix will be great. It can be useful for putting all kinds of local files.

Hmm, this is not persuasive. memory storage is just for test and playground, and it is never used in production.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least put it in a directory? Output heap profile in a root directory looks ugly to me. Also we will have a directory for manually dumped in the dashboard pr.

Anyway, I think an env var for the local file prefix would be great. Maybe we can do it in later prs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized the "env var" actually already exist - MALLOC_CONF

Now it will dump memory profile to prof.prefix if server.auto_dump_heap_profile.dir is absent. Please take a look. 🥰

src/compute/src/memory_management/policy.rs Show resolved Hide resolved
@fuyufjh fuyufjh requested a review from a team as a code owner September 9, 2023 08:55
Copy link
Contributor

@yuhao-su yuhao-su left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@fuyufjh fuyufjh enabled auto-merge September 9, 2023 09:17
@codecov
Copy link

codecov bot commented Sep 9, 2023

Codecov Report

Merging #12186 (4556122) into main (c924c37) will decrease coverage by 0.02%.
Report is 12 commits behind head on main.
The diff coverage is 18.75%.

@@            Coverage Diff             @@
##             main   #12186      +/-   ##
==========================================
- Coverage   69.76%   69.75%   -0.02%     
==========================================
  Files        1407     1407              
  Lines      235565   235576      +11     
==========================================
- Hits       164351   164317      -34     
- Misses      71214    71259      +45     
Flag Coverage Δ
rust 69.75% <18.75%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...rc/compute/src/memory_management/memory_manager.rs 0.00% <0.00%> (ø)
src/compute/src/memory_management/policy.rs 0.00% <0.00%> (ø)
src/compute/src/memory_management/mod.rs 78.61% <37.50%> (+1.40%) ⬆️
src/common/src/config.rs 85.63% <100.00%> (+0.49%) ⬆️

... and 9 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -1130,6 +1127,10 @@ pub mod default {
}

pub mod auto_dump_heap_profile {
pub fn enabled() -> bool {
true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where will we read the env var MALLOC_CONF? 👀

Copy link
Member Author

@fuyufjh fuyufjh Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here: https://github.com/risingwavelabs/risingwave/pull/12186/files#diff-22db760b18e3eab5a13a58cf0ef5249aa2defa83c011f73bf72e41e58bb8363bR131-R132

let prof_prefix_mib = jemalloc_prof::prefix::mib().unwrap();
let prof_prefix = prof_prefix_mib.read().unwrap();

It's actually read from jemalloc lib as its option.

Note that the auto-dump is enabled by default, but if jemalloc's profiling is disabled (which is also the default), nothing happens except the log line mentioned in the PR description.

@@ -46,7 +46,11 @@ workspace-hack = { path = "../workspace-hack" }
task_stats_alloc = { path = "../utils/task_stats_alloc" }

[target.'cfg(unix)'.dependencies]
tikv-jemallocator = { git = "https://github.com/yuhao-su/jemallocator.git", features = ["profiling", "stats", "unprefixed_malloc_on_supported_platforms"], rev = "a0911601bb7bb263ca55c7ea161ef308fdc623f8" }
tikv-jemallocator = { git = "https://github.com/risingwavelabs/jemallocator.git", features = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about making it a workspace dependency?

[workspace.dependencies]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us do it in future PRs. cc. @yuhao-su

@fuyufjh fuyufjh added this pull request to the merge queue Sep 11, 2023
Merged via the queue into main with commit 17e81b1 Sep 11, 2023
@fuyufjh fuyufjh deleted the eric/auto_dump_if_prof_enabled branch September 11, 2023 04:18
fuyufjh added a commit that referenced this pull request Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants