Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(java_binding): fix java binding TLS allocation failure #12862

Merged
merged 10 commits into from
Oct 17, 2023

Conversation

chenzl25
Copy link
Contributor

@chenzl25 chenzl25 commented Oct 16, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

nm -gUj librisingwave_java_binding.dylib

_JNI_OnLoad
_Java_com_risingwave_java_binding_Binding_iteratorClose
_Java_com_risingwave_java_binding_Binding_iteratorGetArrayValue
_Java_com_risingwave_java_binding_Binding_iteratorGetBooleanValue
...
__rjem_je_arena_bin_offsets
__rjem_je_arena_binind_div_info
__rjem_je_arena_emap_global
__rjem_je_arena_pa_central_global
__rjem_je_arenas
__rjem_je_arenas_lock

...

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@codecov
Copy link

codecov bot commented Oct 16, 2023

Codecov Report

Merging #12862 (3c037ff) into main (5d97d60) will increase coverage by 0.05%.
Report is 34 commits behind head on main.
The diff coverage is 93.73%.

@@            Coverage Diff             @@
##             main   #12862      +/-   ##
==========================================
+ Coverage   69.14%   69.19%   +0.05%     
==========================================
  Files        1487     1489       +2     
  Lines      244648   245776    +1128     
==========================================
+ Hits       169150   170055     +905     
- Misses      75498    75721     +223     
Flag Coverage Δ
rust 69.19% <93.73%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/common/heap_profiling/src/jeprof.rs 0.00% <ø> (ø)
src/common/heap_profiling/src/profiler.rs 0.00% <ø> (ø)
src/compute/src/server.rs 0.00% <ø> (ø)
src/connector/src/lib.rs 52.08% <ø> (ø)
src/connector/src/sink/encoder/json.rs 84.63% <100.00%> (ø)
src/connector/src/sink/mod.rs 62.04% <ø> (+0.72%) ⬆️
src/expr/core/src/aggregate/def.rs 69.66% <ø> (ø)
.../frontend/src/optimizer/plan_node/generic/union.rs 100.00% <100.00%> (ø)
.../frontend/src/optimizer/plan_node/logical_union.rs 97.09% <100.00%> (+0.17%) ⬆️
src/meta/src/dashboard/mod.rs 0.00% <ø> (ø)
... and 9 more

... and 29 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@StrikeW
Copy link
Contributor

StrikeW commented Oct 17, 2023

Hi @chenzl25, is this a critical issue that will block the release of v1.3? After v1.3, I think we will switch to the embedded jvm totally.

@chenzl25
Copy link
Contributor Author

chenzl25 commented Oct 17, 2023

Hi @chenzl25, is this a critical issue that will block the release of v1.3? After v1.3, I think we will switch to the embedded jvm totally.

This PR doesn't fix the issue, we need further investigations. I think it is not a critical blocker. Yes, we have switched to the embedded jvm totally.

@chenzl25 chenzl25 force-pushed the dylan/fix_java_binding_TLS_allocation_failure branch from 6da334b to 7cdf801 Compare October 17, 2023 06:23
@github-actions github-actions bot requested a review from a team as a code owner October 17, 2023 06:30
This reverts commit fe6820b.
@fuyufjh
Copy link
Member

fuyufjh commented Oct 17, 2023

Can you please explain some context in PR description?

Comment on lines 162 to 164
[features]
default = []
disable_initial_exec_tls = ["tikv-jemalloc-ctl/disable_initial_exec_tls"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will be enabled in the binary level, so all crates will be affected. Maybe you can directly add it here if it's expected.

tikv-jemalloc-ctl = { git = "https://github.com/risingwavelabs/jemallocator.git", rev = "64a2d9" }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable_initial_exec_tls will only be enabled when build java binding, will it still affect rw binary?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will only be enabled when build java binding

This isn't true, as it's unified by hakari. You need to exclude tikv-jemalloc-ctl or risingwave_java_binding completely from hakari if you want to do so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to enable the disable_initial_exec_tls feature only when we build the Java dynamic library. In other cases, the default feature is ok.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little explanation: hakari tries to help you use the same feature sets regardless of what pkg you use in cargo build -p <pkg>, in order to avoid re-compilation. But you want exactly the reverse here: turn on different features for different pkgs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think so, let me try. Not quite familiar with hakari before.

Copy link
Member

@BugenZhao BugenZhao Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I find java_binding does not utilize jemalloc at all but suffers from its issue. It might not be be a good idea for common to depend on jemalloc.

Note that in this case, there will be two malloc implementations operating in the same process, which will almost certainly result in confusing runtime crashes if pointers leak from one implementation to the other.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not be be a good idea for common to depend on jemalloc.

Storage depends on jemalloc profiling, which has to depend on jemalloc. We may need to remove the profiling from compactor to address this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might not be be a good idea for common to depend on jemalloc.

Storage depends on jemalloc profiling, which has to depend on jemalloc. We may need to remove the profiling from compactor to address this.

+1 to remove it from common (to risingwave_common_heap_profiling). #11665

This comment was marked as resolved.

@chenzl25 chenzl25 requested review from BugenZhao and xxchan October 17, 2023 09:00
@chenzl25
Copy link
Contributor Author

@BugenZhao @xxchan PTAL again~

Copy link
Member

@fuyufjh fuyufjh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as it only affects java-binding

@xxchan
Copy link
Member

xxchan commented Oct 17, 2023

It might not be be a good idea for common to depend on jemalloc.

Storage depends on jemalloc profiling, which has to depend on jemalloc. We may need to remove the profiling from compactor to address this.

+1 to remove it from common (to risingwave_common_heap_profiling). #11665

I just pushed this alternative solution, which I believe is the better way to solve this issue.

@yuhao-su
Copy link
Contributor

I just pushed this alternative solution, which I believe is the better way to solve this issue.

this LGTM too

@chenzl25 chenzl25 added this pull request to the merge queue Oct 17, 2023
@StrikeW
Copy link
Contributor

StrikeW commented Oct 17, 2023

Shall we pick this PR to v1.3?

@chenzl25
Copy link
Contributor Author

Shall we pick this PR to v1.3?

It just affects our testing. While considering we might run a test on v1.3 after some cherry-picks, we can cherry-pick this PR to v1.3.

@StrikeW
Copy link
Contributor

StrikeW commented Oct 17, 2023

Shall we pick this PR to v1.3?

It just affects our testing. While considering we might run a test on v1.3 after some cherry-picks, we can cherry-pick this PR to v1.3.

I checked that the java binding .so only loaded in non embedded jvm mode, so it won't affect the functionality of connector.

Merged via the queue into main with commit 34ec260 Oct 17, 2023
16 checks passed
@chenzl25 chenzl25 deleted the dylan/fix_java_binding_TLS_allocation_failure branch October 17, 2023 10:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants