-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: detect cycle ref in proto #10499
Conversation
Signed-off-by: tabVersion <[email protected]>
Codecov Report
@@ Coverage Diff @@
## main #10499 +/- ##
=======================================
Coverage 70.33% 70.34%
=======================================
Files 1274 1274
Lines 218988 219023 +35
=======================================
+ Hits 154034 154066 +32
- Misses 64954 64957 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LGTM
message Parent { | ||
string parent_name = 1; | ||
int32 parent_id = 2; | ||
repeated ComplexRecursiveMessage siblings = 3; | ||
} | ||
|
||
Parent parent = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a practical example? Seems we're storing the nodes multiple times since protobuf is by-value instead of by-ref. 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the example is generated by chatgpt. real world ones are welcomed
� | ||
recursive.proto recursive"� | ||
ComplexRecursiveMessage | ||
node_name ( RnodeName | ||
node_id (RnodeIdM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the compiled result? Might be better to git ignore it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the proto dep requires the compiled ver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean can we compile it when testing? Seem we don't have any other test that depends on the hard coded compile output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, not a big deal. We can always handle it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean can we compile it when testing? Seem we don't have any other test that depends on the hard coded compile output.
There are only a few generated protobuf binary, I think it is not worthy introducing extra scripts and toolchain.
fn detect_loop_and_push(trace: &mut Vec<String>, fd: &FieldDescriptor) -> Result<()> { | ||
let identifier = format!("{}({})", fd.name(), fd.full_name()); | ||
if trace.iter().any(|s| s == identifier.as_str()) { | ||
return Err(RwError::from(ProtocolError(format!( | ||
"circular reference detected: {}, conflict with {}, kind {:?}", | ||
trace.iter().join("->"), | ||
identifier, | ||
fd.kind(), | ||
)))); | ||
} | ||
trace.push(identifier); | ||
Ok(()) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's cool to use an O(N^2) algorithm in this case. But I am worrying if it's possible for users to import a 1k-fields message in the system, which may cause the memory usage suddenly increases.
Given that all the previous types before the current DFS point should be unique, maybe you can use a set instead of a vector:
fn detect_loop_and_push(visited: &mut HashSet<String>, fd: &FieldDescriptor) -> Result<()> {
if visited.contains(fd.name()) {
return Err(...)
}
visited.insert(fd.name());
}
fn protobuf_type_mapping(
field_descriptor: &FieldDescriptor,
parse_trace: &mut HashSet<String>,
) -> Result<DataType> {
detect_loop_and_push(parse_trace, field_descriptor)?;
...
parse_trace.erase(field_descriptor.name())
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use a vec displaying the parser visiting path in order to help to identify the recursive ref occurs at which level.
push and pop op always happen at the end of the vec, there should be little overhead for allocation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
resolve #10475
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Click here for Documentation
Types of user-facing changes
Please keep the types that apply to your changes, and remove the others.
Release note