-
Notifications
You must be signed in to change notification settings - Fork 564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mpl2 recurses itself to a seg fault on BoomFrontend #6083
Comments
@jeffng-or Apparently it's not mpl2 itself that is blowing up. During clustering, we call par (TritonPart) to partition big flat clusters i.e., big clusters made of only leaf macros/std cells. Based on your log, the segfault is happening inside par. |
Sure, makes sense. The key point is that breakLargeFlatCluster recurses down 11100 frames (I think I cut the stack trace file off one level too soon, so my bad on that). The fact that we down effectively infinitely will eventually cause a failure somewhere and it happens to be in par. |
@AcKoucher the end of the stack is in par but most of the stack is in mpl2. I think the problem is the recursion in breakLargeFlatCluster. How many parts are we trying to break this cluster down into? I suspect something is off in the cluster size. |
@maliberty I see. I'll investigate. |
If that much splitting is necessary then you can write it non-recursively. |
Apparently TritonPart is doing a terrible job when trying partitioning (ftq)_glue_logic
|
@maliberty I'm not sure how to proceed here. Should mpl2 reject the result and take care of splitting the cluster if the partitions generated by TritonPart are not good? |
I think TP should be fixed. |
Describe the bug
The macro placer runs for 14h before seg faulting on BoomFrontend, which is a sub-module of BoomTile. Note that the segfault isn't seen in the full BoomTile run, which runs for about 1h.
I've re-run the job in GDB and mpl2 is infinitely recursing itself into oblivion. Here's a snippet of the stack trace:
The full-ish stack trace can be found at: https://drive.google.com/file/d/10MMydy8f761RPeXXE5FKgIVFDAtlWwCn/view?usp=sharing
The tarball can be found at: https://drive.google.com/file/d/1PH8jZAREhRn4NIVryR7pes3sKNGSIBqs/view?usp=sharing
Expected Behavior
Successful mpl2 run without a seg fault and running less than 1h
Environment
To Reproduce
Relevant log output
No response
Screenshots
No response
Additional Context
No response
The text was updated successfully, but these errors were encountered: