-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only reset unlog bits of objects in modbuf for nursery GCs #1169
base: master
Are you sure you want to change the base?
Conversation
470787e
to
ec6ba72
Compare
src/plan/generational/gc_work.rs
Outdated
// Scan objects in the modbuf and forward pointers | ||
let modbuf = std::mem::take(&mut self.modbuf); | ||
GCWork::do_work( | ||
&mut ScanObjects::<E>::new(modbuf, false, WorkBucketStage::Closure), | ||
worker, | ||
mmtk, | ||
) | ||
} else { | ||
// Flip the per-object unlogged bits to "unlogged" state for objects inside the bootimage | ||
#[cfg(feature = "vm_space")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know what was better: to go through the list or bulk set the unlog bits in vm_space.release()
. I thought this was better since it is more general than bulk setting the unlog bits (in case the VM binding is not using side-metadata for the unlog bits).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why the VM space needs to be specially dealt here. Why do we need to set the unlog bits in mature GCs for the VM space? What about other spaces like immortal or LOS, how is the VM space different from those spaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. So what we want to do is only reset the unlog bit for objects that are considered live. All ImmixSpace and CopySpace mature objects will set the unlog bit when you copy them over because of the CopySemantics::Mature
. But objects in the VM space (or even immortal and LOS like you mentioned) don't perform copies and so we need to reset their bits iff those objects are live. VM space and immortal space objects are always live so we need to unconditionally reset them, but for LOS we need to reset only when the object is live. So really we're doing this reset at the wrong place and time.
One way to fix it would be to reset the unlog bit for mature LOS objects inside its trace_object
. And then we can reset the unlog bits for VM space and immortal space objects in the ProcessModBuf
work packet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But objects in the VM space (or even immortal and LOS like you mentioned) don't perform copies and so we need to reset their bits iff those objects are live.
Why not do that in the trace_object
method for the policy where we know the object is reached? For ImmixSpace and CopySpace, the log bit is set during copying in trace_object
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right that's what I suggested for LOS objects. But it's not clear to me if ART, for example, will actually end up tracing through all objects in the bootimage or immortal space for a full heap GC given their optimization(s). Maybe that's an ART-specific issue though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative could be to create a new work packet to be executed after the transitive closure (or maybe release?) for major GCs. In that work packet we now have all the correct information about if a current object address is marked or not, and if it is, then we can reset the unlog bit for that object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to unconditionally set the log bit in ProcessModBuf
(no matter nursery or mature GCs), and bulk clear log bits when we release memory for dead objects.
In the current code base, for Immix space, we will set log bits during tracing if unlog_object_when_traced
is true, and will clear log bits in prepare if reset_log_bit_in_major_gc
is true. Only in Sticky Immix, both flags are set to true
. The former is needed, as we 'promote' nursery objects in immix space when we trace it (and we do not have to copy it to 'promote'). But the latter may not be necessary -- we can bulk clear log bits for the dead objects after a GC. For other spaces, we can always bulk clear log bits after a GC for reclaimed memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this warrants a group discussion. We can discuss this in the next group meeting. I'll do a hack for ART in the meantime since this current PR is probably still incorrect because of LOS objects' unlog bits not being reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about cases where we recycle mature LOS objects? I think that is broken as well. If the unlog bit was set for it in a previous GC and it has died since then, if we allocate a new LOS object at the same address, we have a young object with the unlog bit set which is also wrong. We need to clear the unlog bit for dead LOS objects in the sweep here
@wks Looks like OpenJDK actually triggers the assertion 😆 . I think the assertion is sound so this is actually an OpenJDK bug. The assertion is not triggered for ART. |
src/plan/barriers.rs
Outdated
@@ -231,6 +231,7 @@ impl<S: BarrierSemantics> Barrier<S::VM> for ObjectBarrier<S> { | |||
target: Option<ObjectReference>, | |||
) { | |||
if self.log_object(src) { | |||
debug_assert!(src.is_live::<S::VM>(), "{} was logged but is not live", src); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_live
doesn't make sense during mutator time. The liveness of an object is only determined after computing transitive closure. During mutator time, all objects that can be accessed are, by definition, live.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except the unlog bit should only be set for mature objects, so they have to be considered alive. Or at the very least have to be marked. I previously had is_marked
there but objects in the bootimage may not actually have mark bits set so it was triggering the assertion incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought I actually agree with you and will remove the debug assert from the write barrier code. I think the debug assert inside the ProcessModBuf work packet is correct though
0282ba5
to
4a29049
Compare
Just fixed it like we discussed in a previous meeting |
e26f37c
to
0753a49
Compare
@@ -275,6 +275,17 @@ impl<VM: VMBinding> VMSpace<VM> { | |||
); | |||
debug_assert!(self.in_space(object)); | |||
if self.mark_state.test_and_mark::<VM>(object) { | |||
// Flip the per-object unlogged bits to "unlogged" state for objects inside the | |||
// bootimage | |||
#[cfg(feature = "set_unlog_bits_vm_space")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can remove the conditional compile feature here since it is likely objects in the VM space will want to have the unlog bit set and the feature was more for initialization
@wks Can you re-review this? |
I assume the "issues for future allocations" means newly allocated (young) objects may have unlog bit set, and may be logged when written into, and we will retain more objects than necessary. But I don't think it is sufficient to ensure this by skipping setting unlog bits in mature GCs. There may be dead mature objects that are not written to (i.e. unlogged). When those objects die, their unlog bits will remain set. If future allocations reuse the memory space of those objects, some young objects will also have unlog bits set. If you want to ensure nursery objects never get logged, we may need to (bulk) clear unlog bits for reclaimed regions, like how we clear VO bits. I think this only affects StickyImmix because we never set the unlog bits for objects in the GenCopy or GenImmix nursery in the first place. |
Hence why we do a bulk clear? |
Oh. How odd. This PR doesn't do the bulk clear. I must have messed up when cherry-picking from my branch |
Instead of bulk clearing, it is also possible to copy the mark bits over to the unlog bits (and remove p.s. I think the so-called "unlog bit" should really be named "is candidate for logging". Obviously nursery objects are not such candidates. |
What that suggests is we need to be more systematic with how we deal with stale metadata values |
This fixes a bug wherein we were erroneously resetting the unlog bits for objects in the modbuf for full-heap GCs. If we reset the unlog bits in a full-heap GC, we may end up setting the unlog bit for an object that actually died which may cause issues for future allocations. We now set the unlog bits for mature objects when tracing for VM space (if compiled with `set_unlog_bits_vm_space`), immortal space, and large object space. This PR also adds debug assertions checking that any object that has been added to the modbuf is considered "live" by MMTk, or in other words, it is a mature object.
5e37a4b
to
2b32380
Compare
@wks What needs to be done for this PR to get it to merge? I think it's very important to merge bug fixes ASAP since it leads to duplication of debugging efforts otherwise. This is a very very subtle bug. EDIT: I can remove the |
If I understand correct, the issue is that for dead objects, we do not clear their unlog bits, and new objects allocated will have random unlog bits. The obvious fix would be to just clear the unlog bits before new allocation. The PR tries to maintain the unlog bits before the objects become dead: if an object dies, its unlog bit is 0, and if an object lives, its unlog bit is 1. This doesn't sound like a straight solution to the problem. What is the reason for using this solution rather than just clearing the unlog bits? |
Fast-path allocation. EDIT: It seems like I haven't cherry-picked a further change from here: k-sareen@8f6fcd8 The above commit bulk clears the unlog bits for a global GC and then sets them again while tracing. |
Really what we want is a meta-module that takes care of all relevant metadata when an object dies/a space is released/a TLAB is allocated. What we have currently is a hack since each metadata (like VO-bit or side forwarding bits) has been special cased. But I feel like that is out of scope of this PR. |
We now: - Clear the unlog bits for dead LOS objects - Bulk clear the unlog bits for MarkSweepSpace for full heap GC - Set the unlog bits for LOS and MarkSweepSpace objects during trace object This is because we want to ensure the unlog bits are only set for objects that are alive. We do not reset the unlogs bits inside the `ProcessModBuf` work packet because we do not know what objects are alive at that moment in time.
if mmtk | ||
.get_plan() | ||
.generational() | ||
.unwrap() | ||
.is_current_gc_nursery() | ||
{ | ||
// Flip the per-object unlogged bits to "unlogged" state. | ||
for obj in &self.modbuf { | ||
debug_assert!((*obj).is_live(), "{} was logged but is not live", *obj); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion fails for GenCopy in the first nursery GC after the first full-heap GC. It is because it is in the from space of the semispace mature space, and is_live()
returns false for all objects in the from space.
You still need to implement bulk clearing of unlog bits for ImmixSpace. It is sufficient to do the clearing in And I suggest adding a debug assertion that young objects cannot have unlog bits set. We can do this while tracing or while allocating. Possible places include
Probably remove it for now because it fails in GenCopy. We can introduce a similar assertion later. I think the actual assertion you want to make is that the object is a mature object. For StickyImmix, we use the mark bit to distinguish between old and young objects. So |
Yes exactly. My best approximation was |
Yes. It should work in general. It is implemented as |
This fixes a bug wherein we were erroneously resetting the unlog bits
for objects in the modbuf for full-heap GCs. If we reset the unlog bits
in a full-heap GC, we may end up setting the unlog bit for an object
that actually died which may cause issues for future allocations.
We now set the unlog bits for mature objects when tracing for VM
space (if compiled with
set_unlog_bits_vm_space
), immortal space, andlarge object space.
This PR also adds debug assertions checking that any object that has
been added to the modbuf is considered "live" by MMTk, or in other
words, it is a mature object.