-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotate mmap ranges using PR_SET_VMA #1236
Conversation
We demand that every invocation of `mmap` within mmtk-core to be accompanied with an "annotation" for the purpose of the mmap. On Linux, we will use `PR_SET_VMA` to set the annotation after `mmap` so that it can be seen in `/proc/pid/maps`. This will greatly improve the debugging experience.
Why not just use the space/metadata spec name instead? Also there's a 80-byte limit for the string (including the NUL byte).
Why is this a problem? If a chunk is recycled, then we should just rename it. |
Good point. Although I don't think any space name + metadata name would exceed this limit, it's safe to perform the check when formatting.
The number of mmap entries is limited at the OS level. So whenever possible, we should merge adjacent entries. If no chunk in the discontiguous space has name (annotated with |
Yes that is a good point. Though in my branch I just use an
I'm not sure this is a real concern? I don't imagine we'll ever be hitting the limit even for 32-bit machines. Even Android always has the annotations enabled. I strongly suggest having these on by default. If we ever hit the limit then we can disable it. |
Well, perhaps it's not a problem. I tried a few benchmarks. We do mmap at chunk granularity. This means we don't call
So I think it's OK to leave it enabled by default. The user can still set |
I think you need to update some of the tests to use the new annotation API as well. Re: mmap chunks. Like I mentioned in a meeting, we can improve this further by allocating spaces from different ends of the address space |
Err.. Why is |
There is an upstream bug report: GuillaumeGomez/sysinfo#1392 |
Should we pin version of sysinfo/libc until this is resolved then? |
I pinned the version of libc in this PR so that we can test and review this PR. There is already a PR for the sysinfo crate that will fix this problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except some minor stuff in review.
src/util/memory.rs
Outdated
/// FIXME: Since it is set via `Options`, it is in theory a decision per MMTk instance. However, we | ||
/// currently don't have a good design for multiple MMTk instances, so we use static variable for | ||
/// now. | ||
pub(crate) static MMAP_ANNOTATION: AtomicBool = AtomicBool::new(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks bad. Though there is a FIXME
above, we should avoid introducing this at first place.
Both Plan
and Space
have references to the options. You can let plan/space to check options and create a MmapAnnotation
. If the runtime option is not enabled, we can use Option<MmapAnnotation>
or MmapAnnotation::Omit/None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That looks bad.
Although it is possible to let plans and spaces check options, and optionally pass MmapAnnotation::Omit
or a None
for Option<MmapAnnotation>
, it is not ideal.
One important part of my intention is that the annotation cannot be omitted. I'd like the function signatures to enforce that an annotation must be provided for every call to mmap. If we give the caller the option to pass None
or Omit
, developers will do it if they think it is not worth annotating. Instead, the decision of whether to actually annotate the mmap should be postponed until mmap
is actually called and is successful, and should be controlled by the user via an option rather than at call sites.
And checking at every call site is repetitive and error prone. Developers may forget to do the check or erroneously omit the annotation.
Ideally, functions related to memory mapping should be wrapped into a struct, maybe named MmapSupport
or CommonMapper
. It will be created per MMTK
instance, like Mmapper
and VMMap
(which are currently both global singletons), and referenced by plans, spaces, SFT, etc., whatever needs to use mmap. The mmap_annotation: bool
should be an instance variable of CommonMapper
, and map_fixed
should be an instance method of MmapSupport
. That may involve much refactoring, including moving dzmmap
, dzmmap_noreplace
and mmap_noreserve
into MmapSupport
. For now, creating a global boolean variable seems to be the easiest solution.
In the short term, we may introduce MmapSupport
as a global singleton, like Mmapper
and VMMap
. In the long term, we can make them specific to the MMTK
instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One important part of my intention is that the annotation cannot be omitted. I'd like the function signatures to enforce that an annotation must be provided for every call to mmap.
We could enforce the users to supply MmapAnnotation
, and turn it into Option<MmapAnnotation>
before we send it to mmap functions. Plus the problem of having a global var outweighs the benefits of 'enforcing annotations'.
Alternatively, we can have an instance for memory
which holds the boolean value from the option. I think they are both worth trying before adopting the current implementation. We have arg structs for creating plans and spaces. Creating a Memory
instance and letting plan/space to hold it shouldn't be too hard.
Line 372 in 8640ab8
pub struct CreateGeneralPlanArgs<'a, VM: VMBinding> { |
It will be created per MMTK instance, like Mmapper and VMMap (which are currently both global singletons), and referenced by plans, spaces, SFT, etc., whatever needs to use mmap.
I think it is still arguable whether Mmapper
and VMMap
should be global or per instance. We do have a hacky vm_layout()
that should be per instance, and is global at the moment. But we should avoid introducing more globals like this. The more we have, the more difficult and the more resistance we will face when we try to support multiple instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we can have an instance for
memory
which holds the boolean value from the option. I think they are both worth trying before adopting the current implementation. We have arg structs for creating plans and spaces. Creating aMemory
instance and letting plan/space to hold it shouldn't be too hard.Line 372 in 8640ab8
pub struct CreateGeneralPlanArgs<'a, VM: VMBinding> {
Yes. This is my intended way to do it. But this may involve too many changes and I think it is better to do the refactoring separately.
The introduced instance (let's call it MmapSupport
for now) should contain dzmap
, dzmap_noreplace
, mmap_noreserve
and mmap_fixed
. I grepped the code and found that the following places will be affected.
MapState
:MapState::transition_to_*
will call those functions directly, as if theMapState
can transition its own state without the help ofMmapper
or theMmapSupport
instance. This needs to be refactored so thatMapState
will become a state holder, andMmapper
implementations (FragmentedMapper
andByteMapMmapper
) can call the mmap functions inMmapSupport
to do the transition.LockFreeImmortalSpace
: It skipped theMmapper
and directly callsdzmmap_noreplace
. I think this is a bug becauseMmapper
should be aware of the mapping of spaces.RawMemoryFreeList
: It callsdzmmap_noreplace
to map its memory. Not sure if it should go throughMmapper
, but probably it should because aRawMemoryFreeList
is part of a space using it.
In the end, we may have all mmap invocations going through Mmapper
, and an Mmapper
holds a reference to MmapSupport
.
We could enforce the users to supply
MmapAnnotation
, and turn it intoOption<MmapAnnotation>
before we send it to mmap functions.
It is good to enforce the users to supply MmapAnnotation
. The difficulty is drawing a border between the "user" and the "mmap functions". And another difficulty is that we lost access to Options
before reaching "mmap functions".
Take the call chain SideMetadataContext::map_metadata_internal
-> try_mmap_contiguous_metadata_space
-> MMAPPER.ensure_mapped
for example.
- In
SideMetadataContext::map_metadata_internal
, we have enough information (space name and side metadata name) to construct the annotation. We constructMmapAnnotation
here. But sinceSideMetadataContext
does not reference anySpace
orPlan
, it cannot accessOptions
. try_mmap_contiguous_metadata_space
is a top-level function. It has access to neither space name norOption
.MMAPER.ensure_mapped
is a function that is more related to mmap. It may be refactored to hold a boolean variableMMAP_ANNOTATION
, or refer toMmapSupport
I mentioned above.
So the availability of Options
or the MMAP_ANNOTATION
variable does not match the distinction between user and mmap. I think it's easier to refactor the code and just let "mmap functions" decide whether to annotate.
Plus the problem of having a global var outweighs the benefits of 'enforcing annotations'.
Having global var is bad, but I think it is worse to let the users check Options::mmap_annotation
before constructing the Option<MmapAnnotation>
because it is repetitive and error-prone. Having a global variable is the easiest workaround before we refactor Mmapper
, MapState
, etc.
If you think having a global variable is unacceptible, I can remove Options::mmap_annotation
and make the annotation controlled by a Cargo feature instead, or leave it always on. I think it is OK to leave it always on because there will be strictly fewer prctl
than mmap
, and we don't call mmap
very often, either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you think having a global variable is unacceptible, I can remove Options::mmap_annotation and make the annotation controlled by a Cargo feature instead, or leave it always on.
Right. We dont really need the option anyway. We can always annotate the mmaps.
The reason that I am against having a global var is that we know it's wrong and this kind of small issues will accumulate over time and eventually make it hard to get rid of them. We should try to avoid introducing them in first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the option, and added a Cargo feature no_mmap_annotation
just in case anyone needs to disable it for any reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This global var can be removed now. I am suprised that we did not get any warning from the compiler for this, this seems to be an unused variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good. The global var MMAP_ANNOTATION
can be removed.
"failed to mmap meta memory" | ||
); | ||
CHUNK_METADATA | ||
.try_map_metadata_space(start, size, space_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good refactoring with clear improvement. The old code put the actual action performed in the assert!
which is bad.
src/util/memory.rs
Outdated
/// FIXME: Since it is set via `Options`, it is in theory a decision per MMTk instance. However, we | ||
/// currently don't have a good design for multiple MMTk instances, so we use static variable for | ||
/// now. | ||
pub(crate) static MMAP_ANNOTATION: AtomicBool = AtomicBool::new(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This global var can be removed now. I am suprised that we did not get any warning from the compiler for this, this seems to be an unused variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
V8 failed for HTTP 404. I think it might be a transient failure if the Ubuntu server is updating itself and happens to be in an inconsistent state when we ran the test. See: https://github.com/mmtk/mmtk-core/actions/runs/12078239098/job/33686682978?pr=1236 JikesRVM failed for the classic "instruction ... not in RVM space" error. Interestingly, the dumped maps file showed that our annotation is working. See: https://github.com/mmtk/mmtk-core/actions/runs/12078239098/job/33686683158?pr=1236 I'll rerun those tests. |
We demand that every invocation of
mmap
within mmtk-core to be accompanied with an "annotation" for the purpose of the mmap. On Linux, we will usePR_SET_VMA_ANON_NAME
to set the attribute aftermmap
so that it can be seen in/proc/pid/maps
. This will greatly improve the debugging experience.