-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parser for /proc/[pid]/maps #34
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Left some specific feedback, I think the most important thing to nail down is an ergonomic public API.
src/pid/maps.rs
Outdated
/// will be mangled by the kernel and decoded by this module as: | ||
/// | ||
/// ```rust,ignore | ||
/// MemoryMapping ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These delimiters should be braces, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch :-) Fixed!
src/pid/maps.rs
Outdated
/// MemoryMapping ( | ||
/// ..., | ||
/// path: PathBuf::from("/tmp/a\nfile"), | ||
/// is_deleted:true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add space after colon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Fixed!
src/pid/maps.rs
Outdated
/// | ||
/// See `man 5 proc`, `Linux/fs/proc/task_mmu.c`, and `Linux/fs/seq_file.c`. | ||
#[derive(Debug, PartialEq, Eq, Hash)] | ||
pub struct MemoryMapping { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you feel about MemoryMap
as opposed to MemoryMapping
? It's a bit shorter, and matches maps
more closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! It does sound better! I have changed it in the updated pull request. Thanks for the suggestion!
src/pid/maps.rs
Outdated
/// Path to the file backing this mapping (for non-anonymous mappings), | ||
/// pseudo-path (such as `[stack]`, `[heap]`, or `[vdso]`) for some special | ||
/// anonymous mappings, or empty path for other anonymous mappings. | ||
pub path: PathBuf, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to parse out the map type into a proper enum, I think putting invalid paths into a PathBuf
is a no-no. Maybe could look something like this:
pub enum MapType {
Anonymous,
File(pub PathBuf),
Stack,
Heap,
Vdso,
Vvar,
Vsyscall,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I agree that abusing PathBuf
for anonymous mappings pseudo-paths is not a very clean API. However, enumerating all possible pseudo-paths might be problematic, as the exact list is not documented, is architecture and config dependent, and might change (new pseudo-paths might be added) in future kernel versions (which would break this API).
I do like the idea of using an enum to distinguish between file-backed and anonymous mappings. What do you think of doing something like this?:
pub struct MemoryMap {
pub range: ops::Range<usize>,
pub is_readable: bool,
pub is_writable: bool,
pub is_executable: bool,
pub is_shared: bool,
pub type: MapType,
}
pub enum MapType {
/// File-backed memory mapping.
File {
/// Offset into the file backing this mapping.
pub offset: u64,
/// Device containing the file backing this mapping.
pub dev: libc::dev_t,
/// Inode of the file backing this mapping.
pub inode: libc::ino_t,
/// Path to the file backing this mapping.
pub path: PathBuf,
/// Whether the file backing this mapping has been deleted.
pub is_deleted: bool,
},
/// Anonymous memory mapping.
Anonymous {
/// Pseudo-path (such as `[stack]`, `[heap]`, or `[vdso]`) for some special
/// mappings, or empty path for generic anonymous mappings.
pub name: Vec<u8>,
}
}
i.e., using a Vec<u8>
for anonymous mappings pseudo-paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, enumerating all possible pseudo-paths might be problematic, as the exact list is not documented, is architecture and config dependent, and might change (new pseudo-paths might be added) in future kernel versions (which would break this API).
Yeah, it would be nice if the #[non_exhaustive]
attribute was stable. How about adding all of the known/common variants and compromising on an Unknown(String)
variant as a fallback for everything else? (I think String
is OK here, since the file is guaranteed to be valid ascii, right?)
src/pid/maps.rs
Outdated
pub path: PathBuf, | ||
/// Whether the file backing this mapping has been deleted (for | ||
/// non-anonymous mappings). | ||
pub is_deleted: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't find in the linux source where (deleted)
is added to the file path, could you point that out to me? The manual doesn't mention it either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The (deleted)
suffix is part of the dcache
interface. Pathnames are generated in /proc
files using seq_file_path(), which ends up calling d_path() to generate the path for the corresponding dentry. The latter function is the one appending (deleted)
if the entry has been deleted.
I have added a reference to Linux/fs/dcache.c
in the MemoryMap
documentation. Thanks for pointing this out! (It was indeed not obvious).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, thanks.
/// To decode only escaped newlines (leaving other escaped sequences alone): | ||
/// | ||
/// ```rust,ignore | ||
/// let path = unmangled_path(br"a\012\040path", b"\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look like correct usage of br""
, since it doesn't include a delimiter, and why is it ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the issue re: br""
. According to the Rust Reference, this is valid syntax for raw string byte literals. What delimiter should be included?
Re: ignore
, the unmangle
module is not part of the crate public API, so unmangled_path
cannot be accessed in doctests. As rustdoc --test
tries to run doctests for private items (see rust-lang/rust#30094), the example has to be explicitly ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand the issue re: br"". According to the Rust Reference, this is valid syntax for raw string byte literals. What delimiter should be included?
Ah, so my confusion is that this instance isn't using any #
delimiters. Apparently that's legal, but it works the same as a non-raw literal, so it can be simplified to b"a\012\040path"
.
Re: ignore, the unmangle module is not part of the crate public API, so unmangled_path cannot be accessed in doctests. As rustdoc --test tries to run doctests for private items (see rust-lang/rust#30094), the example has to be explicitly ignored.
Interesting, didn't know about that limitation. SGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think br"..."
can be simplified to b"..."
in this case. The r
prefix ensures that the backslash is interpreted literally (otherwise, it combines with the following b0
byte to generate a NUL byte: https://play.rust-lang.org/?gist=2bd4e99cd3a1468149226af57fb48956&version=stable).
src/unmangle.rs
Outdated
escaped: &'a [u8], | ||
} | ||
|
||
impl<'a> Iterator for UnmangledPath<'a> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The iterator here is obfuscating the logic, could you just fold it into unmangled_path
as a for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! The code does indeed look more straightforward without the iterator. Thanks for the feedback :-)
- Rename `MemoryMapping` as `MemoryMap`. - Fix `MemoryMap` example. - Refer to `Linux/fs/dache.c` in `MemoryMap` documentation (to document ` (deleted)` suffix appending). - unmangle: Remove `UnmangledPath` iterator and inline decoding in `unmangled_path()`.
Hi Dan! Thanks a lot for taking the time to review this patch! (and sorry for the delay in answering, I have been sick for a couple of weeks). I have incorporated most of your suggested changes in the update to the PR (thanks for the feedback, the code looks much better now). The only pending point is the |
Sounds good! No worries about the delay, I'm not always the quickest myself :) |
- Add `MemoryMapKind` enum to encode mapping type (file-backed, anonymous, vdso, ...). - Move file-backed mapping specific fields to `MemoryMapKind::File(FileMap)` enum. - Add `MemoryMap.file()` convenience method.
Hi Dan! I finally got around to change the API as per your suggestions. I have added a new I have also moved all fields that only make sense for file-backed mappings to a new I have added a Apart from that, I have also extracted device number parsing onto its own function Let me know if there are any other changes that might improve the code. If not, I can rebase and squash the pull request branch (so it can be merged as a single logical commit). |
Any update? |
Hi!
This adds support for parsing process memory mappings information from
/proc/[pid]/maps
.