Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are we reading the UUID correctly? #7

Open
bobbyg603 opened this issue Dec 16, 2023 · 1 comment
Open

Are we reading the UUID correctly? #7

bobbyg603 opened this issue Dec 16, 2023 · 1 comment
Assignees

Comments

@bobbyg603
Copy link
Member

bobbyg603 commented Dec 16, 2023

macho-uuid/src/macho.ts

Lines 127 to 136 in c0d00f6

while (!uuid) {
const cmd = buffer.readUInt32LE(offset);
const cmdsize = buffer.readUInt32LE(offset + 4);
if (constants.cmdType[cmd] === 'uuid') {
uuid = buffer.subarray(offset + 8, offset + 24).toString('hex');
}
offset += cmdsize;
}

We might need to read either LE or BE depending on the magic sequence.

@bobbyg603 bobbyg603 self-assigned this Dec 16, 2023
@csmith0651
Copy link

It's a 16 byte value. In 4, 2, 2, and 8 byte chunks. From what I've seen the 8 byte chunk is read raw, but the other sections might have endianness issues. For instance look at this ELFSharp Code:

(https://github.com/konrad-kruczynski/elfsharp/blob/0f06793c31d9c2e431337e67d2abf2d87745995e/ELFSharp/MachO/UUID.cs#L26)

        private Guid ReadUUID()
        {
            var rawBytes = Reader.ReadBytes(16).ToArray();

            // Deal here with UUID endianess. Switch scheme is 4(r)-2(r)-2(r)-8(o)
            // where r is reverse, o is original order.
            Array.Reverse(rawBytes, 0, 4);
            Array.Reverse(rawBytes, 4, 2);
            Array.Reverse(rawBytes, 6, 2);

            var guid = new Guid(rawBytes);
            return guid;
        }

Reader is a simple Endian reader. But, what's confusing to me a little bit, is why the code is blindly reverseing 0-3, 4-5, and 6-7 bytes rather than consulting the endianness of the reader?

Here's what chatgpt wrote (which doesn't help much I think):

Yes, the bytes in the UUID stored within the `LC_UUID` command in a Mach-O file are typically presented in reverse order compared to the more commonly seen human-readable representation.

The UUID is a 128-bit (16-byte) value, usually represented as a string of 32 hexadecimal characters grouped in five sections separated by hyphens. For example, a typical UUID might look like this: `123e4567-e89b-12d3-a456-426655440000`.

When stored in the `LC_UUID` command within a Mach-O file, the bytes of the UUID are usually stored in little-endian order. This means that the byte order is reversed from how it's commonly displayed in a human-readable format. So, if you were to extract the UUID bytes directly from the Mach-O file, you'd find them in reverse order compared to the standard UUID representation.

For instance, using the previous example UUID (`123e4567-e89b-12d3-a456-426655440000`), the byte order in the Mach-O file would be reversed. However, when read and displayed as a UUID by a tool like `otool`, it will typically reverse the byte order to present it in the standard human-readable format.

It's important to note this byte order reversal is a common representation in the context of storage or encoding within files and memory, especially in little-endian systems, but the order may vary depending on the specific file format or system implementation. Always refer to the specifications or documentation related to the file format or tool you're working with for precise details on byte order and data representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants