Are we reading the UUID correctly? #7

bobbyg603 · 2023-12-16T17:09:51Z

Lines 127 to 136 in c0d00f6

    
           while (!uuid) { 
        
               const cmd = buffer.readUInt32LE(offset); 
        
               const cmdsize = buffer.readUInt32LE(offset + 4); 
        
               if (constants.cmdType[cmd] === 'uuid') { 
        
                   uuid = buffer.subarray(offset + 8, offset + 24).toString('hex'); 
        
               } 
        
               offset += cmdsize; 
        
           }

We might need to read either LE or BE depending on the magic sequence.

csmith0651 · 2023-12-17T12:35:05Z

It's a 16 byte value. In 4, 2, 2, and 8 byte chunks. From what I've seen the 8 byte chunk is read raw, but the other sections might have endianness issues. For instance look at this ELFSharp Code:

(https://github.com/konrad-kruczynski/elfsharp/blob/0f06793c31d9c2e431337e67d2abf2d87745995e/ELFSharp/MachO/UUID.cs#L26)

        private Guid ReadUUID()
        {
            var rawBytes = Reader.ReadBytes(16).ToArray();

            // Deal here with UUID endianess. Switch scheme is 4(r)-2(r)-2(r)-8(o)
            // where r is reverse, o is original order.
            Array.Reverse(rawBytes, 0, 4);
            Array.Reverse(rawBytes, 4, 2);
            Array.Reverse(rawBytes, 6, 2);

            var guid = new Guid(rawBytes);
            return guid;
        }

Reader is a simple Endian reader. But, what's confusing to me a little bit, is why the code is blindly reverseing 0-3, 4-5, and 6-7 bytes rather than consulting the endianness of the reader?

Here's what chatgpt wrote (which doesn't help much I think):

Yes, the bytes in the UUID stored within the `LC_UUID` command in a Mach-O file are typically presented in reverse order compared to the more commonly seen human-readable representation.

The UUID is a 128-bit (16-byte) value, usually represented as a string of 32 hexadecimal characters grouped in five sections separated by hyphens. For example, a typical UUID might look like this: `123e4567-e89b-12d3-a456-426655440000`.

When stored in the `LC_UUID` command within a Mach-O file, the bytes of the UUID are usually stored in little-endian order. This means that the byte order is reversed from how it's commonly displayed in a human-readable format. So, if you were to extract the UUID bytes directly from the Mach-O file, you'd find them in reverse order compared to the standard UUID representation.

For instance, using the previous example UUID (`123e4567-e89b-12d3-a456-426655440000`), the byte order in the Mach-O file would be reversed. However, when read and displayed as a UUID by a tool like `otool`, it will typically reverse the byte order to present it in the standard human-readable format.

It's important to note this byte order reversal is a common representation in the context of storage or encoding within files and memory, especially in little-endian systems, but the order may vary depending on the specific file format or system implementation. Always refer to the specifications or documentation related to the file format or tool you're working with for precise details on byte order and data representation.

bobbyg603 self-assigned this Dec 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are we reading the UUID correctly? #7

Are we reading the UUID correctly? #7

bobbyg603 commented Dec 16, 2023 •

edited

Loading

csmith0651 commented Dec 17, 2023

Are we reading the UUID correctly? #7

Are we reading the UUID correctly? #7

Comments

bobbyg603 commented Dec 16, 2023 • edited Loading

csmith0651 commented Dec 17, 2023

bobbyg603 commented Dec 16, 2023 •

edited

Loading