Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False negative: HandlerBox (hdlr) name field shall not contain NUL bytes prior to terminator #33

Open
baumanj opened this issue Nov 20, 2021 · 11 comments

Comments

@baumanj
Copy link

baumanj commented Nov 20, 2021

Now that MPEGGroup/FileFormat#35 has clarified the specification, there should be an error for inputs like this one (from https://github.com/AOMediaCodec/av1-avif/blob/master/testFiles/Apple/multilayer_examples/animals_00_multilayer_lsel.avif):

00 00 00 22 // Box size (34 bytes)
68 64 6c 72 // boxtype ("hdlr")
00 00 00 00 // version (1 byte), flags (3 bytes)
00 00 00 00 // pre_defined = 0
70 69 63 74 // handler_type ("pict")
00 00 00 00 // reserved[0] = 0
00 00 00 00 // reserved[1] = 0
00 00 00 00 // reserved[2] = 0
00 00       // name ("\0\0")

Here is a minimal example: invalid-avif-hdlr-name-multiple-nul.avif.zip

See also #28

@leo-barnes
Copy link

@baumanj
Good catch on the files, I'll need to look into why they ended up like this.

I'm not sure I agree that this is non-spec compliant though (although it might be good to output a warning). Since name is a NULL terminated string, it will end at the first \0. I don't think there is anything in the spec disallowing extra data to be in a box. If there is, then I agree that this is non-compliant. But if there isn't it's just extra unused cruft in the box. (Still not ideal of course.)

If name was something like lib\0avif\0, I would expect the avif\0 to simply be ignored.

@baumanj
Copy link
Author

baumanj commented Nov 22, 2021

I don't think there is anything in the spec disallowing extra data to be in a box

That's a good question. If you agree with MPEGGroup/FileFormat#35 that the utf8string ends at the first NUL, I'd tend to think that there would need to be explicit allowance in the spec for extra data in the box. Especially considering hldr is a FullBox, I can't think of a valid purpose for extra unspecified data at the end. The definition of BoxHeader in ISO/IEC 14496-12:2020 § 4.2.1 "Object syntax conventions" reads:

size is an integer that specifies the number of bytes in this box, including all its fields and contained boxes

which doesn't seem to allow for extra bytes or else I imagine it would be mentioned there.

More generally, this concept seems to be addressed by the expandable keyword from ISO/IEC 14496-1:2010 § 8.3.3 "Expandable classes", and the expandable keyword is not present in the definition of HandlerBox, so I'd call this invalid.

@cconcolato: do you have any thoughts?

@cconcolato
Copy link
Member

Interesting question.
The ability to have 'garbage' at the end of a box (FullBox or not) is actually a feature to enable backwards compatible extensions, especially for Boxes but also for FullBoxes when you don't want to introduce a new version. I think it has been used in the past (can't recall for the moment). But it could be clarified in the ISOBMFF spec that readers are expected to consume and ignore those bytes.

@dwsinger opinion?

@dwsinger
Copy link

dwsinger commented Dec 2, 2021

yes, perhaps we should have advice that readers should process the fields that they recognize, and if there is extra data in a box after that, ignore it (presuming that they are processing in accordance with a brand). However, it raises an interesting question for file processors: should they copy that data over, or delete it?

@cconcolato
Copy link
Member

I raised an issue on the MPEG repo.

Now regarding ComplianceWarden, I think it could emit a warning if a box contains data past the last known field.

@rbouqueau
Copy link
Member

Still waiting for clarification in MPEGGroup/FileFormat#46.

@dwsinger
Copy link

OK, two things confused here. (a) should a reader parse what it recognizes and ignore the rest? (b) are you allowed to put garbage after the last parsed field when creating a file?

IMHO, the answers are interlinked; to (a) yes, as that enables backward-compatible extensions to the spec, adding fields, without breaking old readers. (b) no, because those extra bytes would be interpreted by those hypothetical future spec-compatible readers as belonging to the newly defined next field.

So, in my retired-hence-unprofessional opinion, these files are invalid. They contain data after the last byte that would clash with any extension.

@jeanlf
Copy link
Member

jeanlf commented Apr 12, 2023

@dwsinger , I'll be a bit more carefull when saying

no, because those extra bytes would be interpreted by those hypothetical future spec-compatible readers as belonging to the newly defined next field.

I agree that if extensions are defined without versioning, this would break everything.
However, if full box versioning is properly used, any "garbage bytes" at the end of a v0 only would not be problematic to those hypothetical future spec-compatible readers since they should ignore v1+.

So theoretically speaking, I don't think these files are invalid (but there is a lot of should and hypothetical involved I agree...)

@dwsinger
Copy link

I was going to say that's fair, if fields are added then the version should be bumped, and no-one should be trying to parse them under v0.

But we don't bump the version if the change is a pure backwards-compatible addition; readers (I think) are told not to try parsing boxes when they don't understand the version number, as the change of version may indicate a non-backwards-compatible change. So making a box v1 would break v0 compatibility.

So we are back to the general question: do extra bytes aka garbage at the end of the box make a file formally non-compliant? I think so.

@rbouqueau
Copy link
Member

Any news here?

@rbouqueau
Copy link
Member

@cconcolato I was notified that MPEGGroup/FileFormat#46 was completed, please don't forget to update us on the resolution. Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants