Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port to TypeScript #424

Open
wants to merge 51 commits into
base: master
Choose a base branch
from
Open

Port to TypeScript #424

wants to merge 51 commits into from

Conversation

bigmistqke
Copy link

@bigmistqke bigmistqke commented Nov 20, 2024

Hi 👋

I have been exploring WebCodec API for the past weeks and during it I came across this library.
It bothered me that I wasn't able to explore its api easily, because of the pre-module bundling strategy and lack of types.

I had seen that there was interest from @cconcolato to port to typescript, but that this was unlikely to happen soon due to time constraints, so I have been exploring this option. This PR is a result of this exploration.

The changes have been quite drastic, as the javascript version relied on a lot of type-unsafe, pre-es6 coding styles and bundler. I understand that this PR might be a bit overwhelming as it touches a lot of code and I needed to make a lot of choices. It's a bit rude to barge into a project like this and change the code so drastically, my bad.

Main changes

  1. Prototype-class-factory replaced for class-declarations
  2. Use esm-modules instead of concatenating javascript files: use vite instead of grunt to bundle the files.
  3. Import classes and modules where possible instead of accessing BoxParser (prevents accessing boxes that are not exported by the entry)
  4. Declaration of classes and write-methods are not separated in 2 different directories (src/write and src/parse), but are written in 1 file as 1 class declaration (src/boxes).
  5. Bundling of all/simple-entry is done through exporting 2 different entries, instead of grunt-config (see entries-directory)
  6. Breaking API change: StructDefinition is refactored from [key, value, key, value, ...] to [[key, value], [key, value], ...] as this is a lot clearer to type.
  7. Refactored usage of Box.add(boxName) and Box.set(propertyName, value) to Box.addBox(new BoxClass()) and setting the properties directly on the class-instance.
  8. var refactored to const/let and inlined variable declarations instead of declaring all variables up-front
  9. Rely on instanceof checks instead of checking the .type property.
    • I am not sure if this is actually a good approach, since we are also creating generic Box in case BoxRegistry[type] does not exist.

Circular Dependencies

With the introduction of import-paths also the potential for circular dependencies is introduced which occurred in an early version of the port. To prevent circular dependencies while still allowing for allowing all/simple or other customised bundles, there is a register-pattern:

  • An entry registers all the Box-classes it wants to be able to initiate. These are then added to a BoxRegistry-object.
  • Whenever there is a dynamic class generation based on a type-string, we access BoxRegistry[...].
  • BoxRegistry is typed as Partial<AllBoxes>, where AllBoxes are all the boxes exported from the all-entry.

This would make it possible too for third parties to make their own treeshakable bundles too.

We do the same for Descriptor with DescriptorRegistry.

I added dpdm to the dev-dependencies to assist with keeping an eye on this: with npm run circular a check is done for circular dependencies.

Tests

I have tried to port the tests too, but a lot seem to rely on video content that is not available anymore. The only proof I have that this port is somehow successful is my working port of the video-decode-display sample of the w3c/webcodec repo. [EDIT]: the audio-video-player sample is now working too.

Bundle Size

The size grew quite a bit:

  • current all version: 150 kB or 33 kB (gzip) (from bundlejs)
  • typescript all version: 221 kB or 51 kB (gzip)

Conclusion

Screen.Recording.2024-11-20.at.20.30.03.mov

Pretty fun to be able to explore the boxes like this! The typescript declaration file is pretty impressive

I will now stop with exploring this port further. The project is in a workable state as far as I can tell, but I have no way to properly test this. I personally don't have an interest in maintaining a typescript fork, because I know little to nothing about mp4 codecs. I am open for continuing if there is an interest in mergin this PR eventually from the core contributors.

Open Questions

  • The types and the refactor helped in identifying some questionable code too that was present in the codebase, these are annotated with // NOTE comments.
  • There are still some unknowns in the codebase and // @ts-expect-error, but most of it has been typed.
  • I haven't yet figured out what type of stream parse() expects: it seems at some places it is also passed a MP4BoxStream, but writeHeader expects a .writeUInt32 method that isn't present on MP4BoxStream.
  • Unclear what is the signature of ISOFile.onError: it is never called in the codebase.
  • Unsure how to translate Descriptor.parseRemainingDescriptors
    • In the javascript version:
      • Defined in MPEG4DescriptorParser (extends Descriptor's prototype inside MPEG4DescriptorParser's constructor, together with a bunch of other methods on Descriptor and ES_Descriptor)
      • Refers to MPEG4DescriptorParser-specific methods inside Descriptor.parseRemainingDescriptors: var desc = that.parseOneDescriptor(stream); (with that being a reference to MPEG4DescriptorParser)
    • I 'fixed' it by moving the parseOneDescriptor method from MPEG4DescriptorParser to Descriptor, but not 100% sure this covers the original intent.
  • the box-type is encoded in the class-name, this does not always work well:
    • ac-3 becomes ac_3SampleEntry
    • all the UUIDBoxes are defined in an object UUID_BOXES = { ... } because some of them start with a number.
  • currently tsconfig has strict: false. if we would want to enforce strict typescript we will have to either reconsider some of the API-choices (p.ex Box.addBox) or cast a lot more to any :-)

simplifies consuming the types
(and other type fixes)
this allows for greater type narrowing on consumer's end.
NOTE: this box previously did not have a constructor, so I guessed the box-type to extend from.
and update trefBox to extend from ContainerBox instead of Box
- keep only BoxRegistry and DescriptorRegistry
- offer tighter typing to compensate
- export BoxRegistry as BoxParser in all/simple
- move all/simple-entries to ./entries
- move src/lib to src
keeps the diff closer to the original
replace MP4BoxBuffer-type with MP4BoxBuffer-class
@DenizUgur
Copy link
Member

Hi @bigmistqke, I haven't gone through the code but the description is very well written. We have been wanting to port to TypeScript for quite some time now. I guess this is a really good start.

I have some ideas on testing, especially with the new conformance framework.

https://mpeggroup.github.io/FileFormatConformance/

Also for tooling, I was thinking about tsup + vitest instead of vite.

I will go through this PR in more detail soon. Thank you for the efforts.

@bigmistqke
Copy link
Author

bigmistqke commented Nov 20, 2024

Hi @DenizUgur

Great to hear there is interest from the core team too to push this further!

Regarding tsup: great choice 👍 I agree it's a better fit. Should be easy to refactor since there is no vite-specifc syntax used and only 1 vite plugin vite-tsconfig-paths which wouldn't be needed because tsup infers alias-paths from tsconfig by default.

- cleanup package.json
- generate sourcemaps

Opted for scripts instead of tsup.config.ts,
as otherwise tsup would also generate .d.cts files
@bigmistqke
Copy link
Author

@DenizUgur btw I am up for having a videochat, going through the code together high level, if that would be helpful for you.

before
- super('hldr', ...)

after
- super('hdlr', ...)

bug came apparent when porting samples/audio-video-player from w3c/webcodecs
see https://github.com/bigmistqke/webcodecs-vite/tree/main/samples/audio-video-player
and do not register these in BoxRegistry
in the javascript source (see commit 43c8b7a) createBoxCtor adds the given type-argument to `BoxParser.boxCodes.push(type);`, similar pushes were commented out in other createXBoxCtor and still present in createSampleEntry.

this was ported to a call in the constructor, but this is not functionally identical: the calls in the factory-functions were only called when declaring the class, not during each instance creation.

currently I removed this functionality and did not replace it, if we want to re-introduce this functionality I would implement it via exports.
before
- BoxBase has a type-property declared as a public argument
- all classes extending from BoxBase pass type through super(type)
- this makes that for each class-instance the type of .type is 'string'

after
- BoxBase has a type-property but it is not initialised in its constructor
- each class-instance declares its type in the declaration: p.ex class moovBox { type = 'moov' ... }
- UUIDBoxes has their uuid inlined as well
- remove redundant UUIDBox-base
- this way we can narrow depending on the BoxKind.type
- refactored ISOFile.parse to make use of this additional type-information: `switch (box.type) { case 'mdat': ...` would narrow the type of box to `mdatBox`
- added BoxKind to @types: infers all classes that extend Box from BoxRegistry
- added SampleEntryKind to @types: infers all classes that extend SampleEntry from BoxRegistry
before
- in iinfBox.parse we would check if result of parseOneBox is instanceof infeBox
- this would fail in case BoxRegistry does not have infeBox, in which case  generic Box would be created with Box.type = 'infe'

after
- cast result of parseOneBox to BoxKind
- check if box.type === 'infe'
several import fixes
@bigmistqke
Copy link
Author

bigmistqke commented Dec 2, 2024

Played around a bit more with it today:

  • in box.js in the javascript-version (commit 43c8b7a) createBoxCtor adds the type BoxParser.boxCodes.push(type);
    • this was initially implemented in the port by pushing to Box.codes/FullBox.codes/ ... in their respective constructors.
      but this is not functionally identical: the original version only pushed when the factory-function called, not when every instance is created.
    • I currently removed this functionality all together. It was commented out in several factory-functions in the original codebase, so I am not sure if this is still relevant code.
    • if we want to re-introduce this functionality, we could maybe do barrel-files and map over these exports to get all the available types.
  • removing this allowed to have type and uuid as fields instead of passing constructor arguments.
    • class moovBox { type = 'moov' as const } instead of class moovBox { constructor(){ super('moov') } }
    • this means that all the boxes have their type narrowly defined instead of string, which enables type narrowing on the type-field.
    • it also allows to do some cool stuff like: inferring all the type-fields of the MP4Box.BoxRegistry that extends SampleEntry: handy for in ISOFileOptions.type
  • the audio-video-player sample of the w3c/webcodec is now working too
Screen.Recording.2024-12-02.at.22.17.53.mov

@DenizUgur
Copy link
Member

DenizUgur commented Dec 2, 2024

Hi @bigmistqke, that sounds great. I'm preoccupied with other things right now, but I might be able to allocate some time for this in early January.

In the meantime, I'll list a few ideas I had for a partial rewrite. Note that this is not an exhaustive list:

  • Remove Grunt and Karma. They are obsolete.
  • Use a tsup config file unless there's a compelling reason not to.
  • Have only one entry file that loads fundamental boxes with the bundle. Group the remaining boxes into logical groups and load them dynamically using the import() syntax when the box parser encounters them. This could make the benefits of using multiple variants more transparent to the user.
  • Ensure there are no unknown types anywhere.
  • Verify that the file inspection demo has no regressions.
  • (I'll probably have to look at this) Get some tests running with Vitest and File Format Conformance Framework. Since we are changing a lot of things, it might be necessary to get a good coverage before merging.

There might be other things to address, but the most important thing is to stay within the scope of this PR's goal: transitioning to TypeScript. There's always more to implement :)

Thank you again for your ongoing efforts!

@DenizUgur DenizUgur changed the title port to typescript Port to TypeScript Dec 2, 2024
add tsup config, remove concurrently
infer values from given StructDefinitions
before
- Box.addBox(boxClass) does 2 actions:
    - sets this[boxClass.type] to boxClass
    - pushes boxClass to this[boxClass.type + 's']
- this[boxClass.type] was already typed

after
- type each this[boxClass.type + "s"]
- add dev-directory
- port filereader-example to typescript
- add partial typings for jquery, jquery.fancytree, jquery-ui and d3 (chatgpt generated, so we can see the code through the red squiggles)
filereader-demo does not work otherwise, unclear why
@bigmistqke
Copy link
Author

bigmistqke commented Dec 3, 2024

  • Remove Grunt and Karma. They are obsolete.
  • Use a tsup config file unless there's a compelling reason not to.
  • Have only one entry file that loads fundamental boxes with the bundle. Group the remaining boxes into logical groups and load them dynamically using the import() syntax when the box parser encounters them. This could make the benefits of using multiple variants more transparent to the user.
  • Ensure there are no unknown types anywhere.
  • Verify that the file inspection demo has no regressions.

Use a tsup config file unless there's a compelling reason not to.

It's a bit annoying that it creates a declaration-file for both esm and cjs when using 1 tsup.config.ts.

Have only one entry file that loads fundamental boxes with the bundle. Group the remaining boxes into logical groups and load them dynamically using the import() syntax when the box parser encounters them. This could make the benefits of using multiple variants more transparent to the user.

Not yet immediately clear to me how to implement this. Would we then also need to pass a map of all boxes and to which dynamically loaded bundle they belong? If we want to handle dynamically loading during parsing, everything would suddenly become async too, which would complicate things.

Maybe it could be done during initialisation, instead?

import BoxesA from "mp4box/a"
import BoxesB from "mp4box/b"
import { register } from "mp4box"

register(BoxesA, BoxesB)

Then it's upto the user to determine which box-collections they want to include.

Ensure there are no unknown types anywhere.

Went unknown hunting. Only protections in iproBox left. Is it a boolean?
Still some // @ts-expect-error too.

Verify that the file inspection demo has no regressions.

Ported the file inspection demo to vite. It revealed a couple of tiny bugs, but once those were fixed I can't seem to discover any regressions. With my local sample at least.

Ported file inspection demo with mp4box.ts

port.mp4

Original file inspection demo with mp4box.js

original.mp4

@bigmistqke
Copy link
Author

also added typechecking to DataStream.writeStruct from the given struct:

writeStruct.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants