-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow custom asset types in the build system #1623
Conversation
This is the first PR that will increase the flexibility of the Dart Build system as well as makes it more layered. The main points being addressed in this PR are: * A bundling tool has to specify which asset types it supports. So we make `supportedAssetTypes` required everywhere. => Across various layers in the code base we make this required => We remove a baked-in `[CodeAsset.type]` fallback in various places (Making this explicit increases size of LOCs in this CL due to the many tests) * The core building infrastructure in `pkg/native_assets_builder/lib/*` no longer knows anything about `CodeAsset` or `DataAsset`. Instead it only knows about `EncodedAsset` which represents the type of an asset and it's json encoding. => The core still verifies certain protocol consistency (e.g. that a hook only emitted asset types that the config allows) => The higher levels pass the `supportedAssetTypes`. => The higher levels now also pass `buildValidator` / `linkValidator` that validate per-asset related things (e.g. the id of a data asset to start with the package name) for a given package. => The higher levels now also pass a `applicationAssetValidator` that validates consistency of assets across all packages (e.g. uniqueness of dylib filenames across all dependencies) => This centralizes the logic about per-asset-type information in the bundling tool (i.e. user of `package:native_assets_builder`). => Bundling tools now have to expand `CodeAsset`s in dry-run to all architectures. * The bundling tool (e.g. `flutter build`) doesn't have to implement validation logic itself, instead it will support certain asset types by importing the "bundling logic" for that asset types (which includes - for now - validation logic). * All the validation code now simply returns `ValidationError` which is a list of errors. If the list is empty then there were no errors. => This removes redundancy between a `success` boolean and the list of errors (it was successsfull iff there were no errors) => This simplifies code that aggregates errors from different sources. * Moves the `fromJson` / `toJson` towards be purely read/write a json schema and are not responsible for anything else. => The validation routines are responsible for validating semantics. * The hook writer API now changes to work on per-asset-type specific extension methods. For example what used to be ``` output.addAsset(DataAsset(...)); ``` is now ``` output.dataAssets.add(DataAsset(...)); ``` The `BuildOutput.encodedAsset` getter is (with this PR) temporarily visible to hook writers, which could use it like this: ``` output.encodedAssets.add(DataAsset(...).encode()); ``` but a following refactoring that increases the flexibility of the protocol more will also get rid of the user-visible `encodedAsset` getter (if we choose so). So this is only temporarily exposed. * `hook/link.dart` scripts have to mark any input files it's output depends on in the `LinkOutput.addDependency`. If the set of inputs is the same but one input's file changed then a `hook/link.dart` - doesn't have to rerun if it just drops all assets or outputs them unchanged - has to re-run if it copies a input, changes it's contents, merge with other inputs, etc => A `hook/link.dart` should - just like a `hook/build.dart` add dependencies if the inputs are used to produce the output => Before this was somewhat hard-coded into the `package:native_assets_builder`.
PR HealthBreaking changes ✔️
Changelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. API leaks ✔️The following packages contain symbols visible in the public API, but not exported by the library. Export these symbols or remove them from your publicly visible API.
License Headers ✔️
All source files should start with a license header. Unrelated files missing license headers
Package publish validation ✔️
Documentation at https://github.com/dart-lang/ecosystem/wiki/Publishing-automation. |
Overall for users of these packages very little changes: Hook writersBefore: output.add(DataAsset(...)); now output.dataAssets.add(DataAsset(...)); Bundling toolsBefore: await build(
...
supportedAssetTypes: [ // <-- this was optional and had auto-fallback to `[CodeAsset.type]`
CodeAsset.type,
],
); now: await build(
...
supportedAssetTypes: [ // <-- this is now required
CodeAsset.type,
DataAsset.type,
],
buildValidator: (config, output) async => [
...await validateDataAssetBuildOutput(config, output),
...await validateCodeAssetBuildOutput(config, output),
],
applicationAssetValidator: (assets) async => [
...await validateCodeAssetsInApplication(assets),
],
); => This change mainly reflects that per-asset-type information is moved to the bundling tool which can use per-asset-type helpers to perform validation. (see cl/387940 on how the roll into Dart SDK would look like) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related issue:
High level LGTM. This PR is doing roughly what I expected to do with ExtensibleAsset
and applies it also to data assets and code assets instead.
Also, thanks for the great PR description! ❤️
- We start separating the per-asset-type specific code:
native_assets_cli/lib/src/code_assets/*
&native_assets_cli/lib/src/data_assets/*
.
=> More refactorings will come that will make the corenative_assets_cli/lib/src/*
no longer have any per-asset-type specific things in them
=> Eventually thelib/src/data_assets
andlib/src/code_assets
could become their own packages (if we choose to do so) - similar to how flutter-specific assets can live in other packages.
One thing I noticed while reviewing the code is that we no longer run the validation for data assets and code assets inside the hook. This makes it harder to debug failures for users, because they cannot run the debugger on the hook with a config file. They'd have to run dartdev/flutter_tools and that doesn't enable them to step through their code.
Making the protocol not know about data assets and code assets should also then add a validation callback to build(
and link(
. And the validate methods for data assets and code assets should be made public in the non-internal.dart
.
For adding assets and looping over assets, extension methods suffice, but for validation it does not.
Can we do better to support asset type specific validation? (Dependency injection? 😅 That would be super ugly. But that would also allow for us to register decode
method for every asset type, fixing our problem with not having virtual statics. But dependency injection is too ugly.)
We need a solution for custom asset type validation inside hooks, and then we could use that as well the custom validation of code and data assets inside thebuild(
and link(
methods used by hook writers. We don't need a solution in this PR. But if we don't add a solution in this PR. I think the build(
and link(
methods should call the validators for code and data assets.
Hook writers
Before:
output.add(DataAsset(...));now
output.dataAssets.add(DataAsset(...));
This is only true if we keep the validation for code and data assets in build(
and link(
. Otherwise users are going to have to call validateDataAssetBuildOutput
and friends manually from the bottom of their build(
method.
hook/link.dart
scripts have to mark any input files it's output depends on in theLinkOutput.addDependency
. If the set of inputs is the same but one input's file changed then ahook/link.dart
- doesn't have to re-run if it just drops all assets or outputs them unchanged
- has to re-run if it copies a input, changes it's contents, merge with other inputs, etc
=> Ahook/link.dart
should - just like ahook/build.dart
add dependencies iff changing the input file contents should cause a re-run of the hook
=> Before this was somewhat hard-coded into thepackage:native_assets_builder
.
This seems to be unrelated to this PR? Can it be moved to a separate PR? (And then we can also discuss pros/cons on that PR.)
const DeepCollectionEquality().equals(encoding, other.encoding); | ||
} | ||
|
||
const String _typeKey = 'type'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static const inside class?
pkgs/native_assets_builder/test/build_runner/build_dependencies_test.dart
Show resolved
Hide resolved
I think I mentioned this before and you were opposed to this: Should we have |
Correct.
Most asset types have their own id namespace, some asset types may share an id namespace with other asset types, some assets will have no id concept at all. Some asset types may have a file, some may have no file, some may have several files. The reality is that different asset types are just that - different. Only The verification about id namespaces is something a bundling tool will do, it decides whether code assets go into same directory as data asset (and cause name collision) or not, ... |
Firstly, thanks for all the comments, @dcharkes ! I've addressed some comments, resolved others with explanation and left a few open for next review round. PTAL
I've filed #1628 with a suggestion how we can do this automatically without regression, but I'm not actually convinced we should do this.
This is not unrelated to this PR because Now I had actually a version where I dependency-injected this from the upper layers (bundling tool): A After thinking about this more I strongly disliked this approach. The natural way for any hook we invoke is for the hook to tell - via dependencies - the things it actually depends on (i.e. if those things change, it should rerun). This is how it works for |
Ah right, because the system doesn't know anymore about assets having I think this is rather a footgun for users, having to add the files as dependencies. You actually filed this and I addressed this last month: This would become compounded if/when we're also going to model metadata as assets: Assets having a file (or multiple files) on disk should be part of the system imho. One way to do so would be to recognize a It would be asymmetric if the hooks do do caching correctly on the asset metadata (the json serialization of the object) but not on the files the asset is a metadata on. (The idea of merging the current metadata into assets signifies this difference, moving arbitrary json from the BuildOutput/BuildConfig to the contents of an asset file.) |
One has to be careful to manage dependencies, but this is also true for build hook writers, or
This would be automatically handled, because the json encoding of a
Not sure about this:
|
Right, so now we give asset-designers the flexibility to provide automatic caching or not be either writing a file or putting everything in the JSON encoding. I could totally make a The users of custom asset types now all of a sudden need to know implementation details of such asset types. I believe custom asset types should be mostly a concern of the introducer of the custom asset type (which is likely an embedder, or a dev who also writes a linker for such custom asset type). Custom asset types having a file could be encapsulated, not visible by the users of the custom asset type (maybe it simply has a I think correct caching should be the responsibility of the custom-asset-type introducer (together with the native_assets_builder). Focusing on an optimization whether something is used/consumed creates the same question for not reading fields of the json encoding of an asset (and the current json-based implementation of the metadata feature flowing from one build hook to the next). If you want to focus on enabling users to have faster execution of their build based on whats used/consumed, then we should start recording what fields of the json encoding are read. It's very smelly to treat file contents differently than the json metadata.
The dependencies that we report in build hooks are files that are not being passed in by the native assets builder, they live somewhere in the package root. The files that the native assets builder knows about influence caching (for example the dart sources). |
That can be easily done by making this modification: diff --git a/pkgs/native_assets_cli/lib/src/code_assets/code_asset.dart b/pkgs/native_assets_cli/lib/src/code_assets/code_asset.dart
index 88576821..0c8acbc0 100644
--- a/pkgs/native_assets_cli/lib/src/code_assets/code_asset.dart
+++ b/pkgs/native_assets_cli/lib/src/code_assets/code_asset.dart
@@ -196,8 +196,13 @@ extension CodeAssetsBuildOutput on BuildOutput {
}
extension type BuildOutputCodeAssets(BuildOutput _output) {
- void add(CodeAsset asset, {String? linkInPackage}) =>
- _output.addEncodedAsset(asset.encode(), linkInPackage: linkInPackage);
+ void add(CodeAsset asset, {String? linkInPackage}) {
+ final file = asset.file;
+ if (file != null && asset.linkingMode is ...) {
+ _output.addDependency(file);
+ }
+ _output.addEncodedAsset(asset.encode(), linkInPackage: linkInPackage);
+ }
void addAll(Iterable<CodeAsset> assets, {String? linkInPackage}) {
for (final asset in assets) {
diff --git a/pkgs/native_assets_cli/lib/src/data_assets/data_asset.dart b/pkgs/native_assets_cli/lib/src/data_assets/data_asset.dart
index 98864398..e4721161 100644
--- a/pkgs/native_assets_cli/lib/src/data_assets/data_asset.dart
+++ b/pkgs/native_assets_cli/lib/src/data_assets/data_asset.dart
@@ -98,8 +98,10 @@ extension DataAssetsBuildOutput on BuildOutput {
}
extension type BuildOutputDataAssets(BuildOutput _output) {
- void add(DataAsset asset, {String? linkInPackage}) =>
- _output.addEncodedAsset(asset.encode(), linkInPackage: linkInPackage);
+ void add(DataAsset asset, {String? linkInPackage}) {
+ _output.addDependency(asset.file);
+ _output.addEncodedAsset(asset.encode(), linkInPackage: linkInPackage);
+ }
void addAll(Iterable<DataAsset> assets, {String? linkInPackage}) {
for (final asset in assets) { In fact I had that at some point, but removed it again: Firstly some tests were failing because they didn't expect those extra dependencies being reported. I'm open to making the above change in a follow-up CL if we want to do this over-approximation. But it's basically a choice of the per-asset-type code, not of the core build system.
I think this is not smelly at all. It's just a limitation in the dependency tracking system in the APIs & CLI protocol. Because of this limitation we over-approximate it and re-build more than possibly needed. It's rooted in the fact that the existing dependency tracking machinery is based on files. One could expand this by adding a more fine-grained dependency tracking that allows expressing this: output.addDependency(<file-that-was-consumed>);
output.addMetadataDependency(<description-of-what-subpart-of-metadata-was-consumed>); but this seems like unnecessary complexity at this point. |
FWIW. I just realized that this approach doens't work. Custom asset types might be introduced by packages that consume such asset types in linkers and then output a data asset in the linker.
Wait, I don't understand this, you don't want to add file of the asset you're adding as a dependency, but the file of the asset you're reading as a dependency. So, it should be rather a modification of The only challenge there is that the Adding all deps on accessing
So all the existing link hooks will be broken in the meantime? Also, the |
Also, all link hooks where the SDK uses the new implementation (not tracking dependencies in native assets builder), and hooks use the old implementation (not tracking the dependencies manually in the hook) will be broken. Should we retract every individual package version of native_assets_cli to attend users? This can all be avoided by baking in knowledge |
Sorry, yes, absolutely correct!
This sounds like the linking support is widely used and would cause huge breakage for users? In reality one can barely make linking scripts today because we don't have record-uses information that can be safely consumed yet. The format of the record use is being developed at this moment - nobody should depend on it atm. So I don't think this would cause any real issues in pratice. And even if it did break anyone: It's an experimental feature that we're trying to get towards 1.0. Now is the time to do breaking changes. The set of changes performed in these refactorings are breaking - towards getting the system to 1.0. The PR as it's now will require hook writers to be precise about dependencies, I don't see that as a problem. It's backwards-compatible to later on over-approximate those dependencies by making the code & data asset specific APIs automatically add them (if we have evidence users have trouble managing dependencies precisely) - which I'm not entirely convinced we should do. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about some of these changes, and have voiced my concerns.
I'd strongly prefer baking in knowledge about files into the json encoding for all the reasons mentioned above.
Please add a breaking change to pkgs/native_assets_cli/lib/src/model/build_output_CHANGELOG.md
about link outputs now needing to contain their file dependencies from LinkConfig.encodedAssets
(We don't have a link_config_CHANGELOG maybe rename to hook_config_CHANGELOG.) And add that we have no mitigation for people using older native_assets_cli's and link hooks because we expect no users. (Or alternatively as mitigation that we retract all older native_asstes_cli package versions...)
Also please add big screaming capslock (or not big scream capslock) documentation that people need to track their own dependencies in LinkConfig.codeAssets
and LinkConfig.dataAssets
.
Have added two concerns I'm aware of to the agenda of the upcoming interop meeting:
I can wait until merging the CL until we have discussed it in the next interop meeting. |
…ment to `LinkConfig.{codeAssets,dataAssets}.all`
Done
Done |
It seemed we reached a decision in our interop meeting to ship it like this. @dcharkes There's two comments I left open: Regarding naming |
We did, thanks @HosseinYousefi for being the tie breaker! 🙏 😄 Ship it! 🚀
I don't see any replies to those comments. If I remember correctly, you wanted to hide And the const inside the class or not, it's private so it doesn't matter. (Also I don't see any replies on that comment.) LGTM the way it is. |
Just to double check: did all the link hooks get updated in this PR to add |
I think so. Not many linkers actually need to call
Other tests/linkers just forward all assets - which doesn't require re-running if content changes. |
Thanks, @dcharkes ! |
@@ -1,6 +1,15 @@ | |||
## 1.6.0 | |||
|
|||
- No changes, but rev version due to BuildConfig change. | |||
- **Breaking change** Link hooks now have to explicitly add any file contents |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should have been renamed to hook_output_CHANGELOG.md
not config.
@@ -1,6 +1,15 @@ | |||
## 1.6.0 | |||
|
|||
- No changes, but rev version due to BuildConfig change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this line could have been removed.
This is the first PR that will increase the flexibility of the Dart Build system as well as makes it more layered. The main points being addressed in this PR are:
A bundling tool has to specify which asset types it supports. So we make
supportedAssetTypes
required everywhere.=> Across various layers in the code base we make this required
=> We remove a baked-in
[CodeAsset.type]
fallback in various places (Making this explicit increases size of LOCs in this CL due to the many tests)The core building infrastructure in
pkg/native_assets_builder/lib/*
no longer knows anything aboutCodeAsset
orDataAsset
. Instead it only knows aboutEncodedAsset
which represents the type of an asset and it's json encoding.=> The core still verifies certain protocol consistency (e.g. that a hook only emitted asset types that the config allows)
=> The higher levels pass the
supportedAssetTypes
.=> The higher levels now also pass
buildValidator
/linkValidator
that validate per-asset related things (e.g. the id of a data asset to start with the package name) for a given package.=> The higher levels now also pass a
applicationAssetValidator
that validates consistency of assets across all packages (e.g. uniqueness of dylib filenames across all dependencies)=> This centralizes the logic about per-asset-type information in the bundling tool (i.e. user of
package:native_assets_builder
).=> Bundling tools now have to expand
CodeAsset
s in dry-run to all architectures.The bundling tool (e.g.
flutter build
) doesn't have to implement validation logic itself, instead it just import the "bundling logic" (which includes validation logic) for various asset types .All the validation code now simply returns
ValidationError
which is a list of errors. If the list is empty then there were no errors.=> This removes redundancy between a
success
boolean and the list of errors (it was successsfull iff there were no errors)=> This simplifies code that aggregates errors from different sources.
Moves the
fromJson
/toJson
towards be purely read/write a json schema and are not responsible for anything else.=> The validation routines are responsible for validating semantics.
The hook writer API now changes to work on per-asset-type specific extension methods. For example what used to be
output.addAsset(DataAsset(...));
is nowoutput.dataAssets.add(DataAsset(...));
The
BuildOutput.encodedAsset
getter is (with this PR) temporarily visible to hook writers, which could use it like this:output.encodedAssets.add(DataAsset(...).encode());
but a following refactoring that increases the flexibility of the protocol more will also get rid of the user-visibleencodedAsset
getter (if we choose so). So this is only temporarily exposed.hook/link.dart
scripts have to mark any input files it's output depends on in theLinkOutput.addDependency
. If the set of inputs is the same but one input's file changed then ahook/link.dart
=> A
hook/link.dart
should - just like ahook/build.dart
add dependencies iff changing the input file contents should cause a re-run of the hook=> Before this was somewhat hard-coded into the
package:native_assets_builder
.We start separating the per-asset-type specific code:
native_assets_cli/lib/src/code_assets/*
&native_assets_cli/lib/src/data_assets/*
.=> More refactorings will come that will make the core
native_assets_cli/lib/src/*
no longer have any per-asset-type specific things in them=> Eventually the
lib/src/data_assets
andlib/src/code_assets
could become their own packages (if we choose to do so) - similar to how flutter-specific assets can live in other packages.