-
-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expose a list of all assets that are safe to set immutable Cache-Control
#1371
Comments
This does seem like a worthwhile idea. It is also connected to another issue that is chunkGroup configuration in optimization. Because if that was set up the empty string would be "shared", "framework" etc. If you look at next.js they use a chunkGroups config like this. When we change this it will be backwards-incompatible also, but it has to be done. I have some more big changes going on that also needs a major release. But feel free to come up with some code that solves this 😀 |
Oh cool, I'm not across if/how any other tools/frameworks approach this, do you have a link/examples?
An open razzle issue? could you point me to which one so I can get more context 😄 I do think that potential one way of solving this is to more strongly define the shape/schema of the existing |
And https://github.com/vercel/next.js/blob/canary/packages/next/build/webpack-config.ts#L378 Not sure about how they do the manifest. |
Take a look at #1377 now, added a new example :) |
@fivethreeo I haven't managed to spend any more time on this issue specifically 😅, I'll definitely try and spend some time trying out the v4 prerelease. If you think it's ready for it I'll hopefully aim to try it out over the next couple of days. I'm not sure if it's of much interest but I've made what I'm working on public now here. I'm pretty keen for v4 because it hopefully means I can remove as many of the "plugin" overrides I have to set things here, particularly for typescript. The cacheable assets stuff is here. |
All files use contenthash now. The copied files I would say is a bad practice when we have a bundler. |
I'm not sure I understand what you mean by "The copied files I would say is a bad practice when we have a bundler". Currently the behavior is that if you put any files in the top level build/
public/
robots.txt
manifest.json
package.json They get coped into the static assets when you build/
public/
robots.txt
manifest.json
static/
...
public/
robots.txt
manifest.json
package.json I was thinking it might be desireable to maintain a list of all the assets that were copied-in during build so they're specifically targettable seperately for applying cache-control to. I think the argument against it (that I can think of) would be that there might not be a need for the user to distinguish between files razzle has copied-in during build and ones that might have been manually put in there outside the |
I think public should only contain robots.txt and favicon.ico and they won’t be versioned by hashes. Anything else should be bundled by webpack. Any larger favicons should be bundled. |
Maybe, but even just if you want to maintain "plug and play" compatibility with a default I vaugely remember there being reasons why the Do any of the razzle example projects implement PWA (and/or service worker) support? |
Maybe less relevant but some other things that I've put in the |
Trying to look for examples of if/why/when webmanifests should be seperate to the bundler: There's a comment in that post that links out to w3c/manifest#446 (comment) |
Yes, downloadable files should go there. Hm, but how do we add those files to the assets.json? Any ideas? 😀Should we make webpack find them and bundle as is? Modifying the assets.json seems hackish. |
I dont think there is a PWA example. But if they need a consistent name. That needs to be handled by webpack. |
I will replace assets plugin with manifest plugin so we can adapt the output. |
Added a new assets-manifest with all files 1c6e916 any suggestions on improvements? |
Did a canary release now :) |
I see manifest plugin is not really maintained. The best would be to do our own. But I don't currently know anyone but (maybe) me or the webpack folks that can do that. |
Added to canary branch now. Sort of a hack for now. But it works and is a start that can be improved upon. |
After some consideration I will not add this to core. But here is the code I came up with: new ManifestPlugin({
fileName: path.join(paths.appBuild, 'assets.json'),
writeToFileEmit: true,
generate: (seed, files) => {
const entrypoints = new Set();
const noChunkFiles = new Set();
const longTermCacheFiles = new Set();
files.forEach(file => {
if (file.isChunk) {
const groups = (
(file.chunk || {})._groups || []
).forEach(group => entrypoints.add(group));
} else {
noChunkFiles.add(file);
}
if (!webpackOptions.fileLoaderExclude.some(re=>re.test(file.path))) {
let fileHasHash = /\[(build|content)?hash/.test(
typeof webpackOptions.fileLoaderOutputName == 'function' ?
webpackOptions.fileLoaderOutputName(file) : webpackOptions.fileLoaderOutputName);
if (fileHasHash) longTermCacheFiles.add(file);
} else if (webpackOptions.urlLoaderTest.some(re=>re.test(file.path))) {
let urlHasHash = /\[(build|content)?hash/.test(
typeof webpackOptions.urlLoaderOutputName == 'function' ?
webpackOptions.urlLoaderOutputName(file) : webpackOptions.urlLoaderOutputName);
if (urlHasHash) longTermCacheFiles.add(file);
} else if (webpackOptions.cssTest.some(re=>re.test(file.path))) {
let cssHasHash = /\[(build|content)?hash/.test(
typeof webpackOptions.cssOutputFilename == 'function' ?
webpackOptions.cssOutputFilename(file) : webpackOptions.cssOutputFilename);
if (cssHasHash) longTermCacheFiles.add(file);
} else if (webpackOptions.jsTest.some(re=>re.test(file.path))) {
let jsHasHash = /\[(build|content)?hash/.test(
typeof webpackOptions.jsOutputFilename == 'function' ?
webpackOptions.jsOutputFilename(file) : webpackOptions.jsOutputFilename);
if (jsHasHash) longTermCacheFiles.add(file);
}
});
const entries = [...entrypoints];
const entryArrayManifest = entries.reduce((acc, entry) => {
const name =
(entry.options || {}).name ||
(entry.runtimeChunk || {}).name ||
entry.id;
const allFiles = []
.concat(
...(entry.chunks || []).map(chunk =>
chunk.files.map(path => config.output.publicPath + path)
)
)
.filter(Boolean);
const filesByType = allFiles.reduce((types, file) => {
const fileType = file.slice(file.lastIndexOf('.') + 1);
types[fileType] = types[fileType] || [];
types[fileType].push(file);
return types;
}, {});
const chunkIds = [].concat(
...(entry.chunks || []).map(chunk => chunk.ids)
);
return name
? {
...acc,
[name]: { ...filesByType, chunks: chunkIds },
}
: acc;
}, seed);
entryArrayManifest['noentry'] = [...noChunkFiles]
.map(file => file.path)
.reduce((types, file) => {
const fileType = file.slice(file.lastIndexOf('.') + 1);
types[fileType] = types[fileType] || [];
types[fileType].push(file);
return types;
}, {});
entryArrayManifest['cacheable'] = [...longTermCacheFiles]
.map(file => file.path);
return entryArrayManifest;
},
}) |
But I learned lots about assets ;) |
Sorry I haven't been able to spend much time with this in a while but that's looks neat. Having a look now at upgrading my stuff to the latest stable razzle version and trying out your suggestion as a custom plugin. It looks pretty good but i'm a bit confused about this: let fileHasHash = /\[(build|content)?hash/.test(
typeof webpackOptions.fileLoaderOutputName == 'function'
? webpackOptions.fileLoaderOutputName(file)
: webpackOptions.fileLoaderOutputName);
if (fileHasHash) longTermCacheFiles.add(file); What is |
Only in razzle canary |
Neat, I've made some progress now with getting a branch in my project working on the canary branch. It's not quite working, at the moment my issues mainly seem to be to do with configuring the babel loader to recognise sibling packages. I can build but then get "cannot find module" issues when I try to run it. This probably isn't too interesting/useful but: https://github.com/bootleg-rust/sites/pull/2/files from memory I originally borrowed the config from #664
|
and set NODE_PATH=../ or something |
Maybe we should change some things here: https://github.com/jaredpalmer/razzle/blob/canary/packages/razzle/config/modules.js |
Ok so digging into it a little bit I've just realised that the issue is actually just caused The only thing I was using it for was for detecting what assets were cacheable so looks like I should be able to give the new |
OK! I've made a custom plugin in my project that seems to work well enough for my use case for the time being. The code you came up with was very helpful having that as a starting point. I've changed it a fair bit but FYI I think there's an issue with it where it thinks everything is processed by In case it's useful to share here: function modifyWebpackConfig({
env: { target, dev },
webpackConfig,
webpackObject,
options: { pluginOptions, razzleOptions, webpackOptions },
paths,
}) {
// TODO: allow passing in extra file categorizers with `pluginOptions`
const fileCategorizers = [
{
test: webpackOptions.urlLoaderTest,
outputName: webpackOptions.urlLoaderOutputName,
},
{
test: webpackOptions.cssTest,
outputName: webpackOptions.cssOutputFilename,
},
{
test: webpackOptions.jsTest,
outputName: webpackOptions.jsOutputFilename,
},
{
exclude: webpackOptions.fileLoaderExclude,
outputName: webpackOptions.fileLoaderOutputName,
},
];
const fileName = path.join(paths.appBuild, "cacheable-assets.json");
const assetPlugin = new WebpackManifestPlugin({
fileName,
writeToFileEmit: true,
generate: (seed, files) => {
const notHashedFiles = new Set();
const hashedFiles = new Set();
const setFileAs = (file, { containsHash }) => {
if (containsHash) {
hashedFiles.add(file);
} else {
notHashedFiles.add(file);
}
};
files.forEach((file) => {
if (file.name.startsWith("..")) {
// Files that start with ".." will live outside of the public/
// folder and therefore can't/shouldn't be accessed.
return;
}
const fileCategorized = fileCategorizers.some(
({ test, exclude, outputName }) => {
const passesTest =
test != null ? fileMatchesAnyRegexp(file, test) : true;
const passesExclude =
exclude != null ? !fileMatchesAnyRegexp(file, exclude) : true;
const fileMatches =
passesTest &&
passesExclude &&
fileMatchesTemplate(file.path, outputName);
if (fileMatches) {
const containsHash = webpackLoaderOutputContainsHash(
outputName,
file,
);
setFileAs(file, { containsHash });
}
return fileMatches;
},
);
if (!fileCategorized) {
// TODO: allow "strict" vs "lazy" mode here where we can only use
// regex on the filename to guess if a file contains a hash in it.
setFileAs(file, { containsHash: false });
}
});
const mutable = [...notHashedFiles].map((file) => file.path);
const immutable = [...hashedFiles].map((file) => file.path);
return {
mutable,
immutable,
};
},
});
if (target === "web") {
webpackConfig.plugins.push(assetPlugin);
}
if (target === "node") {
// NOTE: adding multiple DefinePlugin's causes issues
// so we have to find and edit the existing one.
const definePlugin = webpackConfig.plugins.find(
(p) => p.constructor.name === "DefinePlugin",
);
definePlugin.definitions[
"process.env.RAZZLE_PLUGIN_CACHEABLE_ASSETS"
] = JSON.stringify(fileName);
}
return webpackConfig;
}
const cacheableAssetsPlugin = {
modifyWebpackConfig,
}; Or can look at it here https://github.com/bootleg-rust/sites/pull/2/files#diff-59ee436c0396a1f925f067b7e7cbcdee354003236a279e0a87cf8831c7f587e3 |
Ah right yes, thanks. I'm still getting used to the new plugin hooks, I like it 🎉! I think the only main problem I'm still having that I haven't been able to resolve is that for some reason the Any ideas what it might be? or is it worth putting this on a different github issue somewhere? |
Also use modifyPaths for custom paths aswell so it can be composed. Does not work how? May be a new issue .. :) |
Nevermind, the sass loader not working wasn't anythings specific with razzle. something to do with a version mismatch or something with the a version of |
Added hooks for asset handling, closing 😀 |
I see you added a externals plugin. I still need to fix that for client/server/serverless. Got any ideas for that in canary? A bit stuck. |
The hooks you use now that is. |
I've definitely found it super convenient (mainly on the server) to default to bundle all Having said that I haven't had the need for any using/testing how it works with any native/platform-specific dependencies (I have a feeling things like imagemagick would have issues) My general thought process with the "externals" plugin I made is: const externalsPluginOptions = {
// By default the NodeJS process uses the externals function razzle has and relies on `node_modules` to still be available
// after performing a razzle build. Setting this to `true` would mean that all dependencies attempt to get bundled into
// the build/ output unless explicitly specified as an external
resetNodeExternals: false,
// This probably wouldn't actually be required because the browser runtime
// doesn't have externals by default (i'm assuming)
resetWebExternals: true,
webExternals: [
// add externals for the web runtime here
],
nodeExternals: [
// add externals for the node runtime here
],
}; To be honest before settling on a "proper" configuration API for this (particularly if it was going to be in razzle core) I'd probably have to read the webpack docs for At the moment i'm really only using it to reset externals to be empty so that I get everything bundled into an easily portable app that doesn't rely on node_modules at runtime |
🚀 Feature request
I've been investigating the best way to add a middleware to my razzle server to detect whenever it serves a file built by razzle that includes a webpack
[hash:8]
or[contenthash:8]
in the filename. I first discussed some of the problems I'm running into here #1368 (comment)I would like razzle to generate and expose the list of files/assets safe to be considered "immutable" (for the purposes of setting
Cache-Control
headers in responses) in a way that is easy to consume without extra transformation of the chunks.json and/or assets.json filesNOTE: when setting long-lived & immutable cache-control responses I want to avoid doing any kind of "approximation" on whether a file can be considered immutable (AKA regex to detect a hash in the filename) because a false positive can lead to a file being immutably cached for a long time and wouldn't be fixable by a server-side cache invalidation, which can be a very painful problem to work around.
Current Behavior
TL;DR of why trying to use the currently exposed json files is difficult:
chunks.json
andassets.json
. chunks.json includes sourcemap files and assets.json has files like png/fonts etc which chunks.json doesn't.(assets.json).client
(eg:"client": { "js": "/static/js/bundle.6fc534aa.js" }
), assets.json group all other assets under an empty string (eg:"": { "js": "/static/js/0.cb47cee9.chunk.js" }
)."client": { "css": ["filename.css"] }
), if there's only one file file present in assets.json it will instead just be the single string (eg:"client": { "css": "filename.css" }
)."json": "/../chunks.json"
which is not something that I think should be in there (i'm not sure if this is a bug or not) but I have to manually strip this out when making the list of files that can be given long lived cache-Control response headers.chunks: ["1", "2", "3"]
array to the chunks.json is somewhat annoying because it means I have to do extra work to filter out the(chunks.json).client.chunks
because it doesn't contain an array of files like(chunks.json).client.css
and(chunks.json).client.js
etc.client
chunk weren't even appearing in thechunks.json
file. I made/suggested the change to change it to using the chunk number(s) as the key because at least they then appear in the file. The downside of this is that nowchunks.json
andassets.json
diverse further in their schema when dealing with chunks that aren't the primary named chunk ("client": {/* blah */ }
).using assets.json and chunks.json
Currently using the assets.json and chunks.json this is what I've had to do roughly so far
I haven't:
"chunks": ["1", "2", "3"]
and"json": "/../chunks.json"
Desired Behavior
There are would probably be many ways to do this but the primary thing I want is just a way to load an array of all cacheable/immutable assets generated by razzle build. the result could look something like this:
Suggested Solution
I haven't fully investigated what a good solution would be but after trying to put together this list of "cacheable assets" at runtime using the
assets.json
andchunks.json
I'm pretty convinced that at a minimum the best way to accomplish this would would be at build-time with some kind of webpack plugin and bypass the inconsistencies of those two files.For my purposes I'll probably initially start looking into how to accomplish this with a plugin rather than with at runtime as i've been doing, but I think there'd be significant value to have this baked-in to razzle by default. Being able to set long-lived cache-control on hashed files is largely why they get hashed to begin with, so exposing a list of all those files seems appropriate.
Who does this impact? Who is this for?
Any users who want to set appropriate long-lived & immutable cache-control response headers for files generated & hashed by razzle.
Describe alternatives you've considered
chunks.json
andassets.json
(seems error prone and fragile).require(process.env.RAZZLE_CACHING_MANIFEST!)
. ()Additional context
I'd be willing to help/contribute towards making this change but I might need a bit of a point in the right direction (and of course whether or not this is a change that would be accepted/welcomed).
Also a thought, having something like this might make it easier to have some tests/stability around ensuring that things are using
[contenthash:8]
instead of[hash:8]
(build hash) if/when they can #1331The text was updated successfully, but these errors were encountered: