-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Potential optimizations #4
Conversation
BREAKING CHANGE: as it is a major node bump, requires a breaking change
`fs.readdirSync(p, { withFileTypes: true })` - return the Dirent object, that conveniently has isDirectory, isFile, isSymbolic link methods, which in return allow us not to do additional lstat when deciding which class to create. In some cases, File class do require stats to be calculated, which will happen as soon as needed and will be cached.
…cache, as more likely node_modules stuff shouldn't update (if we want it to be performant)
@@ -226,7 +298,7 @@ function fileStatsEqual(a, b) { | |||
// Note that stats.mtimeMs is only available as of Node 9 | |||
return ( | |||
a.ino === b.ino && | |||
a.mtime.getTime() === b.mtime.getTime() && | |||
// a.mtime.getTime() === b.mtime.getTime() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is/was this important?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NullVoxPopuli in real life it's really rare case where file is changed without it's changed size
} | ||
|
||
const now = Date.now(); | ||
const isCacheValid = (this._cacheBuildAt + CACHE_TTL) > now; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this affect the dev-time reload of linked packages that are maybe ctrl+s
'd rapidly?
Like, does this mean that when working on a library, we can only receive updates at quickest, every 5s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now yes, it was more of an experiment and solution to our problem.. with this - we have issues in tests for the merge-trees, so need to figure out some other solution.. but to say, it does add a big speed boost, so caching is a must.. need to think how to pass it probably from the consuming packages.
throw new Error("File has unexpected type: " + p); | ||
} | ||
} else { | ||
let stats = fs.lstatSync(p); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any difference in perf in sync vs async?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
likely yes (for SSD drives, because there is more file threads available), but this package is sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but for stat it's not clear - nodejs/node#38006
I am trying to find ways if possible to improve performance in our monorepo, and debugging running ember s got me to the fs-updater, merge-trees and broccoli-merge-trees.
Here are some data for the context:
Show screenshot of debug and functions time
From these screenshots, couple things that are interesting:
mergeDirectories
(hence my investigation)This change that I've identified should potentially help minimize
fs.stat
calls, and on our monorepo, it does seem like it is a lot, here are some numbers:dirent
number shows how many times instead of fs.stat we were able to derive information required to determine directory/file/symbolic link from the additional info from thefs.readdir
. This is particularly important on the systems with software that is actively monitor/backup all the changes (in enterprises).I'd greatly appreciate any feedback/hints/help to figure out if there is more that I can look at to get much better results and optimizations in general.
I know there was this effort broccolijs/broccoli-merge-trees#43 which potentially can lead to quite a good improvement on bigger projects as well.
Reference to other PRs that might improve performance in general:
broccolijs/node-merge-trees#6
broccolijs/broccoli-persistent-filter#188
https://github.com/broccolijs/fs-merger