Re-design dependency resolution engine (again) #255

Southclaws · 2018-09-13T20:48:54Z

Recently, both Go and Yarn have had interesting steps in the world of dependency trees:

https://github.com/golang/go/wiki/Modules
Yarn Plug'n'Play: Getting rid of node_modules yarnpkg/rfcs#101
Both of these methods share a common idea: a central location for single copies of dependencies

sampctl's dependency system was modelled after dep, which made use of Go's vendor directory for reading dependencies. It was also an attempt to step away from a monolithic directory of .inc files from the days of Pawno.

This has some advantages and some disadvantages. On one hand, all the dependencies of a package exist in a location close to the package itself which meant the user can see the files in their IDE, open them up to see the code and even make ad-hoc changes for debugging and testing purposes. It also meant that there was no chance of dependencies colliding or conflicting in any way, since there were isolated copies of dependencies for each project. The downside here is there are multiple copies of the same dependency all over the filesystem - especially if you write lots of libraries! It also meant downloading dependencies repeatedly pre-1.8, the dependency engine changes in 1.8+ added a cache in ~/.samp/packages which reduced a lot of this repeated downloading but it still meant making copies in each project.

Sound familiar? It's the exact same problem described in the Yarn Plug'n'Play whitepaper.

And, I have considered doing the exact same thing that PnP and Go 1.11 did, even right back when packages were first introduced into sampctl (in fact, I'm not sure what stopped me). As it stands, I can see two possible implementations, each optimised for different resources.

Option 1: Store One Copy

All dependencies would be stored in ~/.samp/packages/ in username/repo/ directories. Each package would be a Git repository with full history depth cloned.

When an instance of sampctl runs either ensure or build, the dependency tree of the package will be walked recursively, visiting each dependency (where "dependency" refers to its repository in ~/.samp/packages/). As each dependency is visited, it is checked out to the necessary version, specified by :, # or @. If the version has no constraint, it would be pulled for the latest changes.

This method means that there is global state. After an ensure or build, the global packages directory would be left with each repo checked out to a specific commit. sampctl could optionally check all dependencies back out to their HEAD state after any operation, but this could be time consuming.

The benefit with this option would be that there would only ever be a single copy of any dependency, reducing disk usage while increasing time taken due to the additional steps of performing git checkouts.

Option 2: Store Every Copy

All dependencies would be stored in ~/.samp/packages/ in username/repo/version/ directories. Each package would be a Git repository with only a single specific point of history checked out.

When an instance of sampctl runs either ensure or build, the dependency tree of the package will be walked recursively, visiting each dependency. If the dependency exists at username/repo/version/, no further work is necessary. If it does not exist, it is cloned and checked out to the necessary version, then has its .git directory deleted.

Here, version is effectively just the text that proceeds the :, # or @ characters in a dependency string. It would identify that exact version of the dependency by its tag, commit hash or branch. This means n number of the same package could exist, each at different versions. All sampctl needs to do is pass the full path to that directory to pawncc.

One detail that could improve performance with this method would be the use of a cache. Similar to the existing cache, this would simply be a copy of all packages but cloned with full history depth. So when a package depends on an old verion of a package it has already cached, it can simply clone from the local copy instead of one from the internet.

The downside to this approach is the directory could grow quite large if there are lots of packages that depend on many different versions of packages.

There is a difficulty with both these approaches that also exists in the current implementation. Consider a package removes a dependency in a release and the tree changes. In the current implementation, if a package was dependent on an older verison of said package, the resolution would fail because only the latest copy of the package definition is used and that copy of the package definition is missing the dependency that the older version wanted.

The reason for this is, most packages were created before sampctl existed, as a result many older versions of these dependencies do not have package definitions, this poses a problem for packages that either 1. depend on other packages or 2. they are plugins that need to tell sampctl where to find binaries.

The quick fix for this was to always use the latest package definition file, regardless of the actual version of the dependency was used by a package. The upside was simpler code that was shipped quicker, the downside is this problem (read: bug) that, luckily, hasn't occurred anywhere yet (as far as I am aware).

And that problem would be present in these new proposals too. The way around it would be to attempt to use a package definition file from whatever version is specified and, if it does not exist, either use the oldest one or the newest one - along with all the technical complexities of pulling that off.

Anyway, this probably won't get implemented for a long time - if ever (the current solution works fine) but it might still result in some interesting discussion so I thought I'd write it up anyway.

The text was updated successfully, but these errors were encountered:

ADRFranklin added the v2 label May 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-design dependency resolution engine (again) #255

Re-design dependency resolution engine (again) #255

Southclaws commented Sep 13, 2018

Re-design dependency resolution engine (again) #255

Re-design dependency resolution engine (again) #255

Comments

Southclaws commented Sep 13, 2018

Option 1: Store One Copy

Option 2: Store Every Copy