Skip to content

Spec for opam multi packages repo

Thomas Gazagnaire edited this page Jun 9, 2020 · 7 revisions

It has always been been very common to define multiple opam packages into the same Git repository. With dune, this is becoming the norm: dune does not distinguish between internal packages (those defined inside the same repository) and external ones. It is now very common to move packages around, to consolidate a collection of repositories into a big mono-repo, to vendor depdencies, etc.

opam has always partially supported that workflow: you could add p1.opam and p2.opam at the root of the repository to defines the packages p1 and p2. However, most of the opam command only see top-level opam packages which make it difficult to compose repositories transparently.

This feature aims to improve the management of repositories with multiple packages. This will have the following benefits:

  • users will be able to organise their repository tree as they want (not requiring to put all the opam files at the root); the only constraint is to have at most one opam file per package (e.g. that won't work if multiple versions of the same package exist in the repository)

  • which in term will lead to better composability and flexibitly : it won't make a difference if big projects are splitted into multiple repositories or is organised as a mono-repo.

  • better consistency and compatibility with dune, which doesn't distinguish between internal and external packages (as opam 2.0 do) and which allow projects to vendor their external packages to form a monorepo.

Status

  • Partial support already in 2.0
  • Full support committed for inclusion in opam 2.1.0;
  • opam 2.1.0~alpha implementation mostly focus on the pinning use-case;
  • Refining the workflow for opam 2.1.0~beta, so the implementation is still subject to change.

Usecases

  • use: pin and install a subset of the packages from a remote repository; including in the pin-depends field
  • dev: clone the repo locally, install all the dependencies and build the local packages manually (e.g. dune build)
  • publish: publish a subset of the packages to opam-repository

Pin and install a subset of packages

Given a source repository repo:

  • U1: Users should be able to list the existing packages in repo
    • U1.1: it should be possible to have computer readable outputs to ease scripting
    • U1.2: it should be possible to exclude paths from the scan (e.g. _build)
    • U1.3: (TBD) while scanning, opam should report linting errors (we probably want a quiet mode to avoid seeing these) and ignore files with errors.
  • U2: Users should be able to pin and install a package p from repo
    • U2.1: by default, opam will scan repo to find p; it should be possible to avoid scanning for specific paths in repo.
    • U2.2: it should be possible to pass an explicit path directly to avoid scanning.
  • U3: Users should be able to pin and install a package and all its dependencies which are in repo.
  • U4: Users should be able to pin and install all the (valid) packages from a repository.
  • U5: Users should always be able to explicitely select which version the packages should be pinned.

Example of command-line workflows:

$ opam pin list <repo> # U1 (NEW COMMAND)
# package  version repo    path
  foo      1.2*    <repo>  src/foo
  foo-lwt  1.2*    <repo>  src/foo-lwt
[..]

$ opam pin list <repo> -s # U1.1 (NEW COMMAND)
foo.1.2*:repo:src/foo
foo-lwt.1.2*:repo:src/foo-lwt
[..]

$ opam pin list <repo> -s | grep foo | xargs opam pin add -n # U2 (NEW)

$ opam pin add foo-lwt.1.2 <repo> --rec # U4 (NEW: --rec SEMANTICS)
# pin only the dependencies of foo-lwt in <repo>

$ opam pin add --with-version 1.2 <repo> # U5 (NEW: --with-version; and scan
                                               the full repo for opam files )

Install all the dependencies

  • D1: Developpers should be able compile a fresh cloned repository quickly
    • D1.1: (TBD) de we we want to create a new local switch by default?
    • D1.2: (TBD) should with-test be the default?

Example of command-line workflows:

$ git clone <repo> && cd repo/
$ opam install --deps --with-tests  . # D1 (OK)
$ dune build

Publish of subset of the packages

Given a repository repo and a set of packages { p_1 ... p_n } whose opam files are located in <repo>/<path_i>/p_i.opam:

  • P1: developpers should be able to publish a subset of p_is
    • P1.1 using opam-publish
    • P1.2 using dune-release
  • P2: once published, opam 2.0 will not be able to install p_i so they should have a proper error message.

Example of command-line workflows:

$ dune-release -p foo,foo-lwt # P1.1 (TBD)
[..]
$ opam-publish # P1.2 (TBD)
[..]

TBD: P2 means that there we should bump the opam-version field.

Design Considerations

Interaction with dune vendoring

Vendoring code usually means copying the full dependency sources into the current repository. Most of the time, this copy is an over-approximation: more code (and packages) are copied than actually needed. Installing these extra-packages will often fail as these, in turn, will require, new, non-vendored, dependencies. It is thus important to not try to compile vendored sources eagerly. To avoid this, dune is never doing to run a rule from a vendored directory, unless explicitely needed by another (non-vendored) rule.

As opam is managing dependencies directly, opam pin <repo> where repo has vendored packages should work most of the time (unless there are conflicts generated by the vendored code, but this is very unlikely). It might just install a lot more of packages that is really needed to compile the non-vendored packages.

The right way to do this is to use opam pin add <pkg> <repo> --rec. This will ensure that only the packages needed by <pkg> are installed, not all the possible ones.

Interaction with opam-lock

As discussed on opam install --deps . might use the lockfile if it exists. See [https://github.com/ocaml/opam/wiki/Spec-for-opam-lock-integration]

opam-file format update

TBD. If we rely on scanning, we do not need to change the format file; however a bump for the change in the semantics is needed.

If we do not rely on scanning (which is probably better to have metadata as precise as possible) we need to add a new field opam-file in the src field.

Differences with opam 2.1~alpha1

  • new opam pin list <repo> to list the packages available to pin in repo

  • --rec has a new semantics: it install the dependencies of the pinned packages in the given packages (e.g. as opam list --required-by <pkg> --rec)

  • opam pin add <repo> has a new semantics: it scans for all the existing packages in the repo (e.g. ex --rec) and pin them all.

  • opam pin add <pkg> <repo> has a new semantics: it scans for all the existing packages in the repo (e.g. ex --rec) to find <pkg> and pin it.

  • TBD: do we need to keep --subpath? What's the use-case?

  • new --opam-file argument to set the path to the pinned opam file and corresponding src.opam-file file in the the opam format (TBD).

  • new --with-version to set the pin versions

Example Projects

Tezos

Tezos has opam files in subdirectories, since they have ~50 opam packages and so they cannot sit in the root. See opam-repository issue

Tests

There is a testcase on opam-rt