For design and API docs, see the generated documentation. You can generate this yourself with:
$ cargo doc --document-private-items
Note that --document-private-items
is configured by default, so you can actually just use cargo doc
.
Folks with access to Oxide RFDs may find RFD 48 ("Control Plane Requirements") and other control plane RFDs relevant. These are not currently publicly available.
You can format the code using cargo fmt
. Make sure to run this before pushing changes. The CI checks that the code is correctly formatted.
You can run the Clippy linter using cargo clippy -- -D warnings -A clippy::style
. Make sure to run this before pushing changes. The CI checks that the code is clippy-clean.
Prerequisites:
Both normal execution and the test suite expect certain binaries (described below) on your PATH.
-
libpq, the PostgreSQL client library
We use Diesel’s PostgreSQL support to connect to CockroachDB (which is wire-compatible with PostgreSQL). Diesel uses the native libpq to do this. You can get the client library with:
-
Helios:
pkg install library/postgresql-13
-
Linux:
sudo apt-get install libpq-dev
-
Mac:
brew install postgresql
After doing this, you should have the
pg_config
command on your PATH. For example, on Helios, you’d want/opt/ooce/bin
on your PATH. -
-
pkg-config, a tool for querying installed libraries
-
Helios:
pkg install pkg-config
-
Linux:
sudo apt-get install pkg-config
-
Mac:
brew install pkg-config
After doing this, you should have the
pkg-config
command on your PATH. For example, on Helios, you’d want/usr/bin
on your PATH. -
-
CockroachDB v21.1.10.
The build and test suite expects to be able to start a single-node CockroachDB cluster using the
cockroach
executable on your PATH. On illumos, MacOS, and Linux, you should be able to use thetools/ci_download_cockroachdb
script to fetch the official CockroachDB binary. It will be put into./cockroachdb/bin/cockroach
. Alternatively, you can follow the official CockroachDB installation instructions for your platform. -
ClickHouse >= v21.7.1.
The test suite expects a running instance of the ClickHouse database server. The script
./tools/ci_download_clickhouse
can be used to download a pre-built binary for illumos, Linux, or macOS platforms. Once complete, you must manually add the binary (located atclickhouse/clickhouse
) to your PATH. You may also install ClickHouse manually; instructions can be found here. See [_configuring_clickhouse] for details on ClickHouse’s setup and configuration files. -
Additional software requirements:
On an illumos-based machine (Helios, OmniOS), if you want to run the real (non-simulated) Sled Agent to run actual VMs with Propolis, make sure your packages are up to date, and you have the
brand/sparse
package:pkg install brand/sparse pkg install pkg:/package/pkg pkg update
To run Omicron you need to run four programs:
-
a CockroachDB cluster. For development, you can use the
omicron-dev
tool in this repository to start a single-node CockroachDB cluster that will delete the database when you shut it down. -
a ClickHouse server. You should use the
omicron-dev
tool for this as well, see below, and as with CockroachDB, the database files will be deleted when you stop the program. -
nexus
: the guts of the control plane -
sled-agent-sim
: a simulator for the component that manages a single sled
The easiest way to start the required databases is to use the built-in omicron-dev
tool. This tool assumes that the cockroach
and clickhouse
executables are on your PATH, and match the versions above.
-
Start CockroachDB using
omicron-dev db-run
:$ cargo run --bin=omicron-dev -- db-run Finished dev [unoptimized + debuginfo] target(s) in 0.15s Running `target/debug/omicron-dev db-run` omicron-dev: using temporary directory for database store (cleaned up on clean exit) omicron-dev: will run this to start CockroachDB: cockroach start-single-node --insecure --http-addr=:0 --store /var/tmp/omicron_tmp/.tmpM8KpTf/data --listen-addr 127.0.0.1:32221 --listening-url-file /var/tmp/omicron_tmp/.tmpM8KpTf/listen-url omicron-dev: temporary directory: /var/tmp/omicron_tmp/.tmpM8KpTf * * WARNING: ALL SECURITY CONTROLS HAVE BEEN DISABLED! * * This mode is intended for non-production testing only. * * In this mode: * - Your cluster is open to any client that can access 127.0.0.1. * - Intruders with access to your machine or network can observe client-server traffic. * - Intruders can log in without password and read or write any data in the cluster. * - Intruders can consume all your server's resources and cause unavailability. * * * INFO: To start a secure server without mandating TLS for clients, * consider --accept-sql-without-tls instead. For other options, see: * * - https://go.crdb.dev/issue-v/53404/v20.2 * - https://www.cockroachlabs.com/docs/v20.2/secure-a-cluster.html * omicron-dev: child process: pid 3815 omicron-dev: CockroachDB listening at: postgresql://[email protected]:32221/omicron?sslmode=disable omicron-dev: populating database * * INFO: Replication was disabled for this cluster. * When/if adding nodes in the future, update zone configurations to increase the replication factor. * CockroachDB node starting at 2021-04-13 15:58:59.680359279 +0000 UTC (took 0.4s) build: OSS v20.2.5 @ 2021/03/17 21:00:51 (go1.16.2) webui: http://127.0.0.1:41618 sql: postgresql://[email protected]:32221?sslmode=disable RPC client flags: cockroach <client cmd> --host=127.0.0.1:32221 --insecure logs: /var/tmp/omicron_tmp/.tmpM8KpTf/data/logs temp dir: /var/tmp/omicron_tmp/.tmpM8KpTf/data/cockroach-temp022560209 external I/O path: /var/tmp/omicron_tmp/.tmpM8KpTf/data/extern store[0]: path=/var/tmp/omicron_tmp/.tmpM8KpTf/data storage engine: pebble status: initialized new cluster clusterID: 8ab646f0-67f0-484d-8010-e4444fb86336 nodeID: 1 omicron-dev: populated database
Note that as the output indicates, this cluster will be available to anybody that can reach 127.0.0.1.
-
Start the ClickHouse database server:
$ cargo run --bin omicron-dev -- ch-run Finished dev [unoptimized + debuginfo] target(s) in 0.47s Running `target/debug/omicron-dev ch-run` omicron-dev: running ClickHouse (PID: 2463), full command is "clickhouse server --log-file /var/folders/67/2tlym22x1r3d2kwbh84j298w0000gn/T/.tmpJ5nhot/clickhouse-server.log --errorlog-file /var/folders/67/2tlym22x1r3d2kwbh84j298w0000gn/T/.tmpJ5nhot/clickhouse-server.errlog -- --http_port 8123 --path /var/folders/67/2tlym22x1r3d2kwbh84j298w0000gn/T/.tmpJ5nhot" omicron-dev: using /var/folders/67/2tlym22x1r3d2kwbh84j298w0000gn/T/.tmpJ5nhot for ClickHouse data storage
-
nexus
requires a configuration file to run. You can usenexus/examples/config.toml
to start with. Build and run it like this:$ cargo run --bin=nexus -- nexus/examples/config.toml ... listening: http://127.0.0.1:12220
Nexus can also serve the web console. Instructions for generating the static assets and pointing Nexus to them are here. Without console assets, Nexus will still start and run normally as an API. A few console-specific routes will 404.
-
sled-agent-sim
only accepts configuration on the command line. Run it with a uuid identifying itself (this would be a uuid for the sled it’s managing), an IP:port for itself, and the IP:port of `nexus’s internal interface. Using default config, this might look like this:$ cargo run --bin=sled-agent-sim -- $(uuidgen) 127.0.0.1:12345 127.0.0.1:12221 ... Jun 02 12:21:50.989 INFO listening, local_addr: 127.0.0.1:12345, component: dropshot
-
oximeter
is similar tonexus
, requiring a configuration file. You can useoximeter/collector/config.toml
, and the whole thing can be run with:$ cargo run --bin=oximeter -- oximeter/collector/config.toml Dec 02 18:00:01.062 INFO starting oximeter server Dec 02 18:00:01.062 DEBG creating ClickHouse client Dec 02 18:00:01.068 DEBG initializing ClickHouse database, component: clickhouse-client, collector_id: 1da65e5b-210c-4859-a7d7-200c1e659972, component: oximeter-agent Dec 02 18:00:01.093 DEBG registered endpoint, path: /producers, method: POST, local_addr: [::1]:12223, component: dropshot ...
Once everything is up and running, you can use curl
directly to hit either of the servers. But it’s easier to use the oxapi_demo
wrapper (see below).
This repo includes a Dockerfile that builds an image containing the Nexus and sled agent. There’s a GitHub Actions workflow that builds and publishes the Docker image. This is used by the console project for prototyping, demoing, and testing. This is not the way Omicron will be deployed on production systems, but it’s a useful vehicle for working with it.
There’s a small demo tool called ./tools/oxapi_demo
that provides a slightly friendlier interface than curl
, with the same output format. To use the demo, the node
and json
programs should be installed and available in the users PATH:
pkg install node-12
npm i -g json
Here’s a small demo that creates a project, creates an instance, and attaches a disk to it:
$ ./tools/oxapi_demo
oxapi_demo: command not specified
usage: oxapi_demo [-A] [cmd] [args]
GENERAL OPTIONS
-A do not attempt to authenticate
(default behavior: use "spoof" authentication for endpoints
that require it)
ORGANIZATIONS
organizations_list
organization_create_demo ORGANIZATION_NAME
organization_delete ORGANIZATION_NAME
organization_get ORGANIZATION_NAME
PROJECTS
projects_list ORGANIZATION_NAME
project_create_demo ORGANIZATION_NAME PROJECT_NAME
project_delete ORGANIZATION_NAME PROJECT_NAME
project_get ORGANIZATION_NAME PROJECT_NAME
project_list_instances ORGANIZATION_NAME PROJECT_NAME
project_list_disks ORGANIZATION_NAME PROJECT_NAME
project_list_vpcs ORGANIZATION_NAME PROJECT_NAME
INSTANCES
instance_create_demo ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_get ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_delete ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_stop ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_start ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_reboot ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_attach_disk ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME DISK_NAME
instance_detach_disk ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME DISK_NAME
instance_list_disks ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME
instance_get_disk ORGANIZATION_NAME PROJECT_NAME INSTANCE_NAME DISK_NAME
DISKS
disk_create_demo ORGANIZATION_NAME PROJECT_NAME DISK_NAME
disk_get ORGANIZATION_NAME PROJECT_NAME DISK_NAME
disk_delete ORGANIZATION_NAME PROJECT_NAME DISK_NAME
VPCS
vpc_create_demo ORGANIZATION_NAME PROJECT_NAME VPC_NAME DNS_NAME
vpc_get ORGANIZATION_NAME PROJECT_NAME VPC_NAME
vpc_delete ORGANIZATION_NAME PROJECT_NAME VPC_NAME
HARDWARE
racks_list
rack_get RACK_ID
sleds_list
sled_get SLED_ID
$ ./tools/oxapi_demo organization_create_demo myorg
++ curl -sSi http://127.0.0.1:12220/organizations -X POST -T - -H 'oxide-authn-spoof: 001de000-05e4-0000-0000-000000004007'
++ json -ga
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
content-type: application/json
x-request-id: 60ffbc0b-212a-4ffd-84f0-39aeb4b3b772
content-length: 194
date: Wed, 17 Nov 2021 01:43:52 GMT
{
"id": "dbaf046f-1c05-4b6a-a749-5111a8cd9aa1",
"name": "myorg",
"description": "an organization called myorg",
"timeCreated": "2021-11-17T01:43:52.322629Z",
"timeModified": "2021-11-17T01:43:52.322629Z"
}
$ ./tools/oxapi_demo project_create_demo myorg myproject
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects -X POST -T -
++ json -ga
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
content-type: application/json
x-request-id: bea0ba10-2916-473c-b3e4-461984b85c6b
content-length: 252
date: Wed, 17 Nov 2021 01:44:01 GMT
{
"id": "c197b9d2-285c-4e9f-9461-1815ef093c8d",
"name": "myproject",
"description": "a project called myproject",
"timeCreated": "2021-11-17T01:44:01.933615Z",
"timeModified": "2021-11-17T01:44:01.933615Z",
"organizationId": "dbaf046f-1c05-4b6a-a749-5111a8cd9aa1"
}
$ ./tools/oxapi_demo instance_create_demo myorg myproject myinstance
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects/myproject/instances -X POST -T -
++ json -ga
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
content-type: application/json
x-request-id: 3d07dfa2-efda-4085-ac9f-9e6bd3040997
content-length: 377
date: Wed, 17 Nov 2021 01:45:07 GMT
{
"id": "99ad2514-050c-4493-9cb9-d9ceba980a98",
"name": "myinstance",
"description": "an instance called myinstance",
"timeCreated": "2021-11-17T01:45:07.606749Z",
"timeModified": "2021-11-17T01:45:07.606749Z",
"projectId": "c197b9d2-285c-4e9f-9461-1815ef093c8d",
"ncpus": 1,
"memory": 268435456,
"hostname": "myproject",
"runState": "starting",
"timeRunStateUpdated": "2021-11-17T01:45:07.618960Z"
}
$ ./tools/oxapi_demo instance_get myorg myproject myinstance
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects/myproject/instances/myinstance
++ json -ga
HTTP/1.1 200 OK
content-type: application/json
x-request-id: 5870f965-06a8-41fc-967f-9021ec640ad4
content-length: 376
date: Wed, 17 Nov 2021 01:46:41 GMT
{
"id": "99ad2514-050c-4493-9cb9-d9ceba980a98",
"name": "myinstance",
"description": "an instance called myinstance",
"timeCreated": "2021-11-17T01:45:07.606749Z",
"timeModified": "2021-11-17T01:45:07.606749Z",
"projectId": "c197b9d2-285c-4e9f-9461-1815ef093c8d",
"ncpus": 1,
"memory": 268435456,
"hostname": "myproject",
"runState": "running",
"timeRunStateUpdated": "2021-11-17T01:45:09.120652Z"
}
$ ./tools/oxapi_demo disk_create_demo myorg myproject mydisk
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects/myproject/disks -X POST -T -
++ json -ga
HTTP/1.1 100 Continue
HTTP/1.1 201 Created
content-type: application/json
x-request-id: f6073b9d-4b07-4eba-b8cf-55d8158785eb
content-length: 324
date: Wed, 17 Nov 2021 01:47:36 GMT
{
"id": "551bbe67-3640-41c9-b968-249a136e5e31",
"name": "mydisk",
"description": "a disk called mydisk",
"timeCreated": "2021-11-17T01:47:36.524136Z",
"timeModified": "2021-11-17T01:47:36.524136Z",
"projectId": "c197b9d2-285c-4e9f-9461-1815ef093c8d",
"snapshotId": null,
"size": 1024,
"state": {
"state": "creating"
},
"devicePath": "/mnt/mydisk"
}
$ ./tools/oxapi_demo disk_get myorg myproject mydisk
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects/myproject/disks/mydisk
++ json -ga
HTTP/1.1 200 OK
content-type: application/json
x-request-id: bd1083a8-67f2-407a-8d33-068906e9f2f1
content-length: 324
date: Wed, 17 Nov 2021 01:48:17 GMT
{
"id": "551bbe67-3640-41c9-b968-249a136e5e31",
"name": "mydisk",
"description": "a disk called mydisk",
"timeCreated": "2021-11-17T01:47:36.524136Z",
"timeModified": "2021-11-17T01:47:36.524136Z",
"projectId": "c197b9d2-285c-4e9f-9461-1815ef093c8d",
"snapshotId": null,
"size": 1024,
"state": {
"state": "detached"
},
"devicePath": "/mnt/mydisk"
}
$ ./tools/oxapi_demo instance_attach_disk myorg myproject myinstance mydisk
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects/myproject/instances/myinstance/disks
/mydisk -X PUT -T /dev/null
++ json -ga
HTTP/1.1 201 Created
content-type: application/json
x-request-id: da298038-28fc-4eb2-a283-4182859d6f33
content-length: 205
date: Wed, 17 Nov 2021 01:48:41 GMT
{
"instanceId": "99ad2514-050c-4493-9cb9-d9ceba980a98",
"diskId": "551bbe67-3640-41c9-b968-249a136e5e31",
"diskName": "mydisk",
"diskState": {
"state": "attaching",
"instance": "99ad2514-050c-4493-9cb9-d9ceba980a98"
}
}
$ ./tools/oxapi_demo instance_list_disks myorg myproject myinstance
++ curl -sSi http://127.0.0.1:12220/organizations/myorg/projects/myproject/instances/myinstance/disks
++ json -ga
HTTP/1.1 200 OK
content-type: application/json
x-request-id: 9f0490d5-e09e-4831-900c-d39f5f07d2c8
content-length: 206
date: Wed, 17 Nov 2021 01:49:10 GMT
{
"instanceId": "99ad2514-050c-4493-9cb9-d9ceba980a98",
"diskId": "551bbe67-3640-41c9-b968-249a136e5e31",
"diskName": "mydisk",
"diskState": {
"state": "attached",
"instance": "99ad2514-050c-4493-9cb9-d9ceba980a98"
}
}
Prerequisite: Have a machine already running Helios. An easy way to do this is by using a Helios VM.
The control plane repository contains a packaging tool which bundles binaries and SMF manifests. After building the expected binaries, they can be packaged in a format which lets them be transferred to a Helios machine.
This tool acts on a package-manifest.toml
file which describes the packages to be
bundled in the build.
$ cargo build
$ ./target/debug/omicron-package package
The aforementioned package command fills a target directory of choice
(by default, out/
within the omicron repository) with tarballs ready
to be unpacked as services.
To install the services on a target machine, the following command may be executed:
# Note that "sudo" is required to install SMF services; an appropriate pfexec
# profile may also be used.
$ sudo ./target/debug/omicron-package install
This service installs a bootstrap service, which itself loads other requested services. The bootstrap service is currently the only service which is "persistent" across reboots - although it will initialize other service as part of its setup sequence anyway.
# List all services:
$ svcs
# View logs for a service:
$ tail -f $(svcs -L nexus)
To uninstall all Omicron services from a machine, the following may be executed:
$ sudo ./target/debug/omicron-package uninstall
nexus
requires a TOML configuration file. There’s an example in
nexus/examples/config.toml
:
link:nexus/examples/config.toml[role=include]
Supported config properties include:
Name | Example | Required? | Description |
---|---|---|---|
|
|
Yes |
URL identifying the CockroachDB instance(s) to connect to. CockroachDB is used for all persistent data. |
|
Yes |
Dropshot configuration for the external server (i.e., the one that operators and developers using the Oxide rack will use). Specific properties are documented below, but see the Dropshot README for details. |
|
|
|
Yes |
Specifies that the server should bind to the given IP address and TCP port for the external API (i.e., the one that operators and developers using the Oxide rack will use). In general, servers can bind to more than one IP address and port, but this is not (yet?) supported. |
|
|
Yes |
Specifies the maximum request body size for the external API. |
|
Yes |
Dropshot configuration for the internal server (i.e., the one used by the sled agent). Specific properties are documented below, but see the Dropshot README for details. |
|
|
|
Yes |
Specifies that the server should bind to the given IP address and TCP port for the internal API (i.e., the one used by the sled agent). In general, servers can bind to more than one IP address and port, but this is not (yet?) supported. |
|
|
Yes |
Specifies the maximum request body size for the internal API. |
|
|
Yes |
Unique identifier for this Nexus instance |
|
Yes |
Logging configuration. Specific properties are documented below, but see the Dropshot README for details. |
|
|
|
Yes |
Controls where server logging will go. Valid modes are |
|
|
Yes |
Specifies what severity of log messages should be included in the log. Valid values include |
|
|
Only if |
If |
|
|
Only if |
If |
The ClickHouse binary uses several sources for its configuration. The binary expects an XML
config file, usually named config.xml
to be available, or one may be specified with the
-C
command-line flag. The binary also includes a minimal configuration embedded within
it, which will be used if no configuration file is given or present in the current directory.
The server also accepts command-line flags for overriding the values of the configuration
parameters.
The packages downloaded by ci_download_clickhouse
include a config.xml
file with them.
You should probably run ClickHouse via the omicron-dev
tool, but if you decide to run it
manually, you can start the server with:
$ /path/to/clickhouse server --config-file /path/to/config.xml
The configuration file contains a large number of parameters, but most of them are described
with comments in the included config.xml
, or you may learn more about them
here
and here. Parameters may be updated
in the config.xml
, and the server will automatically reload them. You may also specify
many of them on the command-line with:
$ /path/to/clickhouse server --config-file /path/to/config.xml -- --param_name param_value ...
Each service is a Dropshot server that presents an HTTP API. The description of
that API is serialized as an
OpenAPI document which we store
in omicron/openapi
and check in to this repo. In order to
ensure that changes to those APIs are made intentionally, each service contains
a test that validates that the current API matches. This allows us 1. to catch
accidental changes as test failures and 2. to explicitly observe API changes
during code review (and in the git history).
We also use these OpenAPI documents as the source for the clients we generate using Progenitor. Clients are automatically updated when the coresponding OpenAPI document is modified.
Note that Omicron contains a nominally circular dependency:
-
Nexus depends on the Sled Agent client
-
The Sled Agent client is derived from the OpenAPI document emitted by Sled Agent
-
Sled Agent depends on the Nexus client
-
The Nexus client is derived from the OpenAPI document emitted by Nexus
We effectively "break" this circular dependency by virtue of the OpenAPI documents being checked in.
In general, changes any service API require the following set of build steps:
-
Make changes to the service API
-
Update the OpenAPI document by running the relevant test with overwrite set:
EXPECTORATE=overwrite cargo test test_nexus_openapi_internal
(changing the test name as necessary) -
This will cause the generated client to be updated which may break the build for dependent consumers
-
Modify any dependent services to fix calls to the generated client
Note that if you make changes to both Nexus and Sled Agent simultaneously, you may end up in a spot where neither can build and therefore neither OpenAPI document can be generated. In this case, revert or comment out changes in one so that the OpenAPI document can be generated.