From 3ad79c1cbdc6fb77515bc10ce5f4a7d7c8687624 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Karen=20C=C3=A1rcamo?= Date: Tue, 13 Aug 2024 10:15:00 +1200 Subject: [PATCH] Remove outdated TODO file (#6291) --- TODO.adoc | 113 ------------------------------------------------------ 1 file changed, 113 deletions(-) delete mode 100644 TODO.adoc diff --git a/TODO.adoc b/TODO.adoc deleted file mode 100644 index 40c38e14b3..0000000000 --- a/TODO.adoc +++ /dev/null @@ -1,113 +0,0 @@ -:showtitle: -:icons: font - -= TODO - -API endpoints: - -* RFD 24: regions, AZs, etc -* (lots more) - -Work queue (see also: existing GitHub issues): - -* use CARGO_BIN_EXE for paths to binaries -https://doc.rust-lang.org/cargo/reference/environment-variables.html#environment-variables-cargo-sets-for-crates -* dropshot: allow consumers to provide error codes for dropshot errors -* general maintenance and cleanup -** replace &Arc with &T, and some instances of Arc as well -** all identifiers could be newtypes, with a prefix for the type (like AWS - "i-123" for instances) -** rethinking ApiError a bit -- should it use thiserror, or at least impl - std::error::Error? -** scope out switching to sync (see RFD 79) -** proper config + logging for sled agent -* settle on an approach for modification of resources and implement it once -* implement behavior of server restarting (e.g., sled agent starting up) -** This would help validate some of the architectural choices. Current thinking - is that this will notify OXCP of the restart, and OXCP will find instances - that are supposed to be on that server and run instance_ensure(). It will - also want to do that for the disks associated with those instances. - IMPORTANT: this process should also _remove_ any resources that are currently - on that system, so the notification to OXCP about a restart may need to - include the list of resources that the SA knows about and their current - states. -* implement audit log -* implement alerts -* implement external user authentication -* implement external user authorization mechanism -* implement throttling and load shedding described in RFD 6 -* implement hardening in RFD 10 -* implement ETag / If-Match / If-None-Match -* implement limits for all types of resources -* implement scheme for API versioning -** how to identify the requested version -- header or URI? -** translators for older versions? -** integration of supported API versions into build artifact? -** Should all the uses of serde_json disallow unrecognized fields? Should any? -* debugging/monitoring: Prometheus? -* debugging/monitoring: OpenTracing? OpenTelemetry? -* debugging/monitoring: Dynamic tracing? -* debugging/monitoring: Core files? -* Automated testing -** General API testing: there's a lot of boilerplate in hand-generated tests - for each kind of resource. Would it be reasonable / possible to have a sort - of omnibus test that's given the OpenAPI spec (or something like it), - creates a hierarchy with at least one of every possible resource, and does - things like: For each possible resource -*** attempt to (create, get, put, delete) one with an invalid name -*** attempt to (GET, DELETE, PUT) one that does not exist -*** attempt to create one with invalid JSON -*** attempt to create one with a duplicate name of the one we know about -*** exercise list operation with marker and limit (may need to create many of them) -*** for each required input property: -**** attempt to create a resource without that property -*** for each input property: attempt to create a resource with invalid values - for that property -*** list instances of that resource and expect to find the one we know about -*** GET the one instance we know about -*** DELETE the one instance we know about -*** GET the one instance we know about again and expect it to fail -*** list instances again and expect to find nothing -* We will need archivers for deleted records -- especially saga logs - -External dependencies / open questions: - -* Should we create a more first-class notion of objects in the API? -** This would be a good way to enforce built-in limits. -** This would be a good way to enforce uniformity of pagination. -** If each resource provides a way to construct ETags, we could provide - automatic implementation of If-Match, etc. -** With the right interface, we could provide automatic implementations of PUT - or PATCH with JSON Merge Patch and JSON Patch given any one of these. -* would like to require that servers have unique, immutable uuids -* TLS: -** How will we do TLS termination? -** How will we manage server certificates? -** How will we manage client certificates? -* what does bootstrapping / key management look like? -* what does internal authorization look like? - -Other activities: - -* Performance testing -* Stress testing -* Fault testing / under load -* Fuzz testing -* Security review - -Nice-to-haves: - -* API consistency checks: e.g., camel case every where - -Things we're going to want to build once: - -* metric export -* structured event reporting (e.g., audit log, alert log, fault log) -* opentracing-type reporting -* client-side circuit breakers -* service discovery -* client connection pooling -* server-side throttling -* command-line utilities - -Check out linkerd (for inspiration -- it looks K8s-specific)