Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slim down the ClickHouse database #6352

Merged
merged 2 commits into from
Aug 19, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Slim down the ClickHouse database
- Add TTLs to all field tables, by using a materialized column with the
  time each record is inserted. ClickHouse will retain the latest
  timestamp, so when we stop inserting, the TTL clock will start
  counting down on those timeseries records.
- Update Dropshot dependency.
- Add operation ID to HTTP service timeseries, remove other fields.
  Expunge the old timeseries too.
- Remove unnecessary stingifying of URIs in latency tracking.
- Fixes #6328 and #6331
bnaecker committed Aug 15, 2024
commit 1ab13b4ad4d6d501e92b4547304bafda488d2961
7 changes: 4 additions & 3 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i64_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i64_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i64_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_uuid_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_uuid_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_uuid_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_bool_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_bool_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_bool_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_ipaddr_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_ipaddr_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_ipaddr_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_string_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_string_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_string_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i8_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i8_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i8_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u8_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u8_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u8_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i16_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i16_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i16_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u16_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u16_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u16_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i32_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i32_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i32_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u32_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u32_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u32_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u64_local ON CLUSTER oximeter_cluster ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u64_local ON CLUSTER oximeter_cluster MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u64_local ON CLUSTER oximeter_cluster MODIFY TTL last_updated_at + INTERVAL 30 DAY;
1 change: 1 addition & 0 deletions oximeter/db/schema/replicated/10/timeseries-to-delete.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
http_service:request_latency_histogram
12 changes: 8 additions & 4 deletions oximeter/db/schema/replicated/db-init-1.sql
Original file line number Diff line number Diff line change
@@ -78,10 +78,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_i64_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int64
field_value Int64,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_i64_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i64 ON CLUSTER oximeter_cluster
AS oximeter.fields_i64_local
@@ -93,10 +95,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_uuid_local ON CLUSTER oximeter_cluste
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UUID
field_value UUID,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_uuid_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_uuid ON CLUSTER oximeter_cluster
AS oximeter.fields_uuid_local
60 changes: 40 additions & 20 deletions oximeter/db/schema/replicated/db-init-2.sql
Original file line number Diff line number Diff line change
@@ -595,10 +595,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_bool_local ON CLUSTER oximeter_cluste
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt8
field_value UInt8,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_bool_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_bool ON CLUSTER oximeter_cluster
AS oximeter.fields_bool_local
@@ -609,10 +611,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_ipaddr_local ON CLUSTER oximeter_clus
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value IPv6
field_value IPv6,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_ipaddr_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_ipaddr ON CLUSTER oximeter_cluster
AS oximeter.fields_ipaddr_local
@@ -623,10 +627,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_string_local ON CLUSTER oximeter_clus
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value String
field_value String,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_string_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_string ON CLUSTER oximeter_cluster
AS oximeter.fields_string_local
@@ -637,10 +643,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_i8_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int8
field_value Int8,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_i8_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i8 ON CLUSTER oximeter_cluster
AS oximeter.fields_i8_local
@@ -651,10 +659,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_u8_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt8
field_value UInt8,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_u8_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u8 ON CLUSTER oximeter_cluster
AS oximeter.fields_u8_local
@@ -665,10 +675,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_i16_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int16
field_value Int16,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_i16_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i16 ON CLUSTER oximeter_cluster
AS oximeter.fields_i16_local
@@ -679,10 +691,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_u16_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt16
field_value UInt16,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_u16_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u16 ON CLUSTER oximeter_cluster
AS oximeter.fields_u16_local
@@ -693,10 +707,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_i32_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int32
field_value Int32,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_i32_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i32 ON CLUSTER oximeter_cluster
AS oximeter.fields_i32_local
@@ -707,10 +723,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_u32_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt32
field_value UInt32,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_u32_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u32 ON CLUSTER oximeter_cluster
AS oximeter.fields_u32_local
@@ -721,10 +739,12 @@ CREATE TABLE IF NOT EXISTS oximeter.fields_u64_local ON CLUSTER oximeter_cluster
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt64
field_value UInt64,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{shard}/fields_u64_local', '{replica}')
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u64 ON CLUSTER oximeter_cluster
AS oximeter.fields_u64_local
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_bool ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_bool MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_bool MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i8 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i8 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i8 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u8 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u8 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u8 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i16 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i16 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i16 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u16 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u16 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u16 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i32 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i32 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i32 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u32 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u32 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u32 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i64 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i64 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_i64 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u64 ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u64 MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_u64 MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_ipaddr ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_ipaddr MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_ipaddr MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_string ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_string MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_string MODIFY TTL last_updated_at + INTERVAL 30 DAY;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_uuid ADD COLUMN IF NOT EXISTS last_updated_at DateTime MATERIALIZED now();
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_uuid MATERIALIZE COLUMN last_updated_at;
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ALTER TABLE oximeter.fields_uuid MODIFY TTL last_updated_at + INTERVAL 30 DAY;
1 change: 1 addition & 0 deletions oximeter/db/schema/single-node/10/timeseries-to-delete.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
http_service:request_latency_histogram
80 changes: 56 additions & 24 deletions oximeter/db/schema/single-node/db-init.sql
Original file line number Diff line number Diff line change
@@ -504,126 +504,158 @@ TTL toDateTime(timestamp) + INTERVAL 30 DAY;
* timeseries name and then key, since it would improve lookups where one
* already has the key. Realistically though, these tables are quite small and
* so performance benefits will be low in absolute terms.
*
* TTL: We use a materialized column to expire old field table records. This
* column is generated automatically by the database whenever a new row is
* inserted. It cannot be inserted directly, nor is it returned in a `SELECT *`
* query. Since these tables are `ReplacingMergeTree`s, that means the last
* record will remain during a deduplication, which will have the last
* timestamp. ClickHouse will then expire old data for us, similar to the
* measurement tables.
*/
CREATE TABLE IF NOT EXISTS oximeter.fields_bool
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt8
field_value UInt8,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i8
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int8
field_value Int8,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u8
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt8
field_value UInt8,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i16
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int16
field_value Int16,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u16
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt16
field_value UInt16,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i32
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int32
field_value Int32,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u32
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt32
field_value UInt32,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_i64
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value Int64
field_value Int64,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_u64
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UInt64
field_value UInt64,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_ipaddr
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value IPv6
field_value IPv6,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_string
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value String
field_value String,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

CREATE TABLE IF NOT EXISTS oximeter.fields_uuid
(
timeseries_name String,
timeseries_key UInt64,
field_name String,
field_value UUID
field_value UUID,
last_updated_at DateTime MATERIALIZED now()
)
ENGINE = ReplacingMergeTree()
ORDER BY (timeseries_name, field_name, field_value, timeseries_key);
ORDER BY (timeseries_name, field_name, field_value, timeseries_key)
TTL last_updated_at + INTERVAL 30 DAY;

/* The timeseries schema table stores the extracted schema for the samples
* oximeter collects.
2 changes: 1 addition & 1 deletion oximeter/db/src/model.rs
Original file line number Diff line number Diff line change
@@ -45,7 +45,7 @@ use uuid::Uuid;
/// - [`crate::Client::initialize_db_with_version`]
/// - [`crate::Client::ensure_schema`]
/// - The `clickhouse-schema-updater` binary in this crate
pub const OXIMETER_VERSION: u64 = 9;
pub const OXIMETER_VERSION: u64 = 10;

// Wrapper type to represent a boolean in the database.
//
80 changes: 16 additions & 64 deletions oximeter/instruments/src/http.rs
Original file line number Diff line number Diff line change
@@ -6,16 +6,12 @@
// Copyright 2024 Oxide Computer Company

use dropshot::{
HttpError, HttpResponse, RequestContext, RequestInfo, ServerContext,
};
use dropshot::{HttpError, HttpResponse, RequestContext, ServerContext};
use futures::Future;
use http::StatusCode;
use http::Uri;
use oximeter::{
histogram::Histogram, histogram::Record, MetricsError, Producer, Sample,
};
use std::borrow::Cow;
use std::collections::BTreeMap;
use std::sync::{Arc, Mutex};
use std::time::{Duration, Instant};
@@ -24,28 +20,18 @@ oximeter::use_timeseries!("http-service.toml");
pub use http_service::HttpService;
pub use http_service::RequestLatencyHistogram;

// Return the route portion of the request, normalized to include a single
// leading slash and no trailing slashes.
fn normalized_uri_path(uri: &Uri) -> Cow<'static, str> {
Cow::Owned(format!(
"/{}",
uri.path().trim_end_matches('/').trim_start_matches('/')
))
}

impl RequestLatencyHistogram {
/// Build a new `RequestLatencyHistogram` with a specified histogram.
///
/// Latencies are expressed in seconds.
pub fn new(
request: &RequestInfo,
operation_id: &str,
status_code: StatusCode,
histogram: Histogram<f64>,
) -> Self {
Self {
route: normalized_uri_path(request.uri()),
method: request.method().to_string().into(),
status_code: status_code.as_u16().into(),
operation_id: operation_id.to_string().into(),
status_code: status_code.as_u16(),
datum: histogram,
}
}
@@ -59,26 +45,17 @@ impl RequestLatencyHistogram {
///
/// Latencies are expressed as seconds.
pub fn with_latency_decades(
request: &RequestInfo,
operation_id: &str,
status_code: StatusCode,
start_decade: i16,
end_decade: i16,
) -> Result<Self, MetricsError> {
Ok(Self::new(
request,
operation_id,
status_code,
Histogram::span_decades(start_decade, end_decade)?,
))
}

fn key_for(request: &RequestInfo, status_code: StatusCode) -> String {
format!(
"{}:{}:{}",
normalized_uri_path(request.uri()),
request.method(),
status_code.as_u16()
)
}
}

/// The `LatencyTracker` is an [`oximeter::Producer`] that tracks the latencies of requests for an
@@ -129,15 +106,15 @@ impl LatencyTracker {
/// to which the other arguments belong. (One is created if it does not exist.)
pub fn update(
&self,
request: &RequestInfo,
operation_id: &str,
status_code: StatusCode,
latency: Duration,
) -> Result<(), MetricsError> {
let key = RequestLatencyHistogram::key_for(request, status_code);
let key = operation_id.to_string();
bnaecker marked this conversation as resolved.
Show resolved Hide resolved
let mut latencies = self.latencies.lock().unwrap();
let entry = latencies.entry(key).or_insert_with(|| {
RequestLatencyHistogram::new(
request,
operation_id,
status_code,
self.histogram.clone(),
)
@@ -170,14 +147,14 @@ impl LatencyTracker {
Ok(response) => response.status_code(),
Err(ref e) => e.status_code,
};
if let Err(e) = self.update(&context.request, status_code, latency) {
if let Err(e) = self.update(&context.operation_id, status_code, latency)
{
slog::error!(
&context.log,
"error instrumenting dropshot handler";
"error" => ?e,
"status_code" => status_code.as_u16(),
"method" => %context.request.method(),
"uri" => %context.request.uri(),
"operation_id" => &context.operation_id,
"remote_addr" => context.request.remote_addr(),
"latency" => ?latency,
);
@@ -220,41 +197,16 @@ mod tests {
HttpService { name: "my-service".into(), id: ID.parse().unwrap() };
let hist = Histogram::new(&[0.0, 1.0]).unwrap();
let tracker = LatencyTracker::new(service, hist);
let request = http::request::Builder::new()
.method(http::Method::GET)
.uri("/some/uri")
.body(())
.unwrap();
let status_code = StatusCode::OK;
let operation_id = "some_operation_id";
tracker
.update(
&RequestInfo::new(&request, "0.0.0.0:0".parse().unwrap()),
status_code,
Duration::from_secs_f64(0.5),
)
.update(operation_id, status_code, Duration::from_secs_f64(0.5))
.unwrap();

let key = "/some/uri:GET:200";
let actual_hist = tracker.latencies.lock().unwrap()[key].datum.clone();
let actual_hist =
tracker.latencies.lock().unwrap()[operation_id].datum.clone();
assert_eq!(actual_hist.n_samples(), 1);
let bins = actual_hist.iter().collect::<Vec<_>>();
assert_eq!(bins[1].count, 1);
}

#[test]
fn test_normalize_uri_path() {
const EXPECTED: &str = "/foo/bar";
const TESTS: &[&str] = &[
"/foo/bar",
"/foo/bar/",
"//foo/bar",
"//foo/bar/",
"/foo/bar//",
"////foo/bar/////",
];
for test in TESTS.iter() {
println!("{test}");
assert_eq!(normalized_uri_path(&test.parse().unwrap()), EXPECTED);
}
}
}
15 changes: 8 additions & 7 deletions oximeter/oximeter/schema/http-service.toml
Original file line number Diff line number Diff line change
@@ -14,7 +14,7 @@ description = "Duration for the server to handle a request"
units = "seconds"
datum_type = "histogram_f64"
versions = [
{ added_in = 1, fields = [ "route", "method", "status_code" ] }
{ added_in = 1, fields = [ "operation_id", "status_code" ] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the added_in version need to be bumped?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still partway through supporting actual updates to these schema. Until then, we're getting away with this because we drop the data from the old schema. When I do finish that work, then we'd need to add a new versions entry, describing the new fields.

]

[fields.name]
@@ -25,14 +25,15 @@ description = "The name of the HTTP server, or program running it"
type = "uuid"
description = "UUID of the HTTP server"

[fields.route]
[fields.operation_id]
type = "string"
description = "HTTP route in the request"
description = """\
The identifier for the HTTP operation.\
[fields.method]
type = "string"
description = "HTTP method in the request"
In most cases, this the OpenAPI `operationId` field that uniquely identifies the
endpoint the request is targeted to and the HTTP method used.
"""

[fields.status_code]
type = "i64"
type = "u16"
description = "HTTP status code in the server's response"
2 changes: 2 additions & 0 deletions workspace-hack/Cargo.toml
Original file line number Diff line number Diff line change
@@ -101,6 +101,7 @@ sha2 = { version = "0.10.8", features = ["oid"] }
similar = { version = "2.5.0", features = ["bytes", "inline", "unicode"] }
slog = { version = "2.7.0", features = ["dynamic-keys", "max_level_trace", "release_max_level_debug", "release_max_level_trace"] }
smallvec = { version = "1.13.2", default-features = false, features = ["const_new"] }
socket2 = { version = "0.5.7", default-features = false, features = ["all"] }
spin = { version = "0.9.8" }
string_cache = { version = "0.8.7" }
subtle = { version = "2.5.0" }
@@ -208,6 +209,7 @@ sha2 = { version = "0.10.8", features = ["oid"] }
similar = { version = "2.5.0", features = ["bytes", "inline", "unicode"] }
slog = { version = "2.7.0", features = ["dynamic-keys", "max_level_trace", "release_max_level_debug", "release_max_level_trace"] }
smallvec = { version = "1.13.2", default-features = false, features = ["const_new"] }
socket2 = { version = "0.5.7", default-features = false, features = ["all"] }
spin = { version = "0.9.8" }
string_cache = { version = "0.8.7" }
subtle = { version = "2.5.0" }