Skip to content

Commit

Permalink
feat: adjust WAL purge default configurations
Browse files Browse the repository at this point in the history
  • Loading branch information
killme2008 committed Dec 6, 2024
1 parent 2b699e7 commit d89af20
Show file tree
Hide file tree
Showing 6 changed files with 23 additions and 24 deletions.
14 changes: 7 additions & 7 deletions config/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,14 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
| `default_timezone` | String | Unset | The default timezone of the server. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
| `runtime.enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
Expand Down Expand Up @@ -62,8 +62,8 @@
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
Expand Down Expand Up @@ -289,13 +289,13 @@
| `store_addrs` | Array | -- | Store server address default to etcd store. |
| `selector` | String | `round_robin` | Datanode selector type.<br/>- `round_robin` (default value)<br/>- `lease_based`<br/>- `load_based`<br/>For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector". |
| `use_memory_store` | Bool | `false` | Store data in memory. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `backend` | String | `EtcdStore` | The datastore for meta server. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
| `runtime.enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. Enabled by default. |
| `procedure` | -- | -- | Procedure storage options. |
| `procedure.max_retry_times` | Integer | `12` | Procedure max retry time. |
| `procedure.retry_delay` | String | `500ms` | Initial retry delay of procedures, increases exponentially |
Expand Down Expand Up @@ -357,14 +357,14 @@
| `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. |
| `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `rpc_addr` | String | Unset | Deprecated, use `grpc.addr` instead. |
| `rpc_hostname` | String | Unset | Deprecated, use `grpc.hostname` instead. |
| `rpc_runtime_size` | Integer | Unset | Deprecated, use `grpc.runtime_size` instead. |
| `rpc_max_recv_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_recv_message_size` instead. |
| `rpc_max_send_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_send_message_size` instead. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
Expand Down Expand Up @@ -400,8 +400,8 @@
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
Expand Down
9 changes: 4 additions & 5 deletions config/datanode.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,6 @@ require_lease_before_startup = false
## By default, it provides services after all regions have been initialized.
init_regions_in_background = false

## Enable telemetry to collect anonymous usage data.
enable_telemetry = true

## Parallelism of initializing regions.
init_regions_parallelism = 16

Expand All @@ -42,6 +39,8 @@ rpc_max_recv_message_size = "512MB"
## @toml2docs:none-default
rpc_max_send_message_size = "512MB"

## Enable telemetry to collect anonymous usage data. Enabled by default.
#+ enable_telemetry = true

## The HTTP server options.
[http]
Expand Down Expand Up @@ -147,11 +146,11 @@ file_size = "256MB"

## The threshold of the WAL size to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_threshold = "4GB"
purge_threshold = "1GB"

## The interval to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_interval = "10m"
purge_interval = "1m"

## The read batch size.
## **It's only used when the provider is `raft_engine`**.
Expand Down
6 changes: 3 additions & 3 deletions config/metasrv.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,6 @@ selector = "round_robin"
## Store data in memory.
use_memory_store = false

## Whether to enable greptimedb telemetry.
enable_telemetry = true

## If it's not empty, the metasrv will store all data with this key prefix.
store_key_prefix = ""

Expand All @@ -42,6 +39,9 @@ backend = "EtcdStore"
## The number of threads to execute the runtime for global write operations.
#+ compact_rt_size = 4

## Whether to enable greptimedb telemetry. Enabled by default.
#+ enable_telemetry = true

## Procedure storage options.
[procedure]

Expand Down
10 changes: 5 additions & 5 deletions config/standalone.example.toml
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
## The running mode of the datanode. It can be `standalone` or `distributed`.
mode = "standalone"

## Enable telemetry to collect anonymous usage data.
enable_telemetry = true

## The default timezone of the server.
## @toml2docs:none-default
default_timezone = "UTC"
Expand All @@ -25,6 +22,9 @@ max_concurrent_queries = 0
## The number of threads to execute the runtime for global write operations.
#+ compact_rt_size = 4

## Enable telemetry to collect anonymous usage data. Enabled by default.
#+ enable_telemetry = true

## The HTTP server options.
[http]
## The address to bind the HTTP server.
Expand Down Expand Up @@ -151,11 +151,11 @@ file_size = "256MB"

## The threshold of the WAL size to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_threshold = "4GB"
purge_threshold = "1GB"

## The interval to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_interval = "10m"
purge_interval = "1m"

## The read batch size.
## **It's only used when the provider is `raft_engine`**.
Expand Down
4 changes: 2 additions & 2 deletions src/common/wal/src/config/raft_engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ impl Default for RaftEngineConfig {
Self {
dir: None,
file_size: ReadableSize::mb(256),
purge_threshold: ReadableSize::gb(4),
purge_interval: Duration::from_secs(600),
purge_threshold: ReadableSize::gb(1),
purge_interval: Duration::from_secs(60),
read_batch_size: 128,
sync_write: false,
enable_log_recycle: true,
Expand Down
4 changes: 2 additions & 2 deletions tests-integration/tests/http.rs
Original file line number Diff line number Diff line change
Expand Up @@ -882,8 +882,8 @@ with_metric_engine = true
[wal]
provider = "raft_engine"
file_size = "256MiB"
purge_threshold = "4GiB"
purge_interval = "10m"
purge_threshold = "1GiB"
purge_interval = "1m"
read_batch_size = 128
sync_write = false
enable_log_recycle = true
Expand Down

0 comments on commit d89af20

Please sign in to comment.