Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade clickhouse from v22.8.9.24 to v23.8.7.24 #5127

Merged
merged 11 commits into from
Mar 4, 2024
Merged

Conversation

citrus-it
Copy link
Contributor

@citrus-it citrus-it commented Feb 23, 2024

The tests all now pass and I've deployed this to a bench gimlet and done some basic smoke tests.

  • Single node deployment;
  • Clustered deployment;
  • Upgrade from old version to new;
  • More tests, TBD

@citrus-it citrus-it self-assigned this Feb 23, 2024
@citrus-it citrus-it force-pushed the andy/clickhousev23.8 branch from e7b5db6 to 9065a89 Compare February 23, 2024 14:52
@citrus-it citrus-it marked this pull request as draft February 23, 2024 16:32
@citrus-it citrus-it marked this pull request as ready for review February 26, 2024 18:32
Copy link
Collaborator

@bnaecker bnaecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for pushing on this!

oximeter/db/src/client.rs Outdated Show resolved Hide resolved
oximeter/db/src/configs/keeper_config.xml Outdated Show resolved Hide resolved
Copy link
Contributor

@karencfv karencfv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this, and also for cleaning up the tests!

@citrus-it
Copy link
Contributor Author

citrus-it commented Feb 29, 2024

I booted a fresh control plane with the following changes to the omicron repo:

a/sled-agent/src/rack_setup/plan/service.rs
-const CLICKHOUSE_COUNT: usize = 1;
+const CLICKHOUSE_COUNT: usize = 2;
-const CLICKHOUSE_KEEPER_COUNT: usize = 0;
+const CLICKHOUSE_KEEPER_COUNT: usize = 3;

a/smf/clickhouse/method_script.sh
-single_node=true
+single_node=false

and all five expected zones came up:

gimlet-sn06 # zoneadm list | grep -i click
oxz_clickhouse_keeper_abcc5b68-a928-430f-8d85-5022311cb2ec
oxz_clickhouse_keeper_bb33cccc-c12f-4c69-94a2-a192d49c4799
oxz_clickhouse_15a50998-0fb9-4f51-925c-7beab0bf64a0
oxz_clickhouse_72b39137-62fe-4771-8530-1de0ddd9d453
oxz_clickhouse_keeper_6a440ca6-04b9-441b-a718-482a9e326ba6

I logged into one of the keeper zones and connected to the keeper server, which looked good. I noticed that the PATH was not correctly set to find clickhouse so that's addressed in the next commit.

root@oxz_clickhouse_keeper_abcc5b68:~# /opt/oxide/clickhouse_keeper/clickhouse  keeper-client -p 9181 -h '[fd00:1122:3344:101::12]'
Connected to ZooKeeper at [fd00:1122:3344:101::12]:9181 with session_id 3
Keeper feature flag FILTERED_LIST: enabled
Keeper feature flag MULTI_READ: enabled
Keeper feature flag CHECK_NOT_EXISTS: disabled
/ :) ls
clickhouse keeper
/ :) ls clickhouse
tables task_queue

I then logged into one of the clickhouse zones and verified that the cluster existed.

oximeter_cluster node 1 :) show cluster 'oximeter_cluster' format Vertical;

SHOW CLUSTER oximeter_cluster
FORMAT Vertical

Query id: a0fc31bf-08bf-4e40-b61f-4585385278c4

Row 1:
──────
cluster:                 oximeter_cluster
shard_num:               1
shard_weight:            1
replica_num:             1
host_name:               15a50998-0fb9-4f51-925c-7beab0bf64a0.host.control-plane.oxide.internal.
host_address:            fd00:1122:3344:101::e
port:                    9000
is_local:                1
user:                    default
default_database:
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:
database_replica_name:
is_active:               ᴺᵁᴸᴸ

Row 2:
──────
cluster:                 oximeter_cluster
shard_num:               1
shard_weight:            1
replica_num:             2
host_name:               72b39137-62fe-4771-8530-1de0ddd9d453.host.control-plane.oxide.internal.
host_address:            fd00:1122:3344:101::f
port:                    9000
is_local:                0
user:                    default
default_database:
errors_count:            0
slowdowns_count:         0
estimated_recovery_time: 0
database_shard_name:
database_replica_name:
is_active:               ᴺᵁᴸᴸ

2 rows in set. Elapsed: 0.004 sec.

After creating and deleting some instances, the metrics appeared to work correctly.

@citrus-it citrus-it changed the title Upgrade to clickhouse v23.8.7.24 Upgrade clickhouse from v22.8.9.24 to v23.8.7.24 Feb 29, 2024
@citrus-it citrus-it merged commit 9f51dcb into main Mar 4, 2024
20 checks passed
@citrus-it citrus-it deleted the andy/clickhousev23.8 branch March 4, 2024 18:15
david-crespo added a commit that referenced this pull request Mar 7, 2024
After #5127 (I assume, because this problem came up immediately after
that) macOS Gatekeeper is putting the `clickhouse` binary in quarantine
after download, which means we can't execute it unless we take it out of
quarantine. This is apparently a new thing, as [this
doc](https://github.com/ClickHouse/clickhouse-docs/blob/08d7a329d/knowledgebase/fix-developer-verification-error-in-macos.md)
about how to do that was only [added Jan
9](ClickHouse/clickhouse-docs#1835).

Here I am doing it in the simplest way possible: if mac, then remove
quarantine attr. If anyone has a better way, I'm all ears.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants