Percona Monitoring and Management High Availability - PMM HA
This method provides means to:
- Use a running PMM instance
- Prepare it to act as a Primary
- Install a second PMM on a different machine
- Prepare it to act as a Secondary
- Establish replication
- Docker 23.0.3 and higher
- Docker installation will fail if on Amazon Linux or RHEL9/EL9 (unless you are on a s390x architecture machine), since the Percona Easy-Install script relies on the Get Docker script for the docker install. You will need to install docker on your own on those cases
- SSH Access to the host servers
- sudo capabilities
- Ports 443 and 9000 accessible from outside the Primary host machine
Clone the repo and run the pmm.sh server
git clone https://github.com/nethalo/pmmha.git
cd pmmha
bash pmm.sh
You will be presented with the available options. First one is straightforward: Install PMM from scratch.
Both Primary and Replica requires some preparation. Follow the steps below:
Simply put, there are 3 main things replicated:
- VictoriaMetrics time series data
- Inventory+conf info from PostgreSQL
- ClickHouse metrics table
- SQLite info: Grafana Dashboards/Users/Roles/etc, Alerts, PMM Managed Backups (for MongoDB)
Federation is what is being used. A new scrape is configured tothe gather metrics via federate from the primary and stores it locally on the secondary:
scrape_configs:
- job_name: pmmha
honor_timestamps: true
scrape_interval: 2s
scrape_timeout: 1s
metrics_path: /prometheus/federate?match[]={__name__=~".*"}
scheme: $scheme
tls_config:
insecure_skip_verify: true
basic_auth:
username: $user
password: $pass
static_configs:
- targets:
- "$host:$port"
A pg_dump of the pmm-managed schema is made, stored into a FILE table inside the primary ClickHouse and the Secondary will read the contents of that table via the REMOTE function of ClickHouse and will restore the dump.
The FILE table is defined as
CREATE TABLE IF NOT EXISTS pmm.pgpmm (dump String) ENGINE = File(RawBLOB);
And the dump is
pg_dump -Upmm-managed --dbname=pmm-managed --inserts --data-only --disable-triggers > /srv/clickhouse/data/pmm/pgpmm/data.RawBLOB
For QAN data, the same REMOTE functionality is used. However, to achieve data Deduplication, an intermediate table is created with the engine ReplacingMergeTree so that way when forcing a Merge, data is consolidated.
The remote functionality is a simple as this query
select * from remote('$pmmserver', pmm.metrics)
Again, For SQLite data, the same REMOTE functionality is used.
Dump is made per table to divide the dashboard ones from the rest, due to size
sqlite3 /srv/grafana/grafana.db ".dump --nosys --data-only --newlines --preserve-rowids dashboard dashboard_version" > /srv/clickhouse/data/pmm/sqlitedash/data.RawBLOB
The remote functionality is a simple as this query
clickhouse-client --format=PrettySpaceNoEscapes --multiquery --database pmm --query="SET output_format_pretty_max_rows=10000000000000; SET output_format_pretty_max_column_pad_width=10; SET output_format_pretty_max_value_width=100000000; select * from remote('$pmmserver', pmm.sqlitedash)"