Skip to content

Commit b4ecc5b

Browse files
committed
feat: highly available clickhouse deployment
The previous clickhouse deployment was setup to run as a single node cluster preventing us from having a highly available deployment. The database schema has been adjusted to create a replicated database so that Clickhouse will automatically replicate schema changes and data between replicas. This schema change is a breaking change so the environments will be re-created. I've also adjusted the ordering of audit logs to use the timestamp as a secondary sort column so we can maintain strict ordering of audit logs. There's still benefits in maintaining the hourly bucketing so Clickhouse can skip over entire hours of data through indexes. The apiserver has been updated to use the new ordering. The migration script will now wait for all replicas in the cluster to come up before executing migrations.
1 parent 67ce864 commit b4ecc5b

File tree

14 files changed

+543
-123
lines changed

14 files changed

+543
-123
lines changed
Lines changed: 21 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,50 +1,40 @@
11
# ClickHouse Database Component
22

3-
Deploys a ClickHouse cluster for storing audit events.
3+
Deploys a highly-available ClickHouse cluster for storing audit events.
44

5-
## What It Does
5+
## Configuration
66

7-
Creates a ClickHouse instance with:
8-
- Single shard, single replica (test/dev configuration)
9-
- Hot/cold storage tiering (local SSD + S3)
10-
- Audit database pre-configured
11-
- TTL policy to move old data to S3 after 90 days
7+
- **3 replicas, 1 shard** - Survives 1 node failure
8+
- **ReplicatedReplacingMergeTree** - Automatic replication with deduplication
9+
- **Quorum writes** (2/3) - Strong consistency guarantees
10+
- **Hot/cold tiering** - Local SSD → S3 after 90 days
11+
- **Keeper coordination** - Replaces ZooKeeper for HA
1212

13-
## Files
13+
## Prerequisites
1414

15-
- `clickhouse-installation.yaml` - ClickHouseInstallation CR
16-
- `kustomization.yaml` - Component definition
17-
18-
## Storage Configuration
19-
20-
- **Hot Storage**: Recent data on local disks for fast queries
21-
- **Cold Storage**: Old data on S3 for cost-effective archival
22-
23-
The storage policy is configured via patches in each overlay (test-infra, production, etc).
15+
1. Deploy ClickHouse Keeper first (see `../clickhouse-keeper/`)
16+
2. ClickHouse Operator v0.25.6+
17+
3. S3-compatible storage for cold tier
2418

2519
## Usage
2620

27-
Include in your overlay:
28-
2921
```yaml
3022
# config/overlays/{env}/kustomization.yaml
3123
components:
24+
- ../../components/clickhouse-keeper
3225
- ../../components/clickhouse-database
3326
```
3427
35-
Add environment-specific storage patches as needed.
36-
37-
## Access
28+
Override storage settings via patches in your overlay.
3829
39-
```bash
40-
# Connect to ClickHouse
41-
kubectl exec -it clickhouse-activity-0-0-0 -n activity-system -- clickhouse-client
30+
## How It Works
4231
43-
# Query audit events
44-
SELECT * FROM audit.events LIMIT 10;
45-
```
32+
- **Writes**: Insert on any replica → replicate via Keeper → quorum (2/3)
33+
acknowledges
34+
- **Reads**: Query any replica with read-after-write consistency
35+
- **Failures**: Tolerates 1 replica down (2/3 quorum maintained)
4636
47-
## Requirements
37+
**Learn more**:
4838
49-
- ClickHouse Operator must be installed first
50-
- S3 storage (RustFS in test, AWS S3 in production)
39+
- [Data Replication](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication)
40+
- [ClickHouse Keeper](https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper)

config/components/clickhouse-database/clickhouse-installation.yaml

Lines changed: 50 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,70 @@ metadata:
55
namespace: activity-system
66
spec:
77
configuration:
8+
# Configure the clickhouse cluster with the nodes created by the clickhouse
9+
# keeper installation.
10+
zookeeper:
11+
nodes:
12+
- host: chk-activity-keeper-activity-keeper-0-0.activity-system.svc.cluster.local
13+
port: 2181
14+
- host: chk-activity-keeper-activity-keeper-0-1.activity-system.svc.cluster.local
15+
port: 2181
16+
- host: chk-activity-keeper-activity-keeper-0-2.activity-system.svc.cluster.local
17+
port: 2181
18+
# Session timeout: 30 seconds (default)
19+
# Keeper will expire the session if no heartbeat received within this time
20+
session_timeout_ms: 30000
21+
# Operation timeout: 10 seconds
22+
# Maximum time to wait for a Keeper operation to complete
23+
operation_timeout_ms: 10000
824
clusters:
925
- name: activity
1026
layout:
1127
shardsCount: 1
12-
replicasCount: 1
28+
replicasCount: 3
29+
1330
users:
1431
default/password: "" # Empty password for test environments
1532
default/networks/ip:
1633
- "0.0.0.0/0"
1734
settings:
1835
# Default settings for ClickHouse
1936
default/default_database: audit
37+
38+
# These control how CREATE/ALTER/DROP with ON CLUSTER are executed
39+
default/distributed_ddl_task_timeout: 3600
40+
default/distributed_ddl_entry_format_version: 5
41+
42+
# Require 2 out of 3 replicas to acknowledge writes before returning success
43+
# This provides strong consistency guarantees
44+
default/insert_quorum: 2
45+
default/insert_quorum_timeout: 60000 # 60 seconds to wait for quorum
46+
default/insert_quorum_parallel: 1 # Enable parallel quorum inserts
47+
48+
# Ensure you can read your own writes (read-after-write consistency)
49+
default/select_sequential_consistency: 1
50+
51+
# Maximum ratio of errors before marking replica as failed
52+
default/replicated_max_ratio_of_errors_to_be_ignored: 0.5
53+
# Deduplication window: keep track of recent inserts to prevent duplicates
54+
default/replicated_deduplication_window: 100
55+
# Deduplication time window: 7 days
56+
default/replicated_deduplication_window_seconds: 604800
57+
2058
profiles:
2159
default/max_memory_usage: 3000000000 # 3GB max memory usage
2260
files:
61+
config.d/log_rotation.xml: |-
62+
<clickhouse>
63+
<logger>
64+
<level>information</level>
65+
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
66+
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
67+
<size>1000M</size>
68+
<count>5</count>
69+
<console>1</console>
70+
</logger>
71+
</clickhouse>
2372
# Storage configuration for hot/cold tiering with S3
2473
# Hot storage: local disk for recent data (fast queries)
2574
# Cold storage: S3-compatible storage for older data (cost-effective archival)
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# ClickHouse Keeper Component
2+
3+
This component deploys a 3-node ClickHouse Keeper cluster to provide
4+
coordination and metadata management for ClickHouse replication. ClickHouse
5+
Keeper is the modern replacement for ZooKeeper.
6+
7+
Our configuration provides high availability (survives 1 node failure) with pod
8+
anti-affinity ensuring nodes are distributed across the cluster.
9+
10+
## Usage
11+
12+
Include in your overlay:
13+
14+
```yaml
15+
# config/overlays/{env}/kustomization.yaml
16+
components:
17+
- ../../components/clickhouse-keeper
18+
- ../../components/clickhouse-database
19+
```
20+
21+
> [!IMPORTANT]
22+
>
23+
> Deploy Keeper before ClickHouse or update ClickHouse after Keeper is ready.
24+
25+
## Learn More
26+
27+
For detailed information about ClickHouse Keeper, including architecture,
28+
configuration, monitoring, and troubleshooting:
29+
30+
- [ClickHouse Keeper
31+
Documentation](https://clickhouse.com/docs/en/guides/sre/keeper/clickhouse-keeper)
32+
- [Altinity Kubernetes Operator - ClickHouse
33+
Keeper](https://docs.altinity.com/altinitykubernetesoperator/)
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
apiVersion: clickhouse-keeper.altinity.com/v1
3+
kind: ClickHouseKeeperInstallation
4+
metadata:
5+
name: activity-keeper
6+
namespace: activity-system
7+
spec:
8+
configuration:
9+
clusters:
10+
- name: activity-keeper
11+
layout:
12+
# 3 replicas for HA (odd number required for Raft quorum)
13+
# Survives 1 node failure while maintaining quorum
14+
replicasCount: 3
15+
---
16+
# Configure a pod monitor to scrape metrics from the clickhouse keeper pods
17+
# using the port defined in the configuration above.
18+
apiVersion: monitoring.coreos.com/v1
19+
kind: PodMonitor
20+
metadata:
21+
name: clickhouse-keeper
22+
namespace: activity-system
23+
labels:
24+
app: clickhouse-keeper
25+
app.kubernetes.io/component: coordination
26+
spec:
27+
selector:
28+
matchLabels:
29+
# Match operator-created pods for this ClickHouseKeeperInstallation
30+
clickhouse-keeper.altinity.com/chk: activity-keeper
31+
podMetricsEndpoints:
32+
- port: metrics
33+
path: /metrics
34+
interval: 30s
35+
scheme: http
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
apiVersion: kustomize.config.k8s.io/v1alpha1
2+
kind: Component
3+
4+
resources:
5+
- keeper-installation.yaml
6+
7+
# Component metadata
8+
metadata:
9+
name: clickhouse-keeper
10+
annotations:
11+
config.kubernetes.io/local-config: "true"

0 commit comments

Comments
 (0)