Worker
Workers are stateless DataFusion executors. They receive plan fragments (scan tasks) from the coordinator, read Parquet files from S3, and stream Arrow results back.
Architecture
graph TB
subgraph Worker["sqe-server --mode worker"]
FS["Flight Service<br/>:50052"]
EX["Executor"]
PR["Parquet Reader"]
end
COORD["Coordinator"] -->|ScanTask ticket| FS
FS --> EX
EX --> PR
PR -->|read| S3["S3 / MinIO"]
PR -->|Arrow RecordBatches| FS
FS -->|Flight stream| COORD
Scan Task
The coordinator sends workers a ScanTask — a lightweight JSON message containing everything the worker needs:
{
"fragment_id": "frag-001",
"data_file_paths": [
"s3://warehouse/ns/table/data/00001.parquet",
"s3://warehouse/ns/table/data/00002.parquet"
],
"projected_columns": ["id", "name", "amount"],
"s3_endpoint": "http://s3:9000",
"s3_region": "us-east-1",
"s3_access_key": "...",
"s3_secret_key": "...",
"s3_path_style": true
}
Workers don’t need access to Polaris or Keycloak. The coordinator resolves table metadata, applies security, and provides the worker with direct S3 credentials and file paths.
Health Checking
The coordinator monitors workers with a background health check task:
stateDiagram-v2
[*] --> Unhealthy: Registered
Unhealthy --> Healthy: Health check OK
Healthy --> Healthy: Health check OK
Healthy --> Degraded: 1-2 consecutive failures
Degraded --> Unhealthy: 3 consecutive failures
Degraded --> Healthy: Health check OK
Unhealthy --> Healthy: Health check OK
- Health checks run every 5 seconds via Flight
Action("health_check") - A worker is marked unhealthy after 3 consecutive failures
- Unhealthy workers are excluded from query scheduling
- Recovery is automatic — a healthy response resets the failure counter
Scaling
Workers are stateless, so scaling is just changing the replica count:
# Helm
helm upgrade sqe deploy/helm/sqe/ --set worker.replicas=5
# kubectl
kubectl scale deployment sqe-worker --replicas=5
The coordinator discovers workers from the config or service discovery. No re-registration needed.