Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Auth Challenge

The core design constraint that drove us to build SQE: no service account.

The Problem with Service Accounts

In a typical data platform, the query engine authenticates to the catalog and storage with a service account — a single identity with broad permissions. The engine then enforces per-user access control internally.

sequenceDiagram
    participant User
    participant Trino
    participant Polaris
    participant S3

    User->>Trino: Query (user token)
    Note over Trino: Validates user token
    Trino->>Polaris: List tables (SERVICE ACCOUNT)
    Polaris-->>Trino: Table metadata
    Trino->>S3: Read Parquet (SERVICE ACCOUNT IAM role)
    S3-->>Trino: Data
    Trino-->>User: Results

This means:

  • Polaris sees one identity for all queries — audit logs show the service account, not the actual user
  • S3 access is all-or-nothing — the service account can read everything, security depends entirely on the engine enforcing it correctly
  • Credential rotation is a blast-radius event — rotating the service account key affects all users simultaneously
  • Compliance gap — auditors want to see that Alice read table X, not that sqe-service-account did

SQE’s Approach: Bearer Token Passthrough

SQE never stores or uses a service account for data access. Instead, the user’s Keycloak bearer token flows through the entire stack:

sequenceDiagram
    participant User
    participant SQE as SQE Coordinator
    participant KC as Keycloak
    participant Polaris
    participant S3

    User->>SQE: Handshake (username, password)
    SQE->>KC: OIDC Password Grant
    KC-->>SQE: Access token + refresh token
    SQE-->>User: Session (bearer token)

    User->>SQE: Query (bearer token)
    SQE->>Polaris: List tables (USER's bearer token)
    Polaris-->>SQE: Table metadata + S3 credentials
    Note over Polaris: Polaris vends scoped S3<br/>credentials for THIS user
    SQE->>S3: Read Parquet (user-scoped credentials)
    S3-->>SQE: Data
    SQE-->>User: Arrow Flight results

Key Implications

PropertyService Account ModelSQE Token Passthrough
Polaris audit trailService accountActual user
S3 access scopeEverythingUser-scoped (credential vending)
Credential rotationBlast radius: all usersPer-user: transparent refresh
Security enforcementEngine-internal onlyCatalog + storage + engine
ComplianceRequires mapping logsNative user identity

Per-Session Catalog

Each user session gets its own SessionCatalog instance, initialized with the user’s bearer token:

graph TB
    subgraph "Session: alice"
        SC1[SessionCatalog<br/>token: alice_jwt] --> P[Polaris REST]
    end

    subgraph "Session: bob"
        SC2[SessionCatalog<br/>token: bob_jwt] --> P
    end

    P -->|alice's token| S3A[S3: alice sees<br/>tables A, B, C]
    P -->|bob's token| S3B[S3: bob sees<br/>tables A, B only]

Polaris enforces catalog-level access control based on the token. If Alice has access to tables A, B, C but Bob only has access to A and B, this is enforced at the catalog level — SQE doesn’t need to duplicate this logic.

Token Lifecycle

SQE manages token refresh transparently. A background task checks all active sessions every 10 seconds and refreshes tokens that are about to expire:

stateDiagram-v2
    [*] --> Active: Handshake (ROPC grant)
    Active --> Refreshing: Token expiry - 60s buffer
    Refreshing --> Active: New token from Keycloak
    Refreshing --> Expired: Refresh fails
    Active --> Expired: Session timeout
    Expired --> [*]: Session removed

    note right of Active: Queries use current token
    note right of Refreshing: Background task (10s interval)

The token fingerprint (last 8 characters of the access token) is used to invalidate iceberg-rust’s internal catalog session cache when a token is refreshed, ensuring the catalog client always uses the current token.