Rotating Database Credentials Without Restarting Pods
All examples here are generalized and use toy services. They do not disclose any employer, client, or proprietary implementation detail.
Context
A backend service connects to a database using a username and password. Security policy requires those credentials to rotate on a schedule. The naive approach — bake the password into an environment variable — means every rotation requires a redeploy or pod restart to pick up the new value.
Problem
Restarting on every rotation is expensive: it drops in-flight connections, causes brief unavailability, and couples a security operation (rotation) to a delivery operation (deploy). We want the running process to pick up new credentials without restarting.
Constraints
- No restarts on rotation.
- The app must never use a stale credential after the old one is revoked.
- Must degrade safely if the new credential is briefly unreadable.
- No secrets in environment variables or images.
Options considered
- Env vars + restart on rotation. Simple, but violates the no-restart goal.
- Fetch the secret from a vault API on every DB connect. No restart, but adds a network hop and a hard dependency on the vault being up at connect time.
- A sidecar agent writes the current secret to a shared file; the app watches that file and reloads on change. No restart, no per-connect network hop, and the app reads from local disk. This is the approach.
Final approach
A secrets agent (running as a sidecar) authenticates to the vault, renders the current credential to a file on a shared volume, and keeps that file up to date as the credential rotates. The application watches the file and, on change, swaps the credential used by its connection pool — draining old connections gracefully.
// Pseudocode — watch the rendered secret file and swap the pool's credentials.
import { watch, readFile } from 'node:fs/promises';
async function watchCredentials(path: string, onChange: (secret: string) => void) {
// Read once at startup so we never run with no credential.
onChange(await readFile(path, 'utf8'));
for await (const _event of watch(path)) {
try {
const next = await readFile(path, 'utf8');
onChange(next); // swap into the pool; drain old connections.
} catch (err) {
// File mid-write or briefly missing: keep the current credential, retry.
// Never tear down the pool on a transient read error.
}
}
}
Implementation notes
- Read at startup, then watch. Never start the watcher without an initial read, or you risk a window with no usable credential.
- Swap, don’t restart the pool. New connections use the new credential; existing connections drain naturally. The old credential stays valid for an overlap window.
- Atomic writes matter. The agent should write to a temp file and rename, so the app never reads a half-written secret. File watching fires on partial writes too.
Failure modes
- Read fires mid-write → guard with try/catch, keep the current credential.
- Old credential revoked before overlap ends → in-flight connections fail; size the overlap window to exceed your longest healthy connection lifetime.
- Watcher dies silently → add a heartbeat/last-reload metric; alert if the file’s mtime advances but the app’s last-reload timestamp doesn’t.
Testing / validation
- Unit: simulate a file change, assert the pool receives the new credential.
- Integration: rotate the credential in a test vault, assert zero failed queries across the rotation window.
- Chaos: delete the file mid-run, assert the app keeps serving on the cached credential.
What I learned
The hard part isn’t watching a file — it’s the overlap window. Rotation is only safe if the old credential outlives every connection opened under it. Treat the window as a first-class parameter, not an afterthought.