Platform engineering conversations have moved beyond standardizing infrastructure primitives. Platform teams are now responsible for developer outcomes, security posture, and AI integrations as first-class components. Signals from the community and vendors point toward Internal Developer Platforms (IDPs) that provide opinionated golden paths, platform-level governance for LLM access, and automation that makes SBOM generation and vulnerability remediation fast and reliable. This is a shift in responsibilities: platform teams must operate cross-cutting controls across identity, telemetry, and dependency supply-chain tooling.
Why IDPs must behave like product teams
The "golden path" concept is established; what's new is the expectation that platform teams operate the IDP as a measurable product with SLAs and KPIs for developer experience. Practically this shows up as three concrete shifts:
-
Ownership of outcomes over components: adoption counts and template counts are necessary but insufficient. Platform teams instrument developer journeys (scaffold -> CI -> staging -> prod) with metrics mapped to business outcomes rather than only tool telemetry. Existing frameworks such as DORA and Four Keys remain useful, but teams add developer-experience KPIs (time-to-first-successful-deploy, mean-time-to-onboard) and surface them in the IDP dashboard.
-
Opinionated defaults plus documented escape hatches: golden paths supply defaults (runtime, service mesh, observability via OpenTelemetry). Product discipline requires documented escape routes, a gated exception process, and a feedback loop for iterating on the default experience.
-
Cross-functional roadmaps: security (SBOMs, automated patching), observability (agent and trace sampling standards), and AI integration (managed model access and prompt telemetry) belong on the platform roadmap alongside CI/CD and developer portals (e.g., Backstage). These are implementation and governance concerns, not separate point tools.
Treating LLM endpoints like infrastructure
When LLMs are a platform capability, the platform should provide:
- Centralized credentialing and quotas: short-lived credentials provisioned via the platform, per-team quotas, and centralized billing attribution; avoid ad-hoc provider keys checked into repos.
- Resiliency controls: retries with exponential backoff, circuit breakers and graceful degradation (cached responses or deterministic fallbacks) when the model provider is degraded.
- Observability and provenance: OpenTelemetry traces, prompt hashing to avoid storing raw PII, and structured event logs that link prompt usage to deployments for auditability.
A common architecture is a platform-managed LLM proxy service that exposes an internal API (for example POST /v1/llm/completions) and enforces policy. That proxy should integrate with the platform's identity provider (OIDC/JWKS), enforce quotas (a distributed rate limiter such as Redis or an Envoy rate-limit service), emit OpenTelemetry spans, and write prompt metadata (only hashed or redacted) to a secure audit sink.
Example composition (conceptual):
- API: /platform-llm-proxy/v1/completions (internal-only)
- Auth: OIDC tokens verified via JWKS (RFC 7517) and introspection where appropriate (RFC 7662)
- Policy: OPA (Rego) to limit allowed models, scopes and data-processing behaviors per team
- Observability: OpenTelemetry instrumented spans and semantic attributes following OpenTelemetry semantic conventions
These building blocks are available now: Kubernetes Deployments, Envoy/Kong, OpenTelemetry SDKs, OPA, and CI-time static checks. Enforcement should be layered: runtime (proxy) plus CI (static checks preventing secret leakage and requiring SBOMs).
Secure-by-default SBOMs and fast remediation
Regulatory readiness and fast remediation rely on automation and standards. Practical steps platform teams prioritize:
- Generate SBOMs as first-class CI artifacts. Use tools such as Syft or Trivy and produce CycloneDX or SPDX output stored in an immutable artifact repository.
- Gate merges when policies fail: e.g., fail builds when a new high-critical CVE appears or when a base image digest isn't pinned. Use policy engines (Conftest or OPA) to enforce these gates in CI.
- Automate fix PRs and platform validation: integrate Renovate or Dependabot with platform CI so auto-PRs run platform tests; surface remediation telemetry back to the IDP dashboard.
Concrete CI snippet (GitHub Actions) that generates a CycloneDX SBOM using Syft and performs a policy check with Conftest. The steps use container images to avoid relying on the runner's host toolchain:
name: ci-sbom-and-policy
on: [push, pull_request]
jobs:
sbom:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Generate SBOM (CycloneDX)
run: |
docker run --rm -v "$PWD":/workspace -w /workspace anchore/syft:latest syft packages . -o cyclonedx > sbom.cdx.json
- name: Validate SBOM exists
run: |
if [ ! -s sbom.cdx.json ]; then echo 'SBOM missing' && exit 1; fi
- name: Policy check (Conftest)
run: |
docker run --rm -v "$PWD":/workspace -w /workspace instrumenta/conftest:latest test --policy ./policy sbom.cdx.jsonStore sbom.cdx.json in an immutable artifact repository (Nexus, Artifactory, S3) and surface SBOM-related metrics in your IDP (percentage of services with SBOMs, mean time to patch high CVEs).
Minimal LLM proxy example (Node.js, realistic and safer defaults)
Below is a compact, runnable Node.js (ESM) example that demonstrates key platform-managed behaviors: JWKS-based JWT verification, a token-bucket rate limiter (in-memory, replacable by Redis), OpenTelemetry span usage, prompt hashing, and forwarding to an upstream provider. This is minimal—do not treat it as production-complete without adding configuration, secret management, observability bootstrapping, and distributed rate limiting.
// platform-llm-proxy.mjs (Node 18+)
import express from 'express';
import jwksClient from 'jwks-rsa';
import jwt from 'jsonwebtoken';
import crypto from 'crypto';
import { trace } from '@opentelemetry/api';
const app = express();
app.use(express.json());
const jwks = jwksClient({
jwksUri: process.env.JWKS_URI || 'https://your-idp/.well-known/jwks.json',
cache: true,
rateLimit: true
});
function getKey(header, callback) {
jwks.getSigningKey(header.kid, (err, key) => {
if (err) return callback(err);
const pubkey = key.getPublicKey();
callback(null, pubkey);
});
}
// In-memory token-bucket (replace with Redis for multi-replica)
const buckets = new Map();
const RATE = Number(process.env.RATE_PER_MINUTE) || 60; // requests per minute
const WINDOW_MS = 60_000;
function allow(subject) {
const now = Date.now();
const state = buckets.get(subject) || { tokens: RATE, ts: now };
const elapsed = now - state.ts;
const refill = (elapsed / WINDOW_MS) * RATE; // fractional refill
state.tokens = Math.min(RATE, state.tokens + refill);
state.ts = now;
if (state.tokens >= 1) { state.tokens -= 1; buckets.set(subject, state); return true; }
buckets.set(subject, state); return false;
}
function verifyMiddleware(req, res, next) {
const auth = req.header('authorization');
if (!auth) return res.status(401).json({ error: 'missing auth' });
const token = auth.replace(/^Bearer\s+/i, '');
jwt.verify(token, getKey, { algorithms: ['RS256'] }, (err, payload) => {
if (err) return res.status(401).json({ error: 'invalid token' });
req.subject = payload.sub || payload.client_id;
next();
});
}
app.post('/v1/completions', verifyMiddleware, async (req, res) => {
if (!allow(req.subject)) return res.status(429).json({ error: 'rate limit' });
const tracer = trace.getTracer('platform-llm-proxy');
const span = tracer.startSpan('llm.proxy');
try {
span.setAttribute('platform.subject', req.subject);
span.setAttribute('llm.model', req.body.model || 'unknown');
const prompt = JSON.stringify(req.body.messages || req.body.input || '');
const promptHash = crypto.createHash('sha256').update(prompt).digest('hex');
span.setAttribute('prompt.hash', promptHash);
const providerUrl = process.env.PROVIDER_URL || 'https://api.openai.com/v1/chat/completions';
const r = await fetch(providerUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${process.env.PROVIDER_API_KEY}`
},
body: JSON.stringify(req.body)
});
const data = await r.json();
span.setAttribute('llm.http_status', r.status);
res.status(r.status).json(data);
} catch (err) {
span.setAttribute('error', true);
span.setAttribute('error.message', String(err));
res.status(502).json({ error: 'upstream error' });
} finally {
span.end();
}
});
app.listen(8080, () => console.log('platform-llm-proxy listening on 8080'));Production considerations (non-exhaustive): verify tokens via JWKS or introspection and enforce audience/scopes, replace in-memory rate limiting with Redis or an Envoy rate limit service, use structured encrypted audit logs and redact raw prompts, implement circuit breakers and canonical fallbacks, and ensure secrets are stored in a vault.
What platform teams should do now
- Treat the IDP as a product: set KPIs (time-to-onboard, MTTR for security issues, prompt-compliance rate), instrument golden paths, and iterate using developer feedback and telemetry.
- Make LLMs a cataloged platform capability: provide access via a proxy with OIDC-based auth, quotas, tracing (OpenTelemetry), and policy enforcement (OPA). Avoid unmanaged provider keys in repos.
- Automate SBOM generation and gating in CI: make SBOMs immutable artifacts, run Conftest/OPA checks in PRs, and integrate automated remediation flows so fixes progress from detection through platform-run staging tests to production quickly.
- Instrument outcome metrics, not just telemetry: adopt a small set of DX and security outcome metrics and display them in your IDP dashboard (e.g., percentage of services with SBOMs, mean time to patch high CVEs, time-to-first-successful-deploy).
Platform engineering is shifting from tooling to product discipline. How you gate LLMs, standardize SBOMs, and measure developer outcomes now will determine whether your IDP becomes a bottleneck or a multiplier for engineering velocity and security.