AWS

Amazon Bedrock: OpenAI GPT-5.5/5.4, Codex, Managed Agents — OpenSearch Serverless Rebuilt & Resilience Hub Next‑Gen

Bedrock adds OpenAI frontier models (GPT-5.5/5.4) and Codex on pay-per-token. OpenSearch Serverless rebuilt; Resilience Hub, IoT Swift GA impact ops for teams.

June 9, 2026·6 min read·AI researched · AI written · AI reviewed

AWS released a set of updates that shift where inference, orchestration, and operational control sit in cloud-native stacks. The key items: Amazon Bedrock now offers OpenAI "frontier" models (listed as GPT-5.5 and GPT-5.4 in the announcement) plus Codex on Bedrock's managed inference engine with pay-per-token pricing; Bedrock introduces managed agents as a hosted orchestration runtime; Amazon OpenSearch Serverless was rebuilt for agentic AI and dynamic workloads; AWS Resilience Hub received a next-generation application model and automated assessments; and the AWS IoT Device SDK for Swift reached GA.

These changes move responsibility away from self-hosted GPU fleets and bespoke search clusters toward managed inference, serverless search backends, and AWS-hosted agent orchestration. Platform teams should treat these releases as operational changes first: billing, governance, observability, and resilience practices must be adapted before migrating critical workloads.

Amazon Bedrock: OpenAI models, Codex, and pay-per-token

What changed

  • Bedrock now exposes OpenAI frontier models (GPT-5.5, GPT-5.4 as announced) and Codex on its managed inference engine and bills on a pay-per-token basis. This replaces or augments instance-based costs for many use cases.

Operational implications

  • Cost behavior: pay-per-token makes costs proportional to usage and can be spiky. Add per-model and per-workspace tagging, per-model budgets, and anomaly detection for token consumption.
  • Governance and data controls: use private VPC endpoints where provided, control data egress, and map tokenized payloads to your data-classification policies. Bedrock provides integrations for logging and governance, but you must configure them to meet compliance needs.
  • Performance: managed inference offloads fleet operations but imposes externally managed latency profiles. Benchmark cold starts, multi-region placement, and tail latency to validate SLOs for low-latency workloads.

Practical actions

  • Add token-level metering hooks to routing/proxy services and enforce throttles per model or feature flag.
  • Define per-model SLOs (p50/p95/p99 latency, error rate) and integrate them into dashboards and alerts.

Bedrock Managed Agents: hosted orchestration for agentic apps

What it is

  • Bedrock managed agents provide an opinionated, hosted runtime for multi-step, tool-using agents. AWS manages compute, scaling, and lifecycle while offering connectors to common data sources and tools.

Technical considerations

  • Orchestration contract: treat managed agents as stateful runtimes. Define idempotency, cancellation, and retry semantics and expose abort controls through APIs.
  • Least privilege: grant scoped, ephemeral IAM credentials per agent run. Restrict tool access to necessary actions and use short-lived tokens for external integrations.
  • Observability: require structured tracing for plan execution and tool calls. Correlate traces with token billing to detect runaway consumption.
  • Runtime controls: implement allow/deny lists for external calls, content filters, and data provenance logging to enable audits of inputs/outputs.

Testing and CI/CD

  • Include agent tool integrations and failure modes in automated tests (rate limits, timeouts, malformed responses). Provide deterministic fallbacks for degraded tool availability.

Amazon OpenSearch Serverless: rebuilt for agentic AI and dynamic workloads

What changed

  • OpenSearch Serverless was redeveloped with elasticity and AI-driven workloads in mind. AWS highlights instant autoscaling and cites potential cost savings; platform engineers should validate these claims against their workload patterns.

Architectural implications

  • Autoscaling: serverless removes shard/node management but behaves as a managed autoscaler. Test index growth, ingestion spikes, and query bursts to understand warm-up and tail-latency characteristics.
  • Compatibility: verify API and feature compatibility (clients, analyzers, plugins). Serverless offerings commonly restrict native extensions; plan workarounds for custom analyzers or plugins.
  • Vector search: confirm vector indexing formats, distance metrics, and ANN implementations. Measure retrieval accuracy and latency trade-offs for your embeddings and query patterns.
  • Cost model: shift from fixed nodes to usage-based billing. Model ingestion, storage, and query costs under steady and burst traffic.

Migration checklist

  • Run a representative proof-of-concept with your embedding sizes and query patterns.
  • Validate bulk indexing throughput and concurrent query SLA under agent-style workloads.
  • Integrate search observability into model evaluation pipelines to surface retrieval failures and drift.

AWS Resilience Hub Next‑Gen and AWS IoT Device SDK for Swift GA

Resilience Hub

  • The next generation introduces an application model, automated dependency discovery/assessments, generative failure-mode analysis, modular resilience policies, and org-wide reporting. Treat the generative outputs as prioritized hypotheses to validate with experiments and chaos tests.

Integration pointers

  • Export application manifests (from Terraform/CloudFormation/CDK) into Resilience Hub and use automated assessments as part of release gating.
  • Use generated failure modes to prioritize chaos experiments; do not treat them as a substitute for real tests.

IoT Device SDK for Swift (GA)

  • Swift SDK now supports MQTT 5, Device Shadow, Jobs, and fleet provisioning for Apple platforms and Linux. This enables first-class Swift clients for device fleets and backend agents built in Swift.

Operational notes

  • Security: enforce mutual TLS, automated certificate rotation, and least-privilege provisioning templates.
  • Scale: test MQTT 5 features (session expiry, shared subscriptions) under device churn and burst messaging from agent workflows.
  • Integration: ensure device telemetry and shadow state feed your ML feature store and retrieval indexes with correct labeling and retention policies.

Recommended next steps for platform teams

  1. Update FinOps: implement token- and model-level tagging, per-model budgets, and anomaly alerts on token consumption.
  2. Harden access control: create IAM boundaries for Bedrock and scoped, ephemeral credentials for agent tool access.
  3. Improve observability: correlate token billing, model calls, agent invocations, and OpenSearch Serverless queries in end-to-end traces and SLO dashboards.
  4. Benchmark and validate: measure Bedrock latency (cold/warm), tail behavior, and OpenSearch Serverless indexing/query performance with representative embeddings.
  5. Use Resilience Hub: export manifests and include resilience assessments and chaos tests as deployment gates.
  6. Strengthen agent controls: require content-level logging, allow/deny lists, and immutable audit trails for agent flows.
  7. Prototype migrations: move a subset of indexes or inference traffic to the managed services to validate compatibility, vector support, latency, and cost.
  8. Leverage IoT Swift GA where appropriate: standardize on the SDK to simplify device provisioning and shadow/jobs integration.

Conclusion These releases provide powerful managed primitives for LLM-backed applications and search-based retrieval, but they change where operational risk and control reside. Adopt pay-per-token and serverless offerings incrementally, update governance and observability first, and validate resilience and performance with representative tests before migrating critical workloads.

If you want a phased migration plan (benchmarks, SLOs, billing alerts) tailored to your stack and constraints, I can outline a concrete test-and-migrate sequence.

Sources

amazon-bedrockopenaigpt-5.5opensearch-serverlessaws-resilience-hubaws-iot-swift
← All articles
AWS

AWS updates: Lambda 1 MB async payload, .NET 10 & Node.js 24; Bedrock frontier models and MCP Server; EC2 Graviton5 M9g/M9gd

Lambda async payload now 1 MB; .NET 10 & Node.js 24 added. Bedrock introduces frontier models and an MCP Server. EC2 launches Graviton5 M9g/M9gd. for infra teams.

Jun 11, 2026·6maws-lambdaamazon-bedrock
AWS

AWS Lambda: .NET 10, Node.js 24, tenant isolation, 1 MB async payloads — and Amazon Bedrock adds OpenAI models & Codex

AWS Lambda adds .NET 10, Node.js 24, tenant isolation, and 1 MB async payloads. Amazon Bedrock adds OpenAI GPT variants and Codex models, operational impact now.

Jun 10, 2026·6maws-lambdadotnet-10
AWS

Amazon Bedrock: GPT-1.5/GPT-1.4/Codex GA, Managed Agents, and EKS/Lambda Orchestration Updates

Amazon Bedrock now provides OpenAI GPT-1.5, GPT-1.4, and Codex with pay-per-token billing and managed agents; EKS and Lambda updates reshape AI orchestration.

Jun 8, 2026·6mamazon-bedrockopenai-gpt-1-5