# AWS AI Infrastructure — 4+1 Layer AI Infrastructure Assessment

> Mapped to the 4+1 Layer AI Infrastructure Model  
> Version: v1.0 — Draft, Editorial Review Pending · Date: May 21, 2026  
> Source: re:Invent 2025, GTC 2026, Bedrock AgentCore GA, AgentCore Policy GA (Mar 2026), SageMaker Unified Studio, AWS/NVIDIA collaboration, OpenAI/AWS partnership, analyst coverage  
> Published by: The CTO Advisor LLC · thectoadvisor.com  
> Author: Keith Townsend

[Full interactive assessment](https://layer2c.web.app/assessment/aws) · [Methodology](https://layer2c.web.app/methodology) · [What Is Layer 2C?](https://layer2c.web.app/what-is-layer-2c)

## Executive Summary

AWS is the first vendor in this assessment series that makes a credible claim across every layer of the 4+1 model — including Layer 2C. The structural difference between AWS and every on-prem vendor (Dell, HPE, VAST) is the direction of authority. On-prem vendors build upward from hardware, attempting to extend authority into orchestration and runtime layers. AWS builds downward from managed services, extending authority into custom silicon (Trainium, Inferentia, Graviton), custom networking (EFA/SRD, Nitro), and now on-prem infrastructure (AWS AI Factories).

The DAPM classification for AWS is structurally inverted compared to on-prem vendors. The enterprise architect using AWS retains less direct authority at every layer — but gains operational leverage that on-prem vendors cannot match. The question is not whether AWS has the capabilities. The question is whether the enterprise architect has made the authority delegation explicit, and whether they understand what borrowed judgment they inherit when they adopt AWS’s Reasoning Plane as their own.

AWS already has the control plane everyone else is trying to build. The problem is that customers do not always see where AWS’s control plane ends and their own authority begins. When Bedrock routes inference, SageMaker auto-scales, or Karpenter provisions nodes — those are Layer 2C functions operating invisibly inside managed services. This is the ‘DGX Realization’ that birthed the 4+1 model: the cloud operates an invisible Reasoning Plane that becomes visible only when you try to replicate it on bare metal.

The OpenAI partnership (2GW of Trainium capacity, Stateful Runtime on Bedrock) and NVIDIA deepened collaboration (1M+ GPUs including Blackwell and Rubin) demonstrate AWS positioning as the substrate on which multiple AI ecosystems converge — creating one of the broadest Layer 3 ecosystems and one of the most complex borrowed judgment landscapes.

## Layer Status

| Layer | Status | Classification |
|---|---|---|
| Layer 0 | ● Ceded to AWS | Compute & Network Fabric |
| Layer 1A | ● Delegated | Data Storage & Governance |
| Layer 1B | ● Delegated | Context Management & Retrieval |
| Layer 1C | ● Delegated | Data Movement & Pipelines |
| Layer 2A | ● Delegated / Retained | Infrastructure Orchestration |
| Layer 2B | ● Delegated / Retained | Application Runtime & Execution |
| Layer 2C | ◑ Intelligence 2C: Delegated | Infra 2C: Implicit | Agentic Infrastructure — The Reasoning Plane |
| Layer 3 (+1) | ● Broadest Ecosystem | AI Application Layer — The Value Plane |

## DAPM Profile

| Classification | Count | Meaning |
|---|---|---|
| Retained | 1 | Enterprise owns and controls this capability |
| Delegated | 17 | Provided by substitutable partner; enterprise retains swap authority |
| Ceded | 6 | Vendor controls this; enterprise has no governance authority |
| Absent | 0 | No capability at this layer |

## Strongest Layers

- **Layer 0** (Compute & Network Fabric) — Ceded to AWS
- **Layer 1A** (Data Storage & Governance) — Delegated
- **Layer 1B** (Context Management & Retrieval) — Delegated
- **Layer 1C** (Data Movement & Pipelines) — Delegated
- **Layer 2A** (Infrastructure Orchestration) — Delegated / Retained
- **Layer 2B** (Application Runtime & Execution) — Delegated / Retained
- **Layer 3 (+1)** (AI Application Layer — The Value Plane) — Broadest Ecosystem

## Layer-by-Layer Detail

### ● Layer 0: Compute & Network Fabric

*Raw compute, networking, and acceleration fabric*  
**Status:** Ceded to AWS

**AWS Custom Silicon (Annapurna Labs)** [DAPM: Ceded]  
Trainium3 GA: EC2 Trn3 UltraServers, 144 chips/UltraServer, 4.4x compute vs Trn2, 4x energy efficiency, 4x memory bandwidth. Sub-10µs chip-to-chip latency. Designed for agentic AI, MoE models, large-scale RL. Trainium4 expected 2027. Inferentia2 for inference. Graviton for ARM CPU. AWS-owned silicon IP.

**NVIDIA GPUs on AWS** [DAPM: Ceded]  
Broadest NVIDIA GPU collection of any cloud. P5 (H100), P5e (H200), P6 (B200), P6e (GB200). 1M+ GPUs added 2026 including Blackwell and Rubin.

**Nitro System + EFA + SRD** [DAPM: Ceded]  
Custom hardware/firmware for I/O offload. Hardware-enforced security isolation. EFA: OS-bypass with 3,200 Gbps bandwidth. SRD: AWS custom multi-path, fault-tolerant transport. EC2 UltraClusters: petabit-scale, 20,000 GPUs, 16% latency reduction (v2.0).

**AWS AI Factories (On-Prem)** [DAPM: Ceded]  
Dedicated on-prem environments as private AWS Region. Customer provides space/power; AWS deploys and manages Trainium, NVIDIA GPUs, networking, storage, and full managed services (Bedrock, SageMaker). Inverted vs Dell/HPE: AWS operates infrastructure the customer houses.

**Gap Analysis:** The enterprise has no authority over Layer 0 hardware beyond choosing instance types. The multi-accelerator marketplace (Trainium, NVIDIA, AMD, Intel) creates a workload-to-silicon matching problem that is itself a Layer 2C function. This problem doesn’t exist in on-prem (accelerator choice made once at procurement) but recurs with every cloud workload placement decision.

AWS is the only vendor that owns accelerator silicon IP (Annapurna Labs). Dell and HPE brand third-party silicon. VAST has no Layer 0 silicon. AWS AI Factories invert the on-prem model: Ceded infrastructure even when physically in the customer’s facility.

**Borrowed Judgment:** Inverted. The enterprise Cedes Layer 0 entirely — AWS makes all silicon, networking, and infrastructure decisions. The enterprise selects from AWS’s menu but does not influence underlying hardware design, networking topology, or physical infrastructure. The trade-off: loss of direct hardware authority in exchange for operational leverage (no procurement lead time, per-workload silicon selection, managed scaling).

### ● Layer 1A: Data Storage & Governance

*Durable, governed data foundation — the Governance Catalog that Layer 2C queries*  
**Status:** Delegated

**Amazon S3 + S3 Tables** [DAPM: Ceded]  
De facto object storage standard. S3 Tables (re:Invent 2025): Iceberg-native table storage. S3 Express One Zone: single-digit ms latency.

**AWS Glue Data Catalog + Lake Formation** [DAPM: Delegated]  
Centralized metadata with catalog federation to remote Iceberg catalogs. Fine-grained access control, cross-account sharing, column/row-level security. SageMaker and Bedrock inherit IAM/Lake Formation context. The primitives for 1A→2C exist; the composition is customer-built.

**SageMaker Catalog** [DAPM: Delegated]  
Discovery, subscription, governed sharing of data assets within SageMaker Unified Studio.

**Gap Analysis:** Most mature governance catalog in this assessment. Glue + Lake Formation metadata is API-accessible to higher layers. The 1A→2C connection (Reasoning Plane querying governance metadata for placement decisions) is not a product today — the primitives exist, composition is customer-built. Catalog federation to remote Iceberg catalogs is unmatched within this series. Hybrid gap: federated catalog covers S3 and Iceberg-compatible catalogs but not proprietary on-prem storage metadata.

**Borrowed Judgment:** Delegated with customer-retained policy. Lake Formation policies are customer-defined; enforcement is AWS-managed. Cleaner DAPM than Dell (MetadataIQ indexes Dell-only) or VAST (governance catalog is proprietary).

### ● Layer 1B: Context Management & Retrieval

*Low-latency retrieval for RAG — vector/hybrid search, context windows*  
**Status:** Delegated

**Amazon Bedrock Knowledge Bases** [DAPM: Delegated]  
Managed RAG: ingest → chunk → embed → index → retrieve. Supports OpenSearch, Aurora/pgvector, Pinecone, Redis. Vertically integrates 1B+1C within managed boundary.

**Amazon OpenSearch Serverless** [DAPM: Delegated]  
Vector search with HNSW/FAISS. Default Bedrock Knowledge Bases backend. Serverless scaling.

**Amazon Neptune** [DAPM: Delegated]  
Graph database for relationship-aware retrieval. Multi-hop reasoning for agentic workloads.

**Gap Analysis:** Bedrock Knowledge Bases erases the 1B/1C boundary within its managed surface — borrowed judgment, not a gap. AWS makes chunking, embedding, retrieval strategy decisions on the customer’s behalf. The enterprise should ask whether defaults suit their domain.

Interoperability gap: no unified retrieval abstraction across Bedrock Knowledge Bases + self-hosted Weaviate + Neptune. Routing logic between backends is a Layer 2C function living in application code.

**Borrowed Judgment:** Moderate. Bedrock Knowledge Bases makes retrieval quality decisions the enterprise inherits without explicit governance. Compare to VAST (InsightEngine — tighter but VAST-controlled) or Dell (Elastic — separate ISV).

### ● Layer 1C: Data Movement & Pipelines

*Move/transform data — ETL/ELT, lineage, governed data preparation*  
**Status:** Delegated

**AWS Glue + SageMaker Unified Studio** [DAPM: Delegated]  
Glue: serverless ETL with Spark 3.5.6, Iceberg 1.10. SageMaker Unified Studio: horizontal integration across 1A/1C/2B with one-click onboarding. Single governed environment collapsing organizational boundaries across data engineering, data science, ML engineering.

**Amazon MWAA + Step Functions** [DAPM: Delegated]  
Managed Airflow for complex DAGs. Step Functions for serverless multi-step workflows in ML pipeline reference architectures.

**Gap Analysis:** AWS horizontal integration vs VAST vertical integration: same functional coverage, different authority models. VAST = one authority boundary, fewer choices, fewer seams. AWS = many services sharing governance via Lake Formation/IAM, more policy control, more operational complexity. SageMaker Unified Studio addresses fragmentation but underlying services remain distinct.

**Borrowed Judgment:** Delegated with customer-retained configuration. AWS provides pipeline services; customer defines transformations and flows. Operational complexity of maintaining consistent governance across many AWS accounts is the trade-off for flexibility.

### ● Layer 2A: Infrastructure Orchestration

*GPU scheduling, capacity management, autoscaling*  
**Status:** Delegated / Retained

**Amazon EKS Auto Mode + Karpenter** [DAPM: Delegated]  
Managed K8s with GPU-aware scheduling. EKS Auto Mode automates cluster/compute management. Karpenter: open-source autoscaler provisioning exact instance types. Mixed compute (NVIDIA, Trainium, Inferentia, Graviton). Note: no DRA support, Capacity Blocks negate scale-to-zero.

**Capacity Management** [DAPM: Ceded]  
Capacity Block Reservations, Flex Start, Savings Plans, Spot. All AWS-controlled allocation. These are capacity acquisition mechanisms, not workload placement reasoning.

**Gap Analysis:** For Dell/HPE, Layer 2A is where authority slips to NVIDIA. For AWS, Layer 2A is where AWS retains authority through managed services while integrating NVIDIA optionally. GPU scheduling primitives are AWS-controlled.

Governance choice: Retain 2A by running self-managed EKS, or Cede 2A by consuming Bedrock (no EKS, no Karpenter — AWS handles 2A invisibly). Both legitimate; the choice is a governance decision with DAPM implications.

**Borrowed Judgment:** Low to moderate depending on path. Self-managed EKS: Retained. Bedrock consumption: Ceded. NVIDIA dependency is optional, structurally different from Dell/HPE where Run:ai is the primary GPU scheduler.

### ● Layer 2B: Application Runtime & Execution

*Model serving, agent execution, inference APIs, distributed inference*  
**Status:** Delegated / Retained

**Amazon Bedrock** [DAPM: Delegated]  
Foundation model access: Anthropic Claude, Amazon Nova, Meta Llama, OpenAI (Stateful Runtime), Mistral, Cohere, NVIDIA Nemotron. Unified API. Fine-tuning including RFT.

**Bedrock AgentCore Runtime** [DAPM: Delegated]  
Serverless agent runtime. Framework-agnostic (Strands, LangChain, CrewAI). Protocol-agnostic (MCP, A2A). Model-agnostic. MicroVM session isolation. 2M+ SDK downloads in 5 months.

**SageMaker AI + Self-Hosted** [DAPM: Delegated / Retained]  
Training, fine-tuning, inference endpoints. LMI containers with vLLM. Multi-LoRA. Supports Trainium + NVIDIA. vLLM on EKS and Ray on EKS for fully self-hosted (Retained).

**Strands Agents SDK** [DAPM: Retained]  
AWS open-source agentic framework. Model-first, native AgentCore/Guardrails/OpenTelemetry integration. Multi-agent patterns with A2A.

**Gap Analysis:** AWS owns multiple 2B surfaces (Bedrock, SageMaker, AgentCore, EKS). NVIDIA dependency is optional in a way it’s not for Dell/HPE. Agent frameworks blur 2B/2C/3 boundaries — AgentCore bundles Runtime (2B) + Policy (2C) + agent logic (3). Product boundary ≠ architectural boundary.

Borrowed judgment: using Bedrock to access Anthropic Claude or Meta Llama means the model provider’s alignment decisions become part of the enterprise’s AI system. Guardrails constrain output but reasoning in model weights is not customer-configurable.

**Borrowed Judgment:** Varies by path. Bedrock: Delegated + model provider borrowed judgment. SageMaker self-hosted: Retained. AgentCore: Delegated. Self-hosted EKS: fully Retained. Runtime proliferation is itself a 2C decision AWS doesn’t automate.

### ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane

*Policy-driven placement and resource coordination — the Autonomy Layer*  
**Status:** Intelligence 2C: Delegated | Infra 2C: Implicit

**Bedrock Guardrails** [DAPM: Delegated]  
Content filtering, PII protection, topic blocking. ApplyGuardrail API works with any model. Cross-account organizational safeguards (GA Apr 2026).

**AgentCore Policy (GA Mar 2026)** [DAPM: Delegated]  
Centralized governance outside agent code. Natural language → Cedar policy. Intercepts every tool call before execution. 13 AWS regions.

**AgentCore Evaluations + Memory** [DAPM: Delegated]  
Built-in evaluators for correctness, safety, adherence, consistency. Episodic Memory for stateful reasoning across sessions.

**AWS Agent Registry (Preview)** [DAPM: Delegated]  
Governed catalog for agents, tools, skills, MCP servers. Works across AWS, other cloud, on-prem.

**Gap Analysis:** Intelligence Layer 2C (partially present): AgentCore Policy + Guardrails + Evaluations govern agent behavior. Real and productized. Natural-language-to-Cedar conversion is the most accessible policy authoring in this assessment.

Infrastructure Layer 2C (not built): No service answers ‘given data residency, cost, latency, and compliance, should this run on Trainium in us-east-1 or NVIDIA in eu-west-1?’ Capacity primitives are building blocks, not a policy-driven placement engine querying 1A governance metadata.

The structural insight: AWS already has the control plane everyone else is trying to build — but it’s implicit. Dell’s 2C gap is product absence. AWS’s 2C gap is visibility and authority — the capability exists but is implicit, managed, and Ceded.

Five-vendor Layer 2C comparison:
• Dell: Absent.
• HPE: Retained (IT ops) + Delegated (Kamiwaza).
• VAST: Retained/Emerging (PolicyEngine + Polaris, GA end 2026).
• AWS: Intelligence 2C Delegated (productized). Infrastructure 2C implicit (inside managed services).

The question is not ‘Does AWS have Layer 2C?’ but ‘How much can the enterprise configure, audit, and control — and how much has been Ceded without explicit classification?’

**Borrowed Judgment:** Intelligence 2C: Low — AgentCore Policy, Guardrails, Evaluations are AWS IP. Customer defines policies; AWS enforces.

Infrastructure 2C: Ceded (implicit) — placement decisions inside managed services without explicit customer policy input. When SageMaker auto-scales or Bedrock routes, those are 2C functions the enterprise has Ceded without classification.

DAPM discipline demands: for every managed service placement decision, classify as Delegated (customer sets policy) or Ceded (AWS decides).

### ● Layer 3 (+1): AI Application Layer — The Value Plane

*AI-powered business capabilities — business logic, workflow automation*  
**Status:** Broadest Ecosystem

**Bedrock Agents + Strands SDK** [DAPM: Delegated / Retained]  
No-code (Bedrock Agents) and full-code (Strands) agent construction. Bedrock Agents: Delegated behavior. Strands: Retained authority. Both span 2B/2C/3.

**Amazon Q** [DAPM: Delegated]  
AWS AI assistant for business and development. Enterprise Delegates application behavior to AWS.

**Model + ISV Ecosystem** [DAPM: Delegated]  
Anthropic Claude, Amazon Nova, Meta Llama, OpenAI, Mistral, Cohere, NVIDIA Nemotron. Thousands of ISV applications. 11,000+ government agencies.

**AWS Kiro (Agentic IDE Platform)** [DAPM: Delegated]  
Spec-driven agentic development platform replacing Amazon Q Developer (new signups ended May 2026). Three surfaces: VS Code-compatible IDE, CLI, and autonomous cloud agent. Spec-driven development generates requirements.md, design.md, and tasks.md before code — specs are source-of-truth, code is build artifact. Hooks system: 17 automated quality gates (security, linting, testing, validation) firing on file save and PR events. Multi-model routing: Claude Sonnet for reasoning-heavy specs, Amazon Nova for high-throughput code generation, Bedrock as unified model plane. 50+ Powers (MCP integrations: Figma, Terraform, Stripe, Datadog). Autonomous agent executes backlog tasks and opens PRs without developer in the loop. Deep AWS context: native Powers for AWS pricing, docs, Well-Architected, cost analysis. The most opinionated developer AI surface from any cloud vendor — enforces structured development discipline rather than freeform 'vibe coding.'

**Gap Analysis:** Broadest Layer 3 in this assessment — different category than Dell ISV partnerships, HPE Unleash AI, or VAST Cosmos. Each Layer 3 application brings its own governance domain. AgentCore Policy and Guardrails provide cross-agent governance primitives; whether they compose into enterprise-wide agent governance remains an implementation question.

The Retained/Delegated boundary is not uniform. Custom Strands on self-hosted EKS: fully Retained. Bedrock Agents / Q / partner apps: substantially Delegated. Same enterprise may have both patterns simultaneously.

Kiro represents AWS's strongest Layer 3 opinion: spec-driven development enforces structured requirements before code generation. This is an opinionated development methodology embedded in tooling — the enterprise Delegates development workflow decisions to AWS's architectural opinions about how AI-assisted software should be built. The autonomous agent (cloud agent executing tasks and opening PRs without human in the loop) creates a new DAPM question: when Kiro's agent writes and ships code autonomously, who owns the judgment embedded in that code? The developer who assigned the task, or Kiro's multi-model routing logic that chose which model to apply?

Compare to Google Antigravity 2.0 (agent orchestration platform, multi-agent parallel execution, Gemini-native) and GitHub Copilot (IDE-embedded coding agent with cloud agent for autonomous PR creation). All three clouds now have agentic developer surfaces that span Layer 2B (execution) and Layer 3 (application). The competitive dynamics are shifting from 'which cloud has the best models' to 'which cloud has the most productive developer surface.'

**Borrowed Judgment:** Distributed and complex. Model providers bring training data, alignment, safety decisions as inherited borrowed judgment. AWS platform defaults shape application behavior. DAPM Action 3 applies with force: when you move off AWS, what judgment doesn’t move with you? Answer: almost everything above Layer 0.

---
*Layer2C · AI Infrastructure Decision Intelligence · The CTO Advisor LLC · thectoadvisor.com*
