# Layer2C — Complete Vendor Assessment Data > The CTO Advisor LLC · thectoadvisor.com · Updated May 23, 2026 Complete 4+1 Layer AI Infrastructure Model assessments for 12 vendors. Each layer lists all components with their DAPM classification (Retained / Delegated / Ceded / Absent), followed by gap analysis, borrowed judgment assessment, and working notes. ════════════════════════════════════════════════════════════════════════════════ # AWS AI Infrastructure Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 21, 2026 **Source:** re:Invent 2025, GTC 2026, Bedrock AgentCore GA, AgentCore Policy GA (Mar 2026), SageMaker Unified Studio, AWS/NVIDIA collaboration, OpenAI/AWS partnership, analyst coverage ## Summary Finding AWS is the first vendor in this assessment series that makes a credible claim across every layer of the 4+1 model — including Layer 2C. The structural difference between AWS and every on-prem vendor (Dell, HPE, VAST) is the direction of authority. On-prem vendors build upward from hardware, attempting to extend authority into orchestration and runtime layers. AWS builds downward from managed services, extending authority into custom silicon (Trainium, Inferentia, Graviton), custom networking (EFA/SRD, Nitro), and now on-prem infrastructure (AWS AI Factories). The DAPM classification for AWS is structurally inverted compared to on-prem vendors. The enterprise architect using AWS retains less direct authority at every layer — but gains operational leverage that on-prem vendors cannot match. The question is not whether AWS has the capabilities. The question is whether the enterprise architect has made the authority delegation explicit, and whether they understand what borrowed judgment they inherit when they adopt AWS’s Reasoning Plane as their own. AWS already has the control plane everyone else is trying to build. The problem is that customers do not always see where AWS’s control plane ends and their own authority begins. When Bedrock routes inference, SageMaker auto-scales, or Karpenter provisions nodes — those are Layer 2C functions operating invisibly inside managed services. This is the ‘DGX Realization’ that birthed the 4+1 model: the cloud operates an invisible Reasoning Plane that becomes visible only when you try to replicate it on bare metal. The OpenAI partnership (2GW of Trainium capacity, Stateful Runtime on Bedrock) and NVIDIA deepened collaboration (1M+ GPUs including Blackwell and Rubin) demonstrate AWS positioning as the substrate on which multiple AI ecosystems converge — creating one of the broadest Layer 3 ecosystems and one of the most complex borrowed judgment landscapes. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Ceded to AWS ### Vendor-Provided Components **AWS Custom Silicon (Annapurna Labs)** [DAPM: Ceded] Trainium3 GA: EC2 Trn3 UltraServers, 144 chips/UltraServer, 4.4x compute vs Trn2, 4x energy efficiency, 4x memory bandwidth. Sub-10µs chip-to-chip latency. Designed for agentic AI, MoE models, large-scale RL. Trainium4 expected 2027. Inferentia2 for inference. Graviton for ARM CPU. AWS-owned silicon IP. **NVIDIA GPUs on AWS** [DAPM: Ceded] Broadest NVIDIA GPU collection of any cloud. P5 (H100), P5e (H200), P6 (B200), P6e (GB200). 1M+ GPUs added 2026 including Blackwell and Rubin. **Nitro System + EFA + SRD** [DAPM: Ceded] Custom hardware/firmware for I/O offload. Hardware-enforced security isolation. EFA: OS-bypass with 3,200 Gbps bandwidth. SRD: AWS custom multi-path, fault-tolerant transport. EC2 UltraClusters: petabit-scale, 20,000 GPUs, 16% latency reduction (v2.0). **AWS AI Factories (On-Prem)** [DAPM: Ceded] Dedicated on-prem environments as private AWS Region. Customer provides space/power; AWS deploys and manages Trainium, NVIDIA GPUs, networking, storage, and full managed services (Bedrock, SageMaker). Inverted vs Dell/HPE: AWS operates infrastructure the customer houses. ### NVIDIA-Provided Components **NVIDIA GPU Silicon + NIXL** 1M+ GPUs including Blackwell and Rubin. NIXL support with EFA for disaggregated LLM inference. ### Gap Analysis The enterprise has no authority over Layer 0 hardware beyond choosing instance types. The multi-accelerator marketplace (Trainium, NVIDIA, AMD, Intel) creates a workload-to-silicon matching problem that is itself a Layer 2C function. This problem doesn’t exist in on-prem (accelerator choice made once at procurement) but recurs with every cloud workload placement decision. AWS is the only vendor that owns accelerator silicon IP (Annapurna Labs). Dell and HPE brand third-party silicon. VAST has no Layer 0 silicon. AWS AI Factories invert the on-prem model: Ceded infrastructure even when physically in the customer’s facility. ### Borrowed Judgment Inverted. The enterprise Cedes Layer 0 entirely — AWS makes all silicon, networking, and infrastructure decisions. The enterprise selects from AWS’s menu but does not influence underlying hardware design, networking topology, or physical infrastructure. The trade-off: loss of direct hardware authority in exchange for operational leverage (no procurement lead time, per-workload silicon selection, managed scaling). ### Working Notes The switching cost / decision frequency distinction between cloud and on-prem at Layer 0 is structural. On-prem: capital decision at procurement. Cloud: per-workload decision at runtime. That fluidity itself requires Layer 2C. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Delegated ### Vendor-Provided Components **Amazon S3 + S3 Tables** [DAPM: Ceded] De facto object storage standard. S3 Tables (re:Invent 2025): Iceberg-native table storage. S3 Express One Zone: single-digit ms latency. **AWS Glue Data Catalog + Lake Formation** [DAPM: Delegated] Centralized metadata with catalog federation to remote Iceberg catalogs. Fine-grained access control, cross-account sharing, column/row-level security. SageMaker and Bedrock inherit IAM/Lake Formation context. The primitives for 1A→2C exist; the composition is customer-built. **SageMaker Catalog** [DAPM: Delegated] Discovery, subscription, governed sharing of data assets within SageMaker Unified Studio. ### Gap Analysis Most mature governance catalog in this assessment. Glue + Lake Formation metadata is API-accessible to higher layers. The 1A→2C connection (Reasoning Plane querying governance metadata for placement decisions) is not a product today — the primitives exist, composition is customer-built. Catalog federation to remote Iceberg catalogs is unmatched within this series. Hybrid gap: federated catalog covers S3 and Iceberg-compatible catalogs but not proprietary on-prem storage metadata. ### Borrowed Judgment Delegated with customer-retained policy. Lake Formation policies are customer-defined; enforcement is AWS-managed. Cleaner DAPM than Dell (MetadataIQ indexes Dell-only) or VAST (governance catalog is proprietary). ### Working Notes The ‘Governance Enables Autonomy’ principle from the 4+1 model is achievable on AWS but requires the enterprise architect to build the governance-to-placement linkage. ## ● Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Delegated ### Vendor-Provided Components **Amazon Bedrock Knowledge Bases** [DAPM: Delegated] Managed RAG: ingest → chunk → embed → index → retrieve. Supports OpenSearch, Aurora/pgvector, Pinecone, Redis. Vertically integrates 1B+1C within managed boundary. **Amazon OpenSearch Serverless** [DAPM: Delegated] Vector search with HNSW/FAISS. Default Bedrock Knowledge Bases backend. Serverless scaling. **Amazon Neptune** [DAPM: Delegated] Graph database for relationship-aware retrieval. Multi-hop reasoning for agentic workloads. ### Gap Analysis Bedrock Knowledge Bases erases the 1B/1C boundary within its managed surface — borrowed judgment, not a gap. AWS makes chunking, embedding, retrieval strategy decisions on the customer’s behalf. The enterprise should ask whether defaults suit their domain. Interoperability gap: no unified retrieval abstraction across Bedrock Knowledge Bases + self-hosted Weaviate + Neptune. Routing logic between backends is a Layer 2C function living in application code. ### Borrowed Judgment Moderate. Bedrock Knowledge Bases makes retrieval quality decisions the enterprise inherits without explicit governance. Compare to VAST (InsightEngine — tighter but VAST-controlled) or Dell (Elastic — separate ISV). ### Working Notes Neptune graph-based retrieval is increasingly relevant for agentic workloads needing relationship-aware context — a pattern neither Dell’s Elastic nor VAST’s InsightEngine natively provides. ## ● Layer 1C: Data Movement & Pipelines *Move/transform data — ETL/ELT, lineage, governed data preparation* **Status:** Delegated ### Vendor-Provided Components **AWS Glue + SageMaker Unified Studio** [DAPM: Delegated] Glue: serverless ETL with Spark 3.5.6, Iceberg 1.10. SageMaker Unified Studio: horizontal integration across 1A/1C/2B with one-click onboarding. Single governed environment collapsing organizational boundaries across data engineering, data science, ML engineering. **Amazon MWAA + Step Functions** [DAPM: Delegated] Managed Airflow for complex DAGs. Step Functions for serverless multi-step workflows in ML pipeline reference architectures. ### Gap Analysis AWS horizontal integration vs VAST vertical integration: same functional coverage, different authority models. VAST = one authority boundary, fewer choices, fewer seams. AWS = many services sharing governance via Lake Formation/IAM, more policy control, more operational complexity. SageMaker Unified Studio addresses fragmentation but underlying services remain distinct. ### Borrowed Judgment Delegated with customer-retained configuration. AWS provides pipeline services; customer defines transformations and flows. Operational complexity of maintaining consistent governance across many AWS accounts is the trade-off for flexibility. ## ● Layer 2A: Infrastructure Orchestration *GPU scheduling, capacity management, autoscaling* **Status:** Delegated / Retained ### Vendor-Provided Components **Amazon EKS Auto Mode + Karpenter** [DAPM: Delegated] Managed K8s with GPU-aware scheduling. EKS Auto Mode automates cluster/compute management. Karpenter: open-source autoscaler provisioning exact instance types. Mixed compute (NVIDIA, Trainium, Inferentia, Graviton). Note: no DRA support, Capacity Blocks negate scale-to-zero. **Capacity Management** [DAPM: Ceded] Capacity Block Reservations, Flex Start, Savings Plans, Spot. All AWS-controlled allocation. These are capacity acquisition mechanisms, not workload placement reasoning. ### NVIDIA-Provided Components **NVIDIA GPU Operator (on EKS)** Available but optional. AWS controls GPU scheduling through EKS Auto Mode and Karpenter. Run:ai available but not required. ### Gap Analysis For Dell/HPE, Layer 2A is where authority slips to NVIDIA. For AWS, Layer 2A is where AWS retains authority through managed services while integrating NVIDIA optionally. GPU scheduling primitives are AWS-controlled. Governance choice: Retain 2A by running self-managed EKS, or Cede 2A by consuming Bedrock (no EKS, no Karpenter — AWS handles 2A invisibly). Both legitimate; the choice is a governance decision with DAPM implications. ### Borrowed Judgment Low to moderate depending on path. Self-managed EKS: Retained. Bedrock consumption: Ceded. NVIDIA dependency is optional, structurally different from Dell/HPE where Run:ai is the primary GPU scheduler. ### Working Notes Capacity Blocks sometimes described as proto-2C but cost-optimized capacity acquisition is not multi-objective placement reasoning. ## ● Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Delegated / Retained ### Vendor-Provided Components **Amazon Bedrock** [DAPM: Delegated] Foundation model access: Anthropic Claude, Amazon Nova, Meta Llama, OpenAI (Stateful Runtime), Mistral, Cohere, NVIDIA Nemotron. Unified API. Fine-tuning including RFT. **Bedrock AgentCore Runtime** [DAPM: Delegated] Serverless agent runtime. Framework-agnostic (Strands, LangChain, CrewAI). Protocol-agnostic (MCP, A2A). Model-agnostic. MicroVM session isolation. 2M+ SDK downloads in 5 months. **SageMaker AI + Self-Hosted** [DAPM: Delegated / Retained] Training, fine-tuning, inference endpoints. LMI containers with vLLM. Multi-LoRA. Supports Trainium + NVIDIA. vLLM on EKS and Ray on EKS for fully self-hosted (Retained). **Strands Agents SDK** [DAPM: Retained] AWS open-source agentic framework. Model-first, native AgentCore/Guardrails/OpenTelemetry integration. Multi-agent patterns with A2A. ### NVIDIA-Provided Components **NVIDIA GPU Instances + NIM** P5/P6 instances. NVIDIA Nemotron via Bedrock. NVIDIA dependency optional — Trainium-only inference is architecturally possible. ### Gap Analysis AWS owns multiple 2B surfaces (Bedrock, SageMaker, AgentCore, EKS). NVIDIA dependency is optional in a way it’s not for Dell/HPE. Agent frameworks blur 2B/2C/3 boundaries — AgentCore bundles Runtime (2B) + Policy (2C) + agent logic (3). Product boundary ≠ architectural boundary. Borrowed judgment: using Bedrock to access Anthropic Claude or Meta Llama means the model provider’s alignment decisions become part of the enterprise’s AI system. Guardrails constrain output but reasoning in model weights is not customer-configurable. ### Borrowed Judgment Varies by path. Bedrock: Delegated + model provider borrowed judgment. SageMaker self-hosted: Retained. AgentCore: Delegated. Self-hosted EKS: fully Retained. Runtime proliferation is itself a 2C decision AWS doesn’t automate. ### Working Notes Product boundary (AgentCore = Runtime + Policy + Evaluations + Memory + Registry) doesn’t align with 4+1 architectural boundary (Runtime = 2B, Policy = 2C, agent logic = 3). Same cross-layer bundling seen in Google’s and VAST’s products. ## ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Intelligence 2C: Delegated | Infra 2C: Implicit ### Vendor-Provided Components **Bedrock Guardrails** [DAPM: Delegated] Content filtering, PII protection, topic blocking. ApplyGuardrail API works with any model. Cross-account organizational safeguards (GA Apr 2026). **AgentCore Policy (GA Mar 2026)** [DAPM: Delegated] Centralized governance outside agent code. Natural language → Cedar policy. Intercepts every tool call before execution. 13 AWS regions. **AgentCore Evaluations + Memory** [DAPM: Delegated] Built-in evaluators for correctness, safety, adherence, consistency. Episodic Memory for stateful reasoning across sessions. **AWS Agent Registry (Preview)** [DAPM: Delegated] Governed catalog for agents, tools, skills, MCP servers. Works across AWS, other cloud, on-prem. ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** All Layer 2C components are AWS IP. NVIDIA does not control governance, policy, or reasoning in the AWS stack. ### Gap Analysis Intelligence Layer 2C (partially present): AgentCore Policy + Guardrails + Evaluations govern agent behavior. Real and productized. Natural-language-to-Cedar conversion is the most accessible policy authoring in this assessment. Infrastructure Layer 2C (not built): No service answers ‘given data residency, cost, latency, and compliance, should this run on Trainium in us-east-1 or NVIDIA in eu-west-1?’ Capacity primitives are building blocks, not a policy-driven placement engine querying 1A governance metadata. The structural insight: AWS already has the control plane everyone else is trying to build — but it’s implicit. Dell’s 2C gap is product absence. AWS’s 2C gap is visibility and authority — the capability exists but is implicit, managed, and Ceded. Five-vendor Layer 2C comparison: • Dell: Absent. • HPE: Retained (IT ops) + Delegated (Kamiwaza). • VAST: Retained/Emerging (PolicyEngine + Polaris, GA end 2026). • AWS: Intelligence 2C Delegated (productized). Infrastructure 2C implicit (inside managed services). The question is not ‘Does AWS have Layer 2C?’ but ‘How much can the enterprise configure, audit, and control — and how much has been Ceded without explicit classification?’ ### Borrowed Judgment Intelligence 2C: Low — AgentCore Policy, Guardrails, Evaluations are AWS IP. Customer defines policies; AWS enforces. Infrastructure 2C: Ceded (implicit) — placement decisions inside managed services without explicit customer policy input. When SageMaker auto-scales or Bedrock routes, those are 2C functions the enterprise has Ceded without classification. DAPM discipline demands: for every managed service placement decision, classify as Delegated (customer sets policy) or Ceded (AWS decides). ### Working Notes The re:Invent 2025 and AgentCore announcements are the strongest vendor validation of the 4+1 model’s Layer 2C thesis. The ‘invisible Reasoning Plane’ observation is the conceptual foundation of the 4+1 model itself. ## ● Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Broadest Ecosystem ### Vendor-Provided Components **Bedrock Agents + Strands SDK** [DAPM: Delegated / Retained] No-code (Bedrock Agents) and full-code (Strands) agent construction. Bedrock Agents: Delegated behavior. Strands: Retained authority. Both span 2B/2C/3. **Amazon Q** [DAPM: Delegated] AWS AI assistant for business and development. Enterprise Delegates application behavior to AWS. **Model + ISV Ecosystem** [DAPM: Delegated] Anthropic Claude, Amazon Nova, Meta Llama, OpenAI, Mistral, Cohere, NVIDIA Nemotron. Thousands of ISV applications. 11,000+ government agencies. **AWS Kiro (Agentic IDE Platform)** [DAPM: Delegated] Spec-driven agentic development platform replacing Amazon Q Developer (new signups ended May 2026). Three surfaces: VS Code-compatible IDE, CLI, and autonomous cloud agent. Spec-driven development generates requirements.md, design.md, and tasks.md before code — specs are source-of-truth, code is build artifact. Hooks system: 17 automated quality gates (security, linting, testing, validation) firing on file save and PR events. Multi-model routing: Claude Sonnet for reasoning-heavy specs, Amazon Nova for high-throughput code generation, Bedrock as unified model plane. 50+ Powers (MCP integrations: Figma, Terraform, Stripe, Datadog). Autonomous agent executes backlog tasks and opens PRs without developer in the loop. Deep AWS context: native Powers for AWS pricing, docs, Well-Architected, cost analysis. The most opinionated developer AI surface from any cloud vendor — enforces structured development discipline rather than freeform 'vibe coding.' ### NVIDIA-Provided Components **NVIDIA NIM on Bedrock** NVIDIA models via Bedrock API alongside all other providers. ### Gap Analysis Broadest Layer 3 in this assessment — different category than Dell ISV partnerships, HPE Unleash AI, or VAST Cosmos. Each Layer 3 application brings its own governance domain. AgentCore Policy and Guardrails provide cross-agent governance primitives; whether they compose into enterprise-wide agent governance remains an implementation question. The Retained/Delegated boundary is not uniform. Custom Strands on self-hosted EKS: fully Retained. Bedrock Agents / Q / partner apps: substantially Delegated. Same enterprise may have both patterns simultaneously. Kiro represents AWS's strongest Layer 3 opinion: spec-driven development enforces structured requirements before code generation. This is an opinionated development methodology embedded in tooling — the enterprise Delegates development workflow decisions to AWS's architectural opinions about how AI-assisted software should be built. The autonomous agent (cloud agent executing tasks and opening PRs without human in the loop) creates a new DAPM question: when Kiro's agent writes and ships code autonomously, who owns the judgment embedded in that code? The developer who assigned the task, or Kiro's multi-model routing logic that chose which model to apply? Compare to Google Antigravity 2.0 (agent orchestration platform, multi-agent parallel execution, Gemini-native) and GitHub Copilot (IDE-embedded coding agent with cloud agent for autonomous PR creation). All three clouds now have agentic developer surfaces that span Layer 2B (execution) and Layer 3 (application). The competitive dynamics are shifting from 'which cloud has the best models' to 'which cloud has the most productive developer surface.' ### Borrowed Judgment Distributed and complex. Model providers bring training data, alignment, safety decisions as inherited borrowed judgment. AWS platform defaults shape application behavior. DAPM Action 3 applies with force: when you move off AWS, what judgment doesn’t move with you? Answer: almost everything above Layer 0. ### Working Notes OpenAI partnership (Stateful Runtime on Bedrock) + NVIDIA (1M+ GPUs) position AWS as convergence substrate for multiple AI ecosystems. Broadest possibilities, most complex borrowed judgment landscape. The Q Developer → Kiro transition (new Q Developer signups ended May 2026) is a significant strategic signal. AWS is consolidating its developer AI surface into a single opinionated platform rather than maintaining parallel tools. Kiro's spec-driven approach is the inverse of 'vibe coding' — it imposes engineering discipline through AI tooling. Whether enterprise development teams accept this opinionated workflow or prefer the freeform approach of Cursor/Copilot is the adoption question. Red Hat Summit 2026 announced OpenShift Dev Spaces support for Kiro alongside Microsoft Copilot, Claude CLI, Cline, Continue, and Roo — meaning Kiro can run inside IBM's governed platform. This cross-vendor interoperability matters for the 4+1 model: the developer tool (Layer 3) can be decoupled from the infrastructure platform (Layer 2A). An enterprise could run Kiro on OpenShift on Dell hardware — three vendors' authority at three different layers. ════════════════════════════════════════════════════════════════════════════════ # Microsoft Azure AI Infrastructure Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 22, 2026 **Source:** Build 2025, GTC 2026, KubeCon Europe 2026, FabCon/SQLCon 2026, Ignite 2024, Microsoft/OpenAI restructured agreement (April 2026), Entra Agent ID GA (April 2026), Agent Governance Toolkit (April 2026), Foundry Agent Service, analyst coverage ## Summary Finding Microsoft Azure is the most structurally complex vendor in this assessment series because it operates three distinct authority systems simultaneously: a massive cloud infrastructure platform (Azure), a deeply integrated but newly non-exclusive frontier model partnership (OpenAI), and the largest enterprise software installed base on earth (Microsoft 365, Entra ID, Purview, Fabric). No other assessed vendor straddles all three domains. Google Cloud owns its frontier model outright. AWS partners with model providers at arm's length. Dell, HPE, and VAST operate below the model layer entirely. Microsoft is the only vendor that must coordinate authority across infrastructure, model intelligence, and enterprise application identity — and the April 2026 OpenAI restructuring has made that coordination both more flexible and more visible. The DAPM classification for Azure reveals a paradox unique among the assessed vendors: Microsoft has more productized Layer 2C capability than any vendor except Google — Agent Governance Toolkit (open-source, sub-millisecond policy enforcement), Entra Agent ID (GA April 2026, agent identity as first-class Entra citizens), Microsoft Agent 365 (unified agent registry and control plane), Foundry Control Plane (centralized observability for agents across frameworks) — yet the enterprise Cedes the most judgment to consume it. The Layer 2C surface is real. The authority delegation is also real. Both statements are true simultaneously. The OpenAI restructuring (April 27, 2026) is the most significant borrowed judgment event in this assessment series. Azure exclusivity is gone — OpenAI models now ship on AWS Bedrock the next day. Microsoft retains a four-month first-mover window on new frontier models, IP license through 2032, ~27% equity stake, and OpenAI's commitment to $250B in Azure consumption. But the structural dependency has shifted from contractual lock-in to commercial preference. Microsoft is simultaneously scaling its own model development (MAI-1, speech/image models via Mustafa Suleyman's CoreAI division) — hedging the borrowed judgment it once embraced unconditionally. The enterprise identity story is Azure's most underappreciated differentiator. No other vendor has extended enterprise identity governance to AI agents. Entra Agent ID treats agents as identity citizens alongside humans and workloads — same Conditional Access, same lifecycle management, same risk detection. This is Layer 2C infrastructure that every other vendor will eventually need to build or integrate. Microsoft has it because it already owns the enterprise identity plane. Identity is a cross-cutting concern not fully addressed in earlier assessments in this series — the Azure assessment surfaces it as a structural dimension that should be retroactively evaluated across all vendors. Azure's structural question is not capability — the capabilities span every layer of the 4+1 model. The question is authority composition: when the enterprise adopts Azure AI Foundry + OpenAI models + Entra Agent ID + Fabric data governance + Purview compliance, how many independent judgment systems has it inherited, and has it classified each delegation explicitly? The 4+1 model exists to make that composition visible. Azure is the vendor where the composition is most complex. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Ceded to Microsoft ### Vendor-Provided Components **Azure Custom Silicon (Maia + Cobalt)** [DAPM: Ceded] Maia 100: AI accelerator on TSMC 5nm, 105B transistors. Designed for LLM training and inference. Optimized for Azure OpenAI Service and Copilot workloads. Second-generation Maia 200 ('Braga') in development — reported design revisions pushed to 2026. Cobalt 100: ARM-based CPU for general cloud workloads. Both designed in-house by Azure hardware teams. OpenAI partnership now includes rights to OpenAI's custom chip designs for integration into Maia/Cobalt roadmap. **GPU Accelerators on Azure (NVIDIA + AMD)** [DAPM: Ceded] Among the first hyperscalers to deploy Vera Rubin NVL72 (via Azure Local + Foundry Local). ND-series VMs: H100, H200, B200/B300. 1M+ NVIDIA GPUs. Fractional GPU via Azure Kubernetes Service. AMD Instinct MI300X via ND MI300X v5 VMs for inference workloads. The multi-accelerator marketplace (Maia, NVIDIA, AMD) creates a workload-to-silicon matching problem that is itself a Layer 2C function. **Azure Networking (SONiC + Accelerated Networking)** [DAPM: Ceded] Microsoft created SONiC (Software for Open Networking in the Cloud) and open-sourced it through OCP. SONiC is now the de facto open-source network OS for hyperscale, running on switches from Broadcom, NVIDIA, Intel, and others — Microsoft shaped the networking substrate and gave it away. Azure Accelerated Networking with hardware-level SR-IOV. InfiniBand interconnect for GPU clusters. RDMA for distributed training. Microsoft-designed rack architecture, power, and cooling. 80+ regions, 500+ datacenters, 800,000+ km of fiber. **Azure Local (On-Prem)** [DAPM: Delegated] Hyper-converged, customer-owned cluster running Azure services on-premises. Windows/Linux VMs, AKS containers, Azure Virtual Desktop. Connected and fully disconnected modes (Feb 2026). Validated OEM hardware from Dell, HPE, Lenovo, and others. Azure Arc extends management, governance, and security. Foundry Local for on-prem AI inference. $10/core/month + optional services. Sovereign Private Cloud (Azure Local + Microsoft 365 Local) for air-gapped environments. ### NVIDIA-Provided Components **NVIDIA GPU Silicon + Networking** Vera Rubin NVL72, Blackwell B200/B300, H100/H200. InfiniBand for GPU cluster interconnect. ConnectX/BlueField NICs. Microsoft manages the NVIDIA integration and instance types. **NVIDIA NIM on Azure** NVIDIA inference microservices available through Microsoft Foundry model catalog alongside OpenAI, open-source, and Microsoft models. ### Gap Analysis Azure's Layer 0 follows the same structural pattern as AWS and GCP: the enterprise Cedes all infrastructure authority in exchange for operational leverage. The multi-accelerator marketplace (Maia, NVIDIA, AMD) creates the same workload-to-silicon matching problem identified in the AWS assessment — a Layer 2C function that no hyperscaler yet automates with policy-driven placement. Azure's custom silicon is less mature than AWS's (Trainium is in production at scale; Maia 100 powers internal services but Maia 200 has slipped) and less differentiated than Google's (TPUs are architecturally distinct; Maia is NVIDIA-competitive). The OpenAI chip design rights add an interesting dimension — Azure could incorporate inference-optimized design ideas from OpenAI into future Maia generations, creating a silicon-model co-optimization loop unique to Microsoft. SONiC is an underappreciated Layer 0 authority claim. Microsoft designed the network OS that runs hyperscale data centers globally — including competitors' — and open-sourced it. This is a different networking authority model than any other vendor: Dell brands NVIDIA switches, HPE acquired Juniper ($14B), Google built Virgo (proprietary), AWS built SRD (proprietary). Microsoft built SONiC and made it public infrastructure. The strategic value is ecosystem shaping, not proprietary control. Azure Local inverts the on-prem model differently than AWS AI Factories: Azure Local is customer-operated on customer-owned hardware with Azure management plane (Delegated). AWS AI Factories are AWS-operated on AWS-owned hardware in customer facilities (Ceded). Azure Local also runs on multi-vendor OEM hardware (Dell, HPE, Lenovo) while AWS AI Factories run on AWS hardware only — creating cross-OEM visibility similar to VMware's hardware-agnostic model. ### Borrowed Judgment Inverted, same as AWS and GCP. The enterprise Cedes Layer 0 entirely. The trade-off: loss of direct hardware authority in exchange for operational leverage and multi-accelerator choice. The OpenAI silicon co-design relationship is a unique form of borrowed judgment: Microsoft can incorporate OpenAI's hardware ideas but inherits OpenAI's optimization priorities (inference-first, GPT-family architectures). Whether that alignment holds as Microsoft scales its own model development (MAI-1, CoreAI) is an open question. ### Working Notes Microsoft's data center capex run rate exceeds $150B annually (2026), with 1 GW of additional capacity added in Q3 FY2026 alone — among the largest infrastructure investments in corporate history. Custom server boards, racks, and cooling designed for Maia and GPU density. The Stargate project (OpenAI/SoftBank/Oracle JV) is related but distinct — Stargate involves Oracle infrastructure and SoftBank capital, a shared infrastructure authority model with DAPM implications of its own. The Maia 200 slip is worth tracking: AWS shipped Trainium3 on schedule; Google shipped TPU 8t/8i on schedule; Microsoft's second-generation AI accelerator is delayed. Microsoft's near-term answer is massive NVIDIA GPU deployment — deepening the same NVIDIA dependency that Dell and HPE face, just at cloud scale. Azure Local's multi-vendor hardware support makes it the only hyperscaler on-prem offering that runs across OEM boundaries. This parallels VMware's hardware-agnostic model and creates the same potential for a multi-vendor reasoning plane. The difference: Azure Local is managed by Azure Arc (Microsoft's control plane); VMware is managed by VCF (Broadcom's control plane). Both see across OEM boundaries. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Ceded to Microsoft ### Vendor-Provided Components **Azure Blob Storage + ADLS Gen2** [DAPM: Ceded] Object and data lake storage. Hierarchical namespace for analytics workloads. Hot/Cool/Cold/Archive tiers. Immutable blobs, versioning, lifecycle management. S3-compatible API. The storage substrate for OneLake, Fabric, and AI training data pipelines. **Microsoft Fabric / OneLake** [DAPM: Ceded] Unified SaaS data platform: data engineering, warehousing, real-time analytics, data science, Power BI — all on a single data lake (OneLake). Zero-copy access patterns — data remains in place while being reused across experiences. FabCon/SQLCon 2026 positioned Fabric as 'the operating system for enterprise data' and the central control plane for OneLake data management. EXPOSE TO FABRIC T-SQL extension virtualizes database objects in OneLake without moving data. **Microsoft Purview (Governance)** [DAPM: Ceded] Unified data governance across clouds, data types, and the full data estate. Built into Fabric. Automated sensitivity labeling on OneLake assets. Data Loss Prevention policies detect and restrict sensitive data uploads. Audit logs capture all Fabric user activities including AI interactions. Classification, lineage, compliance (GDPR, HIPAA, PCI DSS). Sensitivity labels, access policies, and compliance controls extend to data shared across tenants via OneLake data sharing. Purview extends beyond Azure into Microsoft 365 (SharePoint, Teams, Exchange), on-premises SQL Server, and multi-cloud environments. ### Gap Analysis Azure's Layer 1A is among the most expansive in this assessment series because Microsoft controls the cloud storage infrastructure (Blob, ADLS Gen2), the enterprise data governance plane (Purview), AND the unified data platform (Fabric/OneLake). Microsoft's structural advantage: Purview governance extends beyond Azure into Microsoft 365 (SharePoint, Teams, Exchange), on-premises SQL Server, and multi-cloud environments. The governance catalog that a Layer 2C reasoning plane would query already contains metadata about the enterprise's entire data estate — not just cloud-resident data. The Fabric evolution from analytics platform to 'operating system for enterprise data' (FabCon/SQLCon 2026) is an explicit control plane claim. The unified data catalog spanning all Fabric workloads with automatic governance inheritance is the closest thing in this assessment to a data-layer reasoning plane. The gap: Purview's governance metadata is rich but it's unclear whether it's API-accessible in a way that a Layer 2C placement engine could query programmatically for real-time decisions. Governance as compliance reporting vs. governance as runtime policy input are different functions. ### Borrowed Judgment Ceded with customer-retained policy. Purview policies are customer-defined; enforcement is Microsoft-managed. The enterprise defines what's sensitive and who can access it; Microsoft enforces across the data estate. Fabric introduces a form of borrowed judgment through embedded Copilot: AI assistance for authoring, exploration, and development across Fabric workloads respects tenant, data, and permission boundaries — but the AI assistance logic is Microsoft's. ### Working Notes The Purview-to-Layer-2C connection is the most compelling governance-to-reasoning pathway in the assessment. If Microsoft builds a reasoning plane that queries Purview classification metadata, sensitivity labels, and compliance policies to make placement decisions about AI workloads, it would have the richest governance input of any vendor — because Purview already sees the enterprise's data across Azure, Microsoft 365, and on-premises. The OneLake 'single logical data lake across the tenant' vision parallels VAST's DataSpace 'global namespace' — both attempt to make data location transparent. OneLake is a cloud-native abstraction within Microsoft's platform; DataSpace is an infrastructure-level abstraction across physical sites. Different layers of the stack, same architectural ambition. ## ● Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Ceded to Microsoft ### Vendor-Provided Components **Azure AI Search** [DAPM: Ceded] Enterprise retrieval engine combining full-text search (BM25/Lucene), vector search (HNSW), hybrid search with Reciprocal Rank Fusion (RRF), and transformer-based semantic ranker — all in a single managed platform. Integrated vectorization with built-in chunking and embedding. Agentic retrieval (preview): LLM-assisted query planning, multi-source access, structured responses optimized for agent consumption. **Enterprise + Web Knowledge Grounding** [DAPM: Ceded] Foundry Agent Service agents access SharePoint, Microsoft Fabric, Azure AI Search, Azure Blob Storage, and Bing as knowledge sources. The Microsoft 365 corpus (SharePoint, Teams, Exchange) provides enterprise context that is already semantically rich — documents, conversations, and emails created in business workflows carry business meaning natively. Bing adds web-scale grounding. No other vendor has both enterprise productivity data and a web search engine integrated as agent context sources. This is the structural contrast to Google's Knowledge Catalog approach: Google uses Gemini to derive business semantics from raw data; Microsoft retrieves from a corpus where business semantics were created by humans in the course of work. ### Gap Analysis Azure AI Search is one of the strongest Layer 1B offerings in this assessment. The hybrid search + semantic ranker + agentic retrieval combination addresses the retrieval quality problem comprehensively. The structural contrast with Google's Knowledge Catalog is the key 1B finding. Google's approach is model-integrated: Gemini enriches raw data on arrival, extracting business semantics and building a context graph that didn't exist before. Microsoft's approach is corpus-integrated: the M365 corpus already contains business context because humans created it in business workflows. A SharePoint document about Q3 revenue already carries the business semantics that Knowledge Catalog would need Gemini to extract from a raw CSV. Microsoft doesn't need a model to derive meaning because the data was born semantic. Both approaches have trade-offs. Google's model-derived semantics are consistent and exhaustive — every data asset gets enriched. Microsoft's human-created semantics are richer but inconsistent — the quality depends on how well the enterprise organizes its M365 content. Google's approach works on any data. Microsoft's advantage depends on the enterprise already having its knowledge in M365. The agentic retrieval mode (preview) blurs the boundary between retrieval (1B) and reasoning (2C) — the retrieval engine uses an LLM to decompose complex queries into sub-queries. Same cross-layer blurring observed in other vendors' products. Bing grounding introduces a unique borrowed judgment: the enterprise inherits Bing's web index, coverage, biases, and ranking decisions as part of agent context. No other vendor has this dependency because no other vendor owns a web search engine. ### Borrowed Judgment Ceded with high integration value. Azure AI Search is Microsoft IP. The semantic ranker is a Microsoft model. Agentic retrieval uses Microsoft's LLM for query decomposition. The enterprise Cedes retrieval intelligence to Microsoft but gains the integrated M365 corpus and Bing web index as context. The M365 corpus as borrowed judgment: the enterprise's own data is the context source, but Microsoft controls how it's indexed, chunked, embedded, and served to agents. The enterprise created the content; Microsoft controls the retrieval path. ### Working Notes The 'retrieval as reasoning' evolution (agentic retrieval mode) is worth tracking as a 4+1 model observation. If the retrieval engine uses LLM reasoning to plan queries, where does Layer 1B end and Layer 2C begin? The product boundary (Azure AI Search) doesn't align with the architectural boundary (retrieval vs. reasoning). The Google Knowledge Catalog contrast deserves tracking as both approaches mature. If Google's Gemini-derived semantics prove more reliable than human-created M365 content for agent grounding, the model-integrated approach wins. If enterprise-specific context (internal jargon, organizational knowledge, relationship context) proves more valuable than machine-extracted semantics, the corpus-integrated approach wins. The answer likely varies by use case. ## ● Layer 1C: Data Movement & Pipelines *Move/transform data — ETL/ELT, lineage, cost-aware movement, KV cache tiering* **Status:** Ceded to Microsoft ### Vendor-Provided Components **Azure Data Factory + Fabric Data Pipelines** [DAPM: Ceded] Cloud-based ETL/ELT orchestration available as a standalone Azure service (Data Factory) and within the unified Fabric experience (Data Factory in Fabric). 200+ connectors in Fabric. Visual pipeline builder. Gartner Leader for Data Integration Tools (5 consecutive years). Fabric version adds Dataflows Gen2 for self-service data preparation with Power Query (no-code), Medallion architecture (Bronze → Silver → Gold) with Delta Lake, ACID transactions, schema enforcement, time-travel queries, Data Activator for event-driven actions, and Copilot for natural-language pipeline authoring. Scheduled and event-based triggers. **Azure Databricks (Partnership)** [DAPM: Delegated] Apache Spark-based analytics platform. Delta Lake as the transactional storage layer. Data engineering, data science, ML. Azure-optimized but Databricks-owned IP. The most commonly used advanced data pipeline tool on Azure — but it's a partner dependency, not Microsoft IP. AWS has the same Databricks dependency without listing it as a component; the inclusion here reflects how central Databricks is to Azure's enterprise data engineering story. ### Gap Analysis Azure's Layer 1C is comprehensive and mature. Data Factory's 5-year Gartner leadership position reflects enterprise-grade data movement capability. The Fabric integration collapses what were previously separate services (Data Factory, Synapse, Power BI) into a unified pipeline-to-analytics experience. Azure's Layer 1C advantage: Fabric's unified data platform means the pipeline, storage, analytics, and governance are the same system. Data moves through experiences within OneLake rather than between services. This architectural philosophy parallels vertically integrated approaches in other assessed platforms but at a different layer of the stack. The Databricks dependency is worth noting: many enterprise Azure customers use Databricks rather than native Fabric pipelines for complex data engineering. This creates a Delegated component within an otherwise Ceded layer — the enterprise's data pipeline intelligence is Databricks' IP, running on Azure's infrastructure. No KV cache tiering story is evident on Azure. Dell has validated NVIDIA CMX (19x TTFT improvement), HPE has Alletra X10000 KV cache support, VAST collocates cache and compute in CNode-X. This gap matters as inference workloads scale and KV cache management becomes a data movement problem. ### Borrowed Judgment Ceded for native services. Delegated for Databricks. Copilot for Data Factory adds a dimension: natural-language pipeline authoring means the enterprise inherits Microsoft's LLM's understanding of data engineering patterns. Whether Copilot-authored pipelines match the quality of expert-authored ones is an open question with borrowed judgment implications. ### Working Notes The Fabric positioning as 'operating system for enterprise data' (FabCon/SQLCon 2026) makes Layer 1C the layer where Microsoft's data platform ambitions are most visible. If Fabric succeeds as the unified data control plane, it collapses 1A (storage) + 1B (retrieval) + 1C (movement) into a single authority boundary — which simplifies DAPM classification but concentrates authority. ## ● Layer 2A: Infrastructure Orchestration *GPU scheduling, quotas, RBAC, fair-share scheduling, utilization optimization* **Status:** Ceded to Microsoft ### Vendor-Provided Components **Azure Kubernetes Service (AKS)** [DAPM: Ceded] Managed Kubernetes with GPU-aware scheduling. Dynamic Resource Allocation (DRA) GA at KubeCon Europe 2026 — fine-grained GPU resource allocation enabling multiple AI workloads to share GPU resources, reducing idle GPU time from typical 30-40% rates. AKS-managed GPU node pools automate NVIDIA driver, device plugin, and DCGM metrics. MIG, MPS, and time-slicing for GPU sharing. Kueue for fair queuing and priority. Cilium mTLS encryption in public preview — sidecarless pod-to-pod security eliminating sidecar proxy overhead for AI workloads (built with Isovalent/Cisco). AI Runway (preview): unified inference API positioning Kubernetes as the AI infrastructure operating system, with cross-cloud GPU scheduling previewed. Microsoft contributed DRA to upstream Kubernetes — these GPU scheduling primitives are available to all Kubernetes users, not just AKS. **Azure Arc (Hybrid Orchestration)** [DAPM: Delegated] Extends Azure management to any infrastructure — on-prem, edge, multi-cloud. Projects external resources into Azure Resource Manager. Unified RBAC, policy, monitoring. AKS enabled by Azure Arc: AI inference on hybrid Kubernetes clusters with centralized governance. Connected and disconnected operation modes. The only hyperscaler management plane designed to orchestrate across non-native infrastructure — AWS manages AWS; Google manages GCP and GDC; Azure Arc manages anything. ### NVIDIA-Provided Components **NVIDIA GPU Operator + DRA** NVIDIA GPU Operator manages GPU drivers and device plugins on AKS. DRA GA provides Kubernetes-native GPU scheduling primitives. Microsoft contributed DRA to upstream Kubernetes. ### Gap Analysis Azure's Layer 2A is mature and comprehensive. AKS is the most widely deployed managed Kubernetes service for AI workloads on Azure, and the KubeCon 2026 DRA GA + AI Runway announcements extend its capabilities specifically for AI scheduling. Azure Arc is the hybrid orchestration differentiator: it extends Azure's management plane to any infrastructure, including competitor hardware. An enterprise running Azure Arc on Dell, HPE, and Lenovo hardware gets a unified orchestration plane across OEM boundaries — managed by Microsoft rather than Broadcom (VMware) or the OEM itself. The AI Runway + cross-cloud GPU scheduling preview is the most aggressive multi-cloud orchestration claim in this assessment. If Azure can schedule workloads across its own GPUs, AWS GPUs, and GCP GPUs based on availability and pricing, it becomes the first cross-hyperscaler Layer 2A orchestration plane. This is aspirational — the preview was just announced — but architecturally significant. DRA contribution to upstream Kubernetes means these GPU scheduling primitives are available to all Kubernetes users, not just AKS. Microsoft is building the open-source foundation that competitors also benefit from — a strategic choice that prioritizes ecosystem leadership over proprietary advantage. Brendan Burns' authorship (Kubernetes co-creator, Microsoft employee) gives Azure unique credibility in shaping Kubernetes' evolution. ### Borrowed Judgment Ceded for cloud workloads. Delegated for Azure Arc-managed on-prem infrastructure. Microsoft's Kubernetes contributions (DRA, AI Runway) are open-source — the enterprise can run them on any Kubernetes. But operational maturity (managed upgrades, monitoring, scaling) is Azure-specific. The enterprise retains the code but Cedes the operations. ### Working Notes Microsoft's KubeCon 2026 framing of 'Kubernetes as the AI Infrastructure OS' is a Layer 2A statement, not a Layer 2C statement. The distinction matters: Kubernetes schedules and manages resources. A Reasoning Plane governs them with policy-driven intelligence. Same distinction applies to VMware's framing of VCF as 'the permanent abstraction layer.' ## ● Layer 2B: Application Runtime & Execution *Model serving, inference optimization, agent runtime — the Execution Plane* **Status:** Ceded to Microsoft + OpenAI ### Vendor-Provided Components **Microsoft Foundry + Azure OpenAI Service** [DAPM: Ceded] Unified platform-as-a-service for enterprise AI operations. 11,000+ models in catalog including OpenAI (GPT-5.5, o-series — four-month first-mover window per April 2026 restructuring), open-source, Microsoft (MAI-1, Phi), and NVIDIA models. Foundry Agent Service: fully managed agent runtime supporting no-code prompt agents and code-based agents (Agent Framework, LangGraph, custom). Handles hosting, scaling, identity, observability, enterprise security. OpenResponses, Activity, Invocations, and A2A protocols for agent distribution through M365 Copilot, Teams, and Entra Agent Registry. Foundry Control Plane centralizes observability for agents across frameworks — including agents NOT running on the platform (register custom LangGraph, A2A, or HTTP-based agents, route through AI Gateway, send OTel traces to Application Insights). Batch evaluations for third-party agents using built-in evaluators for safety, fluency, and task adherence. OpenAI models are now also available on AWS Bedrock — the model is no longer an Azure differentiator; the platform integration is. **Microsoft Agent Framework (Open-Source)** [DAPM: Retained] Production framework for building multi-agent systems. Orchestration patterns: sequential, concurrent, handoff, group chat (Magentic One). OpenAPI integration, A2A protocol, MCP support. Local development → Azure deployment with observability and compliance. Open-source (MIT license). KPMG Clara AI built on Agent Framework. ### NVIDIA-Provided Components **NVIDIA Nemotron + NIM on Foundry** NVIDIA models available through Foundry model catalog. Vera Rubin support via Azure Local + Foundry Local for on-prem inference. **NVIDIA NeMo Data Designer** Integration through Foundry for model training and fine-tuning. Same NVIDIA training dependency seen across multiple assessed vendors. ### Gap Analysis Azure's Layer 2B is one of the broadest — 11,000+ models, managed agent runtime, open-source agent framework, cross-framework observability. The breadth creates complexity: the enterprise must choose between Foundry Agent Service (Ceded, managed), Agent Framework on AKS (Retained, self-hosted), Azure OpenAI direct (Ceded, model-specific), and fully self-hosted options. The April 2026 Custom Agent Monitoring is architecturally significant: Foundry extends governance to agents it doesn't host. This is a Layer 2B/2C crossover — the runtime observability reaches beyond the runtime boundary. The OpenAI restructuring creates a unique Layer 2B dynamic: OpenAI models are Azure's flagship capability AND are now available on AWS Bedrock. The model is commodity; the platform around the model is the lock-in. The Agent Framework being open-source (MIT) means the enterprise Retains the code and can run it anywhere. But the operational envelope (Foundry hosting, Entra identity, Purview compliance) is Azure-specific. Code portability vs. operational portability — same distinction identified in the Google Cloud assessment with ADK. ### Borrowed Judgment Multi-layered. OpenAI models: borrowed alignment, training data, and safety decisions — the most significant model-provider borrowed judgment in this assessment because OpenAI is simultaneously a partner, a competitor (ChatGPT Enterprise vs. Microsoft 365 Copilot), and a platform (OpenAI API vs. Azure OpenAI Service). Microsoft's own models (MAI-1, CoreAI): borrowed judgment shifts to Microsoft's model team. Open-source models: community-borrowed judgment. The enterprise using Azure OpenAI Service inherits three judgment systems simultaneously: Microsoft's platform decisions (Foundry defaults, content filtering), OpenAI's model decisions (alignment, capabilities, safety), and NVIDIA's silicon decisions (GPU scheduling, driver behavior). This is the most complex borrowed judgment stack in the assessment. ### Working Notes The April 2026 restructuring eliminated Azure exclusivity for OpenAI models. GPT-5.5 appeared on AWS Bedrock the next day. This validates the 4+1 model's prediction that model-layer lock-in is transient while platform-layer lock-in (2A/2B/2C) is durable. Microsoft's response is right: invest in Foundry, Entra Agent ID, and governance infrastructure that doesn't move with the model. The CoreAI division under Mustafa Suleyman signals Microsoft is building its own model capability to reduce OpenAI dependency. The borrowed judgment composition changes as Microsoft's own models mature. Today it's Microsoft platform + OpenAI models. Tomorrow it could be Microsoft platform + Microsoft models — a vertically integrated model similar to Google's. ## ● Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Intelligence 2C: Productized | Infra 2C: Emerging ### Vendor-Provided Components **Agent Governance Toolkit (April 2026, Open-Source)** [DAPM: Delegated] Seven-package, multi-language (Python, TypeScript, Rust, Go, .NET) system for governing autonomous AI agents. Agent OS: stateless policy engine, sub-millisecond enforcement (<0.1ms p99). Addresses all 10 OWASP agentic AI risks. 9,500+ tests. Deterministic, not LLM-based — 0.43 seconds total overhead across 11 agents over 11 days in Microsoft's internal deployment. Intent-based authorization: Declare → Approve → Execute → Verify lifecycle. Drift detection with soft-block, hard-block, or log-only responses. Deploy as AKS sidecar, Foundry middleware, or Azure Container Apps. **Microsoft Entra Agent ID (GA April 2026)** [DAPM: Ceded] Agent identity as first-class Entra citizens. Same Conditional Access, lifecycle management, risk detection, and governance as human identities. Agent blueprints: reusable identity templates for consistent governance. Shadow AI detection: discover unsanctioned agents. Sponsor lifecycle management. Four Conditional Access policy templates for agents. ID Protection extends anomaly detection to agent identities. Part of Microsoft Agent 365. Identity is a cross-cutting concern not fully assessed at this layer for other vendors in the series. **Microsoft Agent 365 (Control Plane)** [DAPM: Ceded] Unified agent registry and distributed control plane. Single inventory of all agents — Microsoft and non-Microsoft — operating in the organization. Agent Card Manifests provide rich metadata. Collection-based policies for discovery governance. SDK for third-party agent platforms to register agents. Converging the Entra Agent Registry under Agent 365 for simplified management. **Foundry AI Gateway** [DAPM: Ceded] Secures and manages MCP tools with policies and observability. Routes agent traffic through a governed gateway. Content Safety in Foundry Control Plane provides guardrails. Cross-prompt injection attack (XPIA) protection. ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** All Layer 2C components are Microsoft IP or open-source. NVIDIA does not provide or control governance, policy, agent identity, or reasoning in the Azure stack. ### Gap Analysis Microsoft has the most productized Intelligence Layer 2C alongside Google. Four distinct 2C capabilities are GA or recently shipped: (1) Agent Governance Toolkit: Open-source, deterministic policy enforcement addressing all 10 OWASP agentic AI risks with sub-millisecond enforcement. Being open-source (MIT) means it's available to every vendor's customers — Microsoft built a governance standard others can adopt. (2) Entra Agent ID: The only vendor that has extended enterprise identity governance to AI agents as first-class identity citizens. Conditional Access for agents is GA. Shadow AI detection for unsanctioned agents is uniquely valuable for enterprises that don't yet know what agents are running. Identity as a governance dimension is not fully assessed across other vendors in this series — the Azure assessment surfaces it as a structural concern that warrants cross-vendor evaluation. (3) Agent 365: Unified agent registry covering Microsoft and non-Microsoft agents. The SDK for third-party platforms to register means Microsoft is building the universal agent inventory — even for agents that don't run on Azure. (4) Agent Governance Toolkit + Agent Framework integration: The Declare → Approve → Execute → Verify lifecycle with drift detection is the most structured agent governance protocol in this assessment. Infrastructure Layer 2C (emerging): Same gap as AWS — no service answers 'given data residency, cost, latency, and compliance, should this workload run on Maia, NVIDIA, or AMD in which region?' The policy-driven infrastructure placement engine does not exist as a product. Azure Arc + cross-cloud GPU scheduling (previewed at KubeCon) are building blocks, not a composed reasoning plane. The key differentiator vs. Google's Layer 2C: Azure's Intelligence 2C is model-agnostic — it governs agents regardless of which model powers them. Google's is Gemini-integrated. For enterprises running multi-model strategies, Azure's approach provides governance without model lock-in. ### Borrowed Judgment Intelligence 2C: Low — Agent Governance Toolkit is open-source, Entra Agent ID is Microsoft IP, Agent 365 is Microsoft IP. The enterprise defines governance policies; Microsoft provides the enforcement infrastructure. The Entra Agent ID dependency is worth flagging: extending agent identity into Entra means agent governance is tied to Microsoft's identity plane. An enterprise running agents on AWS that are governed by Entra Agent ID has a cross-cloud identity dependency on Microsoft. This is a deliberate strategic move — Microsoft is positioning Entra as the universal agent identity standard, as Active Directory became the universal enterprise identity standard. Infrastructure 2C: Not yet built. The building blocks (Arc, DRA, cross-cloud scheduling) exist but have not been composed into a policy-driven placement engine. ### Working Notes The Agent Governance Toolkit validates the 4+1 model's Layer 2C thesis directly. The OWASP Agentic Top 10 alignment, intent-based authorization lifecycle, and deterministic enforcement model all map precisely to what the 4+1 model describes as the Reasoning Plane's governance function. Microsoft's internal deployment data (11 agents, 7,000+ decisions, 0.43 seconds total governance overhead over 11 days) provides the first production evidence that Layer 2C governance can operate at negligible performance cost. Microsoft's Cloud Adoption Framework for agent governance (April 2026) provides a prescriptive four-layer composition model: data governance/compliance (Purview), agent observability (Agent 365, Defender, Log Analytics), agent security (Defender AI threat protection, Content Safety, AI Red Teaming Agent, RBAC, Sentinel), and agent development (Agent Framework, Foundry SDK, MCP, A2A). This is not a product but a reference architecture showing how the productized 2C components compose into an enterprise governance posture. The strategic play: Microsoft is building Layer 2C as an identity and governance story (Entra Agent ID + AGT). Google is building Layer 2C as a model intelligence story (Gemini integrated into infrastructure). VAST is building Layer 2C as a data platform story (PolicyEngine + Polaris). Three different vectors converging on the same architectural function. The 4+1 model predicted this convergence. ## ● Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Broadest Enterprise Ecosystem ### Vendor-Provided Components **Microsoft Foundry Model Catalog (11,000+ Models)** [DAPM: Delegated] OpenAI (GPT-5.5, o-series), Meta (Llama), Mistral, Cohere, NVIDIA Nemotron, Microsoft (MAI-1, Phi), plus thousands of open-source and industry-specific models. One of the broadest model marketplaces available. **Copilot Studio** [DAPM: Ceded] Low-code/no-code platform for building custom Copilot agents and extensions. Agents publish through Microsoft 365 Copilot and Teams. Enterprise-accessible without developer expertise. Managed MCP tool governance. **ISV + Partner Ecosystem** [DAPM: Delegated] Azure Marketplace with thousands of AI applications. SI partnerships (Accenture, Deloitte, KPMG, Infosys). ISV integrations across every industry. The Microsoft partner network is the largest in enterprise technology. **GitHub Copilot (Agentic Developer Platform)** [DAPM: Ceded] GitHub Copilot has evolved from code completion to a full agentic development platform with three surfaces: IDE agent mode (VS Code, Visual Studio 2026 with cloud agent integration), Copilot CLI (GA for all paid subscribers — Plan mode, Autopilot mode, dynamic agent delegation), and cloud agent (autonomous coding agent that researches repos, creates plans, makes code changes, opens PRs without developer in the loop via GitHub Actions). Multi-model: Claude, GPT, and OpenAI Codex models selectable per task. GitHub Copilot SDK enables building custom agents using Copilot's orchestration runtime — planning, tool invocation, streaming, MCP server integration. Foundry Local integration enables fully local, air-gapped agentic coding with data sovereignty. Microsoft Agent Framework supports Copilot SDK as agent backend. Visual Studio 2026 adds Debugger Agent that validates fixes against real runtime behavior. The most pervasive developer AI surface by installed base — integrated into the largest code hosting platform (GitHub) and the most widely used enterprise IDE ecosystem (VS Code + Visual Studio). ### NVIDIA-Provided Components **NVIDIA NIM + Blueprints on Foundry** NVIDIA models and application patterns available through Foundry. Same blueprints available across other assessed vendors — non-differentiating. ### Gap Analysis Azure's Layer 3 is structurally unique because Microsoft owns both the infrastructure platform AND the largest enterprise application estate in market. GitHub Copilot (developer AI), Microsoft 365 Copilot (knowledge worker AI, 3.3% paid adoption as of early 2026), Power Platform AI (business process automation), and Dynamics 365 Copilot (CRM/ERP AI) are Microsoft application products — not Azure services — but they consume Azure AI infrastructure at Layers 0–2C. This creates a dynamic no other vendor has: the enterprise's Layer 3 application decision often drives the Layer 0 infrastructure decision rather than vice versa. On-prem vendors sell infrastructure first; Azure sells applications first. This application estate context does not appear as assessed components because these are Microsoft products, not Azure services. The parallel exists in the Google Cloud assessment where Gemini's consumer properties are acknowledged as context without being assessed as GCP Layer 3 components. The GitHub Copilot vs. Microsoft 365 Copilot adoption contrast has implications beyond Azure. GitHub Copilot succeeds (high adoption, measurable productivity impact in a specific workflow). M365 Copilot struggles (3.3% conversion on the largest enterprise installed base). This suggests Layer 3 AI applications succeed when targeting specific professional workflows rather than augmenting general knowledge work — an observation relevant to every vendor's Layer 3 strategy. Copilot Studio is the Azure-native Layer 3 capability: low-code/no-code agent building with distribution through M365 and Teams. This is the bridge between the Microsoft application estate and the Azure AI platform — agents built in Copilot Studio consume Foundry models, are governed by Entra Agent ID, and are distributed through the M365 surface. GitHub Copilot's evolution from code completion to autonomous cloud agent represents the most significant Layer 3 shift in the Azure/Microsoft ecosystem. The cloud agent (coding autonomously in GitHub Actions, opening PRs without developer presence) creates a new category of AI-generated code flowing through enterprise repositories. The DAPM question: when Copilot's cloud agent writes production code autonomously, who owns the engineering judgment? The developer who assigned the issue, the model that generated the code, or GitHub's agent orchestration logic? The GitHub Copilot SDK is strategically important: it exposes Copilot's production agent runtime as a programmable API, enabling enterprises to build custom agents on top of GitHub's orchestration engine. Combined with Foundry Local for air-gapped on-device inference, this creates a Retained-authority path for enterprises that need agentic development without cloud dependency — the only assessed cloud vendor offering a fully local agentic developer tool. Three-cloud comparison of agentic developer surfaces: • AWS Kiro: Spec-driven, methodology-opinionated. Enforces structured requirements → design → implementation. Replaces Q Developer. Deep AWS integration (pricing, Well-Architected, Bedrock). Most prescriptive. • Google Antigravity 2.0: Agent orchestration platform. Multi-agent parallel execution, scheduled background tasks. Desktop + CLI + SDK. Replaces Gemini CLI. Gemini-native. Most ambitious multi-agent vision. • GitHub Copilot: IDE-embedded + CLI + cloud agent. Multi-model (Claude, GPT, Codex). GitHub-native (repos, issues, PRs, Actions). Copilot SDK for custom agents. Foundry Local for air-gapped. Most pervasive installed base. All three are Ceded or Delegated — the enterprise adopts the vendor's opinions about how AI-assisted development should work. The competitive differentiation is in the development philosophy: AWS imposes discipline (specs first), Google enables parallelism (multiple agents), Microsoft/GitHub enables delegation (assign and forget). ### Borrowed Judgment The most complex borrowed judgment landscape in this assessment. The enterprise using Azure AI inherits judgment from: Microsoft's platform decisions (Foundry defaults, content filtering), OpenAI's model decisions (alignment, capabilities, safety), NVIDIA's silicon decisions (GPU scheduling, driver behavior), ISV application decisions, and Microsoft's enterprise application decisions (Copilot integration points, Power Platform automation patterns). The unique risk: Microsoft's Layer 3 applications are also the enterprise's productivity tools. If Copilot in Word makes a poor suggestion, it affects the document. If Copilot in Dynamics 365 makes a poor recommendation, it affects the sales pipeline. The blast radius of borrowed judgment at Layer 3 is larger for Microsoft than for other vendors because the applications are mission-critical business tools, not standalone AI applications. ### Working Notes The Microsoft 365 Copilot adoption data is the enterprise AI reality check this assessment series benefits from. At 3.3% conversion on the largest installed base in enterprise software, the question is whether the industry's Layer 3 ambitions are ahead of enterprise readiness — and whether that readiness gap affects infrastructure investment decisions at Layers 0–2. GitHub Copilot's success vs. Microsoft 365 Copilot's adoption challenge has implications for every vendor's Layer 3 strategy: AI applications may succeed faster when they target specific professional workflows (coding, design, data engineering) than when they target general knowledge work. The GitHub Copilot SDK's availability as a programmable agent runtime backend via Microsoft Agent Framework creates a developer platform play that spans Layers 2B and 3. Enterprises building custom agents on the Copilot SDK inherit GitHub's orchestration opinions — planning, tool invocation, context management — as borrowed judgment. The SDK is the distribution mechanism for Microsoft's agent architecture opinions into enterprise codebases. Foundry Local + Copilot SDK for air-gapped agentic development is a unique capability in this assessment. No other cloud vendor provides a fully local, data-sovereign agentic developer tool. AWS Kiro requires Bedrock connectivity. Google Antigravity requires Gemini API. GitHub Copilot with Foundry Local runs entirely on-device. This matters for defense, financial services, and government enterprises where source code cannot leave the local environment. The agentic developer tools market is consolidating around three philosophies: structured discipline (Kiro), parallel orchestration (Antigravity), and delegation-first autonomy (Copilot). Red Hat's OpenShift Dev Spaces supporting Kiro, Copilot, Claude CLI, and others from a single governed runtime suggests the enterprise will run multiple agentic developer tools simultaneously — governed by the platform layer (2A) rather than choosing a single tool at Layer 3. ════════════════════════════════════════════════════════════════════════════════ # Cisco Secure AI Factory with NVIDIA Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 23, 2026 **Source:** Cisco Live EMEA 2026, GTC 2026, RSA Conference 2026, Cisco Q3 FY2026 earnings, Galileo acquisition announcement (Apr 2026), AGNTCY/Linux Foundation donation (Jul 2025), DefenseClaw open source, Cisco/VAST partnership, analyst coverage, published 4+1 model ## Summary Finding Cisco is the only vendor in this assessment series whose AI infrastructure authority is anchored by networking and security — but the compute story has matured beyond that framing. Where Dell builds up from servers and storage, HPE from sovereign compute and owned networking, VAST from the data platform, and hyperscalers from managed services, Cisco builds outward from the network fabric and security posture, with a purpose-built GPU compute portfolio (C885A, C845A, X-Series X580p) that now competes directly with Dell and HPE at every AI workload tier. The Secure AI Factory with NVIDIA is a reference architecture, not a vertically integrated platform — and that distinction defines Cisco's entire 4+1 profile. The customer asking Cisco for an AI factory receives a complete solution: Cisco-owned networking and compute, Cisco-owned security and observability, and partner-delivered storage and data services through validated integrations. Layer 0 is Cisco's strongest position with depth across both networking and compute. Cisco owns the network fabric (Silicon One G300, Nexus 9000/8000, Nexus Hyperfabric) and has built a three-tier GPU compute portfolio (C885A dense HGX for training, C845A modular MGX for inference, X-Series X580p for composable blade AI) that competes directly with Dell PowerEdge and HPE ProLiant. The X-Series disaggregated GPU architecture — independently managing CPU and GPU lifecycles via X-Fabric — is an architectural differentiator no other blade vendor matches. Cisco does not own storage; the AI POD storage story depends on partners (VAST Data primary, plus NetApp, Pure Storage, Hitachi Vantara, Nutanix), which is Delegated authority. This makes Cisco's Layer 0 the inverse of Dell's: Dell owns compute and storage but brands NVIDIA networking silicon. Cisco owns networking silicon and has purpose-built AI compute but depends on partners for the data foundation. The security and observability layers are where Cisco makes its most differentiated claim. AI Defense, Duo Agentic Identity, DefenseClaw, Splunk Observability Cloud with AI Agent Monitoring, and the Galileo acquisition together represent the most comprehensive agent security and observability portfolio of any infrastructure vendor assessed. These capabilities span Layers 2A through 2C — and they constitute genuine Cisco-owned IP, not rebranded NVIDIA or partner technology. The structural question is whether security and observability are sufficient to constitute a control plane, or whether they remain constraint enforcement without placement reasoning. The AGNTCY initiative — open-sourced by Cisco's Outshift incubator, donated to the Linux Foundation with Dell, Google Cloud, Oracle, and Red Hat as formative members — contributes infrastructure primitives (discovery, identity, messaging, observability) to the broader agentic standards ecosystem. AGNTCY sits within the Agentic AI Foundation (AAIF), the fastest-growing project in Linux Foundation history, where MCP (Anthropic), A2A (Google), AGENTS.md (OpenAI) define the foundational protocols. Cisco is a Gold member of AAIF, not a Platinum founder — the ecosystem-defining standards were originated by Anthropic, Google, and OpenAI, not Cisco. AGNTCY contributes essential plumbing beneath those protocols, and Cisco's commercial products (AI Defense, Duo, Splunk) could become the enterprise implementation layer if these open standards achieve adoption. But Cisco's position is contributor, not definer. Cisco's ~$9B in projected FY2026 AI infrastructure orders, record $15.8B quarterly revenue, and hyperscaler design wins (Silicon One P200, G200) validate market traction. The Galileo acquisition and DefenseClaw open-source release signal an intent to own the AI agent trust layer. The 4+1 assessment reveals a vendor with genuine strength at Layer 0 (networking + compute), comprehensive security and observability across Layers 2A through 2C (all Cisco-owned IP), and partner-delivered data plane capabilities at Layers 1A through 1C that the customer receives as part of the Secure AI Factory solution. The structural gap is specific: Cisco has no proprietary data software (storage, retrieval, pipelines) and no policy-driven placement engine at Layer 2C. Cisco secures the AI Factory and contributes to the open standards that may define its governance. It does not yet govern the factory itself. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Networking Strength — Asymmetric ### Vendor-Provided Components **Silicon One G300 (AI Networking Silicon)** [DAPM: Retained] 102.4 Tbps switching silicon for massive AI cluster buildouts. Intelligent Collective Networking delivers 33% increase in network utilization and 28% improvement in job completion time. Powers gigawatt-scale AI clusters for training, inference, and real-time agentic workloads. Cisco-designed silicon — not NVIDIA-branded, not repackaged. This is Cisco's most significant Layer 0 differentiator: owned networking silicon IP at the same architectural level as AWS Nitro/EFA/SRD, Google Virgo, and Microsoft SONiC. **Nexus 9000 / 8000 Systems (G300-powered)** [DAPM: Retained] 102.4 Tbps switching speeds in liquid-cooled and air-cooled designs. Nearly 70% energy efficiency improvement with liquid cooling and advanced optics. Common hardware for diverse fabric types including Nexus Hyperfabric. Designed for hyperscalers, neoclouds, sovereign clouds, service providers, and enterprises. First P200 design wins confirmed with hyperscalers; third P200 win in early Q4 2026. **Nexus Hyperfabric (Cloud-Managed AI Fabric)** [DAPM: Retained] Cloud-managed network fabric integrating networking, GPU, and storage into a unified infrastructure. Fabric-as-a-Service model with automated deployment and operations. NVIDIA NCP (Networking Connectivity Platform) design validation. Sharon AI selected Hyperfabric for Australia's first Cisco Secure AI Factory deployment (1,024 NVIDIA Blackwell Ultra GPUs). **Nexus One (Unified Management Plane)** [DAPM: Retained] Unified management across on-premises and cloud-based data center deployments. AI Job Observability provides job-aware, network-to-GPU visibility correlating network telemetry with AI workload behavior. Native Splunk Platform integration for in-place telemetry analysis without data movement — essential for sovereign cloud and compliance-sensitive environments. API-driven automation and customization built-in. **Cisco UCS C885A M8 (Dense GPU — HGX Platform)** [DAPM: Retained] Cisco's first 8-way accelerated computing system. Built on NVIDIA HGX platform with 8x NVIDIA H100 or H200 Tensor Core GPUs, OR 8x AMD MI300X/MI350X OAM GPUs — multi-vendor GPU support in the same dense server. One ConnectX-7 NIC or BlueField-3 SuperNIC per GPU for cluster-scale training. BlueField-3 DPUs for accelerated GPU-to-data access and zero-trust security. Designed for large LLM training, fine-tuning, and large model inference. MLPerf Training benchmarks published January 2026. This is Cisco's direct competitor to Dell PowerEdge XE9680 and HPE ProLiant DL380a — a genuine leadership-class AI server, not a rebadged reference design. **Cisco UCS C845A M8 (Modular GPU — MGX Platform)** [DAPM: Retained] Flexible, scalable AI server based on NVIDIA MGX modular reference design. Configurable from 2 to 8 NVIDIA or AMD PCIe GPUs including RTX PRO 6000 and RTX PRO 4500 Blackwell Server Edition in a 4RU chassis. Enhanced power delivery, fewer PCBs, improved cable routing for optimal airflow and thermal management. GPU hot-swap for faster replacement. E1.S SSDs for increased storage density. First VAST Certified CNode-X platform in market (UCS C845A M8 with RTX PRO 6000). 'Start small and scale up' positioning for enterprises ramping AI from inference to fine-tuning. Day-1 Intersight management for unified AI and traditional workload operations. **Cisco UCS X-Series + X580p GPU Node (Modular Blade AI)** [DAPM: Retained] Disaggregated, composable AI compute within the UCS X9508 blade chassis. X580p PCIe Node adds up to 4 NVIDIA GPUs (RTX PRO 6000/4500 Blackwell, H200 NVL, L40S) to X210c/X215c compute nodes via X-Fabric Technology. X9516 X-Fabric Module provides PCIe Gen 5 switching with ultra-low-latency, NVLink Bridge support, and dynamic GPU-to-host provisioning. Independent CPU and GPU lifecycle management — upgrade GPUs without replacing compute nodes. MLPerf Inference benchmarks published April 2026. Up to 24 GPUs per 7RU chassis. This is architecturally distinctive: no other vendor offers a modular blade system with disaggregated GPU composition for AI workloads. Dell PowerEdge MX and HPE Synergy do not have equivalent GPU composability. **Cisco UCS C240 M8 / C225 M8 (Mainstream + Storage)** [DAPM: Retained] C240 M8: 2U rack server with Intel Xeon 6 processors, up to 2 double-width NVIDIA GPUs (RTX PRO 6000 Blackwell, H200 NVL). Balanced compute for distributed AI, analytics, edge AI, and vision workloads. Up to 28 NVMe/SAS/SATA drives for high-speed data access. C225 M8: designated as VAST Data Platform EBOX (storage servers) in AI POD architecture — the persistent storage foundation for Cisco AI PODs. Both managed through Cisco Intersight. **Cisco AI PODs (Pre-Validated Full-Stack Infrastructure)** [DAPM: Retained] Modular, full-stack AI infrastructure platform combining UCS C885A/C845A compute, Nexus 9000 networking (up to 800G), NVIDIA GPUs, and partner storage (VAST Data, NetApp, Pure Storage, Hitachi Vantara, Nutanix). Scalable from 32 to 128+ GPUs. Cisco Validated Designs (CVDs) and NVIDIA Enterprise Reference Architectures (ERAs) for training, fine-tuning, inference, and RAG workloads. Pre-validated designs reduce setup time by up to 50%. Integrated management through Cisco Intersight and Nexus Dashboard. Modular scale-unit design enables growth without full infrastructure overhaul. **Cisco Unified Edge (Edge AI Compute)** [DAPM: Retained] Edge compute platform supporting NVIDIA RTX PRO 4500 and 6000 Blackwell Server Edition GPUs for mission-critical AI at the edge. Also supports NVIDIA L4 GPUs. Zero-touch deployment with pre-validated blueprints. Centralized management via Intersight with Splunk and ThousandEyes integrations for end-to-end edge observability. Multi-layered zero-trust security with tamper-proof features, deep telemetry, drift-free configurations. Cisco AI Grid reference design extends edge AI to service providers via Cisco Mobility Services Platform — a unique go-to-market that no other assessed vendor offers. ### NVIDIA-Provided Components **NVIDIA GPU Silicon (HGX, MGX, RTX PRO Blackwell)** All GPU acceleration in Cisco UCS servers depends on NVIDIA silicon — same structural dependency as Dell and HPE. But Cisco adds genuine compute engineering above the GPU: the X-Series X580p disaggregated GPU composition, X-Fabric PCIe Gen 5 switching, MGX reference design improvements (enhanced power delivery, PCB reduction, cable routing for thermal management), and multi-vendor GPU support (AMD MI300X/MI350X on C885A alongside NVIDIA HGX). Cisco's compute differentiation is modularity and composability, not thermal engineering (Dell) or sovereign heritage (HPE). **NVIDIA Spectrum-X Switch Silicon** Cisco offers BOTH Silicon One-powered switches (Cisco-designed silicon) AND Spectrum-X-powered switches (NVIDIA silicon). This dual-silicon networking strategy is unique in the assessment series — Dell only brands NVIDIA Spectrum, HPE only uses Juniper/Aruba/Slingshot, VAST depends on OEM networking. Cisco is the only vendor that competes with NVIDIA in switching silicon while also offering NVIDIA's switching silicon. **NVIDIA BlueField DPUs** Cisco extends Hybrid Mesh Firewall policy enforcement to BlueField DPUs embedded in GPU servers, enabling threat mitigation at the server level before reaching sensitive data. This is Cisco adding security value on top of NVIDIA hardware — a pattern consistent across the assessment. **NVIDIA AI Enterprise + NIM** Cloud-native software tools, libraries, frameworks, dynamic GPU resource allocation, AI workload scheduling, and production-ready models. Same NVIDIA software dependency as Dell and HPE at this layer. ### Gap Analysis Layer 0 is Cisco's strongest position with genuine depth across both networking AND compute — not just networking. Networking authority is unmatched among on-prem vendors. Silicon One is proprietary switching silicon comparable in strategic significance to AWS's Nitro/EFA, Google's Virgo, or Microsoft's SONiC. No other on-prem infrastructure vendor designs their own switching silicon for AI networking. Dell brands NVIDIA Spectrum. HPE acquired Juniper for networking IP but doesn't design switching ASICs. VAST depends entirely on OEM networking. The compute portfolio is broader and more architecturally deliberate than initial assessment suggested. The three-tier GPU server strategy (C885A for dense 8-GPU HGX training, C845A for flexible 2-8 GPU MGX inference and fine-tuning, X-Series X580p for composable blade AI) covers the full spectrum of enterprise AI workloads from leadership-class training to distributed inference. The C885A competes directly with Dell PowerEdge XE9680 and HPE ProLiant DL380a as a genuine 8-GPU dense server with MLPerf benchmarks published. The X-Series X580p is an architectural differentiator that deserves specific recognition: disaggregated GPU composition via X-Fabric allows enterprises to independently manage CPU and GPU lifecycles, dynamically provision GPU resources to compute nodes, and scale GPU density within a blade chassis (up to 24 GPUs per 7RU). No other vendor offers modular blade-based GPU composability at this level. Dell PowerEdge MX and HPE Synergy do not have equivalent GPU disaggregation. This is Cisco applying its composable infrastructure heritage (UCS X-Series has always been about disaggregation) to the AI compute problem. Multi-vendor GPU support (AMD MI300X/MI350X alongside NVIDIA HGX on C885A) gives Cisco the same silicon optionality that HPE offers with GX5000 (NVIDIA Rubin + AMD MI430X). Dell's AI Factory is NVIDIA-only (AMD under separate branding). The storage gap remains the most significant Layer 0 structural difference from Dell and HPE. Cisco does not manufacture or sell storage. The AI POD storage story depends on partners: VAST Data (primary), plus NetApp, Pure Storage, Hitachi Vantara, and Nutanix as validated options. This multi-vendor storage flexibility is arguably a strength for the reference-architecture model — the enterprise retains storage vendor choice — but it means Cisco cannot build a vertically integrated data plane. The dual-silicon networking strategy (Silicon One + Spectrum-X) is strategically unique. Cisco gives customers a choice: Cisco-designed silicon or NVIDIA silicon, both managed through Nexus One. This preserves the NVIDIA partnership while maintaining Cisco's networking authority. If NVIDIA changes its Spectrum roadmap, Cisco's customers have an alternative that Dell's customers do not. The ~$9B in FY2026 AI infrastructure orders and hyperscaler design wins (P200, G200) validate that the networking + compute + reference architecture approach has market traction at the highest scale. ### Borrowed Judgment Moderate but structurally different from Dell's or HPE's. Cisco borrows GPU silicon judgment from NVIDIA (same as everyone) but retains genuine compute platform engineering judgment: X-Series composable architecture, X-Fabric GPU disaggregation, MGX reference design improvements, multi-vendor GPU support, and the three-tier server strategy. Cisco also retains networking judgment entirely — Silicon One, Nexus 9000/8000, Hyperfabric, and Nexus One are all Cisco IP. Storage judgment is borrowed from VAST Data and other partners. The comparison: • Dell retains compute packaging (thermal, mechanical, rack-scale) and storage judgment, borrows networking silicon from NVIDIA (Spectrum). • HPE retains compute judgment (Cray heritage, ProLiant) and networking judgment (Juniper/Aruba/Slingshot), borrows runtime from NVIDIA. • Cisco retains networking judgment AND compute platform engineering (UCS, X-Series composability), borrows GPU silicon (NVIDIA) and storage (partners). Each on-prem vendor retains authority in their heritage domain. Cisco's heritage is networking, but UCS — now in its M8 generation with purpose-built AI servers — has matured from 'networking company does compute' to a genuine multi-tier AI compute platform. The X-Series disaggregated GPU architecture is Cisco's compute contribution that has no direct equivalent from Dell or HPE. ### Working Notes The UCS X-Series composable architecture is Cisco's most underappreciated Layer 0 capability. The X580p + X-Fabric model — dynamically provisioning GPU resources to compute nodes, independently managing CPU and GPU lifecycles, NVLink Bridge support within a blade chassis — applies the same disaggregation principle that defined UCS from inception. If GPU lifecycle velocity continues to outpace CPU lifecycle velocity (which it will), the ability to upgrade GPUs without replacing the compute node is a genuine operational and CapEx advantage. No other blade vendor offers this. The C885A M8 with AMD MI300X/MI350X support is worth tracking. If AMD Instinct gains enterprise traction, Cisco is one of two on-prem vendors (alongside HPE) positioned to offer both NVIDIA and AMD in the same dense-GPU server platform. Dell's AMD support is under separate 'Dell AI Platform with AMD' branding — a different SKU family, not GPU optionality within the same chassis. The Cisco AI Grid reference design for service providers (Cisco Mobility Services Platform + NVIDIA RTX PRO Blackwell GPUs) is a unique go-to-market that no other assessed vendor offers. Dell, HPE, and VAST target enterprises and neoclouds. Cisco targets enterprises, neoclouds, AND the service provider edge — leveraging telco relationships that predate the AI era. If edge inference becomes a significant workload category, Cisco's service provider footprint is a distribution advantage that pure-compute vendors cannot match. The energy efficiency story is worth noting: nearly 70% improvement with liquid cooling and advanced optics. At hyperscale, energy efficiency is a buying criterion that often outranks raw performance — and it plays to Cisco's infrastructure engineering strengths. The multi-vendor storage partner model (VAST, NetApp, Pure Storage, Hitachi Vantara, Nutanix) within AI PODs is structurally different from Dell's single-vendor storage story (PowerScale/ObjectScale/Exascale). The reference architecture model gives enterprises storage vendor choice — but it also means Cisco cannot optimize the compute-to-storage integration path the way Dell can with Exascale or VAST can with DASE. ## ◑ Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Delegated — Partner-Delivered ### Vendor-Provided Components **VAST Data Platform on Cisco AI PODs (EBOX)** [DAPM: Delegated] VAST Data Platform running on Cisco UCS C225 M8 servers (designated EBOX). VAST DASE architecture provides shared-everything model with NVMe-over-Fabrics, global namespace, and ACID guarantees. Managed through Cisco Intersight alongside compute and networking. VAST was recognized at VAST Forward 2026 for its Cisco partnership. The customer buying a Cisco Secure AI Factory gets VAST storage as a validated, integrated component — Cisco delivers the capability even though VAST provides the IP. **Multi-Vendor Storage Options (AI POD Validated)** [DAPM: Delegated] Cisco AI PODs validate multiple storage partners: VAST Data (primary/deepest integration), NetApp, Pure Storage, Hitachi Vantara, and Nutanix. The enterprise retains storage vendor choice — a structural advantage of the reference architecture model over Dell's single-vendor storage story (PowerScale/ObjectScale) or VAST's vertically integrated approach. Each storage partner brings its own governance metadata model, meaning Layer 1A governance capabilities vary by storage choice. **Cisco Hypershield + Isovalent (Data Security)** [DAPM: Retained] Zero-trust, ransomware-resilient storage security and inline network security. Cisco Secure AI Factory security principles applied to the data layer. Isovalent provides eBPF-based runtime security for containerized AI workloads. Hybrid Mesh Firewall extends policy enforcement to BlueField DPUs at the storage server level. Security is Cisco-owned and layered on top of whichever storage partner the enterprise selects — consistent security posture regardless of storage choice. ### NVIDIA-Provided Components **NVIDIA AI Data Platform Reference Design** Cisco + NVIDIA + VAST validated as one of the first NVIDIA AI Data Platform reference designs. NVIDIA provides the reference architecture standard; Cisco provides networking and compute; VAST provides the data platform. ### Gap Analysis Applying the litmus test — what does the enterprise get when it asks Cisco for a 4+1 AI infrastructure? — the answer is clear: the Cisco Secure AI Factory delivers a complete data foundation. VAST Data Platform provides AI-optimized storage with DASE architecture, global namespace, and ACID guarantees. NetApp, Pure Storage, Hitachi Vantara, and Nutanix are validated alternatives. The customer gets Layer 1A capability through Cisco's go-to-market. The DAPM classification (Delegated) captures the structural reality: Cisco delivers the solution, the storage partner provides the IP. This is the same pattern as Dell's Trust3 AI partnership at Layer 1A — Dell delivers governance but Trust3 AI provides the capability. The customer buying Dell doesn't experience Trust3 AI as a gap. The customer buying Cisco shouldn't experience VAST storage as a gap either. Where Cisco's Layer 1A is genuinely thinner than Dell's or HPE's is governance metadata. Dell's MetadataIQ indexes billions of files across PowerScale/ObjectScale with automated classification, tagging, and metadata enrichment — Dell-owned IP. HPE's Data Fabric provides policy-based data placement with lineage tracking — HPE-owned IP. Cisco has no equivalent Cisco-owned metadata or governance capability. The governance metadata available depends entirely on which storage partner the enterprise selects. With VAST, the enterprise gets VAST Catalog. With NetApp, different governance primitives. Cisco adds consistent security across all of them (Hypershield, Isovalent) but does not add a Cisco-owned governance layer above the storage partner. The 4+1 model defines Layer 1A as the 'Governance Catalog that Layer 2C queries.' The catalog exists in the Cisco Secure AI Factory — it's provided by the storage partner. Cisco does not own that catalog, which means Cisco's future Layer 2C ambitions depend on a partner's metadata model. Dell and HPE can build from their own metadata to their own control plane. Cisco would need to build from VAST's metadata to Cisco's control plane — a cross-vendor integration that neither party has announced. The multi-vendor storage model is both a strength and a structural constraint. Strength: the enterprise retains storage vendor choice, avoiding single-vendor lock-in at the data layer. Constraint: Cisco cannot optimize the compute-to-storage-to-governance integration path the way Dell can with Exascale + MetadataIQ or VAST can with DASE + Catalog. Each storage partner brings its own data architecture, its own metadata model, and its own governance surface — Cisco must integrate across all of them rather than optimizing for one. ### Borrowed Judgment High for storage platform and governance metadata, low for data security. The enterprise buying a Cisco Secure AI Factory inherits the storage partner's data architecture decisions — VAST's DASE model, NetApp's ONTAP model, or Pure's Purity model depending on selection. Cisco contributes consistent security posture across all storage choices (Hypershield, Isovalent, Hybrid Mesh Firewall) but does not contribute governance metadata, data classification, or compliance tagging above the storage partner. Comparison: Dell borrows Layer 1A governance judgment from Trust3 AI (a specific partner function) but retains storage platform judgment (PowerScale, ObjectScale, MetadataIQ are Dell IP). HPE retains both storage platform (Alletra) and governance (Data Fabric) judgment. Cisco borrows the storage platform from partners but retains the security layer — structurally higher borrowed judgment than Dell or HPE at Layer 1A, but the customer still receives a complete solution. ### Working Notes The strategic question is whether Cisco intends to remain storage-agnostic (a networking company that partners with storage vendors) or will eventually acquire or build storage capability. The VAST partnership depth suggests the former — but the competitive pressure from Dell's Exascale and HPE's Alletra may eventually force the question. Cisco's absence from the storage market is also why VAST ships CNode-X through Cisco as an OEM partner. The relationship is complementary, not competitive: Cisco needs VAST for data, VAST needs Cisco for networking and enterprise distribution. Dell is notably absent from VAST's OEM partner list precisely because Dell and VAST compete at Layer 1A. ## ◑ Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Delegated — Partner-Delivered ### Vendor-Provided Components **VAST InsightEngine on Cisco AI PODs** [DAPM: Delegated] VAST InsightEngine on Cisco UCS C845A M8 servers (first VAST Certified CNode-X platform). Real-time vector embedding and retrieval for RAG and agentic workflows. Integrates with NVIDIA NIM microservices for AI-native retrieval. Automates embedding, indexing, and retrieval pipelines. Marketed as reducing RAG pipeline latency from minutes to seconds. This is a validated component of the Cisco Secure AI Factory — the customer asking Cisco for RAG capability receives InsightEngine as part of the delivered solution. **Cisco AI Networking for Retrieval Performance** [DAPM: Retained] Silicon One G300 Intelligent Collective Networking directly impacts retrieval latency and throughput. The 33% network utilization improvement and 28% job completion time improvement are not just training metrics — they affect how quickly GPU compute accesses storage-side vector indexes. Cisco's networking fabric is the enabling layer that makes VAST's retrieval performance achievable at scale. Lossless, low-latency Nexus 9000 fabric with up to 800G bandwidth between compute and storage tiers. ### NVIDIA-Provided Components **NVIDIA AI Data Platform + NeMo Retriever** NVIDIA AI Data Platform reference design validates the Cisco + VAST retrieval architecture. NeMo Retriever provides embedding models and reranking. Same NVIDIA retrieval stack available to Dell and HPE — the differentiation is in the validated integration, not the NVIDIA components. ### Gap Analysis Applying the customer litmus test: the enterprise asking Cisco for RAG and retrieval capability receives VAST InsightEngine as a validated component of the Secure AI Factory. The retrieval function is delivered. The DAPM classification (Delegated) captures that the retrieval IP belongs to VAST. Cisco's Retained contribution at Layer 1B is the networking fabric that makes high-performance retrieval possible — lossless connectivity between GPU compute nodes and VAST storage with predictable latency. This is not the retrieval capability itself, but it is the enabling condition. The Cisco + NVIDIA + VAST data platform is marketed as 'the first enterprise architecture unifying compute, fabric, and storage into a single, validated platform to accelerate RAG' — and the validated integration is genuine, even though each component has a different owner. Where Cisco is thinner than Dell at Layer 1B: Dell has Data Analytics Engine with its own MCP Server (Feb 2026), blurring the 1A/1B boundary with search, analytics, and orchestration surfaced as a single queryable service — Dell-owned IP. HPE has Data Fabric with integrated vector search capabilities. Cisco's Layer 1B retrieval is entirely partner-provided. If the enterprise selects a storage partner other than VAST (NetApp, Pure Storage, etc.), the Layer 1B retrieval story changes entirely — each partner brings different retrieval capabilities, and Cisco provides no Cisco-owned retrieval abstraction above them. The network-to-retrieval performance correlation is an underappreciated Cisco contribution. When Nexus One's AI Job Observability shows that retrieval latency degraded because of network congestion on a specific path between compute and storage tiers, that's retrieval-relevant intelligence that no storage vendor can provide independently. Cisco sees the network between the GPU and the data — VAST sees the data, NVIDIA sees the GPU, Cisco sees the fabric connecting them. ### Borrowed Judgment High for retrieval logic, low for retrieval-enabling networking. All retrieval and context management intelligence is provided by VAST (InsightEngine) and NVIDIA (NeMo Retriever, AI Enterprise). Cisco provides the networking substrate that determines retrieval performance characteristics — and that substrate is Cisco-owned IP with genuine impact on retrieval latency and throughput. The structural comparison: Dell borrows retrieval acceleration from NVIDIA (cuVS, NeMo Retriever) but retains the storage platform on which retrieval operates (PowerScale). Cisco borrows both retrieval logic (VAST InsightEngine) and retrieval acceleration (NVIDIA) but retains the networking fabric that connects them. ### Working Notes The network-as-retrieval-enabler framing is worth developing further. In a disaggregated architecture where GPU compute and vector storage are on separate server tiers connected by fabric, the network IS part of the retrieval path. Cisco's ability to correlate network telemetry with retrieval latency via Nexus One and Splunk is a genuine observability advantage — one that could feed Layer 2C placement decisions about which compute-to-storage path to use for a given retrieval workload. The VAST InsightEngine as first VAST Certified CNode-X platform on Cisco UCS C845A M8 (with RTX PRO 6000 acceleration) represents the deepest Cisco-VAST integration point. This is where the partnership moves from 'VAST storage on Cisco servers' to 'VAST compute-storage fusion on Cisco infrastructure' — closer to a joint product than a validated configuration. ## ◑ Layer 1C: Data Movement & Pipelines *ETL/ELT, feature engineering, data preparation for AI workloads* **Status:** Delegated — Partner-Delivered ### Vendor-Provided Components **VAST DataEngine / SyncEngine (on Cisco AI PODs)** [DAPM: Delegated] VAST DataEngine executes serverless functions directly where data lives — 'bringing compute to the data.' SyncEngine indexes and synchronizes data from external sources, triggering enrichment pipelines automatically. Both are delivered as part of the Cisco Secure AI Factory with VAST Data. The customer asking Cisco for data pipeline capability receives these as validated components. **Cisco Networking Fabric (Data Movement Substrate)** [DAPM: Retained] Silicon One G300 Intelligent Collective Networking optimizes data flow across the AI cluster. The network fabric is the physical data movement layer — every byte of training data, every embedding pipeline, every model checkpoint traverses Cisco's switching fabric. Lossless, low-latency networking with up to 800G bandwidth is a prerequisite for high-throughput data pipelines. Cisco's contribution to Layer 1C is the movement infrastructure, not the pipeline logic. ### Gap Analysis Applying the customer litmus test: the enterprise asking Cisco for data pipeline capability receives VAST DataEngine and SyncEngine as validated components of the Secure AI Factory. The pipeline function is delivered. DAPM classification (Delegated) captures the authority structure. Cisco has no proprietary data pipeline, ETL, or feature engineering tools — and this is consistent with Cisco's architectural identity. Cisco has never been a data management software company. Dell's Dataloop-powered Data Orchestration Engine is notable precisely because it's Dell's first proprietary software asset in the data lifecycle. HPE's Data Fabric provides policy-based data placement. IBM has watsonx.data with Confluent streaming. Cisco's equivalent is the validated partner integration. Cisco's Retained contribution at Layer 1C is the networking fabric that data pipelines traverse. In a disaggregated AI architecture, data movement between storage tiers, GPU compute, and model serving endpoints is constrained by network bandwidth and latency. Silicon One G300's Intelligent Collective Networking — the 33% utilization improvement and 28% job completion time improvement — directly accelerates data pipeline throughput. This is an infrastructure contribution, not a software contribution, but it's a real one. The Splunk data ingestion and processing capabilities could theoretically extend into data pipeline territory — Splunk already handles high-volume data streams for observability. But Splunk is positioned as an observability and security platform, not an AI data pipeline tool. No Cisco signals suggest this expansion. ### Borrowed Judgment High for pipeline logic, low for data movement infrastructure. All data pipeline orchestration, serverless execution, and data synchronization judgment is provided by VAST (DataEngine, SyncEngine) or by the enterprise's own tooling. Cisco provides the networking fabric that determines data movement performance — Cisco-owned IP that directly impacts pipeline throughput. If the enterprise selects a storage partner other than VAST, the Layer 1C story changes significantly. NetApp, Pure Storage, and Hitachi Vantara each have their own data movement capabilities, none validated at the same depth as VAST within the Cisco AI POD architecture. ### Working Notes The networking-as-data-movement framing is structurally consistent across Layers 1A, 1B, and 1C. At each data layer, Cisco's Retained contribution is the networking fabric that connects and enables partner-provided data capabilities. This is not a gap in the customer experience — the customer gets a complete data plane. It is a structural characteristic of Cisco's reference architecture model: Cisco owns the connectivity, partners own the data logic. This pattern has a Layer 2C implication. If Cisco ever builds a control plane that makes placement decisions about data (where should this data live, how should it move, which pipeline should process it), that control plane would need to integrate with multiple storage partners' data APIs. Dell's hypothetical control plane only needs to integrate with Dell's own storage. Cisco's hypothetical control plane needs to be multi-vendor by design — which is harder to build but potentially more valuable in heterogeneous enterprise environments. ## ◑ Layer 2A: Infrastructure Orchestration *Resource scheduling, GPU allocation, infrastructure lifecycle management* **Status:** Networking-Centric Orchestration ### Vendor-Provided Components **Cisco Intersight (Unified Fleet Management)** [DAPM: Retained] Cloud-based infrastructure management for UCS compute, Nexus networking, and VAST storage lifecycle. Policy-driven server profiles, firmware management, and workload optimization. Manages the chassis and networking — comparable to Dell's OpenManage Enterprise but with stronger networking integration. Does not manage GPU workload scheduling directly. **Nexus One (AI Network Operations)** [DAPM: Retained] Unified management plane for data center networking. AI Job Observability correlates network telemetry with AI workload behavior. AgenticOps-powered autonomous network operations. The AI-aware network management capability is Cisco-unique — no other vendor correlates network health with GPU job completion at this level of integration. **AgenticOps (Agent-First IT Operating Model)** [DAPM: Retained] Agent-driven IT operations across networking, security, and observability. Cross-domain telemetry from Cisco Networking, Security Cloud Control, Nexus One, Splunk, and ThousandEyes. Agentic Workflows and AI Canvas for troubleshooting and automation. Deep Network Model provides system-wide awareness. Extends from cloud to on-premises to air-gapped industrial environments. **Cisco Hybrid Mesh Firewall** [DAPM: Retained] Policy enforcement across network switches, workloads, and NVIDIA BlueField DPUs. Extends security policy to the GPU server level. This is a Layer 0/2A security function — infrastructure-level policy enforcement that operates below the application runtime. ### NVIDIA-Provided Components **NVIDIA GPU Operator + Run:ai** GPU scheduling, resource allocation, and workload orchestration. Same NVIDIA dependency as Dell and HPE at this layer. Cisco does not own GPU-aware scheduling primitives. **NVIDIA AI Enterprise** Enterprise AI software platform providing GPU drivers, container runtime, and validated AI frameworks. Licensed separately from Cisco infrastructure. ### Gap Analysis Cisco's Layer 2A is networking-centric: Intersight manages the infrastructure fleet, Nexus One manages the AI network, AgenticOps provides agentic IT operations. These are genuine Cisco-owned capabilities with no equivalent in Dell's or HPE's stack in terms of network-to-GPU observability correlation. But GPU-aware scheduling and workload orchestration — the core Layer 2A functions — are NVIDIA-controlled (GPU Operator, Run:ai, AI Enterprise). This is the same gap Dell and HPE face. The enterprise using Cisco AI PODs schedules GPU workloads through NVIDIA's stack, not through Cisco's. The AgenticOps framework is architecturally interesting because it applies agentic AI to IT operations itself — using AI agents to manage the infrastructure that runs AI agents. No other on-prem vendor has an equivalent agent-driven IT operations model. But AgenticOps manages infrastructure operations (networking, security, observability), not AI workload placement. It's a Layer 2A operational capability, not a Layer 2C governance capability. Nexus One's AI Job Observability deserves specific note: correlating network telemetry with AI job behavior is a genuine signal that could feed Layer 2C placement decisions. If the network can tell you that a training job's completion time degraded because of network congestion on a specific spine switch, that's placement-relevant intelligence. Whether this signal is consumed by any placement logic today is not evident. ### Borrowed Judgment Moderate. Cisco retains infrastructure management and network orchestration judgment (Intersight, Nexus One, AgenticOps). GPU workload scheduling judgment is borrowed from NVIDIA (Run:ai, GPU Operator), same as Dell and HPE. The AI Job Observability capability reduces borrowed judgment marginally — Cisco can see network-to-GPU correlations that NVIDIA's orchestration layer may not surface. But seeing the problem and acting on it are different functions. Cisco sees; NVIDIA schedules. ### Working Notes AgenticOps' cross-domain telemetry — ingesting signals from networking, security, and observability into a single agentic decision surface — is the closest thing to a unified operations control plane in the on-prem assessment series. HPE's GreenLake Intelligence is comparable but serves a different function (infrastructure operations vs. networking operations). The question is whether AgenticOps evolves from operational automation into policy-driven governance. ## ◑ Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Security Layer — Not Runtime ### Vendor-Provided Components **Cisco AI Defense** [DAPM: Retained] Industry-first AI security solution for securing AI models, agents, applications, and infrastructure. Integrates NVIDIA NeMo Guardrails for AI application security. Secures NVIDIA OpenShell agent development platform with controls and guardrails to govern agent and claw actions. Model validation, prompt injection defense, data leakage prevention, bias detection. Spans Layer 2B (runtime security) and Layer 2C (agent governance). This is Cisco's most differentiated Layer 2B contribution — no other infrastructure vendor has an equivalent AI-specific security product. **DefenseClaw (Open Source Secure Agent Framework)** [DAPM: Retained] Open-source framework that automates security governance for agentic AI. Admission control: scans skills, MCP servers, plugins, and code before they run. Observe mode (log without blocking) and action mode (block HIGH/CRITICAL findings). Integrates with NVIDIA OpenShell as the sandbox runtime. Splunk Observability Cloud dashboards for monitoring. Apache 2.0 licensed. Plans to integrate with NVIDIA OpenShell as the sandbox to eliminate manual steps and accelerate secure agent deployment. **Cisco Isovalent Runtime Security** [DAPM: Retained] eBPF-based runtime security for containerized AI workloads. Deep kernel-level visibility into container behavior, network flows, and system calls. Part of the Secure AI Factory security stack. Runtime enforcement, not runtime execution — secures the container environment, does not provide the model-serving or agent execution runtime. **Splunk AI Agent Monitoring** [DAPM: Retained] Tracks performance, cost, quality, and behavior of LLM and agentic applications in Splunk Observability Cloud. Visualizes agent workflows. Integrating with Cisco AI Defense for risk mitigation (bias, hallucinations, data leakage, prompt injection). GA February 2026. Galileo acquisition (expected Q4 FY2026) will add real-time guardrails, 20+ evaluation metrics including hallucination detection and context adherence, full ADLC coverage. ### NVIDIA-Provided Components **NVIDIA NemoClaw / OpenShell** Agent runtime and sandboxed execution environment. Cisco AI Defense integrates with OpenShell to add security governance. The runtime is NVIDIA's; the security layer is Cisco's. **NVIDIA NIM + AI Enterprise** Containerized model serving and commercial AI platform. Same dependency as Dell and HPE. **NVIDIA Dynamo** Distributed inference framework with KV-aware routing. Performance optimization, not policy-driven placement. **NVIDIA NeMo Guardrails** Runtime safety boundaries integrated into Cisco AI Defense. Cisco extends NeMo Guardrails with its own AI Defense policy enforcement. ### Gap Analysis Cisco does not own the core agent runtime, model-serving runtime, or distributed inference framework — same structural position as Dell and HPE at Layer 2B. The NVIDIA NemoClaw/OpenShell stack provides execution; Cisco provides security governance around it. But Cisco's security contribution at Layer 2B is the most comprehensive of any infrastructure vendor assessed. AI Defense is not a rebranded NVIDIA capability — it's Cisco-developed AI-specific security that integrates with NVIDIA's runtime. DefenseClaw is open-source admission control for agent capabilities. Isovalent provides kernel-level container security. Splunk AI Agent Monitoring provides behavioral observability. Together, these constitute a 'trust layer' for AI execution that no other infrastructure vendor matches. The Galileo acquisition is strategically significant: it extends Splunk from infrastructure observability into AI agent evaluation, covering the full Agent Development Lifecycle (ADLC) — prompt optimization, model selection, production monitoring, and guardrail enforcement. Post-Galileo, Cisco will have the most comprehensive AI observability stack of any infrastructure vendor. The distinction the 4+1 model draws: security constrains what agents CAN'T do. Runtime governs what agents DO. Governance determines what agents SHOULD do. Cisco has the first. NVIDIA has the second. Nobody fully has the third. ### Borrowed Judgment Moderate but inverted from Dell's pattern. Dell borrows runtime judgment from NVIDIA and security judgment from partners (CrowdStrike, Fortanix, F5). Cisco borrows runtime judgment from NVIDIA but retains security judgment entirely — AI Defense, DefenseClaw, Isovalent, and Splunk AI Agent Monitoring are all Cisco IP. The enterprise inherits NVIDIA's runtime decisions but Cisco's security decisions. Post-Galileo, the observability judgment becomes even more Retained: Cisco will own infrastructure observability (Splunk), network observability (ThousandEyes, Nexus One), and AI agent observability (Galileo + AI Agent Monitoring) in a single platform. ### Working Notes Peter Bailey (SVP/GM Cisco Security): 'We have this opportunity to be a trust layer, not just for network activity, but actually what's happening at the application layer, at the workload layer, between agents, between workloads, between data.' This is a Layer 2B/2C strategic claim — Cisco as the trust layer that wraps around other vendors' execution layers. DefenseClaw's open-source model is strategically similar to AGNTCY: Cisco open-sources the agent security framework, building community adoption and standards influence, while retaining the commercial AI Defense product. Open-source for influence, commercial for revenue. ## ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Security + Identity — Not Yet Governance ### Vendor-Provided Components **Duo Agentic Identity** [DAPM: Retained] Agent identity as first-class non-human identities. Duo Directory registers agents as distinct identity objects mapped to human owners with group-based policy enforcement. Per-action least-privilege enforcement. Lifecycle visibility for agent onboarding and decommissioning. Cisco Identity Intelligence provides continuous inventory of active AI agents — including shadow agents never registered with IdP. Architectural advantage: Cisco spans both identity and network, surfacing agents that communicate across infrastructure even without IdP registration. **AGNTCY (Linux Foundation — Infrastructure Layer)** [DAPM: Retained] Open-source infrastructure for multi-agent systems: agent discovery (Open Agent Schema Framework / DNS-like agent directory), agent identity (cryptographic verification across organizational boundaries), agent messaging (SLIM — Secure Low-Latency Interactive Messaging, quantum-safe), agent observability (end-to-end across multi-agent, multi-vendor workflows). Originally open-sourced by Cisco Outshift (March 2025), donated to Linux Foundation (July 2025) with Dell, Google Cloud, Oracle, Red Hat as formative members. 75+ supporting companies. Critical context: AGNTCY is one component within a larger standards convergence, not the defining standard itself. The Agentic AI Foundation (AAIF), formed December 2025, is the umbrella — analogous to CNCF for cloud-native. AAIF's founding projects are MCP (Anthropic), AGENTS.md (OpenAI), and goose (Block). Platinum members are AWS, Anthropic, Block, Bloomberg, Cloudflare, Google, Microsoft, OpenAI. Cisco is a Gold member of AAIF, not Platinum. A2A (Google) reached v1.0 and joined the broader LF agentic ecosystem. AGNTCY sits as complementary infrastructure beneath these protocols — the plumbing (discovery, identity, messaging, observability) that MCP and A2A need but don't provide themselves. AAIF is already called 'the fastest-growing project in Linux Foundation history' with 190 members by May 2026 — more than double CNCF's membership at the same stage. Microsoft explicitly draws the Kubernetes parallel: 'Just as Kubernetes needed RBAC and admission controllers to be enterprise-ready, agentic systems need governance primitives.' AGNTCY contributes some of those primitives. It does not define the ecosystem. **Cisco AI Defense (Agent Governance Functions)** [DAPM: Retained] Secures multi-agent systems with controls for agent discovery, behavioral guardrails, and policy enforcement. Integrates with NVIDIA OpenShell for sandbox governance. Combined with Duo Agentic Identity, provides: which agents exist (discovery), who they are (identity), what they can access (authorization), what they do (monitoring), and what they can't do (guardrails). This is the closest thing to an agent governance stack in Cisco's portfolio. **Splunk Observability + Galileo (Agent Evaluation)** [DAPM: Retained] AI Agent Monitoring in Splunk Observability Cloud for production agent behavior tracking. Galileo acquisition adds real-time guardrails, hallucination detection, context adherence, chunk attribution — 20+ evaluation metrics across the full ADLC. The Futurum Group positioned Splunk post-Galileo as 'an AI-era control plane candidate, concentrating network, security, and AI agent behavior telemetry in a single vendor.' Whether telemetry concentration constitutes a control plane is the open question. ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** All Layer 2C components are Cisco IP or Cisco-originated open standards. NVIDIA does not control agent identity, governance policy, or observability in Cisco's stack. This is the same pattern as Google's Layer 2C — the intelligence governance layer is vendor-owned, not NVIDIA-dependent. ### Gap Analysis Applying the 'Routing Is Not Reasoning' test from the 4+1 model: • Duo Agentic Identity = identity management (which agents exist and what they can access) • AI Defense = constraint enforcement (what agents cannot do) • DefenseClaw = admission control (what agent capabilities are approved to run) • AGNTCY = infrastructure primitives (how agents discover, authenticate, and message each other) • Splunk + Galileo = observability and evaluation (what agents are doing and how well) None of these provides policy-driven decisions about where compute runs relative to data, which model serves which request, or how cost/compliance/latency are arbitrated in real time. Cisco's Layer 2C is the most comprehensive SECURITY and OBSERVABILITY story of any vendor assessed — but security and observability are not the same as governance and placement. The AGNTCY positioning requires careful calibration. The K8s analogy applies to the broader Agentic AI Foundation (AAIF) — the Linux Foundation umbrella with MCP (Anthropic), AGENTS.md (OpenAI), goose (Block) as founding projects, and AWS/Google/Microsoft/Anthropic/OpenAI as Platinum members. AAIF is 'the fastest-growing project in Linux Foundation history' with 190 members and more than double CNCF's early-stage membership. AGNTCY is one project within this ecosystem — important infrastructure (discovery, identity, messaging, observability) that MCP and A2A need but don't provide. But Cisco is a Gold member of AAIF, not Platinum. The ecosystem-defining standards (MCP, A2A) were originated by Anthropic and Google respectively, not Cisco. AGNTCY contributes the plumbing beneath the protocols — valuable, but not the protocols themselves. The structural comparison: • Google has a productized Layer 2C control plane (Agent Identity + Gateway + Registry + Orchestration + Observability + Memory Bank) AND originated A2A • IBM has cross-framework agent governance (watsonx Orchestrate) + cross-platform model governance (watsonx.governance) • VAST has PolicyEngine + Polaris for data-plane governance • Cisco has agent identity + agent security + agent observability + open infrastructure primitives (AGNTCY) — but no agent orchestration, no model routing, no placement reasoning Cisco is building the trust layer and contributing to the infrastructure standards layer. The question the 4+1 model poses is whether trust (can this agent be trusted to act?) plus interoperability standards (can agents discover and talk to each other?) is sufficient without governance (should this agent act here, now, with this data, on this model?). Trust and interoperability are necessary for governance. They are not governance. However, Cisco's Layer 2C position is stronger than Dell's (Absent) and arguably stronger than HPE's (Delegated to Kamiwaza). Cisco owns genuine Layer 2C primitives — they just don't compose into a placement engine yet. ### Borrowed Judgment Low for the functions Cisco provides. All Layer 2C components are Cisco IP or Cisco-originated open standards. No NVIDIA dependency, no partner dependency for identity, security, or observability at this layer. But 'low borrowed judgment for partial Layer 2C' is structurally different from 'low borrowed judgment for complete Layer 2C.' VAST has low borrowed judgment for a comprehensive (if captive) Layer 2C. Cisco has low borrowed judgment for agent trust functions — but the placement, routing, and governance functions are Absent, not borrowed. The enterprise architect using Cisco for Layer 2C gets strong agent identity and security with zero borrowed judgment. They do not get agent orchestration, model routing, or policy-driven placement from anyone — Cisco or otherwise. ### Working Notes The Futurum Group's framing of post-Galileo Splunk as 'an AI-era control plane candidate' is the most explicit analyst validation of Cisco's Layer 2C potential. The path from candidate to actual control plane requires composing identity (Duo) + security (AI Defense) + observability (Splunk/Galileo) + networking intelligence (Nexus One AI Job Observability) into a single decision surface that makes placement and governance decisions. The pieces exist. The composition does not. This is the inverse of Dell's position: Dell has no pieces. Cisco has pieces without composition. VAST has composition without openness. Google has composition with captivity. The AAIF ecosystem is the right frame for understanding where agentic governance standards are heading. Microsoft's explicit Kubernetes parallel — 'Just as Kubernetes needed RBAC and admission controllers to be enterprise-ready, agentic systems need governance primitives and those primitives belong in the open' — validates the 4+1 model's Layer 2C thesis. The governance primitives Microsoft references are exactly what Layer 2C defines. AAIF is building the open-standards foundation for Layer 2C; the question is whether any vendor productizes it as a coherent control plane. Cisco's strategic position within AAIF: contributor of infrastructure plumbing (AGNTCY), commercial implementer of trust functions (AI Defense, Duo, Splunk), Gold-tier member. This is a meaningful position — but it's one contributor among 190 members, not the ecosystem definer. Anthropic (MCP) and Google (A2A) contributed the foundational protocols. Cisco contributed the infrastructure layer beneath them. The analogy: Google contributed Kubernetes; Cisco contributed something more like the CNI (Container Network Interface) or service mesh layer — essential infrastructure, but not the orchestration standard itself. The commercial bet: if AAIF standards mature into the enterprise agentic governance stack, Cisco's commercial products (AI Defense for security, Duo for identity, Splunk for observability, DefenseClaw for admission control) become the enterprise implementation layer on top of open standards — the Red Hat model applied to agentic infrastructure. That's a credible business model if the standards achieve adoption. Cisco survey data: 55% of organizations have agentic AI running as pilots or in production (Jan 2026). Only 4% are fully confident in full-scale deployment. 59% cite security concerns as the biggest barrier. Cisco is positioned to address the security barrier. The governance barrier — which the 4+1 model argues is structurally different — is what AAIF and AGNTCY are attempting to address at the standards level. ## ◇ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Partner Ecosystem ### Vendor-Provided Components **Cisco Secure AI Factory Partner Ecosystem** [DAPM: Delegated] Reference architecture that partners build on. VAST Data (storage), NVIDIA (compute/runtime), and a growing ecosystem of security partners integrated into AI Defense. The Secure AI Factory is a validated framework, not a walled garden — partners provide application logic, Cisco provides infrastructure and security. **Splunk Platform (Analytics + Security Applications)** [DAPM: Retained] Splunk Enterprise Security TDIR platform. Splunk Observability Cloud. These are Layer 3 security applications that run on Cisco's own infrastructure. Splunk's installed base provides a distribution channel for AI capabilities — existing Splunk customers can adopt AI Agent Monitoring without new vendor relationships. **Cisco Security Portfolio (Security Applications)** [DAPM: Retained] AI Defense, Duo, SecureX, Umbrella, Talos threat intelligence — the security product portfolio that Cisco positions as the 'trust layer' for enterprise AI. Each product addresses a specific security function; together they constitute a security application suite that sits alongside (not above) AI applications from other vendors. ### NVIDIA-Provided Components **NVIDIA NIM + NemoClaw / OpenShell Runtime** Execution surface for AI applications on Cisco infrastructure. NVIDIA provides the application runtime; Cisco provides the infrastructure substrate and security layer. ### Gap Analysis Cisco's Layer 3 is structurally different from Dell's, HPE's, or the hyperscalers'. Dell has a broad ISV ecosystem (OpenAI, Palantir, Google, ServiceNow, SpaceXAI). HPE has Unleash AI with 26+ curated ISV partners. AWS and Google have thousands of ISV applications. Cisco's Layer 3 is narrower — the Secure AI Factory is a reference architecture that partners build on, not an application marketplace. Cisco's strongest Layer 3 asset is Splunk — an established platform with deep enterprise penetration that is being extended into AI observability and security. Splunk's competitive advantage at Layer 3 is distribution: enterprises already running Splunk can adopt AI Agent Monitoring, AI Defense integration, and Galileo evaluation capabilities within their existing observability investment. The strategic comparison with Dell at Layer 3: Dell's ecosystem is load-bearing (ISV partners provide infrastructure-level functions the platform lacks). Cisco's ecosystem is enabling (partners build applications on Cisco's infrastructure substrate). Cisco's Layer 3 partnerships are less about filling platform gaps and more about extending the reference architecture to specific use cases. The Cisco 360 Partner Program (launched January 2026) structures partner engagement around the Secure AI Factory, with role-based training paths, dCloud demo environments, and NVIDIA compute training alongside Cisco networking and security training. This is a go-to-market capability, not a technology capability. ### Borrowed Judgment Distributed across partners, which is architecturally appropriate at Layer 3. The structural observation: Cisco's borrowed judgment at Layer 3 is concentrated in two domains — AI runtime (NVIDIA) and data platform (VAST). The security and observability applications are Retained. The enterprise application use cases are partner-provided. The Splunk ecosystem (tens of thousands of enterprise customers, extensive app marketplace, active developer community) provides a distribution advantage for AI capabilities that pure-infrastructure vendors lack. Dell and HPE must sell AI capabilities to new buyer personas. Cisco can extend AI capabilities to existing Splunk and security customers. ### Working Notes The service provider go-to-market is unique to Cisco at Layer 3. Cisco AI Grid enables telcos to offer managed AI services to their enterprise customers using Cisco infrastructure. No other assessed vendor has a comparable service provider distribution channel for AI. If edge inference becomes a significant market, Cisco's telco relationships position it differently from every other vendor in the assessment. The $4,000 employee reduction (5% of ~86,000) alongside record revenue signals active reallocation toward AI infrastructure and security — Cisco is restructuring its workforce to match its strategic pivot, not cutting due to weakness. ════════════════════════════════════════════════════════════════════════════════ # Dell AI Factory with NVIDIA Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v2.1 — Post-Editorial Review **Date:** May 21, 2026 **Source:** DTW 2026, GTC 2026, Dell press releases, published 4+1 model ## Summary Finding Dell has one of the most credible on-prem AI Factory infrastructure stacks in the market. Its credibility comes from physical infrastructure (Layer 0), storage and data lifecycle integration (Layers 1A/1B/1C — with the Dataloop acquisition giving Dell its first proprietary software asset in the data lifecycle), and ecosystem packaging (Layer 3). The Data Plane is where Dell has made its most meaningful software moves, and the Dataloop-powered Data Orchestration Engine deserves recognition as a genuine practitioner-level capability, not just a bolt-on. But the closer the stack gets to GPU-aware scheduling, agent execution, and policy-driven placement, the more authority moves away from Dell and toward NVIDIA or ISV partners. Layer 2A's GPU-aware orchestration primitives are NVIDIA-controlled (GPU Operator, Run:ai, AI Enterprise). Dell does not appear to own the core agent runtime, model-serving runtime, guardrail framework, or distributed inference framework in the Layer 2B NVIDIA path. No productized Dell-owned Layer 2C control plane is evident that makes policy-driven placement decisions across models, data, agents, and infrastructure. The Layer 3 ecosystem is one of the strongest on-prem AI ecosystem stories in market (5,000+ customers, partnerships with OpenAI, Palantir, Google, ServiceNow, SpaceXAI, Hugging Face). But each partner brings its own governance domain, creating multiple independently-governed agent populations on shared infrastructure with no cross-domain orchestration layer. Dell's security posture (Zero Trust, Intel confidential computing, CrowdStrike/Fortanix/F5) protects the platform from external threats. But security is not governance. Security constrains who can access the platform. Governance constrains what the platform does. The Dell AI Factory has security. It does not yet have governance at the infrastructure level. That does not make the AI Factory weak. It exposes where the next control-plane battle will be fought. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Dell Strength ### Vendor-Provided Components **PowerRack** [DAPM: Retained] Turnkey rack-scale: compute, networking, storage integrated with thermal/power management as one unit. **PowerEdge XE9812 (Vera Rubin NVL72)** [DAPM: Retained] 10x lower cost-per-token than Blackwell for agentic inference. **Pro Max GB300 (Deskside)** [DAPM: Retained] 120B–1T parameter models. MaxCool liquid cooling. ~3 month break-even vs cloud. **PowerSwitch SN6000-series** [DAPM: Retained] NVIDIA Spectrum-6 Ethernet. 800+ Tb/sec east-west. NVIDIA silicon with Dell branding. **PowerCool CDU C7000** [DAPM: Retained] First rack-mount CDU for Vera Rubin NVL72 density. 4U, 19", up to 40°C facility water. ### NVIDIA-Provided Components **GPU/Accelerator Silicon** Blackwell, Vera Rubin — the compute engines Dell builds around. **NVLink / NVSwitch** Intra-node high-bandwidth interconnect defining memory and compute topology. **Spectrum Ethernet Silicon** Dell brands and rack-integrates NVIDIA switching silicon. ### Gap Analysis Dell retains platform packaging authority at Layer 0, but the accelerator fabric and high-performance AI networking roadmap are structurally tied to NVIDIA. Dell provides genuine engineering differentiators in thermal design, rack integration, and mechanical authority. The networking silicon dependency is worth tracking. ### Borrowed Judgment Structural co-dependency: Dell retains mechanical authority, NVIDIA retains silicon authority. If NVIDIA changes the Spectrum roadmap, Dell's PowerRack networking story changes with it. ### Working Notes AMD alternative exists under 'Dell AI Platform with AMD' (separate SKUs). MI350P PCIe, air-cooled, ROCm/vLLM stack. Different Layer 2B story entirely. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Dell Strength ### Vendor-Provided Components **PowerScale (File Engine)** [DAPM: Retained] MetadataIQ integration. NeMo Retriever connector. pNFS 25% throughput improvement. **ObjectScale (Object Storage)** [DAPM: Retained] S3-compatible. S3 over RDMA. NVIDIA Omniverse integration. Palantir Ontology deploys here. **Exascale Storage (3-in-1)** [DAPM: Retained] PowerScale + ObjectScale + Lightning FS on one platform. 10+ PB/rack, 6 TB/s reads. **MetadataIQ** [DAPM: Retained] Indexes billions of files across PowerScale/ObjectScale. Foundation of the governance catalog. **Trust3 AI Integration** [DAPM: Delegated] Storage-layer governance: sensitive data discovery, 'write once, apply everywhere' policy, AI auditing. EU AI Act/GDPR/HIPAA. **Cyber Resilience (Built-in)** [DAPM: Retained] Zero Trust, encryption, RBAC, immutable snapshots, XDR, data masking, air-gapped backup. ### NVIDIA-Provided Components **cuVS (Vector Search)** 12x faster vector indexing. Makes billion-file indexing viable. **CX-8/CX-9 SuperNICs** Storage-side RDMA for GPU-direct access. **NeMo Retriever Connector** PowerScale integration for GPU-accelerated retrieval. ### Gap Analysis Dell's strongest layer after Layer 0. Exascale 3-in-1 architecture is architecturally significant for data locality. Trust3 AI provides agentic-AI-aware governance. The strategic question: is MetadataIQ metadata rich enough to drive Layer 2C placement decisions? Dell's marketing says yes. The proof is whether any Layer 2C can query it programmatically. ### Borrowed Judgment Low. Dell owns storage platforms, metadata layer, and cyber resilience stack. NVIDIA provides acceleration, not governance logic. Trust3 AI is the only Delegated component. ### Working Notes Data Analytics Engine Agentic Layer + MCP Server (Feb 2026) blur 1A/1B boundary — search, analytics, and orchestration surfaced as a single queryable service. ## ◑ Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Delegated ### Vendor-Provided Components **Dell Data Search Engine (Elastic)** [DAPM: Delegated] Elasticsearch 9.4. Hybrid keyword+vector search. MetadataIQ integration. LangChain support. GA with GPU accel Q2 2026. **Incremental Indexing** [DAPM: Delegated] Only updated files re-indexed. Keeps retrieval synchronized with governance catalog. **Analytics Engine Agentic Layer + MCP Server** [DAPM: Delegated] Unifies vector stores across Iceberg, Data Search Engine, PostgreSQL+PGVector. Agent-queryable. ### NVIDIA-Provided Components **NVIDIA cuVS** GPU-accelerated hybrid search. 12x faster vector indexing. **NVIDIA STX Architecture** BlueField-4 + ConnectX-9 + Spectrum-X + DOCA. Storage-side acceleration — available to ALL storage vendors. **NeMo Retriever** PowerScale connector for GPU-accelerated retrieval. ### Gap Analysis Three-party dependency: Dell (storage + metadata), Elastic (search intelligence), NVIDIA (acceleration). STX is non-differentiating — every storage vendor has it. Dell's differentiation is MetadataIQ integration and the Elastic partnership, not NVIDIA acceleration. ### Borrowed Judgment Moderate, distributed across two partners. Search intelligence is Elastic's. Acceleration is NVIDIA's. Dell's durable value is the data substrate — if you swap search engines, PowerScale data doesn't move. ### Working Notes No retrieval quality observability (recall@k, latency percentiles) that a Layer 2C could use for placement decisions. ## ◑ Layer 1C: Data Movement & Pipelines *Move/transform data — ETL/ELT, lineage, cost-aware movement, KV cache tiering* **Status:** Dell + Dataloop ### Vendor-Provided Components **Data Orchestration Engine (Dataloop)** [DAPM: Retained] No-code/low-code AI data lifecycle. Dell's most meaningful software acquisition (~$120M, Dec 2025). GA Q1 CY26. **Orchestration Engine Marketplace** [DAPM: Delegated] 200+ models, NVIDIA NIMs, Blueprints, AI-Q templates. **KV Cache Offload to Shared Storage** [DAPM: Delegated] NVIDIA CMX support. 19x TTFT improvement, 5.3x QPS. Offloads KV cache from GPU HBM to PowerScale/ObjectScale/Lightning FS. **Data Analytics Engine (Starburst)** [DAPM: Delegated] GPU-accelerated SQL. Agentic Layer + MCP Server for agent access. ### NVIDIA-Provided Components **NVIDIA CMX** BlueField-4 powered context memory tier (G3.5). 5x TPS, 5x power efficient. Dedicated KV cache tier. **NVIDIA STX Reference Architecture** Storage-side infrastructure reference. Non-differentiating for Dell. **Blueprints, NIMs, AI-Q Blueprint** Pre-built pipeline components through the Marketplace. ### Gap Analysis Dell's most significant strategic move. Dataloop gives Dell proprietary orchestration logic — strongest 'Retained' software play in the stack. KV Cache offload is the most architecturally significant Layer 1C capability: solves a data movement problem with direct inference economics impact. 'Context Moves to Storage' inverts the 'Compute Moves to Data' principle. ### Borrowed Judgment Low for orchestration (Dell owns Dataloop IP). Moderate for KV cache (joint Dell+NVIDIA, CMX dependency). Starburst is cleanly swappable. ### Working Notes NAND Research flagged maturity concern: 4-month-old acquisition as enterprise orchestration engine vs. established Databricks/Snowflake. HyperFRAME: only 14% of orgs have AI-ready data architecture. ## ○ Layer 2A: Infrastructure Orchestration *GPU scheduling, quotas, RBAC, fair-share scheduling, utilization optimization* **Status:** Gap ### Vendor-Provided Components **Integrated Rack Controller** [DAPM: Retained] Physical rack management — power, thermal, firmware, device inventory. Operates below Layer 2A. **OpenManage Enterprise** [DAPM: Retained] Infrastructure lifecycle management. Manages the chassis, not GPU workloads. **Dell CSI Operator** [DAPM: Retained] Dell's one K8s operator — storage provisioning, not compute orchestration. ### NVIDIA-Provided Components **GPU Operator + Network Operator + NIM Operator** Three of four K8s operators in the reference architecture are NVIDIA's. **NVIDIA Run:ai** GPU scheduling, quotas, fair-share. THIS IS the Layer 2A function. NVIDIA-acquired. **NVIDIA AI Enterprise** Commercial platform wrapping the full GPU orchestration and management stack. ### Gap Analysis Dell manages the rack. NVIDIA manages the GPU-aware substrate. That distinction matters because AI Factory differentiation depends less on whether the rack can be deployed and more on how scarce accelerated capacity is scheduled, partitioned, licensed, and governed at runtime. ClearML provides floating NVAIE license management — three authorities for one optimization function. ### Borrowed Judgment High. GPU-aware orchestration primitives are NVIDIA-controlled. Dell's authority is limited to physical chassis management (Layer 0), storage provisioning (Layer 1A), and deployment automation (Day 0/1). No alternative GPU scheduler exists within the Dell AI Factory. ### Working Notes ClearML is the most interesting independent Layer 2A play. If Dell wanted proprietary 2A capability, acquiring or deep-partnering ClearML would be the most direct path. ## ○ Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Ceded to NVIDIA ### Vendor-Provided Components **Deskside Agentic AI** [DAPM: Ceded] Dell workstations + NVIDIA NemoClaw + Dell Services. Hardware and thermal engineering are Dell's. Runtime is entirely NVIDIA's. **Agentic AI Platform (Blueprints)** [DAPM: Delegated] Cohere North, DataRobot, ClearML blueprints. Dell provides hardware substrate and services. Agent orchestration is ISV-provided. **Accelerator Services for Agentic AI** [DAPM: Retained] Dell's human-delivered value: strategy, deployment, optimization. Services, not software. ### NVIDIA-Provided Components **NemoClaw (OpenClaw Stack)** Open-source agent runtime. Single-command install. Jensen: 'the operating system for personal AI.' **OpenShell** Sandboxed agent runtime with security/privacy controls. Spans deskside to data center. **NeMo Guardrails** Runtime safety boundaries — what agents are NOT allowed to do. Constraint enforcement, not placement. **Dynamo** Distributed inference framework. KV-aware routing to cache-warm nodes. Closest thing to a placement decision in the stack — but single-variable optimization. **NIMs + AI Enterprise** Containerized model serving + commercial platform. ### Gap Analysis Dell does not appear to own the core agent runtime, model-serving runtime, guardrail framework, or distributed inference framework in the NVIDIA AI Factory path. Its value is validation, packaging, integration, services, and partner curation. Dynamo's KV-aware routing is the closest thing to placement reasoning — but optimizes for cache locality, not multi-variable policy. ### Borrowed Judgment Total for runtime. Partially mitigated at blueprint level (Cohere/DataRobot/ClearML are swappable partners). Dell's one Retained asset is Accelerator Services — human expertise, not software. Open-source (OpenClaw) provides theoretical optionality but practical optimization is NVIDIA's. ### Working Notes Jensen's 'OS for personal AI' is a Layer 2B claim. An OS manages execution. A control plane manages placement and policy. ## ✕ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Not Yet Evident ### Vendor-Provided Components **No Productized Dell-Owned Layer 2C Evident** [DAPM: Absent] Dell has governance claims and security controls. What is not yet visible is a Dell-owned control plane that makes policy-driven placement decisions across models, data, agents, and infrastructure. **Dell + Intel Control Plane (Signal Only)** [DAPM: Absent] SiliconANGLE (May 2026): Dell and Intel 'actively addressing' the AI factory governance gap. No product announced. Worth tracking. ### NVIDIA-Provided Components **AI-Q 2.0 Reference Architecture** Multi-agent workflow scaffolding. Does NOT make placement decisions. **OpenShell Governance** Runtime security sandboxing. Layer 2B constraint enforcement, not 2C placement reasoning. **Dynamo KV-Aware Routing** Performance-aware routing (single variable). Not multi-variable policy optimization. ### Gap Analysis Applying the 'Routing Is Not Reasoning' test: AI-Q 2.0 = workflow scaffolding. OpenShell/NeMo Guardrails = constraint enforcement. Dynamo = performance routing. None provides policy-driven decisions about where compute runs relative to data, which model serves which request, and how cost/compliance/latency are arbitrated in real time. ECI Research: 44% of enterprise AI leaders have only moderate confidence agents can act autonomously — rational without Layer 2C. ### Borrowed Judgment Inverted: there IS no judgment to borrow. The enterprise must build custom 2C logic (6-12 months), bring a partner (Kamiwaza, potentially Palantir Ontology), or operate without it. Most will choose option 3 — the gap isn't visible until production agentic workloads expose it. ### Working Notes Dave Vellante (theCUBE): 'The AI factory requires a new control plane — one that governs data, models and agents in real time.' That control plane is Layer 2C. Three vendors approaching from different directions: Dell (bottom-up), Google (top-down), VAST (middle-out). ## ◇ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Partner Ecosystem ### Vendor-Provided Components **Dell AI Ecosystem Program** [DAPM: Delegated] Structured ISV validation path. Partners: Google, Hugging Face, OpenAI, Palantir, Reflection, ServiceNow, SpaceXAI. **Dell Enterprise Hub (Hugging Face)** [DAPM: Delegated] Curated open-weight models on PowerEdge. DeepSeek, GLM, Kimi, Gemma, Nemotron, Mistral, Arcee. **Security Stack** [DAPM: Delegated] CrowdStrike + Fortanix + F5 + Intel confidential computing. Infrastructure security, not agent governance. ### NVIDIA-Provided Components **NemoClaw / OpenClaw Runtime** Execution surface for Layer 3 applications. NVIDIA provides substrate; ISVs provide business logic. ### Gap Analysis One of the strongest on-prem AI ecosystem stories in market. Each partner maps to a coherent use case. But each brings its own governance domain — Palantir Ontology governs within Palantir's domain, ServiceNow Otto within ServiceNow's. Nobody governs ACROSS domains on shared infrastructure. Security protects the platform from threats. Governance constrains what the platform does. Both are necessary. Only security is present. ### Borrowed Judgment Distributed across partners, which is architecturally correct at Layer 3. The structural problem: no cross-domain infrastructure judgment (Layer 2C) constrains all agents regardless of which ISV built them. ### Working Notes 5,000+ AI Factory customers (up from 3,000 at GTC). As they move to production agentic workloads, the multi-agent governance problem becomes visible. More ISV partners = more independent agent populations = more urgent need for Layer 2C. ════════════════════════════════════════════════════════════════════════════════ # Google Cloud AI Infrastructure Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 21, 2026 **Source:** Google Cloud Next 2026 (Apr 22–24), GTC 2026, NVIDIA partnership, Forrester, SiliconANGLE, The New Stack, analyst coverage ## Summary Finding Google Cloud is the only vendor in this assessment series that owns a frontier foundation model — and that single fact restructures the entire 4+1 analysis. Google built Gemini, trains Gemini on its own TPUs, optimizes its silicon for Gemini’s training requirements, and weaves Gemini’s intelligence into every layer of its cloud platform. This creates a model-integrated stack: an architecture where the frontier model is not a component plugged into infrastructure but the intelligence that pervades the infrastructure. No other vendor assessed possesses this vertical integration. Google owns every layer of the 4+1 model with proprietary IP: custom silicon (TPUs), custom networking (Virgo), proprietary storage (Colossus/Spanner/BigQuery), its own runtime and frameworks (JAX/Pathways), its own frontier models (Gemini), and a unified orchestration surface (Gemini Enterprise Agent Platform). The DAPM implication is not merely that every layer is Ceded — it is that every layer is ceded to a unified intelligence. With AWS, authority is distributed across multiple vendors’ judgment (AWS infrastructure, Anthropic model reasoning, ISV application logic). That distribution creates complexity but also structural checks. With Google Cloud + Gemini, the enterprise concentrates authority in one vendor’s judgment across every layer — from silicon to application. This is the deepest expression of vertical integration in enterprise technology since the mainframe era. The enterprise gains end-to-end optimization that no multi-vendor assembly can match. But the 4+1 framework makes visible what the integration obscures: the enterprise has no fallback position at any layer. Google Distributed Cloud (GDC) addresses data sovereignty without addressing judgment sovereignty — GDC still runs Google’s software stack and Google’s models. The structural question: does concentrating all layers of authority and all layers of model judgment in a single vendor deliver enough value to justify the governance position — and has the enterprise made that concentration explicit rather than inheriting it by default? ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Ceded to Google ### Vendor-Provided Components **Google Custom Silicon (TPUs)** [DAPM: Ceded] TPU 8t (training): 9,600-chip superpods, ~3x processing power vs Ironwood. TPU 8i (inference): 288GB HBM, 384MB on-chip SRAM, ~80% better perf/dollar. Designed for agentic AI, MoE models, large-scale RL. Google designs TPUs to train Gemini — the enterprise benefits from silicon optimized by one of the world’s most demanding ML workloads but does not direct the optimization priorities. **NVIDIA GPUs on Google Cloud** [DAPM: Ceded] A5X bare-metal on Vera Rubin NVL72 (among first cloud providers to deploy). A4 Ultra (NVL72 preview Q2 2026), A3/A3 Mega/A3 Ultra. Fractional GPUs (G4 VMs, industry-first RTX PRO 6000 Blackwell vGPU). Scale: 80,000 Rubin GPUs single-site, 960,000 across multisite. Third-party models (Claude, Llama, Mistral) run on NVIDIA, not TPU — the playing field is structurally uneven. **Virgo Network Fabric** [DAPM: Ceded] Purpose-built AI-optimized DC fabric. 134,000 TPU 8t chips connected at 47 Pb/s non-blocking bi-section bandwidth per DC. 4x bandwidth per accelerator, 40% lower unloaded latency vs prior gen. Also available for A5X. Designed for Gemini’s training topology. **Google Distributed Cloud (GDC)** [DAPM: Delegated] On-prem deployment of Google Cloud services. Connected and air-gapped configs. 4 racks to hundreds. NVIDIA Blackwell GPUs + Gemini Flash models on-prem. Managed GDC Provider initiative (Clarence, Gulf Energy, T-Systems, WWT). NATO deployment. Customer provides facility; Google provides and operates HW+SW. Addresses data sovereignty but not judgment sovereignty. ### NVIDIA-Provided Components **NVIDIA GPU Silicon** Vera Rubin NVL72, Blackwell B200/B300, H100/H200. 1M+ NVIDIA GPUs. NVIDIA instances serve third-party models that can’t run on TPU. **NVIDIA NIXL + Networking** NIXL for disaggregated inference. ConnectX/BlueField for GPU networking. Google manages the NVIDIA integration layer. ### Gap Analysis No Layer 0 capability gap — Google’s portfolio is the broadest of any single cloud provider. The gap is governance: the enterprise has no authority over any Layer 0 component beyond selecting instance types. The silicon-model feedback loop is structurally unique: Google designs TPUs to train Gemini, not primarily to sell cloud compute. TPU roadmap decisions reflect Gemini’s training topology, not enterprise customer workload requirements. The enterprise inherits optimization it didn’t direct. AWS’s Trainium is designed for customer workloads. NVIDIA designs for the broadest market. Google designs for Gemini and makes TPUs available to customers. The multi-accelerator matching problem (TPU vs NVIDIA vs Axion CPU) creates a workload-to-silicon decision that recurs per-workload in cloud vs once at procurement on-prem. No productized policy engine automates that matching. Fluid Compute (Layer 2A) begins to address it but doesn’t consult governance metadata. GDC follows the same inverted operating model as AWS AI Factories: Google operates infrastructure the customer houses. Unlike Dell PowerRack or HPE ProLiant (enterprise-owned hardware), GDC is Google-operated even when customer-hosted. ### Borrowed Judgment The silicon-model feedback loop: Google’s TPU roadmap is driven by Gemini’s training requirements. If Google decides TPU 9 should optimize for MoE architectures because that’s where Gemini is heading, every enterprise TPU workload inherits that architectural bet. Borrowed judgment at the silicon layer — a concept with no parallel in the Dell or HPE assessments. Virgo as borrowed network judgment: the enterprise inherits Google’s network optimization decisions without visibility or control. Cannot audit bandwidth sharing across tenants or prioritization of Google’s own Gemini training traffic. ### Working Notes The dual-architecture hedge (TPU + NVIDIA) gives Google pricing leverage and architectural independence. The enterprise benefits indirectly but does not control whether NVIDIA GPU instances remain first-class citizens as Google optimizes for its own silicon. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Ceded — Model-Powered Governance ### Vendor-Provided Components **Cloud Storage + Rapid Tier (Colossus)** [DAPM: Ceded] Standard object storage at planetary scale. Rapid tier uses Colossus — Google’s internal distributed storage platform (previously powering Search, Gmail, YouTube, Gemini training). Sub-millisecond read/write. The enterprise gets the same storage engine that holds Google’s training data. **Smart Storage** [DAPM: Ceded] Automatically analyzes unstructured data and generates metadata/context on ingest using Gemini. Auto-tags images, PDFs. First recursive point: Smart Storage uses Gemini to enrich the data that will eventually be served to Gemini-powered agents. The model enriches the data that feeds the model. **Knowledge Catalog (Gemini-Powered)** [DAPM: Ceded] Universal business context and governance. Gemini-powered semantic extraction, entity relationship mapping, dynamic context graph construction. Sub-second semantic search for agent retrieval. Aggregates metadata across BigQuery, AlloyDB, Spanner, Cloud SQL, Firestore, Looker. Third-party integrations (Atlan, Collibra, Datahub). Enterprise Connectivity federates context from Salesforce, Palantir, Workday, SAP, ServiceNow. Column-level lineage (GA). **BigQuery Storage** [DAPM: Ceded] Serverless columnar storage for structured/semi-structured. Managed Iceberg tables. Separates storage and compute. BigQuery spans storage, analytics, ML, and governance in a single service. ### Gap Analysis Knowledge Catalog with Smart Storage represents the most ambitious attempt to solve the metadata-to-agent grounding problem. No other vendor has a production system that automatically extracts business semantics from raw data, builds a context graph, and serves that context to agents in real-time with governance enforcement. The recursive dependency is the central finding: Knowledge Catalog uses Gemini to perform semantic extraction and build the context graph. When Knowledge Catalog determines what context an agent receives, and that context graph was built by Gemini, the enterprise is consuming Gemini’s judgment about what its own data means — at the governance layer, before any application-level inference occurs. No other vendor has this pattern. Dell’s MetadataIQ indexes deterministically. AWS’s Glue doesn’t use Nova to enrich its catalog. HPE’s Ezmeral doesn’t use a foundation model for data semantics. VAST’s catalog is storage-native, not model-powered. The 1A→2C connection: Knowledge Catalog is not a passive registry. It makes decisions about what context agents receive, how data assets are ranked for retrieval, and which governance policies apply. The context graph determines agent grounding — an orchestration function, not a storage function. Governance gap: Knowledge Catalog is optimized for GCP. Third-party integrations federate INTO Knowledge Catalog, not out of it. An enterprise running PowerScale + S3 + GCS cannot use Knowledge Catalog as a federated surface across all three without making GCP the metadata authority. ### Borrowed Judgment The context graph as borrowed judgment: when Knowledge Catalog builds entity relationships and business meanings, every agent that queries the graph inherits its representation of reality. If Gemini’s semantic extraction misclassifies a data asset, every agent grounded in that context acts on the incorrect interpretation. Borrowed judgment at the governance layer — before application-level reasoning. Smart Storage as ingest-time judgment: a single Gemini classification at ingest (‘this document is about Project X’) becomes a persistent governance fact. Limited mechanisms to audit or correct model-generated metadata at scale. Comparison to AWS: AWS classifies 1A as Delegated because Lake Formation enforces customer-defined policies. Google’s 1A is Ceded because Knowledge Catalog generates governance intelligence using Gemini. The enterprise on AWS retains governance judgment. The enterprise on Google inherits it. ### Working Notes The Layer 1A / 2C boundary question: Knowledge Catalog’s context graph is architecturally Layer 1A (data catalog) but functionally Layer 2C (determines agent grounding and context routing). Model-powered governance layers that make orchestration decisions are functionally 2C even when architecturally 1A. ## ● Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Ceded — Model-Powered Prep ### Vendor-Provided Components **BigQuery ML** [DAPM: Ceded] In-database ML training and inference. Supports linear/logistic regression, K-means, time series, XGBoost, DNNs, imported TF/PyTorch models. Collapses the boundary between data preparation (1B) and AI runtime (2B) by running ML directly in the warehouse. **Dataflow + Dataproc** [DAPM: Ceded] Managed Apache Beam (batch+stream) and managed Spark/Hadoop for large-scale processing. Open-source frameworks, Google-managed execution. **LookML Agent (Gemini-Powered, Preview)** [DAPM: Ceded] Derives semantic models from documentation using Gemini — automates the business logic capture that traditionally requires manual data engineering. Second Gemini recursion point: Gemini interprets what data MEANS at the business logic layer, generates the semantic model, and agents query through that model. If the interpretation of ‘revenue’ is subtly wrong, every agent inherits the error. **Data Agent Kit (Open-Source)** [DAPM: Delegated] MCP-based agents packaged as tools and skills. Supports Claude Code, Gemini CLI, Codex, VS Code. Enables intent-driven development: practitioners define goals, agents handle implementation. Creates governance recursion: the agent builds the pipeline that prepares the data that feeds the agent. **Vertex AI Feature Store** [DAPM: Ceded] Managed feature serving for online/offline ML models and agents. Consistent feature serving across training and inference. 1B→2B bridge. ### Gap Analysis No meaningful capability gap. Most mature Layer 1B in the assessment series. BigQuery ML eliminates data-to-model handoff. LookML Agent automates semantic model construction. Data Agent Kit enables agent-driven pipeline development. The gap is governance over model-generated data artifacts. When LookML Agent generates a semantic model, when Data Agent Kit writes a pipeline, when BigQuery ML trains a model — who reviews the output for correctness? These are AI-generated artifact governance problems. Google provides no productized capability for governing model-generated data artifacts at scale. Data Agent Kit’s explicit support for Claude Code and non-Google tooling is strategically significant — the one point in the stack where third-party model access is genuinely equal. ### Borrowed Judgment The semantic model as borrowed judgment: when LookML Agent generates definitions, every analytics query and agent interaction using those definitions inherits Gemini’s interpretation of business logic. Powerful (automates weeks of manual semantic modeling) and risky (embeds model judgment in the analytical foundation). The pipeline-building agent as borrowed judgment: Data Agent Kit agents write Dataflow jobs and BigQuery transformations. The enterprise inherits the agent’s data engineering judgment — join strategies, filter logic, null handling. Previously human expertise, now model-generated. ### Working Notes The comparison to AWS SageMaker Unified Studio: AWS provides a single governed environment across services (service-wide integration). Google achieves integration through BigQuery spanning storage, analytics, ML, and governance (service-deep integration). Google = tighter integration at cost of BigQuery lock-in. AWS = service diversity at cost of integration complexity. ## ● Layer 1C: Data Movement & Pipelines *Move/transform data — caching, streaming, cross-cloud federation* **Status:** Ceded ### Vendor-Provided Components **Cloud Storage Rapid Tier (Colossus)** [DAPM: Ceded] Sub-millisecond caching layer between persistent storage and compute. Previously internal-only, now customer-accessible. **Managed Lustre** [DAPM: Ceded] 10 TB/s bandwidth (10x YoY, claimed 20x faster than other hyperscalers), 80 PB capacity. RDMA-enabled. Training data movement layer between Cloud Storage and TPU/GPU clusters. **Cross-Cloud Lakehouse (Preview)** [DAPM: Ceded] Agentic AI workflows access data across AWS and Azure without egress — querying data in place rather than copying. Eliminates ETL and cross-platform data movement costs. Extends compute to wherever data sits. Google’s query engine becomes the universal data access layer regardless of physical data location. **BigLake** [DAPM: Ceded] Unified data fabric across data lake (Cloud Storage) and warehouse (BigQuery). Single schema, multiple processing engines. Row/column-level governance via BigQuery Storage API across all access paths including open-source engines. Multi-cloud via BigQuery Omni. **Smart Storage Ingest Enrichment** [DAPM: Ceded] Data entering Google Cloud is auto-enriched by Gemini with metadata and context tags. Data is not just moved — it is interpreted on arrival. No parallel in other vendors’ data movement layers. ### Gap Analysis Cross-Cloud Lakehouse is the most strategically important Layer 1C capability in the assessment series. It represents a fundamentally different approach to data gravity: extend compute to wherever data sits rather than moving data to compute. The DAPM implication: if Cross-Cloud Lakehouse delivers, the enterprise doesn’t need to move data to GCP. That reduces data lock-in at storage. But it increases lock-in at the query layer — analytical capabilities depend on Google’s query engine reaching across clouds. Data sovereignty improves. Analytical sovereignty does not. The data movement / enrichment coupling: Smart Storage’s Gemini enrichment at ingest means Layer 1C (movement) and Layer 1A (governance) are coupled through model inference. Moving files into Cloud Storage triggers model inference generating metadata that propagates into the context graph. No other vendor couples data movement with model-powered enrichment. Cross-Cloud Lakehouse is in preview. Performance, cost, and governance characteristics at enterprise scale are unproven. ### Borrowed Judgment Cross-Cloud Lakehouse as borrowed query optimization: when Google’s engine optimizes a cross-cloud query, the enterprise inherits Google’s optimization judgment. Opaque and unchallengeable — cannot tune the cross-cloud query plan or audit how Google’s engine accesses data in a competing cloud provider’s storage. The enrichment coupling: data engineers moving files into Cloud Storage unknowingly trigger model inference. Convenient (automatic enrichment) and opaque (the engineer may not know Gemini is interpreting their data on arrival). ### Working Notes The asymmetry between data plane federation (Cross-Cloud Lakehouse) and control plane federation (absent at 2C) is a structural finding. Google invests in making data accessible across clouds but not in making agent governance portable across clouds. Data accessibility without orchestration portability draws workloads toward GCP as the governance center. ## ● Layer 2A: Infrastructure Orchestration *GPU scheduling, capacity management, autoscaling, sovereign deployment* **Status:** Ceded / Delegated (GDC) ### Vendor-Provided Components **GKE + GKE Agent Sandbox** [DAPM: Ceded] Managed K8s with AI-era extensions. Agent Sandbox: gVisor-based secure isolation, 300 sandboxes/second/cluster with sub-second time to first instruction. Infrastructure built for the agentic era, not retrofitted. **Fluid Compute** [DAPM: Ceded] GCE + GKE dynamically shifting workloads in real-time. CPUs for branchy agent logic, secure sandboxes, RL, SLM inference, RAG. GPU/TPU for training and large-model inference. Proto-Layer 2C: routes based on workload characteristics, not business context. **GDC (Sovereignty Analysis)** [DAPM: Delegated] On-prem Google Cloud services: GKE, Agent Platform, managed storage, Gemini Flash, Blackwell GPUs. Air-gapped for sensitive workloads. Addresses data sovereignty (where computation happens) but NOT judgment sovereignty (whose model drives computation). GDC runs Gemini on-prem — same recursive dependency, inside the enterprise perimeter. **Capacity Management** [DAPM: Ceded] CUDs (1yr/3yr), on-demand, preemptible/spot, Dynamic Workload Scheduler. Same pattern as AWS — capacity acquisition (2A), not workload placement (2C). ### NVIDIA-Provided Components **NVIDIA GPU Operator (on GKE)** Available for NVIDIA instances on GKE. Google manages the GPU integration layer. ### Gap Analysis GKE is the most mature managed K8s for AI workloads. GKE Agent Sandbox has no equivalent in Dell, HPE, or AWS portfolios — 300 sandboxes/second is built for agentic workload density. Fluid Compute sits at the 2A/2C boundary. Its dynamic workload shifting is more than capacity acquisition — runtime decisions about which compute type serves which workload. But less than full 2C — routes on workload characteristics, not business context (data residency, compliance tags, cost targets). The Fluid Compute → Knowledge Catalog connection does not exist: workload placement does not consult governance metadata. Same Infrastructure Layer 2C gap every vendor has. GDC: the full sovereignty analysis reveals that data sovereignty ≠ judgment sovereignty. Knowledge Catalog on GDC uses Gemini. Smart Storage on GDC uses Gemini. Agent Platform on GDC uses Google’s runtime. The enterprise gains physical sovereignty but retains the same judgment concentration. The ‘self-driving cloud’ narrative implies Gemini-powered infrastructure operations — autonomous root-cause analysis on infrastructure telemetry. If the Reasoning Plane is itself Gemini-powered, the operational intelligence and application intelligence are the same intelligence. ### Borrowed Judgment GKE scheduling as borrowed judgment: enterprise inherits Google’s scheduling decisions. If Google uses Gemini-driven optimization, the enterprise borrows both traditional heuristics and model reasoning, without distinction. GDC as borrowed judgment in sovereign packaging: physical control over facility, Google’s judgment in software, model, governance, and operations. Data sovereignty with judgment concentration. Fluid Compute as proto-2C borrowed judgment: when Fluid Compute routes agent work to CPU vs GPU, that routing is Google’s judgment about optimal compute matching. Enterprise doesn’t configure the routing policy. ### Working Notes Data sovereignty vs judgment sovereignty: the 4+1 framework should distinguish between where data resides (which GDC addresses) and whose model’s reasoning shapes the AI system (which GDC does not address). An enterprise running GDC air-gapped has data sovereignty while fully ceding judgment sovereignty. ## ● Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, frameworks* **Status:** Ceded — Model-Integrated Stack ### Vendor-Provided Components **Gemini Enterprise Agent Platform** [DAPM: Ceded] Unified platform for building, scaling, governing, optimizing agents. Subsumes all Vertex AI services. Agent Studio (low-code), ADK (code-first, Python/Go/Java/TypeScript, model-agnostic, open-source), Model Garden (200+ models incl Gemini, Claude, Llama, Gemma), Agent Runtime, Agent-to-Agent Orchestration, Agent Identity (GA), Agent Gateway, Agent Observability, Agent Registry, Memory Bank, Antigravity (desktop app + CLI). **The House Model Advantage** [DAPM: Ceded] Gemini on TPU: silicon designed for this model, networking designed for its topology, distributed runtime (Pathways) built for its coordination, inference optimized for its architecture, governance (Knowledge Catalog) powered by it, orchestration defaults to it. No other vendor achieves this degree of vertical optimization. Third-party models (Claude, Llama) run on NVIDIA GPUs — supported but not co-optimized. The playing field is structurally tilted. **Frameworks (JAX, Pathways, TorchTPU, vLLM)** [DAPM: Ceded] JAX: Google’s ML framework optimized for TPU. Pathways: distributed runtime for superpod-scale training. TorchTPU: full PyTorch support on TPUs (concession to ecosystem). vLLM: optimized across GPU + TPU. llm-d: open-source K8s-native inference serving (multi-vendor project). ### NVIDIA-Provided Components **NVIDIA GPU Instances** A3/A3 Mega/A3 Ultra/A5X for third-party model inference. CUDA ecosystem required for non-Gemini models. ### Gap Analysis Layer 2B is the center of gravity for the model-integrated stack. The model provider, runtime provider, and infrastructure provider are the same company. When the enterprise runs Gemini on Agent Platform on TPU, it borrows Google’s judgment at the model layer, runtime layer, framework layer, and silicon layer simultaneously. A single entity’s priorities shape the entire execution path. The Agent Platform collapses Layers 2B, 2C, and 3 into a single product surface: Agent Runtime (2B infrastructure), Agent Identity/Gateway/Registry/Orchestration/Observability (2C governance), Agent Studio/ADK/Antigravity (Layer 3 development). The product boundary does not align with the architectural boundary. AWS separates these: Bedrock (model access) is distinct from AgentCore Runtime (agent execution) is distinct from AgentCore Policy (governance). AWS’s separation preserves architectural boundaries the enterprise can independently govern. Google’s collapse optimizes integration but prevents swapping the governance layer (2C) while keeping the runtime (2B). The NVIDIA dependency at 2B is optional for Gemini (TPU-native) but required for third-party models. The enterprise using Claude on Google Cloud pays a structural performance tax — Claude runs on NVIDIA GPUs through a runtime designed for Gemini. Model Garden’s 200+ models are API-equal but not silicon-equal. ### Borrowed Judgment The model-integrated runtime: Gemini on TPU inherits Google’s judgment at model layer (training data, alignment, safety), runtime layer (scheduling, scaling, session management), framework layer (JAX/Pathways optimization), and silicon layer (TPU architecture). Most concentrated borrowed judgment in the assessment series. The open-source hedge: llm-d, TorchTPU, vLLM, ADK provide genuine open alternatives. Google opens components that reduce adoption friction (frameworks, SDKs) while keeping authority-concentrating components closed (Agent Runtime infrastructure, Agent Gateway, Pathways). Production deployment pulls open tools into Google’s managed surface where authority shifts from Retained to Ceded. ### Working Notes The 2B/2C collapse prevents the enterprise from independently governing the orchestration layer. An enterprise that wants Google’s Agent Registry and Agent Identity but AWS’s Bedrock for model access and its own governance engine for policy enforcement cannot compose that architecture. The components are bundled. ## ● Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Ceded — Productized but Captive ### Vendor-Provided Components **Agent Identity (GA)** [DAPM: Ceded] Agents as identity principals with authentication, authorization, audit. Control plane function: determines which agents exist as governed entities. **Agent Gateway** [DAPM: Ceded] Protocol-level governance for MCP and A2A communications. Security partner integrations (Broadcom, Check Point, Cisco, CrowdStrike, F5, Netskope, Okta, Palo Alto, Zscaler). Spans Layer 0 (networking), 2B (runtime), and 2C (orchestration). **Agent-to-Agent Orchestration** [DAPM: Ceded] Deterministic multi-agent workflow routing. Control plane function: determines which agent handles which subtask. **Agent Registry** [DAPM: Ceded] Catalog of agents with ownership, capabilities, protocols, invocation details. Administrator-controlled discoverability. Control plane function: determines which agents are available and who can use them. **Agent Observability** [DAPM: Ceded] Monitoring, tracing, debugging across production agent populations. Feedback loop for detecting faulty reasoning and intervening. ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** All Layer 2C components are Google IP. NVIDIA does not control governance, policy, or reasoning in Google’s stack. ### Gap Analysis Google’s Intelligence Layer 2C is the most complete productized offering in the assessment series: Agent Identity + Gateway + Registry + Orchestration + Observability + Memory Bank. Together they constitute a genuine control plane for agent governance. Infrastructure Layer 2C — the autonomous placement engine — is NOT built as a customer-configurable product. The capacity primitives (Fluid Compute, CUDs, DWS) are building blocks, but they don’t compose into a policy-driven placement engine querying Knowledge Catalog governance metadata. Same gap as AWS and every other vendor. Google’s implicit Layer 2C is the most sophisticated in the assessment: managed services make autonomous placement, scaling, routing, and capacity decisions invisibly. The enterprise cannot see, configure, audit, or override these decisions. The model-integrated Reasoning Plane: if Google’s ‘self-driving cloud’ uses Gemini for infrastructure decisions, then the model powering the enterprise’s agents (Layer 3) is the same model governing agent orchestration (Intelligence 2C) is the same model deciding where agents run (Infrastructure 2C). One model’s judgment pervades every decision surface. Cross-cloud orchestration gap: Google federates the data plane (Cross-Cloud Lakehouse) but NOT the control plane. Agent Platform governs GCP agents only. Enterprise running agents across multiple clouds has no cross-platform agent governance surface — unless all agents route through Google’s Agent Gateway, which cedes cross-cloud governance to Google. The captive-but-best dilemma: this is evidence the control plane CAN be built as a coherent capability. The enterprise architect who wants it has one option: adopt Google Cloud. The federated alternative does not exist. ### Borrowed Judgment The captive control plane: enterprise inherits Google’s orchestration model — deterministic routing, Google-managed identity, Google-governed protocols. Well-engineered but unchallengeable — cannot substitute alternative orchestration logic within the Agent Platform boundary. The model-powered control plane: if the Reasoning Plane uses Gemini for infrastructure decisions, a model judgment error at the control plane layer is invisible to the enterprise, with no fallback to human decision-making. Intelligence 2C: Low borrowed judgment in the sense that the components are productized and configurable. High borrowed judgment in the sense that the governance logic itself (Agent Gateway protocol decisions, Agent Identity authentication model, Orchestration routing patterns) is Google’s, not the enterprise’s. ### Working Notes Google’s 2C proves the Control Plane Working Notes thesis: the control plane can be built. The question is whether it can be liberated from the vendor boundary — and whether the model-integrated dimension (control plane powered by the same model it governs) is a pattern to replicate or to avoid. The asymmetry: data plane federates (Cross-Cloud Lakehouse), control plane does not. This serves Google’s strategic interest — data accessibility without orchestration portability draws workloads toward GCP as governance center. ## ● Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — models, agents, business logic* **Status:** Open Model Layer, Captive Platform ### Vendor-Provided Components **Gemini Model Family** [DAPM: Ceded] Gemini 3.1 Pro, Gemini 3.5, Gemini Flash. The model the entire stack was designed around. Gemma open-weight models for self-hosting (the one offering where enterprise can Retain model authority). **Model Garden (200+ Models)** [DAPM: Delegated] Gemini, Claude Opus/Sonnet/Haiku, Meta Llama, Gemma, open-source models. Model Evaluation service. Broadest model catalog of any cloud provider. Model-agnostic claim genuine at Layer 3 — more so than any other layer. **Application Surfaces** [DAPM: Retained / Delegated] Agent Studio (low-code), Agent Designer (no-code in Gemini Enterprise app), ADK (code-first, open-source, model-agnostic). Gemini Enterprise app for agent discovery and the Deep Research agent. **Consumer-Enterprise Feedback Loop** [DAPM: Ceded] Gemini powers Google Search, Gmail, Docs, Photos, Android, Chrome. Workspace Intelligence uses Gemini for agentic work. Model improvements from billions of consumer interactions directly benefit enterprise workloads. But: consumer-driven alignment and safety tuning may not align with enterprise needs. **Google Antigravity 2.0 (Agent-First Development Platform)** [DAPM: Ceded] Announced I/O 2026. Standalone desktop app + CLI + SDK — a full developer platform built around agent orchestration. Multi-agent parallel execution: orchestrate multiple agents and execute tasks simultaneously. Dynamic subagent workflows and scheduled background automation. Antigravity CLI (Go-based, replacing Gemini CLI — deprecated June 18, 2026) for terminal-native multi-agent workflows. Antigravity SDK for building custom agents with templates in AI Studio. Powered by Gemini 3.5 Flash (co-developed using Antigravity). Native voice command support. Ecosystem integrations: Google AI Studio, Android, Firebase. Export tool for AI Studio → local development. Search integration: real-time custom UI generation within Google Search answers. AI Ultra plan ($100/month, 5x usage limits). Google's most aggressive move in the agentic coding market — positioned as the hub for multi-agent development workflow orchestration, not just code assistance. ### NVIDIA-Provided Components **NVIDIA Models via Model Garden** NVIDIA Nemotron and other NVIDIA models available alongside all other providers. ### Gap Analysis No meaningful capability gap. Broadest model catalog. Most portable agent development framework (ADK). Application surfaces from no-code through code-first. Google deliberately keeps Layer 3 more open than any other layer — while ensuring every Layer 3 application is gravitationally pulled toward Agent Platform (2B/2C). By keeping Layer 3 open, Google maximizes platform adoption: enterprises wanting Claude on Google Cloud still consume Agent Platform’s runtime, identity, gateway, registry, observability. The model is portable; the platform is captive. Consistent with the 4+1 model’s prediction that vendor lock-in concentrates at Layer 2B/2C, not Layer 3. Google has understood this prediction and built strategy accordingly. Code portability vs operational portability: ADK is open-source and model-agnostic — agent code CAN run on AWS or on-prem K8s. But Agent Registry, Memory Bank, Agent Identity, Agent Gateway, Agent Observability are Google Cloud services with no portable equivalents. Agent code is an asset the enterprise owns. Agent operations are an asset it rents. Antigravity 2.0 deepens the Layer 3 gravitational pull toward Google's platform. The desktop app + CLI + SDK creates a development surface that integrates directly with Agent Platform (2B/2C): agents built in Antigravity inherit Agent Platform's identity, gateway, registry, and observability. The Gemini CLI deprecation (June 18, 2026) forces migration to Antigravity CLI — consolidating Google's developer AI surface into one opinionated platform. The SDK enabling custom agent templates in AI Studio means Antigravity is not just a coding tool but an agent construction platform that feeds directly into the Gemini Enterprise Agent Platform. Compare to AWS Kiro (spec-driven, methodology-opinionated, Bedrock-native) and GitHub Copilot (IDE-embedded, multi-model, GitHub-native). Google's differentiator is multi-agent parallel orchestration — Antigravity coordinates multiple agents simultaneously rather than single-agent sequential interaction. This maps to the 4+1 model's Layer 2C vision: orchestrating multiple agents is a control plane function that Antigravity surfaces through a developer tool. The consumer-enterprise feedback loop extends to Antigravity: Google is using Antigravity's capabilities in consumer Search to generate real-time custom UIs as part of search answers. Developer tool innovations flow to consumer products and back — a flywheel no other vendor in the assessment possesses. ### Borrowed Judgment Gemini as borrowed Layer 3 judgment: alignment changes affect agents (Layer 3), governance enrichment (Layer 1A via Knowledge Catalog), semantic models (Layer 1B via LookML Agent), and potentially infrastructure operations (Layer 2C via self-driving cloud). A single alignment decision propagates across the entire model-integrated stack. Platform defaults: Agent Studio and Agent Designer default to Gemini. Enterprise that adopts without explicitly selecting alternatives inherits Google’s model preference as a default rather than a decision. Strategic openness as borrowed judgment about lock-in location: Google’s decision to keep Layer 3 open and concentrate lock-in at 2B/2C is itself borrowed judgment the enterprise inherits. Evaluating Google on model diversity without evaluating platform captivity accepts Google’s framing of where portability matters. ### Working Notes The consumer-enterprise feedback loop has no parallel in the assessment. Model improvements from billions of consumer interactions benefit enterprise workloads — but consumer-driven alignment may constrain enterprise use cases. If Google tightens content policies for consumer safety, enterprise agents inherit that tightening. The Gemini CLI → Antigravity CLI forced migration is a significant authority move. Over 100,000 GitHub stars on Gemini CLI — all those developers must migrate to Antigravity by June 18, 2026. This concentrates Google's developer AI surface into one platform and one billing model (AI Ultra at $100/month). The deprecation timeline is aggressive but consistent with Google's pattern of consolidating developer tools around Gemini. Antigravity 2.0's scheduled tasks capability (agents running automatically in the background) converts the developer tool from a single-turn interaction to a persistent automation pipeline. This blurs the boundary between Layer 3 (application) and Layer 2C (orchestration) — when Antigravity schedules background agents to perform tasks autonomously, who governs those agents? The answer is Agent Platform — reinforcing the Layer 2C gravitational pull. ════════════════════════════════════════════════════════════════════════════════ # HPE AI-Native Infrastructure Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 21, 2026 **Source:** GTC 2026, HPE GreenLake/storage May 2026 announcements, Discover 2025, Juniper acquisition, Town of Vail whitepaper, analyst coverage ## Summary Finding HPE presents the most structurally interesting comparison to Dell in the 4+1 model because it makes genuine software authority claims that Dell does not. Three capabilities differentiate HPE’s architectural position: GreenLake Intelligence (agentic AI mesh using domain-specific LLMs via MCP — HPE-owned Layer 2A/2C for IT operations), the $14B Juniper Networks acquisition (full networking IP stack from silicon to software — Retained Layer 0 networking authority that Dell entirely delegates), and the Unleash AI program with Kamiwaza as the chosen Layer 2C orchestration partner (validated in production at Town of Vail). The AI workload runtime (Layer 2B) remains structurally dependent on NVIDIA AI Enterprise, branded as ‘NVIDIA AI Computing by HPE.’ HPE co-engineers more deeply than Dell — Private Cloud AI is a jointly developed product — but the DAPM implication is the same: Layer 2B model execution authority is Ceded. However, HPE brackets the NVIDIA-controlled Layer 2B with HPE-owned governance above (GreenLake Intelligence at 2A/2C) and below (GreenLake platform at 2A), giving the enterprise governance authority even though it doesn’t control the runtime itself. Kamiwaza’s capabilities span multiple layers — context orchestration (1B), governed data pipelines (1C), agent execution coordination (2B), and decision authority placement (2C) — making it a multi-layer platform, not a point solution. The Town of Vail deployment serves as a by-proxy assessment of Kamiwaza’s capabilities across this full span. HPE’s DAPM classification for Kamiwaza-provided functions is Delegated — structurally superior to Dell’s Absent Layer 2C. The Cray supercomputing heritage gives HPE a sovereign AI positioning that Dell cannot match — exascale systems for Argonne, HLRS, HammerHAI (EU AI Factory). This is a differentiated Layer 0 capability with implications for sovereign data governance at Layer 1A. HPE has one of the most credible on-prem AI infrastructure stacks in the market. Its credibility comes from genuine software authority (GreenLake Intelligence, Data Fabric, OpsRamp), owned networking IP (Juniper/Aruba), sovereign compute heritage (Cray), and a structured ecosystem model (Unleash AI) that deliberately addresses Layer 2C through a chosen partner rather than leaving it absent. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** HPE Strength ### Vendor-Provided Components **HPE ProLiant Compute Gen12** [DAPM: Retained] Intel Xeon 6 / AMD EPYC. HPE iLO management silicon (HPE-owned). Improved perf/watt, security. Foundation for Private Cloud, Private Cloud AI, standalone. **HPE Cray EX4000/GX5000** [DAPM: Retained] Exascale-class supercomputing. GX5000 unifies AI+HPC. Cray Slingshot interconnect. Liquid-cooled blade (GX240) with up to 16 NVIDIA Vera CPUs, 640 per rack. Deployed at Argonne, HLRS, HammerHAI. **HPE Cray Direct Liquid Cooling** [DAPM: Retained] Proprietary DLC supporting up to 400kW per rack with warm water operation. 100% DLC across GX5000 blades. As Blackwell/Vera Rubin density increases, cooling becomes the physical constraint — Cray heritage is genuine differentiator. **HPE Juniper Networking ($14B, July 2025)** [DAPM: Retained] Full IP stack: Junos OS, MX routers, QFX switches, SRX firewalls, Mist AI-native ops, Apstra intent-based DC automation. Networking revenue 151.5% YoY to $2.7B Q1 FY2026. DC networking revenue up 380%+. HPE now owns Layer 0 networking authority. **HPE Cray Slingshot 400 Interconnect** [DAPM: Retained] HPE-owned high-performance interconnect delivering 400 Gbps at scale with ultra-low tail latency for AI workloads. Distinct from NVIDIA InfiniBand — this is HPE networking IP for the supercomputing fabric. Complements Juniper (data center) and Aruba (campus/edge) for a three-tier HPE-owned networking portfolio. **HPE Aruba Networking** [DAPM: Retained] Campus and edge networking with AI-native Central platform. Being retooled on GreenLake Intelligence agentic mesh. Complementary to Juniper’s DC focus and Slingshot’s HPC fabric. **HPE AI Factory (At-Scale + Sovereign)** [DAPM: Retained] Full-stack AI infra: compute, GPUs, networking, liquid cooling, software, services. Blackwell (RTX PRO 6000 now) through Vera Rubin NVL72 (Dec 2026). Multi-tenancy via MIG with GPU passthrough (Spring 2026). Air-gapped configs for sovereign. NVIDIA Cloud Partner endorsed. STIG-hardened, FIPS-enabled. **Silicon Agnosticism (GX5000)** [DAPM: Retained] GX5000 supports NVIDIA AND AMD GPUs in the same rack architecture: GX440n blade (4 Vera CPUs + 8 Rubin GPUs), GX350a blade (1 AMD Venice CPU + 4 AMD MI430X GPUs), GX250 blade (8 AMD Venice CPUs, CPU-only). Up to 24 GPU blades per rack = 192 Rubin GPUs or 112 MI430X per rack. Neither Dell nor VAST offers multi-GPU-vendor blades in the same platform. **HPE Cray K3000 Storage System** [DAPM: Retained] First factory-built offering with embedded DAOS (Distributed Asynchronous Object Storage). Purpose-built I/O acceleration for AI/HPC workloads. Ships early 2026. Complements Alletra at the supercomputing tier. ### NVIDIA-Provided Components **NVIDIA GPU Silicon** RTX PRO 6000 Blackwell now. Vera Rubin NVL72 (72 Rubin GPUs, 36 Vera CPUs, NVLink, ConnectX-9, BlueField-4) Dec 2026. All AI acceleration depends on NVIDIA silicon. **NVIDIA Networking (InfiniBand, Spectrum-X)** Quantum-X800 InfiniBand for Cray GX5000 (144 ports, 800 Gb/s, 2027). ConnectX-9 SuperNICs, BlueField-4 DPUs, NVLink 6th-gen. Competes with HPE’s own Slingshot/Juniper/Aruba in AI fabric — structural tension. **NVIDIA Mission Control** AI Factory at-scale management planned for later 2026. GPU cluster operations, scheduling, resource allocation. HPE AI Factory will support Mission Control for large-scale deployments. ### Gap Analysis Layer 0 has the most Retained DAPM components of any layer in the HPE assessment. Four structural characteristics differentiate HPE’s Layer 0 position: (1) HPE owns networking end-to-end post-Juniper — silicon, OS, management. Plus Slingshot 400 for HPC fabric and Aruba for campus/edge. Three tiers of HPE-owned networking. Dell brands NVIDIA Spectrum silicon. VAST depends on OEM networking. (2) Silicon agnosticism: GX5000 supports NVIDIA Rubin AND AMD MI430X in the same rack architecture. Dell’s AI Factory is NVIDIA-only (AMD under separate ‘Dell AI Platform with AMD’ branding). VAST is NVIDIA-only. (3) Cray heritage positions HPE for sovereign AI — national labs (Argonne), EU AI Factories (HammerHAI), government deployments where the entire stack must be traceable. (4) Cray DLC supports 400kW per rack with warm water — proprietary cooling IP. Structural tension: NVIDIA InfiniBand competes with HPE’s Slingshot in the HPC/AI interconnect. In practice, InfiniBand dominates Ethernet-adjacent environments while Slingshot targets Cray supercomputing deployments. Juniper/Aruba handles east-west and north-south data center networking. Three fabrics, three use cases, two authorities (HPE + NVIDIA). The Cray K3000 with embedded DAOS adds a storage capability at the supercomputing tier that Dell Exascale/Lightning FS and VAST DataStore do not provide in a factory-built form factor. ### Borrowed Judgment Moderate. GPU silicon is fully Ceded to NVIDIA, as for every vendor. HPE retains authority across compute packaging (ProLiant, Cray), networking (Juniper, Aruba, Slingshot — three owned fabrics), cooling (Cray DLC), and HPC storage (K3000/DAOS). Silicon agnosticism (NVIDIA + AMD in GX5000) provides a GPU vendor hedge that Dell and VAST do not currently offer. The Juniper acquisition changes the Layer 0 DAPM comparison: Dell brands NVIDIA Spectrum switches. VAST uses OEM servers with NVIDIA NICs. HPE owns networking IP from silicon to software. This is a structural difference, not a quality judgment — the enterprise architect should evaluate whether owned networking authority matters for their specific deployment. ### Working Notes HPE Compute XD700 (OCP-inspired AI server on NVIDIA HGX Rubin NVL8, liquid-cooled, early 2027) targets neoclouds and service providers. Similar positioning to Dell’s PowerRack but with OCP design philosophy. The three-tier networking portfolio (Slingshot for HPC, Juniper for DC, Aruba for campus/edge) is unique among the vendors assessed. The integration risk is real (three platforms, three management tools) and the authority position should be evaluated against that complexity. Argonne, HLRS, HammerHAI (EU AI Factory), Hudson River Trading, and KISTI are named Cray GX5000 customers — reflecting a sovereign and hyperscale customer profile distinct from Dell’s enterprise-focused AI Factory base. ## ◑ Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Solid ### Vendor-Provided Components **HPE Alletra Storage MP X10000** [DAPM: Retained] Disaggregated, all-flash, scale-out. Native file + object on single platform. 16 nodes, 23PB raw. 100% availability guarantee. RDMA-enabled for AI pipeline optimization across training, inference, KV cache. 2.5 PB/hr backup ingest. **HPE Alletra Storage MP B10000** [DAPM: Retained] Mission-critical block storage. 6-controller-node scaling (50% more perf vs 4-node). Dual-node fault tolerance. 5:1 data reduction guarantee. Real-time agentic support (v10.6.0): coordinated specialized AI agents for semantic understanding, adaptive reasoning, and prescriptive intelligence. Agents draw from system telemetry metadata, best practices, and accumulated product knowledge across installed base. Moves beyond signature-based predictive analytics and pattern matching. **HPE Data Fabric Software v8.1 (Ezmeral)** [DAPM: Retained] Policy-based data placement and movement (tiering) across hybrid environments. Conversational interface and agentic AI assistant for natural language access to global namespace. Enhanced metadata integration for visibility, classification, lineage. Apache Polaris catalog support for Iceberg tables — consistent governance and compliance across platforms. Real-time S3-to-S3 object movement between any S3-compatible storage systems — AI teams can ingest from external S3 sources into governed Data Fabric environment without manual batch transfers. **HPE Zerto Software** [DAPM: Retained] Continuous data protection, AI-powered assistant, Microsoft Defender integration, live VMware-to-HPE VM migration. Near-zero RPO/RTO. ### NVIDIA-Provided Components **GPU-Accelerated Storage Integration** RDMA via CX-8/CX-9 SuperNICs for GPU-direct storage access. Same acceleration Dell and VAST also use. ### Gap Analysis HPE’s Layer 1A is a capable storage foundation with genuine HPE-owned governance intelligence. Three characteristics position it in the 4+1 model: First, the B10000’s agentic support architecture (v10.6.0) goes beyond predictive analytics into semantic understanding and adaptive reasoning — a coordinated set of specialized AI agents drawing from telemetry metadata and accumulated product knowledge. This is HPE-owned intelligence at the storage layer, architecturally aligned with GreenLake Intelligence’s domain-specific agent model. Dell’s storage management is infrastructure monitoring (CloudIQ, MetadataIQ indexing). VAST’s Element Store enriches metadata inline at write time. Three different approaches to storage intelligence. Second, Data Fabric v8.1 with Apache Polaris catalog for Iceberg tables provides cross-platform governance that participates in open-standard ecosystems. Dell’s MetadataIQ indexes within Dell storage boundaries. VAST’s Catalog indexes within the VAST namespace. HPE’s Polaris support means governance metadata is portable across platforms — a federated approach vs Dell’s and VAST’s platform-bounded approaches. Third, Data Fabric’s real-time S3-to-S3 object movement enables AI data ingestion from any S3-compatible source into the governed Data Fabric environment. This addresses the heterogeneous enterprise data ingestion problem — similar in function to VAST’s SyncEngine (which ingests from Google Drive, Jira, Confluence, S3) but operating at the storage protocol level rather than the application API level. X10000’s unified file+object on one platform reduces the number of storage engines vs Dell’s portfolio approach (PowerScale for file, ObjectScale for object, Exascale for combined). VAST’s Element Store goes further by collapsing file, object, table, and vector into a single data structure. HPE’s consolidation is at the platform level; VAST’s is at the data structure level. ### Borrowed Judgment Low to moderate. HPE owns storage platforms (Alletra X10000, B10000), Data Fabric software, and Zerto outright. GPU acceleration for storage I/O depends on NVIDIA networking silicon (CX-8/CX-9), but the storage intelligence — policy engine, metadata, agentic management agents — is HPE IP. Apache Polaris support is a deliberate governance strategy: by using an open standard for metadata catalog, HPE reduces governance vendor lock-in for its customers. Compare to VAST, where the governance catalog is proprietary (Ceded to VAST). The trade-off: HPE’s open-standard approach is more portable but less deeply integrated; VAST’s proprietary approach is tightly integrated but less portable. ### Working Notes Commvault and Veeam partnerships add data resilience capabilities (Delegated partners at Layer 1A). The agentic support in B10000 is distinct from GreenLake Intelligence: B10000 agents are storage-domain specialists drawing from storage telemetry and product knowledge. GreenLake Intelligence agents are cross-domain (networking + storage + compute). The two agent architectures are designed to complement each other — B10000 agents resolve storage-specific issues autonomously while GreenLake Intelligence correlates cross-domain patterns. Whether these agent systems actually interoperate via MCP or operate independently is an open question. The Data Fabric’s real-time S3 ingestion capability addresses a practical enterprise challenge: AI teams need to pull data from diverse S3-compatible sources (AWS, MinIO, other object stores) into a governed environment for AI pipeline consumption. This is not a differentiating capability on its own (any S3-compatible system can ingest from S3) but the governance integration — data lands in the Data Fabric namespace with policy-based placement and lineage tracking — is the value. ## ◑ Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Delegated ### Vendor-Provided Components **HPE Ezmeral Data Fabric (Retrieval Surface)** [DAPM: Retained] Global namespace with conversational access for AI-driven retrieval. Federates data across hybrid environments. Natural language queries against namespace. The discovery layer that retrieval pipelines query — Data Fabric knows where data is and what policies govern it. **HPE Alletra X10000 RDMA Storage** [DAPM: Retained] Low-latency file and object access for AI inference pipelines. RDMA via CX-8/CX-9 reduces retrieval latency for RAG. KV cache storage support for inference state persistence. The storage substrate that retrieval reads from. **Kamiwaza Context Orchestration (via Unleash AI)** [DAPM: Delegated] In Town of Vail: manages the full context pipeline for document-centric use cases. Identifies documents requiring processing (Section 508 compliance), ingests and extracts content (housing deeds), prepares contextual inputs for agent consumption. Determines what context each agent needs, from which sources, under what governance constraints. This is retrieval orchestration — above the storage layer, below the agent runtime. Governs cross-departmental context routing (legal, housing, admin). ### NVIDIA-Provided Components **NVIDIA NeMo Retriever** Embedding models (NV-EmbedQA-E5-v5, Mistral7B-v2, Arctic-Embed-L) and reranking in unified microservice. GPU-accelerated retrieval for RAG pipelines on Private Cloud AI. Provides the embedding intelligence that HPE’s storage does not. **NVIDIA AI-Q Blueprint** Research assistant and enterprise data agent blueprint. Connects enterprise data to AI agents via retrieval pipelines. Available on Private Cloud AI. **NVIDIA RAG Blueprint + Milvus** HPE’s reference RAG architecture uses NeMo Retriever for embedding, Milvus (open-source) for vector database, LangChain for chain serving. HPE does not own any retrieval intelligence component — it provides storage substrate and deployment platform. ### Gap Analysis Layer 1B is HPE’s thinnest proprietary layer. HPE provides storage infrastructure (Data Fabric namespace, Alletra RDMA) but does not own a vector database, an embedding engine, or a retrieval framework. The retrieval intelligence stack is entirely NVIDIA (NeMo Retriever) + open source (Milvus, LangChain). The three-vendor comparison at Layer 1B: • Dell: storage (PowerScale/ObjectScale) + Elastic (search intelligence, Delegated ISV) + NVIDIA (cuVS acceleration). Three authorities. • HPE: storage (Alletra/Data Fabric) + NVIDIA (NeMo Retriever, embedding) + open source (Milvus, LangChain). No proprietary retrieval intelligence. When Kamiwaza is added via Unleash AI, it provides governed retrieval orchestration above the storage and embedding layers. • VAST: storage + embedding + vector search + retrieval pipeline all in one platform (InsightEngine, native vector search, DataBase). One authority. HPE’s retrieval gap is structural: the company has no analog to Dell’s Elastic partnership or VAST’s native InsightEngine. This is a deliberate architectural choice — HPE provides infrastructure substrate and delegates retrieval intelligence to NVIDIA and open-source components. When Kamiwaza enters via Unleash AI, the retrieval story changes. Kamiwaza provides governed context orchestration that neither the storage layer nor the NVIDIA retrieval components provide independently: cross-departmental context routing, authority-constrained retrieval, and document-pipeline coordination. In the Town of Vail, this means the Section 508 compliance agent receives only the documents it’s authorized to process, with retrieval governed by department boundaries. This is a layer of retrieval intelligence that storage-native search (VAST) and embedding-accelerated search (Dell+Elastic) don’t address — the governance of who receives what context under what authority. The 4+1 model question: is governed context orchestration a Layer 1B function (retrieval) or a Layer 2C function (governance)? Kamiwaza’s context management spans both — it retrieves content (1B) according to governance policies (2C). The assessment classifies the retrieval function at 1B and the governance function at 2C. ### Borrowed Judgment Moderate to high. HPE’s own Layer 1B authority is limited to storage infrastructure. Embedding intelligence is NVIDIA (NeMo Retriever). Vector storage is open source (Milvus). Retrieval framework is open source (LangChain). Governed context orchestration is Kamiwaza (Delegated via Unleash AI). Compare to Dell: Dell delegates retrieval intelligence to Elastic (proprietary ISV partnership) and acceleration to NVIDIA. Dell’s borrowed judgment at 1B is Moderate — split between a proprietary ISV and NVIDIA. Compare to VAST: VAST’s borrowed judgment at 1B is Low — InsightEngine, vector search, and the retrieval pipeline are VAST IP. Only embedding model execution (NIM) is NVIDIA-provided, and InsightEngine is model-agnostic. HPE’s Layer 1B borrowed judgment is the highest of the three vendors because HPE owns the least retrieval IP. The mitigation: Kamiwaza’s governed orchestration adds a unique capability that pure retrieval engines don’t provide. ### Working Notes The HPE Developer Portal’s RAG reference architecture is instructive: NeMo Retriever embedding + Milvus vector DB + LangChain + Llama3-70B. This is a standard NVIDIA reference stack, not an HPE-differentiated architecture. Any NVIDIA partner (Dell, Lenovo, Supermicro) could deploy the identical stack. HPE’s Layer 1B differentiation comes not from the retrieval stack but from the storage substrate below it (Data Fabric governance, Alletra RDMA performance) and the orchestration layer above it (Kamiwaza context governance). The KV cache storage support in Alletra X10000 is worth noting as a Layer 1B/2B bridge: inference state persistence in storage allows agents to maintain context across sessions without holding GPU memory. Dell’s equivalent is the CMX KV cache offload (NVIDIA technology). HPE’s is storage-native. VAST’s CNode-X collocates cache and compute. ## ◑ Layer 1C: Data Movement & Pipelines *Move/transform data — policy-driven placement, lineage, cost-aware movement* **Status:** HPE + Open Source ### Vendor-Provided Components **HPE Data Fabric Software (Pipeline Orchestration)** [DAPM: Retained] Policy-based data placement considering performance, sovereignty, costs, compliance. Data lineage and compliance tagging. Agentic AI assistant for automated reporting and data placement decisions. Real-time S3-to-S3 object movement for AI data ingestion from external sources. **HPE Ezmeral Unified Analytics** [DAPM: Retained] Enterprise-hardened packaging of the full open-source ML pipeline stack: Apache Airflow (workflow orchestration), Kubeflow (ML pipelines + model serving via KServe), Ray (distributed compute), Feast (feature store), MLflow (experiment tracking), Apache Spark (data engineering), Presto SQL (federated query), Apache Superset (visualization). Connectors to Snowflake, MySQL, Delta Lake, Teradata, Oracle. Built through acquisitions: BlueData (2018), MapR (2019), Ampool (2021), Arrikto/Kubeflow team (2023). **HPE Morpheus Software** [DAPM: Retained] Hybrid and multicloud management, orchestration, migration, automation. VMware-to-HPE VM migration paths. Cloud-native workflow orchestration. **Kamiwaza Workflow Data Pipelines (via Unleash AI)** [DAPM: Delegated] In Town of Vail: orchestrates governed data flows across department boundaries — housing deeds from ingestion through verification to audit across legal, housing, and admin functions. Decision-driven data movement where pipeline logic is governed by authority constraints and compliance policy, not static ETL schedules. ### NVIDIA-Provided Components **NVIDIA RAPIDS Accelerator for Apache Spark** GPU-accelerated data prep, model training, and visualization within Ezmeral Unified Analytics. Up to 29x faster development. Spark acceleration is the primary NVIDIA contribution at Layer 1C. **NVIDIA Blueprints** Pre-built AI application patterns deployed on Private Cloud AI. Pipeline templates, not pipeline infrastructure. ### Gap Analysis Three vendors, three distinct architectural strategies for Layer 1C: • Dell: acquired Dataloop (proprietary orchestration, no-code/low-code). Dell’s strongest software move, but the broader pipeline layer depends on ISV partners (ClearML, DataRobot, Starburst). Multiple authority boundaries. • HPE: packages the full open-source ML pipeline lifecycle (Airflow → Kubeflow → Ray → Feast → MLflow → Spark) under enterprise-grade guardrails. Four acquisitions (2018–2023) demonstrate deliberate investment. Value is in curation, hardening, support, integration — not proprietary technology. • VAST: built a proprietary DataEngine (event-driven serverless execution on CNodes). Entirely VAST IP. Tightly integrated with storage and retrieval layers. One authority. HPE’s open-source approach creates a specific DAPM trade-off: the enterprise avoids vendor lock-in (Airflow and Kubeflow are portable), but HPE’s authority is in packaging rather than core technology. If Apache Airflow’s community changes direction, HPE is affected. This is a different risk profile than Dell’s (proprietary Dataloop, partner-dependent beyond it) or VAST’s (proprietary DataEngine, VAST-dependent entirely). Data Fabric’s policy-based movement with Apache Polaris governance connects Layer 1C to Layer 2C: data moves according to explicit policies that consider performance, data locality, sovereignty, costs, and compliance. This governance-aware data movement feeds both GreenLake Intelligence (infrastructure decisions) and Kamiwaza (AI workload decisions). Dell’s Dataloop provides orchestration without integrated governance policy. VAST’s DataEngine has Event Broker for data-event-driven movement without an explicit policy engine. When Kamiwaza is selected via Unleash AI, it adds decision-driven pipeline capability: documents move across department boundaries based on decision logic (legal review required? accessibility compliance met? authority approval needed?). This connects infrastructure-level pipeline capabilities to business-level decision flows — where Layer 1C meets Layer 2C. ### Borrowed Judgment Low for pipeline packaging and integration (HPE owns Ezmeral, Data Fabric, Morpheus). Underlying components are open-source, limiting deep technical authority but also limiting NVIDIA dependency — RAPIDS for Spark is the only NVIDIA contribution at this layer. Open-source components are substitutable by the enterprise without HPE’s permission. Compare to Dell: Dell owns Dataloop (Retained) but depends on partners for everything else at Layer 1C. Four authority boundaries. Compare to VAST: VAST owns everything at Layer 1C (Retained by VAST, Ceded by the enterprise). One authority but total vendor dependency. HPE’s Layer 1C authority model is distinct: HPE curates and supports, the enterprise can substitute, NVIDIA acceleration is additive not required. ### Working Notes The acquisition history (BlueData 2018, MapR 2019, Ampool 2021, Arrikto 2023) shows deliberate multi-year investment in the data pipeline layer. HPE chose to build this capability rather than delegate entirely to partners. Dell’s Dataloop acquisition is a similar strategic move but more recent (2024) and narrower in scope. The NVIDIA RAPIDS Accelerator for Spark (up to 29x faster) is meaningful but optional — Ezmeral runs without GPU acceleration. Same pattern as VAST’s DataEngine (runs on standard CNodes, CNode-X adds GPU acceleration). Data Fabric’s real-time S3-to-S3 movement bridges Layer 1A and 1C: ingest from external S3-compatible sources into the governed namespace, where policy-based placement takes over. ## ● Layer 2A: Infrastructure Orchestration *GPU scheduling, quotas, RBAC, infrastructure lifecycle management* **Status:** HPE Strength ### Vendor-Provided Components **HPE GreenLake Cloud Platform** [DAPM: Retained] Consumption-based hybrid cloud (4th gen). Unified VM + K8s management. Pay-per-use AI infrastructure. Dashboard for capacity, utilization, cost. Self-service cloud experience with full lifecycle management. **HPE GreenLake Intelligence (Agentic AI Mesh)** [DAPM: Retained] HPE-owned agentic AI framework infused across the entire hybrid stack (not a standalone product). Multiple domain-specific LLMs trained on HPE data, communicating via Model Context Protocol (MCP). Agents form an agentic mesh for inter-agent communication with secure, contextual data sharing. Domain agents for networking (Aruba/Juniper), storage (Alletra), compute (OpsRamp), orchestration, and FinOps. Cross-domain correlation: traces performance issues across application → storage → network chain. Orchestration, networking, and FinOps agents collaborate to determine workload placement across private and public clouds. Can process real-time infrastructure metrics and execute actions across multiple vendor environments (not exclusively HPE hardware). Human-in-the-loop: agents take action subject to approval. **HPE OpsRamp Software** [DAPM: Retained] Multi-domain agentic system coordinating compute, network, storage, virtualization, and software layers. Use cases: root-cause analysis, explainability, capacity planning. AI-driven alerts, incident management, GPU monitoring, workload observability. Operations copilot with conversational product help and agentic command center. CrowdStrike integration for security monitoring. MCP support for connecting to GreenLake Intelligence and third-party tools. **HPE Alletra X10000 MCP Servers (Native)** [DAPM: Retained] Model Context Protocol servers built natively into X10000 storage. Enables GreenLake Intelligence agents to communicate directly with storage for data management orchestration. Connects storage operations to the broader agentic mesh via GreenLake Copilot and natural language interfaces. This is HPE-owned agent-to-infrastructure communication at the storage layer. **HPE Compute Ops Management** [DAPM: Retained] Cloud-native server lifecycle management for ProLiant fleet. Compute Copilot for AI-assisted infrastructure operations. **HPE Private Cloud (4th Gen)** [DAPM: Retained] K8s management with ProLiant Gen12. Unified cloud-native + virtualized workload management. Independent scaling for cloud-native workloads. Upgrade path to Morpheus for hybrid/multicloud. ### NVIDIA-Provided Components **NVIDIA MIG** GPU fractionalization for multi-tenancy in AI Factory portfolio. **NVIDIA Mission Control** AI Factory at-scale management. Planned later 2026. GPU cluster ops, scheduling, resource allocation. **NVIDIA AI Enterprise (Runtime Management)** Lifecycle management for AI software stack. Pre-integrated with ProLiant. ### Gap Analysis Layer 2A is where HPE makes its most substantial software authority claim. GreenLake Intelligence is not a rebranded monitoring tool — it is an agentic AI framework with domain-specific LLMs communicating via MCP, designed to be infused across the entire HPE hybrid stack. Four characteristics define HPE’s Layer 2A position: (1) Cross-domain agentic correlation. GreenLake Intelligence agents trace performance issues across application → storage → network chains, coordinating remediation across domains. Dell’s OpenManage and NVIDIA’s Run:ai operate within single domains (rack management and GPU scheduling respectively). VAST’s Polaris orchestrates VAST clusters but not the broader infrastructure around them. (2) MCP as the inter-agent communication standard. GreenLake Intelligence is compliant with MCP, enabling connection to third-party agents and devices. The X10000 has native MCP servers built in. This means the agentic mesh is architecturally open — ITSM systems can collaborate with GreenLake, and third-party infrastructure can be brought under GreenLake management. HPE positions this as ‘the mesh is open to more stitches.’ (3) Multi-vendor infrastructure support. NAND Research notes that GreenLake Intelligence agents can process real-time metrics and execute actions across multiple vendor environments, not exclusively HPE hardware. This extends Layer 2A authority beyond HPE’s own equipment — a broader orchestration scope than Dell’s OpenManage (Dell hardware only) or VAST’s Polaris (VAST clusters only). (4) FinOps agent for workload placement. Orchestration, networking, and FinOps agents collaborate to determine workload placement across private and public clouds. This is an economic placement decision — where should this workload run based on cost, performance, and policy? This function overlaps with Layer 2C territory. Dell’s Layer 2A is split between Dell-managed rack deployment (OpenManage) and NVIDIA-managed GPU scheduling (Run:ai). HPE’s Layer 2A is unified under GreenLake with OpsRamp providing a multi-domain agentic system that coordinates compute, network, storage, virtualization, and software layers. GreenLake’s consumption model (pay-per-use) creates a natural authority surface: HPE maintains an ongoing operational relationship with the infrastructure — metering, capacity management, utilization optimization — that traditional capex purchases don’t provide. The gap: GPU-specific scheduling is still Ceded to NVIDIA (MIG for fractionalization, Mission Control for at-scale management, planned later 2026). HPE orchestrates the infrastructure around the GPU cluster; NVIDIA orchestrates inside it. This is the same Layer 2A boundary as Dell, but HPE’s surrounding orchestration is a unified platform rather than separate point tools. ### Borrowed Judgment Low for infrastructure orchestration (GreenLake platform, GreenLake Intelligence, OpsRamp, Compute Ops Management are HPE-owned IP). Moderate for GPU-specific scheduling (MIG, Mission Control are NVIDIA-controlled). The NAND Research caveat is relevant for the DAPM assessment: tight coupling with GreenLake creates potential vendor lock-in for organizations with diverse infrastructure portfolios. However, MCP compliance and multi-vendor agent support partially mitigate this concern — the agentic mesh can extend beyond HPE hardware. Compare to Dell: Dell’s infrastructure orchestration is fragmented (OpenManage for servers, separate tools for storage and networking). GPU scheduling is fully NVIDIA-controlled (Run:ai). No agentic cross-domain correlation. Compare to VAST: Polaris provides fleet-level VAST cluster orchestration (Retained by VAST). DataEngine provides workload scheduling within the data platform. But Polaris doesn’t orchestrate non-VAST infrastructure. GreenLake Intelligence’s multi-vendor, multi-domain scope is broader. ### Working Notes GreenLake Intelligence’s cross-domain correlation and FinOps-aware placement push beyond traditional Layer 2A into Layer 2C territory for IT operations. The assessment classifies it as spanning 2A–2C for IT ops: infrastructure orchestration (2A) + cross-domain governance decisions and economic placement reasoning (2C). This dual classification is important — GreenLake Intelligence is both an orchestrator and a decision-maker. The MCP openness is architecturally significant: by using an open protocol for agent communication, HPE enables third-party integration without custom APIs. ITSM systems (ServiceNow, BMC) can collaborate with GreenLake agents. Third-party infrastructure can be managed. This is an open-ecosystem approach to infrastructure orchestration that Dell’s proprietary OpenManage and VAST’s proprietary Polaris don’t provide. The X10000 native MCP servers represent infrastructure-level agent communication — the storage array itself participates in the agentic mesh as a first-class agent endpoint, not just a managed resource. This is a specific implementation of the 4+1 model’s vision of infrastructure that is natively agent-aware. ## ○ Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Ceded to NVIDIA ### Vendor-Provided Components **HPE Private Cloud AI** [DAPM: Delegated] Co-engineered with NVIDIA. Pre-configured HW+SW stack with four right-sized configurations. Air-gapped capable. Scales to 128 GPUs with network expansion racks. OpsRamp integration for AI workload monitoring. Supports NVIDIA AI-Q, Omniverse, NeMo Retriever blueprints. Multi-tenancy via MIG with GPU passthrough (Spring 2026). This is the delivery vehicle and deployment platform, not the runtime itself. **HPE Ezmeral Unified Analytics (ML Runtime)** [DAPM: Retained] Kubeflow for ML pipeline execution and model serving (KServe). Ray for distributed compute. MLflow for experiment tracking. Enterprise packaging of open-source ML runtime tools. **Agentic Framework Ecosystem on Private Cloud AI** [DAPM: Delegated] CrewAI integration enables enterprises to build multi-agent solutions on Private Cloud AI. Deloitte Zora AI for Finance deploys on Private Cloud AI as an agentic platform for dynamic executive reporting (financial statement analysis, scenario modeling, competitive analysis). NIM Agent Blueprints provide pre-built agentic workflows. This is an emerging multi-framework agentic surface — not HPE-owned runtime IP but HPE-curated deployment options. **Kamiwaza Agent Execution Coordination (via Unleash AI)** [DAPM: Delegated] In Town of Vail: coordinates multiple specialized agents — accessibility identification, alt text generation, remediation guidance formatting. Determines which agents run, in what sequence, with what inputs, under what constraints. Enforces human-in-the-loop checkpoints. Manages agent lifecycle at the execution layer. ‘Coordination of multiple specialized AI agents and human workflows to execute multi-step decisions under explicit authority, governance, and audit constraints.’ ### NVIDIA-Provided Components **NVIDIA AI Enterprise Software** The AI workload runtime: model serving (Triton), guardrails (NeMo Guardrails), distributed inference, training frameworks (NeMo). Pre-integrated on Private Cloud AI. STIG-hardened, FIPS-enabled for sovereign deployments. **NVIDIA NIM Microservices** Pre-optimized inference microservices for model deployment. Part of AI Enterprise platform. Available to all NVIDIA partners (Dell, Cisco, Lenovo) — not HPE-specific. **NVIDIA NIM Agent Blueprints** Pre-built agentic AI application patterns: Multimodal PDF Data Extraction, Digital Twins (Omniverse), AI-Q for enterprise data agents. Deployed on Private Cloud AI. Same blueprints available on Dell AI Factory, Cisco HyperFabric, Lenovo Hybrid AI. ### Gap Analysis Layer 2B reveals a more nuanced runtime architecture than Dell’s because authority is distributed across three actors rather than two. The three-actor model: • NVIDIA provides model execution — Triton (model serving), NeMo Guardrails (safety), NIM (optimized inference), NeMo (training). This is the compute execution layer. Identical across all NVIDIA partners (Dell, Cisco, Lenovo deploy the same stack). • HPE provides the deployment platform — Private Cloud AI (hardware, cooling, lifecycle), Ezmeral (ML runtime packaging), and increasingly an agentic framework surface (CrewAI, Deloitte Zora AI). This is infrastructure + curation. • Kamiwaza provides agent execution coordination (via Unleash AI) — determines which agents run, sequences execution, manages inputs/outputs, enforces execution-time constraints (authority boundaries, audit, human-in-the-loop). This is governance-aware agent coordination above model inference but below Layer 2C policy. This creates a layered runtime: NVIDIA executes individual model inference → Kamiwaza coordinates multi-agent workflows and enforces execution governance → Layer 2C (also Kamiwaza) makes policy decisions about what should run where. The 2B/2C boundary: 2B is execution coordination (how agents run), 2C is decision authority (why agents run, under what governance). The structural comparison across vendors: • Dell: NVIDIA at 2B (model execution + NemoClaw/OpenShell agent runtime). No agent coordination layer beyond NVIDIA. Dell provides packaging and services. • HPE: NVIDIA at model execution + Kamiwaza at agent coordination + CrewAI/ISV frameworks for agent building. Three layers of runtime capability from three sources. HPE provides infrastructure + curation. • VAST: AgentEngine provides a unified agent runtime (execution + coordination + lifecycle + observability) as VAST IP. NVIDIA provides GPU acceleration only. One authority. HPE’s ‘NVIDIA AI Computing by HPE’ branding signals co-engineering, but the DAPM question is precise: can HPE modify, extend, or replace NVIDIA runtime components independently? The answer appears to be no — ‘co-engineering’ means deeper integration and joint validation, not shared IP authority. NVIDIA controls the runtime; HPE controls the platform it runs on. The emerging agentic framework ecosystem (CrewAI, Deloitte Zora AI) on Private Cloud AI is worth noting: HPE is becoming a multi-framework agentic deployment surface, not locked to a single agent runtime. This is a platform strategy — provide the substrate that multiple agentic frameworks can run on — rather than a runtime strategy (build the definitive agent runtime, as VAST is attempting with AgentEngine). ### Borrowed Judgment High for AI workload runtime. NVIDIA controls model serving, inference optimization, guardrails, and training frameworks. The same NVIDIA AI Enterprise stack runs on Dell, Cisco, and Lenovo — this is not HPE-specific technology. The mitigating factor is the ‘bracketing’ architecture: HPE retains governance authority at Layer 2A (GreenLake Intelligence, HPE-owned) and delegates governance authority at Layer 2C (Kamiwaza via Unleash AI). The NVIDIA-controlled Layer 2B runtime is sandwiched between two layers where HPE has governance authority. The enterprise has governance coverage even where it doesn’t control execution. Compare to Dell: Dell has NVIDIA at 2B with no governance brackets. No Layer 2C (Absent). Layer 2A is split between Dell (OpenManage) and NVIDIA (Run:ai). The enterprise has neither governance authority above nor unified governance authority below the NVIDIA runtime. Compare to VAST: VAST owns AgentEngine (2B) and is building PolicyEngine (2C). No bracketing needed because VAST controls both the runtime and the governance layer. The enterprise Cedes both to VAST. HPE’s borrowed judgment at 2B is the highest of any layer in the HPE assessment. The bracketing architecture is the mitigation, not the solution. ### Working Notes The CrewAI and Deloitte Zora AI integrations signal that Private Cloud AI is evolving from a single-stack NVIDIA deployment platform into a multi-framework agentic surface. This is architecturally different from both Dell’s approach (NVIDIA-only runtime) and VAST’s approach (proprietary-only runtime). HPE is positioning Private Cloud AI as the substrate that multiple agent frameworks deploy on. The NIM Agent Blueprints (PDF Extraction, Digital Twins, AI-Q) are available identically on Dell AI Factory, Cisco HyperFabric, and Lenovo Hybrid AI. These do not differentiate HPE at Layer 2B. HPE’s differentiation comes from the bracketing architecture (2A and 2C governance around the NVIDIA 2B runtime) and the emerging multi-framework agent deployment model. The bracketing architecture has a structural analog in HPE’s networking story: NVIDIA InfiniBand handles GPU-to-GPU interconnect (HPE doesn’t control it), but HPE’s Juniper/Aruba/Slingshot handles everything around the GPU fabric (HPE owns it). The pattern: cede the NVIDIA-specific function, retain authority over everything surrounding it. ## ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Retained (IT Ops) + Delegated (AI Workloads) ### Vendor-Provided Components **GreenLake Intelligence (IT Operations 2C)** [DAPM: Retained] MCP-based agent communication across infrastructure domains. Domain-specific agents for networking, storage, compute, operations. Cross-domain correlation and autonomous remediation. Layer 2C for IT infrastructure operations — but not for AI workload placement and policy. **Kamiwaza (AI Workload 2C, via Unleash AI)** [DAPM: Delegated] Agentic orchestration and decision routing for AI workloads. Policy-driven placement, mission decomposition, decision authority placement, cross-agent governance. Distributed Data Engines process data at the source without moving it or compromising security. Cross-environment evaluation: rather than treating anomalies as isolated alerts, Kamiwaza evaluates what else is happening across the environment to determine appropriate response. Town of Vail production agents: ARIA (accessibility auditing — independently audits websites, identifies Section 508 issues, provides developer fixes in days vs $1.5M and months for manual audits). Deed restriction processor (reviews documents spanning 60 years, extracts key data, answers compliance questions, generates Excel/PDF reports — work that previously required weeks of manual review). Fire detection coordinator (works with Vaidio/ProHawk video AI, evaluates cross-environment context, triggers workflows, supports operators as conditions change). HPE’s chosen Layer 2C for AI workload orchestration, delivered under single-accountable-provider model. **HPE Data Fabric Policy Engine** [DAPM: Retained] Policy-based data placement considering performance, sovereignty, costs, compliance. Apache Polaris for cross-platform governance. Feeds governance signals into both GreenLake Intelligence (infrastructure) and Kamiwaza (AI workloads). ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** GreenLake Intelligence and Kamiwaza are HPE-owned and HPE-Delegated respectively. NVIDIA does not control the governance, placement, or policy reasoning layer in the HPE stack. ### Gap Analysis This is the most analytically interesting layer in the HPE assessment and where HPE’s approach diverges most from Dell’s. HPE has a two-part Layer 2C story: GreenLake Intelligence provides Layer 2C for IT infrastructure operations: correlates signals across networking, storage, compute to diagnose and resolve infrastructure issues. Routes decisions across domains. Takes autonomous action (subject to approval). Uses MCP for agent communication. HPE-owned IP. Kamiwaza provides Layer 2C for AI workload orchestration: agentic orchestration, decision routing, policy-driven placement, cross-agent governance. Distributed Data Engines process data at the source without data movement. Not an accidental ISV partnership — HPE’s deliberate architectural choice, curated, integrated, validated, and delivered under single-accountable-provider model. The Town of Vail as by-proxy Kamiwaza assessment — specific evidence: • ARIA accessibility agent: independently audits municipal websites, identifies Section 508 issues, provides developer fixes. Manual equivalent: $1.5M, months of work. ARIA delivers in days. This demonstrates decision automation with governance (the agent identifies what needs fixing, recommends how, but human developers implement). • Deed restriction processor: reviews housing documents spanning 60 years in disjointed legacy formats, extracts key data, answers compliance questions, generates reports. Previously required weeks of manual review and data entry in Excel. Single processing errors carry serious legal and financial consequences. This demonstrates governance-aware document intelligence with cross-departmental implications (legal, housing, administrative). • Fire detection coordinator: works with Vaidio video analytics and ProHawk enhanced vision. Rather than treating a video anomaly as an isolated alert, Kamiwaza evaluates what else is happening across the environment and determines the appropriate response. Agents surface relevant information, trigger correct workflows, and support operators as conditions change. This demonstrates cross-environment evaluation — the core Layer 2C pattern of reasoning across multiple data sources and agent outputs. • Deployment velocity: concept to first-phase production in three months. 20–30 additional use cases projected in first year. Additional use cases compose from existing primitives (decision flows, authority boundaries, governance constraints) rather than requiring new infrastructure. • Economic model: fixed-cost infrastructure on the town’s own solar/wind-powered data center. No cloud API token pricing. Billions of tokens without variable costs. • RBAC → REBAC governance: emerged from Kamiwaza’s production behavior. Traditional role-based access breaks when autonomous agents operate across department boundaries. Relationship-Based Access Control constrains agent permissions based on context, not just role. Structural comparison: • Dell: Layer 2C absent. No partner fills this role. No plan visible. • HPE: Layer 2C Retained for IT ops (GreenLake Intelligence), Delegated for AI workloads (Kamiwaza via Unleash AI) — validated in production with named agents and measurable outcomes. • Google: Layer 2C Retained (Inference Gateway + DWS + Knowledge Catalog). Productized and shipping. • VAST: Layer 2C Retained/Emerging (PolicyEngine + Polaris + TuningEngine). Announced at VAST Forward 2026, GA end of 2026. ### Borrowed Judgment IT ops Layer 2C: Low — GreenLake Intelligence is HPE-owned IP. AI workload Layer 2C: Moderate — Delegated to Kamiwaza, but deliberately chosen, integrated, and delivered under HPE’s accountability. Healthier than Dell’s Absent classification because the function exists and someone is accountable. The risk is partner dependency, not capability absence. ### Working Notes Strategic question: is Delegated Layer 2C transitional (HPE eventually builds/acquires orchestration IP) or permanent (HPE’s value is ecosystem curation, not owning every layer)? Town of Vail evidence suggests HPE is comfortable with the ecosystem model — and that it works operationally. The RBAC → REBAC governance shift from Town of Vail validates the 4+1 model’s claim that Layer 2C requires governance architecturally distinct from Layer 2A infrastructure RBAC. ## ◇ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Unleash AI Ecosystem ### Vendor-Provided Components **HPE Unleash AI Program (26+ ISV Members)** [DAPM: Delegated] Curated (not open) ISV partner ecosystem. HPE is ‘highly selective’ with its ISV pool. 26+ members focused on different AI use cases from vision AI to agentic analytics. Validates interoperability, provides unified deployment and support. Positions HPE as ‘one accountable provider.’ Field motion targets decision friction, not infrastructure features. Training, certifications, and enablement support for partners. India-based partners taking locally developed AI use cases into global markets. **HPE Private Cloud AI (Deployment Platform)** [DAPM: Retained] Pre-configured foundation for ISV solutions. Supports NVIDIA blueprints and partner applications. Air-gapped for regulated industries. Now includes dedicated turnkey development system for fast-tracking AI project validation. Evergreen and always current. **CrewAI (Pre-Installed on Private Cloud AI)** [DAPM: Delegated] Multi-agent automation framework pre-installed on HPE Private Cloud AI hardware. Enables enterprises to rapidly build and deploy tailored AI agents across industries: finance, healthcare, defense, retail, manufacturing, telecom, energy. On-premises deployment ensures data never leaves enterprise control. **Deloitte Zora AI for Finance** [DAPM: Delegated] Agentic AI solution reimagining executive reporting. Dynamic, on-demand, interactive experience driven by autonomous AI. Use cases: financial statement analysis, scenario modeling, competitive and market analysis. Deployed on Private Cloud AI. HPE is adopting internally first. Available worldwide. **Aible (Unleash AI Member, Discover 2025)** [DAPM: Delegated] AI agent platform for business users at enterprise scale. Completely autonomous specialized AI agents without requiring data science or ML engineering expertise. Auto-builds, coaches, and deploys AI agents across on-prem, hybrid cloud, and edge. ### NVIDIA-Provided Components **NVIDIA Blueprints + NIM** Pre-built AI application patterns (Multimodal PDF Extraction, Digital Twins, AI-Q) and inference microservices. Deployed on Private Cloud AI. Same blueprints available across all NVIDIA partners. ### Gap Analysis HPE correctly does not build Layer 3. The Unleash AI program is the most structured ecosystem curation approach among the infrastructure vendors assessed. Three characteristics define HPE’s Layer 3 approach: (1) Curated, not open. HPE is ‘highly selective’ with 26+ ISV members. Each partner is chosen for a specific AI use case domain and validated for interoperability. This is a deliberate contrast to Dell’s broader ISV partnership approach (more partners, less curation) and VAST’s smaller, focused ecosystem (CoreWeave, TwelveLabs, CrowdStrike). (2) Micro-focused agent model. Kamiwaza’s Luke Norris describes the approach: ‘tens if not hundreds of different agents that are micro-focused on particular jobs’ rather than one monolithic model. This is the operational philosophy behind Unleash AI — specialized agents from specialized partners, coordinated by Kamiwaza’s orchestration layer. (3) Pre-installed frameworks. CrewAI comes pre-installed on Private Cloud AI hardware. This is a different model than Dell’s (deploy NVIDIA NIM/NemoClaw as post-purchase software) or VAST’s (AgentEngine is the platform). HPE delivers the agent development framework as part of the infrastructure purchase. With Kamiwaza correctly positioned at Layer 2C (not Layer 3), the ecosystem layer map clarifies: • HPE provides infrastructure authority (Layers 0–2A) • Kamiwaza provides orchestration authority (Layer 2C, spanning 1B/1C/2B) • NVIDIA provides model execution runtime (Layer 2B) • ISVs provide domain applications (Layer 3): Deloitte Zora AI (finance), Aible (business users), ProHawk (video), Vaidio (vision AI), Blackshark.ai (geospatial), Gambit (citizen engagement) • Cross-cutting partners: CrowdStrike (security), Fortanix (confidential computing), Commvault/Veeam (data resilience), Red Hat (OS/K8s), SHI (integration services) The Town of Vail validates the ‘appliance-like operating model’ — unified deployment, lifecycle management, single escalation path. The coordination overhead that typically kills multi-vendor ecosystem solutions is addressed by HPE’s single-accountable-provider model and Kamiwaza’s orchestration layer. The SiliconANGLE analysis (May 2026) frames this as the emerging default for enterprise AI: ‘curated AI ecosystems’ where customers combine infrastructure, models, orchestration platforms, and ISV tooling without stitching every component together manually. HPE’s position is explicitly not a vertically integrated AI stack — it is a curated substrate model. ### Borrowed Judgment Distributed across partners, architecturally correct for Layer 3. Each partner maps to specific layers with identifiable authority boundaries. The structural comparison with Dell and VAST at Layer 3: • Dell’s ecosystem is load-bearing: ISV partners provide infrastructure-level functions (Cohere North for agent orchestration, DataRobot for lifecycle management) that Dell’s platform lacks. Remove Cohere North and Dell loses agent workflow orchestration. • HPE’s ecosystem is curated: ISV partners provide domain applications (Layer 3) while Kamiwaza provides orchestration (Layer 2C). Remove Deloitte Zora AI and HPE loses a finance use case, not a platform capability. • VAST’s ecosystem is additive: the platform is architecturally self-sufficient through Layer 2C. Partners add vertical use cases (TwelveLabs for video AI). Remove TwelveLabs and VAST loses a use case, not a platform function. HPE’s ecosystem structure is closer to VAST’s (additive) than Dell’s (load-bearing) at Layer 3, with the important distinction that HPE’s Layer 2C orchestration is Delegated to an ecosystem partner (Kamiwaza) rather than Retained as proprietary IP (VAST’s PolicyEngine). ### Working Notes The CrewAI pre-installation model is worth tracking: HPE hardware arrives with an agentic development framework already installed. This is a different go-to-market motion than selling infrastructure and then layering software. If this becomes the standard for Private Cloud AI, HPE is bundling Layer 3 development capability into the Layer 0 purchase. Deloitte and Aible represent enterprise-grade ISV deployments on Private Cloud AI — global SI (Deloitte) and AI platform vendor (Aible) choosing HPE’s infrastructure for agentic deployment. Dell’s equivalent is OpenAI Codex, SpaceXAI Grok, and ServiceNow. VAST’s equivalent is CoreWeave and TwelveLabs. Different ISV profiles reflect different customer bases. The SiliconANGLE ‘curated AI ecosystems’ framing from May 20, 2026 (one day ago) positions HPE’s Unleash AI approach as the emerging industry default. Whether this framing holds or whether vertically integrated stacks (VAST) or hyperscaler-controlled ecosystems (Google) prove more durable is an open question for the 4+1 assessment series. ════════════════════════════════════════════════════════════════════════════════ # IBM / Red Hat OpenShift AI Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 23, 2026 **Source:** IBM Think 2026, Red Hat Summit 2026, Red Hat AI 3.4 GA, watsonx Orchestrate next-gen preview, IBM Sovereign Core GA, IBM Concert public preview, IBM Confluent acquisition, Granite 4.1 release, analyst coverage (SiliconANGLE, NAND Research, Futurum Group, ECI Research) ## Summary Finding IBM / Red Hat is the only vendor in this assessment series attempting to build an enterprise AI operating model from middleware outward. Where Dell builds upward from hardware, VAST builds upward from storage, and Google builds downward from model intelligence, IBM builds from the platform layer — Red Hat OpenShift as the universal substrate — and extends authority in both directions: downward into infrastructure governance (Sovereign Core, Concert) and upward into agent orchestration (watsonx Orchestrate). The platform is the distribution vehicle, not the value. The value is the governance and orchestration intelligence that rides on top. The critical analytical lens for IBM is separating defensible proprietary IP from open-source packaging. The majority of IBM's AI platform capabilities — OpenShift (Kubernetes), vLLM (inference), KServe (model serving), Ray (distributed compute), Kubeflow (ML pipelines), MLflow (experiment tracking), Tekton (CI/CD), even InstructLab (model customization) — are open-source projects that run identically on VMware Tanzu, Amazon EKS, or bare Kubernetes. An enterprise could replicate most of IBM's Layer 2A/2B capabilities on any CNCF-compliant Kubernetes distribution. IBM's structural moats — capabilities that cannot be replicated without IBM — are concentrated in a narrow but strategically critical band: watsonx.governance (cross-platform AI assurance), watsonx Orchestrate (agentic control plane), Confluent integration with watsonx.data (governed real-time streaming), and Sovereign Core (runtime sovereignty). These are the components where IBM provides genuine authority above the Kubernetes baseline. IBM does not own compute silicon, does not own GPU scheduling, does not own networking fabric, does not own a high-performance AI-optimized storage platform, and does not own a frontier foundation model. Layer 0 is entirely Delegated or Absent — IBM provides no compute hardware, no server chassis, no networking switches, no cooling infrastructure, and no GPU fabric interconnect. IBM would be perfectly content for customers to run Layers 1A through 3 on a Dell AI Factory, HPE Private Cloud AI, or any OEM hardware. IBM's business model depends on someone else solving Layer 0. The consulting and services model reinforces the open-source strategy. IBM Consulting (~160,000 consultants) provides implementation expertise for the AI platform — but consulting is a competitive services market, not a platform dependency. Enterprises switch from IBM Consulting to Deloitte or Accenture for platform support the same way they switch SAP BASIS support providers: the structural moat is the platform IP (watsonx.governance, Orchestrate), not the services engagement. IBM Consulting is a competitive advantage in the services market, not a structural advantage in the platform architecture. The structural question for IBM is whether governance and orchestration authority — owning the narrow band of non-substitutable AI control plane software while everything else is open-source — is more durable than infrastructure authority (Dell, HPE), storage authority (VAST), or model authority (Google). The 4+1 model suggests this bet is architecturally sound — Layer 2C is where authority concentrates — but IBM must prove that watsonx Orchestrate's control plane is substantive, not just well-named, and that watsonx.governance's cross-platform assurance creates sufficient switching costs to justify the subscription when the rest of the stack is free. ## ○ Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Delegated to Partners ### Vendor-Provided Components **Red Hat Enterprise Linux for NVIDIA** [DAPM: Delegated] RHEL 26.01 with Day 0 Blackwell support. Co-engineering for Vera Rubin. Validated driver, firmware, and runtime integration for NVIDIA AI Factory deployments on OpenShift. Integration work, not hardware authority. **Red Hat AI Factory with NVIDIA (Software Platform)** [DAPM: Delegated] Co-engineered software stack: Red Hat AI Enterprise + NVIDIA AI Enterprise. Unified lifecycle management across RHEL AI, OpenShift AI, Red Hat AI Enterprise. Not a hardware product — a validated software integration path that runs on OEM hardware IBM does not provide. **Multi-Accelerator Platform Support (OpenShift AI 3.4)** [DAPM: Delegated] NVIDIA GPU, AMD GPU (ROCm), Intel Gaudi, IBM Spyre (Tech Preview). Each accelerator uses a different vLLM ServingRuntime variant for KServe. Broadest accelerator support of any AI platform assessed — but this is Layer 2A/2B software integration, not Layer 0 hardware authority. **No Networking Fabric IP** [DAPM: Absent] IBM provides no networking hardware, no GPU fabric interconnect, no software-defined AI networking. OpenShift SDN handles container networking but not the GPU-to-GPU fabric that determines distributed training performance. East-west bandwidth is entirely OEM-dependent. Compare to Dell (PowerSwitch/Spectrum), HPE (Juniper + Slingshot + Aruba), AWS (EFA/SRD). ### NVIDIA-Provided Components **NVIDIA GPU Silicon (via OEM Partners)** Blackwell, Vera Rubin available through Dell, HPE, Lenovo, Supermicro OEM partners running OpenShift. Red Hat Enterprise Linux for NVIDIA 26.01 provides Day 0 Blackwell support with Vera Rubin co-engineering underway. **NVIDIA AI Enterprise on OpenShift** GPU Operator, NVIDIA Run:ai (included in AI Enterprise), NIM microservices, DOCA runtime protection. Validated integration path through Red Hat AI Factory with NVIDIA. ### Gap Analysis IBM does not own compute silicon, server hardware, networking switches, cooling infrastructure, or AI-optimized network fabric. This is the emptiest Layer 0 of any vendor assessed. Dell owns PowerRack/PowerEdge/PowerSwitch/PowerCool. HPE owns ProLiant/Cray/Juniper/Slingshot/Aruba. VAST co-designs CNode-X storage hardware. Google owns TPUs and custom networking. AWS owns Trainium and EFA/SRD. IBM owns none of these. IBM's Layer 0 story is entirely indirect: Red Hat OpenShift runs on any x86/ARM hardware from any OEM. This is presented as a strength (hardware agnosticism, no vendor lock-in) but in 4+1 terms it means IBM has no Layer 0 authority whatsoever. Abstraction is not authority. Layer 0 defines what physical capabilities exist — compute density, thermal envelope, east-west bandwidth, accelerator topology. IBM has no opinion on any of these because IBM provides none of them. The networking gap is particularly significant for AI workloads. GPU-to-GPU bandwidth determines distributed training performance. Dell has PowerSwitch (NVIDIA Spectrum silicon). HPE owns Juniper + Slingshot for HPC fabric. AWS built EFA with SRD custom transport. IBM has no networking IP — not hardware, not software-defined networking for GPU fabrics. OpenShift's SDN handles container networking, not the GPU fabric networking that AI training requires. When an enterprise runs distributed training across 64 GPUs on OpenShift, east-west bandwidth depends entirely on whatever the OEM provided. IBM contributes nothing. The Red Hat AI Factory with NVIDIA co-engineering is significant: Day 0 Blackwell support, Vera Rubin co-engineering, NVIDIA Run:ai integration. But this is validation and integration work, not silicon or fabric authority. The same NVIDIA software stack runs on Dell, HPE, and Lenovo hardware. Note: IBM Z/Power with Telum on-chip AI inference is classified at Layer 3 (AI Application Layer), not Layer 0. Telum's value is transactional AI inference co-located with enterprise ledgers (banking, insurance) — this is an application-layer advantage where AI capability is adjacent to business data, not an infrastructure fabric capability. The same logic applies to SAP HANA on dedicated hardware: the value is in the application adjacency, not the compute fabric. ### Borrowed Judgment Total at Layer 0. IBM borrows all compute, networking, cooling, and fabric judgment from OEM partners and NVIDIA. The enterprise retains hardware vendor choice — a genuine governance benefit — but IBM adds no proprietary hardware value at any sub-layer: no silicon, no thermal engineering, no network fabric, no rack integration. Compare to Dell (retains mechanical/thermal/rack authority), HPE (retains networking end-to-end post-Juniper, cooling via Cray DLC, silicon agnosticism within owned chassis), VAST (retains storage hardware co-design with CNode-X). The abstraction-as-authority argument fails the 4+1 test. OpenShift abstracting hardware is a Layer 2A capability (infrastructure orchestration), not a Layer 0 capability. Layer 0 asks: what physical capabilities does the vendor provide? IBM's answer: none. ### Working Notes IBM Spyre AI Accelerator in Technology Preview is worth tracking. If IBM productizes custom AI silicon for OpenShift, the Layer 0 story changes fundamentally — IBM would join Google (TPU) and AWS (Trainium) as vendors with proprietary AI acceleration. But Technology Preview is not production. The multi-accelerator support story (NVIDIA, AMD ROCm, Intel Gaudi, IBM Spyre through different vLLM ServingRuntime variants) is the broadest of any platform assessed. But this is a Layer 2A/2B capability (platform support for multiple accelerators via Kubernetes operators), not Layer 0 authority. Supporting accelerators through software is fundamentally different from providing accelerators through hardware. Gemini's assessment frames Layer 0 abstraction as 'Silicon Decoupling' — a deliberate strategic choice. This framing is accurate as strategy but misleading as architecture. The enterprise architect choosing IBM accepts that Layer 0 is someone else's problem. The 4+1 model makes that acceptance visible. ## ◑ Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Governance Strength, Storage Gap ### Vendor-Provided Components **watsonx.governance** [DAPM: Retained] Enterprise AI governance: Governance Graph (connected map of AI assets, policies, risks, regulations), model monitoring (bias, drift, fairness), agentic monitoring and security, regulatory library (EU AI Act, HIPAA, GDPR). Cross-platform: governs IBM, OpenAI, AWS, Meta models. IDC named IBM a Leader in AI governance. **watsonx.data (Data Lakehouse)** [DAPM: Retained] Open data lakehouse with Presto and Spark engines, Iceberg table format. Shared metadata layer across clouds and on-premises. GPU-accelerated Presto (private preview). Context layer for AI-queryable metadata (private preview). watsonx.data intelligence provides data lineage, classification, quality, and Master Data Management. **IBM Confluent (Real-Time Streaming)** [DAPM: Retained] Kafka + Flink + Tableflow integrated into watsonx.data. Zero-copy data sharing: query live Kafka streams as Iceberg tables without ETL. $11B acquisition positions IBM as the only infrastructure vendor with owned real-time streaming substrate. **IBM Storage Ceph + IBM Storage Fusion** [DAPM: Retained] Ceph: distributed S3-compatible object storage for watsonx.data lakehouse. Fusion: storage services for OpenShift applications with data caching and acceleration. Storage Fusion HCI provides hosted platform for watsonx on-premises. Competent but not AI-optimized at competitor level. **IBM Sovereign Core (Data Sovereignty)** [DAPM: Retained] GA May 2026. Software platform enforcing data sovereignty across four pillars: operational, data, technology, AI sovereignty. Embeds policy at infrastructure runtime. Built on Red Hat OpenShift. Mistral AI as first certified model partner. Ensures data residency, model execution, and inference all operate within sovereign boundary. ### NVIDIA-Provided Components **GPU-Accelerated Presto (watsonx.data)** Private preview. Proof-of-concept with Nestlé showed 83% cost savings. GPU acceleration for analytical queries on the lakehouse. ### Gap Analysis IBM's Layer 1A is structurally split: strong governance, moderate storage. The governance stack is the strongest in this assessment. watsonx.governance provides end-to-end AI lifecycle governance — model monitoring, bias detection, drift monitoring, regulatory compliance (EU AI Act, HIPAA, GDPR), and a Governance Graph that maps relationships between AI assets, policies, risks, and regulatory requirements across platforms. Unlike Dell's Trust3 AI (partner overlay) or VAST's PolicyEngine (proprietary, platform-bounded), watsonx.governance operates across IBM and third-party platforms (OpenAI, AWS, Meta). This is the only cross-platform AI governance solution assessed. The watsonx.data lakehouse provides a governed data foundation with Presto and Spark engines, Iceberg table format, and shared metadata across cloud and on-premises. IBM Storage Ceph provides S3-compatible object storage. IBM Storage Fusion provides storage services for OpenShift applications. The Confluent acquisition adds real-time streaming (Kafka + Flink + Tableflow) with zero-copy data sharing — AI models can query live Kafka streams as Iceberg tables without ETL. But the storage infrastructure itself is not AI-optimized at the level of competitors. Compare: Dell Exascale provides 10+ PB/rack unified file+object+fast-file with MetadataIQ indexing billions of files. HPE Alletra X10000 provides unified file+object with Data Fabric policy-based placement. VAST Element Store collapses file, object, table, and vector into a single governed data structure. IBM Storage Ceph is competent distributed object storage but lacks the AI-specific metadata enrichment, inline embedding, or vector-native capabilities of Dell, HPE, or VAST storage platforms. The watsonx.data intelligence layer (data lineage, automated classification, Master Data Management, data quality) partially compensates by providing governance above storage — but the storage layer underneath is less performant and less AI-native than competitors' purpose-built solutions. ### Borrowed Judgment Low for governance (watsonx.governance, watsonx.data intelligence are IBM IP). Moderate for storage (IBM Storage Ceph and Fusion are IBM-owned but lack AI-specific optimizations). Low for streaming data (Confluent is now IBM-owned post-acquisition). The cross-platform governance capability is unique: watsonx.governance monitors models running on OpenAI, AWS, or Meta — not just IBM. This is the only assessed solution where governance authority is deliberately designed to extend beyond the vendor's own platform boundary. ### Working Notes The Confluent acquisition is the most strategically significant data-layer move from any vendor in this assessment. Real-time streaming + governed lakehouse + cross-platform governance creates a data foundation story that is architecturally distinct. Where Dell invested in Dataloop (pipeline orchestration) and VAST built DataEngine (serverless compute on data), IBM acquired the streaming substrate itself. The watsonx.data Context layer (private preview) adds contextual metadata directly into the lakehouse — making data AI-queryable without separate ETL. If this matures, it could address the metadata boundary problem identified in the Control Plane Working Notes: metadata that is both governed and real-time, not just batch-indexed. IBM's Governance Graph — mapping AI assets through policies, risks, and regulatory requirements as a connected graph — is the most sophisticated governance data model in this assessment. Whether it can serve as the queryable governance surface that a Layer 2C control plane needs is the open question. ## ◑ Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Open-Source Assembled ### Vendor-Provided Components **OpenShift AI Model Serving (KServe + vLLM)** [DAPM: Retained] KServe for model serving orchestration. vLLM as primary inference runtime with support for NVIDIA, AMD, Intel Gaudi, and IBM Spyre accelerators. Serverless (Knative) and RawDeployment modes. Autoscaling based on request concurrency. **InstructLab (Open-Source Model Customization)** [DAPM: Retained] IBM-led open-source project for enterprise model customization. Structured taxonomy-based knowledge contribution without full retraining. Reduces dependency on model provider for domain-specific retrieval quality. Community-driven knowledge curation. **Vector Database Integration (External)** [DAPM: Delegated] OpenShift AI supports deployment of Milvus, Elasticsearch, pgvector, and other vector databases as containerized workloads. No IBM-owned vector database — enterprise selects and manages. Integration with watsonx.data for hybrid structured+unstructured queries. ### NVIDIA-Provided Components **NVIDIA NeMo Retriever** Embedding models and retrieval microservices available through Red Hat AI Factory with NVIDIA. Same components available across Dell and HPE deployments. ### Gap Analysis Applying the Kubernetes-baseline test: IBM has zero defensible IP at Layer 1B. Every component — KServe (CNCF), vLLM (open-source), Milvus (open-source), InstructLab (Apache 2.0), Granite Guardian (Apache 2.0) — runs identically on VMware Tanzu, Amazon EKS, or bare Kubernetes without IBM involvement. A competent ML engineering team can deploy the same retrieval stack on any CNCF-compliant cluster without IBM licensing, IBM consulting, or IBM support. This is IBM's weakest layer from a defensibility standpoint. The retrieval capability exists and works. The enterprise can build effective RAG pipelines on OpenShift AI. But nothing in the pipeline is IBM-specific. Compare: • VAST: InsightEngine provides end-to-end embedding + vector search + retrieval pipeline as a single integrated system on the same Element Store. Embeddings trigger the moment data lands — vectors are always current with source data. One authority, zero integration seams. This is proprietary IP that cannot be replicated outside VAST. • Dell: Data Search Engine (Elastic-powered) + MetadataIQ + NVIDIA cuVS. Three-party dependency, but MetadataIQ is Dell-proprietary metadata integration — a defensible asset. The Elastic partnership provides search intelligence Dell doesn't own but has engineered deep integration with. • HPE: Data Fabric namespace + NVIDIA NeMo Retriever + Milvus/LangChain. HPE has proprietary storage infrastructure underneath but no proprietary retrieval intelligence. When Kamiwaza enters via Unleash AI, it adds governed context orchestration — a defensible Layer 1B/2C capability. • IBM: Everything open-source. No proprietary vector database, no proprietary embedding pipeline, no proprietary search intelligence, no inline metadata enrichment at the storage layer. IBM provides the Kubernetes platform on which open-source retrieval components run. The platform is the value — but the platform is Layer 2A, not Layer 1B. The structural seam Gemini correctly identifies: context management exists as a distinct software layer on top of storage, not inline with data writes. IBM's architecture requires explicit pipeline configuration for embeddings. VAST's architecture makes embeddings structural. Dell's MetadataIQ makes metadata indexing structural. IBM has no structural retrieval integration — it's all application-layer assembly. InstructLab is IBM's most distinctive contribution — but it's a governance innovation (who controls what the model knows?) rather than a retrieval innovation. It's also Apache 2.0 and runs on any platform. The distinction matters: InstructLab is IBM-led community innovation, not IBM-owned defensible IP. watsonx.data's Context layer (private preview) may change this assessment. If contextual metadata becomes natively queryable for RAG within the governed lakehouse, IBM would have a proprietary retrieval integration point. But private preview is not production. ### Borrowed Judgment High — the highest borrowed judgment of any IBM layer. IBM borrows retrieval intelligence entirely from open-source communities (Milvus, Elasticsearch, pgvector), embedding models from NVIDIA (NeMo Retriever) or open-source (sentence-transformers), and search acceleration from NVIDIA (cuVS). IBM provides no proprietary retrieval logic, no proprietary vector indexing, and no proprietary search intelligence. This is not a criticism of the architecture — open-source retrieval works. It is a DAPM observation: the enterprise's retrieval pipeline at Layer 1B has no IBM authority in it. The judgment is borrowed entirely from open-source communities and NVIDIA. If Milvus changes its indexing heuristics, IBM inherits the change. If NVIDIA changes NeMo Retriever's embedding strategy, IBM inherits that too. IBM provides packaging, not judgment. Compare to VAST (low — owns InsightEngine, DataBase, Catalog, vector search) or Dell (moderate — owns MetadataIQ, delegates search to Elastic, depends on NVIDIA for acceleration). ### Working Notes InstructLab is IBM's most distinctive contribution at this layer — and it's deliberately positioned as an open-source community project rather than proprietary IP. InstructLab allows enterprises to contribute domain knowledge to model training through a structured taxonomy, reducing the enterprise's dependency on model providers for domain-specific retrieval quality. This is a governance innovation (who controls what the model knows?) rather than a retrieval innovation. The Granite Guardian models (safety and guardrails) operate at the boundary between Layer 1B and 2B — filtering retrieved content before it reaches the model. This is a retrieval governance function that Dell's Elastic and VAST's InsightEngine don't address at the retrieval layer. ## ● Layer 1C: Data Movement & Pipelines *Data lifecycle automation — ingestion, transformation, pipeline orchestration* **Status:** Confluent + Open-Source Pipeline ### Vendor-Provided Components **IBM Confluent (Kafka + Flink + Tableflow)** [DAPM: Retained] IBM-defensible IP. Real-time data streaming platform. Kafka for event streaming, Flink for stream processing, Tableflow for zero-copy integration with watsonx.data Iceberg tables. $11B acquisition. Enables AI agents to reason over live event streams without ETL. No other infrastructure vendor owns a real-time streaming substrate. **IBM DataStage (watsonx Edition)** [DAPM: Retained] IBM-defensible IP. Enterprise data integration engine with decades of maturity. Graphical and code-first pipeline building. Automated data cleansing, tokenization, formatting for LLM consumption. Comprehensive lineage tracking — trace which raw document fed into a specific fine-tuning or RAG dataset. Data sanitization (PII, hate speech, copyrighted content) reduces Layer 2C compliance burden. **ML Pipeline Stack (Kubeflow + Ray + MLflow + Spark + Tekton)** [DAPM: Delegated] Kubernetes-baseline capability. Open-source ML lifecycle on OpenShift AI. Kubeflow for pipeline orchestration, Ray for distributed training, MLflow for experiment tracking, Spark for data processing, Tekton for CI/CD. All run identically on VMware Tanzu, EKS, or bare Kubernetes. IBM provides enterprise packaging, not proprietary capability. **watsonx.data Query Federation** [DAPM: Retained] IBM-defensible IP. Zero-copy query federation to external data platforms: Confluent Tableflow, Databricks Unity Catalog, Snowflake Open Catalog, Salesforce Data Cloud. Presto and Spark engines query data where it resides without copying. The federation logic is IBM-specific integration. ### NVIDIA-Provided Components **NVIDIA RAPIDS (via Spark Integration)** GPU-accelerated Spark on watsonx.data for pipeline processing. Same RAPIDS integration available across vendor platforms. ### Gap Analysis Applying the Kubernetes-baseline test, Layer 1C splits cleanly into IBM-defensible IP and open-source commodity — and the defensible half is genuinely strong. IBM-defensible IP (not replicable on Tanzu or bare Kubernetes): (1) Confluent (Kafka + Flink + Tableflow): IBM-owned post-$11B acquisition. Real-time streaming as infrastructure, not batch ETL. Tableflow zero-copy integration with watsonx.data Iceberg tables is IBM-specific. Kafka itself is open-source but the governed integration with the IBM lakehouse is proprietary. This is the most significant data-layer acquisition from any vendor in this assessment — no other infrastructure vendor owns a real-time streaming substrate. (2) DataStage (watsonx Edition): IBM proprietary enterprise data integration engine. Decades of maturity. Graphical and code-first pipeline building with automated data cleansing, tokenization, formatting for LLM consumption, and comprehensive lineage tracking. Enterprises can trace which raw document fed into a specific fine-tuning or RAG dataset. Compare to Dell's Dataloop (~$120M acquisition, less mature) or HPE's Airflow packaging (open-source, no proprietary lineage). (3) watsonx.data query federation: Zero-copy federation to Confluent Tableflow, Databricks Unity Catalog, Snowflake Open Catalog, Salesforce Data Cloud. IBM-specific integration logic. Kubernetes-baseline (replicable on any CNCF-compliant cluster): • Kubeflow — CNCF, pipeline orchestration, runs on any Kubernetes • Ray — open-source, distributed compute, runs anywhere • MLflow — open-source, experiment tracking, runs anywhere • Spark — open-source, data processing, runs anywhere • Tekton — CNCF, CI/CD pipelines, runs on any Kubernetes The 'Strong' classification is earned by the defensible half: Confluent + DataStage + watsonx.data federation. The open-source pipeline tools are commodity packaging — identical to HPE Ezmeral's approach, and replicable by any competent platform engineering team on any Kubernetes distribution. The architectural gap: unlike VAST's DataEngine (where pipeline functions execute directly on storage with CRDs — compute moves to data), IBM's pipelines run on OpenShift as separate containerized workloads — data moves to compute. For large-scale AI training pipelines, this creates more data movement than VAST's architecture. For real-time inference pipelines consuming Confluent streams, the data velocity advantage compensates. The resource overhead concern (correctly identified by Gemini's assessment): DataStage and Tekton are heavy enterprise platforms designed for complex corporate data architectures. For agile AI teams accustomed to Python scripts and LlamaIndex, IBM's data movement layer can feel over-engineered. This is a real practitioner concern — IBM's Layer 1C is enterprise-grade but not lightweight. ### Borrowed Judgment Low for defensible components. IBM now owns the streaming substrate (Confluent), the enterprise data integration engine (DataStage), and the lakehouse federation (watsonx.data). These are IBM IP with no external authority dependency. Moderate for open-source components. Kubeflow, Ray, MLflow, Spark, and Tekton are community-governed. IBM packages and supports them but inherits community judgment on architecture, performance, and API design. This is the same pattern as HPE's Ezmeral — enterprise packaging of open-source pipelines. NVIDIA provides GPU acceleration for Spark (RAPIDS) but the pipeline orchestration, streaming, data integration, and lifecycle are IBM-owned or community-governed. NVIDIA's authority at Layer 1C is limited to acceleration, not orchestration. Compare to Dell: Dell owns Dataloop (Retained) but depends on Starburst, NVIDIA, and ISV partners for the broader pipeline. Four authority boundaries. Compare to VAST: VAST owns everything at Layer 1C. One authority, total vendor dependency. IBM's model is distinctive: own the streaming substrate and enterprise data integration (defensible IP), package the open-source pipeline tools (commodity), federate across external data platforms (defensible integration). The enterprise retains more substitutability than VAST offers (can swap Kubeflow for Airflow without touching Confluent) but less than pure open-source (Confluent streaming integration is IBM-specific). ### Working Notes The Confluent acquisition creates a unique data velocity advantage. Dell, HPE, and VAST focus on data at rest (storage) and data in batch motion (pipelines). IBM now owns data in continuous motion (streaming). For agentic AI where agents need to reason over current events, current transactions, current sensor data — not yesterday's batch export — this is an architectural differentiator that no other infrastructure vendor possesses. Whether IBM can integrate Confluent deeply enough with watsonx.data to deliver on the 'zero-copy' promise is the execution question. The technology exists; the integration maturity is early. DataStage's data sanitization capabilities (removing PII, hate speech, copyrighted content prior to model training) are a crucial data-plane capability that directly reduces the compliance burden on Layer 2C downstream. This is the pipeline-to-governance connection that the 4+1 model identifies as critical: clean data in the pipeline means fewer governance exceptions at the reasoning plane. ## ● Layer 2A: Infrastructure Orchestration *Lifecycle management, scheduling, multi-tenant isolation, capacity management* **Status:** OpenShift Strength ### Vendor-Provided Components **Red Hat OpenShift (Container Platform)** [DAPM: Retained] Enterprise Kubernetes platform. Hybrid cloud consistency across on-prem, AWS (ROSA), Azure (ARO), GCP, IBM Cloud, edge. Lifecycle management, security (SELinux, FIPS), multi-tenant isolation. The universal substrate for IBM's AI stack. **Red Hat OpenShift AI 3.4** [DAPM: Retained] MLOps + GenAIOps + AgentOps on OpenShift. Models-as-a-Service with token quotas, rate limiting, API key self-service, showback dashboards. AI gateway via Connectivity Link (Envoy/Kuadrant/Istio). Multi-accelerator model serving (NVIDIA, AMD, Intel Gaudi, IBM Spyre). **IBM Concert Platform** [DAPM: Retained] Public preview (Think 2026). Agentic operations platform: Concert Observe (Instana), Concert Optimize (Turbonomic), Concert Protect (security/Secure Coder), Concert Operate (cross-domain incident response). Shared operational data layer across applications, infrastructure, networks, security. Graph-driven operations model. **Red Hat Ansible Automation Platform** [DAPM: Retained] Infrastructure automation across hybrid environments. Ansible Lightspeed with IBM watsonx for AI-assisted automation content creation. Established enterprise automation authority — the bridge between AI platform operations and existing IT operations. ### NVIDIA-Provided Components **NVIDIA GPU Operator + Run:ai** GPU scheduling, fractionalization (MIG), workload management on OpenShift clusters. Run:ai now included in NVIDIA AI Enterprise for OpenShift deployments. ### Gap Analysis Layer 2A requires careful separation of Kubernetes-baseline capabilities from IBM-defensible IP. Kubernetes-baseline (replicable on Tanzu, EKS, or bare K8s): Container orchestration, namespace isolation, multi-tenancy, RBAC, lifecycle management, GPU scheduling via NVIDIA GPU Operator, model serving via KServe, distributed compute via Ray, CI/CD via Tekton. These are CNCF-ecosystem capabilities that IBM packages with enterprise support but does not own. VMware Tanzu provides the same primitives through VCF 9.1. An enterprise could switch from OpenShift to Tanzu without losing these capabilities — the same way enterprises switch SAP BASIS support from IBM to Accenture without touching SAP. IBM-defensible IP above the Kubernetes baseline: • NVIDIA bracketing: IBM is the only vendor in this assessment that makes NVIDIA AI Enterprise optional at Layer 2A. OpenShift AI's Kueue integration provides Kubernetes-native multi-tenant GPU queue management, quota enforcement, and fair-share scheduling without Run:ai. Dell's Layer 2A is 'Gap — Ceded to NVIDIA Run:ai.' HPE has the same NVIDIA GPU scheduling dependency. VMware depends on NVIDIA GPU Operator. IBM's platform team can dictate exactly how GPUs are carved up, queued, and billed using native OpenShift primitives, rendering NVIDIA's commercial scheduling layer optional. The enterprise can still use Run:ai on OpenShift — but it doesn't have to. This is a structurally significant Layer 2A differentiator. • Concert platform (Instana + Turbonomic + security modules): Cross-domain observability spanning applications, infrastructure, networks, and security with a graph-driven operations model. No CNCF equivalent. Datadog and Dynatrace compete at the product level but Concert's six-module integrated architecture (Observe, Operate, Optimize, Protect, Secure, Resilience) is IBM-specific. • Ansible Automation Platform: Established enterprise automation authority with Ansible Lightspeed (AI-assisted automation). Red Hat-owned IP with no Kubernetes-native equivalent. • OpenShift AI 3.4 Models-as-a-Service: Token quotas, rate limiting, self-service API keys, showback dashboards built as Kubernetes CRDs on Envoy/Kuadrant/Istio. The underlying components are open-source, but the composition and integration is IBM-specific packaging. An SI could replicate this on Tanzu with sufficient engineering — but IBM provides it out of the box. • Hybrid consistency: Same OpenShift control plane across on-prem, AWS (ROSA), Azure (ARO), GCP, IBM Cloud, edge. No other assessed vendor provides the same AI platform management plane across all major clouds. This is genuine differentiation — but OpenShift-specific, not a capability the enterprise retains if they move to Tanzu. The 'Strong' classification is justified by two structural differentiators that no other on-prem vendor matches: NVIDIA bracketing (making Run:ai optional through native Kueue scheduling) and hybrid consistency (same control plane everywhere). These are competitive advantages in the Kubernetes platform market. Concert and Ansible add IBM-specific observability and automation above the Kubernetes baseline. But the core orchestration primitives remain CNCF-baseline — maturity in Kubernetes packaging is a competitive advantage, not a structural moat. ### Borrowed Judgment Requires disaggregation: For Kubernetes-baseline capabilities: the enterprise borrows Kubernetes community judgment (scheduling, networking, storage orchestration) and NVIDIA judgment (GPU Operator, Run:ai). This borrowed judgment is identical regardless of whether the enterprise runs OpenShift, Tanzu, or EKS. It is not IBM-specific. For IBM-defensible IP: Low. Concert (Instana, Turbonomic), Ansible, and the OpenShift AI MaaS packaging are IBM-owned. The enterprise Cedes observability and automation authority to IBM when it adopts these — but can substitute with Datadog (observability) or Terraform (automation) without re-architecting the AI platform. The structural comparison requires a new framing: Dell's 2A (OpenManage) manages Dell hardware only. HPE's 2A (GreenLake) manages HPE infrastructure. VAST's 2A (Polaris) manages VAST clusters. VMware's 2A (VCF) manages virtualized infrastructure. IBM's 2A (OpenShift) manages any hardware — but so does any Kubernetes distribution. The scope is broad; the defensibility is in the packaging maturity and hybrid consistency, not in the orchestration primitives themselves. ### Working Notes The Models-as-a-Service architecture in OpenShift AI 3.4 is worth close attention. Token quotas, rate limiting, self-service API keys, and showback dashboards are Layer 2A functions that border on Layer 2C territory. When the platform decides which team gets how many tokens from which model — that's a placement decision. IBM positions these as 2A (resource management) rather than 2C (policy-driven placement), but the line is thin. LLMD (referenced in Summit sessions for intelligent resource orchestration) suggests IBM is building model-aware scheduling capabilities within OpenShift AI. If LLMD makes placement decisions based on model characteristics, load patterns, and cost constraints, it's a Layer 2C signal from the platform layer. Concert's six-module architecture (Observe, Operate, Optimize, Protect, Secure, Resilience) is the most comprehensive infrastructure operations platform in this assessment. Whether it constitutes a Layer 2C decision surface or a sophisticated Layer 2A monitoring/management system depends on whether Concert makes autonomous placement decisions or surfaces recommendations for human action. ## ◑ Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Open-Source Runtime + NVIDIA ### Vendor-Provided Components **Red Hat AI Inference (Model Serving)** [DAPM: Delegated] Kubernetes-baseline capability. vLLM-based inference serving with KServe orchestration. OpenAI-compatible APIs. Models-as-a-Service architecture with enterprise authentication, token management, and showback. Supports NVIDIA, AMD, Intel Gaudi, IBM Spyre accelerators. vLLM and KServe are open-source — run identically on Tanzu or bare Kubernetes. IBM provides packaging and integration, not proprietary runtime. **OpenShell Integration (Agent Governance)** [DAPM: Delegated] IBM-shaped open-source. Red Hat co-engineering with NVIDIA on sandboxed agent runtime. Infrastructure-level policy enforcement for autonomous agents. Governs agent execution, tool access, and inference routing. Key Red Hat contribution to upstream open-source project. Feeds into Layer 2C vision. **Confidential Containers + Agent Security** [DAPM: Retained] IBM-defensible IP. Technology Preview: NVIDIA Confidential Computing within OpenShift sandboxed containers. Hardware-enforced agent isolation — protects against runtime compromise even if another agent is breached. Zero-trust architecture: SELinux + FIPS + DOCA runtime protection. Most comprehensive agent security architecture in this assessment. Not replicable on vanilla Kubernetes or Tanzu without significant engineering. **Agent Lifecycle Management (MLflow-based)** [DAPM: Delegated] Kubernetes-baseline capability. LLM call tracing, tool execution tracking, reasoning step auditability. MLflow is open-source — runs on any platform. IBM provides integration with OpenShift AI and the watsonx governance stack. The auditability is enterprise-critical; the tooling is commodity. **Granite Model Family** [DAPM: Retained] IBM-defensible legal wrapper on open-source models. Granite 4.1 (3B, 8B, 30B dense models, Apache 2.0). ISO 42001 certified. Cryptographic model signing. Uncapped IP indemnity on watsonx.ai. Optimized for agentic workflows: tool calling, instruction following, function calling. Granite Guardian for safety guardrails. Models run anywhere (Hugging Face, Ollama, Dell Enterprise Hub, NVIDIA NIM). IP indemnity is IBM-specific — the only defensible element. ### NVIDIA-Provided Components **NVIDIA NIM + NeMo + Triton (via AI Factory)** Model serving, training, and inference runtime on OpenShift. Same NVIDIA runtime available across Dell, HPE, and Lenovo deployments. **OpenShell (Sandboxed Agent Runtime)** NVIDIA open-source project for sandboxed autonomous agent execution. Red Hat is a key contributor to upstream. Joint engineering underway to integrate with Red Hat's full-stack AI platform for infrastructure-level policy and oversight. **NVIDIA Agent Toolkit** Integrated into Red Hat AI Factory with NVIDIA for building autonomous agents. NemoClaw/OpenClaw agent runtime available on OpenShift. ### Gap Analysis Applying the Kubernetes-baseline test at Layer 2B reveals the same pattern as Layer 1B: the execution runtime is entirely open-source commodity, and IBM's defensible value is the governance and security wrapper around it. Kubernetes-baseline (replicable on Tanzu, EKS, or bare K8s): • vLLM — open-source inference engine, runs on any Kubernetes with GPU Operator • KServe — CNCF model serving orchestration, runs on any Kubernetes • MLflow — open-source lifecycle management, runs anywhere • Ray — open-source distributed compute, runs anywhere • NVIDIA NIM — containerized model serving, runs on any Kubernetes with NVIDIA AI Enterprise IBM-defensible IP above the baseline: • Confidential containers + agent security: SELinux, FIPS, sandboxed containers with NVIDIA Confidential Computing (Technology Preview). Hardware-enforced agent isolation protecting against runtime compromise even if another agent is breached. Red Hat's security hardening is genuine IP that vanilla Kubernetes and Tanzu don't match out of the box. This is the most comprehensive agent security architecture in this assessment. • OpenShell co-engineering: Red Hat contributing to upstream agent governance standards — infrastructure-level policy enforcement for autonomous agents. Not proprietary IP but IBM-shaped open-source that feeds into the 4+1 model's Layer 2C vision. • Granite IP indemnity: The models are Apache 2.0 and run anywhere. The uncapped IP indemnity is IBM-specific legal protection available only through watsonx.ai. The model is portable; the legal wrapper is not. • Granite Guardian: Safety guardrails, Apache 2.0. IBM-led but runs on any platform. One observation Gemini makes correctly: IBM deliberately treats the model runtime as a portable commodity layer. By standardizing on vLLM + KServe, IBM eliminates the runtime fragmentation seen in Dell's stack (NemoClaw, OpenShell, NeMo Guardrails, Dynamo, NIMs — five NVIDIA components creating a tightly coupled runtime). IBM's runtime simplicity is the strategy: fewer components, fewer dependencies, more portability. The trade-off is that IBM cannot achieve the extreme custom-silicon optimization of Google's Pathways/TPU integration or AWS's Trainium-optimized stack. The 4+1 model distinction: IBM's Layer 2B borrowed judgment is in execution (how models run — entirely from open-source and NVIDIA). IBM's Layer 2B authority is in governance (how model execution is constrained, audited, and secured — from Red Hat security hardening and confidential containers). This is the inverse of Dell's profile: Dell borrows governance at 2B, IBM borrows execution at 2B. The vendor comparison at 2B: • Dell: NVIDIA runtime + ISV blueprints (Cohere North, DataRobot, ClearML). Dell adds services, not runtime IP. Runtime is Ceded to NVIDIA. • HPE: NVIDIA runtime + Kamiwaza agent coordination + CrewAI/ISV frameworks. Three sources, three authorities. • VAST: AgentEngine provides a unified proprietary agent runtime. One authority. The strongest 2B defensibility in the assessment. • IBM: Open-source runtime (vLLM/KServe) + NVIDIA acceleration + Red Hat security wrapper + IBM governance overlay (watsonx.governance). IBM's value is not the runtime itself but the governance and security wrapper around an open-source runtime. The runtime is commodity; the wrapper is defensible. ### Borrowed Judgment Requires disaggregation: For execution runtime: High. vLLM, KServe, NVIDIA NIM, Triton are open-source or NVIDIA-controlled. IBM borrows all execution judgment from open-source communities and NVIDIA. This borrowed judgment is identical regardless of whether the runtime runs on OpenShift, Tanzu, or EKS. For governance and security: Low. Confidential containers, SELinux/FIPS hardening, OpenShell co-engineering, agent lifecycle auditability, and the Granite IP indemnity are IBM/Red Hat IP or IBM-led open-source. The enterprise Cedes security architecture to Red Hat when adopting OpenShift's agent security stack — but Red Hat's security opinions are the product. For model alignment: Variable. Granite models carry IBM's alignment choices (ISO 42001, rigorous data filtering, enterprise-focused tuning). If the enterprise swaps Granite for Llama or Mistral, it inherits those providers' alignment choices — but the execution fabric remains under the enterprise's authority regardless. The model is a Layer 3 choice; the runtime is a Layer 2B choice. IBM correctly separates them. ### Working Notes The Red Hat Advanced Developer Suite — trusted software factory, Trusted Libraries (SLSA Level 3), AI-driven exploit intelligence — adds a software supply chain governance layer that no other assessed vendor addresses at this depth. This is not traditional Layer 2B (model serving) but it's a critical enterprise concern: is the code that builds and deploys AI models itself trustworthy? OpenShift Dev Spaces supporting AWS Kiro, Microsoft Copilot, Claude CLI, Cline, Continue, and Roo from a single governed runtime is an underappreciated capability. Multi-assistant coding from one governed workspace means the developer's AI tools inherit OpenShift's security and governance posture — a concrete example of infrastructure-level governance applied to AI-assisted development. The three control points from Red Hat Summit 2026 (execution sandboxing, artifact provenance, short-lived agent identity) represent a governance-first approach to agent execution that aligns directly with the 4+1 model's Layer 2C thesis. The question: are these control points sufficient to constitute a Layer 2C function, or are they Layer 2B governance primitives that a separate Layer 2C must consume? ## ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Most Explicit 2C Claim ### Vendor-Provided Components **watsonx Orchestrate (Next-Gen Agentic Control Plane)** [DAPM: Ceded] Private preview (Think 2026). Multi-agent governance: manages agents from IBM native, LangFlow, LangGraph, and A2A protocol. Centralized identity/credential management, policy enforcement, audit logging. Agent catalog for discovery and lifecycle. AgentOps for real-time observability and cost tracking. Over 100 domain agents and 400+ prebuilt tools. **IBM Bob (Multi-Model Orchestration)** [DAPM: Ceded] GA. Agentic developer assistant with multi-model routing: dynamically routes tasks to Claude, Mistral, Granite based on accuracy, latency, cost. Pass-through pricing. 80,000 internal IBM users, 45% average productivity gain. Demonstrates practical multi-variable placement decisions. **IBM Sovereign Core (Runtime Sovereignty)** [DAPM: Retained] GA May 2026. Four sovereignty pillars: operational, data, technology, AI. AI sovereignty enforced at runtime — governing where inference happens, who controls models, how decisions are logged/traced/reviewed. Built on OpenShift + Red Hat AI. Mistral AI first certified model partner. **watsonx.governance (Cross-Platform AI Assurance)** [DAPM: Retained] Governance Graph mapping AI assets through policies, risks, and regulatory requirements. Agentic monitoring and security capabilities. Cross-platform: governs IBM, OpenAI, AWS, Meta. Continuous compliance monitoring, not periodic audits. The governance query surface that a 2C control plane needs. ### NVIDIA-Provided Components **OpenShell Policy Layer** Governs agent execution, tool access, inference routing. 2B constraint enforcement with 2C policy potential. Same OpenShell available across NVIDIA partners. ### Gap Analysis The Kubernetes-baseline test produces its most significant finding at Layer 2C: this is the only layer where IBM provides 100% defensible IP. There is no CNCF equivalent for cross-framework agent governance, no open-source multi-variable model routing, no community-driven runtime sovereignty enforcement, no Kubernetes-native cross-platform AI lifecycle assurance. Every component at Layer 2C is IBM proprietary. Nothing here runs on Tanzu without IBM licensing. This finding validates IBM's entire strategic architecture. Layers 0 through 2B are progressively commodity — open-source packaging on delegated hardware. Layer 3 is consulting and models in a competitive services market. Layer 2C is the narrow band where IBM provides capabilities that cannot be replicated without IBM. The moat is here. IBM is the only vendor in this assessment that explicitly names its agent orchestration layer a 'control plane.' watsonx Orchestrate's next generation (private preview, Think 2026) is positioned as an 'agentic control plane for scaling and governing your AI' — and the capabilities described map directly to what the 4+1 model defines as Layer 2C. Applying the Intelligence 2C vs. Infrastructure 2C split established in the AWS assessment: Intelligence 2C (productized and portable): watsonx.governance provides continuous model monitoring, bias detection, drift tracking, regulatory compliance enforcement, and the Governance Graph mapping AI assets through policies, risks, and regulatory requirements. watsonx Orchestrate provides cross-framework agent governance — managing agents from IBM native, LangFlow, LangGraph, and A2A protocol with centralized policy enforcement, identity/credential management, and audit logging. IBM Bob demonstrates practical multi-model routing: tasks dynamically routed to Claude, Mistral, or Granite based on accuracy, latency, and cost — a multi-variable placement decision, not single-variable optimization like NVIDIA Dynamo's KV-aware routing. Sovereign Core enforces sovereignty as a runtime requirement, governing where inference happens, who controls models, and how decisions are logged within sovereign boundaries. IBM's Intelligence 2C is the strongest in this assessment for on-premises deployments. Four capabilities that no other on-prem vendor matches: (1) Cross-framework agent governance (watsonx Orchestrate) — manages agents regardless of which framework built them. Dell has no equivalent. HPE delegates to Kamiwaza. VAST's AgentEngine governs VAST-native agents only. (2) Cross-platform AI assurance (watsonx.governance) — governs models running on IBM, OpenAI, AWS, or Meta. No other governance solution spans vendor boundaries. (3) Multi-variable model routing (IBM Bob) — 80,000 internal users, demonstrated accuracy/latency/cost optimization. Production evidence at scale. (4) Runtime sovereignty (Sovereign Core) — sovereignty enforced at infrastructure runtime, not as a policy checkbox. No equivalent from any assessed vendor. Infrastructure 2C (absent/manual): watsonx Orchestrate governs agent behavior but does not autonomously calculate: 'Based on real-time token cost, data residency tags in watsonx.data, and current GPU cluster queue times, route this inference to on-prem PowerEdge versus burst to Azure.' That infrastructure placement coordination remains a manual configuration task for the platform architect. No productized engine queries Layer 1A governance metadata to make multi-variable infrastructure placement decisions in real time. This is the same split AWS exhibits: Intelligence 2C is productized (AgentCore Policy, Guardrails), Infrastructure 2C is implicit inside managed services. IBM's Intelligence 2C is more portable than AWS's (runs on-prem, multi-cloud). IBM's Infrastructure 2C is equally absent. Six-vendor Layer 2C comparison: • Dell: Absent. No productized control plane. • HPE: Retained (IT infrastructure ops via GreenLake Intelligence) + Delegated (AI workloads via Kamiwaza). • VAST: Retained/Emerging (PolicyEngine + Polaris — middle-out from data layer, GA end 2026). • AWS: Intelligence 2C Delegated (AgentCore/Guardrails) + Infrastructure 2C Ceded/Implicit within managed services. • Google: Full 2C — Agent Platform with Inference Gateway + DWS. Most production-proven. Entirely Ceded to Google. • IBM: Intelligence 2C Retained/Ceded (watsonx.governance + Orchestrate — highly portable, productized, multi-cloud) + Infrastructure 2C Absent/Manual. The production maturity gap: watsonx Orchestrate next-gen is in private preview. The capabilities are described, the architecture is sound, Bob provides production evidence of multi-model routing at scale (80,000 users). But the full agentic control plane is not GA. Compare to Google's Agent Platform (GA, production-deployed) and AWS's Bedrock AgentCore (GA). IBM's 2C is the most explicitly named, the most framework-agnostic, and the least production-proven as a complete system. ### Borrowed Judgment Low — the lowest borrowed judgment of any IBM layer, because every component is IBM proprietary IP. watsonx Orchestrate, watsonx.governance, Sovereign Core, and Bob are all IBM-owned. No open-source dependency, no NVIDIA dependency, no partner dependency at this layer. The framework-agnostic approach means IBM borrows less agent-level judgment from any single framework vendor — but it also means IBM's orchestration authority depends on integration quality with frameworks it doesn't control (LangFlow, LangGraph, A2A). If LangGraph changes its execution model, IBM must update the integration. This is a different kind of dependency than NVIDIA runtime dependency — it's integration maintenance, not architectural dependency. The DAPM classification requires the Retained/Ceded distinction established in the series: • watsonx Orchestrate: Ceded to IBM. The enterprise consumes IBM's control plane — configures policies within it, but cannot replace it without re-architecture. Portable across clouds but not substitutable. • watsonx.governance: Retained by the enterprise. The enterprise defines its own governance policies, ethical thresholds, and compliance constraints. IBM provides the framework; the enterprise provides the judgment. The closest to genuinely Retained authority in IBM's stack. • Sovereign Core: Retained by the enterprise. The enterprise defines sovereignty boundaries; IBM enforces them at runtime. • Bob: Ceded to IBM. Multi-model routing logic is IBM's — the enterprise configures preferences but IBM's software makes the placement decisions. Compare to Google: Agent Platform provides 2C as a deeply integrated platform capability. More production-proven but less framework-agnostic. Entirely Ceded to Google — no on-prem option. Compare to HPE/Kamiwaza: Kamiwaza provides similar agent coordination but as a partner (Delegated). IBM provides it as owned IP. Compare to VAST: PolicyEngine provides data-layer governance with 2C ambitions. VAST builds 2C from data up; IBM builds 2C from platform out. Different architectural vectors toward the same Layer 2C function. ### Working Notes The Kubernetes-baseline finding at Layer 2C is the structural justification for IBM's entire AI strategy. Every layer below 2C is progressively more commodity — open-source runtime, open-source pipelines, open-source retrieval, delegated hardware. IBM's bet is that the narrow band of defensible IP at Layer 2C (governance + orchestration + sovereignty) captures more strategic value than the broad commodity layers beneath it. The 4+1 model suggests this bet is architecturally sound — Layer 2C is where authority concentrates. The ECI Research finding — two-thirds of enterprise AI leaders have already implemented multi-agent collaboration — validates the urgency. The problem watsonx Orchestrate addresses (governing hundreds of agents from different frameworks with consistent policy) is real and growing. The A2A protocol support is strategically important. If A2A becomes the standard for agent-to-agent communication (analogous to MCP for tool use), IBM's early support positions watsonx Orchestrate as the governance layer above the protocol — the control plane that governs A2A interactions. This is the platform-layer bet: don't own the protocol, govern the protocol. The Intelligence 2C vs. Infrastructure 2C gap is the open question for IBM's roadmap. If Concert's Turbonomic module (GPU cost optimization, workload placement recommendations) evolves from recommendations to autonomous placement decisions informed by watsonx.data governance metadata, IBM would close the Infrastructure 2C gap from the observability layer. The data to make infrastructure placement decisions exists across Concert (infrastructure telemetry), watsonx.data (data governance metadata), and Confluent (real-time operational data). The placement engine that consumes all three does not yet exist as a productized capability. ## ◇ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Consulting-Led Ecosystem ### Vendor-Provided Components **IBM Consulting (Competitive Services Market)** [DAPM: Retained] ~160,000 consultants with AI practice. Vertical industry solutions: banking (Banco Bradesco on ARO), healthcare, government, manufacturing. Competitive advantage in implementation services — but substitutable. Enterprise retains authority to select any SI (Deloitte, Accenture, Wipro, boutique firms) for platform implementation without architectural impact. The SAP BASIS pattern: platform IP is the moat, implementation services are a market. **Granite Model Ecosystem** [DAPM: Retained] Granite 4.1 (3B/8B/30B, Apache 2.0), Granite Guardian (safety), Granite Code, Granite Time Series. Available on watsonx.ai, Hugging Face, Dell Enterprise Hub, NVIDIA NIM, Docker Hub, Ollama, LM Studio. Multi-platform model family with strongest governance posture: ISO 42001, cryptographic signing, uncapped IP indemnity. **Multi-Model / Multi-Framework Support** [DAPM: Delegated] watsonx.ai supports Granite, Llama, Mistral, GPT-OSS, Nemotron. OpenShift AI serves any model via vLLM/KServe. watsonx Orchestrate manages agents from IBM native, LangFlow, LangGraph, A2A. The platform is model-agnostic and framework-agnostic by design. **Red Hat Partner Ecosystem** [DAPM: Delegated] OpenShift ISV ecosystem across industries. Microsoft (Platform Modernization Partner of the Year 2026), AWS (ROSA + Kiro integration), Salesforce (Agentforce reference architecture). Multi-cloud deployment: build on OpenShift, deploy across AWS/Azure/GCP/on-prem. **IBM Z/Power Transactional AI (Application-Layer Advantage)** [DAPM: Retained] IBM Z16 Telum on-chip AI accelerator for real-time transactional inference co-located with enterprise ledgers — banking fraud detection, insurance claims, government transactions. This is a Layer 3 application advantage (AI adjacent to business data on proprietary hardware) not a Layer 0 infrastructure capability. Analogous to SAP HANA on dedicated hardware: the value is application adjacency, not compute fabric. watsonx Code Assistant for Z (IBM Bob Premium for Z) provides AI-assisted COBOL-to-Java modernization — 10x productivity gains in early users. ### NVIDIA-Provided Components **NVIDIA NIM + Agent Blueprints on OpenShift** NVIDIA NIM microservices and Agent Blueprints deployable on Red Hat AI Factory with NVIDIA. Same blueprints available across NVIDIA partners. ### Gap Analysis IBM's Layer 3 requires the same defensible-IP-vs-substitutable-services analysis applied throughout this assessment. IBM Consulting (~160,000 consultants) is a competitive advantage in a substitutable services market, not a structural platform dependency. Enterprises switch AI platform implementation partners the same way they switch SAP BASIS support: IBM to Accenture, Accenture to Deloitte, Deloitte to Wipro. The platform doesn't notice. The open-source components (OpenShift, vLLM, KServe, Kubeflow) run identically regardless of which SI assembled them. IBM Consulting's advantage is scale and experience, not lock-in. This is structurally different from every other vendor's services model: • Dell's Accelerator Services are additive — the NVIDIA runtime works without Dell's humans. • HPE's Kamiwaza partnership is structural — remove Kamiwaza and HPE loses Layer 2C orchestration. • VAST requires minimal consulting because the platform makes architectural decisions for you. • IBM's consulting is the primary go-to-market motion for a platform built from open-source components. The components are free. The assembly expertise is what IBM sells. But any competent SI can provide the assembly expertise. IBM Consulting competes for the engagement; it doesn't own the engagement by virtue of platform architecture. The Granite model family occupies a unique position: open-source (Apache 2.0), enterprise-grade, with the strongest governance posture of any model family (ISO 42001, cryptographic signing, uncapped IP indemnity). Granite is not competing with Claude or GPT-4 on raw capability — it's competing on trustworthiness, efficiency, and enterprise deployability. Critically, Granite runs on any platform — Dell Enterprise Hub, NVIDIA NIM, Hugging Face, Ollama. The model is portable; the IP indemnity is IBM-specific. IBM Z/Power transactional AI (Telum on-chip inference for banking, insurance, government) is a genuine Layer 3 application advantage. AI inference co-located with enterprise ledgers is not replicable on Dell or HPE hardware because the value is in the application adjacency to mainframe data, not in the compute architecture. watsonx Code Assistant for Z (Bob Premium for Z) with 10x productivity for COBOL modernization reinforces this — it's an AI application that only makes sense on IBM Z hardware. The ISV ecosystem comparison: • Dell's ecosystem is broad and horizontal: 5,000+ customers, OpenAI, Palantir, Google, ServiceNow, SpaceXAI. • HPE's ecosystem is curated and vertical: 26+ ISVs through Unleash AI. • IBM's ecosystem is consulting-driven and multi-cloud: IBM Consulting partnerships with SAP, Salesforce, Adobe, ServiceNow, plus the OpenShift ISV ecosystem deployable across all major clouds. • The multi-cloud deployment model (build on OpenShift, deploy across AWS/Azure/GCP/on-prem) is genuine differentiation at Layer 3 — ISVs building on OpenShift get portability that Dell and HPE can't match. ### Borrowed Judgment Distributed across ISV partners and model providers, which is architecturally correct at Layer 3. The consulting question is critical for DAPM: IBM Consulting is NOT a borrowed-judgment dependency. The enterprise retains full authority to select any SI for platform implementation, customization, and ongoing support. Switching SIs does not change the platform architecture, does not require re-engineering, and does not break running workloads. This is the SAP BASIS pattern: the platform IP (watsonx.governance, Orchestrate) is the structural dependency; the implementation services are a competitive market. The structural comparison: • Dell's ecosystem is load-bearing: ISV partners provide infrastructure-level functions. Remove Cohere North and Dell loses agent orchestration. • HPE's ecosystem is curated: partners provide domain applications. Remove Deloitte Zora AI and HPE loses a finance use case, not a platform function. • IBM's consulting is competitive: remove IBM Consulting and the enterprise hires Accenture. The platform remains intact. The adoption motion may slow but the architecture doesn't change. • VAST's ecosystem is additive: platform is self-sufficient. Partners add vertical use cases. ### Working Notes IBM's Layer 3 strategy is tightly focused on high-value, unglamorous enterprise utility — code modernization, automated compliance, legacy IT orchestration, mainframe fraud detection — rather than broad consumer-facing generative applications. Dell's Layer 3 partners are flashy (OpenAI, Palantir, SpaceXAI). HPE's Unleash AI partners target emerging AI use cases (video AI, geospatial, vision). IBM's Layer 3 targets the work enterprises actually need done: converting COBOL to Java, generating Ansible playbooks, detecting fraud in real-time transaction streams, modernizing mainframe applications. Nobody puts COBOL modernization on a keynote slide, but it's where regulated enterprise budgets concentrate. The watsonx application surfaces reinforce this enterprise utility focus: watsonx Assistant (conversational AI for customer service, HR, operations), watsonx Code Assistant / IBM Bob (AI-assisted development across the software lifecycle), watsonx Orchestrate applications (workflow automation binding agents to enterprise processes). These are not general-purpose AI platforms — they are purpose-built for specific enterprise operational domains. The 80,000 internal IBM Bob users represent the largest internal AI deployment from any vendor in this assessment. IBM is eating its own cooking at scale — and the 45% productivity gain claim, if sustained across production workloads, validates the agentic AI thesis more concretely than any vendor keynote. The EY tax technology partnership (Bob in private beta, described as 'closer to a collaborative agent than a simple coding tool') signals enterprise validation from a major professional services firm. Granite's Apache 2.0 licensing + uncapped IP indemnity is a distinctive governance posture. No other model family provides both open-source freedom AND vendor-backed legal protection. This addresses a specific enterprise concern: 'I want to run this model anywhere, and I don't want to worry about IP claims.' Neither OpenAI (closed-source, no indemnity) nor Meta (open-source, no indemnity) nor Google (Gemma open-weight but limited indemnity) matches this combination. The Kubernetes-baseline test at Layer 3: applications are inherently above the platform baseline, but IBM's distribution model matters. Granite models are Apache 2.0 — run on any platform. IBM Consulting is substitutable. The OpenShift ISV ecosystem targets any Kubernetes. IBM's defensible Layer 3 assets are narrow: Z/Power transactional AI (hardware adjacency), Granite IP indemnity (legal wrapper), watsonx Code Assistant for Z (mainframe-specific), and the watsonx application surfaces that integrate with Layer 2C governance (watsonx.governance integration creates application-level governance that doesn't port to Tanzu). The governance integration is the subtle lock-in: applications built to leverage watsonx.governance's Governance Graph inherit a dependency on IBM's governance architecture. ════════════════════════════════════════════════════════════════════════════════ # NVIDIA AI Platform A Components Company Becoming a Platform Vendor — Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 22, 2026 **Source:** GTC 2025, GTC 2026, Dynamo 1.0 GA, NemoClaw/OpenShell announcement, Run:ai acquisition, DGX Cloud, NVIDIA AI Enterprise, NIM GA, analyst coverage, SEC FY2026 annual report ## Summary Finding NVIDIA is the only vendor in this assessment series that appears inside every other vendor's assessment. Dell's Layer 2A GPU orchestration is NVIDIA Run:ai. HPE's Layer 2B runtime is NVIDIA AI Enterprise. VMware's GPU integration is NVIDIA vGPU Manager. AWS, Google Cloud, and Azure all run NVIDIA GPUs alongside their own silicon. VAST embeds NVIDIA GPUs, NICs, DPUs, and switches into its data platform. Mapping NVIDIA as a standalone vendor inverts the perspective: instead of asking 'where does Dell cede authority to NVIDIA,' the question becomes 'where does NVIDIA claim authority, and from whom?' The 4+1 mapping reveals NVIDIA as a vendor with deep authority at Layers 0, 2A, and 2B — and emerging ambitions at Layer 2C. NVIDIA designs the accelerator silicon that every other vendor depends on (Layer 0), provides the GPU orchestration platform that on-prem vendors brand as their own (Layer 2A via Run:ai), and controls the inference runtime and model lifecycle stack that sits between the enterprise's infrastructure and its AI applications (Layer 2B via NIM, NeMo, Dynamo, TensorRT-LLM). At Layers 1A through 1C, NVIDIA is an accelerator — it makes other vendors' storage and data pipelines faster without providing those capabilities directly. The structural tension is between NVIDIA as a silicon supplier and NVIDIA as a platform vendor. When NVIDIA was only selling GPUs, its interests aligned with every OEM and hyperscaler: more GPU adoption meant more revenue for everyone. As NVIDIA extends into GPU orchestration (Run:ai), inference runtime (NIM/Dynamo), agent governance (OpenShell/NemoClaw), and cloud infrastructure (DGX Cloud), it competes with the same customers who buy its silicon. Dell's AI Factory runs NVIDIA software. But DGX Cloud is NVIDIA competing with Dell for the same enterprise workload. The DAPM classification for NVIDIA is inverted from every other assessment: the enterprise consuming NVIDIA through an OEM (Dell, HPE, VMware) has already Ceded authority to the OEM, which has Ceded authority to NVIDIA. The enterprise consuming NVIDIA directly (DGX Cloud, NIM API) Cedes authority to NVIDIA without an intermediary. The enterprise self-hosting NVIDIA open-source software (Dynamo, OpenShell, Nemotron) Retains authority — but still runs on NVIDIA silicon. NVIDIA is the only vendor where every deployment path, at every layer, eventually depends on NVIDIA hardware. More than half of NVIDIA's engineers work on software. That statistic from NVIDIA's FY2026 annual report is the key to understanding the 4+1 mapping: NVIDIA is a software company that happens to sell the hardware its software requires. The assessment series has been documenting where NVIDIA's software authority appears inside other vendors' stacks. This assessment makes that authority explicit. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** NVIDIA Strength — Silicon Authority ### Vendor-Provided Components **GPU Accelerator Silicon (Blackwell, Vera Rubin)** [DAPM: Ceded] Blackwell B200/B300/GB200 (current generation). Vera Rubin NVL72 (next generation, deploying at hyperscalers). The accelerator silicon that every other vendor in this assessment depends on. Dell builds PowerEdge around it. HPE builds ProLiant and Cray around it. AWS offers it as P5/P6 instances. Azure offers it as ND-series. Google offers it alongside TPUs. VAST embeds it in CNode-X. No enterprise AI infrastructure exists without NVIDIA GPU silicon — or a deliberate decision to use an alternative (AWS Trainium, Google TPU, AMD Instinct). **Networking Silicon + Interconnect** [DAPM: Ceded] NVLink/NVSwitch (intra-node GPU interconnect). Spectrum-X Ethernet switches. ConnectX-7/8 SmartNICs. BlueField-3 DPUs. InfiniBand for GPU cluster fabric. NIXL for disaggregated inference data movement. Dell brands Spectrum switches as PowerSwitch. HPE integrates ConnectX into ProLiant. VAST uses ConnectX/BlueField for NVMe-over-Fabrics. The networking silicon is as structurally embedded as the GPU silicon. **DGX Platform (On-Prem Systems)** [DAPM: Ceded] DGX SuperPOD: leadership-class AI infrastructure for on-prem and hybrid. DGX Station: workgroup-scale AI compute. DGX Spark: desktop AI workstation. Pre-configured systems with NVIDIA software stack pre-installed. Competes directly with Dell PowerEdge, HPE ProLiant, and OEM AI server configurations — NVIDIA sells the assembled system, not just the components. **DGX Cloud (Hosted Infrastructure)** [DAPM: Ceded] GPU supercomputing as a service, hosted on AWS, Azure, GCP, and OCI. Includes NVIDIA AI Enterprise software and Base Command Platform. The enterprise accesses NVIDIA infrastructure through a hyperscaler substrate — Ceding to both NVIDIA (software/GPU) and the hyperscaler (facility/network). DGX Cloud competes with the hyperscalers' own GPU instance offerings while running on their infrastructure. ### Gap Analysis Layer 0 is NVIDIA's foundational authority. Every other vendor in this assessment depends on NVIDIA silicon at this layer — the only exceptions are AWS (Trainium/Inferentia), Google (TPU), Azure (Maia), and AMD Instinct instances on hyperscalers. The DGX Platform creates a structural tension with OEM partners. When NVIDIA sells DGX SuperPOD directly to an enterprise, that enterprise is NOT buying Dell PowerEdge or HPE ProLiant. NVIDIA is simultaneously its OEM partners' most critical supplier and their direct competitor. Dell's 'AI Factory with NVIDIA' branding and HPE's 'NVIDIA AI Computing by HPE' branding are attempts to keep the enterprise buying through the OEM rather than going to NVIDIA directly. DGX Cloud adds a second tension: NVIDIA competes with hyperscalers while running on their infrastructure. AWS, Azure, and GCP host DGX Cloud while simultaneously offering their own GPU instances. The enterprise choosing between Azure ND-series VMs and DGX Cloud on Azure is choosing between Microsoft-managed and NVIDIA-managed access to the same GPU hardware. The NVIDIA dependency at Layer 0 is the one dependency shared by every on-prem vendor assessed. Dell, HPE, VAST, and VMware all depend on NVIDIA GPU silicon. The difference is the scope of that dependency: at Layer 0, NVIDIA provides silicon. At Layer 2A, NVIDIA provides orchestration. At Layer 2B, NVIDIA provides runtime. The silicon dependency is structural and shared. The software dependency is where NVIDIA's authority claims create tension. ### Borrowed Judgment The enterprise consuming NVIDIA silicon inherits NVIDIA's GPU architecture decisions (memory bandwidth, interconnect topology, power/thermal profile), NVIDIA's driver and CUDA runtime decisions, and NVIDIA's product lifecycle and pricing decisions. This borrowed judgment is structural — it exists for every vendor in the assessment and for every enterprise running AI workloads on NVIDIA hardware. The DGX Platform adds system-level borrowed judgment: NVIDIA's hardware integration, thermal design, and rack architecture decisions. The DGX Cloud adds hyperscaler-layered borrowed judgment: NVIDIA software decisions on top of hyperscaler infrastructure decisions. ### Working Notes NVIDIA's FY2026 annual report segments its business into Compute & Networking (data center accelerated computing, networking, AI solutions, software, automotive) and Graphics (GeForce, Quadro/RTX). The Data Center platform — GPUs, DPUs, networking, DGX, software — is the revenue engine. The company's strategic direction is to expand from silicon supplier to platform vendor without alienating the OEM and hyperscaler customers who drive GPU volume. The NVIDIA dependency column in every other vendor's assessment can now be read as 'authority NVIDIA claims at this layer.' The standalone assessment makes that authority explicit and measurable. ## ○ Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Accelerator Only ### Gap Analysis NVIDIA provides no storage, no data governance, and no data platform. Zero components at this layer. NVIDIA accelerates other vendors' storage with GPU libraries (cuVS for vector search, RAPIDS for data processing) and networking hardware (BlueField DPUs, ConnectX NICs) — but those are acceleration functions assessed at their functional layers (1B for retrieval, 1C for data processing, Layer 0 for networking silicon). The storage platforms, governance catalogs, and data architectures are entirely owned by other vendors: Dell (PowerScale, ObjectScale, MetadataIQ), HPE (Alletra, Data Fabric), VAST (DataStore, DataBase, Catalog), AWS (S3, Glue, Lake Formation), Google (BigQuery, Knowledge Catalog), Azure (Blob, Fabric, Purview). The absence of NVIDIA-owned storage or governance is structurally significant for Layer 2C: a Reasoning Plane needs governance metadata — which data is sensitive, which models are approved, which compliance requirements apply. NVIDIA has no Layer 1A metadata to feed into a Layer 2C reasoning plane. Every other vendor's 2C ambition is anchored in governance metadata from 1A. NVIDIA's emerging Layer 2C (OpenShell/NemoClaw) operates without governance context because NVIDIA doesn't own the data layer. ### Borrowed Judgment None. NVIDIA has no data layer authority to lend or borrow. The enterprise's storage and governance judgment comes entirely from the storage vendor. ### Working Notes The STX Architecture observation from the Dell assessment applies here: STX is available to every storage vendor. It does not differentiate any OEM's storage offering — it raises the floor for all of them. NVIDIA's Layer 1A role is to make the data layer faster, not to provide or govern it. ## ◑ Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Acceleration + Model Enablement ### Vendor-Provided Components **cuVS (GPU-Accelerated Vector Search)** [DAPM: Delegated] GPU-accelerated vector similarity search library. 12x faster vector indexing. Used by Dell (MetadataIQ integration), VAST (CNode-X vector search), and storage vendors for retrieval acceleration. NVIDIA provides the search acceleration; the platform vendor provides the retrieval infrastructure and index. **NeMo Retriever** [DAPM: Delegated] GPU-accelerated retrieval pipeline for RAG. Embedding models, reranking, and retrieval optimization. Integrated into Dell's Data Search Engine (PowerScale connector), HPE's retrieval stack, and VMware's AI Enterprise RAG Stack. Provides the retrieval intelligence that OEMs brand as part of their platforms. **NIM Embedding Models** [DAPM: Delegated] Pre-built inference microservices for text and multimodal embedding. Used by VAST's InsightEngine, Dell's retrieval pipeline, and hyperscaler RAG services. NVIDIA provides the embedding models; the platform vendor provides the retrieval infrastructure. ### Gap Analysis NVIDIA provides retrieval acceleration and embedding models but not retrieval infrastructure. The retrieval engines — Azure AI Search, OpenSearch, Elasticsearch, VAST InsightEngine, Google Vertex AI Search — are owned by other vendors. NVIDIA makes retrieval faster and provides the embedding models that make vector search work, but the enterprise's retrieval architecture is determined by the platform vendor. NeMo Retriever is a meaningful capability: it provides the GPU-accelerated RAG pipeline that multiple OEMs brand as part of their offerings. When Dell advertises 'GPU-accelerated hybrid search,' the GPU acceleration is NVIDIA's. The enterprise's retrieval quality depends on NVIDIA's embedding model quality — a borrowed judgment that is rarely made explicit. ### Borrowed Judgment Moderate. NeMo Retriever embedding models determine retrieval quality. The enterprise inherits NVIDIA's embedding model training decisions, architecture choices, and optimization priorities. This borrowed judgment is invisible — the enterprise interacts with Dell's search engine or VAST's InsightEngine, not with NVIDIA's embeddings directly. ### Working Notes The embedding model dependency is worth tracking across the assessment series. Multiple vendors (Dell, HPE, VAST, VMware) use NVIDIA NIM embedding models for their RAG pipelines. If NVIDIA changes embedding model architecture, quality, or licensing, it affects every vendor's Layer 1B simultaneously — a shared dependency that no single vendor controls. ## ◑ Layer 1C: Data Movement & Pipelines *Move/transform data — ETL/ELT, lineage, cost-aware movement, KV cache tiering* **Status:** Inference Data Movement ### Vendor-Provided Components **NVIDIA CMX (KV Cache Management)** [DAPM: Delegated] Context Memory Extension for KV cache offload from GPU to CPU/SSD. Validated with Dell PowerScale (19x TTFT improvement). Enables inference scaling by treating KV cache as a data movement problem. Future integration expected with HPE Alletra and VAST CNode-X. The most architecturally significant Layer 1C capability NVIDIA provides — it solves a data movement problem that emerges specifically from inference workloads. ### Gap Analysis NVIDIA does not provide data pipeline orchestration (Data Factory, Dataloop, DataEngine, Airflow). Its Layer 1C presence is a single capability: KV cache management via CMX. CMX is architecturally significant because it addresses a data movement problem unique to AI inference — KV cache growing beyond GPU memory. This is a Layer 1C function (data movement) that directly affects Layer 2B performance (inference latency). Dell has validated it (19x TTFT improvement on PowerScale); HPE and VAST are expected integrations. The enterprise's KV cache strategy becomes a borrowed judgment from NVIDIA's CMX design decisions once adopted. NVIDIA also provides GPU-accelerated compute libraries (RAPIDS, cuDF) used within other vendors' data pipelines, but these are computation acceleration, not data movement — they make processing faster without providing pipeline orchestration, data lineage, or movement logic. They are assessed at the layers where they functionally operate (compute acceleration at Layer 0, retrieval acceleration at Layer 1B) rather than at Layer 1C. ### Borrowed Judgment Moderate for CMX — an architectural decision about KV cache management that affects inference performance and is harder to substitute once adopted. The enterprise inherits NVIDIA's decisions about cache eviction policy, offload thresholds, and storage tier targeting. ### Working Notes The KV cache tiering gap identified in the Azure assessment is relevant here: Azure has no CMX integration. Dell has validated it. HPE is expected. VAST's CNode-X architecture collocates cache and compute, potentially eliminating the need for CMX-style offload. The KV cache management approach varies by vendor — NVIDIA's CMX is one solution, not the only one. ## ● Layer 2A: Infrastructure Orchestration *GPU scheduling, quotas, RBAC, fair-share scheduling, utilization optimization* **Status:** NVIDIA Authority via Run:ai ### Vendor-Provided Components **NVIDIA Run:ai (Acquired 2024)** [DAPM: Ceded] GPU orchestration and workload management platform. Kubernetes-native. Dynamic GPU pooling across hybrid environments. Fractional GPU sharing (no open-source equivalent). Fair-share scheduling with team-level quotas. Multi-cluster management from a unified control plane. Now part of NVIDIA AI Enterprise ($4,500/GPU/year standalone, included with DGX). Run:ai is the Layer 2A authority that Dell brands as part of AI Factory, HPE brands as part of Private Cloud AI, and VMware integrates through NVIDIA AI Enterprise. The OEM sells the relationship; NVIDIA controls the scheduling intelligence. GPU-level infrastructure (GPU Operator for driver/plugin lifecycle, MIG for hardware partitioning) provides the substrate Run:ai orchestrates — invisible plumbing, not standalone orchestration tools. ### Gap Analysis Layer 2A is where NVIDIA's platform ambition creates the most direct tension with its OEM partners. Run:ai is the most capable GPU-specific orchestration platform available — fractional GPU sharing, multi-cluster management, and fair-share scheduling are capabilities that no open-source alternative matches. But Run:ai is NVIDIA's product, not the OEM's. When Dell markets 'AI Factory with NVIDIA,' the GPU scheduling intelligence is Run:ai — NVIDIA's IP, NVIDIA's roadmap, NVIDIA's pricing. Dell provides the hardware, the rack integration, and the customer relationship. NVIDIA provides the scheduling brain. If NVIDIA changes Run:ai's architecture, licensing, or feature set, Dell's AI Factory Layer 2A changes with it — without Dell's input. The same dynamic applies to HPE (Private Cloud AI includes NVIDIA AI Enterprise with Run:ai) and VMware (VCF integrates NVIDIA AI Enterprise). Three OEMs, one scheduling authority. The hyperscalers avoid this dependency: AWS built Karpenter, Google built GKE Autopilot + Fluid Compute, Azure contributed DRA to upstream Kubernetes. Each hyperscaler owns its GPU scheduling intelligence. On-prem vendors do not — they consume NVIDIA's. The open-source alternatives (Kueue, KAI Scheduler, DRA) are catching up but lack Run:ai's fractional GPU sharing. The enterprise evaluating GPU orchestration choices is evaluating a NVIDIA proprietary vs. open-source trade-off — better capability (Run:ai) vs. more authority (open-source). ### Borrowed Judgment High. Run:ai's scheduling decisions — which team gets which GPU, how fractional sharing is allocated, when over-quota borrowing is permitted — are NVIDIA's judgment. The enterprise configures policies; NVIDIA's scheduler executes them. If Run:ai makes a scheduling decision that impacts training job completion time or inference latency, that's NVIDIA's borrowed judgment affecting business outcomes. NVIDIA AI Enterprise licensing adds commercial borrowed judgment: the enterprise's production deployment timeline depends on NVIDIA's licensing terms, pricing changes, and certification cycles. ### Working Notes The Run:ai acquisition (2024) is the most significant NVIDIA software acquisition for the 4+1 model. Before Run:ai, NVIDIA provided silicon and libraries. After Run:ai, NVIDIA provides the orchestration plane that sits between the enterprise and its own GPUs. The enterprise doesn't interact with GPUs directly — it interacts through Run:ai's scheduling layer. The open-source Kubernetes GPU scheduling landscape (DRA, Kueue, KAI Scheduler) is evolving rapidly. Microsoft contributed DRA to upstream Kubernetes at KubeCon 2026. If open-source GPU scheduling reaches feature parity with Run:ai's fractional GPU sharing, the enterprise case for Run:ai's licensing cost weakens. NVIDIA's response: integrate Run:ai deeper into AI Enterprise, making it harder to substitute. ## ● Layer 2B: Application Runtime & Execution *Model serving, inference optimization, agent runtime — the Execution Plane* **Status:** NVIDIA Authority — Inference + Agent Runtime ### Vendor-Provided Components **NVIDIA NIM (Inference Microservices)** [DAPM: Ceded] Pre-built, optimized inference containers for 100+ models. OpenAI-compatible API. Free for prototyping on DGX Cloud (build.nvidia.com). Production requires AI Enterprise license. Includes Nemotron, Llama, Mistral, and partner models. NIM is the inference runtime that multiple OEMs and hyperscalers brand as part of their platforms — AWS Bedrock offers NIM, Azure Foundry offers NIM, Dell deploys NIM on PowerEdge. **Dynamo 1.0 (Inference Operating System)** [DAPM: Retained] Open-source (Apache 2.0) distributed inference serving framework. GA March 2026. Disaggregated prefill and decode. KV-aware routing to GPUs with best cache match. KVBM for memory management. NIXL for GPU-to-GPU data movement. Grove for scaling. 7x performance boost on Blackwell. Adopted by AWS, Azure, GCP, OCI, CoreWeave, and dozens of inference providers. NVIDIA positions Dynamo as 'the operating system of AI factories.' Open-source but NVIDIA-optimized — runs best on NVIDIA hardware. **NeMo (Model Lifecycle)** [DAPM: Ceded] End-to-end model lifecycle management: data curation, model customization and evaluation, guardrailing and observability. NeMo Guardrails for content safety. NeMo Evaluator for model assessment. NeMo Data Designer for training data preparation (integrated into VAST's TuningEngine). The model lifecycle stack that operates above inference and below applications. **NeMo Guardrails (Runtime Content Safety)** [DAPM: Ceded] Programmable content safety framework inline with inference. Controls model output, topic boundaries, and factual grounding during model serving. Deployed as part of NIM containers or standalone. At Layer 2B, Guardrails functions as runtime content filtering — it controls what the model says during inference. The same capability serves a Layer 2C governance function when applied as policy enforcement for agent behavior. **NemoClaw + OpenShell (Agent Execution Runtime)** [DAPM: Retained] NemoClaw: open-source stack (Apache 2.0) bundling OpenShell runtime with Nemotron models. At Layer 2B, NemoClaw is the agent execution environment — the runtime that agents run inside. OpenShell provides kernel-level sandboxing (deny-by-default) and privacy router for on-device vs. cloud inference routing. Early alpha. The same capability serves a Layer 2C governance function as the policy enforcement sandbox that constrains agent behavior. ### Gap Analysis Layer 2B is NVIDIA's deepest software authority and the layer where the platform ambition is most visible. NIM, Dynamo, NeMo, NeMo Guardrails, and NemoClaw/OpenShell constitute the complete inference, model lifecycle, and agent runtime stack. NeMo Guardrails and NemoClaw/OpenShell appear at both Layer 2B and Layer 2C because they serve dual architectural functions. At 2B they are runtime capabilities — content filtering inline with inference, agent execution environment. At 2C they are governance capabilities — policy enforcement for agent behavior, sandbox constraints on agent access. The same code, two architectural purposes. This dual-layer presence is itself evidence of NVIDIA's platform transition: a components company's software stays within one layer; a platform company's software spans layers. The open-source strategy is deliberate: Dynamo (Apache 2.0) and NemoClaw/OpenShell (Apache 2.0) are open-source, meaning the enterprise Retains the code. But both are optimized for NVIDIA hardware and NVIDIA's CUDA ecosystem. Running Dynamo on AMD or Intel GPUs is theoretically possible but practically disadvantaged. The open-source license provides code portability; the hardware optimization provides silicon lock-in. NIM is the more significant authority claim: it's closed-source, NVIDIA-only, and requires an AI Enterprise license for production. The enterprise using NIM to serve models has Ceded inference runtime authority to NVIDIA. The alternative — vLLM, SGLang, or other open-source serving frameworks — is slower but Retained. The Dell assessment's Layer 2B finding applies directly: Dell does not appear to own the core agent runtime, model-serving runtime, guardrail framework, or distributed inference framework. Those are NVIDIA's. This assessment confirms that observation from NVIDIA's perspective. ### Borrowed Judgment High for NIM (closed-source, NVIDIA-controlled inference optimization decisions). Low for Dynamo (open-source, enterprise can fork and modify). Moderate for NeMo (model lifecycle decisions — training data curation, evaluation metrics, guardrail policies — are NVIDIA's defaults that the enterprise inherits unless explicitly overridden). The inference optimization decisions in NIM and TensorRT-LLM directly affect model output quality, latency, and cost. Quantization choices, batching strategies, and KV cache management are NVIDIA's engineering decisions that the enterprise consumes without visibility. If an NIM container produces different outputs than a vLLM deployment of the same model, the enterprise may not know which is 'correct.' ### Working Notes Dynamo 1.0 GA (March 2026) is NVIDIA's strongest Layer 2B play. Positioning it as 'the operating system of AI factories' is explicitly a platform claim. Combined with Run:ai at Layer 2A and NIM at Layer 2B, NVIDIA controls the infrastructure orchestration, the inference optimization, and the model serving runtime — three layers of the enterprise's AI stack that sit between the hardware (which NVIDIA also provides) and the application (which the enterprise builds). The NemoClaw/OpenShell alpha status is important context: Futurum Research noted that NemoClaw addresses 'the deployment end of the agent trust chain well' but urged enterprises 'not to treat it as a complete governance solution.' Security and accountability need to be embedded throughout the development lifecycle, not just at runtime. This is the gap between NVIDIA's runtime governance (OpenShell) and Microsoft's lifecycle governance (Entra Agent ID + Agent Governance Toolkit). ## ○ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Runtime Governance Only — Not a Reasoning Plane ### Vendor-Provided Components **NemoClaw + OpenShell Agent Governance (Alpha)** [DAPM: Retained] NemoClaw: open-source stack (Apache 2.0) bundling OpenShell governance runtime with Nemotron models and NVIDIA Agent Toolkit. At Layer 2C, OpenShell is the policy enforcement sandbox — kernel-level deny-by-default constraints on filesystem, network, and process access via declarative YAML. Privacy router governs on-device vs. cloud inference routing as a policy decision. The same capability serves a Layer 2B function as the agent execution environment. Alpha-stage — NVIDIA is explicit about rough edges. **NeMo Guardrails (Agent Policy Enforcement)** [DAPM: Delegated] At Layer 2C, Guardrails functions as policy enforcement for agent behavior — controlling what agents are permitted to do, say, and access as a governance decision. Topic boundaries, factual grounding requirements, and content policies are defined declaratively and enforced at runtime. The same capability serves a Layer 2B function as inline content safety during model inference. ### Gap Analysis Applying the 'Routing Is Not Reasoning' test from the VMware assessment: OpenShell provides runtime sandbox governance — it controls WHAT agents can access (filesystem, network, processes). NeMo Guardrails control WHAT models can say (content filtering, topic boundaries). Neither provides policy-driven decisions about WHERE compute runs relative to data, WHICH model serves WHICH request, or HOW cost/compliance/latency are arbitrated. OpenShell is agent runtime security. NeMo Guardrails is model output safety. Neither is a Reasoning Plane. NVIDIA's Layer 2C gap is structural: NVIDIA does not own storage (Layer 1A), data governance (Purview, Lake Formation, Knowledge Catalog), or enterprise identity (Entra, IAM). A Reasoning Plane needs governance metadata — which data is sensitive, which models are approved, which compliance requirements apply. NVIDIA has no data governance to query because it has no data layer. The consequence: NVIDIA's Layer 2C will always depend on another vendor's governance metadata. OpenShell can enforce sandbox policies, but it cannot make placement decisions informed by data classification, compliance status, or cost targets — because that information lives in Purview, Lake Formation, PolicyEngine, or MetadataIQ, none of which NVIDIA owns. This is the fundamental structural limitation of NVIDIA's platform ambition: NVIDIA can build runtime governance (2B/2C boundary) but cannot build a full Reasoning Plane (2C) because it lacks the data governance foundation (1A) that a Reasoning Plane queries. ### Borrowed Judgment Low for OpenShell (open-source, enterprise controls the policies). Low for NeMo Guardrails (configurable by the enterprise). The governance logic is transparent — the enterprise defines what agents can and cannot do. The missing borrowed judgment is the more significant finding: NVIDIA's Layer 2C cannot borrow data governance judgment from itself because it doesn't have a data governance layer. It must borrow from Dell (MetadataIQ), HPE (Data Fabric), VAST (Catalog), AWS (Lake Formation), Google (Knowledge Catalog), or Azure (Purview). NVIDIA's governance is runtime-only; other vendors' governance is data-informed. ### Working Notes The Futurum Research observation is the right framing: OpenShell addresses the 'deployment end of the agent trust chain' but enterprises should not treat it as a complete governance solution. Security and accountability need to be embedded throughout the development lifecycle. Compare to other vendors' Layer 2C: • Microsoft: identity + governance lifecycle (Entra Agent ID + AGT) — the broadest governance scope • Google: model-integrated orchestration (Agent Platform) — the deepest platform integration • AWS: policy + evaluation + registry (AgentCore) — the most modular approach • VAST: data platform governance (PolicyEngine + Polaris) — the most data-informed approach • NVIDIA: runtime sandbox (OpenShell) — the narrowest scope, addressing only execution-time security NVIDIA's Layer 2C is necessary but not sufficient. It complements other vendors' governance — it doesn't replace it. ## ◑ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Model + Blueprint Enablement ### Vendor-Provided Components **Nemotron Open Models** [DAPM: Retained] Post-trained on Llama, distilled from DeepSeek-R1. Deployment-ready for AI agents. Available through NIM API (build.nvidia.com) and as downloadable containers. Nemotron models are NVIDIA's answer to the model layer — open models optimized for NVIDIA hardware. Competes with OpenAI, Anthropic, Google, and Meta at the model layer while providing the hardware those competitors run on. ### Gap Analysis NVIDIA does not build enterprise AI applications. Its Layer 3 presence is a single component: Nemotron open models. NVIDIA also provides application enablement that falls below the Layer 3 threshold: Blueprints (pre-built reference patterns for PDF extraction, digital twins, RAG pipelines, AI-Q agent task decomposition — deployed through Dell, HPE, VMware, and hyperscaler marketplaces) and NIM API endpoints (build.nvidia.com — free API access to 100+ models, 1,000 free inference credits, GPU sandbox instances). Blueprints are reference architectures, not applications — the enterprise builds from them, not on them. NIM API is a developer on-ramp and go-to-market funnel, not an application platform. The Nemotron model strategy is the interesting Layer 3 finding: NVIDIA competes with the AI model providers (OpenAI, Anthropic, Google, Meta) whose models run on NVIDIA hardware. If Nemotron achieves quality parity with proprietary models, enterprises can run inference on NVIDIA hardware with NVIDIA models — a fully vertically integrated stack from silicon to model. No other silicon vendor has this: Intel doesn't have frontier models, AMD doesn't have frontier models, AWS Trainium serves other providers' models. The NIM API funnel is NVIDIA's developer moat: free prototyping creates adoption → adoption creates switching cost → production deployment requires AI Enterprise license on NVIDIA hardware. The funnel is silicon-to-model-to-lock-in. ### Borrowed Judgment Moderate. Nemotron model alignment, training data, and safety decisions are NVIDIA's. The model-to-silicon borrowed judgment is unique to NVIDIA: when the enterprise uses Nemotron on NVIDIA GPUs, both the model and the hardware are NVIDIA's. The enterprise borrows NVIDIA's judgment at every layer of the inference path. No other vendor has this — even Google (Gemini on TPU) separates the model team (DeepMind) from the silicon team. ### Working Notes The NVIDIA-as-model-provider dynamic creates an unusual competitive position: NVIDIA wants enterprises to adopt Nemotron (NVIDIA model revenue) AND wants enterprises to run OpenAI/Anthropic/Meta models on NVIDIA GPUs (NVIDIA hardware revenue). Both outcomes benefit NVIDIA, but they benefit NVIDIA in different ways. If Nemotron succeeds too well, it reduces the model diversity that drives GPU demand from multiple model providers. ════════════════════════════════════════════════════════════════════════════════ # Oracle Cloud Infrastructure (OCI) AI Infrastructure Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Draft, Editorial Review Pending **Date:** May 23, 2026 **Source:** Oracle AI World 2025, GTC 2026, OCI Enterprise AI GA (Mar 2026), Fusion Agentic Applications (Mar 2026), Oracle AI Database 26ai, Stargate/OpenAI partnership, NVIDIA/AMD partnerships, analyst coverage, Oracle Q3 FY2026 earnings ## Summary Finding OCI occupies a structurally unique position in this assessment series: it is the only hyperscaler whose AI infrastructure strategy is anchored by a database franchise. AWS builds down from managed services. Google builds out from a frontier model. OCI builds up from the enterprise data layer — Oracle AI Database 26ai, Autonomous AI Database, and the Fusion Applications estate that runs 97% of the Fortune 100. Every other hyperscaler treats the database as one service among many. Oracle treats the database as the gravitational center around which AI infrastructure orbits. The infrastructure story is more aggressive than the enterprise positioning suggests. OCI Zettascale10 connects up to 800,000 NVIDIA GPUs across multi-gigawatt clusters delivering 16 zettaFLOPS — the fabric underpinning the Stargate supercluster built with OpenAI in Abilene, Texas. Oracle Acceleron, a custom RoCE networking architecture with 2.5–9.1 microsecond latency, is genuine Layer 0 IP that positions OCI alongside AWS (Nitro/EFA/SRD) and Google (Virgo) as hyperscalers with proprietary networking stacks. The AMD partnership (50,000 MI450 GPUs, Q3 2026) makes OCI one of two hyperscalers with meaningful multi-vendor GPU strategy alongside AWS. The DAPM profile is heavily Ceded — structurally identical to AWS and Google Cloud in that the enterprise consumes managed services without controlling underlying architecture. But OCI adds a distinctive wrinkle: the database layer creates a gravitational pull that concentrates not just infrastructure authority but data authority. An enterprise running Fusion Applications on Autonomous AI Database on OCI Superclusters has Ceded compute, networking, database, application runtime, AND business logic to a single vendor. This is deeper vertical integration than AWS (which doesn't own the application layer) and comparable to Google's model-integrated stack — but achieved through the application and data layers rather than through a frontier model. OCI Enterprise AI (GA March 2026) is a credible but late entry to the agentic platform space. OpenAI Responses-compatible API, managed agent hosting, vector stores, MCP support, guardrails, and observability — the capabilities parallel AWS Bedrock AgentCore and Google's Gemini Enterprise Agent Platform. The differentiator is database-native AI: Select AI for natural language to SQL, AI Vector Search inside the database engine, and the Private Agent Factory pattern that keeps agent reasoning co-located with enterprise data. Whether 'AI at the database layer' is a structural advantage or an architectural constraint is the central DAPM question for OCI. The Stargate partnership and $553B RPO validate infrastructure demand. The Fusion Agentic Applications validate the application-layer strategy. The multicloud database deployments (Oracle Database@AWS, @Azure, @Google Cloud) validate the data-gravity thesis. But the 4+1 framework asks a different question: where does authority reside, and has the enterprise made that placement explicit? For OCI, the answer is that authority concentrates in the database — and the database concentrates in Oracle. ## ● Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Ceded to Oracle ### Vendor-Provided Components **OCI Superclusters + Zettascale10** [DAPM: Ceded] Up to 800,000 NVIDIA GPUs across multi-gigawatt clusters. 16 zettaFLOPS peak performance. Underpins Stargate (OpenAI). Scales from 8 GPUs to 131,072 B200 GPUs, 100,000+ GB200 Superchips per cluster. Bare metal GPU instances with RDMA cluster networking. **Oracle Acceleron Networking** [DAPM: Ceded] Custom-designed RDMA over Converged Ethernet (RoCE v2). 2.5–9.1 microsecond GPU-to-GPU latency. Multiplanar network architecture with dedicated RoCE fabrics. Congestion-control-first (not PFC-dependent). Zero-Trust Packet Routing (ZPR) at the physical layer. Up to 3,200 Gb/s cluster network bandwidth. Oracle-owned networking IP. **NVIDIA GPU Fleet** [DAPM: Ceded] GB200 NVL72, B200/B300, H200, H100, A100, L40S bare metal instances. 1M+ GPUs. NIXL support for disaggregated inference. BlueField-4 integration for next-gen Superclusters (GTC 2026). DGX Cloud hosted on OCI. **AMD GPU Fleet (Q3 2026)** [DAPM: Ceded] 50,000 AMD Instinct MI450 Series GPUs. Helios rack architecture: 72 liquid-cooled GPUs per rack, AMD EPYC Venice CPUs, Pensando Vulcano DPUs. UALink/UALoE fabric. ROCm software stack. Up to 432 GB HBM4, 20 TB/s memory bandwidth per GPU. First hyperscaler with publicly available AMD AI supercluster at this scale. **OCI Dedicated Region25 + Oracle Alloy** [DAPM: Ceded] Full OCI stack (200+ services including SaaS) in customer data center, starting at 3 racks. 60+ Dedicated Region/Alloy regions live. EU Sovereign Cloud (Frankfurt, Madrid). Isolated Cloud Regions for classified workloads. Oracle Alloy enables partner-operated OCI. Fujitsu, SoftBank, Vodafone as anchor customers. ### NVIDIA-Provided Components **NVIDIA GPU Silicon + Rubin Roadmap** 1M+ NVIDIA GPUs deployed. Blackwell B200/B300, H200, H100, L40S, GB200 NVL72. Rubin roadmap committed. DGX Cloud hosted on OCI. NVIDIA BlueField-4 integration announced at GTC 2026 for OCI Superclusters. ### Gap Analysis OCI's Layer 0 is the most surprising story in this assessment series. A vendor perceived as a database company has built one of the largest GPU cloud fabrics in the world — the Stargate supercluster alone targets 800,000 GPUs. Oracle Acceleron is genuine networking IP: custom RoCE with congestion-control-first design (not PFC-dependent), multiplanar architecture, Zero-Trust Packet Routing at the physical layer. This is not leased NVIDIA networking — it's Oracle-designed fabric. The multi-accelerator strategy is more advanced than any hyperscaler except AWS. NVIDIA (Blackwell, Rubin), AMD (MI450 with Helios rack architecture, 50,000 GPUs Q3 2026), and Intel Xeon 6 processors. The AMD Helios rack — 72 liquid-cooled GPUs with Venice CPUs and Pensando Vulcano DPUs — is a fully integrated system comparable to NVIDIA DGX but from AMD's ecosystem. No other hyperscaler has committed to AMD at this scale. The Dedicated Region25 (full OCI in 3 racks, customer data center) and Oracle Alloy (partner-operated OCI) create a distributed cloud model with 60+ dedicated/Alloy regions live. This parallels AWS AI Factories, Azure Local, and Google Distributed Cloud — but Oracle's model delivers the full 200+ service stack, including SaaS, which none of the other hyperscalers match in dedicated form. The structural tension: OCI's GPU customers include OpenAI, xAI, Meta, and other frontier model trainers. These are not traditional Oracle enterprise customers. OCI is simultaneously serving the world's largest AI training workloads AND the world's most conservative enterprise database customers — two audiences with fundamentally different risk profiles and authority expectations. ### Borrowed Judgment The enterprise Cedes Layer 0 entirely to Oracle — GPU selection, networking topology, cluster design, physical infrastructure. Oracle Acceleron is Oracle IP, reducing NVIDIA networking dependency compared to hyperscalers using NVIDIA Spectrum-X. But the GPU silicon dependency on NVIDIA (and increasingly AMD) is structural and shared with every vendor in this assessment. The Stargate partnership creates a unique borrowed judgment dynamic: Oracle operates infrastructure for OpenAI's training workloads. The operational lessons from running the world's largest AI training cluster feed back into OCI's infrastructure decisions — but those decisions are made for frontier training requirements, not necessarily for enterprise inference workloads. ### Working Notes OCI's bare-metal GPU instances are architecturally distinctive. While AWS and Google abstract GPUs behind instance types, OCI exposes bare metal with RDMA cluster networking — giving customers direct hardware access without hypervisor overhead. This appeals to AI training customers (OpenAI, xAI) who need maximum GPU utilization. The trade-off: bare metal reduces Oracle's ability to multi-tenant and abstract, pushing more operational complexity to the customer. The $45–50B financing plan (Feb 2026) for OCI expansion and the $553B RPO (Q3 FY2026, up 325% YoY) demonstrate infrastructure investment at a scale that few anticipated from Oracle. Cloud infrastructure revenue grew 84% to $4.9B in the quarter. The CapEx intensity is comparable to AWS and Google — this is no longer a database company's side project. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Ceded to Oracle — Database-Anchored ### Vendor-Provided Components **Oracle Autonomous AI Database** [DAPM: Ceded] Self-managing, self-securing, self-patching database built on Oracle AI Database 26ai engine. AI Vector Search (native vector data type, HNSW/IVF indexes, SQL-based similarity search), Select AI (natural language to SQL), ONNX embedding models, Private AI Services Container integration, NVIDIA NIM container support. Autonomous AI Lakehouse with Apache Iceberg support for open multi-vendor data lakehouse. RAFT-based replication, JSON Relational Duality, quantum-resistant encryption, in-database SQL firewall. Platinum and Diamond-tier availability (Apr 2026). Anomaly detection, auto-indexing, auto-tuning. Autonomous AI Vector Database variant in Limited Availability (March 2026). Available on OCI, AWS, Azure, Google Cloud via multicloud deployments. **OCI Object Storage** [DAPM: Ceded] Standard object storage for unstructured data, embeddings, model artifacts. S3-compatible API. Regional and cross-region replication. Storage Classes for cost optimization. **Oracle Database@AWS / @Azure / @Google Cloud** [DAPM: Ceded] Oracle AI Database running inside other hyperscalers' infrastructure with private interconnects. Multicloud Universal Credits for cross-cloud procurement. Teams use familiar AWS/Azure/Google tools and billing while running Oracle AI Database on OCI infrastructure within the hyperscaler. Oracle-AWS Interconnect expanded April 2026. ### NVIDIA-Provided Components **NVIDIA cuVS + CAGRA (Future)** GPU-accelerated vector indexing with NVIDIA CAGRA and cuVS designed for integration with Oracle AI Database. Not yet GA — future GPU acceleration for vector workloads. ### Gap Analysis Layer 1A is where OCI's structural differentiation is sharpest. Every other hyperscaler treats storage and governance as separate services composed by the customer. Oracle treats the database AS the governance layer — AI Vector Search, Select AI, data classification, audit, encryption, and access control are database-native capabilities, not services bolted on top. Oracle AI Database 26ai is the most significant Layer 1A product in this assessment because it collapses traditionally separate functions: relational storage + vector storage + semantic search + natural language querying + governance + encryption + lakehouse (Apache Iceberg) into a single authority boundary. AWS achieves comparable breadth by composing S3 + Glue + Lake Formation + OpenSearch — four services, four governance surfaces. Oracle delivers it in one. The Autonomous AI Lakehouse extends this to open formats: Apache Iceberg read/write in object store, enabling cross-cloud analytics without data movement. Oracle Database@AWS, @Azure, and @Google Cloud place Oracle's data authority inside other hyperscalers' infrastructure — a multicloud data-gravity strategy no other vendor in this series attempts. The DAPM implication: collapsing Layer 1A into a single database authority is simultaneously Oracle's greatest strength and greatest lock-in risk. The enterprise gains architectural simplicity and eliminates inter-service governance gaps. But substituting away from Oracle AI Database means losing vector search, semantic querying, governance, and the lakehouse capability simultaneously — a higher switching cost than any other hyperscaler's Layer 1A. ### Borrowed Judgment Low for database governance — Oracle AI Database 26ai governance (encryption, access control, audit, classification) is Oracle IP with 40+ years of enterprise hardening. The enterprise defines policies; Oracle enforces them inside the database. Moderate for AI-specific capabilities — Select AI and AI Vector Search embed model judgment (embedding quality, chunking strategy, SQL generation accuracy) inside the database layer. When Select AI generates SQL from natural language, the enterprise inherits the model's interpretation of business logic — the same borrowed judgment pattern identified in Google's LookML Agent, but at the database layer rather than the analytics layer. ### Working Notes Oracle's positioning of AI Database 26ai as 'the best memory core for enterprise agents' is architecturally significant for the 4+1 model. If agents store memory, context, and retrieval state in the database, then the database becomes the persistence layer for agentic intelligence — a Layer 1A function that directly enables Layer 2B/2C agent capabilities. This is the 'data-gravity for agents' thesis: agents that persist state in Oracle AI Database become progressively harder to move off Oracle's platform. The quantum-resistant encryption and in-database SQL firewall in 26ai address threats that other vendors' Layer 1A offerings don't yet productize. Security-forward positioning that aligns with sovereign AI requirements. 97% of Fortune 100 running on Oracle Database creates an installed base advantage no other vendor can replicate. The question is whether that installed base translates to AI workload adoption or whether enterprises run AI on one platform and databases on another. ## ● Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Ceded — Database-Native ### Vendor-Provided Components **AI Vector Search (Oracle Autonomous AI Database)** [DAPM: Ceded] Native vector data type, HNSW and IVF vector indexes, SQL-based similarity search. Hybrid search combining vector similarity with relational predicates. Permission-aware retrieval through database access control. No separate vector database required. **OCI Enterprise AI — Managed Vector Store** [DAPM: Ceded] Managed vector storage with file ingestion, semantic search, and metadata filtering for RAG and NL2SQL use cases. Schema enrichment into semantic vector store for natural language queries that produce and execute SQL against customer databases with permission control. **Select AI + SQL Search (NL2SQL)** [DAPM: Ceded] Natural language to SQL generation and execution. Applications and analytics use LLMs to understand natural language questions and generate Oracle SQL. Bridges unstructured (vector) and structured (SQL) retrieval in a single query surface. **OCI Generative AI — Embeddings + Rerank** [DAPM: Ceded] Managed embedding and reranking services. Supports Cohere, NVIDIA Nemotron, and other embedding models. OpenAI-compatible APIs. Dedicated AI clusters for consistent latency. ### NVIDIA-Provided Components **NVIDIA NIM Containers for Embeddings** NVIDIA embedding models available through OCI Generative AI Model Import. cuVS acceleration planned for future vector indexing. ### Gap Analysis OCI's Layer 1B collapses into Layer 1A — and that's the architectural point. AI Vector Search lives inside Oracle AI Database, not as a separate vector database service. RAG queries execute as SQL against the same database that holds the enterprise's transactional data. Permission-aware retrieval inherits the database's existing access control model without a separate security overlay. This is the same architectural pattern as VAST (InsightEngine inside the data platform) but at the database level rather than the storage level. The comparison to AWS is instructive: Bedrock Knowledge Bases composes OpenSearch + S3 + embedding models across service boundaries. Oracle eliminates those boundaries by making vector search a database capability. The SQL Search (NL2SQL) capability adds a dimension no other vendor's Layer 1B provides: agents can retrieve structured enterprise data through natural language queries that generate and execute SQL. This bridges unstructured retrieval (vector search for documents) and structured retrieval (SQL for business data) in a single query surface — a genuine differentiation. The gap: Oracle's Layer 1B is database-bounded. Data outside Oracle AI Database isn't retrievable through AI Vector Search. AWS's Bedrock Knowledge Bases can index data from any S3-accessible source. Google's BigQuery can federate queries across storage boundaries. Oracle's retrieval requires data to be IN the database or accessible through database links. SyncEngine-like capabilities for ingesting external enterprise data (Google Drive, Jira, Confluence) are not evident in Oracle's published materials. ### Borrowed Judgment Low for vector search infrastructure — AI Vector Search, indexing, and SQL-based retrieval are Oracle IP inside the database engine. No external retrieval service dependency. Moderate for embedding quality — embedding models are either ONNX (customer-provided), NVIDIA NIM containers, or third-party models through OCI Generative AI. The quality of retrieval depends on embedding model choice, which the enterprise controls but doesn't build. The NL2SQL capability introduces a specific borrowed judgment: when Select AI generates SQL from natural language, the accuracy of retrieval depends on the model's understanding of the database schema. Schema enrichment into a semantic vector store (announced in OCI Enterprise AI) mitigates this — but the enterprise inherits the model's interpretation of table relationships and business logic. ### Working Notes The Private Agent Factory pattern (announced March 2026) keeps agent reasoning co-located with enterprise data inside Oracle AI Database — agents query the database directly rather than through external RAG pipelines. This reduces network hops and latency for retrieval but deepens the database dependency. Every agent built on Private Agent Factory inherits Oracle AI Database as a non-substitutable Layer 1B dependency. The Exadata for AI announcement (vector search offloaded to intelligent storage for dramatic speedups) bridges Layer 1A and Layer 1B at the hardware level — the storage system itself accelerates retrieval. This is comparable to VAST's CNode-X collocating cache and compute, but implemented at the database storage layer rather than the file system layer. ## ◑ Layer 1C: Data Movement & Pipelines *Move/transform data — ETL/ELT, lineage, governed data preparation* **Status:** Ceded — Database-Centric, Gaps in ML Pipeline Orchestration ### Vendor-Provided Components **OCI GoldenGate** [DAPM: Ceded] Real-time data replication, transformation, and streaming across heterogeneous sources. Continuous data availability. Database migration, disaster recovery, real-time analytics. The deepest database connector ecosystem of any cloud data movement service. **OCI Data Integration + Data Flow** [DAPM: Ceded] Data Integration: managed ETL/ELT service with visual design. Data Flow: managed Apache Spark for large-scale data processing and ML data preparation. Integration with Autonomous AI Database and OCI Object Storage. **OCI Data Science** [DAPM: Ceded] Managed ML platform: JupyterLab notebooks, model training, model catalog, model deployment. AI Quick Actions for one-click model operations. GPU-enabled compute shapes. Separate service surface from OCI Enterprise AI. **Multicloud Database Connectivity** [DAPM: Ceded] Oracle Interconnect + AWS Interconnect for managed private high-performance connectivity (April 2026). Oracle Database@AWS, @Azure, @Google Cloud. Multicloud Universal Credits for cross-cloud procurement. Data stays in Oracle's governance model regardless of which cloud hosts the compute. ### Gap Analysis Layer 1C reveals the database-centric trade-off. Oracle's data movement capabilities are strong for database-to-database flows (GoldenGate) and analytics pipelines (OCI Data Integration, OCI Data Flow). But the ML-specific pipeline orchestration that Dell (Dataloop), HPE (Ezmeral Unified Analytics), VAST (DataEngine), and AWS (Glue + SageMaker Unified Studio) provide is less integrated. OCI Data Science provides notebooks, model training, and deployment but is a separate service from OCI Enterprise AI — creating a multi-surface problem similar to AWS's pre-Unified Studio fragmentation. There is no single governed environment that collapses data engineering, model training, and agent development the way SageMaker Unified Studio or Google's BigQuery ML attempt. GoldenGate is the most mature real-time data replication service in the assessment series — purpose-built for heterogeneous data movement across on-premises and cloud. No other hyperscaler has an equivalent with Oracle's depth of database connector support. The multicloud data movement story is strong: Oracle Database@AWS/@Azure/@Google Cloud moves the database layer to the customer's cloud of choice. But this is database replication, not general-purpose data pipeline orchestration. An enterprise needing Airflow-style DAG orchestration for ML workflows must deploy it on OKE — which is possible but not Oracle-managed. ### Borrowed Judgment Low to moderate. GoldenGate and OCI Data Integration are Oracle IP. Data Flow (managed Spark) and OCI Data Science (managed notebooks) are Oracle-managed wrappers around open-source technology (Apache Spark, JupyterLab). The enterprise retains pipeline logic but Cedes execution infrastructure. The pipeline gap means enterprises often bring third-party orchestration (Airflow, Kubeflow) to OCI — introducing borrowed judgment from those communities and creating governance boundaries that don't exist when using Oracle-native services. ### Working Notes The absence of a unified ML pipeline platform is the most significant gap relative to AWS and Google Cloud. Both competitors have invested heavily in collapsing the data engineering → model training → model serving → agent development pipeline into single governed surfaces. Oracle's approach is service-by-service composition — GoldenGate for replication, Data Integration for ETL, Data Flow for Spark, Data Science for ML, Enterprise AI for agents — without the horizontal integration layer. Oracle Fusion Applications data is already 'in Oracle' — the pipeline problem for Fusion customers is different than for greenfield AI. The data doesn't need to be moved; it needs to be made AI-ready. Select AI and AI Vector Search address this for Fusion customers in a way that no amount of pipeline tooling could match. The question is whether non-Fusion enterprises find the pipeline story compelling. ## ◑ Layer 2A: Infrastructure Orchestration *GPU scheduling, capacity management, autoscaling* **Status:** Ceded — OKE + NVIDIA GPU Scheduling ### Vendor-Provided Components **OCI Kubernetes Engine (OKE)** [DAPM: Ceded] Managed Kubernetes with GPU-aware node pools. Bare metal and virtual machine compute shapes. RDMA cluster networking for training workloads. MIG support for fractional GPU allocation. Autoscaling at pod level with GPU Device Plugin metrics. Karpenter support for node autoprovisioning. Free managed control plane. **OCI Compute Management** [DAPM: Ceded] Instance pools, cluster networks, capacity reservations, preemptible instances. GPU shape selection across NVIDIA and AMD (Q3 2026). OCI Resource Manager (Terraform-based) for infrastructure as code. **GPU Node Manager + Monitoring** [DAPM: Ceded] Kubernetes-native GPU, networking, and infrastructure monitoring for OKE clusters. NVIDIA DCGM integration. Health monitoring, utilization metrics, and alerting. Actively developing additional capabilities. ### NVIDIA-Provided Components **NVIDIA GPU Operator + Device Plugin on OKE** GPU discovery, health monitoring, scheduling within OKE clusters. MIG support for fractional GPU allocation. Node Manager for GPU/networking monitoring. NVIDIA DCGM integration. ### Gap Analysis OCI's Layer 2A follows the same pattern as AWS: Kubernetes-based GPU orchestration through a managed service (OKE) with NVIDIA GPU scheduling underneath. OKE provides managed Kubernetes with GPU-aware node pools, autoscaling, bare-metal RDMA networking, and MIG support for fractional GPU allocation. The distinction from AWS: OCI's bare-metal GPU instances give customers more direct hardware control than AWS's virtualized GPU instances. OKE autoscaling operates at the pod level using NVIDIA GPU Device Plugin metrics, and the Karpenter Provider for OCI (GA April 2026) brings flexible node autoprovisioning — matching AWS's Karpenter capability for just-in-time compute shape selection based on workload requirements. GPU scheduling authority follows the same pattern as Dell and HPE: NVIDIA controls GPU scheduling through the GPU Operator and Device Plugin. Oracle controls infrastructure orchestration through OKE. Policy-driven GPU scheduling (which workload gets which GPU based on cost, compliance, and performance) is not an OCI-native function — the same gap every vendor shares. The Dedicated Region model adds a Layer 2A dimension other hyperscalers don't match: OCI orchestrates infrastructure across public cloud, 60+ dedicated regions, isolated regions, and Alloy partner regions from a single control plane. The orchestration scope is broader than AWS (Outposts, AI Factories) or Google (GDC) in terms of the number and variety of deployment targets. ### Borrowed Judgment GPU scheduling: NVIDIA-controlled, same as every other vendor except AWS (Karpenter) and Google (TPU scheduling). Infrastructure orchestration: Oracle-controlled through OKE and the OCI control plane. The enterprise Cedes orchestration authority but retains Kubernetes-native configuration control. The bare-metal model creates a subtly different borrowed judgment profile: with bare metal, the enterprise has more direct GPU control (no hypervisor overhead, direct RDMA access) but also more operational responsibility. The judgment about GPU sharing, isolation, and scheduling is partially Retained by the customer — a more favorable DAPM position than fully managed GPU instances. ### Working Notes OCI's GPU monitoring through Node Manager is actively developing — the tool surfaces GPU, networking, and infrastructure metrics in a Kubernetes-native way. This is an operational foundation that could evolve toward Layer 2C if enriched with policy-driven decision-making. The multi-accelerator scheduling problem (NVIDIA vs AMD GPUs, different instance types) will become acute when AMD MI450 GPUs arrive Q3 2026. OCI will need workload-to-silicon matching capabilities — a Layer 2C function — to help customers choose between NVIDIA and AMD for each workload. No productized capability for this exists today. ## ● Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Ceded — OCI Enterprise AI Platform ### Vendor-Provided Components **OCI Enterprise AI Platform** [DAPM: Ceded] End-to-end agentic AI platform (GA March 2026). OpenAI Responses-compatible API with multi-model routing. Enterprise AI agents with modular, composable primitives. Managed agent hosting for OSS frameworks and MCP servers. Vector stores, semantic search, NL2SQL, memory, tools. IAM integration, guardrails, observability, auditability. Dedicated AI clusters for isolated compute. **OCI Generative AI Service** [DAPM: Ceded] Foundation model access: xAI Grok (4.1 Fast, 4.3), Cohere Command A (Vision, Reasoning), NVIDIA Nemotron 3 Nano Omni, gpt-oss models. Chat, embeddings, rerank APIs. Model Import for custom models. OpenAI-compatible APIs. Sovereign AI options for data hosting. **Applications and Deployments (Hosted Runtime)** [DAPM: Ceded] Container-based hosting for custom agentic applications. Managed infrastructure, networking, storage integration, identity configuration. Public and private endpoint support. Build-in security. Supports OSS frameworks and custom runtimes. **Private Agent Factory (Oracle Autonomous AI Database)** [DAPM: Ceded] Agents run inside Oracle AI Database with direct data access. Co-locates agent reasoning with enterprise data. Eliminates external RAG pipeline latency. Private AI Services Container for on-premises inference without sending data to third-party services. ### NVIDIA-Provided Components **NVIDIA NIM + Nemotron on OCI** NVIDIA Nemotron models (including Nemotron 3 Nano Omni multimodal) available through OCI Enterprise AI. NIM containers deployable on OCI GPU instances. Model Import capability for custom NIM deployment. ### Gap Analysis OCI Enterprise AI (GA March 2026) is Oracle's unified agentic platform. The capabilities are comprehensive: OpenAI Responses-compatible API, managed agent hosting for OSS frameworks and MCP servers, vector stores, semantic search, NL2SQL, guardrails, observability, and auditability. The OpenAI API compatibility is strategically important — it reduces migration friction from OpenAI's platform and positions OCI as a drop-in alternative. The model catalog includes xAI Grok (4.1 Fast, 4.3), Cohere Command A (Vision, Reasoning), NVIDIA Nemotron 3 Nano Omni, and custom models through Model Import. The notable absence: no Anthropic Claude and no Meta Llama in published model availability. This is a narrower model selection than AWS Bedrock or Google's Model Garden. The agentic runtime distinguishes between three patterns: (1) OCI Generative AI APIs for direct model access, (2) Enterprise AI agents with managed orchestration, tools, memory, and retrieval, and (3) Applications and Deployments for container-based hosted agentic applications with custom runtimes. This three-tier model parallels AWS's Bedrock / AgentCore / SageMaker hierarchy. Dedicated AI clusters provide isolated compute for enterprise workloads — the opposite of shared-tenant model serving. This addresses a specific enterprise concern: inference latency predictability and data isolation. The trade-off is cost — dedicated clusters have fixed cost regardless of utilization. The Fusion Agentic Applications (22 agents across HR, finance, supply chain, CX — GA March 2026) represent Layer 2B and Layer 3 simultaneously: the runtime executes agents that are pre-built for Oracle's application estate. No other hyperscaler ships pre-built enterprise application agents at this scale. ### Borrowed Judgment Model providers (xAI, Cohere, NVIDIA) bring training data, alignment, and safety decisions as borrowed judgment. Oracle's guardrails constrain output but reasoning in model weights is not customer-configurable — same pattern as every other hyperscaler. The Fusion Agentic Applications introduce a unique borrowed judgment dynamic: these agents execute decisions within business processes by accessing unified enterprise data, workflows, policies, approval hierarchies, and permissions. The enterprise inherits Oracle's judgment about how HR, finance, and supply chain processes should be automated. This is not model-level borrowed judgment — it's business-process-level borrowed judgment. When a Fusion Agentic Application automates a talent review or maintenance troubleshooting, the enterprise inherits Oracle's encoding of what 'good' process execution looks like. ### Working Notes The Private Agent Factory pattern (Oracle AI Database 26ai) is architecturally significant: agents run inside the database, with direct access to enterprise data without external API calls. This eliminates the retrieval latency that plagues cloud-native RAG architectures but creates total database dependency for the agent runtime. OCI Enterprise AI's MCP support and OSS framework hosting (container-based deployment with managed infrastructure, networking, storage, and identity) position Oracle to benefit from the open agentic ecosystem without building a proprietary framework. This is a pragmatic strategy: let the frameworks proliferate, provide the managed hosting. The IBM partnership (watsonx Orchestrate agents on Red Hat OpenShift on OCI, IBM Granite models via OCI Data Science) adds an enterprise-focused model and agent ecosystem that differentiates from the consumer-AI-focused model catalogs of AWS and Google. ## ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Intelligence 2C: Emerging | Infra 2C: Implicit ### Vendor-Provided Components **OCI Enterprise AI Governance** [DAPM: Ceded] IAM-based access control for AI resources. Guardrails for content safety and policy enforcement. Observability for agent behavior, tool usage, and data flow monitoring. Auditability with audit logs capturing all AI interactions. Sovereign AI options. Project-level isolation for agent workloads. **Autonomous Database Self-Management** [DAPM: Ceded] Self-managing, self-securing, self-repairing database operations. Auto-indexing, auto-tuning, anomaly detection, autonomous performance optimization. Infrastructure 2C at the database layer — autonomous placement and resource decisions within the database boundary. ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** All Layer 2C components are Oracle IP. NVIDIA does not control governance, policy, or reasoning in the OCI stack. ### Gap Analysis Intelligence Layer 2C (partially present): OCI Enterprise AI governance includes IAM-based access control, guardrails for content safety, observability for agent behavior monitoring, and auditability for compliance. Oracle's blog series on runtime governance (April–May 2026) articulates sophisticated 2C concepts: runtime budget guardrails, approval-aware execution, pre-execution veto, safe degradation, evidence-backed runtime control, and the WORM Evidence Vault for audit-grade trace preservation. But there is a gap between the conceptual architecture Oracle's engineering team has published and the productized capabilities in OCI Enterprise AI GA. The runtime governance blog describes a governed execution layer with budget guardrails, circuit breakers, and safe-mode execution. The GA product offers IAM, guardrails, and observability — necessary but not sufficient for the full 2C vision Oracle's own engineers have articulated. Infrastructure Layer 2C (not built): No OCI service answers 'given data residency, cost, latency, GPU availability across NVIDIA and AMD, and compliance requirements, should this workload run on dedicated clusters in us-ashburn-1 or on Dedicated Region infrastructure in Frankfurt?' The capacity management primitives exist. The policy-driven placement engine does not. The Autonomous AI Database is the closest OCI comes to Infrastructure 2C: it self-manages, self-secures, auto-indexes, auto-tunes, and detects anomalies autonomously. But this autonomy operates at the database layer, not at the infrastructure-wide placement layer that the 4+1 model defines. OCI's unique 2C opportunity: Oracle is the only hyperscaler with both the application layer (Fusion) and the data layer (AI Database) under one authority. A Layer 2C that queries Fusion application state, database governance metadata, GPU utilization, and compliance posture to make autonomous placement decisions would have richer context than any other hyperscaler's 2C — because Oracle sees from application logic through data governance to infrastructure. The data to build 2C exists. The product does not. Cross-vendor Layer 2C comparison: • Dell: Absent. • HPE: Retained (IT ops, GreenLake Intelligence) + Delegated (Kamiwaza). • VAST: Retained/Emerging (PolicyEngine + Polaris). • AWS: Intelligence 2C Delegated (AgentCore Policy). Infra 2C implicit. • Google: Most complete productized Intelligence 2C (Agent Identity + Gateway + Registry + Orchestration + Observability). • OCI: Intelligence 2C emerging (guardrails + observability + auditability). Conceptual vision published but not yet fully productized. Infra 2C implicit. ### Borrowed Judgment Intelligence 2C: Low — guardrails, observability, and auditability are Oracle IP. Customer defines IAM policies; Oracle enforces. Infrastructure 2C: Ceded (implicit) — the Autonomous AI Database makes placement, scaling, and optimization decisions autonomously. These are 2C functions at the database layer that the enterprise has Ceded without explicit classification. The parallel to AWS's implicit 2C (managed service decisions) applies, but OCI's implicit 2C is concentrated in the database rather than spread across managed services. ### Working Notes Oracle's engineering blog series on runtime governance for agentic AI (April–May 2026) is the most sophisticated published thinking on Layer 2C from any hyperscaler's engineering team. The concepts — runtime budget guardrails as governed execution, evidence-backed observability with WORM vaults, approval-aware execution with pre-execution veto — map directly to what the 4+1 model defines as Intelligence Layer 2C. These capabilities are not yet shipping as GA products, but the engineering direction signals that Oracle understands the 2C problem and is building toward it. The question is execution velocity: can Oracle productize these concepts faster than AWS (which already has AgentCore Policy GA) or Google (which already has Agent Identity + Gateway + Registry GA)? Oracle's published vision is ahead of its shipped product — a familiar pattern for Oracle, which historically leads with database innovation and follows with cloud operationalization. The Fusion Agentic Applications contain implicit 2C: these agents make and execute decisions within business processes by accessing policies, approval hierarchies, and permissions. The agent governance is embedded in the application logic, not exposed as a configurable control plane. This is 2C — but it's Ceded 2C, where Oracle defines the governance model and the enterprise consumes it. ## ● Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Strongest Application-Layer Authority ### Vendor-Provided Components **Oracle Fusion Agentic Applications** [DAPM: Ceded] 22 specialized AI agents across HR, Finance, Supply Chain, and CX (GA March 2026). Outcome-driven, proactive, reasoning-based. Execute decisions within business processes by accessing unified enterprise data, workflows, policies, approval hierarchies, permissions, and transactional context. Built into Oracle Fusion Cloud Applications. **ISV + Model Ecosystem** [DAPM: Delegated] xAI Grok, Cohere Command A, NVIDIA Nemotron, IBM Granite models. IBM watsonx Orchestrate agents on Red Hat OpenShift on OCI. SoftBank sovereign cloud with custom AI models on OCI. Oracle Analytics with AI-powered assistants. ### NVIDIA-Provided Components **NVIDIA NIM + Nemotron Models** NVIDIA models via OCI Enterprise AI alongside xAI, Cohere, and customer models. ### Gap Analysis Layer 3 is OCI's most distinctive position in the assessment series. Oracle is the ONLY vendor that owns a comprehensive enterprise application suite AND the AI infrastructure to power it. AWS provides infrastructure and some first-party applications (Q, Connect). Google provides infrastructure and productivity applications (Workspace). Neither owns ERP, HCM, SCM, or CX at Oracle's scale. Fusion Agentic Applications (22 agents, GA March 2026) demonstrate what happens when the application vendor controls the AI infrastructure: agents that can access unified enterprise data, workflows, policies, approval hierarchies, permissions, and transactional context — without integration middleware, without API gateways, without cross-vendor authentication. This is not an ISV ecosystem (Dell's model) or a curated partner program (HPE's Unleash AI). This is first-party AI applications running on first-party infrastructure accessing first-party enterprise data. Specific agent domains: workforce scheduling, payroll issue resolution (HR), financial process automation (Finance), supply chain optimization (SCM), and customer experience enhancement (CX). Each agent is pre-trained on Oracle's understanding of enterprise processes — borrowed judgment at the business logic layer. The Oracle AI Data Platform for US Federal Government bundles OCI, Autonomous AI Database, and Enterprise AI with FedRAMP High and DISA IL4/IL5 authorization — extending Layer 3 into classified and sovereignty-constrained environments. The US Department of War agreement (May 2026) for AI on classified networks across 10 cloud regions at DISA IL2 through Top Secret demonstrates sovereign Layer 3 that no other hyperscaler matches in classification depth with equivalent application-layer integration. Custom agent development uses the OCI Responses API (OpenAI-compatible) with managed hosting for OSS frameworks and MCP servers — the same runtime surface described at Layer 2B, consumed here as an application development platform. The IBM partnership adds watsonx Orchestrate agents on OCI for HR use cases, extending the agentic ecosystem beyond Oracle's own applications. IBM Granite models via OCI Data Science provide additional model options. The SoftBank sovereign cloud platform on OCI (May 2026) demonstrates Layer 3 for sovereign AI: SoftBank's own generative AI models running on OCI Enterprise AI with full data control within Japanese data centers. The DAPM question: Fusion Agentic Applications are the deepest expression of Ceded authority in this assessment. The enterprise Cedes business logic, process automation, and decision-making to agents that Oracle built, Oracle trained, and Oracle hosts. The efficiency gains are real. The authority concentration is total. ### Borrowed Judgment The highest borrowed judgment of any Layer 3 in this assessment. Fusion Agentic Applications encode Oracle's interpretation of enterprise processes: what constitutes a complete talent review, how maintenance troubleshooting should proceed, when payroll exceptions should escalate. The enterprise inherits decades of Oracle's process design as embedded agent behavior. Model providers (xAI, Cohere, NVIDIA) add model-level borrowed judgment. Oracle's application logic adds process-level borrowed judgment. The combination is unique: no other vendor in this assessment embeds both model judgment AND business process judgment into a single managed AI application layer. DAPM Action 3 applies with maximum force: when you move off Oracle, what judgment doesn't move with you? Answer: the process logic, the data governance, the application state, the agent memory, and the transactional context — essentially everything above Layer 0. ### Working Notes The Oracle AI Data Platform for US Federal Government (March 2026) combines OCI, Autonomous AI Database, and Enterprise AI into a unified offering for government agencies with FedRAMP High and DISA IL5 authorization. This is Layer 3 for regulated environments — AI agents operating within classified and sovereignty-constrained boundaries. The US Department of War agreement (May 2026) for advanced AI capabilities on classified networks, leveraging 10 cloud regions at DISA IL2 through Top Secret and Special Access Program levels, demonstrates sovereign Layer 3 that no other hyperscaler matches in classification depth with equivalent application-layer integration. The 531% growth in multicloud database revenues suggests enterprises are increasingly running Oracle's data layer inside other hyperscalers — creating a cross-cloud data authority that could enable cross-cloud Layer 3 applications. Oracle's vision may be: own the data and application layers, be agnostic about the infrastructure layer. This is the inverse of AWS's strategy (own the infrastructure, be agnostic about the application layer). ════════════════════════════════════════════════════════════════════════════════ # Palantir AIP + Foundry + Apollo Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v3.0 — Layer-by-Layer Pressure-Tested **Date:** May 23, 2026 **Source:** Palantir Architecture Center (Platforms, Ontology System, Multimodal Data Plane, Interoperability, Rubix, AIP Architecture), Apollo docs (How Apollo Works, Plans & Constraints), 'Securing Agents in Production' (Palantir blog, Jan 2026), AIPCon 9, Q1 2026 earnings (May 4, 2026), published 4+1 model ## Summary Finding Palantir is the first vendor in this series that is not an infrastructure vendor at all, and reading its own documentation makes the inversion precise. Dell, HPE, VAST, Cisco, and the clouds build upward from silicon, storage, or a data center; Palantir builds downward from the decision. Its Architecture Center is explicit that the Ontology is designed to represent 'the complex, interconnected decisions of an enterprise, not simply the data.' Where Dell is strongest at Layer 0 and absent at Layer 2C, Palantir is structurally absent at Layer 0 and strongest at Layers 1A (governance), 2B (governed agent runtime), 2C (reasoning/orchestration), and 3 (value). It is the mirror image of every infrastructure vendor assessed so far. Palantir's data and compute architecture is genuinely open, and that openness is exactly what makes the capture hard to see. The Multimodal Data Plane uses Apache Iceberg as its primary table format, registers Databricks / Snowflake / BigQuery data through Virtual Tables 'without needless data duplication,' pushes compute down to those same engines, stores data at rest in open formats (Iceberg/Parquet) reachable over REST/JDBC/S3, and supports being 'one participant in a wider' data/AI mesh — Palantir's own 'unwalled garden.' Every one of those openness claims is true. But openness at the data and access layers is not portability of the thing the enterprise actually builds. The test that matters is whether a vendor's opinions — its proprietary way of modeling, governing, and orchestrating — can be lifted out and operated elsewhere. Palantir's opinions are the Ontology, and they run only on Palantir. The boundary, then, is not the data — it is the opinions plus the operating model. Even when data stays in Snowflake and compute pushes down to Databricks, the Ontology (objects, links, actions, and interaction-time security), the agent runtime, and the deployment control plane remain Palantir's, deployed on Palantir's hardened Kubernetes substrate (Rubix) and operated under Palantir's connection. The four-fold integration of data, logic, action, and security is the proprietary IP; the storage underneath is deliberately commoditized and open. The capture is Oracle-shaped: open at the access layer, captive at the layer of accumulated proprietary dependence — and like Oracle, the lift to leave compounds with every use case built, because every object model, action, agent, and workflow is Palantir-specific surface that would have to be rebuilt elsewhere. Apollo is the most important find for the 4+1 model, and the documentation makes both its power and its limit exact. Apollo is a genuine constraint-solving orchestration engine: a Hub continuously evaluates every possible Plan for each Spoke, evaluates all constraints attached to each Plan (maintenance windows, product-dependency version ranges, suppression windows, artifact availability), and issues only Plans whose constraints are satisfied — across connected, disconnected, and air-gapped estates under FedRAMP High, IL5, and IL6. Its explicit 'propose-a-Plan-then-execute' paradigm, with a dependency-graph invalidation model and break-glass overrides, is the closest thing in the entire series to the multi-variable policy reasoning the working notes describe. The limit: Apollo's constraints govern software deployment and day-2 operations — versions, dependencies, time windows, artifact presence — not live per-inference placement of which model, which region, which cost tier at request time. Palantir has built the control-plane MECHANISM the 4+1 model wants; it is pointed at the platform lifecycle and at agent actions, adjacent to the Infrastructure-2C placement function rather than identical with it. The DAPM crux cuts as sharply as it favors. Palantir hands the enterprise a real governance surface, a real governed agent runtime, and a real constraint-based control plane — but as a Ceded dependency operated under Palantir's judgment. The platform is not self-deployable; Palantir engineers deploy and manage it, Apollo holds a persistent connection back to Palantir for updates, monitoring, and orchestration, and Forward Deployed Engineering bridges the last mile. This places Palantir in the proprietary-captive cluster with Google, Oracle, and VAST — heavily Ceded — distinguished by a decoupled capture mechanism: where VAST couples (your data must enter its namespace, so the commitment is visible), Palantir decouples (your data stays open, so the commitment is invisible until you try to leave). The inverse-of-Dell shape holds end to end. Dell owns the floor and is Absent at the reasoning plane, so it says 'bring any Layer 2C' — and that 2C arrives Delegated and substitutable, leaving authority distributed. Palantir Cedes the floor and owns the reasoning plane, so it says 'bring any Layer 0' — and everything above the floor consolidates into one Ceded-and-operated authority. Same invitation, opposite consequence: a federated answer to 'who ties the control plane together' versus a unified one. Q1 2026 (revenue +85% YoY, 1,007 commercial customers, US commercial +133%, Rule of 40 at 145%) shows a growing number of enterprises accepting the unified trade. ## ✕ Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Not Palantir's Layer (By Design) ### Vendor-Provided Components **Host-Provided Compute & Fabric** [DAPM: Ceded] Palantir deploys onto customer or cloud infrastructure (AWS, Azure, GCP, OCI, on-prem, air-gapped). No proprietary Layer 0 assets; Rubix delivers identical operational characteristics regardless of provider. **Sovereign AI OS (NVIDIA Blackwell Ultra)** [DAPM: Ceded] Turnkey Palantir-on-NVIDIA appliance for sovereignty / latency use cases. Layer 0 authority is entirely NVIDIA's; Palantir contributes the software stack from Rubix upward. ### NVIDIA-Provided Components **Sovereign AI OS (Palantir + NVIDIA)** Announced at AIPCon 9 (May 2026): a turnkey system combining NVIDIA Blackwell Ultra hardware with Palantir's full software suite, aimed at data-sovereignty and latency-sensitive deployments. This is the single place Palantir touches Layer 0 — and the silicon, interconnect, and acceleration are entirely NVIDIA's. It is the clearest market signal in the series of the control plane being assembled from both ends: the most application-down vendor (Palantir) paired with the most silicon-up vendor (NVIDIA) in one offering spanning Blackwell Ultra to the Ontology. **No Palantir Silicon, Switching, or Interconnect** Palantir designs no chips, switches, or fabric. By the Rubix documentation it deploys with identical operational characteristics across AWS, Azure, Google Cloud, Oracle Cloud, or on-premises — consuming whatever Layer 0 exists beneath it. ### Gap Analysis Layer 0 is simply not Palantir's layer, and the documentation treats this as a feature rather than a gap. Rubix (the hardened Kubernetes substrate) is explicitly designed to 'abstract away the peculiarities of different environments and providers,' giving identical operational characteristics on AWS, Azure, GCP, OCI, and on-prem. Palantir's value is structurally indifferent to the silicon underneath. The contrast with the infrastructure vendors is total and clean: Dell's strength is Layer 0 and its gap is Layer 2C; Palantir's gap is Layer 0 and its strength is Layer 2C. The Sovereign AI OS partnership with NVIDIA is the exception that proves the rule — when Palantir needs a Layer 0 story (sovereignty, latency, turnkey on-prem), it borrows NVIDIA's Blackwell Ultra stack wholesale rather than building its own. For the 4+1 model this is significant: because Palantir floats above Layer 0, its governance and reasoning surfaces are the only ones in the series that are not anchored to a particular infrastructure vendor's hardware. That is both the source of its federation claim (it can sit atop Dell, HPE, VAST, or a hyperscaler equally) and the reason its lock-in lives entirely in the upper layers rather than the silicon. ### Borrowed Judgment Total at Layer 0, and irrelevant to the value proposition by design. Palantir inherits all silicon, networking, and acceleration judgment from the host environment. The consequence worth tracking: the enterprise's Layer 0 choice (and its Layer 0 DAPM position) is made with a different vendor (Dell, a hyperscaler, or NVIDIA via the Sovereign AI OS), and Palantir simply rides on top — so a Palantir adoption decision does not, by itself, resolve any Layer 0 authority question. ### Working Notes The Sovereign AI OS is the entry worth tracking across the whole series, because it is the first offering that explicitly fuses the top and bottom of the 4+1 stack into one SKU. If the control plane ends up being co-built by an application-down vendor and a silicon-up vendor meeting in the middle (working notes, Pattern 4 and Open Question #2), this partnership is the prototype. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Palantir Strength — Governance Authority, Open Storage ### Vendor-Provided Components **The Ontology (Four-Fold: Data + Logic + Action + Security)** [DAPM: Ceded] Models the enterprise's DECISIONS, not just its data: semantic 'nouns' (objects, properties, links) paired with kinetic 'verbs' (actions, automations) and the logic behind them (business rules, ML models, LLM-driven functions, multi-step orchestrations), all woven through a security layer. A documented 'digital twin' enabling read-write loops between humans and agents. **Interaction-Time Security (Role / Marking / Purpose)** [DAPM: Ceded] Reconciles granular role-, marking-, and purpose-based policies at the moment of interaction across data, logic, actions, and LLM calls — down to row/column level. Agents inherit security scopes from a human user or a project's permission structure, so agent governance equals employee governance. Cataloged in expressive audit logging. **Open Storage & Virtual Tables (MMDP)** [DAPM: Delegated] Iceberg as primary table format; data at rest in open formats (Iceberg/Parquet, original CSV) over REST/JDBC/S3. Virtual Tables register Databricks/Snowflake/BigQuery data without duplication. The storage layer is deliberately open and commoditized — the 'unwalled garden.' **Metadata & Semantic Interoperability** [DAPM: Ceded] Metadata services expose mandatory (security/attribution/lineage) and discretionary (tags/enrichments) metadata across datasets, ontology elements, agents, models, and pipelines for connection to existing catalogs/MDM. Ontology elements are REST/JSON-accessible with bidirectional sync to external semantic tools; Palantir MCP enables agent-driven semantic interop. **Ontology SDK (OSDK) — the 'Operational Bus'** [DAPM: Ceded] Turns the Ontology into a programmatically queryable API gateway / operational bus across the enterprise — the queryable, action-bearing control surface the working notes call for, as opposed to a display-only catalog. **Lineage, Versioning & Global Branching** [DAPM: Ceded] Every data query tied to full version history and the transformation logic that produced it; object types, actions, logic, and policy rules are versioned. Global Branching applies software-engineering change management ('version control for reality') to operational data and logic for both humans and agents. ### NVIDIA-Provided Components **No NVIDIA Layer 1A Dependency** The Ontology, its security system, lineage, and the MMDP open-data architecture are Palantir IP. NVIDIA contributes nothing to the governance layer. ### Gap Analysis This layer splits cleanly into two findings: storage (open) and governance (Palantir's, and very strong). Storage is deliberately open. The Multimodal Data Plane commits to Apache Iceberg as the primary table format; data at rest is stored in original/open formats (CSV, Iceberg, Parquet) and reachable through REST, JDBC, and S3-compatible interfaces. The Virtual Tables framework registers data from Databricks, Snowflake, and BigQuery 'without needless data duplication,' and Palantir explicitly frames itself as able to be 'one participant in a wider, more heterogenous enterprise architecture' — an 'unwalled garden.' This is a documented openness at the data layer — it engages Pattern 1 (the metadata boundary problem) head-on: Palantir's Interoperability documentation describes metadata services that expose mandatory metadata (security, attribution, lineage) and discretionary metadata (tags, enrichments) across datasets, ontology elements, agents, models, and pipelines for connection to existing catalogs and MDM tools, plus bidirectional semantic synchronization with external semantic models. Governance is where Palantir is genuinely one of the strongest in the series. The Ontology binds data, logic, action, and security into one model, and the security system reconciles role-, marking-, and purpose-based controls 'at the time of interaction, across tens of thousands of humans and agents,' down to row/column-level restrictions. Agents take security scopes that inherit from a human user or a project's permission structure — so an agent is governed exactly like the employee it acts for. This is the 'control surface,' not the 'checkbox,' that Pattern 2 asks for: governance metadata that is programmatically queryable (via the Ontology SDK / OSDK as an 'operational bus'), real-time (evaluated at interaction time), and policy-aware (purpose-based controls). The honest residual boundary: the queryable, action-bearing governance lives in the Ontology, which is Palantir's. You can keep your data in Snowflake; the decision graph that makes it operational — and the authority that governs it — is Palantir's. The five working-notes criteria are largely met (queryable, real-time, policy-aware, observable), and 'cross-platform' is met via virtual tables and metadata/semantic interop rather than only via ingestion. ### Borrowed Judgment Low for the governance logic itself — the Ontology Language/Engine/Toolchain, the interaction-time security system, and lineage are Palantir IP. The structural dependency is subtle: data need NOT enter Palantir's storage (open formats, virtual tables, pushdown), but to be GOVERNED and made operational by Palantir, it must be modeled in Palantir's Ontology and mediated by Palantir's security system. The enterprise can retain its storage substrate while ceding the decision-and-governance layer. Compared to Dell's MetadataIQ (indexes Dell storage in place, stays at Layer 1A), Palantir's governance reaches up into Layer 2C — but the authority that does the reaching is Palantir's. ### Working Notes The load-bearing concepts are the 'four-fold integration' (data + logic + action + security) and the Language/Engine/Toolchain decomposition: together they are why this is a governance authority rather than a catalog. The enterprise can keep its data substrate open (virtual tables, pushdown) while the decision graph that makes that data operational remains Palantir's. ## ● Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Palantir Strength — Governed Retrieval ### Vendor-Provided Components **Ontology-Object Retrieval** [DAPM: Ceded] Agents retrieve governed objects (with links, logic, and security) rather than raw chunks; context is continuously integrated into the Ontology. Retrieval inherits interaction-time security automatically. **Vector, Compute & Tool Services** [DAPM: Ceded] Integrated vectorization to produce/manage embeddings; extensible compute (multi-node Spark/Flink, single-node DuckDB/Polars, or BYO containerized engines); and a tool-services layer that functions as an evolving 'tool factory' for agents. Modular and model-agnostic. **AIP Assist / Context-Aware Surfaces** [DAPM: Ceded] Context-aware assistance across out-of-the-box applications, grounded in the Ontology to shorten time-to-value when exploring governed data. ### NVIDIA-Provided Components **No Hard NVIDIA Dependency** Vectorization and retrieval are Ontology/AIP-native services on the Rubix compute mesh. GPU acceleration is inherited from the host Layer 0 when present, not architecturally required; embeddings can be produced by any registered model. ### Gap Analysis Palantir's retrieval story is distinctive and well-documented: rather than RAG-over-raw-text producing 'a slightly better search engine,' agents retrieve and reason over Ontology objects — governed business entities carrying their links, logic, and permissions. The AIP architecture lists 'vector, compute, tool services' as a first-class capability: integrated vectorization to produce and manage embeddings, plus an extensible compute framework. Context is continuously integrated into the Ontology rather than assembled ad hoc at query time, which raises retrieval quality and governance simultaneously — every retrieved object carries its security markings, so retrieval respects the same interaction-time policy as everything else. The trade-off mirrors Layer 1A: retrieval is rich and governed within the Ontology's semantic frame. Compared to VAST's InsightEngine (purpose-built vector retrieval native to the data platform with permission-inheriting vector rows) or Dell's Elastic-based hybrid search, Palantir's retrieval value is semantic and governed rather than infrastructural — it is less a raw vector-DB play and more 'retrieval over a permissioned decision graph.' Because embeddings can be generated by any registered model and data can sit in virtual tables, retrieval does not force a storage migration. The score here is strong on the basis of governed retrieval — permission-aware object retrieval is what Palantir's buyers come for — not on raw vector-search performance or scale, where the purpose-built players lead. VAST and Palantir are both strong at 1B for opposite reasons: VAST on retrieval performance native to the data plane, Palantir on retrieval governance native to the Ontology. Against the working-notes criterion of retrieval-quality observability feeding placement: Palantir partially supplies it because AIP Evals suites are automatically tracked against the functions and sub-agents that consume context, giving a governed record of how retrieval-dependent behavior shifts over time — closer to the 'observable' criterion than most infrastructure vendors, though still oriented to agent quality rather than to a Layer 2C placement engine optimizing recall@k against cost. ### Borrowed Judgment Low. Retrieval semantics, vectorization services, and the context-integration model are Palantir's. The enterprise inherits Palantir's judgment about how context is modeled, embedded, and surfaced — which is precisely the value, but also means retrieval behavior is defined inside Palantir's framework rather than configured against an open, swappable vector store the enterprise operates independently. ### Working Notes Because retrieved units are governed objects rather than text chunks, retrieval inherits the row/column/marking/purpose security of Layer 1A automatically. This is the same structural-security property VAST achieves via permission-inheriting vector rows, reached here through the Ontology's unified security model rather than a shared Element Store. ## ◑ Layer 1C: Data Movement & Pipelines *Move/transform data — ETL/ELT, lineage, cost-aware movement, KV cache tiering* **Status:** Foundry Pipelines + Open Compute (MMDP) ### Vendor-Provided Components **Foundry Pipelines & Context Engineering** [DAPM: Ceded] Extensible multimodal connection/transformation across batch, streaming, and real-time replication (CDC) on any bundled runtime (Spark, Flink, DataFusion, Polars), with cohesive security, governance, and provenance tracking. Feeds the Ontology ('South of the Ontology'). **Open / Pushdown Compute & BYO Compute** [DAPM: Delegated] Pushdown to Databricks/Snowflake; orchestration with external inference infra, Spark clusters, and on-prem HPC; Compute Modules import any containerized runtime/model/executable, securely orchestrated by Rubix. Pipelines can run where existing compute lives. **Global Branching Change Governance** [DAPM: Ceded] Proposed changes (human or agent) live on a branch; reviewed and merged. Versioning spans object types, actions, logic, and policy rules — software-engineering governance applied to operational data and logic. ### NVIDIA-Provided Components **No NVIDIA Layer 1C Dependency** Pipeline authoring, transformation, lineage, and the open compute framework are Foundry/MMDP IP. Acceleration, if any, comes from the host Layer 0. ### Gap Analysis Foundry is a mature data-operations platform — connection, transformation, pipeline authoring, and lineage feed the Ontology — and the MMDP documentation makes the compute side notably open. The 'any compute' architecture supports pushdown to cloud-native runtimes like Databricks and Snowflake, 'Bring Your Own Compute' via Compute Modules (any containerized runtime securely orchestrated by Rubix), and orchestration with external inference infrastructure and on-prem HPC. Pipelines can run where the data and existing compute investment already live, not only inside Palantir. The contrast with Dell's Layer 1C is instructive. Dell's most distinctive 1C capability is KV-cache-to-storage offload — an infrastructure-physics optimization with direct inference economics. Palantir has no equivalent, because it does not operate at the storage-physics layer; its movement is logical and operational rather than infrastructural. Where Dell solves 'move bytes between PowerScale and the GPU cluster efficiently,' Palantir solves 'transform and govern data into operational objects, wherever the bytes physically sit.' The governance overlay is the differentiator: Global Branching means a pipeline or logic change — proposed by a human or an agent — lands on a branch, is reviewed, and is merged, with the change versioned across object types, actions, logic, and policy. That makes data movement auditable and reversible at the operational layer, not just the file layer. ### Borrowed Judgment Low for pipeline, transformation, lineage, and the open compute framework (Palantir IP). The dependency is that the governed terminus is the Ontology: movement and transformation, however open the runtime, exist to populate and update Palantir's governed model. Pushdown to Snowflake/Databricks genuinely reduces substrate lock-in; the decision layer that consumes the result does not move. ### Working Notes Palantir's interoperability posture is open in both directions — open formats at rest and governed egress to external systems — which is a meaningfully stronger stance than the on-prem storage vendors, whose openness is mostly about ingest. ## ◑ Layer 2A: Infrastructure Orchestration *GPU scheduling, quotas, RBAC, fair-share scheduling, utilization optimization* **Status:** Ceded — Platform-Scoped Orchestration ### Vendor-Provided Components **Rubix (Hardened, Autoscaling Kubernetes)** [DAPM: Ceded] Palantir's own hardened, autoscaling Kubernetes substrate that orchestrates the platform's workloads — secure-by-default networking, policy-driven node management, continuous cost optimization, FedRAMP High/IL5-6/CMMC. Real orchestration the enterprise consumes without configuring; platform-scoped, not a scheduler for the enterprise's separate fleet. **Rubix↔Apollo Execution Layer** [DAPM: Ceded] Apollo computes constraint-satisfied Plans; Rubix executes them via zero-downtime rollouts with automated rollback. Orchestration intelligence separated from execution mechanism — the bridge to the 2C control plane. **Compute Modules (Customer Workloads on Rubix)** [DAPM: Ceded] Lets a customer run a containerized workload on Rubix and inherit its placement, isolation, and cost optimization. A runtime convenience, not an exposed accelerator-fleet scheduler with quotas or fair-share the enterprise governs. ### NVIDIA-Provided Components **No NVIDIA Scheduler Dependency** Rubix is Palantir's own hardened, autoscaling Kubernetes implementation. There is no Run:ai dependency; GPU primitives, when needed, are inherited from the host environment. Rubix runs identically across AWS, Azure, GCP, OCI, and on-prem. ### Gap Analysis Layer 2A is Ceded, not absent — and the distinction from Layer 0 is the key to scoring it correctly. At Layer 0 Palantir genuinely provides nothing; you bring the floor. At 2A, orchestration absolutely happens: the agents and applications the enterprise buys are scheduled, autoscaled, isolated, and cost-optimized on Rubix, Palantir's hardened Kubernetes substrate. The capability exists and Palantir controls it. What the enterprise does not get is governance authority over it — no exposed GPU-scheduling, quota, or fair-share surface its infrastructure teams or AI application models can configure. Capability exists, vendor controls it, enterprise consumes without authority: that is the textbook definition of Ceded. This matches how the clouds score at 2A. AWS, Google, and Azure all orchestrate customer workloads through managed, largely invisible scheduling that the enterprise cannot configure or override — and that scores as present-and-Ceded, not absent. Rubix is architecturally the same fact: real orchestration the customer consumes without holding the keys. Scoring Palantir's managed orchestration as absent while the clouds' managed orchestration is present-and-Ceded would treat the same architectural pattern two different ways. The strength is moderate, not strong: Rubix is real and capable, but it is platform-scoped (it orchestrates Palantir's workloads, not the enterprise's separate GPU fleet) and it is not a differentiator the buyer chooses Palantir for. Compute Modules lets a customer run a containerized workload on Rubix and inherit its placement, but it is a runtime convenience, not an accelerator-fleet scheduler the enterprise operates. If the enterprise runs a separate GPU cluster for its own training/inference, that cluster still needs its own scheduler (the cloud's, Run:ai, or Kubernetes) — Palantir orchestrates what runs on Palantir, not the wider estate. Against the inverse-of-Dell spine: Dell's 2A is a gap (the capability is needed and NVIDIA's Run:ai holds it); Palantir's 2A is Ceded (the capability exists, is real, and Palantir holds it). Both thin at 2A in the sense that neither hands the enterprise authority — but for opposite structural reasons. ### Borrowed Judgment Moderate and Ceded. The enterprise inherits Palantir's judgment about how its Palantir-run workloads are scheduled, scaled, isolated, and cost-optimized — node ephemerality, placement heuristics, workload distribution — without the ability to configure or override it. This is the same borrowed-judgment posture as the clouds' managed orchestration: you get the benefit of the vendor's operational opinion and you do not get to change it. It is bounded, though, because it governs only the Palantir footprint; the enterprise's separate fleet, if any, runs on its own scheduler under its own authority. ### Working Notes Rubix is genuine engineering, productized beyond Palantir's own software (offered to other software vendors for regulated-environment deployment via Palantir FedStart, and underpinning the Mission Manager government onboarding offering). The Rubix↔Apollo split is the architecturally notable detail and the bridge to 2C: Apollo computes the constraint-satisfied Plans, Rubix executes them via zero-downtime rollouts with automated rollback — orchestration intelligence separated from execution mechanism. For the 4+1 buyer, the honest framing is that this orchestration is real and Ceded: you benefit from it, you don't govern it, and it covers the Palantir footprint rather than your whole accelerator estate. ## ● Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Palantir Strength — Governed Agent Runtime ### Vendor-Provided Components **AIP Agent Runtime (Stateful Loop on Rubix)** [DAPM: Ceded] Stateful control loop over a stateless reasoning core, executing tools/memory under infrastructure- and platform-level guardrails plus developer-configured controls. Per-workload isolation; operational vs. application executions distinguished; every interaction authenticated, authorized, logged. Agents are governed identities, like employees. **Secure 'Any Model' Integration (Model Catalog)** [DAPM: Ceded] Governed access to commercial and open LLMs plus customer/fine-tuned models on a level playing field, via Palantir-managed infra with no provider retention or retraining; regional endpoints where available; token-limit governance across use-cases. Models are swappable and Evals-comparable. **Agent Lifecycle: AIP Logic, Chatbot Studio, Code Workspaces** [DAPM: Ceded] No-/low-/pro-code construction of LLM-driven functions and durable agent orchestrations on top of the Ontology, with tool access scoped by permission (data tools, logic tools, operational tools). **AIP Evals** [DAPM: Ceded] Evaluation suites operating directly against the Ontology: create test cases, debug/iterate agent definitions, compare performance across LLMs, examine execution variance. Automatically tracked against the functions and sub-agents they govern. **End-to-End Observability** [DAPM: Ceded] Monitoring of every Ontology-feeding data flow, every human/agent action, chained-execution traces, and token/resource consumption. Telemetry and log access are themselves governed by data markings. ### NVIDIA-Provided Components **Model-Agnostic 'Any Model' (MMDP)** AIP's Model Catalog offers commercial models (OpenAI, Anthropic, Google, xAI) and open models (Meta/Llama) on a level playing field with customer-registered, fine-tuned, and existing enterprise models, via Palantir-managed infrastructure that guarantees no provider data retention and no retraining on transmitted data. NVIDIA NIM can be one registered model source among many; none is privileged. Access can be governed with token limits across use-cases. ### Gap Analysis This is one of Palantir's two strongest layers and a direct contrast to Dell, which Cedes the entire runtime to NVIDIA (NemoClaw/OpenShell/Dynamo). The 'Securing Agents in Production' documentation defines an agent as 'a stateful control loop that repeatedly invokes a stateless reasoning core (a frontier language model), interprets its outputs, executes tools and memory options, and feeds the results back until a termination condition is met' — and then makes that loop the unit of governance. Agents run on Rubix with per-workload isolation; every interaction is authenticated, authorized, and logged; and crucially the runtime distinguishes operationally-privileged executions from application-driven executions operating under precisely governed permissions. The 'any model' philosophy makes the model a swappable component rather than the center of gravity — the Ontology is the center. This is the structural inverse of Google's model-integrated stack, where one model's (Gemini's) judgment pervades every layer; in Palantir the reasoning core is deliberately interchangeable and compared via Evals. The AIP agent lifecycle is a documented build/orchestrate/evaluate loop: no-/low-/pro-code construction (AIP Logic for low-code durable orchestrations, Code Workspaces for pro-code), with AIP Evals operating directly against the Ontology to create test cases, debug, compare performance ACROSS different LLMs, and examine variance across executions. Observability is end-to-end: every data flow into the Ontology, every action by a human or agent, the cascade of chained executions, and even token consumption are monitored — and log access is itself governed by data markings. The agent-as-governed-identity model is the decisive property: agents operate atop the same foundation as human users, abide by the same change management (Global Branching), and weave human-in-the-loop with autonomous operations. Specialized builder agents (AI FDE, AIP Analyst) can themselves construct pipelines, write logic, train models, and build ontologies — under the same governance. This is genuinely Palantir's IP, not a partner framework wrapped in packaging. ### Borrowed Judgment Low for the runtime, governance, isolation, and evaluation machinery — all Palantir IP. The reasoning core is explicitly borrowed but equally explicitly interchangeable (any model, no retention, compared via Evals). The enterprise inherits Palantir's judgment about HOW agents are isolated, permissioned, evaluated, observed, and audited; it retains choice over WHICH model reasons. That decoupling is the opposite of the hyperscaler model-integrated pattern and is, for many regulated buyers, the point. ### Working Notes The 2026 forward bet is 'Agentic AI Hives' — autonomous agent networks coordinating on complex problems (e.g., supply-chain disruptions) without human intervention: a shift from decision-support to decision-execution. Shared-Ontology memory plus uniform governance is what Palantir argues makes scaling from single agents to coordinated networks an engineering problem rather than an architectural rewrite. Karp's Q1 2026 framing — differentiating Palantir from model developers amid a 'thousandfold' token-cost decline — is precisely a claim that the durable value is this governed runtime/decision layer, not the model. ## ● Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Ceded — Closest Productized Mechanism, Adjacent Target ### Vendor-Provided Components **Apollo Orchestration Engine (Constraint Solver)** [DAPM: Ceded] Hub/Spoke topology; continuously evaluates possible Plans against their attached constraints (deployment safety, dependencies, timing, environment health) and issues only satisfied Plans. Automated rollback that always respects human-set holds; connected/disconnected/air-gapped under FedRAMP High/IL5/IL6. Governs platform lifecycle and day-2 ops — not live inference placement. **Interaction-Time Policy Reasoning (Ontology)** [DAPM: Ceded] Reconciles granular role/marking/purpose security across data, logic, actions, and LLM calls for tens of thousands of humans and agents at the moment of interaction — the decision plane for what agents may do. Productized and shipping. **Agent Governance & Observability** [DAPM: Ceded] Agents as governed identities; tool calls and queries versioned and tied to Evals; chained-execution tracing; telemetry and log access governed by data markings. A feedback loop for detecting and constraining faulty agent reasoning. **Persistent, Vendor-Operated Connection (Not Self-Deployable)** [DAPM: Ceded] Palantir engineers deploy and manage; Apollo holds an ongoing connection for updates, monitoring, and orchestration ('shared security model'). The reasoning plane is operated under Palantir's authority — the defining DAPM caveat for this layer. ### NVIDIA-Provided Components **No NVIDIA Layer 2C Dependency** All reasoning-plane logic — Ontology interaction-time security, Apollo's constraint solver, agent governance and observability — is Palantir IP. ### Gap Analysis Applying the working notes' 'Routing Is Not Reasoning' test, Palantir comes closer than any infrastructure vendor — and the answer has two halves plus a limit. (1) DECISION governance (Intelligence-2C): the Ontology's interaction-time security plus the AIP runtime constitute a genuine reasoning plane for WHICH agent may take WHICH action on WHICH object under WHICH policy, with role/marking/purpose controls reconciled at the moment of action across tens of thousands of humans and agents. Productized and shipping, and the most complete agent-action governance surface in the series alongside Google's. (2) ORCHESTRATION (the surprising part): Apollo is a real constraint-solving control plane. A Hub continuously evaluates every possible Plan for each Spoke, evaluates the constraints attached to each Plan, and issues only those whose constraints are satisfied — with automated rollback that always respects human-set holds, across connected, disconnected, and air-gapped estates under FedRAMP High / IL5 / IL6. Palantir's own framing — 'different from other control-loop systems' because it proposes a transparent Plan and executes only on satisfied constraints rather than acting silently — is almost verbatim the auditable, multi-variable policy reasoning the working notes say is missing. THE LIMIT (the crux for the 4+1 model): Apollo's constraints govern SOFTWARE DEPLOYMENT and day-2 OPERATIONS, not LIVE PER-INFERENCE PLACEMENT of which model serves which request, in which region, at which cost/latency/compliance tier. Palantir has built the control-plane MECHANISM the model wants, and the DECISION plane for agent actions — but the mechanism is pointed at the platform lifecycle and at agent governance, ADJACENT to the live inference-placement function rather than identical with it. No vendor in the series fully productizes that live-placement engine; Palantir is the closest on mechanism, Google the closest on the data→placement chain. ### Borrowed Judgment This is the heart of the assessment. Unlike Dell (no judgment to borrow — build custom 2C in 6–12 months, bring a partner, or operate without it), Palantir hands the enterprise a real reasoning plane — but as a fully Ceded, vendor-OPERATED dependency. The platform is not self-deployable: Palantir engineers deploy and manage it; Apollo maintains a persistent connection back to Palantir for updates, monitoring, and orchestration (Palantir's documented 'shared security model'); and Forward Deployed Engineering sustains it. So the judgment exercised by both the decision plane (Ontology security) and the orchestration plane (Apollo) is Palantir's, running inside Palantir's operating relationship. This is the same answer VAST and Google give — the vendor holds Layer 2C — but with a distinctive shape: open data substrate underneath, closed governance/decision authority above, vendor-operated throughout. Absent (Dell) is worse than Ceded (Palantir); but Ceded-AND-vendor-operated is a heavier governance position than Ceded-but-self-operated (e.g., software a customer runs themselves). ### Working Notes Maps directly onto the working notes. Open Question #1 (product or pattern?): Apollo is the series' strongest evidence that the control plane can be a PRODUCTIZED constraint engine, not merely a pattern. Open Question #2 (who builds it?): Palantir is the concrete realization of the 'governance platform / new category' candidate, and the option the Dell assessment named ('potentially Palantir Ontology'). Open Question #7 (does 2C unify Infrastructure-2C and Intelligence-2C?): Palantir unifies them organizationally — one vendor, one authority — but not functionally; agent-action governance and operational orchestration are strong, while live inference placement is the unaddressed third. The DAPM distinction the model surfaces is precise: Dell at 2C is ABSENT (no authority to cede); Palantir at 2C is CEDED-AND-OPERATED (authority exists, is productized, and is held and run by the vendor). ## ● Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Palantir Strength — Where It Starts, Not Where It Ends ### Vendor-Provided Components **Workshop & AIP Applications** [DAPM: Ceded] Operational apps and workflows built directly on governed Ontology objects — object-oriented analytics, real-time app building, multimodal governance workflows, persona-tailored out-of-the-box applications, with AI infusion controlled and transparently assessed. **Action-Bearing Agents (Decision Execution)** [DAPM: Ceded] Agents propose and execute real business actions on Ontology objects under human-in-the-loop branching governance — the 'decision-execution' shift toward Agentic AI Hives. **Enterprise-Automation Builder Agents (AI FDE, AIP Analyst)** [DAPM: Ceded] Specialized agents that construct pipelines, write business logic, train models, build ontologies, and develop applications — operating atop the same foundation and governance (Global Branching, interaction-time security) as human builders. **Forward Deployed Engineering** [DAPM: Ceded] Palantir's human delivery model bridges the last mile from data to operational reality. Value-accelerating, but it deepens the vendor-operated dependency rather than reducing it. ### NVIDIA-Provided Components **Model Source Only** NVIDIA's role at Layer 3 is as one possible model/runtime source (NIM) under 'any model.' Business logic, workflows, and applications are Palantir-and-customer-owned and built on the Ontology. ### Gap Analysis Layer 3 is where every infrastructure vendor is weakest (partner ecosystems) and where Palantir is native — indeed it is where Palantir STARTS and reaches down from. Because the Ontology binds data, logic, action, and security, applications and agents are built directly against governed business objects: object-oriented analytics, real-time application building (Workshop), multimodal governance workflows, and out-of-the-box applications tailored to operational users, compliance teams, engineers, and analysts, all with the 'infusion of AI carefully controlled and transparently assessed, ensuring a smooth journey from augmentation to automation.' Agents propose and execute real actions on Ontology objects (reroute shipments, trigger purchase orders) under human-in-the-loop branching governance. The contrast with Dell's Layer 3 is the most illuminating in the series. Dell assembles independently-governed ISV agent populations (Palantir is itself one of Dell's named ISVs) with no cross-domain governance binding them on shared infrastructure. Palantir is one such population — but INTERNALLY it has exactly the cross-domain governance the Dell stack lacks: every app and agent shares the same Ontology, the same interaction-time security, the same Evals, the same Global Branching change control. The 4+1 tension this exposes: Palantir solves the multi-agent governance problem WITHIN its boundary; it does not solve it ACROSS an enterprise running Palantir AND ServiceNow AND a homegrown stack — which is the federated reality the working notes (Pattern 3) insist on. Palantir's answer to heterogeneity is MMDP openness at the data layer, not governance federation at the agent layer. Validated traction is the strongest at this layer in the entire series: Q1 2026 revenue +85% YoY to $1.633B, 1,007 commercial customers (+31%), US commercial +133%, Rule of 40 at 145%, with named production deployments (SAP reporting >99% validation accuracy and large cloud-migration reductions; GE Aerospace; Airbus; Stellantis). Analysts characterized AIP as 'an operational system for deploying agents with governance, cost attribution, and auditability, not a model wrapper' — a Layer 3/2C claim, not a Layer 0/1 one. ### Borrowed Judgment Distributed between Palantir and the customer, which is architecturally correct at Layer 3 — but unlike a neutral application platform, the value layer is inseparable from Palantir's governance, runtime, and operating model beneath it. You do not adopt Palantir's Layer 3 without adopting the Ontology, Rubix, Apollo, and the vendor-operated relationship. The value is real and the governance is real; both are Ceded together. ### Working Notes Forward Deployed Engineering is the human mechanism that bridges the 'last mile' from data to operational reality and is central to Palantir's traction — but it deepens the operating dependency rather than reducing it, reinforcing the Layer 2C 'vendor-operated' finding. The AIP architecture's 'enterprise automation' category (specialized builder agents like AI FDE and AIP Analyst constructing pipelines, logic, models, and ontologies under the same governance as humans) is the clearest expression of the augmentation→automation trajectory and of why the value layer cannot be cleanly separated from the layers beneath it. ════════════════════════════════════════════════════════════════════════════════ # VAST AI Operating System Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Initial Assessment **Date:** May 21, 2026 **Source:** VAST Forward 2026, VAST AI OS white paper, analyst coverage, published 4+1 model ## Summary Finding VAST is the most architecturally distinct vendor in this assessment series because it deliberately collapses traditionally separate infrastructure layers into a single platform. The VAST AI Operating System unifies storage (DataStore), metadata and vector database (DataBase + Catalog), global namespace (DataSpace), serverless compute (DataEngine), retrieval (InsightEngine), agent runtime (AgentEngine), governance (PolicyEngine), and model lifecycle (TuningEngine) under one authority boundary. Where Dell assembles 5–7 partner technologies to span Layers 1A through 2B, VAST provides a vertically integrated alternative with no inter-layer seams. The Polaris control plane and PolicyEngine are the most significant Layer 2C signals from any infrastructure vendor. Polaris abstracts infrastructure location, allowing AI pipelines to operate against a single logical environment across on-prem, neocloud, and public cloud. PolicyEngine provides inline policy enforcement governing agent access to shared memory, tools, knowledge bases, and other agents. Together, they represent the clearest ‘middle-out’ approach to the control plane problem — building Layer 2C from the data layer up. The DAPM trade-off is stark: VAST eliminates seams by collapsing authority into one vendor. The enterprise gains architectural coherence and eliminates integration risk. But the entire data plane, retrieval plane, agent runtime, AND emerging governance plane become a single Ceded dependency. If you run VAST, you run VAST for everything. There is no substitutability at any individual layer. The $30B valuation, $4B+ in bookings, $500M+ CARR, and 1,000+ enterprise deployments validate market traction. The CoreWeave anchor ($1.17B commercial agreement) validates hyperscale credibility. The CNode-X partnership with NVIDIA (GPU-accelerated servers through Cisco and Supermicro OEMs) validates compute integration. But the product-market fit question for the 4+1 model is whether enterprises will accept total data-plane vendor dependency in exchange for architectural simplicity. VAST is building the thing the 4+1 model says is missing. Whether the enterprise will buy it from a storage vendor is the open question. ## ◑ Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Software-Defined ### Vendor-Provided Components **DASE Architecture (Foundation)** [DAPM: Retained] Disaggregated Shared-Everything: separates stateless compute (CNodes) from persistent storage (DNodes/DBoxes) over an NVMe-over-Fabrics network. Every CNode mounts every SSD in the cluster at boot time — no node ‘owns’ any particular data. Eliminates east-west chatter between nodes, shard coordination, and per-node metadata bottlenecks. ACID transactional semantics via global Element Store. Scales linearly from TB to EB. This is VAST’s core IP and the foundation for every layer above. **CNodes (Stateless Compute Servers)** [DAPM: Retained] Process all user requests: file/object serving (NFS, S3), database queries, erasure coding, data reduction, vector search. Fully stateless — containerized VAST software. Add CNodes to scale performance independently from capacity. Can run on dedicated servers or as containers within EBoxes. ConnectX-7/8 network adapters for 200GbE+ RDMA connectivity. **DNodes / DBoxes (Persistent Storage)** [DAPM: Retained] NVMe-oF storage shelves connecting SCM and hyperscale flash SSDs to the NVMe fabric. DBoxes are fully redundant (no single point of failure) — redundant DNodes, NICs, fans, power. DNode containers route NVMe commands between SSDs and CNodes. Add DBoxes to scale capacity independently from performance. **EBox (Everything Box)** [DAPM: Retained] Converged form factor: runs CNode + DNode containers on the same industry-standard x86 server. Introduced in VAST 5.2. Each EBox runs three containers (1 CNode, 2 redundant DNodes). Enables deployment on hyperscaler server configs, cloud VM instances, and OEM servers (Cisco, Supermicro). The CNode in an EBox does NOT own the SSDs in that EBox — all SSDs are equally accessible by all CNodes across the cluster. **CNode-X (GPU-Accelerated, 2026)** [DAPM: Delegated] New node type adding GPU acceleration directly into VAST clusters. First time GPUs are embedded in the data platform rather than consuming it externally. Supermicro config: CloudDC AS-1116CS-TN (storage) + SYS-212GB-FNR 2U (compute) with 2x NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, AMD EPYC 9005 CPUs. Follows NVIDIA AI Data Platform reference architecture. OEM partners: Cisco, Supermicro (shipping); HPE, Lenovo (in progress). Dell notably absent. **NVMe-over-Fabrics Network** [DAPM: Delegated] High-bandwidth RDMA networking (200GbE+ via NVIDIA ConnectX-7/8) connecting all CNodes to all DNodes/SSDs. BlueField-3 DPUs + Spectrum-X switches for zero-copy data paths. This is the fabric that makes ‘shared everything’ possible — every compute node sees the entire storage namespace over NVMe-oF. ### NVIDIA-Provided Components **NVIDIA GPU Silicon (CNode-X)** RTX PRO 6000 Blackwell Server Edition GPUs in CNode-X configurations. GPU acceleration for vector search, data manipulation, inference, and analytics within the VAST platform. **NVIDIA ConnectX-7/8 NICs** Network adapters in every CNode providing RDMA connectivity to the NVMe fabric. This is the physical layer that enables DASE’s shared-everything model. **NVIDIA BlueField-3 DPUs** Data processing units in CNode-X for storage-side compute offload. Enable zero-copy data paths between GPU memory and NVMe storage. **NVIDIA Spectrum-X Switches** Ethernet switches for the NVMe fabric and GPU cluster networking. Same Spectrum silicon that Dell brands as PowerSwitch. **NVIDIA Libraries (cuVS, cuDF, DOCA)** Integrated directly into VAST software services on CNode-X. GPU acceleration for vector search (cuVS), data manipulation (cuDF), and networking (DOCA). ### Gap Analysis VAST’s Layer 0 is architecturally the inverse of Dell’s. Dell manufactures and sells the physical servers (PowerEdge, PowerRack) but depends on NVIDIA for the software runtime. VAST designs the software architecture (DASE) but depends on OEM partners (Cisco, Supermicro, HPE, Lenovo) for the physical servers. The DASE architecture is VAST’s genuine Layer 0 differentiator. No other vendor has a shared-everything model where every compute node can directly access every SSD over NVMe-oF with ACID guarantees. Dell’s PowerScale and ObjectScale are traditional storage architectures (even if highly performant). VAST’s DASE fundamentally changes how compute and storage relate — they share the same data structures, the same namespace, and the same transactional model. The CNode-X evolution is architecturally significant: it dissolves the boundary between ‘storage infrastructure’ and ‘compute infrastructure.’ In Dell’s architecture, PowerEdge servers run NVIDIA’s inference runtime and PowerScale provides separate storage — data moves between them. In VAST’s CNode-X architecture, GPUs embedded in the data platform accelerate data services AND serve inference — no data movement because compute and storage are the same system. The EBox model is worth noting for the procurement story: ‘Gemini model’ pricing means certified hardware is supplied at cost from the manufacturer, with VAST software as a capacity-based subscription. VAST guarantees software compatibility with new and older hardware for up to 10 years. This is a fundamentally different commercial model than Dell’s (buy the server, license NVIDIA AI Enterprise separately, integrate yourself). The NVIDIA dependency at Layer 0 is real but different from Dell’s. Dell depends on NVIDIA for GPU silicon AND the entire software stack above it (Run:ai, NemoClaw, OpenShell, AI Enterprise). VAST depends on NVIDIA for GPU silicon and networking silicon but retains authority over the software architecture. If NVIDIA changes its GPU roadmap, both Dell and VAST are affected. But if NVIDIA changes its software roadmap (NemoClaw, AI Enterprise licensing), only Dell is affected — VAST’s software is its own. ### Borrowed Judgment Moderate but architecturally different from Dell’s. VAST borrows hardware judgment from OEM partners (Cisco/Supermicro for servers) and silicon judgment from NVIDIA (GPUs, NICs, DPUs, switches). But VAST retains software architecture judgment (DASE, containerized CNodes, NVMe fabric design) entirely. Dell borrows software judgment from NVIDIA (Run:ai, NemoClaw, OpenShell, AI Enterprise) while retaining hardware judgment (PowerEdge design, thermal engineering, rack integration). The DAPM distinction: Dell’s borrowed judgment at Layer 0 is silicon-level (structural, everyone shares it). VAST’s borrowed judgment at Layer 0 is hardware-manufacturing-level (Cisco/Supermicro build the boxes). Neither vendor is fully independent at Layer 0, but their dependencies are in different dimensions. ### Working Notes Dell is notably absent from VAST’s OEM partner list (Cisco, Supermicro shipping; HPE, Lenovo in progress). This is a competitive signal — Dell positions itself as VAST’s primary competitor in AI data platforms (PowerScale/ObjectScale/MetadataIQ vs. DataStore/DataBase/Catalog). Dell building CNode-X configurations would be akin to VMware selling on Hyper-V. The containerized update model is a Layer 0 operational differentiator: VAST updates CNodes by spinning up a new container version alongside the old one and switching in seconds. Traditional server updates require node reboots and downtime. This reduces the operational overhead of infrastructure management — a function that Dell handles through OpenManage Enterprise and firmware lifecycle processes. The Gemini procurement model (hardware at cost, software as subscription) means VAST’s revenue is software-driven. Dell’s revenue is hardware-driven with NVIDIA software licenses as pass-through. This affects how each vendor invests in software capabilities — VAST’s business model incentivizes software differentiation; Dell’s incentivizes hardware volume. ## ● Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** VAST Strength ### Vendor-Provided Components **VAST DataStore** [DAPM: Ceded] High-performance file, object, and block storage on all-flash NVMe. Universal Storage heritage — no separate engines for file vs. object vs. block. NFS v3/v4.1, SMB 2.1/3.1, S3 REST API. An Element is not an NFS file or an S3 object — it’s a superset of all of them. Write a file as an object, read it as SMB, read it over NFS — same underlying data, no copies, no translation layers. Protocol-independent storage with protocol-dependent access. **VAST DataBase** [DAPM: Ceded] Purpose-built database for AI: structured data (EDW tables), contextual metadata, vectors for similarity search, real-time streams (Kafka-compatible), catalogs, and logs. Vectors live alongside structured records in a single query path. Supports open formats like Parquet. Handles tables alongside unstructured data in the same transactional system. **VAST Catalog** [DAPM: Ceded] Indexes metadata attributes of all data on cluster — files, objects, directories. Queryable via Web UI, VMS API, CLI, or connected third-party query engines. Automated classification based on multi-protocol access patterns. The discovery surface that downstream engines (InsightEngine, PolicyEngine) consume. **VAST Element Store (Foundation)** [DAPM: Ceded] Low-level key-value store on B-tree data structure underlying all services. Organizes physical storage into a global namespace holding Elements (files/objects, tables, block volumes/LUNs). Every Element is automatically enriched with metadata, security data, and data reduction data. Provides Element-level access control, encryption, snapshots, clones, and replication. ACID transactional semantics with decentralized locks at file/object/table level. No eventual-consistency pitfalls — if an update is committed, subsequent queries see it. **Security & Compliance (Built-in)** [DAPM: Ceded] Inline encryption (at rest and in transit). Immutable snapshots (critical for ransomware recovery and audit trails). MFA. Granular access control at Element level. U.S. Government STIG-aligned security configuration. HIPAA, SOC 2, GDPR-ready compliance posture. CrowdStrike integration monitors data access, admin activity, and workload behavior continuously — security embedded at the data layer, not overlaid. Security governance follows data end-to-end: access controls on source files propagate automatically through embeddings to inference outputs. ### NVIDIA-Provided Components **NVIDIA cuVS (via CNode-X)** GPU-accelerated vector indexing and search. Integrated directly into VAST DataBase on CNode-X rather than as an external accelerator. Enhances performance but is not required for core storage or database operations — VAST DataStore/DataBase run without GPUs on standard CNodes/EBoxes. ### Gap Analysis This is VAST’s strongest layer and its primary 4+1 differentiator. Where Dell requires PowerScale (file) + ObjectScale (object) + Exascale (combined) + MetadataIQ (metadata) + Lightning FS (parallel) + Trust3 AI (governance partner) as separate components with integration seams and multiple authority boundaries, VAST provides a single platform where all data types, metadata, vectors, and security controls share the same Element Store with ACID guarantees. The multiprotocol Element Store is the architectural fact that changes the 4+1 mapping. In Dell’s architecture, data stored as files (PowerScale/NFS) must be accessed differently than data stored as objects (ObjectScale/S3). MetadataIQ indexes across both but the storage engines are separate systems with separate governance surfaces. In VAST, any Element can be accessed as a file, an object, or a table row — same data, same permissions, same metadata — because the Element Store is protocol-independent. For AI pipelines where data flows between ingestion (S3), preprocessing (NFS), training, and inference, this removes an entire class of integration complexity. The governance catalog question from the 4+1 model — ‘is the metadata rich enough to drive Layer 2C placement decisions?’ — has a clearer answer with VAST than with any other vendor in this assessment series. Because the Catalog, DataBase, PolicyEngine, and security controls are all part of the same platform operating on the same Element Store, the metadata that PolicyEngine queries for governance decisions is the same metadata that DataStore manages. No API boundary, no schema translation, no integration seam between ‘where the data lives’ and ‘where governance decisions are made.’ The security-follows-data model is significant: access controls on source files propagate automatically through embeddings to inference outputs. Dell’s Trust3 AI provides similar governance intent but as a partner overlay on separate storage engines — the propagation must cross system boundaries. VAST’s propagation is structural because storage, metadata, and security share the same data structures. The trade-off remains stark: if you choose VAST for Layer 1A, you’ve also chosen VAST for Layer 1B (InsightEngine) and Layer 1C (DataEngine). The layers are not independently substitutable. Dell’s modular approach lets you swap Elastic for Weaviate at Layer 1B without touching Layer 1A. VAST does not offer that optionality. This is the fundamental DAPM trade-off: architectural coherence vs. component substitutability. ### Borrowed Judgment Low — the lowest in either vendor assessment. VAST owns the storage engine, the database engine, the metadata catalog, the Element Store, and the security/compliance stack. NVIDIA dependency is limited to GPU acceleration (cuVS on CNode-X) which enhances but isn’t required for core storage operations. VAST’s storage platform runs without GPUs on standard CNodes/EBoxes; CNode-X adds GPU acceleration on top. Compare to Dell’s Layer 1A: Dell owns PowerScale, ObjectScale, Exascale, Lightning FS, and MetadataIQ (Retained) but depends on NVIDIA for acceleration (cuVS, SuperNICs, NeMo Retriever connector) and Trust3 AI for governance (Delegated). Dell’s authority at Layer 1A is Retained with two Delegated dependencies. VAST’s authority at Layer 1A is entirely self-contained — but Ceded to VAST as a vendor dependency. The DAPM distinction: Dell’s enterprise Retains Layer 1A authority. VAST’s enterprise Cedes Layer 1A authority to VAST. Both are ‘low borrowed judgment’ in different ways — Dell borrows less from partners because it built the storage; VAST borrows less from NVIDIA because its storage is GPU-independent. But the enterprise’s relationship to the vendor is different: Dell = you own it; VAST = VAST owns it, you subscribe. ### Working Notes The Element Store enrichment model is the architectural reason VAST’s governance catalog is inherently richer than Dell’s MetadataIQ: every Element is automatically enriched with metadata, security data, and data reduction data at write time — not indexed after the fact. This is structural metadata richness vs. bolt-on tagging. The practical consequence: VAST’s metadata is always current (enriched inline with writes). Dell’s MetadataIQ indexes asynchronously, which means metadata currency depends on indexing lag. The compliance story is maturing: U.S. Government STIG-aligned security configuration guide (v1.5), HIPAA compliance documentation, and the Kiteworks/Cybersecurity Insiders 2026 forecast showing 63% of organizations cannot enforce purpose limitations on AI agents. VAST’s PolicyEngine (Layer 2C) is designed to address this gap — but it depends on Layer 1A’s Element-level security model as the enforcement surface. CrowdStrike integration at the data layer (vs. Dell’s perimeter-level integration) means threat detection and automated response happen where the data is, not at the network edge. For AI workloads where data is continuously accessed by agents, data-layer monitoring is architecturally more appropriate than perimeter monitoring. ## ● Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** VAST Strength ### Vendor-Provided Components **VAST InsightEngine** [DAPM: Ceded] Framework for building, managing, and automating real-time AI pipelines. Built on DataEngine using event-driven triggers and serverless functions. Processes data the moment it lands: automated chunking, embedding (via NVIDIA NIM), and storage in DataBase’s native vector store. Eliminates traditional batch processing delays — near-instant availability for AI retrieval and inference. Described as the first solution to securely ingest, process, and retrieve all enterprise data (files, objects, tables, streams) in real time. Model agnostic — compatible with any model using the OpenAI API spec. **VAST Vector Search (Native)** [DAPM: Ceded] Built directly into VAST DataBase on DASE architecture. No sharding, no memory-bound indexes. Scales linearly by adding stateless compute nodes — throughput grows without data reorganization. Vectors, metadata, and raw content live side-by-side in the Element Store. Single query path for vector search + SQL filters + metadata predicates + joins. A video RAG workflow can retrieve semantically relevant clips while filtering by time range, camera ID, location, or policy — in a single execution path. Scales from millions to trillions of vectors with consistent performance. **Permission-Aware Retrieval** [DAPM: Ceded] InsightEngine ties the permissions of vector database rows to the permissions of source data. RAG requests only return embeddings and chunks for data the requesting user is authorized to access. This is not a bolt-on ACL check — it’s structural because vectors inherit the security model of their source Elements. Access controls propagate from source files through embeddings to inference outputs. **SyncEngine (Enterprise Data Ingestion)** [DAPM: Ceded] Ingests data from enterprise systems (Google Drive, Jira, Confluence, S3 buckets, file systems) while preserving identity and access semantics. The answer to ‘what about my non-VAST data?’ — brings external enterprise data into the VAST namespace with governance intact. Maintains durable searchable index and triggers enrichment pipelines automatically on ingestion. **VAST Catalog (Discovery Surface)** [DAPM: Ceded] Metadata indexed across all data types. Queryable via Web UI, VMS API, CLI, or third-party engines. Provides the discovery and filtering surface that InsightEngine and retrieval pipelines consume. ### NVIDIA-Provided Components **NVIDIA NIM (Embedding Models)** InsightEngine triggers NVIDIA NIM embedding agents as data is written. Embeddings stored in DataBase within milliseconds. NIM provides the embedding intelligence; VAST provides the pipeline automation and storage. Model agnostic — NIM is the default but any OpenAI-compatible model works. **NVIDIA cuVS (GPU-Accelerated Search)** GPU-accelerated vector indexing and search integrated into DataBase via CNode-X. Enhances search speed but the retrieval logic, permission model, and pipeline orchestration are all VAST’s. ### Gap Analysis Layer 1B is where the architectural difference between VAST and Dell is sharpest — and where the 4+1 model’s layer boundaries are most challenged. Dell’s Layer 1B is a three-party dependency: Dell (PowerScale/ObjectScale storage + MetadataIQ metadata), Elastic (Elasticsearch 9.4 search intelligence), and NVIDIA (cuVS acceleration). Three authority boundaries, three integration seams, three vendors to coordinate for a single retrieval pipeline. VAST’s Layer 1B is a single-authority system: InsightEngine (pipeline), Vector Search (retrieval), DataBase (vector + structured storage), and Catalog (discovery) — all VAST IP running on the same Element Store. NVIDIA provides embedding models (NIM) and search acceleration (cuVS) but the retrieval logic, permission propagation, and pipeline orchestration are entirely VAST’s. The ‘data decay’ concept from VAST’s engineering is relevant to the 4+1 model: the gap between what the data IS and what the system BELIEVES the data to be widens over time in batch-indexed systems. InsightEngine addresses this by triggering embedding generation the moment new data is written — embeddings are always current with source data. Dell’s incremental indexing (ingesting only updated files) addresses the same problem but across system boundaries — MetadataIQ indexes PowerScale/ObjectScale, then the Data Search Engine re-indexes based on metadata changes. VAST’s approach is structurally tighter because the trigger, the embedding, and the vector storage share the same platform. The permission-aware retrieval is the most significant security finding at Layer 1B. Vector database rows inherit permissions from source data Elements. RAG queries only return authorized content. This is structural security (same Element Store, same permission model) rather than an overlay check. Dell’s retrieval through Elastic does not natively propagate PowerScale/ObjectScale permissions into search results — that’s a separate integration concern. SyncEngine’s enterprise data ingestion (Google Drive, Jira, Confluence) with preserved identity and access semantics is VAST’s answer to the heterogeneous enterprise. But it’s also an honest constraint: enterprise data that doesn’t enter the VAST namespace isn’t retrievable through InsightEngine. Dell’s Elastic-based search can potentially index data from more sources without requiring ingestion into Dell storage. The 4+1 Layer 1B question: does VAST expose retrieval quality observability (recall@k, latency percentiles, cache hit rates) that a Layer 2C could use for placement decisions? VAST’s integrated architecture makes this more feasible than Dell’s (InsightEngine and PolicyEngine share the same platform), but published materials don’t detail retrieval quality metrics as a first-class observable. The pipeline automation and real-time embedding are well documented; the retrieval quality feedback loop is not. ### Borrowed Judgment Low. VAST owns the retrieval logic (InsightEngine + Vector Search), the storage substrate (DataStore + Element Store), the metadata catalog (Catalog + DataBase), and the permission model (Element-level security propagated to vector rows). NVIDIA provides embedding models (NIM) and search acceleration (cuVS) but neither is required for the core retrieval function to work — InsightEngine is model agnostic and Vector Search runs on standard CNodes without GPU acceleration. Compare to Dell’s Layer 1B: Dell’s retrieval quality depends on Elastic (search intelligence), NVIDIA (acceleration), and Dell (storage + metadata). If Elastic changes Elasticsearch licensing or features, Dell’s retrieval story changes. VAST has no equivalent third-party dependency at Layer 1B. The DAPM classification is Ceded to VAST — the enterprise doesn’t own the retrieval engine. But the authority is unified in one vendor rather than split across three. For the enterprise architect, this means one vendor relationship to manage for retrieval instead of three — but total dependency on that one vendor. ### Working Notes The single-query-path architecture (vector search + SQL + metadata predicates + joins in one execution path) remains the most significant Layer 1B capability in this assessment series. No other vendor provides this without stitching together separate systems. The model-agnostic design is worth emphasizing: InsightEngine works with NVIDIA NIM by default but supports any model compatible with the OpenAI API spec. This means VAST’s Layer 1B retrieval pipeline is not locked to NVIDIA’s embedding models — unlike the Layer 2B runtime where NemoClaw/OpenShell creates a stronger NVIDIA dependency in Dell’s stack. The real-time embedding trigger (process data the moment it lands) vs. Dell’s incremental indexing (index only updated files) represents two approaches to the data currency problem. VAST’s is architecturally tighter but also more compute-intensive — every write triggers an embedding pipeline. Dell’s is more conservative on compute but introduces metadata lag. For agentic AI workloads where agents make decisions based on current data, VAST’s approach is architecturally more appropriate. For cost-constrained environments where batch indexing is acceptable, Dell’s approach is more efficient. ## ● Layer 1C: Data Movement & Pipelines *Move/transform data — serverless execution, event-driven pipelines, in-situ compute* **Status:** VAST Strength ### Vendor-Provided Components **VAST DataEngine** [DAPM: Ceded] Serverless compute layer executing containerized functions, triggers, and analytical engines directly on CNodes. Three core execution modes: (1) Event Triggers via Kafka-compatible Event Broker, (2) Serverless Functions as lightweight Python in stateless containers, (3) Containerized Engines for native VAST and partner workloads. Functions, events, and pipelines defined as Kubernetes custom resources. Completely event-driven — unlike centralized orchestrators (Airflow, Slurm) that poll resources, DataEngine reacts to events at any scale without bottlenecks. Co-located with data: reduces latency, simplifies management, improves security. **VAST SyncEngine** [DAPM: Ceded] Scalable data discovery and migration service within DataEngine. Indexes and synchronizes data from external sources (S3 buckets, file systems, SaaS platforms like Google Drive, Jira, Confluence) into the DataSpace. Maintains durable searchable index. Triggers enrichment pipelines automatically as new data is onboarded. Preserves identity and access semantics from source systems. **VAST Event Broker** [DAPM: Ceded] Native Kafka-compatible streaming ingestion built into the platform (not an external dependency). Stores message topics as VAST Tables for global ordering and real-time SQL querying of live streams. Events are immediately accessible as tables in the DataBase. Detects S3 object creation, tagging, and deletion events and triggers downstream workflows. Decoupled design: functions are never hardwired to data sources — the broker ensures events flow to the right consumer. **DataEngine CLI + SDK** [DAPM: Ceded] Command-line interface (vastde) for managing functions, pipelines, triggers, compute clusters, container registries. Python SDK with DataEngine context object for function development. Multiple output formats (human-readable, JSON, YAML), dry-run mode, integrated monitoring with built-in access to logs and traces. Low-code interface also available for non-developers. **VAST TuningEngine (End of 2026)** [DAPM: Ceded] Manages and automates AI model tuning. Works with PolicyEngine to power automatic learning loops that remain aligned with organizational expectations. Creates closed operational computing loop: observe, reason, act, evaluate, improve. Future interactions improve newly deployed models. Slated for release by end of 2026. **Composability Architecture** [DAPM: Ceded] The layers compose into end-to-end workflows: SyncEngine discovers and onboards data → DataEngine processes via event-driven functions and Event Broker → InsightEngine contextualizes with chunking and embedding models → DataBase stores vector embeddings for querying → AgentEngine executes multi-step tasks leveraging tools and data across the ecosystem. Each stage is a DataEngine pipeline stage, not a separate system. ### NVIDIA-Provided Components **NVIDIA cuDF (via CNode-X)** GPU-accelerated data manipulation for pipeline processing. Integrated into DataEngine on CNode-X configurations. Accelerates the transformation and enrichment stages of data pipelines. **NVIDIA CUDA Libraries** CNode-X integrates NVIDIA libraries directly into DataEngine services. 44% faster queries, 80% lower costs per VAST claims. GPU acceleration is additive — DataEngine runs on standard CNodes without GPUs; CNode-X adds performance. **NVIDIA NIM (via InsightEngine)** Embedding model execution triggered by DataEngine events. NIM provides the embedding intelligence for the InsightEngine pipeline stage. Model agnostic — any OpenAI API-compatible model works. ### Gap Analysis VAST’s DataEngine is the most architecturally integrated Layer 1C in this assessment series. The gap analysis requires comparing three dimensions: orchestration model, data movement, and lifecycle completeness. Orchestration model: DataEngine is completely event-driven using Kubernetes custom resources. Dell’s Dataloop is a no-code/low-code engine acquired from a startup. Apache Airflow and Slurm (common alternatives) use centralized orchestrators that poll resources. DataEngine’s event-driven model avoids the scaling bottlenecks of centralized orchestration — it reacts to events rather than scheduling them. This is a meaningful architectural advantage for continuous AI pipelines where data flows never stop. Data movement: DataEngine executes co-located with data on CNodes. Dell’s Dataloop orchestrates across API boundaries between separate storage (PowerScale), search (Elastic), and compute (NVIDIA) systems. The practical difference: VAST’s pipelines don’t move data between systems because storage and compute share the same platform. Dell’s pipelines necessarily move data between PowerScale/ObjectScale and the GPU cluster. Dell’s KV Cache offload (NVIDIA CMX, 19x TTFT improvement) is a sophisticated answer to this data movement problem — but it’s solving a problem that VAST’s architecture doesn’t have. Lifecycle completeness: DataEngine + SyncEngine + Event Broker + InsightEngine + AgentEngine + TuningEngine spans from data ingestion through embedding through agent execution through model improvement. Dell spans from data orchestration (Dataloop) through search (Elastic) through analytics (Starburst) — but the agent runtime (NemoClaw) and model lifecycle are separate NVIDIA-owned layers. VAST’s lifecycle is more complete but less proven. Dell’s lifecycle has more mature individual components but more seams. The composability architecture is worth noting for the 4+1 model: VAST’s layers (SyncEngine → DataEngine → InsightEngine → DataBase → AgentEngine) compose as pipeline stages within a single platform. The 4+1 model defines these as separate layers (1A, 1B, 1C, 2B). VAST’s architecture challenges the layer separation by making the boundaries internal to one platform rather than external between systems. The layers still exist functionally, but the authority boundaries don’t — VAST owns all of them. ### Borrowed Judgment Low. DataEngine, SyncEngine, Event Broker, composability architecture, and the DataEngine CLI/SDK are all VAST IP. TuningEngine (end of 2026) will be VAST IP. NVIDIA provides GPU acceleration (cuDF, CUDA, NIM) but the orchestration logic, event processing, pipeline management, and lifecycle coordination are entirely VAST’s. Compare to Dell’s Layer 1C: Dell owns the Dataloop orchestration engine (Retained — its strongest software move) but depends on NVIDIA for acceleration (cuDF, CMX for KV cache), Starburst for analytics, and NVIDIA Blueprints/NIMs for pipeline templates. Dell’s Layer 1C has four authority boundaries: Dell (Dataloop), NVIDIA (acceleration + templates), Starburst (analytics), and the customer (pipeline configuration). VAST’s Layer 1C has one: VAST. The Kubernetes foundation is a shared dependency: both Dell and VAST depend on K8s. But VAST embeds K8s custom resources into its platform (functions, events, pipelines as CRDs), while Dell depends on external K8s distributions (Red Hat OpenShift AI, Canonical Ubuntu). VAST’s K8s dependency is internal; Dell’s is external. ### Working Notes The DataEngine CLI (vastde) and Python SDK represent a developer experience that Dell’s Dataloop doesn’t yet match in published documentation. The CLI manages functions, pipelines, triggers, compute clusters, and container registries with dry-run mode and integrated observability. Dell’s Dataloop is positioned as no-code/low-code — targeting data engineers. VAST’s DataEngine targets both low-code users and developers/MLOps engineers with full Python SDK access. Different audience emphasis. The built-in observability (logs, traces within the same UI, without external Grafana or tracing infrastructure) is an operational differentiator. Dell’s observability for data pipelines depends on partner tools and external monitoring stacks. The TuningEngine + PolicyEngine combination represents VAST’s ‘thinking machine’ vision: systems that observe, reason, act, evaluate, and improve automatically. This is the most ambitious lifecycle claim from any vendor in this assessment series. Dell’s equivalent would require assembling Dataloop (orchestration) + NVIDIA NeMo (model training) + NemoClaw (agent execution) + manual feedback loops — four systems from three vendors with no automated improvement loop. ## ◑ Layer 2A: Infrastructure Orchestration *Provision, schedule, and govern compute environments* **Status:** Partial ### Vendor-Provided Components **Polaris Control Plane** [DAPM: Ceded] Global control plane purpose-built for AI data infrastructure spanning public cloud, neocloud, and on-prem. Kubernetes-based architecture with lightweight agent on every VAST node. Intent-driven: administrators define desired state, Polaris coordinates cloud-native services to achieve and maintain it. Automates provisioning, cloud marketplace integration (subscription/entitlement), centralized upgrade orchestration, expansion, and node replacement. Multi-cluster: converts distributed infrastructure into a single operational platform. Available as VAST-managed, partner-managed, or customer-managed. **Polaris Enterprise Controls** [DAPM: Ceded] Enterprise identity integration, role-based access control, and audit logging. Cloud-style operational consistency across hybrid and multicloud environments. Supports sovereign deployments. Multi-tenant by design. **DataEngine Built-in Scheduler** [DAPM: Ceded] Built-in scheduler and cost-optimizer within DataEngine. Deploys serverless functions on CPU, GPU, and DPU architectures. Manages function lifecycle, container orchestration, and resource allocation. Event-driven scheduling rather than centralized job queuing. **Polaris + DataSpace Coordination** [DAPM: Ceded] Polaris abstracts INFRASTRUCTURE location (where clusters are deployed and maintained). DataSpace abstracts DATA location (how data is presented across locations via global namespace). Together they provide the ‘where should this run relative to this data’ coordination that is a prerequisite for Layer 2C. ### NVIDIA-Provided Components **NVIDIA GPU Operator (on CNode-X)** GPU lifecycle management for containerized VAST services running on CNode-X. Manages driver installation, device plugins, monitoring for the GPUs embedded in the data platform. **GPU Scheduling Scope Distinction** CNode-X GPUs accelerate DATA PLATFORM services (vectorization, SQL via Sirius/cuDF, vector search via cuVS, inference within InsightEngine). They are NOT general-purpose GPU compute for external training/inference workloads. VAST does not provide Run:ai-equivalent fair-share GPU scheduling because the use case is different — GPU resources serve the platform’s own services, not multi-tenant external workloads competing for GPU time. ### Gap Analysis Layer 2A is where the architectural models diverge most sharply between Dell and VAST, and where the 4+1 model’s layer definitions need careful application. Dell’s Layer 2A problem: GPU-aware workload scheduling for multi-tenant inference and training. Multiple teams compete for scarce GPU resources. Run:ai provides fair-share scheduling, quotas, and RBAC. Dell has no proprietary capability here. VAST’s Layer 2A reality: GPU resources in CNode-X serve the data platform’s own services (vector search, SQL acceleration, embedding generation, data pipeline processing). These are not multi-tenant external workloads competing for GPU time — they are platform services running on platform-embedded GPUs. The scheduling question is different: VAST’s DataEngine scheduler allocates serverless function execution across CNodes, not GPU time-slices across competing users. Polaris provides genuine Layer 2A capability that Dell lacks entirely: intent-driven, multi-cluster, multi-cloud infrastructure provisioning and lifecycle management. Dell’s OpenManage manages a single rack. Polaris manages a fleet of VAST deployments across geographies, clouds, and on-prem sites as one system. The three management modes (VAST-managed, partner-managed, customer-managed) provide operational flexibility that has no Dell equivalent. However, if the enterprise deploys separate GPU clusters for training/inference (not embedded in VAST), those clusters still need GPU-aware scheduling. VAST doesn’t provide this — the customer would need NVIDIA Run:ai or equivalent for the GPU compute layer that sits alongside (not within) the VAST data platform. In a typical deployment, VAST handles the data plane (storage, retrieval, orchestration), and a separate GPU cluster (Dell PowerEdge, HPE ProLiant, etc. with Run:ai) handles training/inference. Polaris manages the VAST fleet; something else manages the GPU cluster. The Polaris + DataSpace coordination is the most interesting Layer 2A finding: Polaris abstracts infrastructure location while DataSpace abstracts data location. This separation of concerns — managing where infrastructure runs vs. managing where data lives — is exactly the architectural prerequisite for Layer 2C placement reasoning. No other vendor in this assessment separates these abstractions as cleanly. ### Borrowed Judgment Low for infrastructure orchestration (Polaris is VAST IP). Moderate for GPU-specific scheduling (depends on NVIDIA GPU Operator for CNode-X, and the customer still needs Run:ai or equivalent for separate GPU clusters). Compare to Dell: Dell’s Layer 2A has HIGH borrowed judgment because all GPU-aware orchestration is NVIDIA-controlled (GPU Operator, Run:ai, AI Enterprise, MIG/MPS). VAST’s Layer 2A has MODERATE borrowed judgment because Polaris handles infrastructure orchestration (VAST IP) while GPU scheduling is a narrower dependency limited to CNode-X management. The key distinction: Dell NEEDS Run:ai because GPU scheduling is the core Layer 2A function in Dell’s architecture. VAST’s Polaris handles the broader infrastructure orchestration function, and GPU scheduling is a specific sub-function for CNode-X hardware management, not the defining Layer 2A capability. Polaris is included in VAST AI OS at no additional charge. Dell’s customers pay separately for NVIDIA Run:ai and AI Enterprise licenses. This pricing distinction reflects the authority distinction: VAST bundles infrastructure orchestration because it owns it; Dell passes through NVIDIA licensing because it doesn’t. ### Working Notes The three management modes (VAST-managed, partner-managed, customer-managed) have DAPM implications: • VAST-managed: fully Ceded — VAST operates the infrastructure • Partner-managed: Delegated to the partner (CSP, MSP) • Customer-managed: the enterprise operates Polaris but the software is still VAST’s Even in customer-managed mode, the control plane software is VAST IP. The enterprise operates it but doesn’t own it. This is analogous to running VMware vCenter on your own hardware — you operate it, but the software authority belongs to the vendor. Polaris is available now as part of VAST cloud deployments, with expanded multi-cluster orchestration planned in future releases. The ‘expanded multi-cluster orchestration’ roadmap item is worth tracking — this is where Polaris evolves from infrastructure management (Layer 2A) into placement reasoning (Layer 2C). 89% of APAC enterprises deploy workloads across multiple public clouds; 72% operate hybrid cloud models (cited by VAST). This validates Polaris’s multi-cloud orchestration thesis. ## ◑ Layer 2B: Application Runtime & Execution *Agent execution, model serving, workflow orchestration* **Status:** VAST AgentEngine ### Vendor-Provided Components **VAST AgentEngine** [DAPM: Ceded] AI agent deployment and orchestration system running natively within the DataEngine. Described as ‘the application management layer of the VAST AI OS, designed specifically for the Agentic AI Era’ and ‘the final piece of the puzzle, rounding out the core services needed to run agentic applications on AI hardware.’ Low-code runtime that simplifies programming and coordinates multi-agent workflows, model invocation, and tool usage. Brings agents to life directly within DataEngine — reasoning occurs where the data lives. **Agent Runtime Capabilities** [DAPM: Ceded] Robust runtime for long-running containerized services. Lifecycle management for agent deployment, scaling, and versioning. Persistent state via execution-scoped or persistent scratch space spanning files, objects, or tables. Secure service discovery for agents to find and communicate with other agents and tools. Support for agents with multiple personas and security credentials. Fault-tolerant backbone with Kafka-backed queuing for durability and ordering. **MCP Toolbox** [DAPM: Ceded] Model Context Protocol toolbox enabling agents to orchestrate multiple tools and services together for higher-order workflows. Composability lets teams build complex agentic applications without stitching fragile systems. Aligns with the open MCP standard (donated to Linux Foundation Dec 2025). **Permission-Aware Execution** [DAPM: Ceded] Each agent call is defined with the user’s identities and permissions. Tools and data are accessed responsibly under the same permission model that governs the underlying Element Store. Multi-tenant operation with strong governance controls. This is structural — agent permissions inherit from the data platform’s security model, not from a bolt-on ACL layer. **Observability & Compliance** [DAPM: Ceded] Logs and traces every agent action, tool call, and data access. End-to-end observability creates transparency for compliance, stakeholder trust, and performance refinement. Built into the same platform — no external Grafana, tracing systems, or observability infrastructure required. **Data Event Integration** [DAPM: Ceded] Every add, move, change, or delete in storage emits an event that feeds into agent workflows via DataEngine. Ties data operations directly to agentic AI actions in real time. Agents can react to data changes as they happen — no polling, no batch triggers. ### NVIDIA-Provided Components **NVIDIA NIM Containers (on CNode-X)** VAST can serve NVIDIA NIM inference containers on CNode-X infrastructure, providing access to NVIDIA’s model ecosystem. NIM and AgentEngine are not mutually exclusive — NIM provides optimized model serving; AgentEngine provides the orchestration and lifecycle management around it. **NVIDIA CUDA (Inference Acceleration)** Model inference on CNode-X is GPU-accelerated via CUDA. VAST provides the orchestration and governance; NVIDIA provides the compute acceleration. This separation is cleaner than Dell’s stack where NVIDIA owns both orchestration (NemoClaw) and acceleration. ### Gap Analysis VAST AgentEngine is the most significant Layer 2B finding in the comparative analysis and the starkest architectural contrast with Dell’s approach. Dell’s Layer 2B: Dell does not appear to own the core agent runtime. NemoClaw provides the agent execution stack (NVIDIA). OpenShell provides the sandboxed runtime (NVIDIA). NeMo Guardrails provide safety constraints (NVIDIA). Cohere North provides agent workflow orchestration (ISV partner). DataRobot provides agent lifecycle management (ISV partner). Dell provides hardware, packaging, and professional services. Five authorities for one layer. VAST’s Layer 2B: AgentEngine provides agent deployment, orchestration, runtime, lifecycle management, persistent state, MCP toolbox, permission-aware execution, observability, and data event integration — all within a single platform. One authority for the entire layer. The architectural differentiators: 1. Agents execute where the data lives. In VAST, there is no data movement between storage and inference runtime because they are the same platform. In Dell’s architecture, data flows from PowerScale/ObjectScale through NVIDIA’s inference runtime, crossing authority and system boundaries. 2. Permission-aware by structure. Agent permissions inherit from the Element Store’s security model. Dell’s agent permissions depend on NeMo Guardrails (runtime constraint) and the ISV partner’s governance logic (Cohere North, DataRobot). 3. Real-time data event triggers. Every storage operation emits events that feed into agent workflows. Dell’s agents consume data through retrieval (Elastic search) and pipeline (Dataloop) stages, not through direct data event feeds. 4. Built-in observability. Logs and traces every agent action without external monitoring infrastructure. Dell’s observability depends on partner tools. 5. MCP Toolbox. Native MCP support for agent-to-tool orchestration. Aligns with open standards. Dell’s agentic platform uses ISV-specific tool orchestration (Cohere North, DataRobot SDKs). The maturity question is real. AgentEngine is newer and less proven than NVIDIA’s NemoClaw/OpenShell stack. NVIDIA’s open-source foundation (OpenClaw), broad model support (Nemotron, NIM), and ecosystem validation (Dell, HPE, Lenovo, CSPs) represent a larger installed base. Jensen’s ‘operating system for personal AI’ framing signals NVIDIA’s long-term commitment to the runtime layer. VAST’s AgentEngine must prove equivalent reliability, security, and model compatibility at enterprise scale. The commoditization signal from Augment Code’s 2026 multi-agent orchestration analysis: ‘The runtime layer (tool registries, state management, retry logic) is being commoditized by open standards and hyperscaler investment. Teams building custom implementations will find platform solutions commoditizing that work within 12–18 months.’ This applies to both VAST AgentEngine and NVIDIA NemoClaw — the question is whether the runtime will differentiate or whether the governance layer above it (Layer 2C) becomes the battleground. ### Borrowed Judgment Low for agent orchestration, lifecycle, and governance (AgentEngine, MCP Toolbox, permission model, observability are all VAST IP). Moderate for model inference (depends on NVIDIA CUDA acceleration on CNode-X). NIM containers can run on the platform but are not required — AgentEngine is model-agnostic. Compare to Dell: Dell’s Layer 2B borrowed judgment is TOTAL for the runtime (NemoClaw/OpenShell/NIM/Dynamo/AI Enterprise are all NVIDIA) and DELEGATED for orchestration (Cohere North/DataRobot/ClearML). Dell’s one Retained asset is professional services. VAST’s Layer 2B authority structure: VAST owns the runtime, the orchestration, the lifecycle management, the permission model, and the observability. NVIDIA provides inference acceleration. The enterprise Cedes to one vendor (VAST) rather than to three+ vendors (NVIDIA + Cohere + DataRobot + ClearML in Dell’s case). The DAPM summary: both Dell and VAST require the enterprise to Cede Layer 2B authority. The difference is whether you Cede to a fragmented set of authorities (Dell’s model) or to a unified authority (VAST’s model). The 4+1 model doesn’t prescribe which is better — it requires the enterprise architect to make the choice explicitly. ### Working Notes The positioning contrast is telling: NVIDIA says NemoClaw is ‘the operating system for personal AI.’ VAST says AgentEngine is ‘the application management layer of the AI Operating System.’ NVIDIA positions the agent runtime as an OS — the platform everything else runs on. VAST positions it as a layer within a larger OS that includes storage, retrieval, governance, and infrastructure orchestration. The difference: NVIDIA’s vision is runtime-centric (the agent runtime IS the platform). VAST’s vision is data-centric (the data platform includes the agent runtime as one of its services). For the Enterprise AI Control Plane working document: AgentEngine + PolicyEngine (Layer 2C) + the Element Store permission model (Layer 1A) form a coherent governance chain within VAST’s stack. In Dell’s stack, the equivalent governance chain would be: NeMo Guardrails (NVIDIA) + ???(no Layer 2C) + Trust3 AI (partner) + MetadataIQ (Dell) — four authorities, one missing layer. ## ◑ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Emerging ### Vendor-Provided Components **VAST PolicyEngine (End of 2026)** [DAPM: Ceded] Inline policy enforcement point across the AI OS. Governs agent access to shared memory, external tools, knowledge bases, external data products, and other agents. Applies BOTH explicit permissions AND AI-derived context BEFORE actions execute. Mediates every type of input and output — enables redaction or transformation of sensitive data before exposure to models or agents. Tamper-proof traces and logs for replay, explainability, and regulatory compliance. Zero-trust operating posture: all AI activity remains observable, explainable, and auditable. CrowdStrike integration for security monitoring of data access, admin activity, and workload behavior. **VAST TuningEngine (End of 2026)** [DAPM: Ceded] Manages continuous model optimization. Supports LoRA fine-tuning, supervised fine-tuning, and reinforcement learning. Leverages outcomes from agentic pipelines and curated feedback. Candidate models evaluated, benchmarked, and deployed manually or automatically. Integrates with NVIDIA NeMo Data Designer for training and fine-tuning Nemotron models. Works with PolicyEngine to power automatic learning loops — the closed operational computing loop: observe, reason, act, evaluate, improve. **Polaris Control Plane (Placement)** [DAPM: Ceded] Abstracts infrastructure location across public cloud, neocloud, and on-prem. Intent-driven: administrators define desired state, Polaris coordinates to achieve and maintain it. Complements DataSpace (which abstracts data location). Together they provide the ‘where should this workload run relative to this data’ coordination. Can align compute placement with GPU availability and compliance requirements without altering application behavior. **VAST DataSpace (Global Namespace)** [DAPM: Ceded] Globally distributed data computing layer. Synchronizes metadata, presents data through remote caches, provides file/object/table-level consistency management across sites. Decentralized consistency: each site can assume temporary responsibility at granular namespace levels. The data-location abstraction that Polaris’s infrastructure-location abstraction complements. **The Closed Loop (PolicyEngine + TuningEngine)** [DAPM: Ceded] Together these create what VAST calls a ‘thinking machine’: a consolidated stack that merges data, compute, policy enforcement, and model training into one infrastructure fabric. PolicyEngine governs every interaction. TuningEngine learns from every outcome. The loop: observe (data events) → reason (agent processing) → act (tool execution) → evaluate (outcome assessment) → improve (model tuning) → govern (policy enforcement on the improved model). No other vendor in this assessment has a closed governance+learning loop. ### NVIDIA-Provided Components **NVIDIA NeMo Data Designer (via TuningEngine)** TuningEngine integrates with NeMo Data Designer for training and fine-tuning Nemotron models. This is NVIDIA’s model training framework accessed through VAST’s tuning pipeline — NVIDIA provides the training technology, VAST provides the orchestration and governance around it. **No NVIDIA Layer 2C Governance Dependency** PolicyEngine, Polaris, and DataSpace are VAST IP. NVIDIA does not provide or control the governance, placement, or policy reasoning layer. The NeMo Data Designer integration is a training tool dependency, not a governance dependency. This is structurally different from Dell, where the closest Layer 2C functions (Dynamo routing, NeMo Guardrails) are NVIDIA-owned. ### Gap Analysis This is the most significant finding in the comparative assessment. Layer 2C is where the 4+1 model’s thesis is tested most directly: does any vendor provide policy-driven placement decisions across models, data, agents, and infrastructure? Applying the ‘Routing Is Not Reasoning’ test to VAST’s stack: PolicyEngine goes BEYOND constraint enforcement (what NeMo Guardrails does at Layer 2B). Guardrails say ‘the agent cannot do X.’ PolicyEngine says ‘the agent can do X only if conditions Y and Z are met, as determined by AI-derived context, and the action will be logged with tamper-proof traceability.’ This is active policy reasoning, not static constraint enforcement. The pre-execution enforcement model means governance decisions happen BEFORE actions execute, not after — a fundamentally different posture than post-hoc audit. Polaris goes BEYOND infrastructure provisioning (what Kubernetes does at Layer 2A). Polaris doesn’t just deploy clusters — it abstracts infrastructure location so that workloads can be placed based on GPU availability and compliance requirements without changing application behavior. Combined with DataSpace abstracting data location, Polaris + DataSpace provide the multi-variable placement reasoning that the 4+1 model defines as Layer 2C. TuningEngine goes BEYOND model serving (what NemoClaw does at Layer 2B). It manages the continuous improvement of models based on production outcomes, governed by PolicyEngine. This closes the loop from execution through evaluation through improvement through re-governance — no other vendor has this cycle in a single platform. The multi-variable test from the 4+1 model: can the system make placement decisions based on cost + compliance + latency + data residency + model capability simultaneously? • Cost: DataEngine’s built-in cost-optimizer can deploy on CPU/GPU/DPU based on cost • Compliance: PolicyEngine enforces permissions and audit requirements before execution • Latency: DataSpace provides remote data caches for geo-distributed access • Data residency: DataSpace’s per-site consistency management respects data locality • Model capability: TuningEngine manages model versions and fitness for purpose VAST is the only vendor in this series that addresses all five variables, even if the integration between them is not yet fully productized (PolicyEngine and TuningEngine ship end of 2026). Caveats remain critical: • PolicyEngine and TuningEngine are announced, not shipped. GA end of 2026. • Polaris is available but multi-cluster placement orchestration is in ‘expanded capabilities planned in future releases.’ • The ‘AI-derived context’ in PolicyEngine is described but the AI decision-making model is not detailed — how the AI determines permissions from context is a black box. • The closed loop (observe-reason-act-evaluate-improve) is architecturally described but not production-validated at enterprise scale. Compare to Google Cloud: Inference Gateway + DWS + Knowledge Catalog is productized and shipping. Google’s Layer 2C is narrower (inference placement and scheduling) but real. VAST’s is broader (governance + placement + learning) but pre-GA. Compare to Dell: No productized Dell-owned Layer 2C is evident. Dell has the Dell + Intel ‘control plane’ signal but no product. VAST is building what Dell hasn’t started. Compare to Kamiwaza: Policy-driven Inference Mesh + Distributed Data Engine + ReBAC is the most comparable Layer 2C approach — explicit multi-variable policy optimization for inference placement. Kamiwaza and VAST are approaching from different directions (Kamiwaza from inference, VAST from data) toward the same Layer 2C function. ### Borrowed Judgment Low — the lowest Layer 2C borrowed judgment score in the assessment series because VAST is the only vendor building a proprietary Layer 2C. PolicyEngine, Polaris, DataSpace, and TuningEngine are all VAST IP. The NVIDIA dependency is limited to NeMo Data Designer for model training within TuningEngine — a training tool, not a governance dependency. The DAPM classification is Ceded to VAST. The enterprise doesn’t own the control plane — VAST does. But the 4+1 model’s DAPM framework reveals an important distinction: • Dell at Layer 2C: ABSENT — no authority exists to Cede or Retain. The enterprise operates without governance. • VAST at Layer 2C: CEDED — authority exists and is Ceded to VAST. The enterprise has governance but doesn’t own it. • Google at Layer 2C: CEDED — authority exists and is Ceded to Google. Productized and shipping. Absent is worse than Ceded. Having governance you don’t own is better than having no governance at all. The enterprise architect’s decision is not ‘should we have Layer 2C?’ (the 4+1 model says yes). It’s ‘who should hold the Layer 2C authority?’ VAST’s answer: VAST holds it, as part of a vertically integrated platform where governance, execution, data, and infrastructure are one system. Dell’s non-answer: nobody holds it. Build it yourself or operate without it. Google’s answer: Google holds it, as part of a cloud platform where the enterprise Cedes everything. ### Working Notes Analyst and media framing validates the Layer 2C classification: • Blocks and Files: ‘VAST broadens AI platform push with control plane’ — explicitly using control plane language • Moor Insights: Polaris positions VAST as ‘the persistent operational layer for AI’ • theCUBE Research (Strechay): ‘VAST looks at it as going up the stack’ • TipRanks: PolicyEngine + TuningEngine transform the AI OS into a ‘consolidated stack merging data, compute, policy enforcement, and model training into one infrastructure fabric’ • SiliconANGLE: Polaris is ‘complementary to DataSpace — DataSpace focuses on how data is presented across locations; Polaris focuses on how clusters are deployed and maintained’ This validates the Enterprise AI Control Plane working document’s Pattern 4: Storage Vendors Reaching Up. VAST is the most aggressive example. The three-vector convergence: • Dell: bottom-up (infrastructure OEM, hasn’t started Layer 2C, Dell+Intel signal only) • Google: top-down (cloud provider, Inference Gateway + DWS + Knowledge Catalog, productized) • VAST: middle-out (data platform, PolicyEngine + Polaris + DataSpace + TuningEngine, emerging) The Microsoft Agent Governance Toolkit (April 2026) and the AgentGuardian research paper validate that agent governance is becoming a recognized engineering discipline, not just a VAST-specific product category. The 4+1 model’s Layer 2C aligns with this emerging discipline. ## ◑ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Focused Ecosystem ### Vendor-Provided Components **VAST Cosmos Community (Unified Partner Program)** [DAPM: Delegated] Global partner ecosystem formalized at VAST Forward 2026 with distinct partner tracks: Software Partners (ISVs) build integrations and validated solutions on the AI OS. Hardware/Platform Partners (Cisco, Supermicro, HPE, Lenovo) deliver infrastructure foundations. Cloud Partners (CSPs, neoclouds, hyperscalers). Channel Partners (resellers, SIs, advisory). Developer Community with technical resources, learning pathways, hands-on labs, and contribution opportunities for integrations and blueprints. **TwelveLabs Partnership (Video AI)** [DAPM: Delegated] First customer-managed deployment path for TwelveLabs’ video foundation models (Marengo for embeddings/search, Pegasus for deep video understanding). Extends video intelligence beyond public cloud into on-prem and sovereign environments. Target: media companies, financial services (surveillance-based fraud detection), government agencies (data sovereignty). TwelveLabs gains on-prem deployment; VAST gains a compelling vertical anchor use case. **CrowdStrike Integration (AI Lifecycle Security)** [DAPM: Delegated] Strategic partnership connecting VAST AI OS telemetry to CrowdStrike Falcon platform. Coordinated detection across data ingestion, model training, and runtime inference environments. Deeper than perimeter security — integrated into PolicyEngine with event-driven automation for threat detection and response. Unified policy management, agent governance, encryption, and regulatory reporting. **CoreWeave (Hyperscale Anchor)** [DAPM: Delegated] $1.17B commercial agreement. CoreWeave’s EVP of Product and Engineering presented at VAST Forward 2026 on operating at thousands-of-GPU scale. CoreWeave validates that the VAST platform handles the data coordination requirements of hyperscale AI: predictable data movement, reuse/staging without latency spikes, and consistent flow that enables schedulers to make reliable decisions. The constraint at CoreWeave scale: ‘adding more GPUs doesn’t recover performance if data flow breaks — it amplifies the inefficiency.’ **NVIDIA NIM + Open Models (on VAST)** [DAPM: Delegated] VAST can serve NVIDIA NIM containers and open models on its platform via AgentEngine and CNode-X. Provides access to NVIDIA’s model ecosystem without requiring NemoClaw/OpenShell. AgentEngine is the primary runtime surface; NIM is an optional model delivery mechanism. ### NVIDIA-Provided Components **NVIDIA Model Ecosystem Access** CNode-X and AgentEngine provide the substrate for running NVIDIA NIM containers, Nemotron models, and NVIDIA Blueprints. VAST’s model-agnostic architecture means NVIDIA models are one option among many, not the mandatory runtime path. ### Gap Analysis VAST’s Layer 3 ecosystem is structurally different from Dell’s — not just smaller. Dell’s ecosystem is broad and horizontal: OpenAI (dev productivity), Palantir (operational AI), Google (sovereign compute), ServiceNow (workflow automation), SpaceXAI (enterprise assistant), Hugging Face (model hub), Mistral, Reflection, Poolside, UneeQ, Fogsphere. 5,000+ deployment customers. Dell’s ecosystem breadth compensates for infrastructure gaps — ISV partners provide Layer 2B functions (Cohere North, DataRobot, ClearML) that Dell’s platform lacks. VAST’s ecosystem is focused and vertical: CoreWeave (hyperscale validation), TwelveLabs (video AI), CrowdStrike (AI lifecycle security), plus the Cosmos Community structure for SIs, CSPs, and ISVs. VAST’s ecosystem depth reflects platform self-sufficiency — ISV partners provide application use cases, not infrastructure functions, because the platform handles Layers 1A through 2C internally. The Dell comparison at Layer 3 reveals the DAPM trade-off between the two architectures: • Dell’s Layer 3 partners provide both application logic AND infrastructure-level functions (agent orchestration, governance, GPU scheduling). The partner ecosystem is load-bearing — remove Cohere North and Dell loses agent workflow orchestration. • VAST’s Layer 3 partners provide application logic ONLY. The partner ecosystem is additive — remove TwelveLabs and VAST loses a vertical use case but not a platform capability. This is the cleanest DAPM distinction in the comparative assessment: Dell’s ecosystem is structurally necessary. VAST’s ecosystem is strategically valuable. The ecosystem risk is real: enterprise buyers evaluate ecosystem breadth as a proxy for platform maturity. Dell’s partner roster with OpenAI, Google, Palantir, and ServiceNow signals broad market acceptance and Fortune 500 validation. VAST’s smaller roster signals focused applicability — strong in hyperscale and data-intensive verticals (media, financial services, government), less proven in general enterprise workflows. The CoreWeave validation deserves specific attention: at thousands-of-GPU scale, CoreWeave’s constraint is not compute but data coordination. ‘Adding more GPUs doesn’t recover performance if data flow breaks — it amplifies the inefficiency.’ This is a direct validation of the 4+1 model’s thesis that the data plane (Layers 1A/1B/1C) is the binding constraint for AI at scale, not the compute plane (Layer 0). ### Borrowed Judgment Distributed across partners at Layer 3, which is architecturally correct. VAST’s structural advantage: because the platform provides its own Layers 1A through 2C, Layer 3 partners bring only application logic and vertical use cases. They don’t need to bring infrastructure capabilities, and they don’t need to fill platform gaps. Compare to Dell: Dell’s Layer 3 borrowed judgment is distributed across partners who provide BOTH application logic AND infrastructure-level functions. The distinction is whether partners are additive (VAST) or load-bearing (Dell). The CrowdStrike integration is worth classifying separately: it operates at both Layer 3 (application-level security) and Layer 1A (data-layer monitoring). The integration connects AI OS telemetry to CrowdStrike Falcon for coordinated detection across the full AI lifecycle. This crosses layer boundaries — appropriate because security is a cross-cutting concern, not a single-layer function. ### Working Notes The $30B valuation, $4B+ bookings, and $500M+ CARR signal investor confidence. The CoreWeave $1.17B agreement anchors the customer base at hyperscale. But the enterprise deployment base is smaller than Dell’s 5,000+ and skews toward neoclouds and data-intensive verticals rather than traditional enterprise IT. The developer community track within Cosmos (hands-on labs, learning pathways, blueprint contributions) parallels Dell’s Enterprise Hub on Hugging Face but with a different emphasis: Dell’s Hub is about model access; VAST’s Cosmos is about building on the platform. The TwelveLabs partnership demonstrates the ‘customer-managed deployment’ model that is becoming the pattern for AI model companies moving from cloud-only to hybrid: the model runs on customer infrastructure (VAST AI OS) rather than the model company’s cloud. Dell’s equivalent is OpenAI Codex connecting to the Dell AI Data Platform and SpaceXAI Grok on-premises. The pattern is the same; the deployment substrate is different. ════════════════════════════════════════════════════════════════════════════════ # VMware Private AI Foundation with NVIDIA Mapped to the 4+1 Layer AI Infrastructure Model **Version:** v1.0 — Initial Assessment **Date:** May 22, 2026 **Source:** VMware Explore 2025, VCF 9.0/9.1 announcements, Broadcom press releases, VCF Private AI blog series, published 4+1 model ## Summary Finding VMware Private AI Foundation with NVIDIA occupies a structurally unique position in this assessment series: it is neither an infrastructure OEM (Dell, HPE), a hyperscaler (AWS, Google Cloud), nor a data platform vendor (VAST). It is a virtualization and private cloud platform — the abstraction layer that sits between physical infrastructure and workloads. Broadcom’s strategic thesis is that VCF is ‘the permanent abstraction layer between AI software and physical chips,’ and the Private AI Foundation extends that thesis into AI workloads specifically. The 4+1 model reveals both the power and the limits of this position. VCF’s strength is Layer 2A — infrastructure orchestration is VMware’s heritage and its deepest IP. VCF Automation, vSphere Supervisor, VKS, vSAN, NSX/vDefend, and VCF Operations collectively provide the most mature unified orchestration surface for mixed workloads (VMs, containers, AI) of any on-prem vendor assessed. No other vendor in this series manages GPU-accelerated AI workloads, Kubernetes clusters, and traditional VMs from a single control plane with equivalent operational maturity. At Layer 0, VMware’s multi-accelerator management (AMD, NVIDIA, Intel) requires careful contextualization. GPU vendor choice is not unique to VMware — HPE’s GX5000 supports NVIDIA and AMD blades in the same rack, and hyperscalers fully abstract accelerators at the service layer (a developer calling Vertex AI or Bedrock never sees which silicon powers the response). VMware’s actual differentiator is the level of architectural control: operators manage GPU placement, isolation, and scheduling through familiar vSphere primitives (vGPU profiles, vmclasses, DRS, resource pools). The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to acceleration — but that opinion provides stronger knobs that appeal to operators already comfortable with virtualization management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console. But the closer the stack gets to AI-specific functions — model serving, retrieval, agent execution, governance — the more authority shifts to NVIDIA (Layer 2B runtime via NVIDIA AI Enterprise), to open-source components (pgvector, Elasticsearch), or to capabilities that are emerging but not yet at the depth of purpose-built alternatives. Private AI Services (Model Runtime, Agent Builder, Data Indexing/Retrieval, Vector Database, Model Store) are genuine platform capabilities delivered as part of the VCF subscription, but they are foundational AI services, not the deep data lifecycle or agent orchestration that Dell (Dataloop), HPE (Ezmeral/Kamiwaza), or VAST (DataEngine/AgentEngine) provide. Layer 1C (data pipelines) is absent. Layer 2C (reasoning plane) has building blocks — MCP Server Governance, GPU/Model Metrics, Intelligent Assist — but none passes the ‘Routing Is Not Reasoning’ test. The installed base is the strategic moat: nine of the top ten Fortune 500 companies have committed to VCF, with 100M+ cores licensed worldwide. For the enormous VMware installed base, Private AI Foundation is the lowest-friction path to on-prem AI — no new infrastructure vendor, no new management plane, no new operational model, and no incremental cost beyond GPU hardware. The 4+1 question is whether lowest-friction adoption translates to sufficient architectural depth when agentic AI workloads demand governance, policy-driven placement, and cross-agent orchestration that VCF does not yet provide. VMware Private AI Foundation is the enterprise’s most natural on-ramp to private AI. Hock Tan’s ‘permanent abstraction layer’ framing is a Layer 2A statement, not a Layer 2C statement — the abstraction layer manages resources; the reasoning plane governs them. VMware has the former; it does not have the latter. Whether it becomes the enterprise’s durable AI platform depends on whether Broadcom invests in the Layer 1A governance depth, Layer 1C pipeline capability, and Layer 2C reasoning plane that the 4+1 model identifies as structurally necessary — or whether the Broadcom acquisition thesis (cash generation from the installed base) constrains that investment. VMware’s unique structural advantage is that VCF sees everything from the hypervisor up across all OEM hardware — the data to build a multi-vendor reasoning plane exists. The engineering commitment does not yet. ## ◑ Layer 0: Compute & Network Fabric *Raw compute, networking, and acceleration fabric* **Status:** Hardware-Agnostic Abstraction ### Vendor-Provided Components **VMware vSphere 8/9 (Hypervisor)** [DAPM: Retained] Industry-standard virtualization layer. GPU passthrough and vGPU support via NVIDIA AI Enterprise integration. vSphere Supervisor manages both VMs and Kubernetes workloads from a single control plane. vMotion for live migration of AI workloads. DRS for automated load balancing. The hypervisor is Broadcom’s foundational IP — the abstraction layer between physical infrastructure and all workloads above. **Multi-Vendor Hardware Support** [DAPM: Retained] VCF runs on Dell PowerEdge, HPE ProLiant, Lenovo ThinkSystem, Cisco UCS, Supermicro, NEC, Fujitsu, and others. Hardware-agnostic by design — VCF does not manufacture or specify compute. This is the fundamental architectural difference from Dell AI Factory or HPE Private Cloud AI: VMware abstracts hardware, OEMs provide it. The enterprise retains hardware vendor choice. **NVIDIA GPU Integration (vGPU + Passthrough)** [DAPM: Delegated] VCF 9.1 supports NVIDIA Blackwell architecture, NVSwitch on HGX platform, GPUDirect RDMA over InfiniBand for distributed LLM inference across multiple HGX servers. Enhanced DirectPath I/O for ConnectX-7 NICs and BlueField-3 DPUs. vGPU profiles mapped to vSphere Namespaces as vmclasses for multi-tenant GPU isolation — memory strictly partitioned per tenant with zero side-channel risk across GPU framebuffer. **NSX / vDefend Networking** [DAPM: Retained] Software-defined networking with micro-segmentation, Zero Trust enforcement via Distributed Firewall (Antrea CNI for Kubernetes), and in-memory malware defense. Avi Load Balancer provides virtualized load balancing for AI inference endpoints and agentic applications — eliminates hardware appliance requirements. Post-quantum cryptography support. **Multi-Accelerator Management (AMD + NVIDIA + Intel)** [DAPM: Retained] VCF 9.1 manages AMD, NVIDIA, and Intel accelerators through the same virtualization control plane — vGPU profiles, vmclasses, DRS policies, resource pools. The architect retains granular control over GPU placement, isolation, and scheduling using familiar vSphere primitives. This is not unique as a multi-vendor GPU capability (HPE GX5000 supports NVIDIA Rubin + AMD MI430X in the same rack; hyperscalers fully abstract accelerators at the service layer). VMware’s differentiator is the level of architectural control: operators already comfortable with virtualization management get GPU scheduling knobs they know how to turn. The control plane is opinionated — it applies VMware’s virtualization model to acceleration — but that opinion is the value for VMware-native shops. ### NVIDIA-Provided Components **NVIDIA AI Enterprise (NVAIE)** Enterprise AI software suite providing vGPU drivers, GPU Operator for Kubernetes, and validated AI frameworks. NVAIE licenses purchased separately from VCF. Deeply integrated but independently licensed — the same NVIDIA dependency Dell and HPE share. **NVIDIA GPU Silicon + Networking** Blackwell, H100/H200, ConnectX-7/8, BlueField-3 DPUs, NVSwitch, InfiniBand. VMware validates and integrates but does not manufacture or specify GPU silicon. **NVIDIA NIM Microservices** Pre-built inference microservices deployable on Private AI Foundation. Nemotron models and community models available through Model Store. ### Gap Analysis VMware’s Layer 0 position is fundamentally different from every other vendor in this assessment: VMware provides the abstraction layer, not the physical infrastructure. Dell, HPE, and VAST own or specify hardware. Google and AWS own data centers. VMware sits above all of them. This creates a unique DAPM profile: the enterprise Retains hardware vendor choice (can switch from Dell to HPE to Lenovo without changing the management plane) but Delegates GPU runtime to NVIDIA and inherits whatever GPU integration VMware has validated. The abstraction is VMware’s value proposition and its architectural constraint — VMware can only support GPU features that the hypervisor can virtualize or pass through. Multi-accelerator support requires nuanced comparison across the assessed vendors. GPU vendor choice is NOT unique to VMware: • Hyperscalers (Google, AWS) fully abstract accelerators at the service layer — developers call Vertex AI or Bedrock and never see whether TPUs, Trainium, or NVIDIA GPUs power the response. Accelerator choice is Ceded to the cloud provider. Simplest developer experience, least architectural control. • HPE GX5000 supports NVIDIA Rubin and AMD MI430X GPU blades in the same rack architecture. Multi-vendor at the hardware level. • Dell AI Factory is NVIDIA-only. ‘Dell AI Platform with AMD’ is a separate branding, separate software stack — not a unified runtime. • VAST CNode-X is NVIDIA-only. VMware’s actual differentiator is the level of architectural control over acceleration: VCF manages AMD, NVIDIA, and Intel accelerators through familiar virtualization primitives (vGPU profiles mapped to vmclasses, DRS for GPU workload balancing, vMotion for live migration, resource pools for multi-tenant isolation). The enterprise architect retains granular control over GPU placement, scheduling, and isolation using tools they already operate. The control plane is borrowed judgment — it is VMware’s opinionated virtualization model applied to GPU resources — but that opinion provides stronger knobs that appeal specifically to operators already comfortable with vSphere management. Where hyperscalers abstract the accelerator away from the architect, VMware puts the architect in the driver’s seat through a familiar console. The vDefend security story is architecturally significant for AI: micro-segmentation at the packet level between every AI component (model server, vector database, embedding service, API gateway) with Terraform-codified firewall rules. This is infrastructure-layer Zero Trust for AI workloads — a capability that Dell and HPE don’t provide at equivalent depth from the platform layer. VAST’s CrowdStrike integration operates at a different level (application/data-layer security vs. network-layer micro-segmentation). ### Borrowed Judgment Multi-directional: VMware borrows GPU silicon judgment from NVIDIA (same as everyone) and hardware engineering judgment from OEM partners (Dell, HPE, Lenovo build the servers). But VMware retains the abstraction layer — the hypervisor, the networking, the security model, the orchestration. This is the inverse of Dell’s position: Dell retains hardware judgment and borrows software judgment from NVIDIA. VMware retains software judgment and borrows hardware judgment from OEMs. Critically, the VMware control plane is itself borrowed judgment for the enterprise: the architect gains granular GPU management through vSphere primitives, but those primitives encode VMware’s opinions about how acceleration should be virtualized, scheduled, and isolated. The enterprise borrows VMware’s virtualization worldview in exchange for operational familiarity. This is a different trade-off than the hyperscalers (where the enterprise cedes acceleration decisions entirely) or bare-metal (where the enterprise retains full control but builds everything). VMware occupies the middle: more control than cloud, less effort than bare-metal, but through an opinionated lens. The NVIDIA AI Enterprise dependency is real but no deeper than Dell’s or HPE’s: all three require NVAIE for GPU virtualization and AI framework support. VMware’s co-engineering relationship with NVIDIA on VCF integration is comparable to HPE’s Private Cloud AI co-engineering. ### Working Notes The Broadcom acquisition context is impossible to ignore at Layer 0: Gartner projects VMware’s virtualization market share will fall from 70% (2024) to 40% (2029) due to pricing changes. Nutanix CEO has publicly targeted 165,000 of VMware’s approximately 300,000 customers. Broadcom has converted 90%+ of the top 10,000 VMware customers to VCF subscriptions with 200-500% price increases reported. This creates a unique installed-base dynamic: Private AI Foundation’s market opportunity is less about winning new customers than about retaining existing ones by making VCF indispensable for AI workloads. If the enterprise is already paying for VCF, Private AI Services come at no additional cost — a fundamentally different go-to-market than Dell (buy new PowerEdge + NVIDIA), HPE (buy new Private Cloud AI), or VAST (deploy new AI OS). The air-gapped deployment support is significant for regulated industries and government — same capability Dell and HPE emphasize, delivered through the existing VCF automation framework rather than a purpose-built AI appliance. ## ◑ Layer 1A: Data Storage & Governance *Durable, governed data foundation — the Governance Catalog that Layer 2C queries* **Status:** Platform Storage, Not AI-Native ### Vendor-Provided Components **vSAN (HCI Storage)** [DAPM: Retained] Hyper-converged storage integrated into VCF. Block and file storage natively. Native Object Storage (S3-compatible) in tech preview with VCF 9.1.x — brings S3 interface natively into the platform without third-party licensing. vSAN deduplication and compression for cost reduction. Unified storage policies and multi-tenant self-service access. **vSAN for Recovery + Ransomware Recovery** [DAPM: Retained] Sovereign, in-place ransomware recovery using native snapshot capabilities. Deep snapshot chains and integrated replication workflows. On-prem recovery without external dependencies. **Model Store (Private AI Services)** [DAPM: Retained] Curated LLM repository with integrated RBAC access control. MLOps teams and data scientists can securely manage and provide LLMs with governance and security for enterprise data and IP. NVIDIA models, Nemotron, and community models available. **External Storage Integration** [DAPM: Delegated] VCF supports Dell PowerScale, Dell ObjectScale, NetApp ONTAP, Pure Storage, HPE Alletra, and other enterprise storage via vSphere APIs. The storage layer is not limited to vSAN — enterprises can bring existing storage investments. This is a heterogeneous storage approach vs. Dell’s vertically integrated storage (PowerScale/ObjectScale/Exascale) or VAST’s collapsed storage (Element Store). ### NVIDIA-Provided Components **No Direct NVIDIA Layer 1A Dependency** NVIDIA does not provide storage or governance components in the VMware Private AI stack. Storage is VMware-owned (vSAN) or enterprise-chosen (external arrays). ### Gap Analysis VMware’s Layer 1A is fundamentally different from every other assessed vendor because VMware is not a storage company. vSAN provides competent hyper-converged storage with the VCF 9.1 addition of native S3 object storage, but this is general-purpose platform storage — not AI-optimized data infrastructure. Compare to Dell: MetadataIQ indexes billions of files with AI-specific metadata enrichment. Exascale provides 10+ PB/rack unified file+object+fast-file. Trust3 AI provides storage-layer governance for sensitive data discovery. Compare to HPE: Data Fabric v8.1 provides policy-based data placement with Apache Polaris catalog for Iceberg tables. Alletra B10000 provides real-time agentic storage support with semantic understanding. Compare to VAST: Element Store collapses file, object, table, and vector into a single governed data structure with inline metadata enrichment. VMware’s storage story is ‘bring your existing storage’ — which is pragmatic for the installed base but means Layer 1A governance (metadata richness, data lineage, policy-based placement) depends entirely on whichever external storage vendor the enterprise has deployed. VMware itself provides no AI-specific governance catalog, no metadata enrichment, no data lineage tracking. The Model Store capability is a Layer 1A function worth noting: RBAC-governed model repository is a governance primitive that Dell’s AI Factory lacks as a platform-native capability. But Model Store governs models, not data — it does not address the broader question of which data feeds which model under what compliance constraints. ### Borrowed Judgment Low for vSAN (VMware-owned). High for AI-specific storage governance — entirely dependent on whichever external storage vendor the enterprise deploys. If the enterprise runs Dell storage, it inherits Dell’s governance capabilities (MetadataIQ). If it runs NetApp, it inherits NetApp’s. VMware provides no abstraction or unification of storage governance across heterogeneous backends. This is the inverse of VMware’s Layer 0 strength: at Layer 0, VMware abstracts heterogeneous hardware into a unified management plane. At Layer 1A, VMware does NOT abstract heterogeneous storage governance into a unified governance plane. The storage abstraction stops at provisioning and capacity management — it does not extend to metadata, lineage, or policy. ### Working Notes The native S3 Object Storage in VCF 9.1.x (tech preview) is a strategic move: S3 compatibility is the lingua franca of AI data pipelines. Every vendor in this assessment provides S3 access (Dell ObjectScale, HPE Alletra X10000, VAST DataStore, AWS S3, Google Cloud Storage). VMware adding native S3 to vSAN reduces the dependency on external object storage for AI workloads. The Tanzu Marketplace integration provides a curated path to certified middleware and data services — this is VMware’s approach to ecosystem curation at the data layer, comparable in intent (not depth) to HPE’s Unleash AI program or VAST’s Cosmos Community. SQL Server DBaaS as a first-class VCF citizen is a pragmatic enterprise play — most enterprises have SQL Server deployments, and making it a platform service reduces the friction of data access for AI workloads. ## ◑ Layer 1B: Context Management & Retrieval *Low-latency retrieval for RAG — vector/hybrid search, context windows* **Status:** Foundational RAG Services ### Vendor-Provided Components **Data Indexing & Retrieval (Private AI Services)** [DAPM: Retained] Index and maintain multiple data sources, making them readily available for consumption by AI applications. Integrated with Model Runtime for RAG workflows. Keeps indexed data current as sources change. **Vector Database (Private AI Services)** [DAPM: Retained] pgvector on PostgreSQL delivered via Data Services Manager with VMware enterprise-level support. Enables domain-specific, up-to-date context for AI models. PostgreSQL 16.8 with pgvector 0.8.0 extension in Private AI Services 2.1. **RAG Pipeline Integration** [DAPM: Delegated] NVIDIA NIM RAG Blueprint v2.5.0 validated on VCF — production-grade, multi-model RAG pipeline. Pre-built catalog items in VCF Automation for deploying complete RAG workflows. Elasticsearch supported as external vector database for advanced retrieval scenarios. ### NVIDIA-Provided Components **NVIDIA NIM + NeMo Retriever** Inference microservices and retrieval-augmented generation components. RAG Blueprints provide pre-built retrieval patterns. Same capabilities available on Dell and HPE platforms. **NVIDIA AI Enterprise RAG Stack** Validated software stack for RAG workflows on VMware Private AI Foundation. GPU-accelerated embedding generation and retrieval. ### Gap Analysis VMware’s Layer 1B provides functional RAG capabilities through Private AI Services — Data Indexing/Retrieval and Vector Database are genuine platform services, not just partner integrations. But the retrieval stack is foundational, not differentiated. pgvector on PostgreSQL is a competent vector database for moderate-scale use cases but lacks the performance characteristics of purpose-built alternatives. Compare to Dell’s Data Search Engine (Elasticsearch 9.4 with GPU-accelerated hybrid search, MetadataIQ integration). Compare to VAST’s InsightEngine (native to the data platform, no data movement for retrieval). Compare to HPE’s Alletra X10000 with KV cache storage support for inference state persistence. The Data Indexing & Retrieval service addresses the core RAG requirement — keeping context current as data sources change — but without the metadata richness or governance integration that Dell (MetadataIQ + Elastic), HPE (Data Fabric + Kamiwaza), or VAST (Catalog + InsightEngine) provide. No retrieval quality observability (recall@k, latency percentiles) is evident — the same gap identified in the Dell assessment. A Layer 2C placement engine would need retrieval quality metrics to make informed routing decisions. ### Borrowed Judgment Moderate. VMware owns the Data Indexing/Retrieval and Vector Database services. RAG pipeline patterns depend on NVIDIA NIM/Blueprints (same dependency as Dell and HPE). Elasticsearch as external vector database option introduces the same Elastic dependency Dell has — search intelligence is Elastic’s, not VMware’s. The pgvector choice is notable: PostgreSQL is the most widely deployed enterprise database. By building on pgvector, VMware reduces adoption friction (most enterprises already have PostgreSQL expertise) at the cost of retrieval performance ceiling. Dell chose Elasticsearch (higher performance, more complex). VAST built its own (highest integration, most proprietary). VMware chose the most pragmatic option. ### Working Notes The OpenWebUI integration with Private AI Services RAG demonstrates VMware’s approach to Layer 1B: provide the retrieval infrastructure, let the enterprise choose the user-facing application layer. This is consistent with VMware’s platform philosophy — VMware provides infrastructure services, not applications. The RAG Blueprint validation on VCF (multi-model, production-grade, 8x NVIDIA H100 80GB GPUs) provides a concrete reference architecture that enterprises can deploy from VCF Automation catalog items. This is operationally simpler than assembling equivalent RAG infrastructure on bare-metal Dell or HPE hardware. ## ○ Layer 1C: Data Movement & Pipelines *Move/transform data — policy-driven placement, lineage, cost-aware movement* **Status:** Gap ### Vendor-Provided Components **VCF Automation Catalog Items** [DAPM: Retained] Pre-built self-service catalog items for AI workload deployment: Deep Learning VMs (PyTorch, TensorFlow pre-installed), AI Kubernetes clusters with GPU worker nodes, Triton Inference Servers. Quickstart templates eliminate weeks of manual setup. These are deployment pipelines, not data pipelines. **Tanzu / VKS Pipeline Support** [DAPM: Retained] VKS 3.6 with GitOps-based infrastructure and application management. Tanzu Platform provides container runtime and application deployment. Pipeline orchestration tools (Airflow, Kubeflow, MLflow) can run on VKS but are not VMware-provided — the enterprise must bring its own ML pipeline stack. **Enhanced NVMe Memory Tiering (VCF 9.1)** [DAPM: Retained] Intelligently tiers DRAM and NVMe for memory-bound AI and database workloads. Topology Aware Scheduling places workloads with NUMA and accelerator locality. These are infrastructure-level data movement optimizations, not AI data pipeline orchestration. ### NVIDIA-Provided Components **NVIDIA Blueprints** Pre-built AI application patterns deployable through VCF Automation. Pipeline templates, not pipeline infrastructure — same as Dell and HPE. **NVIDIA CMX (Future)** KV cache management for context memory offload. When integrated with VCF, could provide the same KV cache tiering Dell has validated (19x TTFT improvement). ### Gap Analysis Layer 1C is VMware’s most significant gap relative to other assessed vendors. VMware provides no equivalent to: • Dell’s Data Orchestration Engine (Dataloop): No-code/low-code AI data lifecycle management, Dell’s most meaningful software acquisition. • HPE’s Ezmeral Unified Analytics: Enterprise-hardened ML pipeline stack (Airflow, Kubeflow, Ray, Feast, MLflow, Spark). • HPE’s Data Fabric: Policy-based data placement with compliance tagging and data lineage. • VAST’s DataEngine: Serverless data transformation with CLI/SDK, built-in observability, triggers, and automated pipelines. VCF Automation provides deployment pipelines (standing up AI infrastructure) but not data pipelines (moving, transforming, and governing data through ML workflows). The enterprise running VMware Private AI Foundation must bring its own data pipeline orchestration — Airflow, Kubeflow, or a commercial alternative — and deploy it on VKS. This is architecturally consistent with VMware’s platform philosophy: VCF provides infrastructure services, not application-layer data engineering tools. But it leaves a functional gap that competitors have filled. An enterprise choosing VMware for AI inherits a Layer 1C assembly problem that Dell (Dataloop), HPE (Ezmeral), or VAST (DataEngine) partially or fully solve. ### Borrowed Judgment High. The enterprise must borrow data pipeline judgment from whatever tools it deploys on VKS — Apache Airflow community, Kubeflow community, or a commercial vendor (Dataloop, Databricks, etc.). VMware provides no opinion on data pipeline architecture, no integration between pipeline metadata and infrastructure governance, and no data lineage capability. Compare to Dell: Dell acquired Dataloop specifically to address Layer 1C. Compare to HPE: HPE assembled Ezmeral through four acquisitions (BlueData, MapR, Ampool, Arrikto). Compare to VAST: VAST built DataEngine as a native platform capability. VMware has made no equivalent investment in data pipeline IP. ### Working Notes The gap is real but may be strategic: VMware has historically succeeded by providing infrastructure primitives that partner ecosystems build on, rather than by building application-layer tooling. The question is whether AI data pipelines are infrastructure (VMware should own them) or applications (VMware should enable them). The Tanzu Marketplace could address this gap through curated data pipeline services — certified Airflow, MLflow, or Kubeflow deployments validated for VCF. This would be a Delegated approach (partner provides the capability, VMware validates the deployment) rather than a Retained approach (VMware builds the capability). Architecturally similar to HPE’s Unleash AI ecosystem model. The KV cache story is notably absent: Dell has validated NVIDIA CMX with 19x TTFT improvement on PowerScale. HPE has native KV cache storage support in Alletra X10000. VAST collocates cache and compute in CNode-X. VMware has not yet announced equivalent KV cache tiering capabilities. ## ● Layer 2A: Infrastructure Orchestration *Lifecycle management, resource scheduling, policy enforcement, unified ops* **Status:** VMware Heritage Strength ### Vendor-Provided Components **VCF Automation (formerly vRealize/Aria Automation)** [DAPM: Retained] Self-service catalog with pre-built AI workload templates. Quickstart deployment for Private AI Foundation. Infrastructure-as-code with Terraform integration. Multi-tenant resource provisioning with RBAC. Live Application Stack Blueprints for versioned, redeployable application topologies. Day 2 operations for AI Blueprints. **vSphere Supervisor + VKS** [DAPM: Retained] Unified management of VMs, containers, and AI workloads from a single control plane. VKS (vSphere Kubernetes Service) 3.6 supports up to 500 Kubernetes clusters per Supervisor. Simplified Container-as-a-Service for application teams. VM Fast-Deploy for accelerated provisioning. vSphere Elastic Provisioning for zero-touch fleet expansion. GitOps-based infrastructure management. **VCF Operations (formerly vRealize/Aria Operations)** [DAPM: Retained] Private AI Model and GPU Metrics — utilization, memory pressure, and model-level visibility on the same console as the rest of the estate. Real-Time Operational Observability turns telemetry into action. Customizable dashboards for AI model and agent performance. Capacity management and compliance monitoring for AI workloads. **SDDC Manager** [DAPM: Retained] Full-stack lifecycle management for VCF. Automated deployment, patching, and upgrades across vSphere, vSAN, NSX, and VKS. Single-pane fleet management. Expanded fleet size and upgrade scale in 9.1. This is the operational backbone — the equivalent of HPE’s GreenLake or Dell’s APEX management, but with 20+ years of enterprise maturity. **Advanced Cyber Compliance (ACC)** [DAPM: Retained] Continuous compliance enforcement with automated drift detection and remediation. Hardened infrastructure images. Integrated with vDefend for security posture management. Disaster recovery via vSAN for Recovery. **MCP Server Governance (VCF 9.1)** [DAPM: Retained] IT operations can centrally manage and control access to MCP tools and associated servers across their environment. Ensures user groups can only access approved MCP tools. Security guardrails for MCP servers via vDefend and Avi Load Balancer. This is a Layer 2A governance function with 2C implications — controlling which agents can access which tools. ### NVIDIA-Provided Components **NVIDIA GPU Operator** Kubernetes operator for GPU lifecycle management. Manages GPU drivers, container runtime, device plugins. Standard across all NVIDIA-integrated platforms. **NVIDIA vGPU Manager** GPU virtualization profiles and multi-tenant GPU allocation. Memory partitioning and time-sliced compute scheduling. Managed through vSphere Supervisor. ### Gap Analysis Layer 2A is VMware’s strongest layer — arguably the strongest Layer 2A of any vendor in this assessment series. The reason is operational maturity: VCF has been managing enterprise infrastructure for two decades. No other vendor assessed has equivalent depth in lifecycle management, multi-tenant orchestration, compliance automation, and unified VM/container/AI workload management. Specific 2A differentiators vs. other assessed vendors: • Unified workload management: Dell manages AI workloads separately from traditional workloads (OpenManage for servers, Run:ai for GPUs, separate tools for each). HPE manages AI through GreenLake Intelligence + OpsRamp + Private Cloud AI (three systems). VMware manages AI, containers, and VMs from ONE control plane (vSphere Supervisor). VAST manages only VAST workloads. • Operational maturity: VCF’s Day 2 operations (patching, upgrades, compliance, capacity planning) for AI workloads inherit the same proven processes used for the enterprise’s existing VM fleet. New operational model required? Zero. Dell and HPE AI stacks require new operational processes. VAST requires an entirely new operational discipline. • MCP Server Governance is a notable 2A/2C bridge: centrally controlling which user groups can access which MCP tools is an infrastructure-level governance function that no other on-prem vendor provides as a platform native capability. Google’s Agent Gateway provides equivalent capability in cloud. The GPU scheduling gap remains: NVIDIA GPU Operator and vGPU Manager handle GPU allocation, but policy-driven GPU scheduling (which workload gets which GPU based on cost, compliance, and performance constraints) is not a VCF-native function. This is the same gap Dell has with Run:ai — the scheduling intelligence is NVIDIA’s, not the platform vendor’s. ### Borrowed Judgment Low — the lowest of any layer in the VMware assessment. VCF Automation, vSphere, VKS, vSAN, NSX, VCF Operations, and SDDC Manager are all Broadcom/VMware IP. GPU scheduling is the primary borrowed judgment (NVIDIA GPU Operator + vGPU Manager), but this is the same dependency every on-prem vendor shares. Compare to Dell Layer 2A: Dell splits 2A between OpenManage (Dell-owned) and Run:ai (NVIDIA-owned, acquired). VMware retains more 2A authority than Dell. Compare to HPE Layer 2A: HPE’s GreenLake Intelligence is HPE-owned 2A with MCP-based agent communication. VMware’s VCF Operations is VMware-owned 2A with emerging MCP support. Both retain 2A authority; different architectural approaches (HPE: agentic mesh; VMware: traditional orchestration evolving toward agentic). ### Working Notes The Intelligent Assist for VCF (tech preview) signals VMware’s evolution toward agentic infrastructure management. An AI-driven support assistant that diagnoses and resolves issues by consulting Broadcom’s knowledge base is functionally similar to HPE’s GreenLake Intelligence domain agents or Dell’s CloudIQ — but at an earlier stage of development. The 100M+ licensed cores installed base gives VMware an operational data advantage no other on-prem vendor can match: patterns learned from managing the world’s largest virtualization fleet can inform AI workload optimization in ways that newer platforms cannot. Whether Broadcom invests in leveraging this data advantage for AI-specific intelligence is an open question. ## ◑ Layer 2B: Application Runtime & Execution *Model serving, agent execution, inference APIs, distributed inference* **Status:** Platform-Native + NVIDIA-Dependent ### Vendor-Provided Components **Model Runtime (Private AI Services)** [DAPM: Retained] Run inference and embedding models as a service across the organization. API Gateway allows users and AI applications to interact with models directly via API. Multi-accelerator support — same model deployment on AMD and NVIDIA GPUs without refactoring. Model endpoints configurable through VCF Automation UI. Multi-tenant Models-as-a-Service enables secure model sharing across business units to lower costs and reduce power consumption. **Agent Builder (Private AI Services)** [DAPM: Retained] Build AI agents in a user-friendly playground, leveraging models and knowledge bases created using other Private AI services. Integrated with Model Runtime and Data Indexing/Retrieval for end-to-end agent development. This is a platform-native agent construction surface — not as deep as VAST’s AgentEngine or Google’s Agent Studio/ADK, but integrated into the VCF operational model. **Deep Learning VMs** [DAPM: Delegated] Pre-configured virtual machines with validated AI/ML software stacks: PyTorch, TensorFlow, Miniconda. Software stack validated in advance on NVIDIA GPUs — data scientists start developing immediately without compatibility validation. Provisioned through VCF Automation self-service catalog. **VKS AI Clusters** [DAPM: Retained] GPU-capable Kubernetes worker nodes for cloud-native AI/ML workloads. Triton Inference Server deployable from catalog. Distributed LLM inference with GPUDirect RDMA over InfiniBand for models that exceed single-server capacity (DeepSeek-R1, Llama 3.1-405B). This is infrastructure-layer runtime support, not an opinionated agent execution framework. **Tanzu Platform (Application Runtime)** [DAPM: Retained] PaaS-layer application runtime. ‘You provide code, we put it into production.’ Governed agentic coding with Tanzu. Developers can self-publish AI agents and MCP servers, sharing AI applications and tools across the enterprise. MCP server publishing makes Tanzu a distribution surface for enterprise agent tooling. ### NVIDIA-Provided Components **NVIDIA AI Enterprise Runtime** NIM inference microservices, model optimization, GPU-accelerated frameworks. The core AI runtime dependency — VMware’s Model Runtime wraps NVIDIA inference capabilities in a platform-managed service. **NVIDIA NIM Agent Blueprints** Pre-built agentic workflows (RAG, PDF extraction, digital twins). Same blueprints available on Dell, HPE, Cisco, Lenovo. Non-differentiating for VMware at 2B. **NVIDIA Triton Inference Server** Multi-framework model serving. Deployable as VCF Automation catalog item on GPU-capable VKS clusters. ### Gap Analysis VMware’s Layer 2B is the most architecturally interesting in this assessment because it combines platform-native AI services (Model Runtime, Agent Builder) with NVIDIA runtime dependency — a hybrid Retained/Delegated model. The Model Runtime is a genuine platform capability: model serving as a managed VCF service with API Gateway, multi-tenant isolation, and multi-accelerator support. This is structurally different from Dell’s 2B (entirely NVIDIA-dependent — NemoClaw/OpenShell) and closer to HPE’s 2B (HPE provides the deployment platform, NVIDIA provides the execution runtime, with HPE-owned bracketing governance above and below). The Agent Builder is notable as a platform-native agent construction surface. Compare to alternatives: • Dell: No platform-native agent builder. Relies on NVIDIA NIM/NemoClaw post-deployment. • HPE: CrewAI pre-installed (partner framework). Deloitte Zora AI (partner application). • VAST: AgentEngine — deeply integrated, proprietary agent runtime. • Google: Agent Studio (no-code), ADK (code-first), Agent Designer. • AWS: Bedrock Agents (no-code), Strands SDK (code-first). VMware’s Agent Builder is simpler than the hyperscaler offerings but it’s integrated into the VCF operational model — agents built here inherit VCF’s security (vDefend microsegmentation), governance (MCP server controls), and observability (GPU/model metrics). That operational integration is VMware’s differentiator. The Tanzu Platform MCP server publishing capability is a Layer 2B/2C bridge worth tracking: enabling developers to self-publish MCP servers creates an enterprise-internal agent tool marketplace governed by IT. This is a distributed model for agent capability deployment that differs from Google’s centralized Agent Registry or HPE’s curated Unleash AI ecosystem. ### Borrowed Judgment Moderate. VMware owns Model Runtime, Agent Builder, and the Tanzu application runtime. But inference execution depends on NVIDIA AI Enterprise (same structural dependency as Dell and HPE). The multi-accelerator support (AMD + NVIDIA) provides a runtime alternative that Dell AI Factory customers don’t have (Dell’s AMD track is a separate stack), though HPE’s GX5000 also supports multi-vendor GPU blades and hyperscalers abstract accelerators entirely. VMware’s value is that the architect controls which accelerator serves which workload through familiar virtualization primitives — the control plane is opinionated (VMware’s virtualization model applied to GPUs) but provides operational knobs that vSphere-native teams already understand. The NVIDIA dependency at 2B is real but partially mitigated by VMware’s abstraction: Model Runtime provides a VMware-managed API surface. If NVIDIA changes its NIM/NemoClaw architecture, VMware absorbs the integration change — the enterprise’s API doesn’t change. This is the same ‘bracketing’ architecture HPE uses (GreenLake governance above and below the NVIDIA runtime), expressed differently (VMware API abstraction wrapping the NVIDIA runtime). ### Working Notes The multi-accelerator Model Runtime is significant but requires context: running the same AI model on AMD and NVIDIA GPUs without refactoring is a runtime-level abstraction that VMware provides through familiar virtualization management tools. HPE’s GX5000 supports multi-vendor GPU blades in the same rack, and hyperscalers abstract accelerators entirely at the service layer (Vertex AI, Bedrock). VMware’s differentiator is not multi-accelerator support per se but the level of architectural control — the operator manages GPU placement and scheduling through vSphere primitives they already know, with stronger knobs than cloud providers offer. Dell’s AMD track (Dell AI Platform with AMD) remains a separate branding with a separate software stack, not a unified runtime. The Tanzu-mediated MCP server publishing is an emerging capability that could become significant for agentic AI: if every enterprise developer can publish MCP servers through Tanzu, and IT governs access through VCF 9.1’s MCP server governance, VMware creates a platform for enterprise agent tooling that is neither centralized (Google) nor delegated to partners (HPE Unleash AI) but distributed-and-governed. Whether this pattern scales depends on enterprise developer adoption of Tanzu. ## ○ Layer 2C: Agentic Infrastructure — The Reasoning Plane *Policy-driven placement and resource coordination — the Autonomy Layer* **Status:** Emerging Signals Only ### Vendor-Provided Components **MCP Server Governance (VCF 9.1)** [DAPM: Retained] Central management and access control for MCP tools and servers. User groups restricted to approved tools. Security guardrails via vDefend and Avi. This is an access control function — it determines WHICH agents can use WHICH tools. It is NOT a placement function — it does not determine WHERE agents run, WHICH model serves WHICH request, or HOW cost/compliance/latency are arbitrated. **VCF Intelligent Assist (Tech Preview)** [DAPM: Retained] AI-driven support assistant for VCF operations. Diagnoses and resolves infrastructure issues by consulting Broadcom’s knowledge base. Supports on-premises and cloud-hosted language models. This is an agentic infrastructure management tool, not a general-purpose agent orchestration layer. **GPU/Model Metrics Observability** [DAPM: Retained] Private AI Model and GPU Metrics in VCF Operations: utilization, memory pressure, model-level visibility. Customizable dashboards for AI model and agent performance. These metrics could FEED a Layer 2C placement engine — but no such engine exists to consume them. ### NVIDIA-Provided Components **No NVIDIA Layer 2C on VMware** NVIDIA provides no agent governance, policy-driven placement, or reasoning plane components in the VMware stack. Same gap as Dell — NVIDIA’s AI-Q is workflow scaffolding, OpenShell is constraint enforcement, Dynamo is performance routing. None is Layer 2C. ### Gap Analysis Applying the ‘Routing Is Not Reasoning’ test: MCP Server Governance = access control. Intelligent Assist = IT operations automation. GPU/Model Metrics = observability telemetry. None provides policy-driven decisions about where compute runs relative to data, which model serves which request, and how cost/compliance/latency are arbitrated in real time. VMware’s Layer 2C position is comparable to Dell’s: not yet evident as a productized capability. The signals (MCP governance, metrics observability, Intelligent Assist) suggest the building blocks exist but have not been composed into a reasoning plane. Compare to other vendors’ Layer 2C status: • Dell: Absent. Dell+Intel ‘actively addressing’ but no product announced. • HPE: Delegated to Kamiwaza (multi-layer orchestration partner via Unleash AI). GreenLake Intelligence provides IT ops 2C. • VAST: PolicyEngine + Polaris — the most aggressive middle-out Layer 2C build. • Google: Agent Identity + Gateway + Registry + Orchestration + Observability — the most complete productized 2C. • AWS: AgentCore with implicit 2C through managed service placement decisions. VMware’s unique 2C opportunity: VCF manages the entire infrastructure estate. If Broadcom builds a Layer 2C that queries vSAN governance metadata, GPU utilization telemetry, vDefend security posture, and Model Runtime performance metrics to make autonomous placement decisions, it would have the broadest infrastructure visibility of any on-prem 2C — because VCF sees everything from the hypervisor up. The data to build 2C exists in VCF Operations. The governance primitives exist in MCP Server Governance and ACC. The placement engine does not. ### Borrowed Judgment Inverted: there IS no judgment to borrow because no Layer 2C exists. Same structural position as Dell. The enterprise must build custom 2C logic, bring a partner (Kamiwaza, potentially), or operate without it. Most will choose option 3. The MCP Server Governance capability is an interesting partial answer: it provides governance over agent-tool interactions without providing placement intelligence. This is access-control-as-governance — necessary but not sufficient for a reasoning plane. ### Working Notes VMware has a structural advantage in building Layer 2C that no other on-prem vendor possesses: VCF is the control plane for the enterprise’s entire virtualized estate. Dell manages Dell hardware. HPE manages HPE hardware. VAST manages VAST storage. VMware manages EVERYTHING virtualized — across Dell, HPE, Lenovo, Cisco, and any other OEM’s hardware. A VMware Layer 2C would be the first multi-vendor infrastructure reasoning plane — making placement decisions across heterogeneous hardware from a single governance surface. No other vendor can build this because no other vendor has the cross-vendor infrastructure visibility. Whether Broadcom invests in this opportunity is an open question. The Broadcom acquisition thesis prioritizes cash generation from the installed base, not R&D investment in new platform capabilities. Layer 2C is a significant engineering investment. The $30B annual infrastructure software segment gives Broadcom the resources; the question is whether the strategic priority exists. Hock Tan’s framing of VCF as ‘the permanent abstraction layer between AI software and physical chips’ is a Layer 2A statement, not a Layer 2C statement. The abstraction layer manages resources. The reasoning plane governs them. VMware has the former; it does not have the latter. ## ◑ Layer 3 (+1): AI Application Layer — The Value Plane *AI-powered business capabilities — business logic, workflow automation* **Status:** Platform-Enabled, Not Platform-Provided ### Vendor-Provided Components **Private AI Services (Integrated)** [DAPM: Retained] Model Runtime + Agent Builder + Data Indexing/Retrieval + Vector Database + Model Store + GPU Monitoring — all included in VCF subscription at no additional cost. This is not a Layer 3 application stack — it is an integrated set of AI platform services that enables Layer 3 development. The distinction matters: VMware provides the tools to BUILD AI applications, not the applications themselves. **NVIDIA Blueprints + NIM on VCF** [DAPM: Delegated] Pre-built AI application patterns deployable through VCF Automation catalog. Multimodal PDF Extraction, Digital Twins, RAG pipelines. Same blueprints available on Dell, HPE, Cisco, Lenovo — non-differentiating for VMware. **Tanzu Platform (Agent Distribution)** [DAPM: Retained] Developers self-publish AI agents and MCP servers to the enterprise via Tanzu. IT maintains governance and oversight. Tanzu Marketplace provides curated path to certified middleware, data services, and AI tooling. This is an enterprise app store model for AI capabilities. **ISV + OEM Ecosystem** [DAPM: Delegated] VMware Private AI Foundation validated on Dell, HPE, Lenovo, Cisco, Supermicro, NEC, Fujitsu hardware. ISV ecosystem spans the entire VMware partner network — thousands of validated applications across every industry. AI-specific ISV validation is emerging but not yet at the curation depth of HPE’s Unleash AI (26+ selected ISV partners) or Dell’s AI Ecosystem Program (OpenAI, Palantir, Google, ServiceNow). **OpenWebUI Integration** [DAPM: Delegated] Open-source AI user interface integrated with VCF Private AI Services RAG. Provides a ChatGPT-like interface for enterprise users to interact with privately-hosted models. Demonstrates the ‘platform enables applications’ model. ### NVIDIA-Provided Components **NVIDIA Model Ecosystem** Nemotron models, community models, NVIDIA NIM containers available through Model Store. NVIDIA provides the model layer; VMware provides the serving and governance layer. ### Gap Analysis VMware’s Layer 3 is structurally different from every other assessed vendor because VMware is explicitly a platform, not an application provider. VMware provides the tools to build and deploy AI applications (Private AI Services) but does not build the applications themselves. This is the correct architectural position for an infrastructure platform vendor — and it’s the same position Dell occupies (Dell doesn’t build AI applications; it partners with OpenAI, Palantir, ServiceNow). The difference is ecosystem depth: • Dell’s AI ecosystem: OpenAI, Palantir, Google, ServiceNow, SpaceXAI, Hugging Face, 5,000+ deployment customers. Explicitly curated for AI. • HPE’s Unleash AI: 26+ selected ISV partners with validated interoperability. Kamiwaza orchestration. CrewAI pre-installed. Purpose-built for AI. • VAST’s Cosmos Community: CoreWeave, TwelveLabs, CrowdStrike with distinct partner tracks. Focused and vertical. • VMware’s AI ecosystem: Inherits the broader VMware partner ecosystem (thousands of ISVs) but without AI-specific curation depth. Private AI Foundation validation is available on major OEM hardware, but AI-specific ISV partnerships are not yet at the maturity of Dell or HPE programs. The Tanzu-mediated MCP server publishing could evolve into VMware’s distinctive Layer 3 model: instead of curating an external ISV ecosystem (HPE’s approach) or partnering with AI application vendors (Dell’s approach), VMware enables the enterprise’s own developers to build and distribute AI agents internally. This is an internally-generated Layer 3 rather than an externally-sourced one. The VCF installed base is the Layer 3 enabler: 100M+ cores means Private AI Services reach more enterprise infrastructure than any competitor’s AI platform. The AI applications built on VMware will be built by the enterprise’s own developers, using VMware’s tools, on VMware’s platform. Whether that bottom-up, developer-driven approach generates Layer 3 applications as quickly as Dell’s top-down partnerships (OpenAI, Palantir) or HPE’s curated ecosystem (Unleash AI) is the open question. ### Borrowed Judgment Distributed across the enterprise’s own development teams and chosen partners. VMware provides the platform; the enterprise provides the application logic. This is the most explicit Retained model for Layer 3 in this assessment — the enterprise builds its own AI applications rather than consuming a vendor’s or partner’s. The trade-off: maximum control (Retained), maximum effort (the enterprise must build everything above the platform services layer). Dell and HPE offer Delegated shortcuts (partner applications). VMware offers Retained responsibility. ### Working Notes The Private AI Foundation at no additional cost for VCF subscribers is a strategic masterstroke for customer retention: every VCF customer already has access to Model Runtime, Agent Builder, Vector Database, Data Indexing/Retrieval, and Model Store. The marginal cost of trying Private AI is zero (beyond GPU hardware). This is the lowest-barrier entry to on-prem AI of any vendor assessed. The 9/10 Fortune 500 commitment to VCF means Private AI Foundation has the largest potential enterprise deployment footprint of any on-prem AI platform. Whether that potential converts to actual AI workload deployment depends on whether enterprises find Private AI Services sufficient for production AI or whether they choose purpose-built alternatives (Dell AI Factory, HPE Private Cloud AI, VAST AI OS) for deeper capabilities. The competitive dynamic is unusual: VMware doesn’t compete with Dell or HPE at Layer 0 (VMware runs ON their hardware). VMware competes with them at Layers 1-3 (management, orchestration, AI services). An enterprise could run Dell hardware + VMware VCF + VMware Private AI Services — getting Dell’s Layer 0 with VMware’s Layers 2A/2B. Or Dell hardware + Dell AI Factory — getting Dell’s Layer 0 with Dell/NVIDIA’s Layers 2A/2B. The choice is between VMware’s operational maturity and Dell/HPE’s AI-specific depth. --- *Layer2C · AI Infrastructure Decision Intelligence · The CTO Advisor LLC · thectoadvisor.com*