# NVIDIA AI Platform — 4+1 Layer AI Infrastructure Assessment

> A Components Company Becoming a Platform Vendor — Mapped to the 4+1 Layer AI Infrastructure Model  
> Version: v1.0 — Draft, Editorial Review Pending · Date: May 22, 2026  
> Source: GTC 2025, GTC 2026, Dynamo 1.0 GA, NemoClaw/OpenShell announcement, Run:ai acquisition, DGX Cloud, NVIDIA AI Enterprise, NIM GA, analyst coverage, SEC FY2026 annual report  
> Published by: The CTO Advisor LLC · thectoadvisor.com  
> Author: Keith Townsend

[Full interactive assessment](https://layer2c.web.app/assessment/nvidia) · [Methodology](https://layer2c.web.app/methodology) · [What Is Layer 2C?](https://layer2c.web.app/what-is-layer-2c)

## Executive Summary

NVIDIA is the only vendor in this assessment series that appears inside every other vendor's assessment. Dell's Layer 2A GPU orchestration is NVIDIA Run:ai. HPE's Layer 2B runtime is NVIDIA AI Enterprise. VMware's GPU integration is NVIDIA vGPU Manager. AWS, Google Cloud, and Azure all run NVIDIA GPUs alongside their own silicon. VAST embeds NVIDIA GPUs, NICs, DPUs, and switches into its data platform. Mapping NVIDIA as a standalone vendor inverts the perspective: instead of asking 'where does Dell cede authority to NVIDIA,' the question becomes 'where does NVIDIA claim authority, and from whom?'

The 4+1 mapping reveals NVIDIA as a vendor with deep authority at Layers 0, 2A, and 2B — and emerging ambitions at Layer 2C. NVIDIA designs the accelerator silicon that every other vendor depends on (Layer 0), provides the GPU orchestration platform that on-prem vendors brand as their own (Layer 2A via Run:ai), and controls the inference runtime and model lifecycle stack that sits between the enterprise's infrastructure and its AI applications (Layer 2B via NIM, NeMo, Dynamo, TensorRT-LLM). At Layers 1A through 1C, NVIDIA is an accelerator — it makes other vendors' storage and data pipelines faster without providing those capabilities directly.

The structural tension is between NVIDIA as a silicon supplier and NVIDIA as a platform vendor. When NVIDIA was only selling GPUs, its interests aligned with every OEM and hyperscaler: more GPU adoption meant more revenue for everyone. As NVIDIA extends into GPU orchestration (Run:ai), inference runtime (NIM/Dynamo), agent governance (OpenShell/NemoClaw), and cloud infrastructure (DGX Cloud), it competes with the same customers who buy its silicon. Dell's AI Factory runs NVIDIA software. But DGX Cloud is NVIDIA competing with Dell for the same enterprise workload.

The DAPM classification for NVIDIA is inverted from every other assessment: the enterprise consuming NVIDIA through an OEM (Dell, HPE, VMware) has already Ceded authority to the OEM, which has Ceded authority to NVIDIA. The enterprise consuming NVIDIA directly (DGX Cloud, NIM API) Cedes authority to NVIDIA without an intermediary. The enterprise self-hosting NVIDIA open-source software (Dynamo, OpenShell, Nemotron) Retains authority — but still runs on NVIDIA silicon. NVIDIA is the only vendor where every deployment path, at every layer, eventually depends on NVIDIA hardware.

More than half of NVIDIA's engineers work on software. That statistic from NVIDIA's FY2026 annual report is the key to understanding the 4+1 mapping: NVIDIA is a software company that happens to sell the hardware its software requires. The assessment series has been documenting where NVIDIA's software authority appears inside other vendors' stacks. This assessment makes that authority explicit.

## Layer Status

| Layer | Status | Classification |
|---|---|---|
| Layer 0 | ● NVIDIA Strength — Silicon Authority | Compute & Network Fabric |
| Layer 1A | ○ Accelerator Only | Data Storage & Governance |
| Layer 1B | ◑ Acceleration + Model Enablement | Context Management & Retrieval |
| Layer 1C | ◑ Inference Data Movement | Data Movement & Pipelines |
| Layer 2A | ● NVIDIA Authority via Run:ai | Infrastructure Orchestration |
| Layer 2B | ● NVIDIA Authority — Inference + Agent Runtime | Application Runtime & Execution |
| Layer 2C | ○ Runtime Governance Only — Not a Reasoning Plane | Agentic Infrastructure — The Reasoning Plane |
| Layer 3 (+1) | ◑ Model + Blueprint Enablement | AI Application Layer — The Value Plane |

## DAPM Profile

| Classification | Count | Meaning |
|---|---|---|
| Retained | 4 | Enterprise owns and controls this capability |
| Delegated | 5 | Provided by substitutable partner; enterprise retains swap authority |
| Ceded | 8 | Vendor controls this; enterprise has no governance authority |
| Absent | 0 | No capability at this layer |

## Strongest Layers

- **Layer 0** (Compute & Network Fabric) — NVIDIA Strength — Silicon Authority
- **Layer 2A** (Infrastructure Orchestration) — NVIDIA Authority via Run:ai
- **Layer 2B** (Application Runtime & Execution) — NVIDIA Authority — Inference + Agent Runtime

## Gap Areas

- **Layer 1A** (Data Storage & Governance) — Accelerator Only
- **Layer 2C** (Agentic Infrastructure — The Reasoning Plane) — Runtime Governance Only — Not a Reasoning Plane

## Layer-by-Layer Detail

### ● Layer 0: Compute & Network Fabric

*Raw compute, networking, and acceleration fabric*  
**Status:** NVIDIA Strength — Silicon Authority

**GPU Accelerator Silicon (Blackwell, Vera Rubin)** [DAPM: Ceded]  
Blackwell B200/B300/GB200 (current generation). Vera Rubin NVL72 (next generation, deploying at hyperscalers). The accelerator silicon that every other vendor in this assessment depends on. Dell builds PowerEdge around it. HPE builds ProLiant and Cray around it. AWS offers it as P5/P6 instances. Azure offers it as ND-series. Google offers it alongside TPUs. VAST embeds it in CNode-X. No enterprise AI infrastructure exists without NVIDIA GPU silicon — or a deliberate decision to use an alternative (AWS Trainium, Google TPU, AMD Instinct).

**Networking Silicon + Interconnect** [DAPM: Ceded]  
NVLink/NVSwitch (intra-node GPU interconnect). Spectrum-X Ethernet switches. ConnectX-7/8 SmartNICs. BlueField-3 DPUs. InfiniBand for GPU cluster fabric. NIXL for disaggregated inference data movement. Dell brands Spectrum switches as PowerSwitch. HPE integrates ConnectX into ProLiant. VAST uses ConnectX/BlueField for NVMe-over-Fabrics. The networking silicon is as structurally embedded as the GPU silicon.

**DGX Platform (On-Prem Systems)** [DAPM: Ceded]  
DGX SuperPOD: leadership-class AI infrastructure for on-prem and hybrid. DGX Station: workgroup-scale AI compute. DGX Spark: desktop AI workstation. Pre-configured systems with NVIDIA software stack pre-installed. Competes directly with Dell PowerEdge, HPE ProLiant, and OEM AI server configurations — NVIDIA sells the assembled system, not just the components.

**DGX Cloud (Hosted Infrastructure)** [DAPM: Ceded]  
GPU supercomputing as a service, hosted on AWS, Azure, GCP, and OCI. Includes NVIDIA AI Enterprise software and Base Command Platform. The enterprise accesses NVIDIA infrastructure through a hyperscaler substrate — Ceding to both NVIDIA (software/GPU) and the hyperscaler (facility/network). DGX Cloud competes with the hyperscalers' own GPU instance offerings while running on their infrastructure.

**Gap Analysis:** Layer 0 is NVIDIA's foundational authority. Every other vendor in this assessment depends on NVIDIA silicon at this layer — the only exceptions are AWS (Trainium/Inferentia), Google (TPU), Azure (Maia), and AMD Instinct instances on hyperscalers.

The DGX Platform creates a structural tension with OEM partners. When NVIDIA sells DGX SuperPOD directly to an enterprise, that enterprise is NOT buying Dell PowerEdge or HPE ProLiant. NVIDIA is simultaneously its OEM partners' most critical supplier and their direct competitor. Dell's 'AI Factory with NVIDIA' branding and HPE's 'NVIDIA AI Computing by HPE' branding are attempts to keep the enterprise buying through the OEM rather than going to NVIDIA directly.

DGX Cloud adds a second tension: NVIDIA competes with hyperscalers while running on their infrastructure. AWS, Azure, and GCP host DGX Cloud while simultaneously offering their own GPU instances. The enterprise choosing between Azure ND-series VMs and DGX Cloud on Azure is choosing between Microsoft-managed and NVIDIA-managed access to the same GPU hardware.

The NVIDIA dependency at Layer 0 is the one dependency shared by every on-prem vendor assessed. Dell, HPE, VAST, and VMware all depend on NVIDIA GPU silicon. The difference is the scope of that dependency: at Layer 0, NVIDIA provides silicon. At Layer 2A, NVIDIA provides orchestration. At Layer 2B, NVIDIA provides runtime. The silicon dependency is structural and shared. The software dependency is where NVIDIA's authority claims create tension.

**Borrowed Judgment:** The enterprise consuming NVIDIA silicon inherits NVIDIA's GPU architecture decisions (memory bandwidth, interconnect topology, power/thermal profile), NVIDIA's driver and CUDA runtime decisions, and NVIDIA's product lifecycle and pricing decisions. This borrowed judgment is structural — it exists for every vendor in the assessment and for every enterprise running AI workloads on NVIDIA hardware.

The DGX Platform adds system-level borrowed judgment: NVIDIA's hardware integration, thermal design, and rack architecture decisions. The DGX Cloud adds hyperscaler-layered borrowed judgment: NVIDIA software decisions on top of hyperscaler infrastructure decisions.

### ○ Layer 1A: Data Storage & Governance

*Durable, governed data foundation — the Governance Catalog that Layer 2C queries*  
**Status:** Accelerator Only

**Gap Analysis:** NVIDIA provides no storage, no data governance, and no data platform. Zero components at this layer.

NVIDIA accelerates other vendors' storage with GPU libraries (cuVS for vector search, RAPIDS for data processing) and networking hardware (BlueField DPUs, ConnectX NICs) — but those are acceleration functions assessed at their functional layers (1B for retrieval, 1C for data processing, Layer 0 for networking silicon). The storage platforms, governance catalogs, and data architectures are entirely owned by other vendors: Dell (PowerScale, ObjectScale, MetadataIQ), HPE (Alletra, Data Fabric), VAST (DataStore, DataBase, Catalog), AWS (S3, Glue, Lake Formation), Google (BigQuery, Knowledge Catalog), Azure (Blob, Fabric, Purview).

The absence of NVIDIA-owned storage or governance is structurally significant for Layer 2C: a Reasoning Plane needs governance metadata — which data is sensitive, which models are approved, which compliance requirements apply. NVIDIA has no Layer 1A metadata to feed into a Layer 2C reasoning plane. Every other vendor's 2C ambition is anchored in governance metadata from 1A. NVIDIA's emerging Layer 2C (OpenShell/NemoClaw) operates without governance context because NVIDIA doesn't own the data layer.

**Borrowed Judgment:** None. NVIDIA has no data layer authority to lend or borrow. The enterprise's storage and governance judgment comes entirely from the storage vendor.

### ◑ Layer 1B: Context Management & Retrieval

*Low-latency retrieval for RAG — vector/hybrid search, context windows*  
**Status:** Acceleration + Model Enablement

**cuVS (GPU-Accelerated Vector Search)** [DAPM: Delegated]  
GPU-accelerated vector similarity search library. 12x faster vector indexing. Used by Dell (MetadataIQ integration), VAST (CNode-X vector search), and storage vendors for retrieval acceleration. NVIDIA provides the search acceleration; the platform vendor provides the retrieval infrastructure and index.

**NeMo Retriever** [DAPM: Delegated]  
GPU-accelerated retrieval pipeline for RAG. Embedding models, reranking, and retrieval optimization. Integrated into Dell's Data Search Engine (PowerScale connector), HPE's retrieval stack, and VMware's AI Enterprise RAG Stack. Provides the retrieval intelligence that OEMs brand as part of their platforms.

**NIM Embedding Models** [DAPM: Delegated]  
Pre-built inference microservices for text and multimodal embedding. Used by VAST's InsightEngine, Dell's retrieval pipeline, and hyperscaler RAG services. NVIDIA provides the embedding models; the platform vendor provides the retrieval infrastructure.

**Gap Analysis:** NVIDIA provides retrieval acceleration and embedding models but not retrieval infrastructure. The retrieval engines — Azure AI Search, OpenSearch, Elasticsearch, VAST InsightEngine, Google Vertex AI Search — are owned by other vendors. NVIDIA makes retrieval faster and provides the embedding models that make vector search work, but the enterprise's retrieval architecture is determined by the platform vendor.

NeMo Retriever is a meaningful capability: it provides the GPU-accelerated RAG pipeline that multiple OEMs brand as part of their offerings. When Dell advertises 'GPU-accelerated hybrid search,' the GPU acceleration is NVIDIA's. The enterprise's retrieval quality depends on NVIDIA's embedding model quality — a borrowed judgment that is rarely made explicit.

**Borrowed Judgment:** Moderate. NeMo Retriever embedding models determine retrieval quality. The enterprise inherits NVIDIA's embedding model training decisions, architecture choices, and optimization priorities. This borrowed judgment is invisible — the enterprise interacts with Dell's search engine or VAST's InsightEngine, not with NVIDIA's embeddings directly.

### ◑ Layer 1C: Data Movement & Pipelines

*Move/transform data — ETL/ELT, lineage, cost-aware movement, KV cache tiering*  
**Status:** Inference Data Movement

**NVIDIA CMX (KV Cache Management)** [DAPM: Delegated]  
Context Memory Extension for KV cache offload from GPU to CPU/SSD. Validated with Dell PowerScale (19x TTFT improvement). Enables inference scaling by treating KV cache as a data movement problem. Future integration expected with HPE Alletra and VAST CNode-X. The most architecturally significant Layer 1C capability NVIDIA provides — it solves a data movement problem that emerges specifically from inference workloads.

**Gap Analysis:** NVIDIA does not provide data pipeline orchestration (Data Factory, Dataloop, DataEngine, Airflow). Its Layer 1C presence is a single capability: KV cache management via CMX.

CMX is architecturally significant because it addresses a data movement problem unique to AI inference — KV cache growing beyond GPU memory. This is a Layer 1C function (data movement) that directly affects Layer 2B performance (inference latency). Dell has validated it (19x TTFT improvement on PowerScale); HPE and VAST are expected integrations. The enterprise's KV cache strategy becomes a borrowed judgment from NVIDIA's CMX design decisions once adopted.

NVIDIA also provides GPU-accelerated compute libraries (RAPIDS, cuDF) used within other vendors' data pipelines, but these are computation acceleration, not data movement — they make processing faster without providing pipeline orchestration, data lineage, or movement logic. They are assessed at the layers where they functionally operate (compute acceleration at Layer 0, retrieval acceleration at Layer 1B) rather than at Layer 1C.

**Borrowed Judgment:** Moderate for CMX — an architectural decision about KV cache management that affects inference performance and is harder to substitute once adopted. The enterprise inherits NVIDIA's decisions about cache eviction policy, offload thresholds, and storage tier targeting.

### ● Layer 2A: Infrastructure Orchestration

*GPU scheduling, quotas, RBAC, fair-share scheduling, utilization optimization*  
**Status:** NVIDIA Authority via Run:ai

**NVIDIA Run:ai (Acquired 2024)** [DAPM: Ceded]  
GPU orchestration and workload management platform. Kubernetes-native. Dynamic GPU pooling across hybrid environments. Fractional GPU sharing (no open-source equivalent). Fair-share scheduling with team-level quotas. Multi-cluster management from a unified control plane. Now part of NVIDIA AI Enterprise ($4,500/GPU/year standalone, included with DGX). Run:ai is the Layer 2A authority that Dell brands as part of AI Factory, HPE brands as part of Private Cloud AI, and VMware integrates through NVIDIA AI Enterprise. The OEM sells the relationship; NVIDIA controls the scheduling intelligence. GPU-level infrastructure (GPU Operator for driver/plugin lifecycle, MIG for hardware partitioning) provides the substrate Run:ai orchestrates — invisible plumbing, not standalone orchestration tools.

**Gap Analysis:** Layer 2A is where NVIDIA's platform ambition creates the most direct tension with its OEM partners. Run:ai is the most capable GPU-specific orchestration platform available — fractional GPU sharing, multi-cluster management, and fair-share scheduling are capabilities that no open-source alternative matches.

But Run:ai is NVIDIA's product, not the OEM's. When Dell markets 'AI Factory with NVIDIA,' the GPU scheduling intelligence is Run:ai — NVIDIA's IP, NVIDIA's roadmap, NVIDIA's pricing. Dell provides the hardware, the rack integration, and the customer relationship. NVIDIA provides the scheduling brain. If NVIDIA changes Run:ai's architecture, licensing, or feature set, Dell's AI Factory Layer 2A changes with it — without Dell's input.

The same dynamic applies to HPE (Private Cloud AI includes NVIDIA AI Enterprise with Run:ai) and VMware (VCF integrates NVIDIA AI Enterprise). Three OEMs, one scheduling authority.

The hyperscalers avoid this dependency: AWS built Karpenter, Google built GKE Autopilot + Fluid Compute, Azure contributed DRA to upstream Kubernetes. Each hyperscaler owns its GPU scheduling intelligence. On-prem vendors do not — they consume NVIDIA's.

The open-source alternatives (Kueue, KAI Scheduler, DRA) are catching up but lack Run:ai's fractional GPU sharing. The enterprise evaluating GPU orchestration choices is evaluating a NVIDIA proprietary vs. open-source trade-off — better capability (Run:ai) vs. more authority (open-source).

**Borrowed Judgment:** High. Run:ai's scheduling decisions — which team gets which GPU, how fractional sharing is allocated, when over-quota borrowing is permitted — are NVIDIA's judgment. The enterprise configures policies; NVIDIA's scheduler executes them. If Run:ai makes a scheduling decision that impacts training job completion time or inference latency, that's NVIDIA's borrowed judgment affecting business outcomes.

NVIDIA AI Enterprise licensing adds commercial borrowed judgment: the enterprise's production deployment timeline depends on NVIDIA's licensing terms, pricing changes, and certification cycles.

### ● Layer 2B: Application Runtime & Execution

*Model serving, inference optimization, agent runtime — the Execution Plane*  
**Status:** NVIDIA Authority — Inference + Agent Runtime

**NVIDIA NIM (Inference Microservices)** [DAPM: Ceded]  
Pre-built, optimized inference containers for 100+ models. OpenAI-compatible API. Free for prototyping on DGX Cloud (build.nvidia.com). Production requires AI Enterprise license. Includes Nemotron, Llama, Mistral, and partner models. NIM is the inference runtime that multiple OEMs and hyperscalers brand as part of their platforms — AWS Bedrock offers NIM, Azure Foundry offers NIM, Dell deploys NIM on PowerEdge.

**Dynamo 1.0 (Inference Operating System)** [DAPM: Retained]  
Open-source (Apache 2.0) distributed inference serving framework. GA March 2026. Disaggregated prefill and decode. KV-aware routing to GPUs with best cache match. KVBM for memory management. NIXL for GPU-to-GPU data movement. Grove for scaling. 7x performance boost on Blackwell. Adopted by AWS, Azure, GCP, OCI, CoreWeave, and dozens of inference providers. NVIDIA positions Dynamo as 'the operating system of AI factories.' Open-source but NVIDIA-optimized — runs best on NVIDIA hardware.

**NeMo (Model Lifecycle)** [DAPM: Ceded]  
End-to-end model lifecycle management: data curation, model customization and evaluation, guardrailing and observability. NeMo Guardrails for content safety. NeMo Evaluator for model assessment. NeMo Data Designer for training data preparation (integrated into VAST's TuningEngine). The model lifecycle stack that operates above inference and below applications.

**NeMo Guardrails (Runtime Content Safety)** [DAPM: Ceded]  
Programmable content safety framework inline with inference. Controls model output, topic boundaries, and factual grounding during model serving. Deployed as part of NIM containers or standalone. At Layer 2B, Guardrails functions as runtime content filtering — it controls what the model says during inference. The same capability serves a Layer 2C governance function when applied as policy enforcement for agent behavior.

**NemoClaw + OpenShell (Agent Execution Runtime)** [DAPM: Retained]  
NemoClaw: open-source stack (Apache 2.0) bundling OpenShell runtime with Nemotron models. At Layer 2B, NemoClaw is the agent execution environment — the runtime that agents run inside. OpenShell provides kernel-level sandboxing (deny-by-default) and privacy router for on-device vs. cloud inference routing. Early alpha. The same capability serves a Layer 2C governance function as the policy enforcement sandbox that constrains agent behavior.

**Gap Analysis:** Layer 2B is NVIDIA's deepest software authority and the layer where the platform ambition is most visible. NIM, Dynamo, NeMo, NeMo Guardrails, and NemoClaw/OpenShell constitute the complete inference, model lifecycle, and agent runtime stack.

NeMo Guardrails and NemoClaw/OpenShell appear at both Layer 2B and Layer 2C because they serve dual architectural functions. At 2B they are runtime capabilities — content filtering inline with inference, agent execution environment. At 2C they are governance capabilities — policy enforcement for agent behavior, sandbox constraints on agent access. The same code, two architectural purposes. This dual-layer presence is itself evidence of NVIDIA's platform transition: a components company's software stays within one layer; a platform company's software spans layers.

The open-source strategy is deliberate: Dynamo (Apache 2.0) and NemoClaw/OpenShell (Apache 2.0) are open-source, meaning the enterprise Retains the code. But both are optimized for NVIDIA hardware and NVIDIA's CUDA ecosystem. Running Dynamo on AMD or Intel GPUs is theoretically possible but practically disadvantaged. The open-source license provides code portability; the hardware optimization provides silicon lock-in.

NIM is the more significant authority claim: it's closed-source, NVIDIA-only, and requires an AI Enterprise license for production. The enterprise using NIM to serve models has Ceded inference runtime authority to NVIDIA. The alternative — vLLM, SGLang, or other open-source serving frameworks — is slower but Retained.

The Dell assessment's Layer 2B finding applies directly: Dell does not appear to own the core agent runtime, model-serving runtime, guardrail framework, or distributed inference framework. Those are NVIDIA's. This assessment confirms that observation from NVIDIA's perspective.

**Borrowed Judgment:** High for NIM (closed-source, NVIDIA-controlled inference optimization decisions). Low for Dynamo (open-source, enterprise can fork and modify). Moderate for NeMo (model lifecycle decisions — training data curation, evaluation metrics, guardrail policies — are NVIDIA's defaults that the enterprise inherits unless explicitly overridden).

The inference optimization decisions in NIM and TensorRT-LLM directly affect model output quality, latency, and cost. Quantization choices, batching strategies, and KV cache management are NVIDIA's engineering decisions that the enterprise consumes without visibility. If an NIM container produces different outputs than a vLLM deployment of the same model, the enterprise may not know which is 'correct.'

### ○ Layer 2C: Agentic Infrastructure — The Reasoning Plane

*Policy-driven placement and resource coordination — the Autonomy Layer*  
**Status:** Runtime Governance Only — Not a Reasoning Plane

**NemoClaw + OpenShell Agent Governance (Alpha)** [DAPM: Retained]  
NemoClaw: open-source stack (Apache 2.0) bundling OpenShell governance runtime with Nemotron models and NVIDIA Agent Toolkit. At Layer 2C, OpenShell is the policy enforcement sandbox — kernel-level deny-by-default constraints on filesystem, network, and process access via declarative YAML. Privacy router governs on-device vs. cloud inference routing as a policy decision. The same capability serves a Layer 2B function as the agent execution environment. Alpha-stage — NVIDIA is explicit about rough edges.

**NeMo Guardrails (Agent Policy Enforcement)** [DAPM: Delegated]  
At Layer 2C, Guardrails functions as policy enforcement for agent behavior — controlling what agents are permitted to do, say, and access as a governance decision. Topic boundaries, factual grounding requirements, and content policies are defined declaratively and enforced at runtime. The same capability serves a Layer 2B function as inline content safety during model inference.

**Gap Analysis:** Applying the 'Routing Is Not Reasoning' test from the VMware assessment: OpenShell provides runtime sandbox governance — it controls WHAT agents can access (filesystem, network, processes). NeMo Guardrails control WHAT models can say (content filtering, topic boundaries). Neither provides policy-driven decisions about WHERE compute runs relative to data, WHICH model serves WHICH request, or HOW cost/compliance/latency are arbitrated.

OpenShell is agent runtime security. NeMo Guardrails is model output safety. Neither is a Reasoning Plane.

NVIDIA's Layer 2C gap is structural: NVIDIA does not own storage (Layer 1A), data governance (Purview, Lake Formation, Knowledge Catalog), or enterprise identity (Entra, IAM). A Reasoning Plane needs governance metadata — which data is sensitive, which models are approved, which compliance requirements apply. NVIDIA has no data governance to query because it has no data layer.

The consequence: NVIDIA's Layer 2C will always depend on another vendor's governance metadata. OpenShell can enforce sandbox policies, but it cannot make placement decisions informed by data classification, compliance status, or cost targets — because that information lives in Purview, Lake Formation, PolicyEngine, or MetadataIQ, none of which NVIDIA owns.

This is the fundamental structural limitation of NVIDIA's platform ambition: NVIDIA can build runtime governance (2B/2C boundary) but cannot build a full Reasoning Plane (2C) because it lacks the data governance foundation (1A) that a Reasoning Plane queries.

**Borrowed Judgment:** Low for OpenShell (open-source, enterprise controls the policies). Low for NeMo Guardrails (configurable by the enterprise). The governance logic is transparent — the enterprise defines what agents can and cannot do.

The missing borrowed judgment is the more significant finding: NVIDIA's Layer 2C cannot borrow data governance judgment from itself because it doesn't have a data governance layer. It must borrow from Dell (MetadataIQ), HPE (Data Fabric), VAST (Catalog), AWS (Lake Formation), Google (Knowledge Catalog), or Azure (Purview). NVIDIA's governance is runtime-only; other vendors' governance is data-informed.

### ◑ Layer 3 (+1): AI Application Layer — The Value Plane

*AI-powered business capabilities — business logic, workflow automation*  
**Status:** Model + Blueprint Enablement

**Nemotron Open Models** [DAPM: Retained]  
Post-trained on Llama, distilled from DeepSeek-R1. Deployment-ready for AI agents. Available through NIM API (build.nvidia.com) and as downloadable containers. Nemotron models are NVIDIA's answer to the model layer — open models optimized for NVIDIA hardware. Competes with OpenAI, Anthropic, Google, and Meta at the model layer while providing the hardware those competitors run on.

**Gap Analysis:** NVIDIA does not build enterprise AI applications. Its Layer 3 presence is a single component: Nemotron open models.

NVIDIA also provides application enablement that falls below the Layer 3 threshold: Blueprints (pre-built reference patterns for PDF extraction, digital twins, RAG pipelines, AI-Q agent task decomposition — deployed through Dell, HPE, VMware, and hyperscaler marketplaces) and NIM API endpoints (build.nvidia.com — free API access to 100+ models, 1,000 free inference credits, GPU sandbox instances). Blueprints are reference architectures, not applications — the enterprise builds from them, not on them. NIM API is a developer on-ramp and go-to-market funnel, not an application platform.

The Nemotron model strategy is the interesting Layer 3 finding: NVIDIA competes with the AI model providers (OpenAI, Anthropic, Google, Meta) whose models run on NVIDIA hardware. If Nemotron achieves quality parity with proprietary models, enterprises can run inference on NVIDIA hardware with NVIDIA models — a fully vertically integrated stack from silicon to model. No other silicon vendor has this: Intel doesn't have frontier models, AMD doesn't have frontier models, AWS Trainium serves other providers' models.

The NIM API funnel is NVIDIA's developer moat: free prototyping creates adoption → adoption creates switching cost → production deployment requires AI Enterprise license on NVIDIA hardware. The funnel is silicon-to-model-to-lock-in.

**Borrowed Judgment:** Moderate. Nemotron model alignment, training data, and safety decisions are NVIDIA's. The model-to-silicon borrowed judgment is unique to NVIDIA: when the enterprise uses Nemotron on NVIDIA GPUs, both the model and the hardware are NVIDIA's. The enterprise borrows NVIDIA's judgment at every layer of the inference path. No other vendor has this — even Google (Gemini on TPU) separates the model team (DeepMind) from the silicon team.

---
*Layer2C · AI Infrastructure Decision Intelligence · The CTO Advisor LLC · thectoadvisor.com*
