Executive Summary: Google Cloud AI Infrastructure

Google Cloud is the only vendor in this assessment series that owns a frontier foundation model — and that single fact restructures the entire 4+1 analysis. Google built Gemini, trains Gemini on its own TPUs, optimizes its silicon for Gemini’s training requirements, and weaves Gemini’s intelligence into every layer of its cloud platform. This creates a model-integrated stack: an architecture where the frontier model is not a component plugged into infrastructure but the intelligence that pervades the infrastructure.

No other vendor assessed possesses this vertical integration. Google owns every layer of the 4+1 model with proprietary IP: custom silicon (TPUs), custom networking (Virgo), proprietary storage (Colossus/Spanner/BigQuery), its own runtime and frameworks (JAX/Pathways), its own frontier models (Gemini), and a unified orchestration surface (Gemini Enterprise Agent Platform).

The DAPM implication is not merely that every layer is Ceded — it is that every layer is ceded to a unified intelligence. With AWS, authority is distributed across multiple vendors’ judgment (AWS infrastructure, Anthropic model reasoning, ISV application logic). That distribution creates complexity but also structural checks. With Google Cloud + Gemini, the enterprise concentrates authority in one vendor’s judgment across every layer — from silicon to application.

This is the deepest expression of vertical integration in enterprise technology since the mainframe era. The enterprise gains end-to-end optimization that no multi-vendor assembly can match. But the 4+1 framework makes visible what the integration obscures: the enterprise has no fallback position at any layer. Google Distributed Cloud (GDC) addresses data sovereignty without addressing judgment sovereignty — GDC still runs Google’s software stack and Google’s models.

The structural question: does concentrating all layers of authority and all layers of model judgment in a single vendor deliver enough value to justify the governance position — and has the enterprise made that concentration explicit rather than inheriting it by default?

Layer-by-layer status: Layer 0 (Ceded to Google), Layer 1A (Ceded — Model-Powered Governance), Layer 1B (Ceded — Model-Powered Prep), Layer 1C (Ceded), Layer 2A (Ceded / Delegated (GDC)), Layer 2B (Ceded — Model-Integrated Stack), Layer 2C (Ceded — Productized but Captive), Layer 3 (+1) (Open Model Layer, Captive Platform).

Assessment framework: 4+1 Layer AI Infrastructure Model. Scoring model: Decision Authority Placement Model (DAPM) — Retained, Delegated, Ceded, or Absent. Published by The CTO Advisor LLC. Author: Keith Townsend. Date assessed: May 21, 2026. Version: v1.0 — Draft, Editorial Review Pending.

Google Cloud AI Infrastructure

Mapped to the 4+1 Layer AI Infrastructure Model

v1.0 — Draft, Editorial Review Pending·Assessed May 21, 2026·Source: Google Cloud Next 2026 (Apr 22–24), GTC 2026, NVIDIA partnership, Forrester, SiliconANGLE, The New Stack, analyst coverage

ACTIVE ASSESSMENT

Strength

Delegated

Gap

Absent

Partner

Layer 0Compute & Network FabricCeded to Google▼

Raw compute, networking, and acceleration fabric

Vendor-Provided

Google Custom Silicon (TPUs)Ceded

TPU 8t (training): 9,600-chip superpods, ~3x processing power vs Ironwood. TPU 8i (inference): 288GB HBM, 384MB on-chip SRAM, ~80% better perf/dollar. Designed for agentic AI, MoE models, large-scale RL. Google designs TPUs to train Gemini — the enterprise benefits from silicon optimized by one of the world’s most demanding ML workloads but does not direct the optimization priorities.

NVIDIA GPUs on Google CloudCeded

A5X bare-metal on Vera Rubin NVL72 (among first cloud providers to deploy). A4 Ultra (NVL72 preview Q2 2026), A3/A3 Mega/A3 Ultra. Fractional GPUs (G4 VMs, industry-first RTX PRO 6000 Blackwell vGPU). Scale: 80,000 Rubin GPUs single-site, 960,000 across multisite. Third-party models (Claude, Llama, Mistral) run on NVIDIA, not TPU — the playing field is structurally uneven.

Virgo Network FabricCeded

Purpose-built AI-optimized DC fabric. 134,000 TPU 8t chips connected at 47 Pb/s non-blocking bi-section bandwidth per DC. 4x bandwidth per accelerator, 40% lower unloaded latency vs prior gen. Also available for A5X. Designed for Gemini’s training topology.

Google Distributed Cloud (GDC)Delegated

On-prem deployment of Google Cloud services. Connected and air-gapped configs. 4 racks to hundreds. NVIDIA Blackwell GPUs + Gemini Flash models on-prem. Managed GDC Provider initiative (Clarence, Gulf Energy, T-Systems, WWT). NATO deployment. Customer provides facility; Google provides and operates HW+SW. Addresses data sovereignty but not judgment sovereignty.

NVIDIA-Provided

NVIDIA GPU Silicon

Vera Rubin NVL72, Blackwell B200/B300, H100/H200. 1M+ NVIDIA GPUs. NVIDIA instances serve third-party models that can’t run on TPU.

NVIDIA NIXL + Networking

NIXL for disaggregated inference. ConnectX/BlueField for GPU networking. Google manages the NVIDIA integration layer.

◆ Gap Analysis

No Layer 0 capability gap — Google’s portfolio is the broadest of any single cloud provider. The gap is governance: the enterprise has no authority over any Layer 0 component beyond selecting instance types. The silicon-model feedback loop is structurally unique: Google designs TPUs to train Gemini, not primarily to sell cloud compute. TPU roadmap decisions reflect Gemini’s training topology, not enterprise customer workload requirements. The enterprise inherits optimization it didn’t direct. AWS’s Trainium is designed for customer workloads. NVIDIA designs for the broadest market. Google designs for Gemini and makes TPUs available to customers. The multi-accelerator matching problem (TPU vs NVIDIA vs Axion CPU) creates a workload-to-silicon decision that recurs per-workload in cloud vs once at procurement on-prem. No productized policy engine automates that matching. Fluid Compute (Layer 2A) begins to address it but doesn’t consult governance metadata. GDC follows the same inverted operating model as AWS AI Factories: Google operates infrastructure the customer houses. Unlike Dell PowerRack or HPE ProLiant (enterprise-owned hardware), GDC is Google-operated even when customer-hosted.

◆ Borrowed Judgment

The silicon-model feedback loop: Google’s TPU roadmap is driven by Gemini’s training requirements. If Google decides TPU 9 should optimize for MoE architectures because that’s where Gemini is heading, every enterprise TPU workload inherits that architectural bet. Borrowed judgment at the silicon layer — a concept with no parallel in the Dell or HPE assessments. Virgo as borrowed network judgment: the enterprise inherits Google’s network optimization decisions without visibility or control. Cannot audit bandwidth sharing across tenants or prioritization of Google’s own Gemini training traffic.

◆ Working Notes

The dual-architecture hedge (TPU + NVIDIA) gives Google pricing leverage and architectural independence. The enterprise benefits indirectly but does not control whether NVIDIA GPU instances remain first-class citizens as Google optimizes for its own silicon.

Layer 1AData Storage & GovernanceCeded — Model-Powered Governance▼

Durable, governed data foundation — the Governance Catalog that Layer 2C queries

Vendor-Provided

Cloud Storage + Rapid Tier (Colossus)Ceded

Standard object storage at planetary scale. Rapid tier uses Colossus — Google’s internal distributed storage platform (previously powering Search, Gmail, YouTube, Gemini training). Sub-millisecond read/write. The enterprise gets the same storage engine that holds Google’s training data.

Smart StorageCeded

Automatically analyzes unstructured data and generates metadata/context on ingest using Gemini. Auto-tags images, PDFs. First recursive point: Smart Storage uses Gemini to enrich the data that will eventually be served to Gemini-powered agents. The model enriches the data that feeds the model.

Knowledge Catalog (Gemini-Powered)Ceded

Universal business context and governance. Gemini-powered semantic extraction, entity relationship mapping, dynamic context graph construction. Sub-second semantic search for agent retrieval. Aggregates metadata across BigQuery, AlloyDB, Spanner, Cloud SQL, Firestore, Looker. Third-party integrations (Atlan, Collibra, Datahub). Enterprise Connectivity federates context from Salesforce, Palantir, Workday, SAP, ServiceNow. Column-level lineage (GA).

BigQuery StorageCeded

Serverless columnar storage for structured/semi-structured. Managed Iceberg tables. Separates storage and compute. BigQuery spans storage, analytics, ML, and governance in a single service.

NVIDIA-Provided

Assessment pending

◆ Gap Analysis

Knowledge Catalog with Smart Storage represents the most ambitious attempt to solve the metadata-to-agent grounding problem. No other vendor has a production system that automatically extracts business semantics from raw data, builds a context graph, and serves that context to agents in real-time with governance enforcement. The recursive dependency is the central finding: Knowledge Catalog uses Gemini to perform semantic extraction and build the context graph. When Knowledge Catalog determines what context an agent receives, and that context graph was built by Gemini, the enterprise is consuming Gemini’s judgment about what its own data means — at the governance layer, before any application-level inference occurs. No other vendor has this pattern. Dell’s MetadataIQ indexes deterministically. AWS’s Glue doesn’t use Nova to enrich its catalog. HPE’s Ezmeral doesn’t use a foundation model for data semantics. VAST’s catalog is storage-native, not model-powered. The 1A→2C connection: Knowledge Catalog is not a passive registry. It makes decisions about what context agents receive, how data assets are ranked for retrieval, and which governance policies apply. The context graph determines agent grounding — an orchestration function, not a storage function. Governance gap: Knowledge Catalog is optimized for GCP. Third-party integrations federate INTO Knowledge Catalog, not out of it. An enterprise running PowerScale + S3 + GCS cannot use Knowledge Catalog as a federated surface across all three without making GCP the metadata authority.

◆ Borrowed Judgment

The context graph as borrowed judgment: when Knowledge Catalog builds entity relationships and business meanings, every agent that queries the graph inherits its representation of reality. If Gemini’s semantic extraction misclassifies a data asset, every agent grounded in that context acts on the incorrect interpretation. Borrowed judgment at the governance layer — before application-level reasoning. Smart Storage as ingest-time judgment: a single Gemini classification at ingest (‘this document is about Project X’) becomes a persistent governance fact. Limited mechanisms to audit or correct model-generated metadata at scale. Comparison to AWS: AWS classifies 1A as Delegated because Lake Formation enforces customer-defined policies. Google’s 1A is Ceded because Knowledge Catalog generates governance intelligence using Gemini. The enterprise on AWS retains governance judgment. The enterprise on Google inherits it.

◆ Working Notes

The Layer 1A / 2C boundary question: Knowledge Catalog’s context graph is architecturally Layer 1A (data catalog) but functionally Layer 2C (determines agent grounding and context routing). Model-powered governance layers that make orchestration decisions are functionally 2C even when architecturally 1A.

Layer 1BContext Management & RetrievalCeded — Model-Powered Prep▼

Low-latency retrieval for RAG — vector/hybrid search, context windows

Vendor-Provided

BigQuery MLCeded

In-database ML training and inference. Supports linear/logistic regression, K-means, time series, XGBoost, DNNs, imported TF/PyTorch models. Collapses the boundary between data preparation (1B) and AI runtime (2B) by running ML directly in the warehouse.

Dataflow + DataprocCeded

Managed Apache Beam (batch+stream) and managed Spark/Hadoop for large-scale processing. Open-source frameworks, Google-managed execution.

LookML Agent (Gemini-Powered, Preview)Ceded

Derives semantic models from documentation using Gemini — automates the business logic capture that traditionally requires manual data engineering. Second Gemini recursion point: Gemini interprets what data MEANS at the business logic layer, generates the semantic model, and agents query through that model. If the interpretation of ‘revenue’ is subtly wrong, every agent inherits the error.

Data Agent Kit (Open-Source)Delegated

MCP-based agents packaged as tools and skills. Supports Claude Code, Gemini CLI, Codex, VS Code. Enables intent-driven development: practitioners define goals, agents handle implementation. Creates governance recursion: the agent builds the pipeline that prepares the data that feeds the agent.

Vertex AI Feature StoreCeded

Managed feature serving for online/offline ML models and agents. Consistent feature serving across training and inference. 1B→2B bridge.

NVIDIA-Provided

Assessment pending

◆ Gap Analysis

No meaningful capability gap. Most mature Layer 1B in the assessment series. BigQuery ML eliminates data-to-model handoff. LookML Agent automates semantic model construction. Data Agent Kit enables agent-driven pipeline development. The gap is governance over model-generated data artifacts. When LookML Agent generates a semantic model, when Data Agent Kit writes a pipeline, when BigQuery ML trains a model — who reviews the output for correctness? These are AI-generated artifact governance problems. Google provides no productized capability for governing model-generated data artifacts at scale. Data Agent Kit’s explicit support for Claude Code and non-Google tooling is strategically significant — the one point in the stack where third-party model access is genuinely equal.

◆ Borrowed Judgment

The semantic model as borrowed judgment: when LookML Agent generates definitions, every analytics query and agent interaction using those definitions inherits Gemini’s interpretation of business logic. Powerful (automates weeks of manual semantic modeling) and risky (embeds model judgment in the analytical foundation). The pipeline-building agent as borrowed judgment: Data Agent Kit agents write Dataflow jobs and BigQuery transformations. The enterprise inherits the agent’s data engineering judgment — join strategies, filter logic, null handling. Previously human expertise, now model-generated.

◆ Working Notes

The comparison to AWS SageMaker Unified Studio: AWS provides a single governed environment across services (service-wide integration). Google achieves integration through BigQuery spanning storage, analytics, ML, and governance (service-deep integration). Google = tighter integration at cost of BigQuery lock-in. AWS = service diversity at cost of integration complexity.

Layer 1CData Movement & PipelinesCeded▼

Move/transform data — caching, streaming, cross-cloud federation

Vendor-Provided

Cloud Storage Rapid Tier (Colossus)Ceded

Sub-millisecond caching layer between persistent storage and compute. Previously internal-only, now customer-accessible.

Managed LustreCeded

10 TB/s bandwidth (10x YoY, claimed 20x faster than other hyperscalers), 80 PB capacity. RDMA-enabled. Training data movement layer between Cloud Storage and TPU/GPU clusters.

Cross-Cloud Lakehouse (Preview)Ceded

Agentic AI workflows access data across AWS and Azure without egress — querying data in place rather than copying. Eliminates ETL and cross-platform data movement costs. Extends compute to wherever data sits. Google’s query engine becomes the universal data access layer regardless of physical data location.

BigLakeCeded

Unified data fabric across data lake (Cloud Storage) and warehouse (BigQuery). Single schema, multiple processing engines. Row/column-level governance via BigQuery Storage API across all access paths including open-source engines. Multi-cloud via BigQuery Omni.

Smart Storage Ingest EnrichmentCeded

Data entering Google Cloud is auto-enriched by Gemini with metadata and context tags. Data is not just moved — it is interpreted on arrival. No parallel in other vendors’ data movement layers.

NVIDIA-Provided

Assessment pending

◆ Gap Analysis

Cross-Cloud Lakehouse is the most strategically important Layer 1C capability in the assessment series. It represents a fundamentally different approach to data gravity: extend compute to wherever data sits rather than moving data to compute. The DAPM implication: if Cross-Cloud Lakehouse delivers, the enterprise doesn’t need to move data to GCP. That reduces data lock-in at storage. But it increases lock-in at the query layer — analytical capabilities depend on Google’s query engine reaching across clouds. Data sovereignty improves. Analytical sovereignty does not. The data movement / enrichment coupling: Smart Storage’s Gemini enrichment at ingest means Layer 1C (movement) and Layer 1A (governance) are coupled through model inference. Moving files into Cloud Storage triggers model inference generating metadata that propagates into the context graph. No other vendor couples data movement with model-powered enrichment. Cross-Cloud Lakehouse is in preview. Performance, cost, and governance characteristics at enterprise scale are unproven.

◆ Borrowed Judgment

Cross-Cloud Lakehouse as borrowed query optimization: when Google’s engine optimizes a cross-cloud query, the enterprise inherits Google’s optimization judgment. Opaque and unchallengeable — cannot tune the cross-cloud query plan or audit how Google’s engine accesses data in a competing cloud provider’s storage. The enrichment coupling: data engineers moving files into Cloud Storage unknowingly trigger model inference. Convenient (automatic enrichment) and opaque (the engineer may not know Gemini is interpreting their data on arrival).

◆ Working Notes

The asymmetry between data plane federation (Cross-Cloud Lakehouse) and control plane federation (absent at 2C) is a structural finding. Google invests in making data accessible across clouds but not in making agent governance portable across clouds. Data accessibility without orchestration portability draws workloads toward GCP as the governance center.

Layer 2AInfrastructure OrchestrationCeded / Delegated (GDC)▼

GPU scheduling, capacity management, autoscaling, sovereign deployment

Vendor-Provided

GKE + GKE Agent SandboxCeded

Managed K8s with AI-era extensions. Agent Sandbox: gVisor-based secure isolation, 300 sandboxes/second/cluster with sub-second time to first instruction. Infrastructure built for the agentic era, not retrofitted.

Fluid ComputeCeded

GCE + GKE dynamically shifting workloads in real-time. CPUs for branchy agent logic, secure sandboxes, RL, SLM inference, RAG. GPU/TPU for training and large-model inference. Proto-Layer 2C: routes based on workload characteristics, not business context.

GDC (Sovereignty Analysis)Delegated

On-prem Google Cloud services: GKE, Agent Platform, managed storage, Gemini Flash, Blackwell GPUs. Air-gapped for sensitive workloads. Addresses data sovereignty (where computation happens) but NOT judgment sovereignty (whose model drives computation). GDC runs Gemini on-prem — same recursive dependency, inside the enterprise perimeter.

Capacity ManagementCeded

CUDs (1yr/3yr), on-demand, preemptible/spot, Dynamic Workload Scheduler. Same pattern as AWS — capacity acquisition (2A), not workload placement (2C).

NVIDIA-Provided

NVIDIA GPU Operator (on GKE)

Available for NVIDIA instances on GKE. Google manages the GPU integration layer.

◆ Gap Analysis

GKE is the most mature managed K8s for AI workloads. GKE Agent Sandbox has no equivalent in Dell, HPE, or AWS portfolios — 300 sandboxes/second is built for agentic workload density. Fluid Compute sits at the 2A/2C boundary. Its dynamic workload shifting is more than capacity acquisition — runtime decisions about which compute type serves which workload. But less than full 2C — routes on workload characteristics, not business context (data residency, compliance tags, cost targets). The Fluid Compute → Knowledge Catalog connection does not exist: workload placement does not consult governance metadata. Same Infrastructure Layer 2C gap every vendor has. GDC: the full sovereignty analysis reveals that data sovereignty ≠ judgment sovereignty. Knowledge Catalog on GDC uses Gemini. Smart Storage on GDC uses Gemini. Agent Platform on GDC uses Google’s runtime. The enterprise gains physical sovereignty but retains the same judgment concentration. The ‘self-driving cloud’ narrative implies Gemini-powered infrastructure operations — autonomous root-cause analysis on infrastructure telemetry. If the Reasoning Plane is itself Gemini-powered, the operational intelligence and application intelligence are the same intelligence.

◆ Borrowed Judgment

GKE scheduling as borrowed judgment: enterprise inherits Google’s scheduling decisions. If Google uses Gemini-driven optimization, the enterprise borrows both traditional heuristics and model reasoning, without distinction. GDC as borrowed judgment in sovereign packaging: physical control over facility, Google’s judgment in software, model, governance, and operations. Data sovereignty with judgment concentration. Fluid Compute as proto-2C borrowed judgment: when Fluid Compute routes agent work to CPU vs GPU, that routing is Google’s judgment about optimal compute matching. Enterprise doesn’t configure the routing policy.

◆ Working Notes

Data sovereignty vs judgment sovereignty: the 4+1 framework should distinguish between where data resides (which GDC addresses) and whose model’s reasoning shapes the AI system (which GDC does not address). An enterprise running GDC air-gapped has data sovereignty while fully ceding judgment sovereignty.

Layer 2BApplication Runtime & ExecutionCeded — Model-Integrated Stack▼

Model serving, agent execution, inference APIs, frameworks

Vendor-Provided

Gemini Enterprise Agent PlatformCeded

Unified platform for building, scaling, governing, optimizing agents. Subsumes all Vertex AI services. Agent Studio (low-code), ADK (code-first, Python/Go/Java/TypeScript, model-agnostic, open-source), Model Garden (200+ models incl Gemini, Claude, Llama, Gemma), Agent Runtime, Agent-to-Agent Orchestration, Agent Identity (GA), Agent Gateway, Agent Observability, Agent Registry, Memory Bank, Antigravity (desktop app + CLI).

The House Model AdvantageCeded

Gemini on TPU: silicon designed for this model, networking designed for its topology, distributed runtime (Pathways) built for its coordination, inference optimized for its architecture, governance (Knowledge Catalog) powered by it, orchestration defaults to it. No other vendor achieves this degree of vertical optimization. Third-party models (Claude, Llama) run on NVIDIA GPUs — supported but not co-optimized. The playing field is structurally tilted.

Frameworks (JAX, Pathways, TorchTPU, vLLM)Ceded

JAX: Google’s ML framework optimized for TPU. Pathways: distributed runtime for superpod-scale training. TorchTPU: full PyTorch support on TPUs (concession to ecosystem). vLLM: optimized across GPU + TPU. llm-d: open-source K8s-native inference serving (multi-vendor project).

NVIDIA-Provided

NVIDIA GPU Instances

A3/A3 Mega/A3 Ultra/A5X for third-party model inference. CUDA ecosystem required for non-Gemini models.

◆ Gap Analysis

Layer 2B is the center of gravity for the model-integrated stack. The model provider, runtime provider, and infrastructure provider are the same company. When the enterprise runs Gemini on Agent Platform on TPU, it borrows Google’s judgment at the model layer, runtime layer, framework layer, and silicon layer simultaneously. A single entity’s priorities shape the entire execution path. The Agent Platform collapses Layers 2B, 2C, and 3 into a single product surface: Agent Runtime (2B infrastructure), Agent Identity/Gateway/Registry/Orchestration/Observability (2C governance), Agent Studio/ADK/Antigravity (Layer 3 development). The product boundary does not align with the architectural boundary. AWS separates these: Bedrock (model access) is distinct from AgentCore Runtime (agent execution) is distinct from AgentCore Policy (governance). AWS’s separation preserves architectural boundaries the enterprise can independently govern. Google’s collapse optimizes integration but prevents swapping the governance layer (2C) while keeping the runtime (2B). The NVIDIA dependency at 2B is optional for Gemini (TPU-native) but required for third-party models. The enterprise using Claude on Google Cloud pays a structural performance tax — Claude runs on NVIDIA GPUs through a runtime designed for Gemini. Model Garden’s 200+ models are API-equal but not silicon-equal.

◆ Borrowed Judgment

The model-integrated runtime: Gemini on TPU inherits Google’s judgment at model layer (training data, alignment, safety), runtime layer (scheduling, scaling, session management), framework layer (JAX/Pathways optimization), and silicon layer (TPU architecture). Most concentrated borrowed judgment in the assessment series. The open-source hedge: llm-d, TorchTPU, vLLM, ADK provide genuine open alternatives. Google opens components that reduce adoption friction (frameworks, SDKs) while keeping authority-concentrating components closed (Agent Runtime infrastructure, Agent Gateway, Pathways). Production deployment pulls open tools into Google’s managed surface where authority shifts from Retained to Ceded.

◆ Working Notes

The 2B/2C collapse prevents the enterprise from independently governing the orchestration layer. An enterprise that wants Google’s Agent Registry and Agent Identity but AWS’s Bedrock for model access and its own governance engine for policy enforcement cannot compose that architecture. The components are bundled.

Layer 2CAgentic Infrastructure — The Reasoning PlaneCeded — Productized but Captive▼

Policy-driven placement and resource coordination — the Autonomy Layer

Vendor-Provided

Agent Identity (GA)Ceded

Agents as identity principals with authentication, authorization, audit. Control plane function: determines which agents exist as governed entities.

Agent GatewayCeded

Protocol-level governance for MCP and A2A communications. Security partner integrations (Broadcom, Check Point, Cisco, CrowdStrike, F5, Netskope, Okta, Palo Alto, Zscaler). Spans Layer 0 (networking), 2B (runtime), and 2C (orchestration).

Agent-to-Agent OrchestrationCeded

Deterministic multi-agent workflow routing. Control plane function: determines which agent handles which subtask.

Agent RegistryCeded

Catalog of agents with ownership, capabilities, protocols, invocation details. Administrator-controlled discoverability. Control plane function: determines which agents are available and who can use them.

Agent ObservabilityCeded

Monitoring, tracing, debugging across production agent populations. Feedback loop for detecting faulty reasoning and intervening.

NVIDIA-Provided

No NVIDIA Layer 2C Dependency

All Layer 2C components are Google IP. NVIDIA does not control governance, policy, or reasoning in Google’s stack.

◆ Gap Analysis

Google’s Intelligence Layer 2C is the most complete productized offering in the assessment series: Agent Identity + Gateway + Registry + Orchestration + Observability + Memory Bank. Together they constitute a genuine control plane for agent governance. Infrastructure Layer 2C — the autonomous placement engine — is NOT built as a customer-configurable product. The capacity primitives (Fluid Compute, CUDs, DWS) are building blocks, but they don’t compose into a policy-driven placement engine querying Knowledge Catalog governance metadata. Same gap as AWS and every other vendor. Google’s implicit Layer 2C is the most sophisticated in the assessment: managed services make autonomous placement, scaling, routing, and capacity decisions invisibly. The enterprise cannot see, configure, audit, or override these decisions. The model-integrated Reasoning Plane: if Google’s ‘self-driving cloud’ uses Gemini for infrastructure decisions, then the model powering the enterprise’s agents (Layer 3) is the same model governing agent orchestration (Intelligence 2C) is the same model deciding where agents run (Infrastructure 2C). One model’s judgment pervades every decision surface. Cross-cloud orchestration gap: Google federates the data plane (Cross-Cloud Lakehouse) but NOT the control plane. Agent Platform governs GCP agents only. Enterprise running agents across multiple clouds has no cross-platform agent governance surface — unless all agents route through Google’s Agent Gateway, which cedes cross-cloud governance to Google. The captive-but-best dilemma: this is evidence the control plane CAN be built as a coherent capability. The enterprise architect who wants it has one option: adopt Google Cloud. The federated alternative does not exist.

◆ Borrowed Judgment

The captive control plane: enterprise inherits Google’s orchestration model — deterministic routing, Google-managed identity, Google-governed protocols. Well-engineered but unchallengeable — cannot substitute alternative orchestration logic within the Agent Platform boundary. The model-powered control plane: if the Reasoning Plane uses Gemini for infrastructure decisions, a model judgment error at the control plane layer is invisible to the enterprise, with no fallback to human decision-making. Intelligence 2C: Low borrowed judgment in the sense that the components are productized and configurable. High borrowed judgment in the sense that the governance logic itself (Agent Gateway protocol decisions, Agent Identity authentication model, Orchestration routing patterns) is Google’s, not the enterprise’s.

◆ Working Notes

Google’s 2C proves the Control Plane Working Notes thesis: the control plane can be built. The question is whether it can be liberated from the vendor boundary — and whether the model-integrated dimension (control plane powered by the same model it governs) is a pattern to replicate or to avoid. The asymmetry: data plane federates (Cross-Cloud Lakehouse), control plane does not. This serves Google’s strategic interest — data accessibility without orchestration portability draws workloads toward GCP as governance center.

Layer 3 (+1)AI Application Layer — The Value PlaneOpen Model Layer, Captive Platform▼

AI-powered business capabilities — models, agents, business logic

Vendor-Provided

Gemini Model FamilyCeded

Gemini 3.1 Pro, Gemini 3.5, Gemini Flash. The model the entire stack was designed around. Gemma open-weight models for self-hosting (the one offering where enterprise can Retain model authority).

Model Garden (200+ Models)Delegated

Gemini, Claude Opus/Sonnet/Haiku, Meta Llama, Gemma, open-source models. Model Evaluation service. Broadest model catalog of any cloud provider. Model-agnostic claim genuine at Layer 3 — more so than any other layer.

Application SurfacesRetained / Delegated

Agent Studio (low-code), Agent Designer (no-code in Gemini Enterprise app), ADK (code-first, open-source, model-agnostic). Gemini Enterprise app for agent discovery and the Deep Research agent.

Consumer-Enterprise Feedback LoopCeded

Gemini powers Google Search, Gmail, Docs, Photos, Android, Chrome. Workspace Intelligence uses Gemini for agentic work. Model improvements from billions of consumer interactions directly benefit enterprise workloads. But: consumer-driven alignment and safety tuning may not align with enterprise needs.

Google Antigravity 2.0 (Agent-First Development Platform)Ceded

Announced I/O 2026. Standalone desktop app + CLI + SDK — a full developer platform built around agent orchestration. Multi-agent parallel execution: orchestrate multiple agents and execute tasks simultaneously. Dynamic subagent workflows and scheduled background automation. Antigravity CLI (Go-based, replacing Gemini CLI — deprecated June 18, 2026) for terminal-native multi-agent workflows. Antigravity SDK for building custom agents with templates in AI Studio. Powered by Gemini 3.5 Flash (co-developed using Antigravity). Native voice command support. Ecosystem integrations: Google AI Studio, Android, Firebase. Export tool for AI Studio → local development. Search integration: real-time custom UI generation within Google Search answers. AI Ultra plan ($100/month, 5x usage limits). Google's most aggressive move in the agentic coding market — positioned as the hub for multi-agent development workflow orchestration, not just code assistance.

NVIDIA-Provided

NVIDIA Models via Model Garden

NVIDIA Nemotron and other NVIDIA models available alongside all other providers.

◆ Gap Analysis

No meaningful capability gap. Broadest model catalog. Most portable agent development framework (ADK). Application surfaces from no-code through code-first. Google deliberately keeps Layer 3 more open than any other layer — while ensuring every Layer 3 application is gravitationally pulled toward Agent Platform (2B/2C). By keeping Layer 3 open, Google maximizes platform adoption: enterprises wanting Claude on Google Cloud still consume Agent Platform’s runtime, identity, gateway, registry, observability. The model is portable; the platform is captive. Consistent with the 4+1 model’s prediction that vendor lock-in concentrates at Layer 2B/2C, not Layer 3. Google has understood this prediction and built strategy accordingly. Code portability vs operational portability: ADK is open-source and model-agnostic — agent code CAN run on AWS or on-prem K8s. But Agent Registry, Memory Bank, Agent Identity, Agent Gateway, Agent Observability are Google Cloud services with no portable equivalents. Agent code is an asset the enterprise owns. Agent operations are an asset it rents. Antigravity 2.0 deepens the Layer 3 gravitational pull toward Google's platform. The desktop app + CLI + SDK creates a development surface that integrates directly with Agent Platform (2B/2C): agents built in Antigravity inherit Agent Platform's identity, gateway, registry, and observability. The Gemini CLI deprecation (June 18, 2026) forces migration to Antigravity CLI — consolidating Google's developer AI surface into one opinionated platform. The SDK enabling custom agent templates in AI Studio means Antigravity is not just a coding tool but an agent construction platform that feeds directly into the Gemini Enterprise Agent Platform. Compare to AWS Kiro (spec-driven, methodology-opinionated, Bedrock-native) and GitHub Copilot (IDE-embedded, multi-model, GitHub-native). Google's differentiator is multi-agent parallel orchestration — Antigravity coordinates multiple agents simultaneously rather than single-agent sequential interaction. This maps to the 4+1 model's Layer 2C vision: orchestrating multiple agents is a control plane function that Antigravity surfaces through a developer tool. The consumer-enterprise feedback loop extends to Antigravity: Google is using Antigravity's capabilities in consumer Search to generate real-time custom UIs as part of search answers. Developer tool innovations flow to consumer products and back — a flywheel no other vendor in the assessment possesses.

◆ Borrowed Judgment

Gemini as borrowed Layer 3 judgment: alignment changes affect agents (Layer 3), governance enrichment (Layer 1A via Knowledge Catalog), semantic models (Layer 1B via LookML Agent), and potentially infrastructure operations (Layer 2C via self-driving cloud). A single alignment decision propagates across the entire model-integrated stack. Platform defaults: Agent Studio and Agent Designer default to Gemini. Enterprise that adopts without explicitly selecting alternatives inherits Google’s model preference as a default rather than a decision. Strategic openness as borrowed judgment about lock-in location: Google’s decision to keep Layer 3 open and concentrate lock-in at 2B/2C is itself borrowed judgment the enterprise inherits. Evaluating Google on model diversity without evaluating platform captivity accepts Google’s framing of where portability matters.

◆ Working Notes

The consumer-enterprise feedback loop has no parallel in the assessment. Model improvements from billions of consumer interactions benefit enterprise workloads — but consumer-driven alignment may constrain enterprise use cases. If Google tightens content policies for consumer safety, enterprise agents inherit that tightening. The Gemini CLI → Antigravity CLI forced migration is a significant authority move. Over 100,000 GitHub stars on Gemini CLI — all those developers must migrate to Antigravity by June 18, 2026. This concentrates Google's developer AI surface into one platform and one billing model (AI Ultra at $100/month). The deprecation timeline is aggressive but consistent with Google's pattern of consolidating developer tools around Gemini. Antigravity 2.0's scheduled tasks capability (agents running automatically in the background) converts the developer tool from a single-turn interaction to a persistent automation pipeline. This blurs the boundary between Layer 3 (application) and Layer 2C (orchestration) — when Antigravity schedules background agents to perform tasks autonomously, who governs those agents? The answer is Agent Platform — reinforcing the Layer 2C gravitational pull.

✦ Summary Finding

4+1 Layer AI Infrastructure Model · Vendor Assessment Series · The CTO Advisor LLC · thectoadvisor.com