Skip to main content
webmcp engineering architecture

Integrating WebMCP: Architectural Patterns for Browser-Native Agent Support

Plotono Team

Introduction

We integrated WebMCP across the entire Plotono platform. 85 tools, every major surface. This post breaks down the architectural patterns that made it possible, for engineers building agent-compatible applications or evaluating WebMCP for their own platforms.

Plotono is a visual data pipeline and BI platform that compiles to SQL. It spans pipeline building, dashboards, visualizations, data lakes, connectors, data dictionary management, workflows, workboards, data quality assertions, and multi-tenant workspace administration. Every one of those surfaces now exposes structured tools to browser-resident agents through the WebMCP Imperative API.

The key finding: the scope of integration wasn’t determined by WebMCP’s complexity (the API is small). It was determined by whether the existing platform had the right structural properties. Clean API boundaries, a typed authorization model, per-page state isolation, and comprehensive test coverage made 85 tools tractable.

WebMCP: Spec Status and Browser Surface Area

WebMCP is a draft specification from the W3C Web Machine Learning Community Group, co-authored by Google and Microsoft. It’s still in early incubation, not yet on the formal W3C Standards Track. The first implementation shipped in Chrome Canary 146+ behind a feature flag. No stable-channel browser supports it today.

The specification defines two API surfaces:

  1. Declarative API: a markup-based approach where tools are declared in HTML attributes. Think of it as the <form> analog. The browser parses the declarations and exposes them to agents without JavaScript.

  2. Imperative API: a JavaScript API rooted at navigator.modelContext that lets applications programmatically register and unregister tools at runtime. This is what Plotono uses, because our tool surface is dynamic. The available tools depend on which page the user is on, what state the editors are in, and what permissions the current user holds.

The Imperative API surface is small. The core operations:

navigator.modelContext.registerTool(toolDefinition)
navigator.modelContext.unregisterTool(toolName)

Each tool definition includes a name, a natural-language description, a JSON Schema for parameters, and a handler function. When an agent invokes a tool, the browser calls the handler with the parsed parameters and returns the result to the agent.

The agent never touches the DOM. It never interprets screenshots. It calls typed functions with validated inputs and receives structured outputs. That’s the fundamental difference between WebMCP and the screen-scraping approaches that current browser agents rely on.

The Integration Architecture

Per-Page Lifecycle Scoping

The most important design decision was scoping tool registration to page lifecycle. When a user navigates to a page, that page’s tools register. When they navigate away, those tools unregister. There’s no global tool registry that persists across navigation.

This matters for two reasons:

  1. Relevance. An agent operating on the dashboard builder should see dashboard tools, not pipeline node tools. Exposing the full 85-tool surface on every page would overwhelm agent context windows and increase the probability of incorrect tool selection.

  2. State validity. Many tools interact with page-local state: the current editor contents, the selected node, the active filter configuration. If tools remained registered after the user navigated away, the handler functions would reference stale or destroyed state. Per-page scoping eliminates this class of bugs by construction.

The implementation is straightforward. Tools register during page mount and unregister during page unmount. Each page module declares its own tool set, and the registration/unregistration is tied to the component lifecycle of that page. No coordination layer needed. Each page is self-contained.

Two Tool Patterns

Across 85 tools and 10+ platform surfaces, two distinct architectural patterns emerged. Understanding when to use each is probably the most useful takeaway from this integration.

Pattern 1: Ref-Based State Bridges

Complex editors (the visual pipeline builder, the dashboard layout editor, the visualization configuration panel) maintain rich, mutable state that drives the UI render cycle. The tool handlers need to read and mutate this state, but they run outside the render cycle. They’re callback functions invoked by the browser runtime, not event handlers triggered by user interaction.

The naive approach, having tool handlers directly call state setters, creates race conditions and stale closure bugs. The state captured in the handler closure at registration time diverges from the actual state as the user (or agent) makes changes.

The solution is a ref-based state bridge. A stable reference object is created at the page level and passed to both the UI layer and the tool registration layer. The ref always points to the current state. Tool handlers read from and write through the ref, bypassing the closure staleness problem entirely.

[Agent invokes tool]
       |
       v
[Browser calls handler]
       |
       v
[Handler reads current state via ref]
       |
       v
[Handler mutates state via ref]
       |
       v
[UI observes state change and re-renders]

This pattern is necessary anywhere the tool needs to interact with a stateful editor. In Plotono, it covers pipeline node manipulation, dashboard item positioning, visualization column mapping, and filter configuration.

The ref-based bridge has a nice property: it decouples tool handlers from the UI framework’s reactivity model. The handlers are plain functions that read and write through a stable pointer. They don’t need to understand rendering, diffing, or update batching. That made them straightforward to test in isolation. We could verify tool behavior without rendering the editor UI at all.

Pattern 2: Direct API Calls

Management pages (workspace administration, connector setup, data lake configuration, data dictionary browsing) are fundamentally CRUD interfaces over the platform’s backend API. The page state is a cache of server state, not a rich local model.

For these pages, the tool handlers bypass local state entirely and call the platform’s API layer directly. The handler receives parameters from the agent, constructs an API request, executes it, and returns the response. If the page needs to reflect the change, it invalidates its cache and re-fetches.

[Agent invokes tool]
       |
       v
[Browser calls handler]
       |
       v
[Handler constructs API request]
       |
       v
[API call executes with full auth context]
       |
       v
[Handler returns structured response to agent]
       |
       v
[Page cache invalidated; UI re-fetches]

This pattern is simpler, faster to implement, and covers roughly half the tools by count. The split between the two patterns is close to even. Which one a given tool uses depends entirely on whether the page maintains rich local state or is a thin cache over server state.

A lightweight context object is shared across all tools on a given page, carrying the current tenant and workspace. This context feeds into every API call, ensuring the authorization model is enforced identically whether the action originates from a human click or an agent tool invocation.

Tool Schema Design

Every tool definition includes a JSON Schema for its parameters and a natural-language description. These two pieces of metadata make the tool set self-documenting. An agent can read the tool list, understand what each tool does, and construct valid invocations without any fine-tuning, training data, or out-of-band documentation.

A representative tool definition (pseudocode, not actual implementation):

name: "add_pipeline_node"
description: "Adds a new node to the current pipeline graph.
  Supported node types: source, filter, join, aggregate,
  select, extend, order_by, limit, sql.
  Returns the new node ID and updated graph metadata."
parameters: {
  type: "object",
  properties: {
    node_type: {
      type: "string",
      enum: ["source", "filter", "join", "aggregate",
             "select", "extend", "order_by", "limit", "sql"]
    },
    position: {
      type: "object",
      properties: {
        x: { type: "number" },
        y: { type: "number" }
      }
    },
    config: {
      type: "object",
      description: "Node-type-specific configuration.
        For source: { table, schema }.
        For filter: { expression }.
        For join: { join_type, on_expression }."
    }
  },
  required: ["node_type"]
}

The descriptions are dense and operational. They tell the agent not just what the tool does, but what the valid inputs are, what the output shape looks like, and what constraints apply. It’s interface documentation embedded in the tool definition, not marketing copy.

When we add a new node type, chart type, or assertion kind to the platform, the corresponding tool schema updates in the same change. There’s no separate integration layer to maintain. The tool definitions live next to the feature implementation, and they call the same code paths the UI uses.

Human-in-the-Loop: Scoping Destructive Actions

Not all 85 tools are equal in consequence. Reading a data lake schema is side-effect-free. Deleting a workspace is not. The tool surface includes a clear distinction between read operations, non-destructive writes, and destructive actions.

The highest-consequence operations, like saving a pipeline, surface a confirmation step: the tool handler prepares the action and returns a prompt for the agent to present to the user. Only after the user confirms does the action execute. For read operations and non-destructive edits (adding a node, configuring a chart, rearranging dashboard items), the agent operates freely because those changes only exist in the current editing session until explicitly saved.

This mirrors the platform’s existing UX. The same save confirmation a human sees when clicking Save is the same gate an agent encounters. The WebMCP tools inherit guardrails from the underlying operations, not from a separate policy layer.

The practical effect: an agent can autonomously explore data, build pipelines, configure visualizations, and draft dashboards. But persisting those changes requires the user to approve. The agent does the work. The human retains control over consequences.

Feature Detection and Graceful Degradation

The entire tool registration layer is gated behind a single feature detection check:

if (navigator.modelContext) {
  // register tools
}

If the browser doesn’t support WebMCP (which is every browser except Chrome Canary 146+ with the flag enabled, as of this writing), nothing happens. No polyfills. No fallback behavior. No error logging. The tool registration code is dead code on unsupported browsers, eliminated at the check.

That’s the right approach for a pre-Standards-Track API. The spec may change. The browser surface may evolve. Vendor prefixes may appear and disappear. By gating everything behind feature detection, the integration is zero-cost on unsupported browsers and forward-compatible with spec changes.

The platform’s behavior is identical with or without WebMCP support. Every action an agent can take through a tool, a human can take through the UI. The tools are a parallel interface, not a replacement. If WebMCP gets removed from Chrome tomorrow, no user workflow breaks.

Why Stable Platforms Integrate Faster

If you’re evaluating whether to invest in WebMCP (or any emerging browser API), our experience boils down to this: the speed of integration is determined by your existing architecture, not by the complexity of the new API.

WebMCP’s API surface is trivial. registerTool, unregisterTool, a JSON Schema, a handler function. Any competent engineer can wire up a single tool in an afternoon. The real question is whether you can wire up 85 without introducing regressions, security gaps, or state management bugs.

We could because of properties the platform already had:

Typed API contracts. Every backend operation has a defined request schema, response schema, and error type. Tool handlers construct API requests from agent-provided parameters, and the type system catches mismatches at build time, not at runtime when an agent sends an unexpected value.

Per-tenant authorization. Every API call carries a tenant context and is authorized against a policy engine. Tool handlers pass through the same authorization path as UI-initiated requests. We didn’t need to build a separate permission model for agent actions. The existing model already covers them.

Comprehensive test coverage. The pipeline compiler, dashboard system, and authorization layer have hundreds of tests each. When we wired tool handlers to these systems, we already knew the underlying operations were correct. The tests we wrote for the WebMCP layer cover tool registration, parameter validation, handler behavior, and lifecycle management, not the platform operations themselves. Those were already tested.

Page-scoped state management. Each page owns its state. There’s no global mutable state that multiple pages fight over. Per-page tool scoping was a natural fit: each page’s tools interact only with that page’s state, with no cross-page side effects to worry about.

If your platform lacks these properties, WebMCP integration becomes a forcing function to add them. That’s not necessarily bad. Typed APIs, proper authorization, and test coverage are worth having regardless. But the integration timeline will be proportional to your architectural debt, not to the complexity of WebMCP.

Rollout Strategy

We rolled out incrementally, starting with the most complex surface (the visual pipeline editor) to validate the ref-based state bridge pattern early. Subsequent surfaces alternated between bridge-based editors and API-call management pages, which let us stress-test both patterns against progressively different use cases.

Each surface shipped independently. The platform was usable by agents as soon as the first set of tools landed, with increasing capability as more surfaces came online. No big-bang integration.

What This Means for Agent Builders

If you’re building browser agents and evaluating WebMCP support, a few things stood out:

  1. Tool discovery is the API. Agents don’t need training data, documentation crawls, or custom integrations for WebMCP-enabled applications. The tool definitions include everything the agent needs: names, descriptions, parameter schemas, and return types. The protocol is the documentation.

  2. Tool count matters less than tool design. 85 tools sounds like a lot, but agents handle it well because the tools are scoped per-page. On any given page, an agent sees a focused subset (typically 8-22 tools depending on the surface), not the full 85. Context window pressure stays manageable.

  3. Structured invocation eliminates an entire class of agent failures. No screenshot interpretation. No coordinate calculation. No brittle CSS selector chains. The agent calls a typed function and gets a typed response. The failure modes are semantic (wrong tool choice, incorrect parameter values), not mechanical (missed click, stale DOM reference).

  4. The confirmation pattern is essential. Agents will make mistakes. Confirmation gates on destructive actions mean those mistakes get caught before they have consequences. Design your tool surface with clear read/write/destructive tiers from the start.

Conclusion

WebMCP is early. The spec is a draft. Browser support is a single vendor behind a flag. The W3C process is long, and the API surface will likely change before standardization.

But that doesn’t matter for the architectural argument. The pattern of exposing your platform’s capabilities as typed, discoverable tools that browser-resident agents can invoke is sound regardless of which specific API carries it to standardization. If WebMCP doesn’t become the standard, something structurally similar will. The investment in clean API boundaries, typed tool schemas, per-page lifecycle scoping, and human-in-the-loop confirmation flows will transfer to whatever emerges.

The deeper point is about the relationship between stability and adaptability. We could integrate 85 tools across the entire platform because the foundation was solid. The compiler works. The authorization model is airtight. The API contracts are typed. The tests pass. That foundation made it possible to adopt an experimental browser API without risk, because the new layer is thin, additive, and gated behind feature detection.

Stability isn’t the opposite of moving fast on emerging technology. It’s the prerequisite.


If you’re running Chrome Canary 146+ with WebMCP enabled, any compatible agent will discover the full tool surface when you open Plotono.

If you’re building agents and want a real-world WebMCP integration to test against, Plotono exposes one of the most comprehensive tool surfaces available today.

See Plotono plans and pricing