Runtype
BlogPersona

Your AI Agent Can See What Your Users See

Persona supports local tool calling, letting your AI agent read the DOM, call browser APIs, and interact with what's on screen. No server round-trips required.

Runtype TeamRuntype Team
personalocal-toolsdombrowseradvanced

Here's something that bothers us about most web AI experiences: the agent has no idea what's on the screen.

A user is looking at a product page. They ask the chat widget a question about the product. The agent has to guess what they're looking at, or ask them to describe it, or rely entirely on whatever context was manually fed into the prompt.

Local tool calling fixes this.

What local tools are

Runtype's local tool calling lets you register tools that run directly in the browser. When the agent decides to use one of these tools, the execution happens on the client side. No server round-trip. No extra latency. The tool has access to everything the browser has access to.

That includes the DOM.

DOM awareness in practice

Register a tool that reads the current page's content, and your agent can understand exactly what the user is looking at.

Register a DOM-reading tool
// Register a tool that reads the current page content
const tools = [
{
  name: 'read_page_content',
  description: 'Read the main content of the current page',
  parameters: {},
  execute: async () => {
    const main = document.querySelector('main') || document.body
    return { content: main.innerText, url: window.location.href }
  }
}
]

When a user asks "what's the return policy for this product?", the agent can call read_page_content, see the product details on screen, and give a specific answer. Not a generic one.

This creates an experience that feels like the agent is actually browsing the site together with the user.

Beyond the DOM: browser APIs and existing frontend APIs

Local tools are not limited to reading the page. You can register tools that:

  • Call your existing frontend APIs: If your site already has a search endpoint that your frontend calls, register it as a local tool. The agent can search your catalog without hitting your backend twice.
  • Access browser APIs: localStorage, sessionStorage, clipboard, geolocation. Anything the browser exposes.
  • Trigger UI actions: Open a modal, navigate to a page, add an item to a cart. The agent becomes an active participant in the user experience, not just a text responder.

Real examples

Shopping assistant

The agent reads the current product page, understands what the user is browsing, and can answer questions about sizing, availability, or compatibility without any server-side context injection.

Brand agent with local search

Instead of the AI calling a backend search API (which adds latency and requires auth), register your frontend search function as a local tool. The agent calls it directly, gets results instantly, and incorporates them into its response.

Support agent with page context

When a user reports a problem, the agent can read error messages on screen, check the current URL, and look at form state. It knows what the user sees before the user has to explain it.

Why this matters

Most AI chat widgets operate in a vacuum. They know what's in the system prompt and what the user typed. That's it.

Local tool calling breaks that boundary. The agent sees the page. It can interact with the page. It can use the same APIs your frontend already uses.

This is a fundamentally different kind of AI experience. And it's available today.

Register a read_page_content tool, embed the widget, and your agent starts every conversation already knowing what the user is looking at.