Open Source · MIT Licensed

Reality Web
Intelligence

Browser-native AI for the web

A browser-mediated architecture that exposes on-device LLM capabilities to websites through an explicit, origin-scoped permission model and a shared model runtime. Websites don't run AI models — the browser does. Websites request intelligence — the user grants access.

On-Device InferenceOrigin-Scoped PermissionsZero CloudShared RuntimeMetal Accelerated

A few lines. That's it.

Add on-device AI to any website — no model bundling, no API keys, no cloud costs.

your-website.js
// Check availability
const available = await rwi.isAvailable();
// Request permission
const { granted } = await rwi.requestPermission();
// Generate with streaming
const result = await rwi.generate({
prompt: "What's RWI?",
onToken: (token) => updateUI(token)
});
The Problem

Web AI is broken by design

Today, most AI integration in web applications is server-dependent: user data is transmitted off-device for inference, developers pay per-request API costs, and latency depends on external services.

Meanwhile, modern consumer devices — especially Apple Silicon Macs — can run mid-scale language models locally. What if the intelligence a website needs were already present on the user's device?

No existing system simultaneously provides a shared model, origin-scoped permissions, zero cloud dependency, and a deployment path independent of browser vendors.

Cloud APIs

Privacy risk, recurring costs, latency, fragmented implementations

In-Browser (WebLLM)

Per-origin model downloads — same model cached 5× for 5 sites

Chrome Prompt API

Single-vendor lock-in, model instability under vendor control

Local Runtimes (Ollama)

No web-origin-scoped permission surface for untrusted websites

Platform Intelligence

Not programmable as a web API — system features, not web primitives

How It Works

From request to response

The browser mediates every step — websites never touch the model directly.

01

SDK Injection

The browser injects window.rwi at document start — available before application scripts run.

02

Permission Request

The website requests access; the user sees a browser-native prompt and grants per-origin consent.

03

On-Device Inference

The request is routed to llama.cpp running locally. Tokens are generated via Metal-accelerated compute.

04

Streaming Response

Tokens stream back to the page in real time. The website updates its UI token-by-token.

Implementations

Two paths, one API

Both implementations expose the same window.rwi API. Websites work identically on either.

Rewebin

Native Browser

A custom macOS browser built on WebKit that integrates RWI natively. The browser is the trusted runtime — inference runs in-process via a native Swift service layer with llama.cpp XCFramework integration.

Note: Rewebin is a prototype built solely to demonstrate RWI's vision of AI coming built-in to future browsers. It has not been developed with the essential security protocols required for deployment.

In-process inference (no IPC)
SwiftUI + WKWebView
Includes RWI Analyzer

Safari Extension

Extension-Mediated

A macOS container app hosts the on-device LLM runtime, while a Safari Web Extension injects the SDK and mediates per-origin requests — proving RWI can work without requiring a new browser.

Works in existing Safari
Manifest V3 extension
App hosts LLM runtime

Open Source

RWI is MIT-licensed and welcomes contributions. Explore the code, file issues, or submit pull requests — help shape the future of browser-native AI.