Open Source · MIT Licensed

Reality Web
Intelligence

Browser-native AI for the web

A browser-mediated architecture that exposes on-device LLM capabilities to websites through an explicit, origin-scoped permission model and a shared model runtime. Websites don't run AI models — the browser does. Websites request intelligence — the user grants access.

On-Device InferenceOrigin-Scoped PermissionsZero CloudShared RuntimeMetal Accelerated

Star on GitHub View Documentation

A few lines. That's it.

Add on-device AI to any website — no model bundling, no API keys, no cloud costs.

your-website.js

// Check availability

const available = await rwi.isAvailable();

// Request permission

const { granted } = await rwi.requestPermission();

// Generate with streaming

const result = await rwi.generate({

prompt: "What's RWI?",

onToken: (token) => updateUI(token)

});

The Problem

Web AI is broken by design

Today, most AI integration in web applications is server-dependent: user data is transmitted off-device for inference, developers pay per-request API costs, and latency depends on external services.

Meanwhile, modern consumer devices — especially Apple Silicon Macs — can run mid-scale language models locally. What if the intelligence a website needs were already present on the user's device?

No existing system simultaneously provides a shared model, origin-scoped permissions, zero cloud dependency, and a deployment path independent of browser vendors.

Cloud APIs

Privacy risk, recurring costs, latency, fragmented implementations

In-Browser (WebLLM)

Per-origin model downloads — same model cached 5× for 5 sites

Chrome Prompt API

Single-vendor lock-in, model instability under vendor control

Local Runtimes (Ollama)

No web-origin-scoped permission surface for untrusted websites

Platform Intelligence

Not programmable as a web API — system features, not web primitives

How It Works

From request to response

The browser mediates every step — websites never touch the model directly.

SDK Injection

The browser injects window.rwi at document start — available before application scripts run.

Permission Request

The website requests access; the user sees a browser-native prompt and grants per-origin consent.

On-Device Inference

The request is routed to llama.cpp running locally. Tokens are generated via Metal-accelerated compute.

Streaming Response

Tokens stream back to the page in real time. The website updates its UI token-by-token.

Implementations

Two paths, one API

Both implementations expose the same window.rwi API. Websites work identically on either.

Rewebin

Native Browser

A custom macOS browser built on WebKit that integrates RWI natively. The browser is the trusted runtime — inference runs in-process via a native Swift service layer with llama.cpp XCFramework integration.

Note: Rewebin is a prototype built solely to demonstrate RWI's vision of AI coming built-in to future browsers. It has not been developed with the essential security protocols required for deployment.

In-process inference (no IPC)

SwiftUI + WKWebView

Includes RWI Analyzer

Safari Extension

Extension-Mediated

A macOS container app hosts the on-device LLM runtime, while a Safari Web Extension injects the SDK and mediates per-origin requests — proving RWI can work without requiring a new browser.

Works in existing Safari

Manifest V3 extension

App hosts LLM runtime

Open Source

RWI is MIT-licensed and welcomes contributions. Explore the code, file issues, or submit pull requests — help shape the future of browser-native AI.

Star on GitHub All Products

Reality WebIntelligence