Reality Web
Intelligence
Browser-native AI for the web
A browser-mediated architecture that exposes on-device LLM capabilities to websites through an explicit, origin-scoped permission model and a shared model runtime. Websites don't run AI models — the browser does. Websites request intelligence — the user grants access.
A few lines. That's it.
Add on-device AI to any website — no model bundling, no API keys, no cloud costs.
Web AI is broken by design
Today, most AI integration in web applications is server-dependent: user data is transmitted off-device for inference, developers pay per-request API costs, and latency depends on external services.
Meanwhile, modern consumer devices — especially Apple Silicon Macs — can run mid-scale language models locally. What if the intelligence a website needs were already present on the user's device?
No existing system simultaneously provides a shared model, origin-scoped permissions, zero cloud dependency, and a deployment path independent of browser vendors.
Cloud APIs
Privacy risk, recurring costs, latency, fragmented implementations
In-Browser (WebLLM)
Per-origin model downloads — same model cached 5× for 5 sites
Chrome Prompt API
Single-vendor lock-in, model instability under vendor control
Local Runtimes (Ollama)
No web-origin-scoped permission surface for untrusted websites
Platform Intelligence
Not programmable as a web API — system features, not web primitives
From request to response
The browser mediates every step — websites never touch the model directly.
SDK Injection
The browser injects window.rwi at document start — available before application scripts run.
Permission Request
The website requests access; the user sees a browser-native prompt and grants per-origin consent.
On-Device Inference
The request is routed to llama.cpp running locally. Tokens are generated via Metal-accelerated compute.
Streaming Response
Tokens stream back to the page in real time. The website updates its UI token-by-token.
Two paths, one API
Both implementations expose the same window.rwi API. Websites work identically on either.
Rewebin
Native Browser
A custom macOS browser built on WebKit that integrates RWI natively. The browser is the trusted runtime — inference runs in-process via a native Swift service layer with llama.cpp XCFramework integration.
Note: Rewebin is a prototype built solely to demonstrate RWI's vision of AI coming built-in to future browsers. It has not been developed with the essential security protocols required for deployment.
Safari Extension
Extension-Mediated
A macOS container app hosts the on-device LLM runtime, while a Safari Web Extension injects the SDK and mediates per-origin requests — proving RWI can work without requiring a new browser.
Open Source
RWI is MIT-licensed and welcomes contributions. Explore the code, file issues, or submit pull requests — help shape the future of browser-native AI.