WebGPU inference for DeepSeek-R1 — runs entirely in your browser, no downloads, no server. Open source, made in America, run on American servers.
Later extended to run Qwen3 0.6B with thinking locally via WebGPU, including experimental mobile support. Uses transformers.js with ONNX conversion under the hood.