This is not X.com/Twitter.com -- this is Backscroll, a 1-to-1 replica of Twitter's UI in which a autoresearch-loop agent reads my actual Twitter home timeline directly every so often dynamically and serves me a for you feed in my clone that serves me everything I actually want to see.
Backscroll reads my Twitter timeline and compiles a feed of what I actually want to read. It scrapes about 2,600 tweets a day from my timeline, plus my likes and bookmarks as a taste signal, packs the lot into a 250k-token prompt, and a model ranks what's worth surfacing. Mostly this saves me time. Instead of opening X and hoping the algorithm shows me the good parts, I get a slower, compiled feed I can scroll through.
It is part of a longer experiment in AI-generated feeds and interfaces. Sudo generated whole websites in real time from a prompt. WikWok took that toward feeds, where each short-form item is a looped canvas explaining something. Backlist is the same instinct applied to a Hacker News-style ranked list. Backscroll is another take on it: a Twitter-style feed where the ranking is a function I can read and re-run.
How it works
A RoamBrowser Durable Object owns the scraper, the compiler, and the serving layer, so every part of the system reads from one consistent piece of state. The compiler runs in five steps: scrape, rank, evaluate, publish, iterate.
Step two is where most of the design choices live. Instead of having the model chain a dozen single tool calls, the model writes one bounded JavaScript program that runs inside an isolated Cloudflare Dynamic Worker with the network blocked, and that program calls typed feed.* tools as if they were a local SDK. Intermediate state lives in JS variables instead of being smeared across the model's context window, the worker boundary is the only place where errors and untrusted code matter, and the entire ranking experiment happens in one model generation rather than a dozen round-trips.
Make me a feed about RL environments. Prefer primary sources, skip repeat threads, and publish only if diagnostics pass.
const pool = await feed.candidates({ mode: "for_you", limit: 600, unseenOnly: true});const ids = rank(pool.candidates).slice(0, 80);await feed.evaluate({ ids });return feed.publish({ ids });- Andrej Karpathy@karpathyยท 2h
RL environments are the new datasets. The bottleneck shifted from tokens to tasks with dense, well-defined reward.
- Nat Friedman@natfriedmanยท 5h
Procgen still the cleanest test of generalization. Fixed levels just measure memorization at this point.
- Rob Knight@ada_robยท 9h
Minimal RL gym wrapper โ 200 LOC, no Mujoco. Plugs into a Worker for headless rollouts.
The viz shows the curated path. Two feed modes share this pipeline. For You is the fast one: the ranking prompt is feature-only and never sees tweet text, which keeps runs cheap. Curated is the prompt-driven version, where I can say "make me a feed about RL environments" and the program inspects content directly. If a published feed is wrong, the iterate step forks a child session that excludes everything I already saw and re-ranks against the same rubric, so my feedback feeds back in instead of getting lost.
Frontends
Four different surfaces, all reading from the same underlying feed sessions. The compiler doesn't know about any of them.
The native Backscroll UI is the primary one (shown above). The header is just backscroll ยท 116,145 (today's token budget), and the tab bar runs For you / All / Bookmarks / Likes / Archive / Chat. Each ranked card has a "why" button next to the action row that opens the score breakdown, and the rubric feedback buttons (๐ / ๐) sit alongside the standard like and retweet. The composer at the bottom isn't for tweeting; it asks "What feed do you want?" and feeds the input straight into Code Mode as a directive.
The Twitter-web frontend takes the same sessions and renders them in an X-shaped layout for when I want the familiar muscle memory: For You / Following tabs, inline media, action row at the bottom of each tweet. The composer bar at the top of this view doubles as a feed-directive input ("more from @kohjingyu", "less politics") so I can steer the feed mid-session without dropping into a settings page.
The Hacker News-style frontend takes the same sessions and renders them as a ranked link aggregator, which I find better for reading densely. It also has a past page that lets me browse any historical day's feed with the raw stats attached: 2,660 tweets that day, 254,497 tokens, reasoning mode high.
The Telegram bot is push delivery for when I'm not at a computer. It sends curated thread summaries to a private channel with "More like this" and "Less of this" buttons inline, and that feedback flows into the rubric the same way clicks on the web version do.
Architecture
The harness is borrowed from Cursor: durable state lives outside the prompt, tools are narrow and typed, every run produces an artifact, and the system evaluates before it publishes. None of those rules are individually new. Together they make the system small enough to reason about.
The pieces the diagram doesn't show are mostly about how the model expresses preferences. The rubric is what to value (quality signals, topic preferences, source weights) and the strategy is how to rank, which in practice means LLM-generated SQL queries against the candidate pool that the program inside Code Mode iterates on. Every compilation persists as a durable session with items, scores, and metadata attached, and the UI always serves from the latest session rather than waiting on a live model call, which is why the feed feels instant when the compiler is slow. The implementation lives in two files: feed-compiler.ts is the lifecycle API (feedStats, feedStartRun, feedCandidates, feedEval, feedPublish), and feed-ranker.ts is the deterministic scoring half (buildCandidatePool, compileForYouFeed, evaluateFeed, diagnoseBadFeed).
The hosted demo at backscroll.sdan.io is read-only and serves the latest cached session, so it cannot scrape, re-rank, or spend model tokens. It is meant as a way to see the architecture and the harness, not as a feed you can use yourself.