Inspired by SDPO — a minimal Claude Code that learns from your corrections in real-time, built on Tinker. The agent improves continuously as you use it, turning each correction into training signal.
Benchmarked on LiveCodeBench with Qwen3-4B and Qwen3-30B. Both show an early jump in pass@1 then plateau — the interesting question is how far continual learning can push coding agents. `pip install continualcode`.