Home

Qwen3-VL-JAX

Qwen3-VL implementation in JAX

AuthorSurya Dantuluri
Published

From-scratch JAX implementation of Qwen3-VL. Reimplemented mRoPE, Deepstack, KV Cache, and ViTs to get a deep understanding of the architecture. Runs Qwen3-VL 2B locally with full chain-of-thought reasoning visible.

Built as the foundation for vlm-gym and the geo-guessing RL pipeline. The impetus was wanting a lean, HuggingFace-free implementation that could run directly on TPUs.