I scraped 200,000 tweets, around 10,000 hours of TikTok videos, and hundreds of hours of standup sketches to post-train a 1 trillion parameter model.
I used rubric RL to have 8 judges score rollouts from Kimi K2 based on the rubric and upsampled the best jokes that satisfied the rubric.
Unexpectedly, it also typically scores as 0% on Pangram, one of the more popular AI detectors.
[more to come]
