NotRealNews.net

GPT-2 generated news sites

AuthorSurya Dantuluri
Published
Views3.1M from San Francisco, New York City, Los Angeles

Fine-tuned GPT-2 on hundreds of thousands of news articles to generate news articles, running GPT-2 124M inference on a cluster of Google Colab notebooks running on 8 Google accounts that sent the generations to a cluster of 4 Raspberry Pis that were running Ghost. This was one of the first projects I did that really made me think about how to scale and deploy models in a way that was efficient and cost effective.

The idea was think about AI-safety in a realistic scenario -- could a small LLM spread fake news cheaply and easily? Over the course of 7 months the blog network started backlinking to each other and eventually got 12 million impressions across Facebook, Reddit, and Hacker News.

From my understanding of interviewing many users (and almost signing a loi with the Washington Post in 2019) it was clear AI-safety, governance, and red-teaming was important to understand implications of models regardless if they were small. In this scenario many users simply reposted and shared articles purely based on headlines and the OG image they saw to continue narratives they believed in -- for instance some articles BigBird(the whole AI/serving stack I made) made were around local politics -- these articles were more widely shared in Facebook groups and despite them getting many likes and share, very few (if any) actually read to the bottom of the page. Even in 2019 124m paremeter models were barely coherent enough to pass it off as human written. Shortly after due to AI-safety concerns I made the effort to de-index all NotRealNews related pages from the internet.