CARP

Contrastive learning for story critiques.

GPT-Neo

An implementation of model & data-parallel autoregressive language models with Mesh TensorFlow for distributed TPUs.

Mesh Transformer JAX

An implementation of model & data-parallel autoregressive language models with JAX and Haiku for distributed TPUs.

OpenWebText2

An enhanced version of OpenWebTextCorpus.

The Pile

A large, diverse, open-source language modeling dataset.