Eval Harness

GPT-Neo

An implementation of model & data-parallel autoregressive language models with Mesh TensorFlow for distributed TPUs.

GPT-NeoX

An implementation of 3D-parallel autoregressive language models for distributed GPUs.

Mesh Transformer JAX

An implementation of model & data-parallel autoregressive language models with JAX and Haiku for distributed TPUs.

OpenWebText2

An enhanced version of OpenWebTextCorpus.

The Pile

A large, diverse, open-source language modeling dataset.