
Bessie
ox's Repositories
🐱 Cats vs Dogs 🐶 which is better? Contribute your cat or dog to make the worlds largest cat and dog repository.
Combining instruct data with SQUAD data to see if we can get a more generic model
This is an aggregate dataset, comprised of Dolly HHRLHF (derived from the Databricks Dolly-15k and the Anthropic Helpful and Harmless (HH-RLHF) datasets), combined with Competition Math, Duorc, CoT GSM8k, Qasper, Quality, Summ Screen FD and Spider. The intention was to create a permissively-licensed instruction-following dataset with a large number of longform samples.
This is an automated nightly crawl of a dad joke dataset from r/dadjokes reddit
The PIQA dataset introduces the task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA.