Bessie
Bessie
ox
User account
ox's Repositories
Displaying Page 3 of 12 (120 total Repositories)

Combining instruct data with SQUAD data to see if we can get a more generic model

179.7 mb
1
Updated: 7 months ago

This is an aggregate dataset, comprised of Dolly HHRLHF (derived from the Databricks Dolly-15k and the Anthropic Helpful and Harmless (HH-RLHF) datasets), combined with Competition Math, Duorc, CoT GSM8k, Qasper, Quality, Summ Screen FD and Spider. The intention was to create a permissively-licensed instruction-following dataset with a large number of longform samples.

233.5 mb
2
Updated: 7 months ago

239.9 mb
3312100K
Updated: 9 months ago
Public
16

88.9 mb
5K21
Updated: 9 months ago

This is a crawl of the top hacker news posts daily

7.4 mb
1
Updated: 9 months ago

This is an automated nightly crawl of a dad joke dataset from r/dadjokes reddit

7.5 mb
11
Updated: 9 months ago

5.9 gb
132
Updated: 9 months ago

Creating a dataset to fine-tune Mamba 🐍

302.8 mb
121
Updated: 10 months ago
Public
0

The PIQA dataset introduces the task of physical commonsense reasoning and a corresponding benchmark dataset Physical Interaction: Question Answering or PIQA.

5.1 mb
21
Updated: 10 months ago

2.3 gb
14.9K2
Updated: 10 months ago