Featured Datasets
Public
1
Pipeline to answer questions about papers from Arxiv Dives
452.7 mb
402096319
Public
0
Pipeline to answer questions about papers from Arxiv Dives
450.4 mb
189634020
Public
0
Pipeline to answer questions about papers from Arxiv Dives
450.4 mb
963182040
195.7 mb
222K
Public
0
4.3 gb
1258K
25 gb
1152K1
139.8 mb
21
Public
0
590.2 mb
22
View all featured repositories
Featured Collections
Some of the Oxen team's favorite collections.
Visual LLMs
This collection is datasets for understanding of images with large language models
a collection by datasets
LLM-Feedback
Datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO.
a collection by ox
Multimodal
List of datasets that cross modalities, combinations of text, image, audio, video etc.
a collection by ox
Browse all collections