LLM-Feedback
Datasets with human or AI feedback. Useful for training reward models or applying techniques like DPO.
This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp.
41.6 mb
12
public
0
19.1 mb
12
public
0
519.5 mb
12
public
0
19.5 mb
22
public
0
24.2 mb
22
public
0
321.7 mb
12