Evaluations/0e9d5ad5-fea6-4fb4-876a-6a6d574cbfbb
main
train.jsonl
text → text
OpenAIOpenAI/GPT 4o mini
OpenAI OpenAI
prediction
You are an expert programmer tasked with evaluating a response to a programming question. Answer "good" if the answer is correct and "bad" if incorrect. Only answer with the one word
Aug 2, 2025, 1:59 AM UTC
Aug 2, 2025, 1:59 AM UTC
5 row sample
282 tokens$ 0.0001
5 rows processed, 282 tokens used ($0.0001)
Estimated cost for all 3000 rows: $0.0421
Sample Results completed
4 columns, 1-5 of 3000 rows