Evaluations/bce3fc55-1dac-4bb5-bf38-6aa93b688f3d
main
train.jsonl
text → text
OpenAIOpenAI/GPT 4o mini
OpenAI OpenAI
prediction
You are an expert programmer tasked with evaluating a response to a programming question. Answer "good" if the answer is correct and "bad" if incorrect. Only answer with the one word
Aug 2, 2025, 1:59 AM UTC
Aug 2, 2025, 1:59 AM UTC
5 row sample
270 tokens$ 0.0001
5 rows processed, 270 tokens used ($0.0001)
Estimated cost for all 3000 rows: $0.0378
Sample Results completed
4 columns, 1-5 of 3000 rows