Evaluations/Rewritten LLM as a Judge
rewritten
12_sample_rewritten.parquet
texttext
OpenAIOpenAI/GPT 4o
OpenAI OpenAI
is_correct
Compare the two answers and respond with true if the reasoning and answers are the same and false if not. Respond with a single word lower case.

Answer 1:
{reasoning}
{answer}

Answer 2:
{prediction}
Oct 4, 2024, 4:16 PM UTC
Oct 4, 2024, 4:16 PM UTC
10 row sample
3463 tokens
10 rows processed, 3463 tokens used
Sample Results completed
6 columns, 1-10 of 12 rows