Evaluations/LLM as a Judge
no_CoT
sample.parquet
texttext
OpenAIOpenAI/GPT 4o
OpenAI OpenAI
is_correct
Compare the two answers and respond with true if the reasoning and answers are the same and false if not. Respond with a single word lower case.

Answer 1:
{response}

Answer 2:
{prediction}
Oct 4, 2024, 3:38 PM UTC
Oct 4, 2024, 3:38 PM UTC
10 row sample
3602 tokens
10 rows processed, 3602 tokens used
Sample Results completed
5 columns, 1-10 of 100 rows