Evaluations/Base_model_eval/Iteration history
History
Total running cost: $0.2741
PromptRowsTypeModelTargetStatusRuntimeRunByTokensCost
Run
You are an expert programmer and are given the task of evaluating the quality of answers for programming questions. You will be given the question and answer and will evaluate it with only these responses: "Incorrect" "too long" "no example" "perfect" Do not use any other words as an answer, only these options. If the answer is incorrect, in any way always use "incorrect". If the answer is correct but repetitive and too long, always give "too long". If the answer is correct but without an example, always give "no example". If the answer is correct and includes an example, give "perfect". Here is your question: {prompt} Here is your answer: {response} Remember, only respond with either "incorrect", "too long", "no example", or "perfect" and no other words.
3000text → textOpenAIOpenAI/GPT 4o mini93687b7dd0f61fd7fbdf0e02fd640a3d completed 00:29:262 weeks agomathi1795290 tokens$ 0.2715
Sample
You are an expert programmer and are given the task of evaluating the quality of answers for programming questions. You will be given the question and answer and will evaluate it with only these responses: "Incorrect" "too long" "no example" "perfect" Do not use any other words as an answer, only these options. If the answer is incorrect, in any way always use "incorrect". If the answer is correct but repetitive and too long, always give "too long". If the answer is correct but without an example, always give "no example". If the answer is correct and includes an example, give "perfect". Here is your question: {prompt} Here is your answer: {response} Remember, only respond with either "incorrect", "too long", "no example", or "perfect" and no other words.
10text → textOpenAIOpenAI/GPT 4o miniSample - N/A completed 00:00:072 weeks agomathi6368 tokens$ 0.0010
Sample
You are an expert programmer and are given the task of evaluating the quality of answers for programming questions. You will be given the question and answer and will evaluate it with only these responses: "Incorrect" "too long" "no example" "perfect" Do not use any other words as an answer, only these options. If the answer is incorrect, in any way always use "incorrect". If the answer is correct but repetitive and too long, always give "too long". If the answer is correct but without an example, always give "no example". If the answer is correct and includes an example, give "perfect". Here is your question: {prompt} Here is your answer: {response} Remember, only respond with either "incorrect", "too long", "no example", or "perfect" and no other words.
5text → textOpenAIOpenAI/GPT 4o miniSample - N/A completed 00:00:032 weeks agomathi3731 tokens$ 0.0006
Sample
You are an expert programmer and are given the task of evaluating the quality of answers for programming questions. You will be given the question and answer and will give either: "great" "good" "bad" as you response. Do not use any other words as an answer, only the three options. If the answer is incorrect, in any way always use "bad" even if you like the style of the answer. If the answer is correct but without an example, always give "good". If the answer is correct and includes and example, give "great". Here is your question: {prompt} Here is your answer: {response} Remember, only respond with either "great", "good", or "bad", no other words.
5text → textOpenAIOpenAI/GPT 4o miniSample - N/A completed 00:00:042 weeks agomathi3643 tokens$ 0.0005
Sample
You are an expert programmer and are given the task of evaluating the quality of answers for programming questions. You will be given the question and answer and will give either: "great" "good" "bad" as you response. Do not use any other words as an answer, only the three options. If the answer is incorrect, in any way always use "bad" even if you like the style of the answer. If the answer is correct but without an example, always give "good". If the answer is correct and includes and example, give "great". Here is your question: {prompt} Here is your answer: {response} Remember, only respond with either "great", "good", or "bad", no other words.
5text → textOpenAIOpenAI/GPT 4o miniSample - N/A completed 00:00:032 weeks agomathi3643 tokens$ 0.0005