Evaluations
Run models against your data
Introducing Evaluations, a powerful feature designed to enable you to effortlessly test and compare a selection of AI models against your datasets.
Whether you're fine-tuning models or evaluating performance metrics, Oxen evaluations simplifies the process, allowing you to quickly and easily run prompts through an entire dataset.
Once you're happy with the results, output the resulting dataset to a new file, another branch, or directly as a new commit.
62e264e3-afd9-4366-a3ae-ce87dcf52c19
62e264e3-afd9-4366-a3ae-ce87dcf52c19
5 row sample completed
Bessie
Bessie
1 week ago
Prompt: What is the sentiment of the following text, limit to one word, positive, negative or neutral {text}
2 iterations 1550 tokens$ 0.0151
text → textopenaiOpenAI/o1 mini
Source:
Sentiment Analysis
7c2aa00c-b9d8-4e38-918d-a6e9ec0960ba
4000 rows completed
Bessie
Bessie
3 weeks ago
Prompt: Classify the following text into positive, negative or neutral, based on the sentiment of the text. All lowercase, one word. {text}
3 iterations 286738 tokens$ 0.2581
text → textfireworksFireworks AI/Llama v3.1 70B Instruct
Source:
Target:
evaluating_sentiment_llama_70B
27fb870b-9302-4616-af19-3590a7aff15f
27fb870b-9302-4616-af19-3590a7aff15f
5 row sample completed
Bessie
Bessie
3 weeks ago
Prompt: Classify the text into positive, negative or neutral sentiment. One word all lowercase. {text}
2 iterations 266 tokens$ 0.0000
text → textgoogleGoogle/Gemini 1.5 Flash - 8B
Source:
Mistral Large Take 2
57993a40-2bb5-46a6-b9e6-a619d457132c
1000 rows completed
Bessie
Bessie
1 month ago
Prompt: Compute the sentiment of the text based on how well the companies are performing in the market. Return only one of three options: positive, negative, or neutral. Respond with one word, all lowercase. Text: {text}
2 iterations 54065 tokens$ 0.1106
text → textmistralMistral AI/Mistral Large
Source:
sentiment-analysis-mistral-large
Target:
sentiment-analysis-mistral-large-take-2
GPT-4o Sentiment Analysis
fb24aca8-7598-4208-bd15-3dd8630285f7
1000 rows completed
Bessie
Bessie
1 month ago
Prompt: Compute the sentiment of the text based on how well the companies are performing in the market. Return only one of three options: positive, negative, or neutral. Respond with one word, all lowercase. Text: {text}
1 iteration 77875 tokens$ 0.2022
text → textopenaiOpenAI/GPT-4o
Source:
Target:
sentiment-analysis-gpt-4o
Mistral Large Sentiment Analysis
fc6c8632-4ef2-4b80-9c7f-8753b7921f5b
1000 rows completed
Bessie
Bessie
1 month ago
Prompt: Compute the sentiment of the text based on how well the companies are performing in the market. Return only one of three options: positive, negative, or neutral. Respond with one word, all lowercase. Text: {text}
2 iterations 50440 tokens$ 0.1032
text → textmistralMistral AI/Mistral Large
Source:
Target:
sentiment-analysis-mistral-large
Ministral 8B Sentiment Analysis
889f014a-d37e-4ab7-969a-f31ab77f9c58
1000 rows completed
Bessie
Bessie
1 month ago
Prompt: Compute the sentiment of the text based on how well the companies are performing in the market. Return only one of three options: positive, negative, or neutral. Respond with one word, all lowercase. Text: {text}
2 iterations 38411 tokens$ 0.0038
text → textmistralMistral AI/Ministral 8B
Source:
Target:
sentiment-analysis-ministral-8b
Ministral 3B Sentiment Analysis
0c7bfd16-c0c2-4327-b915-2a9e8e7cc2b7
1000 rows completed
Bessie
Bessie
1 month ago
Prompt: Compute the sentiment of the text based on how well the companies are performing in the market. Return only one of three options: positive, negative, or neutral. Respond with one word, all lowercase. Text: {text}
2 iterations 38280 tokens$ 0.0015
text → textmistralMistral AI/Ministral 3B
Source:
Target:
sentiment-analysis-ministral-3b
Clean valid
c8d6b4ac-ae59-4001-b4a4-920e82c64b78
842 rows completed
Bessie
Bessie
1 month ago
Prompt: Fix the punctuation in the following text {Sentence}
1 iteration 58439 tokens$ 0.0192
text → textopenaiOpenAI/GPT-4o mini
Source:
Target:
cleaned_valid
Fix punctuation train
7a82f117-0432-4b39-981b-6bcb152c91e9
4000 rows completed
Bessie
Bessie
1 month ago
Prompt: Fix the punctuation in the following text {Sentence}
2 iterations 283425 tokens$ 0.0933
text → textopenaiOpenAI/GPT-4o mini
Source:
Target:
clean_train
Fix Punctuation
e98a97a2-ddbc-4e2c-9aea-c33bcbe509f3
1000 rows completed
Bessie
Bessie
1 month ago
Prompt: Fix the punctuation in the following text {text}
2 iterations 70157 tokens$ 0.0231
text → textopenaiOpenAI/GPT-4o mini
Source:
Target:
Computing sentiment
d6795b42-8c63-42d6-a140-1e51a4da7f3f
1999 rows completed
Bessie
Bessie
1 month ago
Prompt: What is the sentiment of the following text. Please respond with positive, negative or neutral. All one word. All lowercase. {text}
3 iterations 122466 tokens$ 0.0193
text → textopenaiOpenAI/GPT-4o mini
Source:
Target:
predictions
cbd282ff-723b-474c-bfbb-6b7caaadec94
cbd282ff-723b-474c-bfbb-6b7caaadec94
5 row sample completed
Bessie
Bessie
1 month ago
Prompt: Compute sentiment for the following text. One word, all lowercase. Positive, negative or neutral {text}
2 iterations 256 tokens$ 0.0000
text → textopenaiOpenAI/GPT-4o mini
Source:
Llama Sentiment
e5b81e95-a028-43c7-b3c4-b34cb96fcd04
100 rows completed
Bessie
Bessie
1 month ago
Prompt: You are a financial analyst and you want to find if the companies mentioned are mentioned in a positive, negative or neutral light. Respond all lowercase, one word. {text}
3 iterations 10262 tokens$ 0.0018
text → texttogetheraiTogether.ai/Meta Llama 3.1 8B Instruct Turbo
Source:
Target:
sentiment
Financial Sentiment Analysis
e9ba9a18-c973-450c-9514-049e85f3f20d
100 rows completed
Bessie
Bessie
2 months ago
Prompt: Compute the sentiment of the text based on the how well the companies are performing in the market. Return only one of three options: positive, negative or neutral. Respond with one word all lowercase. Text: {text}
2 iterations 11104 tokens$ 0.0020
text → texttogetheraiTogether.ai/Meta Llama 3.1 8B Instruct Turbo
Source:
Target:
llama-3.1-8b-sentiment
Sentiment Ministral
abd94a83-3768-4115-bd51-22adbf8266de
100 rows completed
Bessie
Bessie
2 months ago
Prompt: compute the sentiment of the text, positive, negative or neutral, one word all lowercase {text}
2 iterations 5293 tokens$ 0.0002
text → textmistralMistral AI/Ministral 3B
Source:
Target:
ministral-3b-sentiment
Sentiment Analysis
e0bbf280-af43-4d71-a5d2-a2a834f93cae
5 row sample completed
Bessie
Bessie
2 months ago
Prompt: compute the sentiment of the text, positive, negative or neutral, one word all lowercase {text}
2 iterations 291 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Sentiment
850b4c99-ac01-4210-81f6-5dda2ba3285a
100 rows completed
Bessie
Bessie
2 months ago
Prompt: Classify the text into positive negave or neutral sentiment. respond with one word all lowercase. {text}
3 iterations 5654 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Target:
sentiment
fcb0b14c-5565-4c4b-8ce3-8726ad23174e
5 row sample completed
Bessie
Bessie
2 months ago
Prompt: Find the sentiment of the following text, respond with just one word: positive, negative or neutral all lowercase {text}
3 iterations 311 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Sentiment Analysis
34042ba6-86bb-48eb-89c2-f00b7afcf401
5 row sample completed
Bessie
Bessie
2 months ago
Prompt: Classify the text into positive, negative or neutral sentiment {text}
1 iteration 228 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Translate to french
79aa0000-3297-4ac8-9866-7d20c0ff203c
5 row sample completed
Bessie
Bessie
2 months ago
Prompt: Translate this from English to French {text}
1 iteration 358 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Translate to french
c27f903b-5b0e-4be8-afc9-0522997cdee0
5 row sample completed
Bessie
Bessie
2 months ago
Prompt: Translate this from English to French {text}
1 iteration 357 tokens
text → textopenaiOpenAI/GPT-4o
Source:
c880bd08-4b52-4c9b-9c06-39700d140d5f
c880bd08-4b52-4c9b-9c06-39700d140d5f
5 row sample completed
Bessie
Bessie
2 months ago
Prompt: Classify the sentiment, respond in one word {text}
1 iteration 210 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Testing Sentiment
15e20ce8-ca37-4786-b13c-b42b350ee1e5
100 rows completed
Bessie
Bessie
2 months ago
Prompt: Classify this into positive, negative, neutral, respond with only one word. {text}
3 iterations 5289 tokens
text → textopenaiOpenAI/GPT-4o
Source:
d97a086a-d502-40d4-9413-7a62da39ee95
d97a086a-d502-40d4-9413-7a62da39ee95
10 row sample completed
Bessie
Bessie
2 months ago
Prompt: Classify this into positive, negative, or neutral. Respond with one word, all lowercase. {text}
1 iteration 585 tokens
text → textopenaiOpenAI/GPT-4o
Source:
78f86efd-7bc7-44d8-a06a-29a8470c29e7
78f86efd-7bc7-44d8-a06a-29a8470c29e7
10 row sample completed
Adam Singer
2 months ago
Prompt: You are a financial analyst who sees bad market outcomes as good and good market outcomes as bad because you are trying to short companies. Answer with a number between -10 and 10 with 10 meaning extremely positive opportunity to short, and -10 meaning an extremely negative opportinity to short response ONLY with the number {text}
7 iterations 1019 tokens
text → textopenaiOpenAI/GPT-4o
Source:
3 Shot Prompting
58126199-3ec5-42a2-935a-9312f4d892f3
100 rows completed
Bessie
Bessie
2 months ago
Prompt: Text: Operating profit improved by 39.9% to EUR 18.0 mn from EUR12.8 mn. Sentiment: negative Text: Net sales have been eaten by the weak US dollar. Sentiment: positive Text: Includes company and brand share data by category, as well as distribution channel data. Sentiment: neutral Text: {text} Sentiment:
3 iterations 11470 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Target:
Emotion Prompting
9cc8fbb8-1087-4eac-be53-f79269984524
100 rows completed
Bessie
Bessie
2 months ago
Prompt: classify the text into positive, negative, or neutral sentiment. Respond with a single word, all lowercase. Think really hard about if it is positive or negative. My career depends on you getting it right! {text}
2 iterations 7954 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Target:
Style Prompting
e1a47544-2405-4e9a-b02e-5e3a65b943b9
100 rows completed
Bessie
Bessie
2 months ago
Prompt: classify the text into positive, negative, or neutral sentiment. Respond with a single word, all lowercase. {text}
2 iterations 5954 tokens
text → textopenaiOpenAI/GPT-4o
Source:
Target:
style_prompting