Repository evaluations - datasets/Bitext-Customer-Support-Intent

Evaluations

Label datasets and evaluate model performance

Oxen.ai allows you to run models row by row over your datasets. This allows you to label data, or evaluate how well a model is performing. Once the model has run over your dataset, you can save the output to a new file or branch, comparing it to the original dataset.

Classify Intents

4380b6a6-c437-4ad6-897b-35091eb59d24

OpenAI/GPT 4o minitext → text

greg

11 months ago

Prompt

Classify the following query into one of the following categories:

[recover_password, place_order, payment_issue, get_refund, get_invoice, contact_human_agent]

Return the category that best fits the text. Respond with a single word lowercase.

{query}

main

test.jsonl

N/A

test.jsonl

error no case clause matching: {:error, "resource_not_found", 0, 0} 229 rows17048 tokens$ 0.0028 3 iterations

Classify Intents

30275ab5-602e-47d5-8322-f369b3e9e3b6

OpenAI/GPT 4o minitext → text

greg

11 months ago

Prompt

Classify the following query into one of the following categories:

[recover_password, place_order, payment_issue, get_refund, get_invoice, contact_human_agent]

Return the category that best fits the text. Respond with a single word lowercase.

{query}

main

test.jsonl

completed 5 row sample347 tokens$ 0.0001 1 iteration

Classify Intents

265dd3a0-dd1d-48e5-9bd7-f5bfe6a425f3

OpenAI/GPT 4o minitext → text

greg

11 months ago

Prompt

Classify the following query into one of the following categories:

[recover_password, place_order, payment_issue, get_refund, get_invoice, contact_human_agent]

Return the category that best fits the text. Respond with a single word lowercase.

{instruction}

main

test.jsonl

completed 5 row sample351 tokens$ 0.0001 2 iterations

Loading evaluations...