main
ultrachat_200k_test_sft.parquet
text → text
prompt_prediction
You are an expert in NLP and prompt analysis. Your task is to evaluate a **single user prompt** (plain text input) based on predefined categories and return structured JSON data for easier post-processing.
---
### **Input Format**
You will receive a **single user prompt as a string** in the following format:
"User's prompt goes here."
The prompt consists of **normal text**, where the user asks the assistant to perform a task.
Analyze the given **prompt only** (do not consider any future conversation) and classify it according to the categories below.
---
### **1. Top 3 Topics**
Select **up to 3** topics that are relevant to the prompt from the following list:
["Healthcare", "Finance", "Education", "Technology", "Science", "Politics", "Environment", "Ethics", "Entertainment", "History", "Philosophy", "Psychology", "Sports", "Legal", "Business", "Travel", "Food", "Art", "Literature", "Personal Development"]
- The **first topic** should be the **most dominant** in the prompt.
- The **second and third topics** should reflect **other significant themes** in the discussion.
- If a prompt **only has one or two clear topics**, leave the remaining slots **empty**.
---
### **2. Language Style**
- **"Formal"**
- **"Informal"**
- **"Mixed"**
---
### **3. Grammar & Slang in User Input**
- **"Perfect"** (No mistakes, professional style)
- **"Minor Errors"** (Small grammar/spelling mistakes, but understandable)
- **"Major Errors"** (Frequent grammar mistakes, difficult to read)
- **"Contains Slang"** (Uses informal slang expressions)
---
### **4. Type of Instruction Given to Assistant**
Choose **one** category that best describes what the user is asking the assistant to do.
- **Content Generation** → User asks for creative content, including writing, design ideas, or brainstorming responses.
- Example: `"Create a t-shirt design about animal rights."`
- Example: `"Write a short sci-fi story."`
- Example: `"Generate ideas for a marketing slogan."`
- **Factual Inquiry** → User requests objective facts, statistics, or comparisons with clear, verifiable answers.
- Example: `"What are the top 5 largest animal rights organizations?"`
- Example: `"Give me statistics on deforestation and animal extinction."`
- Example: `"Compare the environmental impact of cotton vs. synthetic fabrics."`
- **Opinion-Seeking** → User explicitly asks for subjective input, recommendations, or an evaluative stance.
- Example: `"What’s your opinion on using synthetic leather?"`
- Example: `"Do you think my t-shirt design idea is effective?"`
- Example: `"What’s the best way to convince people to care about animal rights?"`
- **Task-Oriented** → User asks for structured assistance, edits, refinements, or summarization of existing content.
- Example: `"Summarize the key points from this discussion."`
- Example: `"Improve my t-shirt design by making it more dynamic."`
- Example: `"Make my speech more persuasive."`
- **Conversational Engagement** → User initiates casual, open-ended dialogue with no clear task or goal.
- Example: `"What do you think about animal welfare?"`
- Example: `"Tell me something interesting about t-shirts!"`
- Example: `"Let’s chat about animal rights history."`
---
### **Output Format**
Return structured **JSON output** in this format:
```json
{
"topics": ["Art", "Science", "Healthcare"],
"language_style": "Formal",
"grammar_slang": "Perfect",
"instruction_type": "Content Generation"
}
Instructions
Analyze only the provided prompt (do not infer from missing context).
Ensure responses contain at least one topic (empty output is invalid).
Select up to 3 most relevant topics, ordered by prominence.
Use only predefined options for consistency.
Do not add explanations—only return JSON.
Now, analyze the following prompt (plain text input):
{{prompt}} Mar 16, 2025, 12:00 PM UTC
Mar 16, 2025, 12:00 PM UTC
5 row sample
4819 tokens$ 0.0010
5 rows processed, 4819 tokens used ($0.0010)
Estimated cost for all 23110 rows: $4.45Sample Results completed
4 columns, 1-5 of 23110 rows