Evaluations/5e7d827d-d4ab-4452-97b2-2971e39ffecf
main
gouache_images_768.parquet
imagetext
OpenAIOpenAI/GPT 4o mini
OpenAI OpenAI
captions
You are creating single-line captions to train a style LoRA for WAN 2.1 T2V.
Output one comma-separated line (no line breaks) per image. Do not include character names, brand/IP names, or camera/lens models.
Goal: describe what’s visible while emphasizing visual style over identity. Keep it generic (no personal/unique names), and suitable for both images and as base prompts for T2V.
Length: 25–60 words.
Order & fields (use as phrases, separated by commas):
Base class & subject (generic): e.g., “illustration of a person”, “cartoon mermaid”, “fantasy landscape”, “product render”, “animal portrait”.
Style & medium keywords: “cartoon animation aesthetic”, “digital illustration”, “vector-like shading”, “painterly brush texture”, “3D toon render”, etc.
Palette & materials: key colors or materials/patterns seen (e.g., “bright cyan and gold palette”, “scales texture”, “glossy highlights”).
Lighting & mood: “soft studio lighting”, “bright high-key”, “warm rim light”, “cheerful mood”.
Composition & shot type: “close portrait”, “medium shot”, “full body”, “three-quarter view”, “centered composition”, “white background”.
Notable visible details (generic): gestures, props, background simplicity, facial expression (“left eye closed winking”, “holding golden trident”, “hair flowing”, “clean background”).
Quality/layout helpers (optional): “clean edges, smooth gradients, minimal noise”.
Rules:
Be factual (describe only what’s visible).
Use generic nouns (“person”, “figure”, “character”) instead of names or brands.
If text appears in the image, say “a sign with text” (don’t transcribe).
Only mention gender/age/ethnicity if visually clear and relevant.
For faces, you may note expression/eye state (“smiling”, “eyes closed”, “left eye closed winking”).
Avoid negatives, camera metadata, and subjective adjectives like “beautiful”, “masterpiece”.
No trigger words; this is style training, not character training.
Output format example:
cartoon mermaid, digital illustration, cartoon animation aesthetic, bright blue and gold palette with scale textures, soft high-key lighting, full body three-quarter view, hair flowing, holding a golden trident, cheerful mood, clean white background, smooth gradients, minimal noise
More examples:
person portrait, vector-style digital art, saturated warm palette with teal accents, soft studio lighting, close headshot, left eye closed winking and smiling, clean background, smooth shading, crisp linework
fantasy character, painterly toon rendering, cyan and magenta palette, rim-lit, medium shot seated, gesturing with one hand, simple gradient background, subtle film grain, soft brush texture
product render, flat graphic design aesthetic, pastel palette, even lighting, centered composition on white, icon-like silhouette, clean edges, minimal shadows

caption this image {file_path}
Oct 7, 2025, 4:32 PM UTC
Oct 7, 2025, 4:33 PM UTC
5 row sample
74195 tokens$ 0.0112
5 rows processed, 74195 tokens used ($0.0112)
Estimated cost for all 57 rows: $0.1281
Sample Results completed
2 columns, 1-5 of 57 rows