History
Total running cost: $0.1481
| Prompt | Rows | Type | Model | Target | Status | Runtime | Run | By | Tokens | Cost | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Run | You are creating single-line captions to train a style LoRA for WAN 2.1 T2V.
Output one comma-separated line (no line breaks) per image. Do not include character names, brand/IP names, or camera/lens models.
Goal: describe what’s visible while emphasizing visual style over identity. Keep it generic (no personal/unique names), and suitable for both images and as base prompts for T2V.
Length: 25–60 words.
Order & fields (use as phrases, separated by commas):
Base class & subject (generic): e.g., “illustration of a person”, “cartoon mermaid”, “fantasy landscape”, “product render”, “animal portrait”.
Style & medium keywords: “cartoon animation aesthetic”, “digital illustration”, “vector-like shading”, “painterly brush texture”, “3D toon render”, etc.
Palette & materials: key colors or materials/patterns seen (e.g., “bright cyan and gold palette”, “scales texture”, “glossy highlights”).
Lighting & mood: “soft studio lighting”, “bright high-key”, “warm rim light”, “cheerful mood”.
Composition & shot type: “close portrait”, “medium shot”, “full body”, “three-quarter view”, “centered composition”, “white background”.
Notable visible details (generic): gestures, props, background simplicity, facial expression (“left eye closed winking”, “holding golden trident”, “hair flowing”, “clean background”).
Quality/layout helpers (optional): “clean edges, smooth gradients, minimal noise”.
Rules:
Be factual (describe only what’s visible).
Use generic nouns (“person”, “figure”, “character”) instead of names or brands.
If text appears in the image, say “a sign with text” (don’t transcribe).
Only mention gender/age/ethnicity if visually clear and relevant.
For faces, you may note expression/eye state (“smiling”, “eyes closed”, “left eye closed winking”).
Avoid negatives, camera metadata, and subjective adjectives like “beautiful”, “masterpiece”.
No trigger words; this is style training, not character training.
Output format example:
cartoon mermaid, digital illustration, cartoon animation aesthetic, bright blue and gold palette with scale textures, soft high-key lighting, full body three-quarter view, hair flowing, holding a golden trident, cheerful mood, clean white background, smooth gradients, minimal noise
More examples:
person portrait, vector-style digital art, saturated warm palette with teal accents, soft studio lighting, close headshot, left eye closed winking and smiling, clean background, smooth shading, crisp linework
fantasy character, painterly toon rendering, cyan and magenta palette, rim-lit, medium shot seated, gesturing with one hand, simple gradient background, subtle film grain, soft brush texture
product render, flat graphic design aesthetic, pastel palette, even lighting, centered composition on white, icon-like silhouette, clean edges, minimal shadows
caption this image {file_path} | 57 | image → text | 4caa69e0b558fc48badfa2408f1facd1 | completed | 00:03:01 | 2 months ago | shadowworks | 831066 tokens | $ 0.1259 | |
| Sample | You are creating single-line captions to train a style LoRA for WAN 2.1 T2V.
Output one comma-separated line (no line breaks) per image. Do not include character names, brand/IP names, or camera/lens models.
Goal: describe what’s visible while emphasizing visual style over identity. Keep it generic (no personal/unique names), and suitable for both images and as base prompts for T2V.
Length: 25–60 words.
Order & fields (use as phrases, separated by commas):
Base class & subject (generic): e.g., “illustration of a person”, “cartoon mermaid”, “fantasy landscape”, “product render”, “animal portrait”.
Style & medium keywords: “cartoon animation aesthetic”, “digital illustration”, “vector-like shading”, “painterly brush texture”, “3D toon render”, etc.
Palette & materials: key colors or materials/patterns seen (e.g., “bright cyan and gold palette”, “scales texture”, “glossy highlights”).
Lighting & mood: “soft studio lighting”, “bright high-key”, “warm rim light”, “cheerful mood”.
Composition & shot type: “close portrait”, “medium shot”, “full body”, “three-quarter view”, “centered composition”, “white background”.
Notable visible details (generic): gestures, props, background simplicity, facial expression (“left eye closed winking”, “holding golden trident”, “hair flowing”, “clean background”).
Quality/layout helpers (optional): “clean edges, smooth gradients, minimal noise”.
Rules:
Be factual (describe only what’s visible).
Use generic nouns (“person”, “figure”, “character”) instead of names or brands.
If text appears in the image, say “a sign with text” (don’t transcribe).
Only mention gender/age/ethnicity if visually clear and relevant.
For faces, you may note expression/eye state (“smiling”, “eyes closed”, “left eye closed winking”).
Avoid negatives, camera metadata, and subjective adjectives like “beautiful”, “masterpiece”.
No trigger words; this is style training, not character training.
Output format example:
cartoon mermaid, digital illustration, cartoon animation aesthetic, bright blue and gold palette with scale textures, soft high-key lighting, full body three-quarter view, hair flowing, holding a golden trident, cheerful mood, clean white background, smooth gradients, minimal noise
More examples:
person portrait, vector-style digital art, saturated warm palette with teal accents, soft studio lighting, close headshot, left eye closed winking and smiling, clean background, smooth shading, crisp linework
fantasy character, painterly toon rendering, cyan and magenta palette, rim-lit, medium shot seated, gesturing with one hand, simple gradient background, subtle film grain, soft brush texture
product render, flat graphic design aesthetic, pastel palette, even lighting, centered composition on white, icon-like silhouette, clean edges, minimal shadows
caption this image {file_path} | 5 | image → text | Sample - N/A | completed | 00:00:23 | 2 months ago | shadowworks | 74195 tokens | $ 0.0112 | |
| Sample | caption this image {file_path} to be used in buidling a training dataset for a style lora to be used with WAN 2.1 14b t2v. Focus on the image's aesthetic and overall style techniques. | 5 | image → text | Sample - N/A | completed | 00:00:31 | 2 months ago | shadowworks | 71557 tokens | $ 0.0109 |