Prompt Engineering
Introduction
Creating an image that matches a certain expectation is not as trivial as one might expect. The solution space of the AI model is insanely high. To limit this a well engineered prompt can be used to steer the AI in the right direction. This practice is called Prompt Engineering.
Prompt guides
- https://stable-diffusion-art.com/how-to-come-up-with-good-prompts-for-ai-image-generation/#Some_good_keywords_for_you
- https://anakin.ai/blog/stable-diffusion-prompt-guide/
- https://cheatsheet.md/stable-diffusion/stable-diffusion-prompts-guide.en
- https://cheatsheet.md/stable-diffusion/stable-diffusion-webui-styles
- https://stable-diffusion-art.com/
- https://supagruen.github.io/StableDiffusion-CheatSheet/
Resource references
- https://civitai.com/
- https://www.mage.space/explore
- https://tensor.art/
- https://majinai.art/
- https://openmodeldb.info/
Prompt perfecters
Using ChatGPT to optimize the prompt is efficient, because it is not always clear how the AI will respond to a certain prompt. There are templates available to help kickstart the prompt perfectioning via ChatGPT.
- https://www.feedough.com/stable-diffusion-prompt-generator/
- https://huggingface.co/spaces/Gustavosta/MagicPrompt-Stable-Diffusion
- https://sd-prompt-generator.netlify.app/
SD-Web-UI Forge workflow
To get as close to the imagined concept and as efficiently as possible, the following workflow is proposed:
Database selection:Checkpoint selection: Each model is trained for specific output. Use the right one! (Multiple can be combined)Add additional LoRa or LyCORIS files to improve specific features (download at:https://civitai.com/models)
Prompt creation:Use prompt perfector to get to a good descriptive promptUse CLIP Interrogator to retrieve a prompt from an example image that can be used to get detailed prompts for specific featuresUse a fixed seed, in order to same results for the same prompt.
Choose settings:Choose proper sampling method. Each method has it's own benefits and downsides.Use relative low resolution, but at the desired aspect ratio. Exact resolution is depending on the used database models (e.g. 512x512, 768x768, 1024x1024 px).
Image generation:Iterate the prompt until the concept is mainly displayed.
Finalize the details:Move to the img2img tab to use inpainting to improve certain aspects
Upscale image:Move to Extras tab and set higher resolution and iterations. Fine-tune the prompt until all details are correct.
Save image and save prompt & settings.
Prompt format
A good prompt exists of (most) of the following topics:
- Subject(s)
- Background
- Medium
- Artistic style
- Lighting
- Resolution
Subject(s)
- Description (What kind of subject is it?)
- Professions, like: 'artist, magician, knight'
- Clothes or accessories
- Body features, like: 'brown hair, blue eyes'
- Action (what is the subject doing?)
- Running, jumping, dancing, etc.
- Pose (Which pose should the subject have? Only relevant for creatures!)
- Extract pose from built-in ControlNet and OpenPose Editor.
- Location (Where in the picture should the subject be displayed?)
- Regional prompting can be used to indicate a sub region in the picture.
- Orientation (From which direction should the subject be displayed?)
- Use camera angle descriptions, like: 'low angle'
- Distance (From which distance should the subject be displayed?)
- Use camera shot descriptions, like: 'medium shot'
- Use length units, like: 'viewed from 100 meters'
Background
The background gives the total setting in which the picture is presented.
- Seasons (Winter, Spring, Summer, Fall)
- Nature (mountains, beach, hills, grass, desert)
- Human-made environments (hotel, restaurant, road, city)
Medium
The medium is the type of picture, like: illustration, oil painting, 3D rendering, digital art and photography.
Artistic style
The style refers to the artistic style of the image, like: impressionist, surrealist, pop art. The names of famous artists can help. Use the SDXL Styles Editor for quickly selecting a preset.
Lighting
The type of lighting is important, because it has a large impact on the emotional value of a picture, like: studio lighting, warm light, crepuscular rays, rim lighting.
Resolution
Resolution represents how sharp and detailed the image is, like: highly detailed, sharp focus, 8K, HD.