Prompt Engineering
Introduction
Creating an image that matches a certain expectation is not as trivial as one might expect. The solution space of the AI model is insanely high. To limit this a well engineered prompt can be used to steer the AI in the right direction. This practice is called Prompt Engineering.
Prompt guides
- https://stable-diffusion-art.com/how-to-come-up-with-good-prompts-for-ai-image-generation/#Some_good_keywords_for_you
- https://anakin.ai/blog/stable-diffusion-prompt-guide/
- https://cheatsheet.md/stable-diffusion/stable-diffusion-prompts-guide.en
- https://cheatsheet.md/stable-diffusion/stable-diffusion-webui-styles
- https://stable-diffusion-art.com/
- https://supagruen.github.io/StableDiffusion-CheatSheet/
Resource references
- https://civitai.com/
- https://www.mage.space/explore
- https://tensor.art/
- https://majinai.art/
- https://openmodeldb.info/
Prompt perfecters
Using ChatGPT to optimize the prompt is efficient, because it is not always clear how the AI will respond to a certain prompt. There are templates available to help kickstart the prompt perfectioning via ChatGPT.
- https://www.feedough.com/stable-diffusion-prompt-generator/
- https://huggingface.co/spaces/Gustavosta/MagicPrompt-Stable-Diffusion
- https://sd-prompt-generator.netlify.app/
SD-Web-UI Forge workflow
To get as close to the imagined concept and as efficiently as possible, the following workflow is proposed:
- Database selection:
- Checkpoint selection: Each model is trained for specific output. Use the right one! (Multiple can be combined)
- Add additional LoRa or LyCORIS files to improve specific features (download at: https://civitai.com/models)
- Prompt creation:
- Use prompt perfector to get to a good descriptive prompt
- Use CLIP Interrogator to retrieve a prompt from an example image that can be used to get detailed prompts for specific features
- Use a fixed seed, in order to same results for the same prompt.
- Choose settings:
- Choose proper sampling method. Each method has it's own benefits and downsides.
- Use relative low resolution, but at the desired aspect ratio. Exact resolution is depending on the used database models (e.g. 512x512, 768x768, 1024x1024 px).
- Image generation:
- Iterate the prompt until the concept is mainly displayed.
- Finalize the details:
- Move to the img2img tab to use inpainting to improve certain aspects
- Upscale image:
- Move to Extras tab and set higher resolution and iterations. Fine-tune the prompt until all details are correct.
- Save image and save prompt & settings.
Prompt format
A good prompt exists of (most) of the following topics:
- Subject(s)
- Background
- Medium
- Artistic style
- Lighting
- Resolution
Subject(s)
- Description (What kind of subject is it?)
- Professions, like: 'artist, magician, knight'
- Clothes or accessories
- Body features, like: 'brown hair, blue eyes'
- Action (what is the subject doing?)
- Running, jumping, dancing, etc.
- Pose (Which pose should the subject have? Only relevant for creatures!)
- Extract pose from built-in ControlNet and OpenPose Editor.
- Location (Where in the picture should the subject be displayed?)
- Regional prompting can be used to indicate a sub region in the picture.
- Orientation (From which direction should the subject be displayed?)
- Use camera angle descriptions, like: 'low angle'
- Distance (From which distance should the subject be displayed?)
- Use camera shot descriptions, like: 'medium shot'
- Use length units, like: 'viewed from 100 meters'
Background
The background gives the total setting in which the picture is presented.
- Seasons (Winter, Spring, Summer, Fall)
- Nature (mountains, beach, hills, grass, desert)
- Human-made environments (hotel, restaurant, road, city)
Medium
The medium is the type of picture, like: illustration, oil painting, 3D rendering, digital art and photography.
Artistic style
The style refers to the artistic style of the image, like: impressionist, surrealist, pop art. Use the SDXL Styles Editor for quickly selecting a preset.
Lighting
The type of lighting is important, because it has a large impact on the emotional value of a picture.picture, like: studio lighting, warm light.
Resolution
Resolution represents how sharp and detailed the image is, like: highly detailed, sharp focus, 8K, HD.