This guide shall demonstrate how to install and run “Stable Diffusion” locally and run it with a Web UI.
Time: 30 minutes
Table of Contents
What is Stable Diffusion
Stable Diffusion is a deep learning, text-to-image model released by startup StabilityAI in 2022. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt.https://en.wikipedia.org/wiki/Stable_Diffusion
Prerequisites to run Stable Diffusion
Stable Diffusion can run on CPU or GPU. It is highly recommended to run it on a powerful GPU (Nvidia’s RTX2060 in my case) since running it on a CPU can take about 10 times as long.
An up-to-date Python installation is required to run this tool. At the time of writing, Stable Diffusion calls for Python 3.10.
Install Stable Diffusion
- Create a folder where you want to install Stable Diffusion.
- Open a command prompt and navigate to this folder.
- Clone the Stable Diffusion Repository into your folder.
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
- Visit HuggingFace.co and download the latest Stable Diffusion “original” weight. At the time of writing, this is version 1.4
- Move the file to the “models” directory in your Stable Diffusion folder
- Run webui-user.bat and wait for the installation to complete. This will take a while.
- Once the script has finished you will see the IP and port you need to connect to with your web browser.
UPDATE July ’23: You can now find great models on civitai.com. I recommend you check these out.
Use Stable Diffusion
To generate an image from text you can simply enter something into the “prompt” field at the top and click “generate”. Depending on your hardware, this might take a while. You can play around with a lot of settings that I will try to explain to you. No guarantee on how accurate this explanation is.
- Sampling Steps: The higher the number the better the image – Everything above 50 seems to only change very minor things.
- Sampling Method: Different image generation methods.
- Width / Height: The width and height of the final image. Everything above 512×512 seems to generate weird images for me.
- Restore faces: As the name suggests. Tries to restore faces in images
- Tiling: Produces an image that can be tiled
- Highres.Fix: Tries to upscale the image
- Batch Count: How many batches of images to create
- Batch Size: Size of the batches
- CFG scale: How much the image should conform to the text input. Lower values can create more creative images
- Seed: Generates a random number. If you use the same number for different images you should get the same result. “-1” seems to be “random”
- Script: Some scripts for additional functionality. One of them can import a file with multiple prompts.
All images that you create can be found in the “outputs” folder.
Stable Diffusion is a lot of fun to play around with. It is awesome to see what AI can be capable of today.