Install and run Stable Diffusion locally with Web UI on Windows

This guide shall demonstrate how to install and run “Stable Diffusion” locally and run it with a Web UI.

Difficulty: Easy
Time: 30 minutes

Table of Contents

What is Stable Diffusion

Stable Diffusion is a deep learning, text-to-image model released by startup StabilityAI in 2022. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt.
https://en.wikipedia.org/wiki/Stable_Diffusion

Prerequisites to run Stable Diffusion

Stable Diffusion can run on CPU or GPU. It is highly recommended to run it on a powerful GPU (Nvidia’s RTX2060 in my case) since running it on a CPU can take about 10 times as long.

An up-to-date Python installation is required to run this tool. At the time of writing, Stable Diffusion calls for Python 3.10.

Install Stable Diffusion

Create a folder where you want to install Stable Diffusion.
Open a command prompt and navigate to this folder.
Clone the Stable Diffusion Repository into your folder.

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git

Visit HuggingFace.co and download the latest Stable Diffusion “original” weight. At the time of writing, this is version 1.4
Move the file to the “models” directory in your Stable Diffusion folder
Run webui-user.bat and wait for the installation to complete. This will take a while.
Once the script has finished you will see the IP and port you need to connect to with your web browser.

UPDATE July ’23: You can now find great models on civitai.com. I recommend you check these out.

Use Stable Diffusion

To generate an image from text you can simply enter something into the “prompt” field at the top and click “generate”. Depending on your hardware, this might take a while. You can play around with a lot of settings that I will try to explain to you. No guarantee on how accurate this explanation is.

Sampling Steps: The higher the number the better the image – Everything above 50 seems to only change very minor things.
Sampling Method: Different image generation methods.
Width / Height: The width and height of the final image. Everything above 512×512 seems to generate weird images for me.
Restore faces: As the name suggests. Tries to restore faces in images
Tiling: Produces an image that can be tiled
Highres.Fix: Tries to upscale the image
Batch Count: How many batches of images to create
Batch Size: Size of the batches
CFG scale: How much the image should conform to the text input. Lower values can create more creative images
Seed: Generates a random number. If you use the same number for different images you should get the same result. “-1” seems to be “random”
Script: Some scripts for additional functionality. One of them can import a file with multiple prompts.