Deploy Stable Diffusion for AI Image Generation

Harry Wang
7 min readAug 23, 2022

--

  • Updated on 8/10/2023: added ComfyUI Mac M1 setup note
  • Updated on 3/4/2023: added WebUI Ubuntu setup note and link to WebUI Colab
  • Updated on 2/4/2023: added InvokeAI instruction
  • Updated on 1/30/2023: use Stable Diffusion WebUI instead given that DiffusionBee has issues loading custom models — check out my setup notes below — it may save you lots of time and trouble!
  • Updated on 1/16/2023: start to use the awesome offline stable diffusion app: DiffusionBee
  • Updated on 12/6/2022: add M1 deployment notes

Stable Diffusion WebUI is my current go-to UI — it’s quite useful to go over the feature showcase page.

Setup ComfyUI

Put your SD checkpoints (the huge ckpt/safetensors files) in: models/checkpoints

Then, visit http://127.0.0.1:8188

Setup Stable Diffusion WebUI

Ubuntu WebUI Setup

Here are the notes for me to setup A1111 on my server:

Install extensions:

Change startup script in webui-user.sh and add:

Start A1111: ./webui.sh and find the cloudflare link to login.

Use Batchlinks Downloader to download models:

Mac WebUI Setup

I ran into so many issues trying to set it up on my MacBook Pro M1 and finally made it work (Ubuntu setup is actually much easier — see below).

The most important lesson learned: the Python version matters!

I figured this out from the inline comment here after having many issues with Python 3.8.0 and 3.9.7 — It would be helpful if the author can highlight this in the README file.

I use pyenv to manage my Python versions and use the following commands to install Python 3.10.6 first.

It should be simple as the few steps above if the Python version is correct.

“Stable Diffusion 2.0 and 2.1 require both a model and a configuration file, and image width & height will need to be set to 768 or higher when generating images”

For using Stable Diffusion v2.0, follow the instruction to download the checkpoints and yaml files.

My /stable-diffusion-webui/models/Stable-diffusion/ folder looks like the following:

Note: for v2.0, you may need to run ./webui.sh --no-half or restart to make it work.

for 768-v-ema.ckpt SD v2.0, you have to use at least 768x768 or higher, e.g., 768x1024 to generate images otherwise you get garbage images shown below:

Ubuntu WebUI Setup

Tested on Ubuntu 20.04.5 LTS, it’s as simple as the following two lines:

Install xformers by editing webui-user.sh (see discussions), then start WebUI using ./webui.sh and xformers will be installed (for my new 4090, this did not work, adding --xformers worked):

Next, change webui-user.sh to remove the installation argument above (you only need to install it once) and enable xformers:

To enable a public Gradio link with authentication, change webui-user.sh with arguments:

To enable extension installation with --share, change webui-user.sh with arguments (otherwise this error):

If you have a server with a fixed IP address, say x.x.x.x, then you can use --share to run WebUI at x.x.x.x:7860.

You can also install webui tunnels plugin to have a Cloudflare URL by running:

To download a model:

Install Dreambooth Extension

Note for installing https://github.com/d8ahazard/sd_dreambooth_extension

NOTE: “Once installed, you must restart the Stable-Diffusion WebUI completely. Reloading the UI will not install the necessary requirements.”

Install https://github.com/kohya-ss/sd-webui-additional-networks to use the Lora Weights.

Check out my tutorial on how to use this extension.

WebUI Colab

You can use https://github.com/camenduru/stable-diffusion-webui-colab if you just want to use WebUI via Colab — Just run the chosen Colab Notebook (find the model you want to use) and you will get a URL to use WebUI — the speed is OK.

Configure WebUI Server Behind Firewall

then start webui using the following command

then webui.takin.ai can be used as backend.

Setup InvokeAI

This is the my notes of installing InvokeAI instruction on MacBook Pro M1. Tested with Python 3.10.6.

  • pip install . installs packages using pyproject.toml.
  • invokeai-configure asks you to download sd models

Visit http://localhost:9090 to use the UI.

InvokeAI seems to take more resources than AUTOMATIC1111 Stable Diffusion WebUI below.

M1 Stable Diffusion Deployment

I just followed the instructions here.

Tested on my 2020 MacBook Pro M1 with 16G RAM and Torch 1.13.0.

Run the following to generate the models in coreml-sd folder:

Generate image with Python and output to image-outputs folder:

The method above loads the model every time, which is quite slow (2–3 minutes). Use Swift to speed up model loading by setting up the Resources:

Then, generate image with Swift and output to image-outputs folder:

Ubuntu Deployment

In the past few months, I tried almost all popular text-to-image AI generation models/products, such as Dall-E 2, MidJourney, Disco Diffusion, Stable Diffusion, etc. Stable Diffusion checkpoint was just released a few days ago. I deployed one on my old GPU server and record my notes here for people who may also want to try. Machine creativity is a quite interesting research area for IS scholars and I jotted down some potential research topics in the end of this post as well.

I first spent a few hours trying to set up Stable Diffusion on Mac M1 and failed — I cannot install the packages properly, e.g., version not found, dependency issues, etc. I found some successful attempts here but have no time to try them yet.

I ended up setting up Stable Diffusion on my old GPU server running Ubuntu and here are my notes.

  • Use optimized fork, which uses lesser VRAM than the original by sacrificing on inference speed git clone https://github.com/basujindal/stable-diffusion - I did this because I ran into CUDA out of memory issue using the original repo.
  • Get the checkpoint file from HuggingFace Repo. Download/Upload the checkpoint file to the server. I use links to browse websites and download files via terminal:

rename the checkpoint file to model.ckpt and put it in the following folder (create a new one):

A side note on estimated training cost based on the reported GPU usage and the related AWS price I found:

  • Hardware Type: A100 PCIe 40GB
  • Hours used: 150000 (about 17.1 years)
  • Cloud Provider: AWS

Price of p4d.24xlarge instance with 8 A100 with 40G VRAM:

  • 32.77 USD (hourly)
  • 19.22 USD (1-year reserved)
  • 11.57 USD (3-year reserved)

The training would cost between 225,000 USD and 600,000 USD.

Now, Stable Diffusion is ready to go and let’s see what AI will create based on the following text:

A car in 2050 designed by Antoni Gaudi

This whole area is relatively new and there are many potential interesting research topics, e.g.,

  • how humans work with AI creativity tools like Stable Diffusion (workflow shift? efficiency improvement? creativity boosting?…)
  • how to use AI to pick top n results generated from the same prompt, e.g., preference on cleaner background, less color, simpler composition, aesthetics scoring, etc.
  • how to do systemic/serendipitous prompt engineering to improve art creation novelty, efficiency, and quality by leveraging ideas from areas such as AutoML, recommender systems, and reinforcement learning.

Anyway, out of the 20 generated images from the prompt above, the following are my top 3:

PS. The featured image for this post is generated using Stable Diffusion, whose full parameters with model link can be found at Takin.AI.

Originally published at https://harrywang.me on August 23, 2022.

--

--

Responses (1)