ComfyUI is a node-based graphical interface for Stable Diffusion and other AI image models that gives you granular control over every step of the image generation process. Unlike AUTOMATIC1111’s linear interface, ComfyUI represents workflows as connected nodes — each node handles one part of the pipeline (load model, encode prompt, sample, decode), and you connect them with wires. This makes it infinitely composable for advanced workflows while remaining surprisingly learnable.

Why ComfyUI Over AUTOMATIC1111?

Both tools run locally on your hardware, but they appeal to different users:

Feature	ComfyUI	AUTOMATIC1111
Interface	Node graph	Form-based UI
Learning curve	Higher	Lower
Workflow flexibility	Unlimited	Plugin-dependent
Performance	Faster (optimized execution)	Slightly slower
Workflow sharing	JSON files	Screenshots/instructions
Advanced features	Built-in via nodes	Extension ecosystem

Choose AUTOMATIC1111 if you want a quick start with a traditional UI. Choose ComfyUI if you want to build complex multi-stage workflows, need exact control over the pipeline, or want to automate image generation programmatically.

Hardware Requirements

Same as AUTOMATIC1111 — an NVIDIA GPU with 6+ GB VRAM handles SDXL comfortably. ComfyUI also runs on AMD (ROCm on Linux) and Apple Silicon (Metal).

Installing ComfyUI

Windows

# Clone the repository
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI

# Install PyTorch with CUDA support (for NVIDIA GPU)
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121

# Install ComfyUI requirements
pip install -r requirements.txt

# Launch
python main.py

Access at http://127.0.0.1:8188 in your browser.

ComfyUI Manager (Strongly Recommended)

Install the ComfyUI Manager extension immediately — it provides an in-browser package manager for custom nodes:

cd ComfyUI/custom_nodes
git clone https://github.com/ltdrdata/ComfyUI-Manager

Restart ComfyUI. A Manager button appears in the interface — use it to install custom nodes without touching the command line.

Understanding the Node Interface

ComfyUI’s canvas starts with a default workflow containing these nodes:

Load Checkpoint — loads the model (.safetensors file from Stable Diffusion)
CLIP Text Encode (Positive) — encodes your positive prompt
CLIP Text Encode (Negative) — encodes your negative prompt
Empty Latent Image — defines the resolution
KSampler — runs the denoising/sampling loop
VAE Decode — decodes latent space to pixels
Save Image — saves the output

Each node has colored input/output ports. Connect them by dragging from output to input. Port colors indicate data types — you can only connect compatible types.

Running Your First Generation

In the Load Checkpoint node, click the dropdown and select a model (place .safetensors files in ComfyUI/models/checkpoints/)
Click on the positive CLIP Text Encode node and type your prompt
Click on the negative CLIP Text Encode node and add negative terms
In Empty Latent Image, set width and height (512×512 for SD 1.5, 1024×1024 for SDXL)
In KSampler, set:
- steps: 20–25
- cfg: 7.5
- sampler_name: dpmpp_2m
- scheduler: karras
- seed: -1 (random)
Click Queue Prompt (top right)

The image generates in the Save Image node and saves to ComfyUI/output/.

Building an SDXL Workflow with Refiner

SDXL works best with a two-stage pipeline: base model + refiner. In ComfyUI, this is built as a connected workflow:

Add a second Load Checkpoint node → load the SDXL Refiner model
Connect the base KSampler output latent to the Refiner’s KSampler input
Set the Refiner KSampler to use start_at_step: 20, end_at_step: 25 (last 5 steps)
Add a VAE Decode and Save Image after the Refiner

This two-pass approach produces significantly sharper, more detailed results than the base model alone.

ControlNet in ComfyUI

ControlNet nodes give you structural control over the generation. Common use:

Install via ComfyUI Manager → Search “ControlNet”
In your workflow, add:
- Load ControlNet Model node (download CN models to models/controlnet/)
- Apply ControlNet node — connects between CLIP encode and KSampler
- Load Image node for your reference image
- A preprocessor node (Canny Edge, OpenPose Estimator, Depth Estimator)

Chain: Load Image → Preprocessor → Apply ControlNet → KSampler

Available preprocessors let you extract edges (Canny), poses (OpenPose), depth maps, and normal maps from reference images.

IP-Adapter: Style and Character Transfer

IP-Adapter is a powerful tool for transferring a face, style, or composition from a reference image:

Install ComfyUI_IPAdapter_plus via Manager
Load an IP-Adapter model file (available from the IPAdapter repository on Hugging Face)
Connect: Load Image → IPAdapter Node → KSampler

Use cases:

Keep a character’s face consistent across multiple images
Apply an artist’s style from a reference image
Compose images using object placement from a reference

ComfyUI workflows are JSON files. Share them by dragging a workflow JSON file into the canvas, or via Save in the top menu. The ComfyUI community shares workflows on sites like OpenArt.ai and Civitai.

Loading a shared workflow may require installing the custom nodes it uses — ComfyUI Manager handles this automatically.

Automating ComfyUI via API

ComfyUI exposes a REST API for programmatic image generation:

import json
import urllib.request
import urllib.parse

def queue_prompt(workflow_json):
    data = json.dumps({"prompt": workflow_json}).encode('utf-8')
    req = urllib.request.Request("http://127.0.0.1:8188/prompt", data=data)
    with urllib.request.urlopen(req) as response:
        return json.loads(response.read())

# Load your workflow JSON
with open('my_workflow.json', 'r') as f:
    workflow = json.load(f)

# Modify the seed in the KSampler node
workflow['3']['inputs']['seed'] = 42

result = queue_prompt(workflow)
print(result)

This enables batch generation, dynamic prompt variation, and integration with other systems.

ComfyUI rewards investment — the more you understand its node system, the more powerful your workflows become. The active community shares complex pipelines (face detailing, hires fix, regional prompting) as drop-in workflow files, making advanced techniques accessible even without building them from scratch.