1

72

submitted 2 years ago by db0 to c/stable_diffusion

14 comments fedilink

This is a copy of /r/stablediffusion wiki to help people who need access to that information

Howdy and welcome to r/stablediffusion! I'm u/Sandcheeze and I have collected these resources and links to help enjoy Stable Diffusion whether you are here for the first time or looking to add more customization to your image generations.

If you'd like to show support, feel free to send us kind words or check out our Discord. Donations are appreciated, but not necessary as you being a great part of the community is all we ask for.

Note: The community resources provided here are not endorsed, vetted, nor provided by Stability AI.

#Stable Diffusion

Local Installation

Active Community Repos/Forks to install on your PC and keep it local.

Online Websites

Websites with usable Stable Diffusion right in your browser. No need to install anything.

Mobile Apps

Stable Diffusion on your mobile device.

Tutorials

Learn how to improve your skills in using Stable Diffusion even if a beginner or expert.

Dream Booth

How-to train a custom model and resources on doing so.

Models

Specially trained towards certain subjects and/or styles.

Embeddings

Tokens trained on specific subjects and/or styles.

Bots

Either bots you can self-host, or bots you can use directly on various websites and services such as Discord, Reddit etc

3rd Party Plugins

SD plugins for programs such as Discord, Photoshop, Krita, Blender, Gimp, etc.

Other useful tools

Diffusion Toolkit - Image viewer/organizer that scans your images for PNGInfo generated.
Pixiz Morphing - Easily transition between 2 photos.
Bulk Image Resizing Made Easy 2.0

#Community

Games

PictionAIry : (Video|2-6 Players) - The image guessing game where AI does the drawing!

Podcasts

This is Not An AI Art Podcast - Doug Smith talks about Ai Art and provides the prompts/workflow on his site.

Databases or Lists

AiArtApps
Stable Diffusion Akashic Records
Questianon's SD Updates 1
Questianon's SD Updates 2
SW-Yw's Stable Diffusion Repo List
Plonk's SD Model List (NSFW)
Nightkall's Useful Lists
Civitai - Website with a list of custom models.

Still updating this with more links as I collect them all here.

FAQ

How do I use Stable Diffusion?

Check out our guides section above!

Will it run on my machine?

Stable Diffusion requires a 4GB+ VRAM GPU to run locally. However, much beefier graphics cards (10, 20, 30 Series Nvidia Cards) will be necessary to generate high resolution or high step images. However, anyone can run it online through DreamStudio or hosting it on their own GPU compute cloud server.
Only Nvidia cards are officially supported.
AMD support is available here unofficially.
Apple M1 Chip support is available here unofficially.
Intel based Macs currently do not work with Stable Diffusion.

How do I get a website or resource added here?

*If you have a suggestion for a website or a project to add to our list, or if you would like to contribute to the wiki, please don't hesitate to reach out to us via modmail or message me.

2

9

Um if I upload a pic of my dog can someone cartoonize it like smoking a blunt or cigar or cig?? Don't really know how it works here because the mods post shhoow. Any help will help?? (lemmy.world)

submitted 4 days ago by [email protected] to c/stable_diffusion

5 comments fedilink

3

10

1038lab/ComfyUI-LBM: A ComfyUI custom node for Latent Bridge Matching (LBM), for fast image relighting processing. (github.com)

submitted 1 week ago by Even_Adder to c/stable_diffusion

0 comments fedilink

4

10

USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning (github.com)

submitted 1 week ago by Even_Adder to c/stable_diffusion

0 comments fedilink

Abstract

Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of content and style, a long-standing theme in style-driven research. To this end, we present USO, a Unified Style-Subject Optimized customization model. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content-style disentanglement training. Third, we incorporate a style reward-learning paradigm denoted as SRL to further enhance the model's performance. Finally, we release USO-Bench, the first benchmark that jointly evaluates style similarity and subject fidelity across multiple metrics. Extensive experiments demonstrate that USO achieves state-of-the-art performance among open-source models along both dimensions of subject consistency and style similarity. Code and model: this https URL

Technical Report: https://arxiv.org/abs/2508.18966

Code: https://github.com/bytedance/USO

USO in ComfyUI tutorial: https://docs.comfy.org/tutorials/flux/flux-1-uso

Project Page: https://bytedance.github.io/USO/

5

1

Haoming02/Stable Diffusion WebUI Forge - Neo - Wan 2.2 Nunchaku Flux-Kontext and more (github.com)

submitted 1 week ago by Even_Adder to c/stable_diffusion

0 comments fedilink

6

7

rupeshs/fastsdcpu Release v1.0.0-beta.275 Qt GUI updates and SDXL single file support (github.com)

submitted 2 weeks ago* (last edited 2 weeks ago) by Even_Adder to c/stable_diffusion

0 comments fedilink

Release: https://github.com/rupeshs/fastsdcpu/releases/tag/v1.0.0-beta.275

7

CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models (self.stable_diffusion)

submitted 2 weeks ago by Even_Adder to c/stable_diffusion

0 comments fedilink

Abstract

Text-to-image (T2I) diffusion models excel at generating photorealistic images but often fail to render accurate spatial relationships. We identify two core issues underlying this common failure: 1) the ambiguous nature of data concerning spatial relationships in existing datasets, and 2) the inability of current text encoders to accurately interpret the spatial semantics of input descriptions. We propose CoMPaSS, a versatile framework that enhances spatial understanding in T2I models. It first addresses data ambiguity with the Spatial Constraints-Oriented Pairing (SCOP) data engine, which curates spatially-accurate training data via principled constraints. To leverage these priors, CoMPaSS also introduces the Token ENcoding ORdering (TENOR) module, which preserves crucial token ordering information lost by text encoders, thereby reinforcing the prompt's linguistic structure. Extensive experiments on four popular T2I models (UNet and MMDiT-based) show CoMPaSS sets a new state of the art on key spatial benchmarks, with substantial relative gains on VISOR (+98%), T2I-CompBench Spatial (+67%), and GenEval Position (+131%). Code is available at this https URL.

Paper: https://arxiv.org/abs/2412.13195

Code: https://github.com/blurgyy/CoMPaSS

Project Page: https://compass.blurgy.xyz/

8

4

peteromallet/Qwen-Image-Edit-InStyle (huggingface.co)

submitted 2 weeks ago by Even_Adder to c/stable_diffusion

0 comments fedilink

QwenEdit InStyle is a LoRA fine-tune for QwenEdit that significantly improves its ability to generate images based on a style reference. While the base model has style transfer capabilities, it often misses the nuances of styles and can transplant unwanted details from the input image. This LoRA addresses these limitations to provide more accurate style-based image generation.

9

mcmonkeyprojects/SwarmUI 0.9.7-Beta Release (github.com)

submitted 3 weeks ago by Even_Adder to c/stable_diffusion

0 comments fedilink

Major Updates

Added support for HiDream-i1 image models https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model%20Support.md#hidream-i1 (Full, Dev, Fast, Edit)
Full beginner's guide to generating videos in swarm #716
Added support for Chroma https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model%20Support.md#chroma
Swarm now has Parameter Sub-Groups, to better organize long parameter lists. The first usage is the Refiner group now has a sub-group for Base Parameter Overrides, and similar for Segments
initial experimental partial mobile interface support. Swarm now works, mostly, if you open it on a mobile device. Much still still to be done. Under User->Generate Tab Layout, you can configure to use mobile/desktop/autodetect.
Added support for Flux Kontext https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model%20Support.md#flux1-tools
Added support for OmniGen 2 https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model%20Support.md#omnigen-2
Wan with Lightning loras is the new best way to generate videos, documentation here https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Video%20Model%20Support.md#wan-causvid---high-speed-14b
Added support for Wan Phantom https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Video%20Model%20Support.md#wan-phantom
Added support for Wan 2.2 https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Video%20Model%20Support.md#wan-22
Added support for Qwen Image and Qwen Image Edit https://github.com/mcmonkeyprojects/SwarmUI/blob/master/docs/Model%20Support.md#qwen-image

10

Chroma1 - An 8.9B Parameter Text-To-Image Foundational Model Based on FLUX.1-schnell (lemmy.dbzer0.com)

submitted 3 weeks ago by Even_Adder to c/stable_diffusion

0 comments fedilink

Chroma1-Base: 512x512 model

Chroma1-HD: 1024x1024 model

Chroma1-Flash: A fine-tuned Chroma1-Base experimental model

Chroma1-Radiance [WIP]: Chroma1-Base pixel space model

11

7

AI Isn’t Coming for Hollywood. It's Already Arrived (www.wired.com)

submitted 3 weeks ago* (last edited 3 weeks ago) by Even_Adder to c/stable_diffusion

2 comments fedilink

Without paywall: https://archive.is/4oEi2

12

20

AI extension: Stable Diffusion image generator for LibreOffice (blog.documentfoundation.org)

submitted 3 weeks ago by db0 to c/stable_diffusion

4 comments fedilink

13

7

Qwen Image Edit (qianwen-res.oss-cn-beijing.aliyuncs.com)

submitted 4 weeks ago* (last edited 4 weeks ago) by Even_Adder to c/stable_diffusion

0 comments fedilink

Introduction

We are excited to introduce Qwen-Image-Edit, the image editing version of Qwen-Image. Built upon our 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Image’s unique text rendering capabilities to image editing tasks, enabling precise text editing. Furthermore, Qwen-Image-Edit simultaneously feeds the input image into Qwen2.5-VL (for visual semantic control) and the VAE Encoder (for visual appearance control), achieving capabilities in both semantic and appearance editing. To experience the latest model, visit Qwen Chat and select the "Image Editing" feature.

Technical Report: https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen-Image/Qwen_Image.pdf

Code: https://github.com/QwenLM/Qwen-Image

Hugging Face: https://huggingface.co/Qwen/Qwen-Image-Edit

GGUFs: https://huggingface.co/QuantStack/Qwen-Image-Edit-GGUF

14

8

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training (arxiv.org)

submitted 4 weeks ago* (last edited 4 weeks ago) by Even_Adder to c/stable_diffusion

0 comments fedilink

Code: https://github.com/CompVis/tread

15

4

Hey Diffusion Models, Your Colors Are Wrong (civitai.com)

submitted 1 month ago by Even_Adder to c/stable_diffusion

0 comments fedilink

16

2

SDNext Release for 2025-08-15 (github.com)

submitted 1 month ago by Even_Adder to c/stable_diffusion

0 comments fedilink

SD.Next Release 2025-08-15

New release two weeks after the last one and its a big one with over 150 commits!

Several new models: Qwen-Image (plus Lightning variant) and FLUX.1-Krea-Dev
Several updated models: Chroma, SkyReels-V2, Wan-VACE, HunyuanDiT
Plus continuing with major UI work with new embedded Docs/Wiki search, redesigned real-time hints, wildcards UI selector, built-in GPU monitor, CivitAI integration and more!

17

8

FlyMyAI/flymyai-lora-trainer: Qwen text to image LoRA trainer (github.com)

submitted 1 month ago* (last edited 1 month ago) by Even_Adder to c/stable_diffusion

0 comments fedilink

An open-source implementation for training LoRA (Low-Rank Adaptation) layers for Qwen/Qwen-Image models by FlyMy.AI.

18

4

Qwen-Image: Crafting with Native Text Rendering • Apache 2.0 licensed (qwenlm.github.io)

submitted 1 month ago* (last edited 1 month ago) by Even_Adder to c/stable_diffusion

3 comments fedilink

GITHUB: https://github.com/QwenLM/Qwen-Image

HUGGING FACE: https://huggingface.co/Qwen/Qwen-Image

MODELSCOPE: https://modelscope.cn/models/Qwen/Qwen-Image

DEMO: https://modelscope.cn/aigc/imageGeneration?tab=advanced

DISCORD: https://discord.gg/yPEP2vHTu4

19

1

nunchaku-tech/nunchaku-flux.1-krea-dev · A Nunchaku-Quantized Versions of FLUX.1-Krea-dev (huggingface.co)

submitted 1 month ago by Even_Adder to c/stable_diffusion

0 comments fedilink

Example Script: https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-krea-dev.py

20

8

FLUX.1 Krea [dev] Lands on ComfyUI on Day-1 (blog.comfy.org)

submitted 1 month ago by Even_Adder to c/stable_diffusion

0 comments fedilink

21

7

FLUX.1 Krea [dev]: An ‘Opinionated’ Text-to-Image Model (bfl.ai)

submitted 1 month ago* (last edited 1 month ago) by Even_Adder to c/stable_diffusion

0 comments fedilink

Model: https://huggingface.co/ND911/flux1_krea_dev_GGUFs/tree/main

Krea's blog post: https://www.krea.ai/blog/flux-krea-open-source-release

22

5

jenissimo/unfake.js: Fix AI Art. Pixel-perfect. (github.com)

submitted 1 month ago by Even_Adder to c/stable_diffusion

0 comments fedilink

23

19

OmniSVG: A Unified Scalable Vector Graphics Generation Model (omnisvg.github.io)

submitted 1 month ago* (last edited 1 month ago) by Even_Adder to c/stable_diffusion

10 comments fedilink

Abstract

Scalable Vector Graphics (SVG) is an important image format widely adopted in graphic design because of their resolution independence and editability. The study of generating high-quality SVG has continuously drawn attention from both designers and researchers in the AIGC community. However, existing methods either produces unstructured outputs with huge computational cost or is limited to generating monochrome icons of over-simplified structures. To produce high-quality and complex SVG, we propose OmniSVG, a unified framework that leverages pre-trained Vision-Language Models (VLMs) for end-to-end multimodal SVG generation. By parameterizing SVG commands and coordinates into discrete tokens, OmniSVG decouples structural logic from low-level geometry for efficient training while maintaining the expressiveness of complex SVG structure. To further advance the development of SVG synthesis, we introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks. Extensive experiments show that OmniSVG outperforms existing methods and demonstrates its potential for integration into professional SVG design workflows.

Paper: https://arxiv.org/abs/2504.06263

Code: https://github.com/OmniSVG/OmniSVG/

Weights: https://huggingface.co/OmniSVG/OmniSVG

Project Page: https://omnisvg.github.io/

Demo: https://huggingface.co/spaces/OmniSVG/OmniSVG-3B