r/StableDiffusion

10 views08:40

Custom face detection + segmentation models with dedicated ComfyUI nodes
https://redd.it/1rrlh4o
@rStableDiffusion

👍1

8 views09:40

Last week in Image & Video Generation

I curate a weekly multimodal AI roundup, here are the open-source image & video highlights from last week:

LTX-2.3 — Lightricks

Better prompt following, native portrait mode up to 1080x1920. Community moved incredibly fast on this one — see below.
Model | HuggingFace

https://reddit.com/link/1rr9iwd/video/8quo4o9mxhog1/player

Helios — PKU-YuanGroup

14B video model running real-time on a single GPU. t2v, i2v, v2v up to a minute long. Worth testing yourself.
HuggingFace | GitHub

https://reddit.com/link/1rr9iwd/video/ciw3y2vmxhog1/player

Kiwi-Edit

Text or image prompt video editing with temporal consistency. Style swaps, object removal, background changes.
HuggingFace | Project | Demo

https://preview.redd.it/dx8lm1uoxhog1.png?width=1456&format=png&auto=webp&s=25d8c82bac43d01f4e425179cd725be8ac542938

CubeComposer — TencentARC

Converts regular video to 4K 360° seamlessly. Output quality is genuinely surprising.
Project | HuggingFace

https://preview.redd.it/rqds7zvpxhog1.png?width=1456&format=png&auto=webp&s=24de8610bc84023c30ac5574cbaf7b06040c29a0

HY-WU — Tencent

No-training personalized image edits. Face swaps and style transfer on the fly without fine-tuning.
Project | HuggingFace

https://preview.redd.it/l9p8ahrqxhog1.png?width=1456&format=png&auto=webp&s=63f78ee94170afcca6390a35c50539a8e40d025b

Spectrum

3–5x diffusion speedup via Chebyshev polynomial step prediction. No retraining required, plug into existing image and video pipelines.
GitHub

https://preview.redd.it/htdch9trxhog1.png?width=1456&format=png&auto=webp&s=41100093cedbeba7843e90cd36ce62e08841aabc

LTX Desktop — Community

Free local video editor built on LTX-2.3. Just works out of the box.
Reddit

LTX Desktop Linux Port — Community

Someone ported LTX Desktop to Linux. Didn't take long.
Reddit

LTX-2.3 Workflows — Community

12GB GGUF workflows covering i2v, t2v, v2v and more.
Reddit

https://reddit.com/link/1rr9iwd/video/westyyf3yhog1/player

LTX-2.3 Prompting Guide — Community

Community-written guide that gets into the specifics of prompting LTX-2.3 well.
Reddit

Checkout the full roundup for more demos, papers, and resources.

https://redd.it/1rr9iwd
@rStableDiffusion

ltx.io

LTX-2.3: Introducing LTX's Latest AI Video Model | LTX Model

LTX-2.3 upgrades every dimension of AI video: sharper detail, cleaner audio, stronger motion, and native portrait — one generation model.

7 views10:40

r/StableDiffusion

Anima Preview 2 posted on hugging face

https://huggingface.co/circlestone-labs/Anima/tree/main/splitfiles/diffusionmodels

https://redd.it/1rqy92r
@rStableDiffusion

From the StableDiffusion community on Reddit: Anima Preview 2 posted on hugging face

Explore this post and more from the StableDiffusion community

9 views11:40

r/StableDiffusion

New Image Edit model? HY-WU

Why is there no mention of HY-WU here? https://huggingface.co/tencent/HY-WU

Has anyone actually used it?

https://redd.it/1rrdpya
@rStableDiffusion

huggingface.co

tencent/HY-WU · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

8 views12:40

r/StableDiffusion

So... turns out Z-Image Base is really good at inpainting realism. Workflow + info in the comments!

https://redd.it/1rrqrpf
@rStableDiffusion

From the StableDiffusion community on Reddit: So... turns out Z-Image Base is really good at inpainting realism. Workflow + info…

Explore this post and more from the StableDiffusion community

9 views13:40

r/StableDiffusion

9 views13:40

r/StableDiffusion

Anima-Preview2-8-Step-Turbo-Lora

https://preview.redd.it/g15ojf2bgmog1.png?width=1024&format=png&auto=webp&s=e3e102e7f73329c100f48632e56fd8caa1e48c05

I’m happy to share with you my **Anima-Preview2-8-Step-Turbo-LoRA**.

You can download the model and find example workflows in the gallery/files sections here:

* [https://civitai.com/models/2460007?modelVersionId=2766518](https://civitai.com/models/2460007?modelVersionId=2766518)
* [https://huggingface.co/Einhorn/Anima-Preview2-Turbo-LoRA](https://huggingface.co/Einhorn/Anima-Preview2-Turbo-LoRA)

Recommended Settings

* **Steps:** 6–8
* **CFG Scale:** 1
* **Samplers:** `dpmpp_sde`, `dpmpp_2m_sde`, or `dpmpp_multistep`

This LoRA was trained using renewable energy.

https://redd.it/1rrs5u0
@rStableDiffusion

9 views14:40

r/StableDiffusion

0:30

This media is not supported in your browser

VIEW IN TELEGRAM

LTX 2.3 30 second clips @ 6.5 minutes w 16gb vram. Settings work for all kinds of clips. No janky animation. High detail in all kinds of clips try out the workflow.

https://redd.it/1rrq33f
@rStableDiffusion

10 views15:40

r/StableDiffusion

I built a free local video captioner specifically tuned for LTX-2.3 training —
https://redd.it/1rrsd9i
@rStableDiffusion

15 views16:40

r/StableDiffusion

New FLUX.2 Klein 9b models have been released.
https://huggingface.co/black-forest-labs/FLUX.2-klein-9b-kv-fp8

https://redd.it/1rrw4lx
@rStableDiffusion

huggingface.co

black-forest-labs/FLUX.2-klein-9b-kv-fp8 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

32 views17:40

r/StableDiffusion

Flux 2 Klein 9B is now up to 2× faster with multiple reference images (new model)
https://x.com/bfl_ml/status/2032110512381837735

https://redd.it/1rrvnu2
@rStableDiffusion