r/StableDiffusion

7 views12:40

r/StableDiffusion

Simple Anima SEGS tiled upscale workflow (works with most models)

https://redd.it/1rye0p1
@rStableDiffusion

From the StableDiffusion community on Reddit: Simple Anima SEGS tiled upscale workflow (works with most models)

Explore this post and more from the StableDiffusion community

8 views13:40

r/StableDiffusion

7 views13:40

r/StableDiffusion

Ubisoft Chord PBR Material Estimation

I hadn't seen this mentioned anywhere, but Ubisoft has an open source model to make a PBR material from any image. It seems pretty amazing and already integrated into comfyui!

I found it by having this video come up on my youtube feed
https://www.youtube.com/watch?v=rE1M8_FaXtk

It seems pretty amazing: https://github.com/ubisoft/ubisoft-laforge-chord

https://github.com/ubisoft/ComfyUI-Chord?tab=readme-ov-file

https://redd.it/1ryvqpj
@rStableDiffusion

YouTube

Can Ubisoft’s CHORD Model Replace Substance Designer? I Tested It

Learn how to generate AAA-quality PBR materials using Ubisoft’s CHORD model inside @comfyorg

In this tutorial, I walk through the full workflow — from setup to output — showing how CHORD can create stunning textures ready for Unreal Engine and other real…

8 views14:40

r/StableDiffusion

Inpainting in 3 commands: remove objects or add accessories with any base model, no dedicated inpaint model needed

https://redd.it/1ryvv5p
@rStableDiffusion

From the StableDiffusion community on Reddit: Inpainting in 3 commands: remove objects or add accessories with any base model,…

Explore this post and more from the StableDiffusion community

6 views15:40

r/StableDiffusion

6 views15:40

r/StableDiffusion

PSA: Use the official LTX 2.3 workflow, not the ComfyUI included one. It's significantly better.

Most of the time I rely on the default ComfyUI workflows. They're producing results just as good as 90% of the overly-complicated workflows I see floating around online. So I was fighting with the default Comfy LTX 2.3 template for a while, just not getting anything good. Saw someone mention the official LTX workflows and figured I'd give it a try.

Yeah, huge difference. Easily makes LTX blow past WAN 2.2 into SOTA territory for me. So something's up with the Comfy default workflow.

If you're having issues with weird LTX 2 or LTX 2.3 generations, use the official workflow instead:

https://github.com/Lightricks/ComfyUI-LTXVideo/blob/master/example\_workflows/2.3/LTX-2.3\_T2V\_I2V\_Single\_Stage\_Distilled\_Full.json

This runs the distilled and non-distilled at the same time. I find they pretty evenly trade blows to give me what I'm looking for, so I just left it as generating both.

https://redd.it/1rz1u3j
@rStableDiffusion

GitHub

ComfyUI-LTXVideo/example_workflows/2.3/LTX-2.3_T2V_I2V_Single_Stage_Distilled_Full.json at master · Lightricks/ComfyUI-LTXVideo

LTX-Video Support for ComfyUI. Contribute to Lightricks/ComfyUI-LTXVideo development by creating an account on GitHub.

6 views17:40

r/StableDiffusion

0:00

This media is not supported in your browser

VIEW IN TELEGRAM

ComfyUI Nodes for Filmmaking (LTX 2.3 Shot Sequencing, Keyframing, First Frame/Last Frame)

https://redd.it/1rz355d
@rStableDiffusion

6 views18:40

r/StableDiffusion

Nvidia SANA Video 2B

https://www.youtube.com/watch?list=TLGG-iNIhzqJ0OgyMDAzMjAyNg&v=7eNfDzA4yBs

Efficient-Large-Model/SANA-Video\_2B\_720p · Hugging Face

SANA-Video is a small, ultra-efficient diffusion model designed for rapid generation of high-quality, minute-long videos at resolutions up to 720×1280.

Key innovations and efficiency drivers include:

(1) Linear DiT: Leverages linear attention as the core operation, offering significantly more efficiency than vanilla attention when processing the massive number of tokens required for video generation.

(2) Constant-Memory KV Cache for Block Linear Attention: Implements a block-wise autoregressive approach that uses the cumulative properties of linear attention to maintain global context at a fixed memory cost, eliminating the traditional KV cache bottleneck and enabling efficient, minute-long video synthesis.

SANA-Video achieves exceptional efficiency and cost savings: its training cost is only 1% of MovieGen's (12 days on 64 H100 GPUs). Compared to modern state-of-the-art small diffusion models (e.g., Wan 2.1 and SkyReel-V2), SANA-Video maintains competitive performance while being 16× faster in measured latency. SANA-Video is deployable on RTX 5090 GPUs, accelerating the inference speed for a 5-second 720p video from 71s down to 29s (2.4× speedup), setting a new standard for low-cost, high-quality video generation.

More comparison samples here: SANA Video

https://redd.it/1rz153l
@rStableDiffusion

YouTube

SANA-Video Quality Comparison

This video shows how good can SANA-Video a pure linear transformer diffuison model can do.

6 views19:40

r/StableDiffusion

Training Lora with Ai Toolkit (about resolution)
https://redd.it/1rz5ifb
@rStableDiffusion

6 views20:40

r/StableDiffusion

Have you tried fish audio S2Pro?

What is your experience with it? Do you think it can compete with Elevenlabs?
I have tried it and it is 80% as good as Elevenlabs.

https://redd.it/1rz7wjh
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

5 views21:40

r/StableDiffusion

I built a tool that creates LoRAs from images without any training — no gradient descent, no loss curves, no hyperparameters. Dataset in, LoRA out, 1-5 minutes.

I've been building an AI video production pipeline on 4×RTX 4090s and got frustrated with how long LoRA training takes. So I built NeuralGraft, which includes a new feature called LoRA Forge that constructs LoRAs from a folder of images using pure linear algebra — no training loop at all.

**How it works in 30 seconds:**

You give it a folder of images (10-100) and a base model checkpoint. It:

1. Extracts a "concept signature" from your images (81 visual features: color palette, texture, spatial frequency, contrast, structure)

2. Projects your images through each transformer block's weights

3. Discovers which activation directions encode your concept via closed-form regression

4. Constructs standard LoRA matrices (B @ A) from those directions via SVD

5. Outputs a standard .safetensors LoRA you can use in ComfyUI, diffusers, A1111 — anywhere

**CLI is one command:**

neuralgraft forge \\

\--base model.safetensors \\

\--images ./my_cinematic_shots/ \\

\--output cinematic-lora.safetensors \\

\--rank 16 \\

\--trigger-word "cinematic"

**What it's actually good for:**

\- Art style transfer (give it 20 frames from a film → get its visual style as a LoRA)

\- Color grading (reference color-graded images → color grading LoRA)

\- Texture/material quality (skin texture, fabric, surfaces)

\- Lighting mood (warm sunset, cold blue, neon)

\- Camera characteristics (specific lens look, DoF style)

**What it honestly struggles with (not trying to oversell):**

\- Specific face identity — faces are highly non-linear, use DreamBooth for that

\- Very fine character details (specific clothing patterns, logos)

\- Concepts the base model has never seen at all

**The math (for the curious):**

LoRA training discovers weight modification directions via gradient descent over thousands of steps. NeuralGraft discovers the same directions via closed-form linear regression on SVD-decomposed weights. Same result, different path — seconds of math instead of hours of training.

LoRA training: ΔW = B @ A (rank-r, learned over thousands of steps)

NeuralGraft: ΔW = U @ diag(d) @ V\^T (rank-k, computed in one SVD)

**Other things NeuralGraft can do:**

\- Permanently bake LoRAs into model weights (zero runtime overhead)

\- Graft capabilities from one model architecture into another (e.g., WAN 2.2 motion quality → LTX 2.3)

\- Spectral amplification (boost LoRA-improved directions in base weights)

Works with any DiT-based model: LTX Video, FLUX, SD3, HunyuanVideo, WAN, PixArt.

**Repo:** https://github.com/alokickstudios-coder/neuralgraft

**License:** Apache 2.0 (fully open source)

Built this primarily for video generation (LTX 2.3) but it works for image models too. Happy to answer questions about the approach or limitations.

https://redd.it/1rza04z
@rStableDiffusion

GitHub

GitHub - alokickstudios-coder/neuralgraft: Zero-training capability transfer & LoRA construction for diffusion models. Forge LoRAs…

Zero-training capability transfer & LoRA construction for diffusion models. Forge LoRAs from images without training. Graft capabilities across architectures. Hours of model training in min...

5 views22:40

r/StableDiffusion

SAMA 14b - Video Editing Model based off Wan 2.1 (Apache 2.0)

https://github.com/Cynthiazxy123/SAMA
https://huggingface.co/syxbb/SAMA-14B

https://redd.it/1rzauw4
@rStableDiffusion

GitHub

GitHub - Cynthiazxy123/SAMA: Official inference code for SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction…

Official inference code for SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing. - Cynthiazxy123/SAMA

5 views23:40

r/StableDiffusion

GPU Temps for Local Gen

What sort of temps are acceptable for local image generation? I generate images at 832x1216 and upscale by 1.5x and i'm seeing hot spot temps on my RTX 4080 peak out at 103c

is it time for me to replace the thermal paste on my GPU or is this expected temps? Worried that these temps will cause damage and be a costly replacement.

https://redd.it/1rz9je1
@rStableDiffusion

From the StableDiffusion community on Reddit

Explore this post and more from the StableDiffusion community

6 views00:40

r/StableDiffusion

What's the best pipeline to uniformize and upscale a large collection of old book cover scans?

https://redd.it/1rzbpeg
@rStableDiffusion

From the StableDiffusion community on Reddit: What's the best pipeline to uniformize and upscale a large collection of old book…

Explore this post and more from the StableDiffusion community

5 views01:40

r/StableDiffusion

Release Qwen-Image-2.0 or fake
https://redd.it/1rzhtef
@rStableDiffusion

6 views04:41