LTX 2.2 was nice but just not good enough. But I really think LTX 2.3 has finally gotten me to where I've basically stopped using WAN 2.2

For a long time, I considered LTX to be the worst of all the models. I've tried each release they've come out with. Some of the earlier ones were downright horrible, especially for their time.

But my God have they turned things around.

LTX 2.3 is by no means better than WAN 2.2 in every single way. But one thing that (in my humble opinion) can be said about LTX 2.3 is that, when you consider all factors, it is now overall the best video model that can be locally run, and it has reduced the need to fall back on WAN in a way that LTX 2.2 could not. Especially since ITV in 2.2 was an absolute nightmare to work with.

Things WAN 2.2 still has over LTX:


*Slightly better prompt comprehension and prompt following (as opposed to WAY better in LTX 2.2)


*Moderately better picture/video quality.


*LORA advantage due to its age.


On the flipside: having used LTX 2.3 a great deal since its release, it's painful to go back to WAN now.


*WAN is only 5 seconds ideally before it starts to break apart.

*WAN is dramatically slower than distilled LTX 2.3 or LTX 2.3 with the distill LORA

*WAN cannot do sound on its own (14b version)

*WAN is therefore more useful now as a base building block that passes its output along to something else.

When you're making 15 second videos with sound and highly convincing audio in one minute, it really starts to highlight how far WAN is falling behind, especially since 2.5 and 2.6 will likely never be local.


TL:DR

Generating T2V might still hold some advantage for WAN, but for ITV, it's basically obsolete now compared to LTX 2.3, and even on T2V, LTX 2.3 has made many gains. Since LTX is all we're likely to get, as open source seems to be drying up, it's good that the company behind it has gotten over a lot of their growing pains and is now putting up some seriously amazing tech.

https://redd.it/1rzjel2
@rStableDiffusion
Built a local AI creative suite for Windows, thought you might find it useful

Hey all, I spent the last 6 weeks (and around 550 hours between Claude Code and various OOMs) building something that started as a portfolio piece, but then evolved into a single desktop app that covers the full creative pipeline, locally, no cloud, no subscriptions. It definitely runs with an RTX 4080 and 32GB of RAM (and luckily no OOMs in the last 7 days of continued daily usage).

https://preview.redd.it/qhvafyragdqg1.png?width=2670&format=png&auto=webp&s=a687d9c65e7ea7173bccdda426c22f590e8c2044

It runs image gen (Z-Image Turbo, Klein 9B) with 90+ style LoRAs and a CivitAI browser built in, LTX 2.3 for video across a few different workflow modes, video retexturing with LoRA presets and depth conditioning, a full image editor with AI inpainting and face swap (InsightFace + FaceFusion), background removal, SAM smart select, LUT grading, SeedVR2 and Real-ESRGAN and RIFE for enhancement and frame interpolation, ACE-Step for music, Qwen3-TTS for voiceover with 28 preset voices plus clone and design modes, HunyuanVideo-Foley for SFX, a 12-stage storyboard pipeline, and persistent character library with multi-angle reference generation. There is also a Character repository, to create and reuse them across both storyboard mode as well as for image generation.

https://preview.redd.it/ys308jnegdqg1.png?width=2669&format=png&auto=webp&s=b1b1ef23814b193ac4e95b2cac4d869d53c5bd8e

https://preview.redd.it/c4nx2gtggdqg1.png?width=2757&format=png&auto=webp&s=ea7388165fd4424acc79e5c139584e3d92a611a5

There's a chance it will OOM (I counted 78 OOMs in the last 3 weeks alone), but I tried to build as many VRAM safeguards as possible and stress-tested it to the nth degree.

Still working on it, a few things are already lined up for the next release (multilingual UI, support for Characters in Videos, Mobile companion, Session mode, and a few other things).

I figured someone might find it useful, it's completely free, I'm not monitoring any data and you'll only need an internet connection to retrieve additional styles/LoRAs.

https://preview.redd.it/4o8k2uhjgdqg1.png?width=2893&format=png&auto=webp&s=0d8957bdd382b1b942ea727884c036b8a5b004ee

https://preview.redd.it/sbxd77bqgdqg1.png?width=2760&format=png&auto=webp&s=f65a29e2d7624f3a3eb420ad64506676202ac88d

The installer is \~4MB, but total footprint will bring you close to 200GB.

You can download it from here: https://huggingface.co/atMrMattV/Visione

https://preview.redd.it/qkce1kqsgdqg1.png?width=2898&format=png&auto=webp&s=95838223b023a8eb80ad42608de7fba26da84e30




https://redd.it/1rznto9
@rStableDiffusion