Custom face detection + segmentation models with dedicated ComfyUI nodes
https://redd.it/1rrlh4o
@rStableDiffusion
πŸ‘1
Last week in Image & Video Generation

I curate a weekly multimodal AI roundup, here are the open-source image & video highlights from last week:

LTX-2.3 β€” Lightricks

Better prompt following, native portrait mode up to 1080x1920. Community moved incredibly fast on this one β€” see below.
Model | HuggingFace

https://reddit.com/link/1rr9iwd/video/8quo4o9mxhog1/player

Helios β€” PKU-YuanGroup

14B video model running real-time on a single GPU. t2v, i2v, v2v up to a minute long. Worth testing yourself.
HuggingFace | GitHub

https://reddit.com/link/1rr9iwd/video/ciw3y2vmxhog1/player

Kiwi-Edit

Text or image prompt video editing with temporal consistency. Style swaps, object removal, background changes.
HuggingFace | Project | Demo

https://preview.redd.it/dx8lm1uoxhog1.png?width=1456&format=png&auto=webp&s=25d8c82bac43d01f4e425179cd725be8ac542938

CubeComposer β€” TencentARC

Converts regular video to 4K 360Β° seamlessly. Output quality is genuinely surprising.
Project | HuggingFace

https://preview.redd.it/rqds7zvpxhog1.png?width=1456&format=png&auto=webp&s=24de8610bc84023c30ac5574cbaf7b06040c29a0

HY-WU β€” Tencent

No-training personalized image edits. Face swaps and style transfer on the fly without fine-tuning.
Project | HuggingFace

https://preview.redd.it/l9p8ahrqxhog1.png?width=1456&format=png&auto=webp&s=63f78ee94170afcca6390a35c50539a8e40d025b

Spectrum

3–5x diffusion speedup via Chebyshev polynomial step prediction. No retraining required, plug into existing image and video pipelines.
GitHub

https://preview.redd.it/htdch9trxhog1.png?width=1456&format=png&auto=webp&s=41100093cedbeba7843e90cd36ce62e08841aabc

LTX Desktop β€” Community

Free local video editor built on LTX-2.3. Just works out of the box.
Reddit

LTX Desktop Linux Port β€” Community

Someone ported LTX Desktop to Linux. Didn't take long.
Reddit

LTX-2.3 Workflows β€” Community

12GB GGUF workflows covering i2v, t2v, v2v and more.
Reddit

https://reddit.com/link/1rr9iwd/video/westyyf3yhog1/player

LTX-2.3 Prompting Guide β€” Community

Community-written guide that gets into the specifics of prompting LTX-2.3 well.
Reddit


Checkout the full roundup for more demos, papers, and resources.



https://redd.it/1rr9iwd
@rStableDiffusion