GANs are getting their way into production
Adobe has rolled out a super-resolution feature for Photoshop. Now one can upscale the image x2 times on each side.
💎 For curious, I leave several links to SOTA super-resolution methods:
1. Structure-Preserving Super Resolution with Gradient Guidance (SPSR), CVPR2020
2. Learned Image Downscaling for Upscaling using Content Adaptive Resampler (CAR), ECCV2020
3. Single Image Super-Resolution via a Holistic Attention Network (HAN), ECCV2020
4. ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic, CVPR2021
—
Let me know in comments if there is a better super-res paper.
Adobe has rolled out a super-resolution feature for Photoshop. Now one can upscale the image x2 times on each side.
💎 For curious, I leave several links to SOTA super-resolution methods:
1. Structure-Preserving Super Resolution with Gradient Guidance (SPSR), CVPR2020
2. Learned Image Downscaling for Upscaling using Content Adaptive Resampler (CAR), ECCV2020
3. Single Image Super-Resolution via a Holistic Attention Network (HAN), ECCV2020
4. ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic, CVPR2021
—
Let me know in comments if there is a better super-res paper.
Summary of Recent Generative Models
Nice blogpost giving a brief overview over several recent generative models, including VAEe, GANs and Diffusion Models.
🌐 Read it here
Nice blogpost giving a brief overview over several recent generative models, including VAEe, GANs and Diffusion Models.
🌐 Read it here
Aran Komatsuzaki
State-of-the-Art Image Generative Models
I have aggregated some of the SotA image generative models released recently, with short summaries, visualizations and comments. The overall development is summarized, and the future trends are spe…
Facebook AI has built TimeSformer, a new architecture for video understanding. It’s the first based exclusively on the self-attention mechanism used in Transformers. It outperforms the state of the art while being more efficient than 3D ConvNets for video.
❓Why it matters
To train video-understanding models, the best 3D CNNs today can only use video segments that are a few seconds long. With TimeSformer, we are able to train on far longer video clips — up to several minutes long. This may dramatically advance research to teach machines to understand complex long-form actions in videos, which is an important step for many AI applications geared toward human behavior understanding (e.g., an AI assistant).
Furthermore, the low inference cost of TimeSformer is an important step toward supporting future real-time video processing applications, such as AR/VR, or intelligent assistants that provide services based on video taken from wearable cameras.
🌐 FAIR Blog
📝 Paper
❓Why it matters
To train video-understanding models, the best 3D CNNs today can only use video segments that are a few seconds long. With TimeSformer, we are able to train on far longer video clips — up to several minutes long. This may dramatically advance research to teach machines to understand complex long-form actions in videos, which is an important step for many AI applications geared toward human behavior understanding (e.g., an AI assistant).
Furthermore, the low inference cost of TimeSformer is an important step toward supporting future real-time video processing applications, such as AR/VR, or intelligent assistants that provide services based on video taken from wearable cameras.
🌐 FAIR Blog
📝 Paper
The well-known implementation-freak lucidrains has already released a ⚙️ Timesformer code.