π₯ T5X is a modular, composable, research-friendly framework for high-performance, configurable, self-service training, evaluation, and inference of sequence models.
Github: https://github.com/google-research/t5x
Paper: https://arxiv.org/abs/2203.17189v1
@ArtificialIntelligencedl
Github: https://github.com/google-research/t5x
Paper: https://arxiv.org/abs/2203.17189v1
@ArtificialIntelligencedl
π4
π» TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing (CVPR 2022)
Recent advances like StyleGAN have promoted the growth of controllable facial editing.
Github: https://github.com/billyxyb/transeditor
Paper: https://arxiv.org/abs/2203.17266v1
@ArtificialIntelligencedl
Recent advances like StyleGAN have promoted the growth of controllable facial editing.
Github: https://github.com/billyxyb/transeditor
Paper: https://arxiv.org/abs/2203.17266v1
@ArtificialIntelligencedl
π6
π Exploiting Explainable Metrics for Augmented SGD
A new explainability metrics that measure the redundant information in a network's layers and exploit this information to augment the Stochastic Gradient Descent
Project
Code: https://github.com/mahdihosseini/rmsgd
Paper: https://arxiv.org/pdf/2203.16723v1.pdf
Dataset: https://paperswithcode.com/dataset/mhist
@ArtificialIntelligencedl
A new explainability metrics that measure the redundant information in a network's layers and exploit this information to augment the Stochastic Gradient Descent
Project
Code: https://github.com/mahdihosseini/rmsgd
Paper: https://arxiv.org/pdf/2203.16723v1.pdf
Dataset: https://paperswithcode.com/dataset/mhist
@ArtificialIntelligencedl
π5π₯1
π Efficient Non-Autoregressive GAN Voice Conversion using VQWav2vec Features and Dynamic Convolution
Dynamic-GAN-VC (DYGAN-VC), uses a non-autoregressive structure and makes use of vector quantised embeddings obtained from a VQWav2vec model
Code: https://github.com/mingjiechen/dyganvc
Paper: https://arxiv.org/abs/2203.17172v1
Dataset: https://github.com/nii-yamagishilab/VCC2020-database
@ArtificialIntelligencedl
Dynamic-GAN-VC (DYGAN-VC), uses a non-autoregressive structure and makes use of vector quantised embeddings obtained from a VQWav2vec model
Code: https://github.com/mingjiechen/dyganvc
Paper: https://arxiv.org/abs/2203.17172v1
Dataset: https://github.com/nii-yamagishilab/VCC2020-database
@ArtificialIntelligencedl
π6
β Rethinking Portrait Matting with Privacy Preserving
Code: https://github.com/mingjiechen/dyganvc
Paper: https://arxiv.org/abs/2203.16828v1
Dataset: https://github.com/vitae-transformer/vitae-transformer-matting#ppt-setting-and-p3m-10k-dataset
@ArtificialIntelligencedl
Code: https://github.com/mingjiechen/dyganvc
Paper: https://arxiv.org/abs/2203.16828v1
Dataset: https://github.com/vitae-transformer/vitae-transformer-matting#ppt-setting-and-p3m-10k-dataset
@ArtificialIntelligencedl
π4
π MultiMAE: Multi-modal Multi-task Masked Autoencoders
An efficient and effective pre-training strategy for Vision Transformers
Project: https://multimae.epfl.ch/
Code: https://github.com/EPFL-VILAB/MultiMAE
Paper: https://arxiv.org/abs/2204.01678
Project: https://multimae.epfl.ch/
@ArtificialIntelligencedl
An efficient and effective pre-training strategy for Vision Transformers
Project: https://multimae.epfl.ch/
Code: https://github.com/EPFL-VILAB/MultiMAE
Paper: https://arxiv.org/abs/2204.01678
Project: https://multimae.epfl.ch/
@ArtificialIntelligencedl
π3
π TESTR: Text Spotting Transformers
TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild
Code: https://github.com/mlpc-ucsd/testr
Paper: https://arxiv.org/abs/2204.01918
Dataset: https://ucsdcloud-my.sharepoint.com/:u:/g/personal/xiz102_ucsd_edu/EWgEM5BSRjBEua4B_qLrGR0BaombUL8K3d23ldXOb7wUNA?e=7VzH34
@ArtificialIntelligencedl
TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild
Code: https://github.com/mlpc-ucsd/testr
Paper: https://arxiv.org/abs/2204.01918
Dataset: https://ucsdcloud-my.sharepoint.com/:u:/g/personal/xiz102_ucsd_edu/EWgEM5BSRjBEua4B_qLrGR0BaombUL8K3d23ldXOb7wUNA?e=7VzH34
@ArtificialIntelligencedl
π6π1
MIMDet π
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Code: https://github.com/hustvl/mimdet
Paper: https://arxiv.org/abs/2204.02964v1
Dataset: https://paperswithcode.com/dataset/coco
Pretrained Model: https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base_full.pth
@ArtificialIntelligencedl
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Code: https://github.com/hustvl/mimdet
Paper: https://arxiv.org/abs/2204.02964v1
Dataset: https://paperswithcode.com/dataset/coco
Pretrained Model: https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base_full.pth
@ArtificialIntelligencedl
π₯4β€1π1
π FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality Assessment
A new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures
Code: https://github.com/xujinglin/finediving
Paper: https://arxiv.org/abs/2204.03646v1
Dataset: https://pan.baidu.com/s/1v85-np2FbS0J4UfAEiI4mg
@ArtificialIntelligencedl
A new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures
Code: https://github.com/xujinglin/finediving
Paper: https://arxiv.org/abs/2204.03646v1
Dataset: https://pan.baidu.com/s/1v85-np2FbS0J4UfAEiI4mg
@ArtificialIntelligencedl
π7
π Context-Sensitive Temporal Feature Learning for Gait Recognition
Code: https://github.com/oliverhxh/cstl
Paper: https://arxiv.org/abs/2204.03270v1
@ArtificialIntelligencedl
Code: https://github.com/oliverhxh/cstl
Paper: https://arxiv.org/abs/2204.03270v1
@ArtificialIntelligencedl
β€5π1
DaViT: Dual Attention Vision Transformer
Code: https://github.com/dingmyu/davit
Paper: https://arxiv.org/abs/2204.03645v1
Dataset: https://paperswithcode.com/dataset/ade20k
@ArtificialIntelligencedl
Code: https://github.com/dingmyu/davit
Paper: https://arxiv.org/abs/2204.03645v1
Dataset: https://paperswithcode.com/dataset/ade20k
@ArtificialIntelligencedl
π4
Vision Transformers for Single Image Dehazing
Code: https://github.com/IDKiro/DehazeFormer
Paper: https://arxiv.org/abs/2204.03883v1
Dataset: https://paperswithcode.com/dataset/rs-haze
@ArtificialIntelligencedl
Code: https://github.com/IDKiro/DehazeFormer
Paper: https://arxiv.org/abs/2204.03883v1
Dataset: https://paperswithcode.com/dataset/rs-haze
@ArtificialIntelligencedl
SuperGAT
A self-supervised graph attention network (SuperGAT), an improved graph attention model for noisy graph
Code: https://github.com/dongkwan-kim/SuperGAT
Paper: https://arxiv.org/abs/2204.04879v1
Dataset: https://paperswithcode.com/dataset/ogb
@ArtificialIntelligencedl
A self-supervised graph attention network (SuperGAT), an improved graph attention model for noisy graph
Code: https://github.com/dongkwan-kim/SuperGAT
Paper: https://arxiv.org/abs/2204.04879v1
Dataset: https://paperswithcode.com/dataset/ogb
@ArtificialIntelligencedl
π8π1
FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning
Code: https://github.com/alibaba/federatedscope
Paper: https://arxiv.org/abs/2204.05562v1
@ArtificialIntelligencedl
Code: https://github.com/alibaba/federatedscope
Paper: https://arxiv.org/abs/2204.05562v1
@ArtificialIntelligencedl
π3
π₯ DALLΒ·E 2
DALLΒ·E 2 is a new AI system that can create realistic images and art from a description in natural language.
Openai: https://openai.com/dall-e-2/
Paper: https://cdn.openai.com/papers/dall-e-2.pdf
Video: https://vimeo.com/692375454
DALLΒ·E 2 is a new AI system that can create realistic images and art from a description in natural language.
Openai: https://openai.com/dall-e-2/
Paper: https://cdn.openai.com/papers/dall-e-2.pdf
Video: https://vimeo.com/692375454
π5
Understanding Engagement from Video Screengrabs
Code: https://github.com/wanghewei16/video-engagement-analysis
Paper: https://arxiv.org/abs/2204.06454v1
The data source: https://github.com/e-drishti/wacv2016.
@ArtificialIntelligencedl
Code: https://github.com/wanghewei16/video-engagement-analysis
Paper: https://arxiv.org/abs/2204.06454v1
The data source: https://github.com/e-drishti/wacv2016.
@ArtificialIntelligencedl
π₯4
π¦ YOLOV5-ti-lite Object Detection Models
Code: https://github.com/texasinstruments/edgeai-yolov5
Paper: https://arxiv.org/abs/2204.06806v1
Dataset: https://paperswithcode.com/dataset/coco
@ArtificialIntelligencedl
Code: https://github.com/texasinstruments/edgeai-yolov5
Paper: https://arxiv.org/abs/2204.06806v1
Dataset: https://paperswithcode.com/dataset/coco
@ArtificialIntelligencedl
β€6
πΉ Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Github: https://github.com/xiaoyuan1996/AMFMN
Paper: https://arxiv.org/abs/2204.09868v1
Dataset: https://paperswithcode.com/dataset/kitti
Github: https://github.com/xiaoyuan1996/AMFMN
Paper: https://arxiv.org/abs/2204.09868v1
Dataset: https://paperswithcode.com/dataset/kitti
ResT V2: Simpler, Faster and Stronger
Code: https://github.com/wofmanaf/ResT
Paper: https://arxiv.org/abs/2204.07366v1
Dataset: https://drive.google.com/drive/folders/1H6QUZsKYbU6LECtxzGHKqEeGbx1E8uQ9
@ArtificialIntelligencedl
Code: https://github.com/wofmanaf/ResT
Paper: https://arxiv.org/abs/2204.07366v1
Dataset: https://drive.google.com/drive/folders/1H6QUZsKYbU6LECtxzGHKqEeGbx1E8uQ9
@ArtificialIntelligencedl
π6
βοΈ Temporally Efficient Vision Transformer for Video Instance Segmentation
Code: https://github.com/hustvl/tevit
Paper: https://arxiv.org/abs/2204.08412v1
Dataset: https://paperswithcode.com/dataset/youtubevis
@ArtificialIntelligencedl
Code: https://github.com/hustvl/tevit
Paper: https://arxiv.org/abs/2204.08412v1
Dataset: https://paperswithcode.com/dataset/youtubevis
@ArtificialIntelligencedl
π3π₯3π1