StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
📖The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.
https://github.com/icoz69/styleavatar3d
📖The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.
https://github.com/icoz69/styleavatar3d
GitHub
GitHub - icoz69/StyleAvatar3D: Official repo for StyleAvatar3D
Official repo for StyleAvatar3D. Contribute to icoz69/StyleAvatar3D development by creating an account on GitHub.
Humans in 4D: Reconstructing and Tracking Humans with Transformers
📖To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.
https://github.com/shubham-goel/4D-Humans
📖To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.
https://github.com/shubham-goel/4D-Humans
GitHub
GitHub - shubham-goel/4D-Humans: 4DHumans: Reconstructing and Tracking Humans with Transformers
4DHumans: Reconstructing and Tracking Humans with Transformers - shubham-goel/4D-Humans
📝DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
https://github.com/rikorose/deepfilternet
https://github.com/rikorose/deepfilternet
GitHub
GitHub - Rikorose/DeepFilterNet: Noise supression using deep filtering
Noise supression using deep filtering. Contribute to Rikorose/DeepFilterNet development by creating an account on GitHub.
📝SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
https://github.com/vahe1994/spqr
https://github.com/vahe1994/spqr
GitHub
GitHub - Vahe1994/SpQR
Contribute to Vahe1994/SpQR development by creating an account on GitHub.
📝Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
https://github.com/damo-nlp-sg/video-llama
https://github.com/damo-nlp-sg/video-llama
GitHub
GitHub - DAMO-NLP-SG/Video-LLaMA: [EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding - DAMO-NLP-SG/Video-LLaMA
Simple and Controllable Music Generation
📝We tackle the task of conditional music generation.
https://github.com/facebookresearch/audiocraft
📝We tackle the task of conditional music generation.
https://github.com/facebookresearch/audiocraft
GitHub
GitHub - facebookresearch/audiocraft: Audiocraft is a library for audio processing and generation with deep learning. It features…
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable...
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities
📝In this work, we present a conceptually simple and effective method to train a strong bilingual/multilingual multimodal representation model.
https://github.com/flagai-open/flagai
📝In this work, we present a conceptually simple and effective method to train a strong bilingual/multilingual multimodal representation model.
https://github.com/flagai-open/flagai
GitHub
GitHub - FlagAI-Open/FlagAI: FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large…
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model. - GitHub - FlagAI-Open/FlagAI: FlagAI (Fast LArge-scale General AI models) is a fast...
📝Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation.
https://github.com/mbzuai-oryx/video-chatgpt
https://github.com/mbzuai-oryx/video-chatgpt
GitHub
GitHub - mbzuai-oryx/Video-ChatGPT: "Video-ChatGPT" is a video conversation model capable of generating meaningful conversation…
"Video-ChatGPT" is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder ada...
🦦 📝Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://github.com/luodian/otter
https://github.com/luodian/otter
GitHub
GitHub - Luodian/Otter: 🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained…
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning a...
📝Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
https://github.com/franxyao/chain-of-thought-hub
https://github.com/franxyao/chain-of-thought-hub
GitHub
GitHub - FranxYao/chain-of-thought-hub: Benchmarking large language models' complex reasoning ability with chain-of-thought prompting
Benchmarking large language models' complex reasoning ability with chain-of-thought prompting - GitHub - FranxYao/chain-of-thought-hub: Benchmarking large language models' complex r...
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
📝https://github.com/osu-nlp-group/magicbrush
📝https://github.com/osu-nlp-group/magicbrush
GitHub
GitHub - OSU-NLP-Group/MagicBrush: Dataset, code and models for the paper "MagicBrush: A Manually Annotated Dataset for Instruction…
Dataset, code and models for the paper "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing". - GitHub - OSU-NLP-Group/MagicBrush: Dataset, code and mode...
ChemCrow: Augmenting large-language models with chemistry tools
📝https://github.com/ur-whitelab/chemcrow-public
📝https://github.com/ur-whitelab/chemcrow-public
GitHub
GitHub - ur-whitelab/chemcrow-public: Chemcrow
Chemcrow. Contribute to ur-whitelab/chemcrow-public development by creating an account on GitHub.
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
📝https://github.com/nlpxucan/wizardlm
📝https://github.com/nlpxucan/wizardlm
GitHub
GitHub - nlpxucan/WizardLM: LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath - nlpxucan/WizardLM
Self-Supervised Learning from Images with a Joint-Embedding Predictive Architecture
📝https://github.com/facebookresearch/ijepa
📝https://github.com/facebookresearch/ijepa
GitHub
GitHub - facebookresearch/ijepa: Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined…
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predic...
Full Parameter Fine-tuning for Large Language Models with Limited Resources
📝https://github.com/openlmlab/lomo
📝https://github.com/openlmlab/lomo
GitHub
GitHub - OpenLMLab/LOMO: LOMO: LOw-Memory Optimization
LOMO: LOw-Memory Optimization. Contribute to OpenLMLab/LOMO development by creating an account on GitHub.
WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings
📝https://github.com/poloclub/wizmap
📝https://github.com/poloclub/wizmap
GitHub
GitHub - poloclub/wizmap: Explore and interpret large embeddings in your browser with interactive visualization! 📍
Explore and interpret large embeddings in your browser with interactive visualization! 📍 - poloclub/wizmap