Tree of Thoughts: Deliberate Problem Solving with Large Language Models
📝https://github.com/kyegomez/tree-of-thoughts
📝https://github.com/kyegomez/tree-of-thoughts
GitHub
GitHub - kyegomez/tree-of-thoughts: Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large…
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70% - GitHub - kyegomez/tree-of-thoughts: Plug i...
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
📝https://github.com/shihaozhaozsh/uni-controlnet
📝https://github.com/shihaozhaozsh/uni-controlnet
GitHub
GitHub - ShihaoZhaoZSH/Uni-ControlNet: [NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
[NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models - ShihaoZhaoZSH/Uni-ControlNet
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
📝https://github.com/Liuhong99/Sophia
📝https://github.com/Liuhong99/Sophia
GitHub
GitHub - Liuhong99/Sophia: The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model…
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training” - Liuhong99/Sophia
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
📝https://github.com/shi-labs/prompt-free-diffusion
📝https://github.com/shi-labs/prompt-free-diffusion
GitHub
GitHub - SHI-Labs/Prompt-Free-Diffusion: Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023…
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024 - SHI-Labs/Prompt-Free-Diffusion
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
📝https://github.com/thu-ml/prolificdreamer
📝https://github.com/thu-ml/prolificdreamer
GitHub
GitHub - thu-ml/prolificdreamer: ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation…
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight) - thu-ml/prolificdreamer
Large Language Models as Tool Makers
📝Recent research shows the potential of enhancing the problem-solving ability of large language models (LLMs) through the use of external tools. However, prior work along this line depends on the availability of existing tools. In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool M akers (LATM), where LLMs create their own reusable tools for problem-solving
https://github.com/ctlllll/llm-toolmaker
📝Recent research shows the potential of enhancing the problem-solving ability of large language models (LLMs) through the use of external tools. However, prior work along this line depends on the availability of existing tools. In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool M akers (LATM), where LLMs create their own reusable tools for problem-solving
https://github.com/ctlllll/llm-toolmaker
GitHub
GitHub - ctlllll/LLM-ToolMaker
Contribute to ctlllll/LLM-ToolMaker development by creating an account on GitHub.
Fine-Tuning Language Models with Just Forward Passes
📝Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but as LMs grow in size, backpropagation requires a prohibitively large amount of memory.
https://github.com/princeton-nlp/MeZO
📝Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but as LMs grow in size, backpropagation requires a prohibitively large amount of memory.
https://github.com/princeton-nlp/MeZO
GitHub
GitHub - princeton-nlp/MeZO: [NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333 - princeton-nlp/MeZO
NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images
📝We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.
https://github.com/liuyuan-pal/NeRO
📝We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.
https://github.com/liuyuan-pal/NeRO
GitHub
GitHub - liuyuan-pal/NeRO: [SIGGRAPH2023] NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images
[SIGGRAPH2023] NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images - liuyuan-pal/NeRO
SoundStorm: Efficient Parallel Audio Generation
📝We present SoundStorm, a model for efficient, non-autoregressive audio generation.
https://github.com/rishikksh20/SoundStorm-pytorch
📝We present SoundStorm, a model for efficient, non-autoregressive audio generation.
https://github.com/rishikksh20/SoundStorm-pytorch
GitHub
GitHub - rishikksh20/SoundStorm-pytorch: Google's SoundStorm: Efficient Parallel Audio Generation
Google's SoundStorm: Efficient Parallel Audio Generation - rishikksh20/SoundStorm-pytorch
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
📖In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
https://github.com/salesforce/codetf
📖In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
https://github.com/salesforce/codetf
GitHub
GitHub - salesforce/CodeTF: CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM - GitHub - salesforce/CodeTF: CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
📖Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).
https://github.com/mit-han-lab/llm-awq
📖Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).
https://github.com/mit-han-lab/llm-awq
GitHub
GitHub - mit-han-lab/llm-awq: [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration - mit-han-lab/llm-awq
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
📖Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.
https://github.com/facebookresearch/hiera
📖Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.
https://github.com/facebookresearch/hiera
GitHub
GitHub - facebookresearch/hiera: Hiera: A fast, powerful, and simple hierarchical vision transformer.
Hiera: A fast, powerful, and simple hierarchical vision transformer. - facebookresearch/hiera
StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation
📖The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.
https://github.com/icoz69/styleavatar3d
📖The recent advancements in image-text diffusion models have stimulated research interest in large-scale 3D generative models.
https://github.com/icoz69/styleavatar3d
GitHub
GitHub - icoz69/StyleAvatar3D: Official repo for StyleAvatar3D
Official repo for StyleAvatar3D. Contribute to icoz69/StyleAvatar3D development by creating an account on GitHub.
Humans in 4D: Reconstructing and Tracking Humans with Transformers
📖To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.
https://github.com/shubham-goel/4D-Humans
📖To analyze video, we use 3D reconstructions from HMR 2. 0 as input to a tracking system that operates in 3D.
https://github.com/shubham-goel/4D-Humans
GitHub
GitHub - shubham-goel/4D-Humans: 4DHumans: Reconstructing and Tracking Humans with Transformers
4DHumans: Reconstructing and Tracking Humans with Transformers - shubham-goel/4D-Humans
📝DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
https://github.com/rikorose/deepfilternet
https://github.com/rikorose/deepfilternet
GitHub
GitHub - Rikorose/DeepFilterNet: Noise supression using deep filtering
Noise supression using deep filtering. Contribute to Rikorose/DeepFilterNet development by creating an account on GitHub.
📝SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
https://github.com/vahe1994/spqr
https://github.com/vahe1994/spqr
GitHub
GitHub - Vahe1994/SpQR
Contribute to Vahe1994/SpQR development by creating an account on GitHub.
📝Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
https://github.com/damo-nlp-sg/video-llama
https://github.com/damo-nlp-sg/video-llama
GitHub
GitHub - DAMO-NLP-SG/Video-LLaMA: [EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding - DAMO-NLP-SG/Video-LLaMA