Sparks of Artificial General Intelligence: Early experiments with GPT-4
📝
https://github.com/microsoft/guidance
📝
https://github.com/microsoft/guidance
GitHub
GitHub - guidance-ai/guidance: A guidance language for controlling large language models.
A guidance language for controlling large language models. - guidance-ai/guidance
FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
📝
https://github.com/mit-han-lab/fastcomposer
📝
https://github.com/mit-han-lab/fastcomposer
GitHub
GitHub - mit-han-lab/fastcomposer: [IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention - mit-han-lab/fastcomposer
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
📝
https://github.com/opengvlab/visionllm
📝
https://github.com/opengvlab/visionllm
GitHub
GitHub - OpenGVLab/VisionLLM: VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks - OpenGVLab/VisionLLM
This media is not supported in your browser
VIEW IN TELEGRAM
Official pytorch implementation of "ControlVideo: Training-free Controllable Text-to-Video Generation"
📝
https://github.com/ybybzhang/controlvideo
📝
https://github.com/ybybzhang/controlvideo
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
📝https://github.com/kyegomez/tree-of-thoughts
📝https://github.com/kyegomez/tree-of-thoughts
GitHub
GitHub - kyegomez/tree-of-thoughts: Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large…
Plug in and Play Implementation of Tree of Thoughts: Deliberate Problem Solving with Large Language Models that Elevates Model Reasoning by atleast 70% - GitHub - kyegomez/tree-of-thoughts: Plug i...
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
📝https://github.com/shihaozhaozsh/uni-controlnet
📝https://github.com/shihaozhaozsh/uni-controlnet
GitHub
GitHub - ShihaoZhaoZSH/Uni-ControlNet: [NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
[NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models - ShihaoZhaoZSH/Uni-ControlNet
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
📝https://github.com/Liuhong99/Sophia
📝https://github.com/Liuhong99/Sophia
GitHub
GitHub - Liuhong99/Sophia: The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model…
The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training” - Liuhong99/Sophia
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
📝https://github.com/shi-labs/prompt-free-diffusion
📝https://github.com/shi-labs/prompt-free-diffusion
GitHub
GitHub - SHI-Labs/Prompt-Free-Diffusion: Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023…
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models, arxiv 2023 / CVPR 2024 - SHI-Labs/Prompt-Free-Diffusion
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
📝https://github.com/thu-ml/prolificdreamer
📝https://github.com/thu-ml/prolificdreamer
GitHub
GitHub - thu-ml/prolificdreamer: ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation…
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (NeurIPS 2023 Spotlight) - thu-ml/prolificdreamer
Large Language Models as Tool Makers
📝Recent research shows the potential of enhancing the problem-solving ability of large language models (LLMs) through the use of external tools. However, prior work along this line depends on the availability of existing tools. In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool M akers (LATM), where LLMs create their own reusable tools for problem-solving
https://github.com/ctlllll/llm-toolmaker
📝Recent research shows the potential of enhancing the problem-solving ability of large language models (LLMs) through the use of external tools. However, prior work along this line depends on the availability of existing tools. In this work, we take an initial step towards removing this dependency by proposing a closed-loop framework, referred to as LLMs A s Tool M akers (LATM), where LLMs create their own reusable tools for problem-solving
https://github.com/ctlllll/llm-toolmaker
GitHub
GitHub - ctlllll/LLM-ToolMaker
Contribute to ctlllll/LLM-ToolMaker development by creating an account on GitHub.
Fine-Tuning Language Models with Just Forward Passes
📝Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but as LMs grow in size, backpropagation requires a prohibitively large amount of memory.
https://github.com/princeton-nlp/MeZO
📝Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but as LMs grow in size, backpropagation requires a prohibitively large amount of memory.
https://github.com/princeton-nlp/MeZO
GitHub
GitHub - princeton-nlp/MeZO: [NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333 - princeton-nlp/MeZO
NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images
📝We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.
https://github.com/liuyuan-pal/NeRO
📝We present a neural rendering-based method called NeRO for reconstructing the geometry and the BRDF of reflective objects from multiview images captured in an unknown environment.
https://github.com/liuyuan-pal/NeRO
GitHub
GitHub - liuyuan-pal/NeRO: [SIGGRAPH2023] NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images
[SIGGRAPH2023] NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images - liuyuan-pal/NeRO
SoundStorm: Efficient Parallel Audio Generation
📝We present SoundStorm, a model for efficient, non-autoregressive audio generation.
https://github.com/rishikksh20/SoundStorm-pytorch
📝We present SoundStorm, a model for efficient, non-autoregressive audio generation.
https://github.com/rishikksh20/SoundStorm-pytorch
GitHub
GitHub - rishikksh20/SoundStorm-pytorch: Google's SoundStorm: Efficient Parallel Audio Generation
Google's SoundStorm: Efficient Parallel Audio Generation - rishikksh20/SoundStorm-pytorch
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
📖In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
https://github.com/salesforce/codetf
📖In this paper, we present CodeTF, an open-source Transformer-based library for state-of-the-art Code LLMs and code intelligence.
https://github.com/salesforce/codetf
GitHub
GitHub - salesforce/CodeTF: CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
CodeTF: One-stop Transformer Library for State-of-the-art Code LLM - GitHub - salesforce/CodeTF: CodeTF: One-stop Transformer Library for State-of-the-art Code LLM
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
📖Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).
https://github.com/mit-han-lab/llm-awq
📖Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises the hardware barrier for serving (memory size) and slows down token generation (memory bandwidth).
https://github.com/mit-han-lab/llm-awq
GitHub
GitHub - mit-han-lab/llm-awq: [MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration - mit-han-lab/llm-awq
Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles
📖Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.
https://github.com/facebookresearch/hiera
📖Modern hierarchical vision transformers have added several vision-specific components in the pursuit of supervised classification performance.
https://github.com/facebookresearch/hiera
GitHub
GitHub - facebookresearch/hiera: Hiera: A fast, powerful, and simple hierarchical vision transformer.
Hiera: A fast, powerful, and simple hierarchical vision transformer. - facebookresearch/hiera