Github LLMs
747 subscribers
39 photos
3 videos
4 files
54 links
LLM projects
@Raminmousa
Download Telegram
Channel created
graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
                                                                   
Creator: Microsoft
Stars ⭐️: 13.7k
Forked By: 1.2k
GitHub Repo:
https://github.com/microsoft/graphrag

       
Join @deep_learning_proj
👍1
firecrawl

Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
                                                                   
Creator: Mendable
Stars ⭐️: 12.3k
Forked By: 861
GitHub Repo:
https://github.com/mendableai/firecrawl

https://t.iss.one/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
👍2
🖥 Awesome LLM Strawberry (OpenAI o1)



Github

https://t.iss.one/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
👍2
MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
                                                                   
Creator: OpenBMB
Stars ⭐️: 11.4k
Forked By: 798
GitHub Repo:
https://github.com/OpenBMB/MiniCPM-V

       
Join https://t.iss.one/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
🌟 GRIN MoE: Mixture-of-Experts от Microsoft.


🟢total parameters: 16x3.8B;
🟢active parameters: 6.6B;
🟢context length: 4096;
🟢number of embeddings 4096;
🟢number of layers: 32;
https://t.iss.one/deep_learning_proj


🟡Arxiv
🟡Demo
🖥Github
Please open Telegram to view this post
VIEW IN TELEGRAM
🔥 NVIDIA silently release a Llama 3.1 70B fine-tune that outperforms
GPT-4o and Claude Sonnet 3.5


Llama 3.1 Nemotron 70B Instruct a further RLHFed model on
huggingface


https://huggingface.co/collections/nvidia/llama-31-nemotron-70b-670e93cd366feea16abc13d8
https://t.iss.one/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
🌟 Zamba2-Instruct

В семействе 2 модели:

🟢Zamba2-1.2B-instruct;
🟠Zamba2-2.7B-instruct.



# Clone repo
git clone https://github.com/Zyphra/transformers_zamba2.git
cd transformers_zamba2

# Install the repository & accelerate:
pip install -e .
pip install accelerate

# Inference:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("Zyphra/Zamba2-2.7B-instruct")
model = AutoModelForCausalLM.from_pretrained("Zyphra/Zamba2-2.7B-instruct", device_map="cuda", torch_dtype=torch.bfloat16)

user_turn_1 = "user_prompt1."
assistant_turn_1 = "assistant_prompt."
user_turn_2 = "user_prompt2."
sample = [{'role': 'user', 'content': user_turn_1}, {'role': 'assistant', 'content': assistant_turn_1}, {'role': 'user', 'content': user_turn_2}]
chat_sample = tokenizer.apply_chat_template(sample, tokenize=False)

input_ids = tokenizer(chat_sample, return_tensors='pt', add_special_tokens=False).to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=150, return_dict_in_generate=False, output_scores=False, use_cache=True, num_beams=1, do_sample=False)
print((tokenizer.decode(outputs[0])))





🖥GitHub

https://t.iss.one/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
👍2
📖 LLM-Agent-Paper-List is a repository of papers on the topic of agents based on large language models (LLM)! The papers are divided into categories such as LLM agent architectures, autonomous LLM agents, reinforcement learning (RL), natural language processing methods, multimodal approaches and tools for developing LLM agents, and more.

🖥 Github

https://t.iss.one/deep_learning_proj
Please open Telegram to view this post
VIEW IN TELEGRAM
👍3
Welcome to Ollama's Prompt Engineering Interactive Tutorial

🔗 Github

https://t.iss.one/deep_learning_proj
👍3
⚡️ MobileLLM


🟢MobileLLM-125M. 30 Layers, 9 Attention Heads, 3 KV Heads. 576 Token Dimension;

🟢MobileLLM-350M. 32 Layers, 15 Attention Heads, 5 KV Heads. 960 Token Dimension;

🟢MobileLLM-600M. 40 Layers, 18 Attention Heads, 6 KV Heads. 1152 Token Dimension;

🟢MobileLLM-1B. 54 Layers, 20 Attention Heads, 5 KV Heads. 1280 Token Dimension;


🟡Arxiv
🖥GitHub


@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
🌟 BioNeMo: A Framework for Developing AI Models for Drug Design.

NVIDIA BioNeMo2 Framework is a set of tools, libraries, and models for computational drug discovery and design.



▶️ Pre-trained models:

🟢 ESM-2 is a pre-trained bidirectional encoder (BERT-like) for amino acid sequences. BioNeMo2 includes checkpoints with parameters 650M and 3B;

🟢 Geneformer is a tabular scoring model that generates a dense representation of a cell's scRNA by examining co-expression patterns in individual cells.


▶️ Datasets:

🟠 CELLxGENE is a collection of publicly available single-cell datasets collected by the CZI (Chan Zuckerberg Initiative) with a total volume of 24 million cells;


🟠 UniProt is a database of clustered sets of protein sequences from UniProtKB, created on the basis of translated genomic data.



🟡 Project page
🟡 Documentation
🖥 GitHub

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
👍2
Please open Telegram to view this post
VIEW IN TELEGRAM
👍1
Large Language Models Course: Learn by Doing LLM Projects

🖥 Github: https://github.com/peremartra/Large-Language-Model-Notebooks-Course

📕 Paper: https://doi.org/10.31219/osf.io/qgxea

@Machine_learn
Please open Telegram to view this post
VIEW IN TELEGRAM
Foundations of Large Language Models (1).pdf
1.9 MB
Foundations of Large Language Models

📝 Table of Contents:
● Pre-training
● Generative Models
● Prompting
● Alignment

Tong Xiao and Jingbo Zhu
January 17, 2025

📃 Download from arXiv.

@Machine_learn
👍1