Github LLMs

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub - firecrawl/firecrawl: 🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data

🔥 The Web Data API for AI - Turn entire websites into LLM-ready markdown or structured data - firecrawl/firecrawl

👍2

572 views08:42

🖥

Awesome LLM Strawberry (OpenAI o1)

▪ Github

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

👍2

4.1K viewsedited 19:35

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Creator: OpenBMB
Stars ⭐️: 11.4k
Forked By: 798
GitHub Repo:
https://github.com/OpenBMB/MiniCPM-V

➖➖➖➖➖➖➖➖➖➖➖➖➖➖
Join ✅https://t.iss.one/deep_learning_proj

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub - OpenBMB/MiniCPM-V: MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on…

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone - OpenBMB/MiniCPM-V

582 views11:22

LLM based Multi-Agent methods

🖥

Github: https://github.com/AgnostiqHQ/multi-agent-llm

📕

Paper: https://arxiv.org/abs/2409.12618v1

🤗 Dataset: https://paperswithcode.com/dataset/hotpotqa

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub - AgnostiqHQ/multi-agent-llm: Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)

Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT) - AgnostiqHQ/multi-agent-llm

4.64K views17:38

🌟 GRIN MoE: Mixture-of-Experts от Microsoft.

🟢total parameters: 16x3.8B;
🟢active parameters: 6.6B;
🟢context length: 4096;
🟢number of embeddings 4096;
🟢number of layers: 32;

✅

🟡

Arxiv

🟡

Demo

🖥

Github

Please open Telegram to view this post

VIEW IN TELEGRAM

3.11K views15:11

llama-stack

Model components of the Llama Stack APIs

Creator: Meta Llama
Stars ⭐️: 1.5k
Forked By: 137
https://github.com/meta-llama/llama-stack

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub - llamastack/llama-stack: Composable building blocks to build LLM Apps

Composable building blocks to build LLM Apps. Contribute to llamastack/llama-stack development by creating an account on GitHub.

719 views12:42

Crawl 4 AI

Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper

Creator: UncleCode
Stars ⭐️: 8.6k
Forked By: 627
https://github.com/unclecode/crawl4ai

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

GitHub - unclecode/crawl4ai: 🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://dis…

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN - unclecode/crawl4ai

4.34K views09:19

🔥 NVIDIA silently release a Llama 3.1 70B fine-tune that outperforms
GPT-4o and Claude Sonnet 3.5

Llama 3.1 Nemotron 70B Instruct a further RLHFed model on
huggingface

https://huggingface.co/collections/nvidia/llama-31-nemotron-70b-670e93cd366feea16abc13d8

✅

Please open Telegram to view this post

VIEW IN TELEGRAM

4.21K viewsedited 20:09

GitHub

https://t.iss.one/deep_learning_proj

🌟 Zamba2-Instruct

В семействе 2 модели:

🟢

Zamba2-1.2B-instruct;

🟠

Zamba2-2.7B-instruct.

# Clone repo
git clone https://github.com/Zyphra/transformers_zamba2.git
cd transformers_zamba2

# Install the repository & accelerate:
pip install -e .
pip install accelerate

# Inference:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("Zyphra/Zamba2-2.7B-instruct")
model = AutoModelForCausalLM.from_pretrained("Zyphra/Zamba2-2.7B-instruct", device_map="cuda", torch_dtype=torch.bfloat16)

user_turn_1 = "user_prompt1."
assistant_turn_1 = "assistant_prompt."
user_turn_2 = "user_prompt2."
sample = [{'role': 'user', 'content': user_turn_1}, {'role': 'assistant', 'content': assistant_turn_1}, {'role': 'user', 'content': user_turn_2}]
chat_sample = tokenizer.apply_chat_template(sample, tokenize=False)

input_ids = tokenizer(chat_sample, return_tensors='pt', add_special_tokens=False).to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=150, return_dict_in_generate=False, output_scores=False, use_cache=True, num_beams=1, do_sample=False)
print((tokenizer.decode(outputs[0])))

🖥

Please open Telegram to view this post

VIEW IN TELEGRAM

👍2

3.55K viewsedited 19:16

Github

https://t.iss.one/deep_learning_proj

📖

LLM-Agent-Paper-List is a repository of papers on the topic of agents based on large language models (LLM)! The papers are divided into categories such as LLM agent architectures, autonomous LLM agents, reinforcement learning (RL), natural language processing methods, multimodal approaches and tools for developing LLM agents, and more.

🖥

Please open Telegram to view this post

VIEW IN TELEGRAM

👍3

4.14K viewsedited 04:36

https://github.com/andrewyng/aisuite
#LLMs

https://t.iss.one/deep_learning_proj

3.03K viewsedited 18:51

LLM-based agents for Software Engineering
"Large Language Model-Based Agents for Software Engineering: A Survey".

https://github.com/FudanSELab/Agent4SE-Paper-List.

https://t.iss.one/deep_learning_proj

3.13K viewsedited 19:26

Welcome to Ollama's Prompt Engineering Interactive Tutorial

🔗 Github

https://t.iss.one/deep_learning_proj

👍3

3.4K viewsedited 14:24

Forwarded from Machine learning books and papers

⚡️ MobileLLM

🟢

MobileLLM-125M. 30 Layers, 9 Attention Heads, 3 KV Heads. 576 Token Dimension;

🟢

MobileLLM-350M. 32 Layers, 15 Attention Heads, 5 KV Heads. 960 Token Dimension;

🟢

MobileLLM-600M. 40 Layers, 18 Attention Heads, 6 KV Heads. 1152 Token Dimension;

🟢

MobileLLM-1B. 54 Layers, 20 Attention Heads, 5 KV Heads. 1280 Token Dimension;

🟡

Arxiv

🖥

GitHub

@Machine_learn

Please open Telegram to view this post

VIEW IN TELEGRAM

943 views03:00

Fine_Tuning_LLMs_with_Hugging_Face_Partial_Code.ipynb

214.1 KB

Fine Tuning LLMs with Hugging Face LLMs Code

https://t.iss.one/deep_learning_proj

3.55K viewsedited 03:04

Forwarded from Machine learning books and papers

🌟 BioNeMo: A Framework for Developing AI Models for Drug Design.

NVIDIA BioNeMo2 Framework is a set of tools, libraries, and models for computational drug discovery and design.

▶️ Pre-trained models:

🟢

ESM-2 is a pre-trained bidirectional encoder (BERT-like) for amino acid sequences. BioNeMo2 includes checkpoints with parameters 650M and 3B;

🟢

Geneformer is a tabular scoring model that generates a dense representation of a cell's scRNA by examining co-expression patterns in individual cells.

▶️ Datasets:

🟠

CELLxGENE is a collection of publicly available single-cell datasets collected by the CZI (Chan Zuckerberg Initiative) with a total volume of 24 million cells;

🟠

UniProt is a database of clustered sets of protein sequences from UniProtKB, created on the basis of translated genomic data.

🟡

Project page

🟡

Documentation

🖥

GitHub

@Machine_learn

Please open Telegram to view this post

VIEW IN TELEGRAM

👍2

923 views15:15

GitHub

https://t.iss.one/deep_learning_proj

🌟 LLaMA-Mesh:

🟡

Arxiv

🖥

Please open Telegram to view this post

VIEW IN TELEGRAM

👍1

4.23K viewsedited 20:40

Forwarded from Machine learning books and papers

Large Language Models Course: Learn by Doing LLM Projects

🖥

Github: https://github.com/peremartra/Large-Language-Model-Notebooks-Course

📕

Paper: https://doi.org/10.31219/osf.io/qgxea

@Machine_learn

Please open Telegram to view this post

VIEW IN TELEGRAM

834 views13:09

Foundations of Large Language Models (1).pdf

Forwarded from Machine learning books and papers

1.9 MB

Foundations of Large Language Models

📝 Table of Contents:
● Pre-training
● Generative Models
● Prompting
● Alignment

Tong Xiao and Jingbo Zhu
January 17, 2025

📃 Download from arXiv.

@Machine_learn

👍1

712 views20:18

Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap

Large language models (LLMs) have not only revolutionized natural language processing but also extended their prowess to various domains, marking a significant stride towards artificial general intelligence. The interplay between LLMs and evolutionary algorithms (EAs), despite differing in objectives and methodologies, share a common pursuit of applicability in complex problems. Meanwhile, EA can provide an optimization framework for LLM's further enhancement under black-box settings, empowering LLM with flexible global search capacities. On the other hand, the abundant domain knowledge inherent in LLMs could enable EA to conduct more intelligent searches. Furthermore, the text processing and generative capabilities of LLMs would aid in deploying EAs across a wide range of tasks. Based on these complementary advantages, this paper provides a thorough review and a forward-looking roadmap, categorizing the reciprocal inspiration into two main avenues: LLM-enhanced EA and EA-enhanced #LLM. Some integrated synergy methods are further introduced to exemplify the complementarity between LLMs and EAs in diverse scenarios, including code generation, software engineering, neural architecture search, and various generation tasks. As the first comprehensive review focused on the EA research in the era of #LLMs, this paper provides a foundational stepping stone for understanding the collaborative potential of LLMs and EAs. The identified challenges and future directions offer guidance for researchers and practitioners to unlock the full potential of this innovative collaboration in propelling advancements in optimization and artificial intelligence.

Paper: https://arxiv.org/pdf/2401.10034v3.pdf

Code: https://github.com/wuxingyu-ai/llm4ec

https://t.iss.one/deep_learning_proj

3.1K viewsedited 12:03