vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
Total stars: 268
Stars trend:
20 Jun 2023
#python
#gpt, #inference, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #transformer
A high-throughput and memory-efficient inference and serving engine for LLMs
Language: Python
Total stars: 268
Stars trend:
20 Jun 2023
12pm ▎ +2
1pm ▍ +3
2pm ▏ +1
3pm +0
4pm +0
5pm +0
6pm +0
7pm ▊ +6
8pm ███▉ +31
9pm ████████ +64
10pm ███████▍ +59
11pm ██████▎ +50
#python
#gpt, #inference, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #transformer
microsoft/aici
AICI: Prompts as (Wasm) Programs
Language:Rust
Total stars: 213
Stars trend:
#rust
#ai, #inference, #languagemodel, #llm, #llmframework, #llminference, #llmserving, #llmops, #modelserving, #rust, #transformer, #wasm, #wasmtime
AICI: Prompts as (Wasm) Programs
Language:Rust
Total stars: 213
Stars trend:
11 Mar 2024
6pm ██▏ +17
7pm ███▎ +26
8pm ████▊ +38#rust
#ai, #inference, #languagemodel, #llm, #llmframework, #llminference, #llmserving, #llmops, #modelserving, #rust, #transformer, #wasm, #wasmtime
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 34142
Stars trend:
20 Jan 2025
8pm ▎ +2
9pm ▎ +2
10pm █▏ +9
11pm ▍ +3
21 Jan 2025
12am ▎ +2
1am █ +8
2am ▉ +7
3am ▉ +7
4am █▌ +12
5am ▊ +6
6am █▎ +10
7am █▍ +11#python
#amd, #cuda, #gpt, #hpu, #inference, #inferentia, #llama, #llm, #llmserving, #llmops, #mlops, #modelserving, #pytorch, #rocm, #tpu, #trainium, #transformer, #xpu
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 60724
Stars trend:
#python
#amd, #blackwell, #cuda, #deepseek, #deepseekv3, #gpt, #gptoss, #inference, #kimi, #llama, #llm, #llmserving, #modelserving, #moe, #openai, #pytorch, #qwen, #qwen3, #tpu, #transformer
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python
Total stars: 60724
Stars trend:
22 Oct 2025
2am ▎ +2
3am ▌ +4
4am ▏ +1
5am ▍ +3
6am ▎ +2
7am ▊ +6
8am ▋ +5
9am ▍ +3
10am +0
11am ▍ +3
12pm ▏ +1
1pm ▎ +2#python
#amd, #blackwell, #cuda, #deepseek, #deepseekv3, #gpt, #gptoss, #inference, #kimi, #llama, #llm, #llmserving, #modelserving, #moe, #openai, #pytorch, #qwen, #qwen3, #tpu, #transformer
vllm-project/vllm-omni
A framework for efficient model inference with omni-modality models
Language:Python
Total stars: 225
Stars trend:
#python
#audiogeneration, #diffusion, #imagegeneration, #inference, #modelserving, #multimodal, #pytorch, #transformer, #videogeneration
A framework for efficient model inference with omni-modality models
Language:Python
Total stars: 225
Stars trend:
1 Dec 2025
5pm ▏ +1
6pm +0
7pm ▎ +2
8pm ▍ +3
9pm ▎ +2
10pm ▎ +2
11pm +0
2 Dec 2025
12am ▎ +2
1am +0
2am ▋ +5
3am █▏ +9#python
#audiogeneration, #diffusion, #imagegeneration, #inference, #modelserving, #multimodal, #pytorch, #transformer, #videogeneration