Committers¶

本文件列出了 vLLM 專案的當前 Committer 以及他們維護的核心領域。Committer 擁有 vLLM 倉庫的寫入許可權，並負責審查和合並 PR。您還可以參考 CODEOWNERS 檔案獲取具體的檔案級別歸屬和審查人資訊。本文件和 CODEOWNERS 檔案都是動態更新的，它們是互補的。

活躍 Committer¶

我們試圖用簡短的幾句話來總結每位 Committer 在 vLLM 中的角色。總的來說，vLLM 的 Committer 覆蓋了廣泛的領域，並在維護過程中相互幫助。有關具體的元件歸屬細節，請參考後面的“領域負責人”部分。按 GitHub 使用者名稱字母順序排序

@22quinn: RL API
@aarnphm: 結構化輸出
@alexm-redhat: 效能
@ApostaC: Connectors, offloading
@benchislett: 引擎核心和 spec decode
@bigPYJ1151: Intel CPU/XPU 整合
@chaunceyjiang: 工具使用和推理解析器
@DarkLight1337: 多模態，API 伺服器
@esmeetu: 開發者營銷，社群
@gshtras: AMD 整合
@heheda12345: 混合記憶體分配器
@hmellor: Hugging Face 整合，文件
@houseroad: 引擎核心和 Llama 模型
@Isotr0py: 多模態，新模型支援
@jeejeelee: LoRA，新模型支援
@jikunshang: Intel CPU/XPU 整合
@khluu: CI 基礎設施
@KuntaiDu: KV Connector
@LucasWilkinson: Kernels 和效能
@luccafong: Llama 模型，speculative decoding，分散式
@markmc: 可觀測性
@mgoin: 量化和效能
@NickLucche: KV connector
@njhill: 分散式，API 伺服器，引擎核心
@noooop: Pooling models
@patrickvonplaten: Mistral 模型，新模型支援
@pavanimajety: NVIDIA GPU 整合
@ProExpertProg: 編譯，啟動 UX
@robertgshaw2-redhat: Core, distributed, disagg
@ruisearch42: Pipeline parallelism, Ray Support
@russellb: 結構化輸出，引擎核心，安全
@sighingnow: Qwen 模型，新模型支援
@simon-mo: 專案負責人，API 入口，社群
@tdoublep: State space models
@tjtanaa: AMD GPU 整合
@tlrmchlsmth: Kernels and performance, distributed, disagg
@WoosukKwon: 專案負責人，引擎核心
@yaochengji: TPU 整合
@yeqcharlotte: Benchmark, Llama 模型
@yewentao256: Kernels and performance
@Yikun: Pluggable hardware interface
@youkaichao: 專案負責人，分散式，編譯，社群
@ywang96: 多模態，benchmark
@zhuohan123: 專案負責人，RL 整合，numerics
@zou3519: 編譯

榮譽 Committer¶

過去曾為 vLLM 做出重大貢獻（感謝！）但現已不再活躍的 Committer

@andoorve: Pipeline parallelism
@cadedaniel: Speculative decoding
@comaniac: KV cache management, pipeline parallelism
@LiuXiaoxuanPKU: Speculative decoding
@pcmoritz: MoE
@rkooo567: Chunked prefill
@sroy745: Speculative decoding
@Yard1: kernels and performance
@zhisbug: Arctic models, distributed

領域負責人¶

本節按 vLLM 元件細分了活躍 Committer，並列出了領域負責人。如果您有涉及該領域的 PR，請隨時 ping 領域負責人進行審查。

引擎核心¶

Scheduler: vLLM 引擎的核心迴圈，將請求排程到下一個批次
- @WoosukKwon, @robertgshaw2-redhat, @njhill, @heheda12345
KV Cache Manager: 排程器內的記憶體管理層，維護 KV 快取的邏輯塊資料
- @heheda12345, @WoosukKwon
AsyncLLM: 基於 zmq 的協議，託管引擎核心並使其可供入口點訪問
- @robertgshaw2-redhat, @njhill, @russellb
ModelRunner, Executor, Worker: 用於包裝模型實現的引擎的抽象
- @WoosukKwon, @tlrmchlsmth, @heheda12345, @LucasWilkinson, @ProExpertProg
KV Connector: 用於 KV 快取解除安裝和傳輸的聯結器介面和實現
- @robertgshaw2-redhat, @njhill, @KuntaiDu, @NickLucche, @ApostaC
Distributed, Parallelism, Process Management: 程序啟動器，管理每個 worker，並將其分配給正確的 DP/TP/PP/EP rank
- @youkaichao, @njhill, @WoosukKwon, @ruisearch42
Collectives: nccl 和其他通訊庫/kernels 的使用
- @tlrmchlsmth, @youkaichao
多模態引擎和記憶體管理: 涉及視覺、音訊和影片輸入的關鍵排程和記憶體管理。
- @ywang96, @DarkLight1337

模型實現¶

Model Interface: 各種模型的 nn.Module 介面和實現
- @zhuohan123, @mgoin, @simon-mo, @houseroad, @ywang96 (multimodality), @jeejeelee (lora)
Logits Processors / Sampler: 提供的 sampler 類和可插入的 logits processors
- @njhill, @houseroad, @22quinn
Custom Layers: vLLM 中的實用層，如 rotary embedding 和 rms norms
- @ProExpertProg
Attention: paged attention 的 Attention 介面
- @WoosukKwon, @LucasWilkinson, @heheda12345
FusedMoE: FusedMoE kernel, Modular kernel framework, EPLB
- @tlrmchlsmth
Quantization: 各種量化配置、權重載入和 kernel。
- @mgoin, @Isotr0py, @yewentao256
Custom quantized GEMM kernels (cutlass_scaled_mm, marlin, machete)
- @tlrmchlsmth, @LucasWilkinson
Multi-modal Input Processing: 載入和處理影像/影片/音訊資料到特徵張量的元件
- @DarkLight1337, @ywang96, @Isotr0py
torch compile: vLLM 中的 torch.compile 整合，自定義 pass & transformations
- @ProExpertProg, @zou3519, @youkaichao
State space models: vLLM 中的 state space models 實現
- @tdoublep, @tlrmchlsmth
Reasoning and tool calling parsers
- @chaunceyjiang, @aarnphm

入口點¶

LLM Class: 用於離線推理的 LLM 類
- @DarkLight1337
API Server: 相容 OpenAI 的 API 伺服器
- @DarkLight1337, @njhill, @aarnphm, @simon-mo, @heheda12345 (Responses API)
Batch Runner: 相容 OpenAI 的 batch runner
- @simon-mo

功能特性¶

Spec Decode: 涵蓋模型定義、attention、sampler 和排程器，與 n-grams、EAGLE 和 MTP 相關。
- @WoosukKwon, @benchislett, @luccafong
Structured Output: 結構化輸出實現
- @russellb, @aarnphm
RL: RL 相關功能，如 collective rpc，sleep mode 等。
- @youkaichao, @zhuohan123, @22quinn
LoRA: @jeejeelee
Observability: Metrics and Logging
- @markmc, @robertgshaw2-redhat, @simon-mo

程式碼庫¶

Config: 配置註冊和解析
- @hmellor
Documentation: @hmellor, @DarkLight1337, @simon-mo
Benchmarks: @ywang96, @simon-mo
CI, Build, Release Process: @khluu, @njhill, @simon-mo
Security: @russellb

外部 Kernels 整合¶

FlashAttention: @LucasWilkinson
FlashInfer: @LucasWilkinson, @mgoin, @WoosukKwon
Blackwell Kernels: @mgoin, @yewentao256
DeepEP/DeepGEMM/pplx: @mgoin, @yewentao256

整合¶

Hugging Face: @hmellor, @Isotr0py
Ray: @ruisearch42
NIXL: @robertgshaw2-redhat, @NickLucche

與模型供應商合作¶

gpt-oss: @heheda12345, @simon-mo, @zhuohan123
Llama: @luccafong
Qwen: @sighingnow
Mistral: @patrickvonplaten

硬體¶

Plugin Interface: @youkaichao, @Yikun
NVIDIA GPU: @pavanimajety
AMD GPU: @gshtras, @tjtanaa
Intel CPU/GPU: @jikunshang, @bigPYJ1151
Google TPU: @yaochengji

生態專案¶

Ascend NPU: @wangxiyuan 和更多詳情
Intel Gaudi HPU @xuechendi 和 @kzawora-intel
Semantic Router: @xunzhuo, @rootfs 和更多詳情