Haystack¶

Haystack 是一個端到端的 LLM 框架，它允許您構建由 LLM、Transformer 模型、向量搜尋等提供支援的應用程式。無論您想執行檢索增強生成 (RAG)、文件搜尋、問答還是答案生成，Haystack 都可以將最先進的嵌入模型和 LLM 編排到管道中，以構建端到端 NLP 應用程式並解決您的用例。

它允許您將 vLLM 作為後端部署大型語言模型 (LLM) 伺服器，該伺服器公開與 OpenAI 相容的端點。

先決條件¶

設定 vLLM 和 Haystack 環境

pip install vllm haystack-ai

部署¶

啟動支援聊天完成模型的 vLLM 伺服器，例如

vllm serve mistralai/Mistral-7B-Instruct-v0.1

在 Haystack 中使用 OpenAIGenerator 和 OpenAIChatGenerator 元件查詢 vLLM 伺服器。

程式碼

from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage
from haystack.utils import Secret

generator = OpenAIChatGenerator(
    # for compatibility with the OpenAI API, a placeholder api_key is needed
    api_key=Secret.from_token("VLLM-PLACEHOLDER-API-KEY"),
    model="mistralai/Mistral-7B-Instruct-v0.1",
    api_base_url="http://{your-vLLM-host-ip}:{your-vLLM-host-port}/v1",
    generation_kwargs = {"max_tokens": 512}
)

response = generator.run(
  messages=[ChatMessage.from_user("Hi. Can you help me plan my next trip to Italy?")]
)

print("-"*30)
print(response)
print("-"*30)

------------------------------
{'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=' Of course! Where in Italy would you like to go and what type of trip are you looking to plan?')], _name=None, _meta={'model': 'mistralai/Mistral-7B-Instruct-v0.1', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 23, 'prompt_tokens': 21, 'total_tokens': 44, 'completion_tokens_details': None, 'prompt_tokens_details': None}})]}
------------------------------

有關詳細資訊，請參閱教程在 Haystack 中使用 vLLM。