LlamaEdge

LlamaEdge 允许您与 GGUF 格式的大语言模型进行交互，无论是本地还是通过聊天服务。

LlamaEdgeChatService 为开发者提供了一个与 OpenAI API 兼容的服务，可以通过 HTTP 请求与大语言模型进行交互。
LlamaEdgeChatLocal 使开发者能够在本地与大语言模型进行交互（即将推出）。

LlamaEdgeChatService 和 LlamaEdgeChatLocal 都运行在由 WasmEdge 运行时驱动的基础架构上，该运行时为大语言模型推理任务提供了一个轻量级且可移植的 WebAssembly 容器环境。

通过 API 服务进行聊天

LlamaEdgeChatService 在 llama-api-server 上运行。按照 llama-api-server 快速入门中的步骤，您可以托管自己的 API 服务，以便您可以与任何您喜欢的模型在任何设备上进行交互，只要网络可用即可。

from langchain_community.chat_models.llama_edge import LlamaEdgeChatService
from langchain_core.messages import HumanMessage, SystemMessage

API 参考：LlamaEdgeChatService | HumanMessage | SystemMessage

以非流式模式与大语言模型进行聊天

# service url
service_url = "https://b008-54-186-154-209.ngrok-free.app"

# create wasm-chat service instance
chat = LlamaEdgeChatService(service_url=service_url)

# create message sequence
system_message = SystemMessage(content="You are an AI assistant")
user_message = HumanMessage(content="What is the capital of France?")
messages = [system_message, user_message]

# chat with wasm-chat service
response = chat.invoke(messages)

print(f"[Bot] {response.content}")

[Bot] Hello! The capital of France is Paris.

以流式模式与大语言模型进行聊天

# service url
service_url = "https://b008-54-186-154-209.ngrok-free.app"

# create wasm-chat service instance
chat = LlamaEdgeChatService(service_url=service_url, streaming=True)

# create message sequence
system_message = SystemMessage(content="You are an AI assistant")
user_message = HumanMessage(content="What is the capital of Norway?")
messages = [
    system_message,
    user_message,
]

output = ""
for chunk in chat.stream(messages):
    # print(chunk.content, end="", flush=True)
    output += chunk.content

print(f"[Bot] {output}")

[Bot]   Hello! I'm happy to help you with your question. The capital of Norway is Oslo.

聊天模型概念指南
聊天模型操作指南

LlamaEdge

通过 API 服务进行聊天

以非流式模式与大语言模型进行聊天

以流式模式与大语言模型进行聊天

此页面是否有帮助？

您还可以留下详细的反馈在 GitHub 上.

LlamaEdge

通过 API 服务进行聊天​

以非流式模式与大语言模型进行聊天​

以流式模式与大语言模型进行聊天​

相关​

此页面是否有帮助？

您还可以留下详细的反馈 在 GitHub 上.

通过 API 服务进行聊天

以非流式模式与大语言模型进行聊天

以流式模式与大语言模型进行聊天

相关

您还可以留下详细的反馈在 GitHub 上.