如何添加聊天历史

注意

本指南以前使用 RunnableWithMessageHistory 抽象。你可以在 v0.2 文档中访问此版本的文档。

自 LangChain v0.3 版本发布以来，我们建议 LangChain 用户利用 LangGraph 持久化功能，将 memory 集成到新的 LangChain 应用程序中。

如果你的代码已经依赖于 RunnableWithMessageHistory 或 BaseChatMessageHistory，你不需要进行任何更改。我们不打算在近期弃用此功能，因为它适用于简单的聊天应用程序，并且任何使用 RunnableWithMessageHistory 的代码都将继续按预期工作。

请参阅如何迁移到LangGraph记忆了解更多详细信息。

在许多问答（Q&A）应用中，我们希望允许用户进行来回对话，这意味着应用程序需要某种形式的对过去问题和答案的“记忆”，以及将这些信息纳入当前思考的逻辑。

在本指南中，我们重点介绍添加整合历史消息的逻辑。

这主要是对话式 RAG 教程的精简版。

我们将介绍两种方法

链 (Chains)，我们总是执行一个检索步骤；
代理 (Agents)，我们赋予 LLM 自由裁量权，决定是否以及如何执行检索步骤（或多个步骤）。

对于外部知识源，我们将使用与 RAG 教程中 Lilian Weng 的《LLM 驱动的自主代理》博客文章相同的内容。

这两种方法都利用 LangGraph 作为编排框架。LangGraph 实现了一个内置的持久化层，使其成为支持多轮对话的聊天应用程序的理想选择。

设置

依赖项

在本演练中，我们将使用 OpenAI 嵌入和内存向量存储，但此处显示的所有内容都适用于任何嵌入模型 (Embeddings)、向量存储 (VectorStore) 或检索器 (Retriever)。

我们将使用以下包

%%capture --no-stderr
%pip install --upgrade --quiet langgraph langchain-community beautifulsoup4

LangSmith

你使用 LangChain 构建的许多应用程序将包含多个步骤，其中包含多次 LLM 调用。随着这些应用程序变得越来越复杂，能够检查你的链或代理内部究竟发生了什么是至关重要的。实现这一目标的最佳方法是使用 LangSmith。

请注意，LangSmith并非必需，但它会很有帮助。如果您确实想使用LangSmith，在通过上述链接注册后，请务必设置您的环境变量以开始记录跟踪信息

os.environ["LANGSMITH_TRACING"] = "true"
if not os.environ.get("LANGSMITH_API_KEY"):
    os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

组件

我们需要从LangChain的集成套件中选择三个组件。

一个聊天模型

选择聊天模型

pip install -qU "langchain[google-genai]"

import getpass
import os

if not os.environ.get("GOOGLE_API_KEY"):
  os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google Gemini: ")

from langchain.chat_models import init_chat_model

llm = init_chat_model("gemini-2.0-flash", model_provider="google_genai")

一个嵌入模型

选择嵌入模型

pip install -qU langchain-openai

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")

和一个向量存储

选择向量存储

pip install -qU langchain-core

from langchain_core.vectorstores import InMemoryVectorStore

vector_store = InMemoryVectorStore(embeddings)

链

RAG 教程索引了 Lilian Weng 的《LLM 驱动的自主代理》博客文章。我们将在此处重复该步骤。下面我们加载页面内容，将其拆分为子文档，并将文档嵌入到我们的向量存储中。

import bs4
from langchain import hub
from langchain_community.document_loaders import WebBaseLoader
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from typing_extensions import List, TypedDict

# Load and chunk contents of the blog
loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
all_splits = text_splitter.split_documents(docs)

API 参考：hub | WebBaseLoader | Document | RecursiveCharacterTextSplitter

# Index chunks
_ = vector_store.add_documents(documents=all_splits)

正如 RAG 教程第 2 部分中详述的，我们可以通过将 RAG 应用程序的流程表示为一系列消息来自然地支持对话体验

用户输入为 HumanMessage；
向量存储查询为带有工具调用的 AIMessage；
检索到的文档为 ToolMessage；
最终响应为 AIMessage。

我们将使用工具调用 (tool-calling) 来促进此过程，这还允许查询由 LLM 生成。我们可以构建一个工具 (tool) 来执行检索步骤

from langchain_core.tools import tool


@tool(response_format="content_and_artifact")
def retrieve(query: str):
    """Retrieve information related to a query."""
    retrieved_docs = vector_store.similarity_search(query, k=2)
    serialized = "\n\n".join(
        (f"Source: {doc.metadata}\nContent: {doc.page_content}")
        for doc in retrieved_docs
    )
    return serialized, retrieved_docs

API 参考：tool

我们现在可以构建我们的 LangGraph 应用程序了。

请注意，我们使用检查点 (checkpointer) 来编译它，以支持来回对话。LangGraph 带有一个简单的内存检查点 (in-memory checkpointer)，我们将在下面使用。有关更多详细信息，包括如何使用不同的持久化后端（例如 SQLite 或 Postgres），请参阅其文档。

from langchain_core.messages import SystemMessage
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import END, MessagesState, StateGraph
from langgraph.prebuilt import ToolNode, tools_condition


# Step 1: Generate an AIMessage that may include a tool-call to be sent.
def query_or_respond(state: MessagesState):
    """Generate tool call for retrieval or respond."""
    llm_with_tools = llm.bind_tools([retrieve])
    response = llm_with_tools.invoke(state["messages"])
    # MessagesState appends messages to state instead of overwriting
    return {"messages": [response]}


# Step 2: Execute the retrieval.
tools = ToolNode([retrieve])


# Step 3: Generate a response using the retrieved content.
def generate(state: MessagesState):
    """Generate answer."""
    # Get generated ToolMessages
    recent_tool_messages = []
    for message in reversed(state["messages"]):
        if message.type == "tool":
            recent_tool_messages.append(message)
        else:
            break
    tool_messages = recent_tool_messages[::-1]

    # Format into prompt
    docs_content = "\n\n".join(doc.content for doc in tool_messages)
    system_message_content = (
        "You are an assistant for question-answering tasks. "
        "Use the following pieces of retrieved context to answer "
        "the question. If you don't know the answer, say that you "
        "don't know. Use three sentences maximum and keep the "
        "answer concise."
        "\n\n"
        f"{docs_content}"
    )
    conversation_messages = [
        message
        for message in state["messages"]
        if message.type in ("human", "system")
        or (message.type == "ai" and not message.tool_calls)
    ]
    prompt = [SystemMessage(system_message_content)] + conversation_messages

    # Run
    response = llm.invoke(prompt)
    return {"messages": [response]}


# Build graph
graph_builder = StateGraph(MessagesState)

graph_builder.add_node(query_or_respond)
graph_builder.add_node(tools)
graph_builder.add_node(generate)

graph_builder.set_entry_point("query_or_respond")
graph_builder.add_conditional_edges(
    "query_or_respond",
    tools_condition,
    {END: END, "tools": "tools"},
)
graph_builder.add_edge("tools", "generate")
graph_builder.add_edge("generate", END)

memory = MemorySaver()
graph = graph_builder.compile(checkpointer=memory)

API 参考：SystemMessage | MemorySaver | StateGraph | ToolNode | tools_condition

from IPython.display import Image, display

display(Image(graph.get_graph().draw_mermaid_png()))

让我们测试一下我们的应用程序。

请注意，它会对不需要额外检索步骤的消息进行适当的响应

# Specify an ID for the thread
config = {"configurable": {"thread_id": "abc123"}}

input_message = "Hello"

for step in graph.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    step["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

Hello
==================================[1m Ai Message [0m==================================

Hello! How can I assist you today?

当执行搜索时，我们可以流式传输步骤以观察查询生成、检索和答案生成过程

input_message = "What is Task Decomposition?"

for step in graph.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    step["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

What is Task Decomposition?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_RntwX5GMt531biEE9MqSbgLV)
 Call ID: call_RntwX5GMt531biEE9MqSbgLV
  Args:
    query: Task Decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
==================================[1m Ai Message [0m==================================

Task Decomposition is the process of breaking down a complicated task into smaller, more manageable steps. It often involves techniques like Chain of Thought (CoT), where the model is prompted to "think step by step," allowing for better handling of complex tasks. This approach enhances model performance and provides insight into the model's reasoning process.

最后，因为我们使用检查点 (checkpointer) 编译了我们的应用程序，所以历史消息会保留在状态中。这使得模型能够根据上下文理解用户查询

input_message = "Can you look up some common ways of doing it?"

for step in graph.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    step["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

Can you look up some common ways of doing it?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_kwO5rYPyJ0MftYKoKRFjKpZM)
 Call ID: call_kwO5rYPyJ0MftYKoKRFjKpZM
  Args:
    query: common methods for task decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================

Common ways of Task Decomposition include: (1) using large language models (LLMs) with simple prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?"; (2) utilizing task-specific instructions, such as "Write a story outline" for creative tasks; and (3) incorporating human inputs to guide the decomposition process.

请注意，我们可以在 LangSmith 跟踪中观察发送到聊天模型的完整消息序列——包括工具调用和检索到的上下文。

对话历史记录也可以通过应用程序的状态进行检查

chat_history = graph.get_state(config).values["messages"]
for message in chat_history:
    message.pretty_print()

================================[1m Human Message [0m=================================

Hello
==================================[1m Ai Message [0m==================================

Hello! How can I assist you today?
================================[1m Human Message [0m=================================

What is Task Decomposition?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_RntwX5GMt531biEE9MqSbgLV)
 Call ID: call_RntwX5GMt531biEE9MqSbgLV
  Args:
    query: Task Decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.
==================================[1m Ai Message [0m==================================

Task Decomposition is the process of breaking down a complicated task into smaller, more manageable steps. It often involves techniques like Chain of Thought (CoT), where the model is prompted to "think step by step," allowing for better handling of complex tasks. This approach enhances model performance and provides insight into the model's reasoning process.
================================[1m Human Message [0m=================================

Can you look up some common ways of doing it?
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_kwO5rYPyJ0MftYKoKRFjKpZM)
 Call ID: call_kwO5rYPyJ0MftYKoKRFjKpZM
  Args:
    query: common methods for task decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================

Common ways of Task Decomposition include: (1) using large language models (LLMs) with simple prompts like "Steps for XYZ" or "What are the subgoals for achieving XYZ?"; (2) utilizing task-specific instructions, such as "Write a story outline" for creative tasks; and (3) incorporating human inputs to guide the decomposition process.

代理

代理 (Agents) 利用 LLM 的推理能力在执行过程中做出决策。使用代理可以将对检索过程的额外自由裁量权下放。尽管它们的行为不如上述“链”可预测，但它们能够执行多个检索步骤以响应查询，或对单个搜索进行迭代。

下面我们组装一个最小的 RAG 代理。使用 LangGraph 的预构建 ReAct 代理构造函数，我们可以一行代码完成此操作。

提示

请查看 LangGraph 的 Agentic RAG 教程，了解更高级的构建方法。

from langgraph.prebuilt import create_react_agent

agent_executor = create_react_agent(llm, [retrieve], checkpointer=memory)

API 参考：create_react_agent

让我们检查一下图

display(Image(agent_executor.get_graph().draw_mermaid_png()))

与我们之前的实现相比，主要区别在于，这里不是一个结束运行的最终生成步骤，而是工具调用循环回原始的 LLM 调用。然后模型可以使用检索到的上下文回答问题，或者生成另一个工具调用以获取更多信息。

让我们来测试一下。我们构建一个通常需要迭代检索步骤才能回答的问题

config = {"configurable": {"thread_id": "def234"}}

input_message = (
    "What is the standard method for Task Decomposition?\n\n"
    "Once you get the answer, look up common extensions of that method."
)

for event in agent_executor.stream(
    {"messages": [{"role": "user", "content": input_message}]},
    stream_mode="values",
    config=config,
):
    event["messages"][-1].pretty_print()

================================[1m Human Message [0m=================================

What is the standard method for Task Decomposition?

Once you get the answer, look up common extensions of that method.
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_rxBqio7dxthnMuzjr4AIquSZ)
 Call ID: call_rxBqio7dxthnMuzjr4AIquSZ
  Args:
    query: standard method for Task Decomposition
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================
Tool Calls:
  retrieve (call_kmQMRWCKeBdtXdlJi8yZD9CO)
 Call ID: call_kmQMRWCKeBdtXdlJi8yZD9CO
  Args:
    query: common extensions of Task Decomposition methods
=================================[1m Tool Message [0m=================================
Name: retrieve

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.
Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.

Source: {'source': 'https://lilianweng.github.io/posts/2023-06-23-agent/'}
Content: Fig. 1. Overview of a LLM-powered autonomous agent system.
Component One: Planning#
A complicated task usually involves many steps. An agent needs to know what they are and plan ahead.
Task Decomposition#
Chain of thought (CoT; Wei et al. 2022) has become a standard prompting technique for enhancing model performance on complex tasks. The model is instructed to “think step by step” to utilize more test-time computation to decompose hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.
==================================[1m Ai Message [0m==================================

The standard method for Task Decomposition involves breaking down complex tasks into smaller, manageable steps. Here are the main techniques:

1. **Chain of Thought (CoT)**: This prompting technique encourages a model to "think step by step," allowing it to utilize more computational resources during testing to decompose challenging tasks into simpler parts. CoT not only simplifies tasks but also provides insights into the model's reasoning process.

2. **Simple Prompting**: This can involve straightforward queries like "Steps for XYZ" or "What are the subgoals for achieving XYZ?" to guide the model in identifying the necessary steps.

3. **Task-specific Instructions**: Using specific prompts tailored to the task at hand, such as "Write a story outline" for creative writing, allows for more directed decomposition.

4. **Human Inputs**: Involving human expertise can also aid in breaking down tasks effectively.

### Common Extensions of Task Decomposition Methods

1. **Tree of Thoughts**: This method extends CoT by exploring multiple reasoning possibilities at each step. It decomposes the problem into various thought steps and generates multiple thoughts per step, forming a tree structure. This can utilize search processes like breadth-first search (BFS) or depth-first search (DFS) to evaluate states through classifiers or majority voting.

These extensions build on the basic principles of task decomposition, enhancing the depth and breadth of reasoning applied to complex tasks.

请注意，该代理

生成一个查询，以搜索任务分解的标准方法；
收到答案后，生成第二个查询以搜索其常见的扩展；
收到所有必要的上下文后，回答问题。

我们可以在 LangSmith 跟踪中查看完整的步骤序列，以及延迟和其他元数据。

下一步

我们已经介绍了构建一个基本对话式问答（Q&A）应用程序的步骤

我们使用链 (chains) 来构建一个可预测的应用程序，该程序为每个用户输入生成搜索查询；
我们使用代理 (agents) 来构建一个“决定”何时以及如何生成搜索查询的应用程序。

要探索不同类型的检索器和检索策略，请访问操作指南的检索器 (retrievers) 部分。

要详细了解 LangChain 的对话记忆抽象，请访问如何添加消息历史记录（记忆）LCEL 页面。

要了解更多关于代理 (agents) 的信息，请前往代理模块 (Agents Modules)。

设置​

依赖项​

LangSmith​

组件​

链​

代理​

下一步​

设置

依赖项

LangSmith

组件

链

代理

下一步