从StuffDocumentsChain迁移

StuffDocumentsChain 通过将文档连接到一个单一的上下文窗口来组合它们。这是一种直接有效的方法，用于问答、摘要和其他目的的文档组合。

create_stuff_documents_chain 是推荐的替代方案。它的功能与 StuffDocumentsChain 相同，但对流式传输和批处理功能有更好的支持。因为它只是 LCEL 原语的简单组合，所以它也更容易扩展和集成到其他 LangChain 应用程序中。

下面我们将通过一个简单的示例，说明 StuffDocumentsChain 和 create_stuff_documents_chain 的用法。

我们首先加载一个聊天模型

选择聊天模型

pip install -qU "langchain[google-genai]"

import getpass
import os

if not os.environ.get("GOOGLE_API_KEY"):
  os.environ["GOOGLE_API_KEY"] = getpass.getpass("Enter API key for Google Gemini: ")

from langchain.chat_models import init_chat_model

llm = init_chat_model("gemini-2.0-flash", model_provider="google_genai")

示例

让我们来看一个分析一组文档的示例。我们首先为说明目的生成一些简单的文档。

from langchain_core.documents import Document

documents = [
    Document(page_content="Apples are red", metadata={"title": "apple_book"}),
    Document(page_content="Blueberries are blue", metadata={"title": "blueberry_book"}),
    Document(page_content="Bananas are yelow", metadata={"title": "banana_book"}),
]

API 参考：Document

旧版

详情

下面我们展示了一个使用 StuffDocumentsChain 的实现。我们为此目的定义了一个用于摘要任务的提示模板，并实例化了一个 LLMChain 对象。我们定义了文档如何格式化为提示，并确保各个提示中的键保持一致。

from langchain.chains import LLMChain, StuffDocumentsChain
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate

# This controls how each document will be formatted. Specifically,
# it will be passed to `format_document` - see that function for more
# details.
document_prompt = PromptTemplate(
    input_variables=["page_content"], template="{page_content}"
)
document_variable_name = "context"
# The prompt here should take as an input variable the
# `document_variable_name`
prompt = ChatPromptTemplate.from_template("Summarize this content: {context}")

llm_chain = LLMChain(llm=llm, prompt=prompt)
chain = StuffDocumentsChain(
    llm_chain=llm_chain,
    document_prompt=document_prompt,
    document_variable_name=document_variable_name,
)

API 参考：LLMChain | StuffDocumentsChain | ChatPromptTemplate | PromptTemplate

我们现在可以调用我们的链。

result = chain.invoke(documents)
result["output_text"]

'This content describes the colors of different fruits: apples are red, blueberries are blue, and bananas are yellow.'

for chunk in chain.stream(documents):
    print(chunk)

{'input_documents': [Document(metadata={'title': 'apple_book'}, page_content='Apples are red'), Document(metadata={'title': 'blueberry_book'}, page_content='Blueberries are blue'), Document(metadata={'title': 'banana_book'}, page_content='Bananas are yelow')], 'output_text': 'This content describes the colors of different fruits: apples are red, blueberries are blue, and bananas are yellow.'}

LCEL

详情

下面我们展示了一个使用 create_stuff_documents_chain 的实现。

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("Summarize this content: {context}")
chain = create_stuff_documents_chain(llm, prompt)

API 参考：create_stuff_documents_chain | ChatPromptTemplate

调用链后，我们得到了与之前类似的结果。

result = chain.invoke({"context": documents})
result

'This content describes the colors of different fruits: apples are red, blueberries are blue, and bananas are yellow.'

请注意，此实现支持输出 token 的流式传输。

for chunk in chain.stream({"context": documents}):
    print(chunk, end=" | ")

 | This |  content |  describes |  the |  colors |  of |  different |  fruits | : |  apples |  are |  red | , |  blue | berries |  are |  blue | , |  and |  bananas |  are |  yellow | . |  |

下一步

有关更多背景信息，请查看LCEL 概念文档。

有关使用 RAG 进行问答任务的更多信息，请参阅这些操作指南。

有关更多基于 LLM 的总结策略，请参阅本教程。

示例​

旧版​

LCEL​

下一步​

示例

旧版

LCEL

下一步