ChatOpenAI
本笔记提供了 OpenAI 聊天模型快速入门的概述。有关所有 ChatOpenAI 功能和配置的详细文档,请查阅API 参考。
OpenAI 有多种聊天模型。您可以在OpenAI 文档中找到有关其最新模型及其成本、上下文窗口和支持的输入类型的信息。
请注意,某些 OpenAI 模型也可以通过Microsoft Azure 平台访问。要使用 Azure OpenAI 服务,请使用AzureChatOpenAI 集成。
概述
集成详情
类别 | 包 | 本地 | 可序列化 | JS 支持 | 包下载量 | 最新包版本 |
---|---|---|---|---|---|---|
ChatOpenAI | langchain-openai | ❌ | 测试版 | ✅ |
模型特性
工具调用 | 结构化输出 | JSON 模式 | 图片输入 | 音频输入 | 视频输入 | 逐令牌流式传输 | 原生异步 | 令牌使用量 | 对数概率 |
---|---|---|---|---|---|---|---|---|---|
✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ |
设置
要访问 OpenAI 模型,您需要创建一个 OpenAI 账户,获取 API 密钥,并安装 `langchain-openai` 集成包。
凭证
请访问https://platform.openai.com注册 OpenAI 并生成 API 密钥。完成此操作后,请设置 OPENAI_API_KEY 环境变量。
import getpass
import os
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")
如果您希望对模型调用进行自动化追踪,也可以通过取消注释下方内容来设置您的 LangSmith API 密钥。
# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"
安装
LangChain OpenAI 集成位于 `langchain-openai` 包中。
%pip install -qU langchain-openai
实例化
现在我们可以实例化模型对象并生成聊天补全
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2,
# api_key="...", # if you prefer to pass api key in directly instaed of using env vars
# base_url="...",
# organization="...",
# other params...
)
调用
messages = [
(
"system",
"You are a helpful assistant that translates English to French. Translate the user sentence.",
),
("human", "I love programming."),
]
ai_msg = llm.invoke(messages)
ai_msg
AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 31, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-63219b22-03e3-4561-8cc4-78b7c7c3a3ca-0', usage_metadata={'input_tokens': 31, 'output_tokens': 5, 'total_tokens': 36})
print(ai_msg.content)
J'adore la programmation.
链式调用
我们可以像这样将模型与提示模板链式连接起来
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant that translates {input_language} to {output_language}.",
),
("human", "{input}"),
]
)
chain = prompt | llm
chain.invoke(
{
"input_language": "English",
"output_language": "German",
"input": "I love programming.",
}
)
AIMessage(content='Ich liebe das Programmieren.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'stop', 'logprobs': None}, id='run-350585e1-16ca-4dad-9460-3d9e7e49aaf1-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32})
工具调用
OpenAI 提供了一个工具调用(此处“工具调用”和“函数调用”可互换使用)API,您可以通过它描述工具及其参数,并让模型返回一个 JSON 对象,其中包含要调用的工具及其输入。工具调用对于构建使用工具的链和代理,以及更普遍地从模型获取结构化输出非常有用。
ChatOpenAI.bind_tools()
通过 `ChatOpenAI.bind_tools`,我们可以轻松地将 Pydantic 类、字典模式、LangChain 工具甚至函数作为工具传递给模型。在底层,这些被转换为 OpenAI 工具模式,其结构如下:
{
"name": "...",
"description": "...",
"parameters": {...} # JSONSchema
}
并传递给每次模型调用。
from pydantic import BaseModel, Field
class GetWeather(BaseModel):
"""Get the current weather in a given location"""
location: str = Field(..., description="The city and state, e.g. San Francisco, CA")
llm_with_tools = llm.bind_tools([GetWeather])
ai_msg = llm_with_tools.invoke(
"what is the weather like in San Francisco",
)
ai_msg
AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-1617c9b2-dda5-4120-996b-0333ed5992e2-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_o9udf3EVOWiV4Iupktpbpofk', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})
strict=True
langchain-openai>=0.1.21
截至 2024 年 8 月 6 日,OpenAI 在调用工具时支持 `strict` 参数,该参数将强制模型遵守工具参数模式。更多信息请参见:https://platform.openai.com/docs/guides/function-calling
注意:如果 `strict=True`,工具定义也将被验证,并且只接受 JSON 模式的一个子集。关键是,模式不能包含可选参数(即带有默认值的参数)。在此处阅读有关支持的模式类型的完整文档:https://platform.openai.com/docs/guides/structured-outputs/supported-schemas。
llm_with_tools = llm.bind_tools([GetWeather], strict=True)
ai_msg = llm_with_tools.invoke(
"what is the weather like in San Francisco",
)
ai_msg
AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'function': {'arguments': '{"location":"San Francisco, CA"}', 'name': 'GetWeather'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 68, 'total_tokens': 85}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_3aa7262c27', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-5e3356a9-132d-4623-8e73-dd5a898cf4a6-0', tool_calls=[{'name': 'GetWeather', 'args': {'location': 'San Francisco, CA'}, 'id': 'call_jUqhd8wzAIzInTJl72Rla8ht', 'type': 'tool_call'}], usage_metadata={'input_tokens': 68, 'output_tokens': 17, 'total_tokens': 85})
AIMessage.tool_calls
请注意,AIMessage 具有 `tool_calls` 属性。它包含一个与模型提供商无关的标准化 ToolCall 格式。
ai_msg.tool_calls
[{'name': 'GetWeather',
'args': {'location': 'San Francisco, CA'},
'id': 'call_jUqhd8wzAIzInTJl72Rla8ht',
'type': 'tool_call'}]
有关绑定工具和工具调用输出的更多信息,请查阅工具调用文档。
结构化输出和工具调用
OpenAI 的结构化输出功能可以与工具调用同时使用。模型将生成工具调用或遵循所需模式的响应。请参见下面的示例。
from langchain_openai import ChatOpenAI
from pydantic import BaseModel
def get_weather(location: str) -> None:
"""Get weather at a location."""
return "It's sunny."
class OutputSchema(BaseModel):
"""Schema for response."""
answer: str
justification: str
llm = ChatOpenAI(model="gpt-4.1")
structured_llm = llm.bind_tools(
[get_weather],
response_format=OutputSchema,
strict=True,
)
# Response contains tool calls:
tool_call_response = structured_llm.invoke("What is the weather in SF?")
# structured_response.additional_kwargs["parsed"] contains parsed output
structured_response = structured_llm.invoke(
"What weighs more, a pound of feathers or a pound of gold?"
)
Responses API
langchain-openai>=0.3.9
OpenAI 支持 Responses API,该 API 旨在构建代理式应用程序。它包括一套内置工具,如网页和文件搜索。它还支持会话状态管理,允许您在不显式传递先前消息的情况下继续会话线程,以及来自推理过程的输出。
如果使用其中一个功能,`ChatOpenAI` 将路由到 Responses API。在实例化 `ChatOpenAI` 时,您也可以指定 `use_responses_api=True`。
`langchain-openai >= 0.3.26` 允许用户在使用 Responses API 时选择更新的 AIMessage 格式。设置
llm = ChatOpenAI(model="...", output_version="responses/v1")
将把推理摘要、内置工具调用和其他响应项的输出格式化到消息的 `content` 字段中,而不是 `additional_kwargs`。我们建议新应用程序使用此格式。
网页搜索
要触发网页搜索,请像使用其他工具一样将 `{"type": "web_search_preview"}` 传递给模型。
您也可以将内置工具作为调用参数传递
llm.invoke("...", tools=[{"type": "web_search_preview"}])
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")
tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])
response = llm_with_tools.invoke("What was a positive news story from today?")
请注意,响应包含结构化内容块,其中包括响应文本和引用其来源的 OpenAI 注释。输出消息还将包含来自任何工具调用的信息。
response.content
[{'id': 'ws_685d997c1838819e8a2cbf66059ddd5c0f6f330a19127ac1',
'action': {'query': 'positive news stories today', 'type': 'search'},
'status': 'completed',
'type': 'web_search_call'},
{'type': 'text',
'text': "On June 25, 2025, the James Webb Space Telescope made a groundbreaking discovery by directly imaging a previously unknown exoplanet. This young gas giant, approximately the size of Saturn, orbits a star smaller than our Sun, located about 110 light-years away in the constellation Antlia. This achievement marks the first time Webb has identified an exoplanet not previously known, expanding our understanding of distant worlds. ([straitstimes.com](https://www.straitstimes.com/world/while-you-were-sleeping-5-stories-you-might-have-missed-june-26-2025?utm_source=openai))\n\nAdditionally, in the realm of conservation, a significant milestone was achieved with the successful translocation of seventy southern white rhinos from South Africa to Rwanda's Akagera National Park. This initiative represents the first international translocation from Platinum Rhino, a major captive breeding operation, and is seen as a substantial opportunity to safeguard the future of the white rhino species. ([conservationoptimism.org](https://conservationoptimism.org/7-stories-of-optimism-this-week-17-06-25-23-06-25/?utm_source=openai))\n\nThese developments highlight positive strides in both scientific exploration and wildlife conservation efforts. ",
'annotations': [{'end_index': 572,
'start_index': 429,
'title': 'While You Were Sleeping: 5 stories you might have missed, June 26, 2025 | The Straits Times',
'type': 'url_citation',
'url': 'https://www.straitstimes.com/world/while-you-were-sleeping-5-stories-you-might-have-missed-june-26-2025?utm_source=openai'},
{'end_index': 1121,
'start_index': 990,
'title': '7 stories of optimism this week (17.06.25-23.06.25) - Conservation Optimism',
'type': 'url_citation',
'url': 'https://conservationoptimism.org/7-stories-of-optimism-this-week-17-06-25-23-06-25/?utm_source=openai'}],
'id': 'msg_685d997f6b94819e8d981a2b441470420f6f330a19127ac1'}]
您可以使用 `response.text()` 将响应的纯文本内容作为字符串恢复。例如,要流式传输响应文本:
for token in llm_with_tools.stream("..."):
print(token.text(), end="|")
更多详细信息请参见流式传输指南。
图像生成
langchain-openai>=0.3.19
要触发图像生成,请像使用其他工具一样将 `{"type": "image_generation"}` 传递给模型。
您也可以将内置工具作为调用参数传递
llm.invoke("...", tools=[{"type": "image_generation"}])
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")
tool = {"type": "image_generation", "quality": "low"}
llm_with_tools = llm.bind_tools([tool])
ai_message = llm_with_tools.invoke(
"Draw a picture of a cute fuzzy cat with an umbrella"
)
import base64
from IPython.display import Image
image = next(
item for item in ai_message.content if item["type"] == "image_generation_call"
)
Image(base64.b64decode(image["result"]), width=200)
文件搜索
要触发文件搜索,请像传递其他工具一样将文件搜索工具传递给模型。您需要填充一个由 OpenAI 管理的向量存储,并在工具定义中包含向量存储 ID。更多详细信息请参见 OpenAI 文档。
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")
openai_vector_store_ids = [
"vs_...", # your IDs here
]
tool = {
"type": "file_search",
"vector_store_ids": openai_vector_store_ids,
}
llm_with_tools = llm.bind_tools([tool])
response = llm_with_tools.invoke("What is deep research by OpenAI?")
print(response.text())
Deep Research by OpenAI is a newly launched agentic capability within ChatGPT designed to conduct complex, multi-step research tasks on the internet autonomously. It synthesizes large amounts of online information into comprehensive, research analyst-level reports, accomplishing in tens of minutes what would typically take a human many hours. This capability is powered by an upcoming OpenAI o3 model that is optimized for web browsing and data analysis, allowing it to search, interpret, and analyze massive amounts of text, images, and PDFs from the internet, while dynamically adjusting its approach based on the information it finds.
Key features of Deep Research include:
- Independent discovery, reasoning, and consolidation of insights from across the web.
- Ability to use browser and Python programming tools for data analysis and graph plotting.
- Full documentation of outputs with clear citations and a summary of its reasoning process, making it easy to verify and reference.
- Designed to provide thorough, precise, and reliable research especially useful for knowledge-intensive domains such as finance, science, policy, and engineering. It is also valuable for individuals seeking personalized and detailed product research.
It uses reinforcement learning techniques to plan and execute multi-step information-gathering tasks, reacting to real-time information by backtracking or pivoting its search when necessary. Deep Research can browse the open web and user-uploaded files, integrates visual data such as images and graphs into its reports, and cites specific source passages to support its conclusions.
The goal behind Deep Research is to enhance knowledge synthesis, which is essential for creating new knowledge, marking a significant step toward the development of Artificial General Intelligence (AGI) capable of producing novel scientific research.
Users can access Deep Research via ChatGPT by selecting the "deep research" option in the message composer, entering their query, and optionally attaching files or spreadsheets. The research process can take from 5 to 30 minutes, during which users can continue with other tasks. The final output is delivered as a richly detailed and well-documented report within the chat interface.
Currently, Deep Research is available to Pro users with plans to expand access further to Plus, Team, and Enterprise users. It currently supports research using open web sources and uploaded files, with future plans to connect to specialized subscription or internal data sources for even more robust research outputs.
Though powerful, Deep Research has limitations such as occasional hallucinations, difficulty distinguishing authoritative information from rumors, and some formatting or citation issues at launch, which are expected to improve with usage and time.
In summary, Deep Research is a highly advanced AI research assistant capable of automating extensive, in-depth knowledge work by synthesizing vast amounts of online data into comprehensive, credible reports, designed to save users significant time and effort on complex research tasks.
与网页搜索类似,响应将包含带有引用的内容块。
[block["type"] for block in response.content]
['file_search_call', 'text']
text_block = next(block for block in response.content if block["type"] == "text")
text_block["annotations"][:2]
[{'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k',
'filename': 'deep_research_blog.pdf',
'index': 3121,
'type': 'file_citation'},
{'file_id': 'file-3UzgX7jcC8Dt9ZAFzywg5k',
'filename': 'deep_research_blog.pdf',
'index': 3121,
'type': 'file_citation'}]
它还将包含来自内置工具调用的信息。
response.content[0]
{'id': 'fs_685d9e7d48408191b9e34ad359069ede019138cfaaf3cea8',
'queries': ['deep research by OpenAI'],
'status': 'completed',
'type': 'file_search_call'}
计算机使用
`ChatOpenAI` 支持 `"computer-use-preview"` 模型,这是一个用于内置计算机使用工具的专用模型。要启用此功能,请像传递其他工具一样传递一个计算机使用工具。
目前,计算机使用的工具输出存在于消息的 `content` 字段中。要回复计算机使用工具调用,请构造一个 `ToolMessage`,并在其 `additional_kwargs` 中包含 `{"type": "computer_call_output"}`。消息内容将是屏幕截图。下面,我们演示一个简单示例。
首先,加载两个屏幕截图。
import base64
def load_png_as_base64(file_path):
with open(file_path, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read())
return encoded_string.decode("utf-8")
screenshot_1_base64 = load_png_as_base64(
"/path/to/screenshot_1.png"
) # perhaps a screenshot of an application
screenshot_2_base64 = load_png_as_base64(
"/path/to/screenshot_2.png"
) # perhaps a screenshot of the Desktop
from langchain_openai import ChatOpenAI
# Initialize model
llm = ChatOpenAI(
model="computer-use-preview",
truncation="auto",
output_version="responses/v1",
)
# Bind computer-use tool
tool = {
"type": "computer_use_preview",
"display_width": 1024,
"display_height": 768,
"environment": "browser",
}
llm_with_tools = llm.bind_tools([tool])
# Construct input message
input_message = {
"role": "user",
"content": [
{
"type": "text",
"text": (
"Click the red X to close and reveal my Desktop. "
"Proceed, no confirmation needed."
),
},
{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_1_base64}",
},
],
}
# Invoke model
response = llm_with_tools.invoke(
[input_message],
reasoning={
"generate_summary": "concise",
},
)
响应将在其 `content` 中包含对计算机使用工具的调用。
response.content
[{'id': 'rs_685da051742c81a1bb35ce46a9f3f53406b50b8696b0f590',
'summary': [{'text': "Clicking red 'X' to show desktop",
'type': 'summary_text'}],
'type': 'reasoning'},
{'id': 'cu_685da054302481a1b2cc43b56e0b381706b50b8696b0f590',
'action': {'button': 'left', 'type': 'click', 'x': 14, 'y': 38},
'call_id': 'call_zmQerFBh4PbBE8mQoQHkfkwy',
'pending_safety_checks': [],
'status': 'completed',
'type': 'computer_call'}]
接下来,我们构造一个具有以下属性的 ToolMessage:
- 它具有与计算机调用中的 `call_id` 匹配的 `tool_call_id`。
- 它在 `additional_kwargs` 中包含 `{"type": "computer_call_output"}`。
- 其内容可以是 `image_url` 或 `input_image` 输出块(有关格式,请参见OpenAI 文档)。
from langchain_core.messages import ToolMessage
tool_call_id = next(
item["call_id"] for item in response.content if item["type"] == "computer_call"
)
tool_message = ToolMessage(
content=[
{
"type": "input_image",
"image_url": f"data:image/png;base64,{screenshot_2_base64}",
}
],
# content=f"data:image/png;base64,{screenshot_2_base64}", # <-- also acceptable
tool_call_id=tool_call_id,
additional_kwargs={"type": "computer_call_output"},
)
现在,我们可以使用消息历史再次调用模型。
messages = [
input_message,
response,
tool_message,
]
response_2 = llm_with_tools.invoke(
messages,
reasoning={
"generate_summary": "concise",
},
)
response_2.text()
'VS Code has been closed, and the desktop is now visible.'
除了传回整个序列,我们还可以使用previous_response_id。
previous_response_id = response.response_metadata["id"]
response_2 = llm_with_tools.invoke(
[tool_message],
previous_response_id=previous_response_id,
reasoning={
"generate_summary": "concise",
},
)
response_2.text()
'The VS Code window is closed, and the desktop is now visible. Let me know if you need any further assistance.'
代码解释器
OpenAI 实现了一个代码解释器工具,以支持沙盒化的代码生成和执行。
使用示例
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="o4-mini", output_version="responses/v1")
llm_with_tools = llm.bind_tools(
[
{
"type": "code_interpreter",
# Create a new container
"container": {"type": "auto"},
}
]
)
response = llm_with_tools.invoke(
"Write and run code to answer the question: what is 3^3?"
)
请注意,上述命令创建了一个新容器。我们也可以指定一个现有容器 ID。
code_interpreter_calls = [
item for item in response.content if item["type"] == "code_interpreter_call"
]
assert len(code_interpreter_calls) == 1
container_id = code_interpreter_calls[0]["container_id"]
llm_with_tools = llm.bind_tools(
[
{
"type": "code_interpreter",
# Use an existing container
"container": container_id,
}
]
)
远程 MCP
OpenAI 实现了一个远程 MCP 工具,允许模型生成对 MCP 服务器的调用。
使用示例
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="o4-mini", output_version="responses/v1")
llm_with_tools = llm.bind_tools(
[
{
"type": "mcp",
"server_label": "deepwiki",
"server_url": "https://mcp.deepwiki.com/mcp",
"require_approval": "never",
}
]
)
response = llm_with_tools.invoke(
"What transport protocols does the 2025-03-26 version of the MCP "
"spec (modelcontextprotocol/modelcontextprotocol) support?"
)
MCP 审批
OpenAI 有时会在与远程 MCP 服务器共享数据之前请求审批。
在上述命令中,我们指示模型从不要求审批。我们也可以配置模型始终要求审批,或仅对特定工具要求审批。
llm_with_tools = llm.bind_tools(
[
{
"type": "mcp",
"server_label": "deepwiki",
"server_url": "https://mcp.deepwiki.com/mcp",
"require_approval": {
"always": {
"tool_names": ["read_wiki_structure"]
}
}
}
]
)
response = llm_with_tools.invoke(
"What transport protocols does the 2025-03-26 version of the MCP "
"spec (modelcontextprotocol/modelcontextprotocol) support?"
)
响应随后可能包含类型为 `"mcp_approval_request"` 的块。
要提交审批请求的审批,请将其结构化为输入消息中的内容块。
approval_message = {
"role": "user",
"content": [
{
"type": "mcp_approval_response",
"approve": True,
"approval_request_id": block["id"],
}
for block in response.content
if block["type"] == "mcp_approval_request"
]
}
next_response = llm_with_tools.invoke(
[approval_message],
# continue existing thread
previous_response_id=response.response_metadata["id"]
)
管理会话状态
Responses API 支持会话状态管理。
手动管理状态
您可以手动管理状态或使用LangGraph,与其他聊天模型类似。
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4.1-mini", output_version="responses/v1")
tool = {"type": "web_search_preview"}
llm_with_tools = llm.bind_tools([tool])
first_query = "What was a positive news story from today?"
messages = [{"role": "user", "content": first_query}]
response = llm_with_tools.invoke(messages)
response_text = response.text()
print(f"{response_text[:100]}... {response_text[-100:]}")
On June 25, 2025, the James Webb Space Telescope made a groundbreaking discovery by directly imaging... exploration and environmental conservation, reflecting positive developments in science and nature.
second_query = (
"Repeat my question back to me, as well as the last sentence of your answer."
)
messages.extend(
[
response,
{"role": "user", "content": second_query},
]
)
second_response = llm_with_tools.invoke(messages)
print(second_response.text())
Your question was: "What was a positive news story from today?"
The last sentence of my answer was: "These stories highlight significant advancements in both space exploration and environmental conservation, reflecting positive developments in science and nature."
传递 previous_response_id
使用 Responses API 时,LangChain 消息将在其元数据中包含一个 `"id"` 字段。将此 ID 传递给后续调用将继续会话。请注意,从计费角度来看,这等同于手动传递消息。
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4.1-mini",
output_version="responses/v1",
)
response = llm.invoke("Hi, I'm Bob.")
print(response.text())
Hi Bob! How can I assist you today?
second_response = llm.invoke(
"What is my name?",
previous_response_id=response.response_metadata["id"],
)
print(second_response.text())
You mentioned that your name is Bob. How can I help you today, Bob?
ChatOpenAI 还可以使用消息序列中的最新响应自动指定 `previous_response_id`。
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4.1-mini",
output_version="responses/v1",
use_previous_response_id=True,
)
如果我们将 `use_previous_response_id=True`,则直到最新响应的输入消息将从请求负载中删除,并且 `previous_response_id` 将使用最新响应的 ID 进行设置。
也就是说,
llm.invoke(
[
HumanMessage("Hello"),
AIMessage("Hi there!", response_metadata={"id": "resp_123"}),
HumanMessage("How are you?"),
]
)
等同于
llm.invoke([HumanMessage("How are you?")], previous_response_id="resp_123")
推理输出
一些 OpenAI 模型将生成单独的文本内容,以说明其推理过程。有关详细信息,请参见 OpenAI 的推理文档。
OpenAI 可以返回模型推理的摘要(尽管它不暴露原始推理 token)。要配置 `ChatOpenAI` 返回此摘要,请指定 `reasoning` 参数。如果设置了此参数,`ChatOpenAI` 将自动路由到 Responses API。
from langchain_openai import ChatOpenAI
reasoning = {
"effort": "medium", # 'low', 'medium', or 'high'
"summary": "auto", # 'detailed', 'auto', or None
}
llm = ChatOpenAI(model="o4-mini", reasoning=reasoning, output_version="responses/v1")
response = llm.invoke("What is 3^3?")
# Output
response.text()
'3³ = 3 × 3 × 3 = 27.'
# Reasoning
for block in response.content:
if block["type"] == "reasoning":
for summary in block["summary"]:
print(summary["text"])
**Calculating the power of three**
The user is asking about 3 raised to the power of 3. That's a pretty simple calculation! I know that 3^3 equals 27, so I can say, "3 to the power of 3 equals 27." I might also include a quick explanation that it's 3 multiplied by itself three times: 3 × 3 × 3 = 27. So, the answer is definitely 27.
微调
您可以通过传入相应的 `modelName` 参数来调用微调后的 OpenAI 模型。
这通常采用 `ft:{OPENAI_MODEL_NAME}:{ORG_NAME}::{MODEL_ID}` 的形式。例如:
fine_tuned_model = ChatOpenAI(
temperature=0, model_name="ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR"
)
fine_tuned_model.invoke(messages)
AIMessage(content="J'adore la programmation.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 31, 'total_tokens': 39}, 'model_name': 'ft:gpt-3.5-turbo-0613:langchain::7qTVM5AR', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-0f39b30e-c56e-4f3b-af99-5c948c984146-0', usage_metadata={'input_tokens': 31, 'output_tokens': 8, 'total_tokens': 39})
多模态输入(图像、PDF、音频)
OpenAI 提供支持多模态输入的模型。您可以将图像、PDF 或音频传递给这些模型。有关如何在 LangChain 中执行此操作的更多信息,请查阅多模态输入文档。
您可以在OpenAI 文档中查看支持不同模态的模型列表。
对于所有模态,LangChain 都支持其跨提供商标准以及 OpenAI 的原生内容块格式。
要将多模态数据传递到 `ChatOpenAI` 中,请创建一个包含数据的内容块并将其合并到消息中,例如,如下所示:
message = {
"role": "user",
"content": [
{
"type": "text",
# Update prompt as desired
"text": "Describe the (image / PDF / audio...)",
},
content_block,
],
}
内容块示例见下文。
图像
请参阅此处的操作指南中的示例。
URL
# LangChain format
content_block = {
"type": "image",
"source_type": "url",
"url": url_string,
}
# OpenAI Chat Completions format
content_block = {
"type": "image_url",
"image_url": {"url": url_string},
}
内联 Base64 数据
# LangChain format
content_block = {
"type": "image",
"source_type": "base64",
"data": base64_string,
"mime_type": "image/jpeg",
}
# OpenAI Chat Completions format
content_block = {
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_string}",
},
}
PDF 文件
注意:OpenAI 要求为 PDF 输入指定文件名。使用 LangChain 格式时,请包含 `filename` 键。
在此处阅读更多内容。
请参阅此处的操作指南中的示例。
内联 Base64 数据
# LangChain format
content_block = {
"type": "file",
"source_type": "base64",
"data": base64_string,
"mime_type": "application/pdf",
"filename": "my-file.pdf",
}
# OpenAI Chat Completions format
content_block = {
"type": "file",
"file": {
"filename": "my-file.pdf",
"file_data": f"data:application/pdf;base64,{base64_string}",
}
}
音频
请参见支持的模型,例如 `"gpt-4o-audio-preview"`。
请参阅此处的操作指南中的示例。
内联 Base64 数据
# LangChain format
content_block = {
"type": "audio",
"source_type": "base64",
"mime_type": "audio/wav", # or appropriate mime-type
"data": base64_string,
}
# OpenAI Chat Completions format
content_block = {
"type": "input_audio",
"input_audio": {"data": base64_string, "format": "wav"},
}
预测输出
需要 langchain-openai>=0.2.6
一些 OpenAI 模型(例如其 `gpt-4o` 和 `gpt-4o-mini` 系列)支持预测输出,这允许您提前传入 LLM 预期输出的已知部分,以减少延迟。这对于编辑文本或代码等情况非常有用,因为这些情况下模型的输出只有一小部分会发生变化。
这是一个示例
code = """
/// <summary>
/// Represents a user with a first name, last name, and username.
/// </summary>
public class User
{
/// <summary>
/// Gets or sets the user's first name.
/// </summary>
public string FirstName { get; set; }
/// <summary>
/// Gets or sets the user's last name.
/// </summary>
public string LastName { get; set; }
/// <summary>
/// Gets or sets the user's username.
/// </summary>
public string Username { get; set; }
}
"""
llm = ChatOpenAI(model="gpt-4o")
query = (
"Replace the Username property with an Email property. "
"Respond only with code, and with no markdown formatting."
)
response = llm.invoke(
[{"role": "user", "content": query}, {"role": "user", "content": code}],
prediction={"type": "content", "content": code},
)
print(response.content)
print(response.response_metadata)
/// <summary>
/// Represents a user with a first name, last name, and email.
/// </summary>
public class User
{
/// <summary>
/// Gets or sets the user's first name.
/// </summary>
public string FirstName { get; set; }
/// <summary>
/// Gets or sets the user's last name.
/// </summary>
public string LastName { get; set; }
/// <summary>
/// Gets or sets the user's email.
/// </summary>
public string Email { get; set; }
}
{'token_usage': {'completion_tokens': 226, 'prompt_tokens': 166, 'total_tokens': 392, 'completion_tokens_details': {'accepted_prediction_tokens': 49, 'audio_tokens': None, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 107}, 'prompt_tokens_details': {'audio_tokens': None, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_45cf54deae', 'finish_reason': 'stop', 'logprobs': None}
请注意,目前预测会作为额外 token 进行计费,这可能会增加您的使用量和成本,以换取降低的延迟。
音频生成(预览版)
需要 langchain-openai>=0.2.3
OpenAI 推出了一项新的音频生成功能,允许您将音频输入和输出与 `gpt-4o-audio-preview` 模型一起使用。
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o-audio-preview",
temperature=0,
model_kwargs={
"modalities": ["text", "audio"],
"audio": {"voice": "alloy", "format": "wav"},
},
)
output_message = llm.invoke(
[
("human", "Are you made by OpenAI? Just answer yes or no"),
]
)
`output_message.additional_kwargs['audio']` 将包含一个字典,例如:
{
'data': '<audio data b64-encoded',
'expires_at': 1729268602,
'id': 'audio_67127d6a44348190af62c1530ef0955a',
'transcript': 'Yes.'
}
格式将是 `model_kwargs['audio']['format']` 中传入的格式。
我们也可以在 OpenAI 的 `expires_at` 到期之前,将此带有音频数据的消息作为消息历史的一部分传回给模型。
输出音频存储在 `AIMessage.additional_kwargs` 的 `audio` 键下,但输入内容块在 `HumanMessage.content` 列表中以 `input_audio` 类型和键进行类型化。
更多信息请参见 OpenAI 的音频文档。
history = [
("human", "Are you made by OpenAI? Just answer yes or no"),
output_message,
("human", "And what is your name? Just give your name."),
]
second_output_message = llm.invoke(history)
弹性处理
OpenAI 提供多种服务层级。“flex”层级为请求提供更便宜的定价,但代价是响应可能需要更长时间,并且资源可能并非始终可用。这种方法最适合非关键任务,包括模型测试、数据增强或可以异步运行的作业。
要使用它,请使用 `service_tier="flex"` 初始化模型。
llm = ChatOpenAI(model="o4-mini", service_tier="flex")
请注意,这是一个测试版功能,仅适用于部分模型。更多详细信息请参见 OpenAI 文档。
API 参考
有关所有 ChatOpenAI 功能和配置的详细文档,请查阅API 参考。