跳到主要内容
Open In ColabOpen on GitHub

GreenNodeRetriever

GreenNode 是一家全球人工智能解决方案提供商,也是 NVIDIA 的首选合作伙伴,为美国、中东和北非以及亚太地区的企业提供从基础设施到应用的完整 AI 能力。GreenNode 在世界一流的基础设施(LEED Gold、TIA‑942、Uptime Tier III)上运营,为企业、初创公司和研究人员提供全面的 AI 服务套件。

本笔记本提供了 `GreenNodeRerank` 检索器入门的演练。它使您能够使用内置连接器或通过集成您自己的数据源来执行文档搜索,利用 GreenNode 的重排序功能来提高相关性。

集成详情

  • 提供商GreenNode Serverless AI
  • 模型类型:重排序模型
  • 主要用例:基于语义相关性重排序搜索结果
  • 可用模型:包括 BAAI/bge-reranker-v2-m3 和其他高性能重排序模型
  • 评分:返回用于根据查询对齐重新排序文档候选的相关性分数

设置

要访问 GreenNode 模型,您需要创建一个 GreenNode 账户,获取 API 密钥,并安装 `langchain-greennode` 集成包。

凭证

前往 此页面 注册 GreenNode AI 平台并生成 API 密钥。完成后,设置 GREENNODE_API_KEY 环境变量

import getpass
import os

if not os.getenv("GREENNODE_API_KEY"):
os.environ["GREENNODE_API_KEY"] = getpass.getpass("Enter your GreenNode API key: ")

如果您想获取单个查询的自动化追踪,您还可以通过取消注释下方内容来设置您的 LangSmith API 密钥。

# os.environ["LANGSMITH_API_KEY"] = getpass.getpass("Enter your LangSmith API key: ")
# os.environ["LANGSMITH_TRACING"] = "true"

安装

此检索器位于 `langchain-greennode` 包中。

%pip install -qU langchain-greennode
Note: you may need to restart the kernel to use updated packages.

实例化

`GreenNodeRerank` 类可以通过 API 密钥和模型名称的可选参数进行实例化。

from langchain_greennode import GreenNodeRerank

# Initialize the embeddings model
reranker = GreenNodeRerank(
# api_key="YOUR_API_KEY", # You can pass the API key directly
model="BAAI/bge-reranker-v2-m3", # The default embedding model
top_n=3,
)

使用

重排序搜索结果

重排序模型通过根据语义相关性优化和重新排序初始搜索结果,增强了检索增强生成(RAG)工作流程。以下示例演示了如何将 GreenNodeRerank 与基础检索器集成,以提高检索到的文档质量。

from langchain.retrievers.contextual_compression import ContextualCompressionRetriever
from langchain_community.vectorstores import FAISS
from langchain_core.documents import Document
from langchain_greennode import GreenNodeEmbeddings

# Initialize the embeddings model
embeddings = GreenNodeEmbeddings(
# api_key="YOUR_API_KEY", # You can pass the API key directly
model="BAAI/bge-m3" # The default embedding model
)

# Prepare documents (finance/economics domain)
docs = [
Document(
page_content="Inflation represents the rate at which the general level of prices for goods and services rises"
),
Document(
page_content="Central banks use interest rates to control inflation and stabilize the economy"
),
Document(
page_content="Cryptocurrencies like Bitcoin operate on decentralized blockchain networks"
),
Document(
page_content="Stock markets are influenced by corporate earnings, investor sentiment, and economic indicators"
),
]

# Create a vector store and a base retriever
vector_store = FAISS.from_documents(docs, embeddings)
base_retriever = vector_store.as_retriever(search_kwargs={"k": 4})


rerank_retriever = ContextualCompressionRetriever(
base_compressor=reranker, base_retriever=base_retriever
)

# Perform retrieval with reranking
query = "How do central banks fight rising prices?"
results = rerank_retriever.get_relevant_documents(query)

results
/var/folders/bs/g52lln652z11zjp98qf9wcy40000gn/T/ipykernel_96362/2544494776.py:41: LangChainDeprecationWarning: The method `BaseRetriever.get_relevant_documents` was deprecated in langchain-core 0.1.46 and will be removed in 1.0. Use :meth:`~invoke` instead.
results = rerank_retriever.get_relevant_documents(query)
[Document(metadata={'relevance_score': 0.125}, page_content='Central banks use interest rates to control inflation and stabilize the economy'),
Document(metadata={'relevance_score': 0.004913330078125}, page_content='Inflation represents the rate at which the general level of prices for goods and services rises'),
Document(metadata={'relevance_score': 1.6689300537109375e-05}, page_content='Cryptocurrencies like Bitcoin operate on decentralized blockchain networks')]

直接用法

`GreenNodeRerank` 类可以独立使用,根据相关性分数对检索到的文档进行重排序。此功能在以下场景中特别有用:初次检索步骤(例如,关键词或向量搜索)返回了大量候选,此时需要一个次级模型来利用更复杂的语义理解来优化结果。该类接受查询和候选文档列表,并根据预测的相关性返回一个重新排序的列表。

test_documents = [
Document(
page_content="Carson City is the capital city of the American state of Nevada."
),
Document(
page_content="Washington, D.C. (also known as simply Washington or D.C.) is the capital of the United States."
),
Document(
page_content="Capital punishment has existed in the United States since beforethe United States was a country."
),
Document(
page_content="The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan."
),
]

test_query = "What is the capital of the United States?"
results = reranker.rerank(test_documents, test_query)
results
[{'index': 1, 'relevance_score': 1.0},
{'index': 0, 'relevance_score': 0.01165771484375},
{'index': 3, 'relevance_score': 0.0012054443359375}]

在链中使用

GreenNodeRerank 在 LangChain RAG 管道中无缝工作。以下是使用 GreenNodeRerank 创建简单 RAG 链的示例。

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_greennode import ChatGreenNode

# Initialize LLM
llm = ChatGreenNode(model="deepseek-ai/DeepSeek-R1-Distill-Qwen-32B")

# Create a prompt template
prompt = ChatPromptTemplate.from_template(
"""
Answer the question based only on the following context:

Context:
{context}

Question: {question}
"""
)


# Format documents function
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)


# Create RAG chain
rag_chain = (
{"context": rerank_retriever | format_docs, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)

# Run the chain
answer = rag_chain.invoke("How do central banks fight rising prices?")
answer
'\n\nCentral banks combat rising prices, or inflation, by adjusting interest rates. By raising interest rates, they increase the cost of borrowing, which discourages spending and investment. This reduction in demand helps slow down the rate of price increases, thereby controlling inflation and contributing to economic stability.'

API 参考

有关 GreenNode Serverless AI API 的更多详细信息,请访问 GreenNode Serverless AI 文档