Rememberizer

Rememberizer 是由 SkyDeck AI Inc. 创建的用于AI应用的知识增强服务。

本笔记本展示了如何从 Rememberizer 检索文档并将其转换为下游使用的 Document 格式。

准备工作

您需要一个 API 密钥：您可以在 https://rememberizer.ai 创建公共知识后获取。获得 API 密钥后，您必须将其设置为环境变量 REMEMBERIZER_API_KEY，或在初始化 RememberizerRetriever 时将其作为 rememberizer_api_key 传入。

RememberizerRetriever 具有以下参数

可选 top_k_results：默认值=10。用于限制返回文档的数量。
可选 rememberizer_api_key：如果您未设置环境变量 REMEMBERIZER_API_KEY，则此项为必填。

get_relevant_documents() 有一个参数 query：用于在 Rememberizer.ai 的公共知识中查找文档的自由文本查询。

示例

基本用法

# Setup API key
from getpass import getpass

REMEMBERIZER_API_KEY = getpass()

import os

from langchain_community.retrievers import RememberizerRetriever

os.environ["REMEMBERIZER_API_KEY"] = REMEMBERIZER_API_KEY
retriever = RememberizerRetriever(top_k_results=5)

API 参考：RememberizerRetriever

docs = retriever.get_relevant_documents(query="How does Large Language Models works?")

docs[0].metadata  # meta-information of the Document

{'id': 13646493,
 'document_id': '17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP',
 'name': 'What is a large language model (LLM)_ _ Cloudflare.pdf',
 'type': 'application/pdf',
 'path': '/langchain/What is a large language model (LLM)_ _ Cloudflare.pdf',
 'url': 'https://drive.google.com/file/d/17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP/view',
 'size': 337089,
 'created_time': '',
 'modified_time': '',
 'indexed_on': '2024-04-04T03:36:28.886170Z',
 'integration': {'id': 347, 'integration_type': 'google_drive'}}

print(docs[0].page_content[:400])  # a content of the Document

before, or contextualized in new ways. on some level they " understand " semantics in that they can associate words and concepts by their meaning, having seen them grouped together in that way millions or billions of times. how developers can quickly start building their own llms to build llm applications, developers need easy access to multiple data sets, and they need places for those data sets

在链中的使用

OPENAI_API_KEY = getpass()

os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

from langchain.chains import ConversationalRetrievalChain
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model_name="gpt-3.5-turbo")
qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)

API 参考：ConversationalRetrievalChain | ChatOpenAI

questions = [
    "What is RAG?",
    "How does Large Language Models works?",
]
chat_history = []

for question in questions:
    result = qa.invoke({"question": question, "chat_history": chat_history})
    chat_history.append((question, result["answer"]))
    print(f"-> **Question**: {question} \n")
    print(f"**Answer**: {result['answer']} \n")

-> **Question**: What is RAG? 

**Answer**: RAG stands for Retrieval-Augmented Generation. It is an AI framework that retrieves facts from an external knowledge base to enhance the responses generated by Large Language Models (LLMs) by providing up-to-date and accurate information. This framework helps users understand the generative process of LLMs and ensures that the model has access to reliable information sources. 

-> **Question**: How does Large Language Models works? 

**Answer**: Large Language Models (LLMs) work by analyzing massive data sets of language to comprehend and generate human language text. They are built on machine learning, specifically deep learning, which involves training a program to recognize features of data without human intervention. LLMs use neural networks, specifically transformer models, to understand context in human language, making them better at interpreting language even in vague or new contexts. Developers can quickly start building their own LLMs by accessing multiple data sets and using services like Cloudflare's Vectorize and Cloudflare Workers AI platform.

检索器概念指南
检索器操作指南

准备工作

示例

基本用法​

在链中的使用

相关​

基本用法

相关