vlite

VLite 是一个简单且极速的向量数据库，允许您使用嵌入（embeddings）语义化地存储和检索数据。VLite 使用 numpy 构建，是一个轻量级、开箱即用的数据库，可用于将 RAG、相似性搜索和嵌入功能集成到您的项目中。

您需要安装 langchain-community (使用 pip install -qU langchain-community) 才能使用此集成

安装

要在 LangChain 中使用 VLite，您需要安装 `vlite` 包

!pip install vlite

导入 VLite

from langchain_community.vectorstores import VLite

API 参考：VLite

基本示例

在这个基本示例中，我们加载一个文本文档，并将其存储在 VLite 向量数据库中。然后，我们执行相似性搜索，根据查询检索相关文档。

VLite 会为您处理文本的分块（chunking）和嵌入（embedding），您可以通过预先对文本进行分块和/或将这些分块嵌入到 VLite 数据库中来更改这些参数。

from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter

# Load the document and split it into chunks
loader = TextLoader("path/to/document.txt")
documents = loader.load()

# Create a VLite instance
vlite = VLite(collection="my_collection")

# Add documents to the VLite vector database
vlite.add_documents(documents)

# Perform a similarity search
query = "What is the main topic of the document?"
docs = vlite.similarity_search(query)

# Print the most relevant document
print(docs[0].page_content)

API 参考：TextLoader | CharacterTextSplitter

添加文本和文档

您可以使用 `add_texts` 和 `add_documents` 方法向 VLite 向量数据库分别添加文本或文档。

# Add texts to the VLite vector database
texts = ["This is the first text.", "This is the second text."]
vlite.add_texts(texts)

# Add documents to the VLite vector database
documents = [Document(page_content="This is a document.", metadata={"source": "example.txt"})]
vlite.add_documents(documents)

相似度搜索

VLite 提供了执行相似性搜索的方法，在已存储的文档上。

# Perform a similarity search
query = "What is the main topic of the document?"
docs = vlite.similarity_search(query, k=3)

# Perform a similarity search with scores
docs_with_scores = vlite.similarity_search_with_score(query, k=3)

最大边际相关性搜索

VLite 还支持最大边际相关性（MMR）搜索，该搜索在优化查询相似性的同时，也优化了检索文档的多样性。

# Perform an MMR search
docs = vlite.max_marginal_relevance_search(query, k=3)

更新和删除文档

您可以使用 `update_document` 和 `delete` 方法在 VLite 向量数据库中更新或删除文档。

# Update a document
document_id = "doc_id_1"
updated_document = Document(page_content="Updated content", metadata={"source": "updated.txt"})
vlite.update_document(document_id, updated_document)

# Delete documents
document_ids = ["doc_id_1", "doc_id_2"]
vlite.delete(document_ids)

检索文档

您可以使用 `get` 方法根据文档 ID 或元数据从 VLite 向量数据库中检索文档。

# Retrieve documents by IDs
document_ids = ["doc_id_1", "doc_id_2"]
docs = vlite.get(ids=document_ids)

# Retrieve documents by metadata
metadata_filter = {"source": "example.txt"}
docs = vlite.get(where=metadata_filter)

创建 VLite 实例

您可以通过多种方法创建 VLite 实例

# Create a VLite instance from texts
vlite = VLite.from_texts(texts)

# Create a VLite instance from documents
vlite = VLite.from_documents(documents)

# Create a VLite instance from an existing index
vlite = VLite.from_existing_index(collection="existing_collection")

附加功能

VLite 提供了管理向量数据库的附加功能

from langchain.vectorstores import VLite
vlite = VLite(collection="my_collection")

# Get the number of items in the collection
count = vlite.count()

# Save the collection
vlite.save()

# Clear the collection
vlite.clear()

# Get collection information
vlite.info()

# Dump the collection data
data = vlite.dump()

向量存储概念指南
向量存储操作指南

安装​

导入 VLite​

基本示例​

添加文本和文档​

相似度搜索​

最大边际相关性搜索​

更新和删除文档​

检索文档​

创建 VLite 实例​

附加功能​

相关​

安装