LangChain 的 Redis 缓存

本教程演示了如何使用 langchain-redis 包中的 RedisCache 和 RedisSemanticCache 类来实现 LLM 响应的缓存。

设置

首先，让我们安装所需的依赖项并确保 Redis 实例正在运行。

%pip install -U langchain-core langchain-redis langchain-openai redis

确保您有一个 Redis 服务器正在运行。您可以使用 Docker 启动一个，命令如下：

docker run -d -p 6379:6379 redis:latest

或者根据您操作系统的说明，在本地安装并运行 Redis。

import os

# Use the environment variable if set, otherwise default to localhost
REDIS_URL = os.getenv("REDIS_URL", "redis://:6379")
print(f"Connecting to Redis at: {REDIS_URL}")

Connecting to Redis at: redis://redis:6379

导入所需库

import time

from langchain.globals import set_llm_cache
from langchain.schema import Generation
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain_redis import RedisCache, RedisSemanticCache

import langchain_core
import langchain_openai
import openai
import redis

设置 OpenAI API 密钥

from getpass import getpass

# Check if OPENAI_API_KEY is already set in the environment
openai_api_key = os.getenv("OPENAI_API_KEY")

if not openai_api_key:
    print("OpenAI API key not found in environment variables.")
    openai_api_key = getpass("Please enter your OpenAI API key: ")

    # Set the API key for the current session
    os.environ["OPENAI_API_KEY"] = openai_api_key
    print("OpenAI API key has been set for this session.")
else:
    print("OpenAI API key found in environment variables.")

OpenAI API key not found in environment variables.
``````output
Please enter your OpenAI API key:  ········
``````output
OpenAI API key has been set for this session.

使用 RedisCache

# Initialize RedisCache
redis_cache = RedisCache(redis_url=REDIS_URL)

# Set the cache for LangChain to use
set_llm_cache(redis_cache)

# Initialize the language model
llm = OpenAI(temperature=0)


# Function to measure execution time
def timed_completion(prompt):
    start_time = time.time()
    result = llm.invoke(prompt)
    end_time = time.time()
    return result, end_time - start_time


# First call (not cached)
prompt = "Explain the concept of caching in three sentences."
result1, time1 = timed_completion(prompt)
print(f"First call (not cached):\nResult: {result1}\nTime: {time1:.2f} seconds\n")

# Second call (should be cached)
result2, time2 = timed_completion(prompt)
print(f"Second call (cached):\nResult: {result2}\nTime: {time2:.2f} seconds\n")

print(f"Speed improvement: {time1 / time2:.2f}x faster")

# Clear the cache
redis_cache.clear()
print("Cache cleared")

First call (not cached):
Result: 

Caching is the process of storing frequently accessed data in a temporary storage location for faster retrieval. This helps to reduce the time and resources needed to access the data from its original source. Caching is commonly used in computer systems, web browsers, and databases to improve performance and efficiency.
Time: 1.16 seconds

Second call (cached):
Result: 

Caching is the process of storing frequently accessed data in a temporary storage location for faster retrieval. This helps to reduce the time and resources needed to access the data from its original source. Caching is commonly used in computer systems, web browsers, and databases to improve performance and efficiency.
Time: 0.05 seconds

Speed improvement: 25.40x faster
Cache cleared

使用 RedisSemanticCache

# Initialize RedisSemanticCache
embeddings = OpenAIEmbeddings()
semantic_cache = RedisSemanticCache(
    redis_url=REDIS_URL, embeddings=embeddings, distance_threshold=0.2
)

# Set the cache for LangChain to use
set_llm_cache(semantic_cache)


# Function to test semantic cache
def test_semantic_cache(prompt):
    start_time = time.time()
    result = llm.invoke(prompt)
    end_time = time.time()
    return result, end_time - start_time


# Original query
original_prompt = "What is the capital of France?"
result1, time1 = test_semantic_cache(original_prompt)
print(
    f"Original query:\nPrompt: {original_prompt}\nResult: {result1}\nTime: {time1:.2f} seconds\n"
)

# Semantically similar query
similar_prompt = "Can you tell me the capital city of France?"
result2, time2 = test_semantic_cache(similar_prompt)
print(
    f"Similar query:\nPrompt: {similar_prompt}\nResult: {result2}\nTime: {time2:.2f} seconds\n"
)

print(f"Speed improvement: {time1 / time2:.2f}x faster")

# Clear the semantic cache
semantic_cache.clear()
print("Semantic cache cleared")

Original query:
Prompt: What is the capital of France?
Result: 

The capital of France is Paris.
Time: 1.52 seconds

Similar query:
Prompt: Can you tell me the capital city of France?
Result: 

The capital of France is Paris.
Time: 0.29 seconds

Speed improvement: 5.22x faster
Semantic cache cleared

高级用法

自定义 TTL（生存时间）

# Initialize RedisCache with custom TTL
ttl_cache = RedisCache(redis_url=REDIS_URL, ttl=5)  # 60 seconds TTL

# Update a cache entry
ttl_cache.update("test_prompt", "test_llm", [Generation(text="Cached response")])

# Retrieve the cached entry
cached_result = ttl_cache.lookup("test_prompt", "test_llm")
print(f"Cached result: {cached_result[0].text if cached_result else 'Not found'}")

# Wait for TTL to expire
print("Waiting for TTL to expire...")
time.sleep(6)

# Try to retrieve the expired entry
expired_result = ttl_cache.lookup("test_prompt", "test_llm")
print(
    f"Result after TTL: {expired_result[0].text if expired_result else 'Not found (expired)'}"
)

Cached result: Cached response
Waiting for TTL to expire...
Result after TTL: Not found (expired)

自定义 RedisSemanticCache

# Initialize RedisSemanticCache with custom settings
custom_semantic_cache = RedisSemanticCache(
    redis_url=REDIS_URL,
    embeddings=embeddings,
    distance_threshold=0.1,  # Stricter similarity threshold
    ttl=3600,  # 1 hour TTL
    name="custom_cache",  # Custom cache name
)

# Test the custom semantic cache
set_llm_cache(custom_semantic_cache)

test_prompt = "What's the largest planet in our solar system?"
result, _ = test_semantic_cache(test_prompt)
print(f"Original result: {result}")

# Try a slightly different query
similar_test_prompt = "Which planet is the biggest in the solar system?"
similar_result, _ = test_semantic_cache(similar_test_prompt)
print(f"Similar query result: {similar_result}")

# Clean up
custom_semantic_cache.clear()

Original result: 

The largest planet in our solar system is Jupiter.
Similar query result: 

The largest planet in our solar system is Jupiter.

总结

本教程演示了 langchain-redis 包中 RedisCache 和 RedisSemanticCache 的用法。这些缓存机制可以通过减少冗余的 API 调用和利用语义相似性进行智能缓存，显著提高基于 LLM 的应用程序的性能。基于 Redis 的实现为分布式系统中的缓存提供了一个快速、可扩展和灵活的解决方案。

设置​

导入所需库​

设置 OpenAI API 密钥​

使用 RedisCache​

使用 RedisSemanticCache​

高级用法​

自定义 TTL（生存时间）​

自定义 RedisSemanticCache​

总结​

设置