跳到主要内容
Open In ColabOpen on GitHub

HugeGraph

HugeGraph 是一个方便、高效、可扩展的图数据库,它兼容 Apache TinkerPop3 框架和 Gremlin 查询语言。

Gremlin 是由 Apache 软件基金会下属的 Apache TinkerPop 开发的一种图遍历语言和虚拟机。

本笔记将展示如何使用大型语言模型(LLM)为 HugeGraph 数据库提供自然语言接口。

设置

您需要有一个正在运行的 HugeGraph 实例。您可以通过运行以下脚本来启动一个本地 Docker 容器。

docker run \
--name=graph \
-itd \
-p 8080:8080 \
hugegraph/hugegraph

如果要在应用程序中连接 HugeGraph,我们需要安装 Python SDK。

pip3 install hugegraph-python

如果您正在使用 Docker 容器,您需要等待几秒钟让数据库启动,然后我们需要为数据库创建模式并写入图数据。

from hugegraph.connection import PyHugeGraph

client = PyHugeGraph("localhost", "8080", user="admin", pwd="admin", graph="hugegraph")

首先,我们为一个简单的电影数据库创建模式。

"""schema"""
schema = client.schema()
schema.propertyKey("name").asText().ifNotExist().create()
schema.propertyKey("birthDate").asText().ifNotExist().create()
schema.vertexLabel("Person").properties(
"name", "birthDate"
).usePrimaryKeyId().primaryKeys("name").ifNotExist().create()
schema.vertexLabel("Movie").properties("name").usePrimaryKeyId().primaryKeys(
"name"
).ifNotExist().create()
schema.edgeLabel("ActedIn").sourceLabel("Person").targetLabel(
"Movie"
).ifNotExist().create()
'create EdgeLabel success, Detail: "b\'{"id":1,"name":"ActedIn","source_label":"Person","target_label":"Movie","frequency":"SINGLE","sort_keys":[],"nullable_keys":[],"index_labels":[],"properties":[],"status":"CREATED","ttl":0,"enable_label_index":true,"user_data":{"~create_time":"2023-07-04 10:48:47.908"}}\'"'

然后我们可以插入一些数据。

"""graph"""
g = client.graph()
g.addVertex("Person", {"name": "Al Pacino", "birthDate": "1940-04-25"})
g.addVertex("Person", {"name": "Robert De Niro", "birthDate": "1943-08-17"})
g.addVertex("Movie", {"name": "The Godfather"})
g.addVertex("Movie", {"name": "The Godfather Part II"})
g.addVertex("Movie", {"name": "The Godfather Coda The Death of Michael Corleone"})

g.addEdge("ActedIn", "1:Al Pacino", "2:The Godfather", {})
g.addEdge("ActedIn", "1:Al Pacino", "2:The Godfather Part II", {})
g.addEdge(
"ActedIn", "1:Al Pacino", "2:The Godfather Coda The Death of Michael Corleone", {}
)
g.addEdge("ActedIn", "1:Robert De Niro", "2:The Godfather Part II", {})
1:Robert De Niro--ActedIn-->2:The Godfather Part II

创建 HugeGraphQAChain

我们现在可以创建 HugeGraphHugeGraphQAChain。要创建 HugeGraph,我们只需将数据库对象传递给 HugeGraph 构造函数即可。

from langchain.chains import HugeGraphQAChain
from langchain_community.graphs import HugeGraph
from langchain_openai import ChatOpenAI
graph = HugeGraph(
username="admin",
password="admin",
address="localhost",
port=8080,
graph="hugegraph",
)

刷新图模式信息

如果数据库模式发生变化,您可以刷新生成 Gremlin 语句所需的模式信息。

# graph.refresh_schema()
print(graph.get_schema)
Node properties: [name: Person, primary_keys: ['name'], properties: ['name', 'birthDate'], name: Movie, primary_keys: ['name'], properties: ['name']]
Edge properties: [name: ActedIn, properties: []]
Relationships: ['Person--ActedIn-->Movie']

查询图

我们现在可以使用图 Gremlin 问答链来查询图数据。

chain = HugeGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph, verbose=True)
chain.run("Who played in The Godfather?")


> Entering new chain...
Generated gremlin:
g.V().has('Movie', 'name', 'The Godfather').in('ActedIn').valueMap(true)
Full Context:
[{'id': '1:Al Pacino', 'label': 'Person', 'name': ['Al Pacino'], 'birthDate': ['1940-04-25']}]

> Finished chain.
'Al Pacino played in The Godfather.'