Tensorlake

Tensorlake 是 AI 数据云，它能可靠地将非结构化数据转换为可供 AI 应用摄入的格式。

langchain-tensorlake 包在 Tensorlake 和 LangChain 之间提供了无缝集成，使您能够构建具有增强解析功能（如签名检测）的复杂文档处理代理。

Tensorlake 功能概述

Tensorlake 为您提供以下工具：

提取：模式驱动的结构化数据提取，用于从文档中提取特定字段。
解析：将文档转换为 Markdown，用于构建 RAG/知识图谱系统。
编排：构建可编程工作流，用于大规模摄入和丰富文档、文本、音频、视频等。

安装

pip install -U langchain-tensorlake

示例

按照完整教程了解如何使用 langchain-tensorlake 工具检测非结构化文档中的签名。

或者查看此 colab notebook 以快速入门。

快速入门

1. 设置您的环境

您应该通过设置以下环境变量来配置 Tensorlake 和 OpenAI 的凭据：

export TENSORLAKE_API_KEY="your-tensorlake-api-key"
export OPENAI_API_KEY = "your-openai-api-key"

从 Tensorlake 云控制台获取您的 Tensorlake API 密钥。新用户可获得 100 个免费积分。

2. 导入必要的包

from langchain_tensorlake import document_markdown_tool
from langgraph.prebuilt import create_react_agent
import asyncio
import os

API 参考：create_react_agent

3. 构建签名检测代理

async def main(question):
    # Create the agent with the Tensorlake tool
    agent = create_react_agent(
            model="openai:gpt-4o-mini",
            tools=[document_markdown_tool],
            prompt=(
                """
                I have a document that needs to be parsed. \n\nPlease parse this document and answer the question about it.
                """
            ),
            name="real-estate-agent",
        )
    
    # Run the agent
    result = await agent.ainvoke({"messages": [{"role": "user", "content": question}]})

    # Print the result
    print(result["messages"][-1].content)

注意：我们强烈建议使用 openai 作为代理模型，以确保代理设置正确的解析参数

4. 示例用法

# Define the path to the document to be parsed
path = "path/to/your/document.pdf"

# Define the question to be asked and create the agent
question = f"What contextual information can you extract about the signatures in my document found at {path}?"

if __name__ == "__main__":
    asyncio.run(main(question))

需要帮助？

请直接在 Slack 或 GitHub 上的包仓库联系我们。

Tensorlake 功能概述​

安装​

示例​

快速入门​

1. 设置您的环境​

2. 导入必要的包​

3. 构建签名检测代理​

4. 示例用法​

需要帮助？​