跳到主要内容

谷歌 Cloud Storage 文件

Google Cloud Storage 是一种用于存储非结构化数据的托管服务。

本文档介绍了如何从 Google Cloud Storage (GCS) 文件对象(Blob) 加载文档对象。

%pip install --upgrade --quiet  langchain-google-community[gcs]
from langchain_google_community import GCSFileLoader
loader = GCSFileLoader(project_name="aist", bucket="testing-hwc", blob="fake.docx")
loader.load()
/Users/harrisonchase/workplace/langchain/.venv/lib/python3.10/site-packages/google/auth/_default.py:83: UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/
warnings.warn(_CLOUD_SDK_CREDENTIALS_WARNING)
[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': '/var/folders/y6/8_bzdg295ld6s1_97_12m4lr0000gn/T/tmp3srlf8n8/fake.docx'}, lookup_index=0)]

如果你想使用其他加载器,可以提供自定义函数,例如

from langchain_community.document_loaders import PyPDFLoader


def load_pdf(file_path):
return PyPDFLoader(file_path)


loader = GCSFileLoader(
project_name="aist", bucket="testing-hwc", blob="fake.pdf", loader_func=load_pdf
)
API 引用:PyPDFLoader

此页面是否有帮助?


您也可以在 GitHub 上留下详细的反馈 .