跳至主要内容

JSONFormer

JSONFormer 是一个库,它包装了本地 Hugging Face 管道模型,用于对 JSON Schema 的子集进行结构化解码。

它的工作原理是填充结构标记,然后从模型中采样内容标记。

警告 - 此模块仍在实验阶段

%pip install --upgrade --quiet  jsonformer > /dev/null

Hugging Face 基线

首先,让我们通过检查模型在没有结构化解码的情况下输出的结果来建立一个定性基线。

import logging

logging.basicConfig(level=logging.ERROR)
import json
import os

import requests
from langchain_core.tools import tool

HF_TOKEN = os.environ.get("HUGGINGFACE_API_KEY")


@tool
def ask_star_coder(query: str, temperature: float = 1.0, max_new_tokens: float = 250):
"""Query the BigCode StarCoder model about coding questions."""
url = "https://api-inference.huggingface.co/models/bigcode/starcoder"
headers = {
"Authorization": f"Bearer {HF_TOKEN}",
"content-type": "application/json",
}
payload = {
"inputs": f"{query}\n\nAnswer:",
"temperature": temperature,
"max_new_tokens": int(max_new_tokens),
}
response = requests.post(url, headers=headers, data=json.dumps(payload))
response.raise_for_status()
return json.loads(response.content.decode("utf-8"))
API 参考:工具
prompt = """You must respond using JSON format, with a single action and single action input.
You may 'ask_star_coder' for help on coding problems.

{arg_schema}

EXAMPLES
----
Human: "So what's all this about a GIL?"
AI Assistant:{{
"action": "ask_star_coder",
"action_input": {{"query": "What is a GIL?", "temperature": 0.0, "max_new_tokens": 100}}"
}}
Observation: "The GIL is python's Global Interpreter Lock"
Human: "Could you please write a calculator program in LISP?"
AI Assistant:{{
"action": "ask_star_coder",
"action_input": {{"query": "Write a calculator program in LISP", "temperature": 0.0, "max_new_tokens": 250}}
}}
Observation: "(defun add (x y) (+ x y))\n(defun sub (x y) (- x y ))"
Human: "What's the difference between an SVM and an LLM?"
AI Assistant:{{
"action": "ask_star_coder",
"action_input": {{"query": "What's the difference between SGD and an SVM?", "temperature": 1.0, "max_new_tokens": 250}}
}}
Observation: "SGD stands for stochastic gradient descent, while an SVM is a Support Vector Machine."

BEGIN! Answer the Human's question as best as you are able.
------
Human: 'What's the difference between an iterator and an iterable?'
AI Assistant:""".format(arg_schema=ask_star_coder.args)
from langchain_huggingface import HuggingFacePipeline
from transformers import pipeline

hf_model = pipeline(
"text-generation", model="cerebras/Cerebras-GPT-590M", max_new_tokens=200
)

original_model = HuggingFacePipeline(pipeline=hf_model)

generated = original_model.predict(prompt, stop=["Observation:", "Human:"])
print(generated)
API 参考:HuggingFacePipeline
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
``````output
'What's the difference between an iterator and an iterable?'

这并不令人印象深刻,不是吗?它根本没有遵循 JSON 格式!让我们尝试使用结构化解码器。

JSONFormer LLM 包装器

让我们再次尝试,现在向模型提供 Action 输入的 JSON Schema。

decoder_schema = {
"title": "Decoding Schema",
"type": "object",
"properties": {
"action": {"type": "string", "default": ask_star_coder.name},
"action_input": {
"type": "object",
"properties": ask_star_coder.args,
},
},
}
from langchain_experimental.llms import JsonFormer

json_former = JsonFormer(json_schema=decoder_schema, pipeline=hf_model)
API 参考:JsonFormer
results = json_former.predict(prompt, stop=["Observation:", "Human:"])
print(results)
{"action": "ask_star_coder", "action_input": {"query": "What's the difference between an iterator and an iter", "temperature": 0.0, "max_new_tokens": 50.0}}

瞧!没有解析错误。


此页面是否有帮助?


您也可以留下详细的反馈 在 GitHub 上.