Structured outputs from LLMs

Jul 17, 2024

Watching the Pydantic is all you need talk was eye-opening for me. In the past, I have struggled to write LLM prompts that will return exactly the JSON schema I need – say, from OpenAI’s gpt-3.5-turbo model. For simple JSON schemas with a handful of keys, this is not hard to do. However, describing a JSON schema in a prompt becomes cumbersome when dealing with more complex schemas especially where type-safety is important. Not to mention, the LLM may return more than you ask for. For example, if the LLM API’s response to a prompt is "Sure, here's your JSON: {"foo": "bar"}", it cannot be deserialized because it is not a valid JSON.

OpenAI’s gpt-* family of models will return free form text unless instructed otherwise – that is the kind of output most LLMs are optimized for. With the right prompt, however, they will return exactly the object you’re interested in. When building applications that programmatically interface with LLMs, obtaining structured output is important: this makes working with the object throughout its lifecycle in an application much simpler.

Here’s an example. Say we’ve got some customers leaving reviews on our site and we’re interested in building a database to analyze these reviews at scale. Getting an LLM to generate free-form text doesn’t help much here: what we really want is machine-readable data that can be aggregated. If we could determine the things we’d like to know from each review: a category, a severity (0-10), and perhaps sentiment, we can have the LLM API return a JSON with these fields for each input review that is passed. Because the output is now structured (in the form of a dict or list if using Python, and analogous types for any other language), it’s easier to work with, and meets our requirement of pushing the data into a SQL database.

Instructor is a fantastic library to accomplish this if you’re working on LLM use-cases in Python or TypeScript. Instructor allows you to prompt the LLM and get exactly the structured object you define.

import instructor
from openai import OpenAI
from pydantic import BaseModel, Field


class Review(BaseModel):
    category: str = Field(
        description="The product family that the review is associated with"
    )
    severity: int = Field(
        description="The severity of the review, 0 being no action needed, to 10 being immediate action needed"
    )
    sentiment: int = Field(
        description="The sentiment with -1 being extremely negative and 1 being extremely positive, and 0 for neutral"
    )


# Patch the OpenAI client
client = instructor.from_openai(OpenAI())

reviews = [
    """
    These wireless earbuds are fantastic for the price! The sound quality is clear and the battery life is impressive. Perfect for daily use.
    """,
    """
    The coffee maker does the job, but the brewing time is longer than expected. It’s reliable, but I wish it had a faster brew option.
    """,
    """
    I’m disappointed with this fitness tracker. It frequently loses sync with my phone and the step counter seems inaccurate. I might have to return it.
    """,
]

for review in reviews:
    structured_review = client.chat.completions.create(
        model="gpt-4o-mini",
        response_model=Review,
        messages=[{"role": "user", "content": review}],
    )
    print(structured_review)

# Each response is now a Pydantic object that you can work with or extend to fit your use case!
# category='Wireless Earbuds' severity=0 sentiment=1
# category='Coffee Maker' severity=5 sentiment=0
# category='Fitness Tracker' severity=6 sentiment=-1

Under the hood, Instructor takes care of passing off your desired schema as well its descriptors to OpenAI API, and converting its output into a Review object that you can work with. Because Review is a Pydantic BaseModel, you can call model_dump_json() on it to get your structured JSON or extend the class to support your use case.

Using Instructor to generate structured outputs from OpenAI models has been game-changing for me and it has helped provide structure in the outputs I’m able to generate. I highly recommend checking out LangChain’s Structured Outputs as well.

Thanks for reading!