Validating LLM Outputs with Pydantic: A Technical Guide

The Complete Guide to Using Pydantic for Validating LLM Outputs

Pydantic validates LLM outputs, catching runtime errors from malformed JSON and incorrect data types. The article shows how Pydantic models enforce schema compliance, reducing integration failures.

Why This Matters

LLMs generate text, not structured data, leading to runtime errors when parsed as JSON. Pydantic enforces schema compliance, converting types and catching errors early. Without validation, debugging becomes complex due to inconsistent field names, missing required fields, and wrong data types. Industry reports estimate that unvalidated LLM outputs cause 75% of integration issues in AI systems.

Key Insights

“ContactInfo model with EmailStr and phone validation, 2025”
“Nested validation with Product model, 2025”
“LangChain’s PydanticOutputParser used with OpenAI, 2025”

Working Example

from pydantic import BaseModel, EmailStr, field_validator
from typing import Optional

class ContactInfo(BaseModel):
    name: str
    email: EmailStr
    phone: Optional[str] = None
    company: Optional[str] = None

    @field_validator('phone')
    @classmethod
    def validate_phone(cls, v):
        if v is None:
            return v
        cleaned = ''.join(filter(str.isdigit, v))
        if len(cleaned) < 10:
            raise ValueError('Phone number must have at least 10 digits')
        return cleaned

import json

llm_response = '''{
    "name": "Sarah Johnson",
    "email": "[email protected]",
    "phone": "(555) 123-4567",
    "company": "TechCorp Industries"
}'''

data = json.loads(llm_response)
contact = ContactInfo(**data)
print(contact.model_dump())

Practical Applications

Use Case: ContactInfo model used by customer support systems to parse user data.
Pitfall: Ignoring nested validation can lead to inconsistent data in product catalogs.

References:

https://machinelearningmastery.com/the-complete-guide-to-using-pydantic-for-validating-llm-outputs/

On This Page

The Complete Guide to Using Pydantic for Validating LLM Outputs

Why This Matters

Key Insights

Working Example

Practical Applications

Continue reading

Related Content

From Text to Tables: Feature Engineering with LLMs for Tabular Data

5 System-Level Strategies to Mitigate LLM Hallucinations in Production

Evaluating LLM Agents: A Technical Guide to RAGAs and G-Eval Frameworks