Python type safety in action: A beginner's guide to Pydantic

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Pydantically Perfect blog series

Welcome to Pydantically Perfect, the blog series where we explore how to solve data-related problems in Python using Pydantic, a feature-rich data validation library written in Rust. Whether you're a seasoned developer or just starting, we're hoping to give you actionable insights you can start applying right now to make your code more robust and reliable with stronger typing.

Where you are in the Pydantic for Python blog series:

You are here: A beginner's guide to Pydantic to Python type safety
Seamlessly handle non-pythonic naming conventions
Normalize legacy data in Python
Field report in progress: Declare rich validation rules
Field report in progress: Build shareable domain types
Field report in progress: Add your custom logic
Field report in progress: Apply alternative validation rules
Field report in progress: Validate your app configuration
Field report in progress: Put it all together with a FHIR example

Why Pydantic when working in Python?

At a previous client, we encountered growing pains in our different Python services as we built more of them and they kept accumulating complexity. Switching between services and keeping track of the intricacies of their data models became increasingly difficult. That pain was exacerbated when we rotated engineers between teams, as they had to rebuild their understanding from what little context they could salvage in our loosely typed code.

The direction we took to solve this was to move towards stricter typing with the intent of shifting knowledge of the business domain from our heads, documentation and ticketing system to the codebases. We planned to do this via Python's type hints and mypy, a static code analyzer.

Kyle suggested adding Pydantic to the mix and we didn't get it at first. Eventually we had a lightbulb moment: `mypy` provides type safety at code analysis, but Pydantic provides type safety at runtime. Combining the two would provide us even more benefits. By the end, Pydantic was a huge part of shifting our Python codebases to stricter typing and we believe you could leverage it too.

What's interesting in Pydantic?

Pydantic is a data validation library written in Rust that integrates really well with Python's type system, especially now that the latest Python releases have enriched the type system's capabilities. It gives rich data validation features ensuring that any Pydantic model built will match its model definition. It's also extensible with custom code if the out-of-the-box options aren't enough for your specific needs.

The best part? Its validation rules are declarative, so there's little data validation code to write, and it's much more readable than a bunch of nested if-else statements.

Sales pitch aside, what makes it interesting for us is that, once we've validated data into Pydantic models, we have a guarantee that the data inside Pydantic models match their definitions thanks to the Pydantic validation engine. That means a lot less headaches in our business rules around whether an object has the right type or if it has a specific attribute because Pydantic has already dealt with all that for us. Then, once we're ready to interact with the external, unvalidated world again, Pydantic models can easily be dumped into Python objects or JSON.

Dumping and validating are Pydantic concepts similar to serialization and deserialization in the larger software world. If you're looking for more context, there's an explanation of why Pydantic chose the validation keyword at the top of their models page.

Pydantic features: Making the model

Let's say we want to model a person in Python, we could start with this plain Python class:

class Person:
    first_name: str
    last_name: str

Now, if we want that class to become a Pydantic model, we just have to make it inherit from the Pydantic BaseModel class:

from pydantic import BaseModel

class Person(BaseModel):
    first_name: str
    last_name: str

That was a small change! What did we gain just by doing that?

Pydantic features: Validating and dumping a model

We now have different options for creating a Person model:

# Using the constructor with keyword arguments
# We recommend using this option when we control the data like in tests or business logic
Person(first_name="John", last_name="Smith")

# Validate an arbitrary Python object
arbitrary_input = { "first_name": "John", "last_name": "Smith" }
Person.model_validate(arbitrary_input)

# Validate JSON
Person.model_validate_json('{"first_name":"John","last_name":"Smith"}')

We can also take an existing model and dump it back into either a Python dictionary or JSON:

person = Person(first_name="John", last_name="Smith")

# Dumping to a Python dictionary
person.model_dump()
# {'first_name': 'John', 'last_name': 'Smith'}

# Dumping to JSON-encoded data
person.model_dump_json()
# '{"first_name":"John","last_name":"Smith"}'

Pydantic features: Data validation

We automatically receive the benefits of Pydantic's validation engine. Attempts to create an invalid Pydantic model will fail with a raised ValidationError. Here are the core rules we have to be aware of:

All fields are required unless a default value is specified.
Data types are validated. This includes None, so a field isn't nullable unless explicitly specified.

Below we have a few examples of trying to create an invalid Person with the raised exceptions. Take a moment to look at the out-of-the-box error messages and how extraordinarily helpful they are: they point directly to the problematic fields and provide all validation errors at once rather than just the first one to occur.

# The last_name field is missing
Person(first_name="John")
# pydantic_core._pydantic_core.ValidationError: 1 validation error for Person
# last_name
#   Field required [type=missing, input_value={'first_name': 'John'}, input_type=dict]
#     For further information visit https://errors.pydantic.dev/2.11/v/missing

# The first_name and last_name fields have a value of None
Person(first_name=None, last_name=None)
# pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person
# first_name
#   Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_type
# last_name
#   Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_type

# The first_name and last_name fields have values of wrong data types
Person(first_name=123, last_name={"a dict": "not a string"})
# pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person
# first_name
#   Input should be a valid string [type=string_type, input_value=123, input_type=int]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_type
# last_name
#   Input should be a valid string [type=string_type, input_value={'a dict': 'not a string'}, input_type=dict]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_type

Pydantic features: Nullable and/or optional fields

Now, what if we were to have an optional field? For most optional fields, we would need to do two different things:

Mark the field type as nullable with a union type like str | None.
Make the field as optional by providing a default value. Adding metadata like this to a field is done via the Field function.

It's easy to confuse a field being nullable and a field being optional, especially since we often combine these two options. They can be applied both separately and together.

Let's add an optional and nullable middle_name field:

from pydantic import BaseModel, Field

class Person(BaseModel):
    first_name: str
    middle_name: str | None = Field(default=None)
    last_name: str

Let's see different valid ways we can create a Person now:

# Specifying a middle_name
Person(first_name="John", middle_name="Bob", last_name="Smith")
# Works with specified middle_name of "Bob"

# Omitting the middle_name
Person(first_name="John", last_name="Smith")
# Works with default middle_name of `None` because of the `Field(default=None)` metadata

# Specifying a middle_name of `None`
Person(first_name="John", middle_name=None, last_name="Smith")
# Works with specified middle_name of `None` because of the `str | None` union type specified

Pydantic features: Field constraints

The Field metadata function also allows us to define Field constraints. There's a variety of them included out-of-the-box like string length constraints, string regular expression constraints, numerical constraints, list size constraints and a lot more. We'll make sure to provide more examples down the road, but we wanted to provide at least one example in our first post:

Here's an unexpected way to create a Person model that doesn't raise any validation errors:

Person(first_name="", middle_name="", last_name="")

It's fair to say that it's not a valid representation of a person. Let's move towards fixing that with a min_length string constraint:

from pydantic import BaseModel, Field

class Person(BaseModel):
    first_name: str = Field(min_length=1)
    middle_name: str | None = Field(default=None, min_length=1)
    last_name: str = Field(min_length=1)

# With this, we now get the following behavior:
Person(first_name="", middle_name="", last_name="")
# pydantic_core._pydantic_core.ValidationError: 3 validation errors for Person
# first_name
#   String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_too_short
# middle_name
#   String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_too_short
# last_name
#   String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
#     For further information visit https://errors.pydantic.dev/2.11/v/string_too_short

Pydantic features: JSON schema generation

Another really powerful feature is Pydantic's ability to generate a matching JSON Schema definition for a model. This will be especially useful if a framework or library leveraging the JSON Schema specification is in the picture because now the Pydantic models can be the single source of truth by dynamically generating the schemas we need.

Comparing the Pydantic model to the generated schema, we can notice that everything has been ported over, from the nullable and optional middle_name field to the minimum length for all string fields we specified.

Not all advanced Pydantic features can be ported, but we're typically safe here as long as we're not writing custom Python validation code.

from pydantic import BaseModel, Field

class Person(BaseModel):
    first_name: str = Field(min_length=1)
    middle_name: str | None = Field(default=None, min_length=1)
    last_name: str = Field(min_length=1)

Person.model_json_schema()
# {
#     "properties": {
#         "first_name": {"minLength": 1, "title": "First Name", "type": "string"},
#         "middle_name": {
#             "anyOf": [{"minLength": 1, "type": "string"}, {"type": "null"}],
#             "default": None,
#             "title": "Middle Name",
#         },
#         "last_name": {"minLength": 1, "title": "Last Name", "type": "string"},
#     },
#     "required": ["first_name", "last_name"],
#     "title": "Person",
#     "type": "object",
# }

Conclusion

We've walked together through most of the benefits you get just by switching to Pydantic models and how to add simple validation rules to your models. Of course that's just the surface, and there are a lot more features to cover that we haven't even mentioned yet like aliases, annotated types or custom validators. We're planning to work our way to progressively more advanced features in future installments.

Our goal isn't to go through all of Pydantic's features, but rather to provide a curated list of Pydantic features we found helpful when adopting it. If you're looking for a larger overview or want to know more without waiting for future posts, we encourage you to take a look at the official Pydantic documentation.

Resources

Gabriel Côté-Carrier is a senior software consultant at Test Double, and has experience in full–stack development, leading teams and teaching others.

Kyle Adams is a staff software consultant at Test Double who lives for that light bulb moment when a solution falls perfectly in place or an idea takes root.

Related Insights

Explore our insights

See all insights

Developers

Pydantically perfect: Normalize legacy data in Python

Learn how to normalize inconsistent data structures in Python with Pydantic. The post guides you through different approaches and pitfalls, using Pydantic's alias path and alias choices features.

Gabriel Côté-Carrier

Kyle Adams

Leadership

5 rules to avoid the 95% AI project failure rate

MIT research shows 95% of corporate AI pilots fail. The problem isn't the technology—it's transformation. Based on decades of implementation experience, here are the 5 non-negotiables every C-suite needs to master for AI success.

Ed Frank

Developers

Keep your coding agent on task with mutation testing

Code quality tools are helpful guardrails for humans, but coding agents benefit even more. Mutation testing is a rarely-used tool showing new promise as we leverage AI to write more and more software.