At a previous client, we encountered growing pains in our different Python services as we built more of them and they kept accumulating complexity. Switching between services and keeping track of the intricacies of their data models became increasingly difficult. That pain was exacerbated when we rotated engineers between teams, as they had to rebuild their understanding from what little context they could salvage in our loosely typed code.
The direction we took to solve this was to move towards stricter typing with the intent of shifting knowledge of the business domain from our heads, documentation and ticketing system to the codebases. We planned to do this via Python's type hints and mypy, a static code analyzer.
Kyle suggested adding Pydantic to the mix and we didn't get it at first. Eventually we had a lightbulb moment: `mypy` provides type safety at code analysis, but Pydantic provides type safety at runtime. Combining the two would provide us even more benefits. By the end, Pydantic was a huge part of shifting our Python codebases to stricter typing and we believe you could leverage it too.
What's interesting in Pydantic?
Pydantic is a data validation library written in Rust that integrates really well with Python's type system, especially now that the latest Python releases have enriched the type system's capabilities. It gives rich data validation features ensuring that any Pydantic model built will match its model definition. It's also extensible with custom code if the out-of-the-box options aren't enough for your specific needs.
The best part? Its validation rules are declarative, so there's little data validation code to write, and it's much more readable than a bunch of nested if-else statements.
Sales pitch aside, what makes it interesting for us is that, once we've validated data into Pydantic models, we have a guarantee that the data inside Pydantic models match their definitions thanks to the Pydantic validation engine. That means a lot less headaches in our business rules around whether an object has the right type or if it has a specific attribute because Pydantic has already dealt with all that for us. Then, once we're ready to interact with the external, unvalidated world again, Pydantic models can easily be dumped into Python objects or JSON.
Dumping and validating are Pydantic concepts similar to serialization and deserialization in the larger software world. If you're looking for more context, there's an explanation of why Pydantic chose the validation keyword at the top of their models page.
Pydantic features: Making the model
Let's say we want to model a person in Python, we could start with this plain Python class:
class Person:
first_name: str
last_name: str
Now, if we want that class to become a Pydantic model, we just have to make it inherit from the Pydantic BaseModel
class:
from pydantic import BaseModel
class Person(BaseModel):
first_name: str
last_name: str
That was a small change! What did we gain just by doing that?
Pydantic features: Validating and dumping a model
We now have different options for creating a Person
model:
# Using the constructor with keyword arguments
# We recommend using this option when we control the data like in tests or business logic
Person(first_name="John", last_name="Smith")
# Validate an arbitrary Python object
arbitrary_input = { "first_name": "John", "last_name": "Smith" }
Person.model_validate(arbitrary_input)
# Validate JSON
Person.model_validate_json('{"first_name":"John","last_name":"Smith"}')
We can also take an existing model and dump it back into either a Python dictionary or JSON:
person = Person(first_name="John", last_name="Smith")
# Dumping to a Python dictionary
person.model_dump()
# {'first_name': 'John', 'last_name': 'Smith'}
# Dumping to JSON-encoded data
person.model_dump_json()
# '{"first_name":"John","last_name":"Smith"}'
Pydantic features: Data validation
We automatically receive the benefits of Pydantic's validation engine. Attempts to create an invalid Pydantic model will fail with a raised ValidationError.
Here are the core rules we have to be aware of:
- All fields are required unless a default value is specified.
- Data types are validated. This includes
None
, so a field isn't nullable unless explicitly specified.
Below we have a few examples of trying to create an invalid Person
with the raised exceptions. Take a moment to look at the out-of-the-box error messages and how extraordinarily helpful they are: they point directly to the problematic fields and provide all validation errors at once rather than just the first one to occur.
# The last_name field is missing
Person(first_name="John")
# pydantic_core._pydantic_core.ValidationError: 1 validation error for Person
# last_name
# Field required [type=missing, input_value={'first_name': 'John'}, input_type=dict]
# For further information visit https://errors.pydantic.dev/2.11/v/missing
# The first_name and last_name fields have a value of None
Person(first_name=None, last_name=None)
# pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person
# first_name
# Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
# last_name
# Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
# The first_name and last_name fields have values of wrong data types
Person(first_name=123, last_name={"a dict": "not a string"})
# pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person
# first_name
# Input should be a valid string [type=string_type, input_value=123, input_type=int]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
# last_name
# Input should be a valid string [type=string_type, input_value={'a dict': 'not a string'}, input_type=dict]
# For further information visit https://errors.pydantic.dev/2.11/v/string_type
Pydantic features: Nullable and/or optional fields
Now, what if we were to have an optional field? For most optional fields, we would need to do two different things:
- Mark the field type as nullable with a union type like
str | None
. - Make the field as optional by providing a default value. Adding metadata like this to a field is done via the
Field
function.
It's easy to confuse a field being nullable and a field being optional, especially since we often combine these two options. They can be applied both separately and together.
Let's add an optional and nullable middle_name
field:
from pydantic import BaseModel, Field
class Person(BaseModel):
first_name: str
middle_name: str | None = Field(default=None)
last_name: str
Let's see different valid ways we can create a Person
now:
# Specifying a middle_name
Person(first_name="John", middle_name="Bob", last_name="Smith")
# Works with specified middle_name of "Bob"
# Omitting the middle_name
Person(first_name="John", last_name="Smith")
# Works with default middle_name of `None` because of the `Field(default=None)` metadata
# Specifying a middle_name of `None`
Person(first_name="John", middle_name=None, last_name="Smith")
# Works with specified middle_name of `None` because of the `str | None` union type specified
Pydantic features: Field constraints
The Field
metadata function also allows us to define Field constraints. There's a variety of them included out-of-the-box like string length constraints, string regular expression constraints, numerical constraints, list size constraints and a lot more. We'll make sure to provide more examples down the road, but we wanted to provide at least one example in our first post:
Here's an unexpected way to create a Person
model that doesn't raise any validation errors:
Person(first_name="", middle_name="", last_name="")
It's fair to say that it's not a valid representation of a person. Let's move towards fixing that with a min_length
string constraint:
from pydantic import BaseModel, Field
class Person(BaseModel):
first_name: str = Field(min_length=1)
middle_name: str | None = Field(default=None, min_length=1)
last_name: str = Field(min_length=1)
# With this, we now get the following behavior:
Person(first_name="", middle_name="", last_name="")
# pydantic_core._pydantic_core.ValidationError: 3 validation errors for Person
# first_name
# String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
# For further information visit https://errors.pydantic.dev/2.11/v/string_too_short
# middle_name
# String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
# For further information visit https://errors.pydantic.dev/2.11/v/string_too_short
# last_name
# String should have at least 1 character [type=string_too_short, input_value='', input_type=str]
# For further information visit https://errors.pydantic.dev/2.11/v/string_too_short
Pydantic features: JSON schema generation
Another really powerful feature is Pydantic's ability to generate a matching JSON Schema definition for a model. This will be especially useful if a framework or library leveraging the JSON Schema specification is in the picture because now the Pydantic models can be the single source of truth by dynamically generating the schemas we need.
Comparing the Pydantic model to the generated schema, we can notice that everything has been ported over, from the nullable and optional middle_name
field to the minimum length for all string fields we specified.
Not all advanced Pydantic features can be ported, but we're typically safe here as long as we're not writing custom Python validation code.
from pydantic import BaseModel, Field
class Person(BaseModel):
first_name: str = Field(min_length=1)
middle_name: str | None = Field(default=None, min_length=1)
last_name: str = Field(min_length=1)
Person.model_json_schema()
# {
# "properties": {
# "first_name": {"minLength": 1, "title": "First Name", "type": "string"},
# "middle_name": {
# "anyOf": [{"minLength": 1, "type": "string"}, {"type": "null"}],
# "default": None,
# "title": "Middle Name",
# },
# "last_name": {"minLength": 1, "title": "Last Name", "type": "string"},
# },
# "required": ["first_name", "last_name"],
# "title": "Person",
# "type": "object",
# }
Conclusion
We've walked together through most of the benefits you get just by switching to Pydantic models and how to add simple validation rules to your models. Of course that's just the surface, and there are a lot more features to cover that we haven't even mentioned yet like aliases, annotated types or custom validators. We're planning to work our way to progressively more advanced features in future installments.
Our goal isn't to go through all of Pydantic's features, but rather to provide a curated list of Pydantic features we found helpful when adopting it. If you're looking for a larger overview or want to know more without waiting for future posts, we encourage you to take a look at the official Pydantic documentation.
Resources
- Official Pydantic documentation
- Python's typing documentation
- mypy's homepage
- An overview of the JSON Schema specification