Pydantically perfect: Declare rich validation rules

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Welcome to Pydantically Perfect, the blog series where we explore how to solve data-related problems in Python using Pydantic, a feature-rich data validation library written in Rust. Whether you're a seasoned developer or just starting, we're hoping to give you actionable insights you can start applying right now to make your code more robust and reliable with stronger typing.

If you're a newcomer here, we encourage you to take a look at our first installment: Pydantically perfect: A beginner’s guide to Pydantic for Python type safety.

Where you are in the Pydantic for Python blog series:

A beginner's guide to Pydantic to Python type safety
Seamlessly handle non-pythonic naming conventions
Normalize legacy data
You are here: Declare rich validation rules
Field report in progress: Build shareable domain types
Field report in progress: Add your custom logic
Field report in progress: Apply alternative validation rules
Field report in progress: Validate your app configuration
Field report in progress: Put it all together with a FHIR example

The problem

We have datatypes—like UUIDs, URLs, phone numbers, etc.—that are more complex than the primities—int, bool, str—provided by Python. Additionally, with some data types there may be additional business logic. For example, a particular ID string might need to be two leading alphabetical characters followed by eight digits.

from pydantic import BaseModel

class User(BaseModel):
    name: str | None = None
    id: str | None = None       # needs to be a UUID
    password: str | None = None # needs to stay secret
    email: str | None = None    # needs to be a valid email
    phone: str | None = None    # needs to be a valid phone
    code: str | None = None     # needs to follow AA-NNNNNNNN format

How do we validate these more complex types, particularly when there are custom business rules? Fortunately, Pydantic provides us with a rich library of complex types as well as the building blocks for creating our own types.

What are our goals here?

1. Keep the validation logic in our models, both in terms of code location and in Pydantic’s validation flow. The rest of our application should be able to count on email being a valid email address, with zero validation leakage.

2. Utilize everything that Pydantic gives us, out of the box, to minimize the custom code we have to write. Code is complexity and complexity is an expense.

Standard library types

Pydantic supports the many data types that are included in Python’s standard libraries. While int or str may come immediately to mind, these types include others that may not be readily apparent. For example, the uuid module:

import uuid
from pydantic import BaseModel

class User(BaseModel):
    name: str | None = None
    id: uuid.UUID | None = None       # ✅ needs to be a UUID
    password: str | None = None # needs to stay secret
    email: str | None = None    # needs to be a valid email
    phone: str | None = None    # needs to be a valid phone
    code: str | None = None     # needs to follow AA-NNNNNNNN format

Here we’ve switched the type on id from str to uuid.UUID, which means this code is valid:

>>> User.model_validate({"id": "b3181580-4563-4936-9933-57b137217ce0"})

While this code, which would have been valid with our initial model, will throw a ValidationError:

>>> User.model_validate({"id": "test-id"})

Some of the other useful types from Pydantic’s standard library support include:

† Literals are a really nice way to handle enumerated values without the overhead of a full-blown enum. For example, if our Person had a status field where only certain values—say “active” and “inactive”—were valid, we could do:

from typing import Literal

class User(BaseModel):
    status: Literal[“active”, “inactive”]

Pydantic types

Pydantic also has additional types defined that build on Python’s standard library, often by applying constraints to the standard types. For example, PositiveInt will ensure that the value of a particular field will always be greater than 0. Other useful examples include:

There are other types that go beyond simple constraints. SecretStr behaves like normal string; however, when the model is printed or converted to string via repr() or str() it will display ‘**********' instead of the underlying string value. This behavior helps keep passwords out of log files. We can update our model to use SecretStr:

import uuid
from pydantic import BaseModel, SecretStr

class User(BaseModel):
    name: str | None = None
    id: uuid.UUID | None = None       # ✅ needs to be a UUID
    password: SecretStr | None = None # ✅ needs to stay secret
    email: str | None = None    # needs to be a valid email
    phone: str | None = None    # needs to be a valid phone
    code: str | None = None     # needs to follow AA-NNNNNNNN format

Now our passwords won’t leak into logs:

>>> model = User.model_validate({"password": "SuperSecret"})
>>> print(model)
"name=None id=None password=SecretStr('**********') email=None phone=None employee_number=None status=None"

And our application can still access the secret value:

>>> model.password.get_secret_value()
'SuperSecret'

Network types

Network types are in the same category as the types above; however, they’re extensive enough to warrant their own subcategory. Email validation is one of those problems that may seem simple at first, but is actually very complex. Additionally, mistakes in the validation could allow security attacks. Consequently, it’s a good idea to lean on proven email validators, such as the email-validator package. Pydantic has an EmailStr type that uses email-validator to securely validate our users’ email addresses. We’ll need to install email-validator first:

# via uv
$ uv add email-validator

# via pip
$ pip install email-validator

Adding the EmailStr type to the User model is a cinch:

import uuid
from pydantic import BaseModel, EmailStr, SecretStr

class User(BaseModel):
    name: str | None = None
    id: uuid.UUID | None = None       # ✅ needs to be a UUID
    password: SecretStr | None = None # ✅ needs to stay secret
    email: EmailStr | None = None     # ✅ needs to be a valid email
    phone: str | None = None    # needs to be a valid phone
    code: str | None = None     # needs to follow AA-NNNNNNNN format

Now we can rest easy, knowing that we’re protected against malicious email addresses:

>>> User.model_validate({"email": '"><script>alert(1);</script>"@example.org'})
Traceback (most recent call last):
...
pydantic_core._pydantic_core.ValidationError: 1 validation error for User
email
  value is not a valid email address: Quoting the part before the @-sign is not allowed here....

Other network types of interest:

Pydantic extra types

We’ve looked at types based on Python’s standard library as well as types that build on the standard library, but what about information that falls outside of the standard library? That’s where Pydantic’s extra types come in: these types are often taken from external standards bodies (ISO, W3C, etc.) and sometimes have dependencies on other libraries.

The phone field in our User model is a good example: Pydantic Extra Types has a PhoneNumber type that depends on the phonenumbers package. Before we can use it, we’ll need to install the pydantic-extra-types package along with its optional dependency on phonenumbers:

# via uv
$ uv add "pydantic-extra-types[phonenumbers]"

# via pip
$ pip install -U "pydantic-extra-types[phonenumbers]"

Now we can use the PhoneNumber type in our model; note the different package on the import statement:

import uuid
from pydantic import BaseModel, EmailStr, SecretStr
from pydantic_extra_types.phone_numbers import PhoneNumber

class User(BaseModel):
    name: str | None = None
    id: uuid.UUID | None = None       # ✅ needs to be a UUID
    password: SecretStr | None = None # ✅ needs to stay secret
    email: EmailStr | None = None     # ✅ needs to be a valid email
    phone: PhoneNumber | None = None  # ✅ needs to be a valid phone
    code: str | None = None     # needs to follow AA-NNNNNNNN format

Some types, like PaymentCardNumber, are currently in the process of moving from the core pydantic package to pydantic-extra-types; be sure to use the pydantic-extra-types version.

Custom validators

What if we have business domains or types that need custom validation? Pydantic provides extension points so that we can write our own types. Pydantic’s custom validators support validation at two levels:

Field validators: Focuses on validating just the target field, without need to reference external state.
Model validators: Validates across an entire model. Useful if, for example, you need to cross reference with another field to determine if the target field is valid.

Additionally, validators can occur at different points in Pydantic’s validation workflow:

Field validators: before/plain/after/wrap.
Model validators: before/after/wrap.

Finally, there are two patterns that can be used to apply field validators:

Annotated: uses the Annotated type to connect a function to the field it validates.
Decorator: uses @field_validator to connect a class method to the fields it validates.

Given there’s quite a bit of complexity here, we’ll focus on the custom validators that we ended up using most often: after field validators via the Annotated pattern. If you need more information on the other types of custom validators, Pydantic’s Validators Concept doc is a good starting point

Returning to our User model, the final field we need to validate is code, a business-specific field that needs to adhere to an AA-NNNNNNNN format. That is:

✅ Starts with two alpha characters
✅ Uses a hyphen delimiter
✅ Ends with eight numeric characters

Since this field is business-specific, it’s a good candidate for a custom validator. Going further, the format doesn’t depend on other fields to determine its validity, so we can use a field validator instead of a model. Our last decision is where in the validation workflow do we need to make this decision; to quote Pydantic’s docs on After validators:

They are generally more type safe and thus easier to implement.

In our case, an After validator will get the job done and we won’t have to worry about the additional complexities that pop up at the other points in the workflow. Since we want to match a particular pattern, a regular expression is the perfect tool for this job:

import re

def is_valid_code(value: str) -> str:
    if not re.match(r"^[A-Za-z]{2}-\d{8}$", value):
        raise ValueError(f"Code does not match AA-NNNNNNNN: {value}")
    return value

Now we can wire that function up to our code field using an AfterValidator, like this:

import uuid
from typing import Annotated
from pydantic import AfterValidator, BaseModel, EmailStr, SecretStr
from pydantic_extra_types.phone_numbers import PhoneNumber

class User(BaseModel):
    name: str | None = None
    id: uuid.UUID | None = None       # ✅ needs to be a UUID
    password: SecretStr | None = None # ✅ needs to stay secret
    email: EmailStr | None = None     # ✅ needs to be a valid email
    phone: PhoneNumber | None = None  # ✅ needs to be a valid phone
    code: Annotated[str, AfterValidator(is_valid_code)] | None = (
        None  # ✅ needs to follow AA-NNNNNNNN format
    )

The full code for our fully-validated model:

import re
import uuid
from typing import Annotated
from pydantic import AfterValidator, BaseModel, EmailStr, SecretStr
from pydantic_extra_types.phone_numbers import PhoneNumber

def is_valid_code(value: str) -> str:
    if not re.match(r"^[A-Za-z]{2}-\d{8}$", value):
        raise ValueError(f"Code does not match AA-NNNNNNNN: {value}")
    return value

class User(BaseModel):
    name: str | None = None
    id: uuid.UUID | None = None       # ✅ needs to be a UUID
    password: SecretStr | None = None # ✅ needs to stay secret
    email: EmailStr | None = None     # ✅ needs to be a valid email
    phone: PhoneNumber | None = None  # ✅ needs to be a valid phone
    code: Annotated[str, AfterValidator(is_valid_code)] | None = (
        None  # ✅ needs to follow AA-NNNNNNNN format
    )

Conclusion: What's next for the Pydantically Perfect series?

Past posts looked at using aliases to normalize and extract legacy data. We are now equipped to wield the full power of Pydantic’s validation to verify that the extracted data is sparkling clean. These posts form the foundation for our next step: how do we aggregate fields and models into powerful domain types? To use a LEGO analogy: we’ve reviewed various, useful parts available to us, so now we’re going to start building things with those parts. The fun is just getting started!

Our goal isn't to go through all of Pydantic's features, but rather to provide a curated list of Pydantic features we found helpful when adopting it.

If you're looking for a larger overview or want to know more without waiting for future posts, we encourage you to take a look at the official Pydantic documentation.

Gabriel Côté-Carrier is a senior software consultant at Test Double, and has experience in full–stack development, leading teams and teaching others.

Kyle Adams is a staff software consultant at Test Double who lives for that light bulb moment when a solution falls perfectly in place or an idea takes root.

Resources

Related Insights

Explore our insights

See all insights

Leadership

Why we're not chasing the AI hype (And what we're doing instead)

We want clients and prospective clients to know they can entrust us to solve problems with AI while remaining true to who we are, how we work, and the value we actually provide.

Todd Kaufman

Leadership

Speed is a side effect of making the system work

When leaders demand speed, teams often cut corners on validation and composition—ironically creating the brittleness that slows everything down. Real speed emerges from flow, small batches, and systems designed for adaptation, not from pushing harder on execution.