Intro to seamlessly handle non-Pythonic naming conventions

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Pydantically Perfect blog series

Welcome to Pydantically Perfect, the blog series where we explore how to solve data-related problems in Python using Pydantic, a feature-rich data validation library written in Rust. Whether you're a seasoned developer or just starting, we're hoping to give you actionable insights you can start applying right now to make your code more robust and reliable with stronger typing.

If you're a newcomer here, we encourage you to take a look at our first installment: Pydantically perfect: A beginner’s guide to Pydantic for Python type safety

Where you are in the Pydantic for Python blog series:

A beginner's guide to Pydantic for Python type safety
You are here: Seamlessly handle non-pythonic naming conventions‍
Normalize legacy data in Python
‍Field report in progress: Declare rich validation rules
Field report in progress: Build shareable domain types
Field report in progress: Add your custom logic
Field report in progress: Apply alternative validation rules
Field report in progress: Validate your app configuration
Field report in progress: Put it all together with a FHIR example

The validation problem: receiving data in a different naming convention

The generally accepted field naming convention in the Python ecosystem is snake_case. Our Pydantic models should reflect this to be consistent with the rest of our code base.

What if we get data from other APIs that use camelCase instead? This will require some tweaks on our Pydantic models.

Let's imagine a basic Person model with a few fields using the standard snake_case naming convention. We get back an error when validating data in camelCase because the validation engine failed to match with our fields:

from pydantic import BaseModel

class Person(BaseModel):
    first_name: str
    last_name: str

data = {
    "firstName": "John",
    "lastName": "Smith",
}

Person.model_validate(data)
# pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person
# first_name
#   Field required [type=missing, input_value={'firstName': 'John', 'lastName': 'Smith'}, input_type=dict]
#     For further information visit https://errors.pydantic.dev/2.11/v/missing
# last_name
#   Field required [type=missing, input_value={'firstName': 'John', 'lastName': 'Smith'}, input_type=dict]
#     For further information visit https://errors.pydantic.dev/2.11/v/missing

How do we solve this? We'll walk through different solutions and rank them from worst to best.

Worst solution: break our naming convention

The quickest way to get it to green with what we know so far would be to break our naming convention and rename our model to use camelCase.

from pydantic import BaseModel

class Person(BaseModel):
    firstName: str
    lastName: str

data = {
    "firstName": "John",
    "lastName": "Smith",
}

Person.model_validate(data)
# Person(firstName='John', lastName='Smith')

This isn't the way to go because it allows this external complexity to creep into our code everywhere that we use a Person object.

Instead, let's lean on Pydantic features to handle the different naming convention for us.

Better solution: use field aliases

The Pydantic Field metadata enables us to set aliases. These aliases are alternative names available to Pydantic when validating and serializing.

For example, here we could define aliases for our two fields like so:

from pydantic import BaseModel, Field

class Person(BaseModel):
    first_name: str = Field(alias="firstName")
    last_name: str = Field(alias="lastName")

data = {
    "firstName": "John",
    "lastName": "Smith",
}

Person.model_validate(data)
# Person(first_name='John', last_name='Smith')

Now, the rest of our code outside of Pydantic doesn't have to care about the external camelCase naming convention and we can stay true to our snake_case naming convention. That external complexity is contained solely within our Pydantic layer.

That said, the drawback of this method is that it requires an alias to be defined for every field. This small example doesn't look like much, but aliasing 20 different models with 10 fields each could quickly become a bother. If only there was a way to automate this…

Best solution: use alias generators

Of course, Pydantic has a way to automate this: alias generators.

Every Pydantic model has a model_config attribute that enables us to set configuration for the whole model by assigning it a ConfigDict. One of these configuration options is to pass in an alias_generator function that will automatically build aliases for all the fields in the model. Pydantic also offers a few out-of-the-box functions for our convenience, including a to_camel function that transforms the names into camelCase.

Let's see what it might look like:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class Person(BaseModel):
    model_config = ConfigDict(alias_generator=to_camel)

    first_name: str
    last_name: str

data = {
    "firstName": "John",
    "lastName": "Smith",
}

Person.model_validate(data)
# Person(first_name='John', last_name='Smith')

‍

This works well for us. There's no need now to specify aliases individually and all of our fields are converted to be validated from camelCase with just a single line of code.

What if one of the automatically generated aliases doesn't neatly match the incoming data? We can overwrite automatically generated aliases by defining a field alias like in the previous solution. This has the benefit of letting us handle edge cases without losing out on the automatic alias generation for the other fields.

Taking it further: having a shared model configuration

What if we had 20 different models with that same problem? We would still need one line of code per model to apply that configuration. How could we simplify that if we consider it to be too much?

Pydantic enables us to share our model configurations by defining our own parent class. In practice, it'd mean that we could create a OurBaseModel class inheriting from Pydantic's own BaseModel with an alias_generator configuration. Then, any new model inheriting from OurBaseModel would have that alias_generator by default.

‍

It would look like this:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class OurBaseModel(BaseModel):
    model_config = ConfigDict(alias_generator=to_camel)

class Person(OurBaseModel):
    first_name: str
    last_name: str

data = {
    "firstName": "John",
    "lastName": "Smith",
}

Person.model_validate(data)
# Person(first_name='John', last_name='Smith')

‍

Now, we could create our 20 models by inheriting directly from our OurBaseModel class, and they would all correctly handle data in camelCase.

‍

Outside the box: custom alias generators

Pydantic offers the following three alias generators out-of-the-box:

to_pascal, for PascalCase
to_camel, for camelCase
to_snake, for snake_case

What happens if we encounter a naming convention that isn't covered, like kebab-case or Pascal_Snake_Case? We can just make our own!

The alias_generator configuration accepts a function, so we can write a custom name transformation function and pass it in. Here's an example for kebab-case:

from pydantic import BaseModel, ConfigDict

def to_kebab(name: str) -> str:
    return name.replace("_", "-")

class Person(BaseModel):
    model_config = ConfigDict(alias_generator=to_kebab)

    first_name: str
    last_name: str

data = {
    "first-name": "John",
    "last-name": "Smith",
}

Person.model_validate(data)
# Person(first_name='John', last_name='Smith')

The serialization problem: sending data in another naming convention

We've explored aliases and alias generation in the context of accepting incoming data in another naming convention, but another likely context would be serializing to another naming convention.

For example, our REST APIs are expected company-wide to use camelCase. In that case, we'd want to serialize to that standard but still use snake_case in our python code. This will require a bit more explanation because of the default Pydantic behaviors, so bear with me while we take a little detour.

Pydantic's intent is to try to provide us with useful default behaviors, but still empower us to configure it differently if our use case is a bad match. Being aware of these configuration options and when we should use them will be very helpful.

When dealing with aliases, Pydantic's default behavior isn't consistent across validation and serialization:

Validation: Prefer the field alias
Serialization: Use the field name

For our use case of serializing to another naming convention, we will need to override Pydantic's default behavior somewhere.

We can override these default behaviors at two different levels:

The function calls by passing in by_alias or by_name arguments in the model_dump and/or model_validate functions.
The model configuration by passing in the right boolean flags in the model_config dictionary. For better or worse, these flags have a larger impact radius because they'll affect all usage of that model.

If we wanted to serialize with field aliases at the function call level, it would look like this:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class Person(BaseModel):
    model_config = ConfigDict(alias_generator=to_camel)

    first_name: str
    last_name: str

data = {
    "firstName": "John",
    "lastName": "Smith",
}

person = Person.model_validate(data)

# Default behavior
person.model_dump()
# {'first_name': 'John', 'last_name': 'Smith'}

# Explicitly asking to use aliases
person.model_dump(by_alias=True)
# {'firstName': 'John', 'lastName': 'Smith'}

‍

If we wanted to serialize with field aliases at the model configuration level, it would look like this:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class Person(BaseModel):
    model_config = ConfigDict(
        alias_generator=to_camel,
        serialize_by_alias=True, # Overridden default behavior here
    )

    first_name: str
    last_name: str

data = {
    "firstName": "John",
    "lastName": "Smith",
}

person = Person.model_validate(data)

# Overridden default behavior
person.model_dump()
# {'firstName': 'John', 'lastName': 'Smith'}

# We can still explicitly serialize by name
person.model_dump(by_alias=False)
# {'first_name': 'John', 'last_name': 'Smith'}

‍

A gotcha: Pydantic aliases and Python constructors

One of the gotchas of the default validation behavior is that having aliases will break the model constructor. Passing in the field names in our Python code while the validation process looks for field aliases will raise a validation error:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class Person(BaseModel):
    model_config = ConfigDict(alias_generator=to_camel)

    first_name: str
    last_name: str

Person(first_name="John", last_name="Smith")
# pydantic_core._pydantic_core.ValidationError: 2 validation errors for Person
# firstName
#   Field required [type=missing, input_value={'first_name': 'John', 'last_name': 'Smith'}, input_type=dict]
#     For further information visit https://errors.pydantic.dev/2.11/v/missing
# lastName
#   Field required [type=missing, input_value={'first_name': 'John', 'last_name': 'Smith'}, input_type=dict]
#     For further information visit https://errors.pydantic.dev/2.11/v/missing

‍

To solve this, we have essentially two options:

Switch the validation behavior to use field names by default rather than aliases like the serialization behavior. The associated tradeoff is that we'll need to explicitly validate by aliases when needed.
Configure the validation behavior to look for both field names and aliases. Pydantic would look at both for each field and take the first one that exists and contain a valid value. That option is great if we're comfortable with the tradeoff of the validation being less strict in all cases.

‍

Here's how we can configure the model to validate by field name instead of field aliases:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class Person(BaseModel):
    model_config = ConfigDict(
        alias_generator=to_camel,
        validate_by_name=True,  # Overridden default behavior
    )

    first_name: str
    last_name: str

# Using the constructor works!
Person(first_name="John", last_name="Smith")
# Person(first_name='John', last_name='Smith')

# Using data matching the aliases also works
# as long as we specify to use the alias!
data = {
    "firstName": "John",
    "lastName": "Smith",
}

person = Person.model_validate(data, by_alias=True)
# Person(first_name='John', last_name='Smith')

‍

Here's how we can configure the model to validate with both field names and field aliases:

from pydantic import BaseModel, ConfigDict
from pydantic.alias_generators import to_camel

class Person(BaseModel):
    model_config = ConfigDict(
        alias_generator=to_camel,

        # Validate with field names AND aliases
        validate_by_name=True,
        validate_by_alias=True,
    )

    first_name: str
    last_name: str

# Using the constructor works!
Person(first_name="John", last_name="Smith")
# Person(first_name='John', last_name='Smith')

# Using data matching the aliases also works!
data = {
    "firstName": "John",
    "lastName": "Smith",
}

person = Person.model_validate(data)
# Person(first_name='John', last_name='Smith')

Conclusion: what's next for the Pydantically Perfect series?

Now that we've covered using aliases to manage different naming conventions, it'll be worthwhile to deepen our coverage of features related to aliases. The next problem we'll be walking through will be leveraging aliases to normalize inconsistent data. This issue can typically happen when we receive data from different systems or when data formats change over time.

Our goal isn't to go through all of Pydantic's features, but rather to provide a curated list of Pydantic features we found helpful when adopting it. If you're looking for a larger overview or want to know more without waiting for future posts, we encourage you to take a look at the Official Pydantic documentation.

Gabriel Côté-Carrier is a senior software consultant at Test Double, and has experience in full–stack development, leading teams and teaching others.

Kyle Adams is a staff software consultant at Test Double who lives for that light bulb moment when a solution falls perfectly in place or an idea takes root.