Hidden powers of pydantic
Discover some key features of pydantic you may not be aware of.
For those who don’t know pydantic, it is a library for data validation leveraging the type annotations feature added in python3.5. Here is an example taken from the documentation.
You should definitely give it a try if you don’t use it, it is an amazing library!
Here I want to point out three features that I think many people don’t use.
Validation decorator
Pydantic provides a decorator validate_arguments which allows us to validate arguments passed to a function leveraging type annotations. The feature is still in beta but having used it in a recent project, I think the use is worth it. Let’s see with an example where we will compute the distance between two points after checking the points passed as input.
Without pydantic, you probably have written something equivalent to this:
Notes:
- This example will work fine starting with python3.7 where dataclasses appeared first.
- Inside the function compute_distance, from line 12 to line 15 we check that the arguments given are really Point instances.
The same example uses the decorator validate_arguments.
Notes:
- The body of compute_distance is now simplified. We no longer have boilerplate code to check the arguments. Instead, we can focus on the main purpose of this function making it simpler and more readable :)
- The only place to check for errors is where we use the function. We need to try/except pydantic ValidationError which is raised if an argument is invalid. Try to pass an invalid value and see how the error is displayed. Pydantic makes it easier to understand what is wrong with an intuitive error message.
- A side effect of using the pydantic decorator is that it works just like the pydantic BaseModel, i.e it will try to convert given data if they don’t have the expected type. In concrete terms, a tuple value like (1,2) or a dictionary like {‘x’: 3, ‘y’: 5} will not raise an error because they can be converted to a Point dataclass instance. You should be aware of that. 😉
Ok, let’s take the same example with another approach. Now we want to compute the hypotenuse length of a right triangle using the Pythagorean theorem. The function will take as input the length of the two other sides. Here is an example implementation without the pydantic decorator.
Notes:
- From line 6 to line 9, we check that the arguments are int or float, the only valid types for our operation.
- From line 14 to line 17, we check that the values are correct.
- Line 19 is where the operation is done and the value is returned.
So much validation effort for just one effective line… we can do better!
Here is an enhanced version using the decorator validate_arguments.
Notes:
- Notice that in the definition of the function, we no longer use a Union annotation because pydantic will convert integers to float values.
- We add metadata to arguments by assigning them the return of a special pydantic function named Field (yeah it is a function and not a class even if it starts with a capital letter). The first value the function takes is a default value for the argument. In our case, we don’t want to assign a default value so we give the ellipsis value to tell pydantic that this argument is mandatory. For more information, you can look at this section of the official documentation.
- We also specified in the field function that the value must be greater than 0 using the gt argument. A detailed list of Field arguments is described on this page.
- Look at the body of the function, only one line of code! This is really amazing, we save a lot of energy for the things that matter.💃🕺
- Again we need to check for errors when using our function, in our case, it happens from line 14 to line 17 where we try/except a pydantic ValidationError.
Before continuing to the next section, there is a little caveat with our implementation. Assigning a value as we did will confuse many linters / static type checkers. Because in the end the arguments are mandatory but we assign them a value…
There is a way to avoid it using the Annotated type introduced in python3.9. If you are using a previous version of python, pydantic provides a backport you can use. So our final implementation will be the following:
Notes:
- Line 4, we import Annotated from pydantic. But if you are using python3.9+, you can directly import it from the typing standard library.
- In lines 9 and 10, we no longer assigned a value for the arguments. Instead, we use the Annotated type, passing the real type as its first argument, in our case float, and pydantic metadata to help for validation.
- Note that in the Field definition, we don’t the ellipsis value because we are no longer assigned the return value of the Field function, we just define a constraint for the minimum value allowed.
Pydantic dataclasses
Looking at our first example, more specifically the Point dataclass, what would you think will happen if you define an object like this: Point(‘foo’, 2)? The answer is that python will not complain at all because the dataclass decorator does not perform argument validation. What if we could bring the power of pydantic BaseModel into dataclasses?
Well, pydantic has its own version of dataclass that adds validation to arguments!
We just need to replace dataclass import from dataclasses to pydantic.dataclasses and we’re done! Try to replace the last line with a bad example like Point(‘foo’, 2) and you will see what will happen.
There are cases where you have defined dataclasses that are out of your control, but if you still want to validate their arguments, pydantic allows you to do that by simply wrapping the built-in dataclass in a pydantic dataclass.
Beware that pydantic dataclasses are not exactly BaseModel classes, there are some subtle differences between the two. More information can be found on this issue on Github.
Secret variables
This is a bonus section. Often when we log actions from our system (web application, cli, etc…) we don’t want to leak sensible information like passwords, credit cards, etc… Do you know that pydantic has some types made for it?
You have SecretStr and SecretBytes to mask strings you think are sensible like passwords.
For the special case of the payment card number, you also have a convenient class PaymentCardNumber.
This is all for this tutorial, hope you enjoyed it. Take care of yourself and see you next time! 😁
If you like my article and want to continue learning with me, don’t hesitate to follow me here and subscribe to my newsletter on substack 😉