Brief history of data classes in python
How dataclasses came in python landscape
In this tutorial, we will review the different ways we create data classes in python from the oldest way to the newer. Hopefully at the end, you will be convinced to use pydantic dataclasses as your default way to create data classes.
Default python class creation
The oldest way to define a class in python is via the special
__init__ method. For our example we will work with a
Point class taking
y coordinates as input.
Like you can see, there is some boilerplate code because we define the same variables as method arguments and class attributes (
y). When we try to print our created object to see how it looks like, we see a strange default representation made by python.
Even worse, we can instantiate an object like the following
Point(1, 'foo') and python will not complaint at all.
Ok you can say that tools like mypy or pyright will help you catch the bug but not everybody wants to use them, so we have to find another way.
Also, the default class implementation does not have comparison methods implemented, therefore
p == p2 returns
False even if the attributes have the same values between the two objects. 🥲
Here is how we can fix these three issues with the following code.
Now we have a working class with a pretty representation, but look the amount of code we have to write... And we haven't even implemented all the comparison methods.
I will make a small digression on the namedtuples because one car argue that it is a way to declare a class quickly, and yes it is the case but not without some caveats…
Consider the following example:
Ok, with namedtuples, we have a pretty representation by default, but we suffer for the same lack of type verification like in the default way of creating classes and since it inherits the
tuple class, the comparison with a tuple in the last instruction is correct which is not always what we will want.
Taking into account the problems related to class creation quoted above, a well known pythonista decides to bring a solution with a library called attrs. Let’s see how we can re-write our
Ok, we clearly see a difference with the handwritten class we wrote above. We have:
- A pretty default representation
- Type verification using the field function and validators.
- A default comparison implementation taking in account the type of the compared objects. This is why the test with a tuple returns
It is a well-thought library, and it can be customized in different ways like defining slots, frozen classes, keyword-only arguments and more.
Let's see what our famous
Point class look like:
We have almost the same advantages as the
attrs definition except that the argument type is not verified at initialization time.
The only way to achieve this verification is to do the following:
Yeah, it sucks a little, but it was a will of the CPython maintainers to have a simplified version of
attrs without validation and other joys.
Finally, we will talk about pydantic, a data validation library made famous by
a relatively young web framework, FastAPI. If you don’t know it, I highly recommend to check its api, it is another well-though piece of software. I also wrote a blog post presenting some of its advantages.
The feature that interest us in this article is that of the dataclasses. Again we will look at our
Point class implementation. 😁
We have all the advantages that we had with the
attrs implementation and the written code is even smaller!
pydantic leverages type annotations to validate data, and we can still use the api provided by the standard library
dataclasses like field, asdict, etc... because
pydantic.dataclasses is just a wrapper around the standard one. For proof, look at the following example:
Here we wrap a normal dataclass into a pydantic one, and we have the same verifications and features!
Pydantic is a fantastic library (yes I am a little biased) and I can only recommend you to check out its documentation.
This is all for this tutorial, hope you enjoyed it. Take care of yourself and see you next time! 😁