The Beauty of DTOs

What is a Data Transfer Object?

The basic idea of using DTOs is to separate data from business logic and app-specific functionality. This is especially interesting in cases you need to pass data to a different layer (e.g. from Model layer to View layer) or receive data from a different service (external API, microservice, ...).

After all, a DTO is just a simple data structure that holds various properties of built-in datatypes (or other DTOs). The advantage of enforcing this restriction comes in the form of interchangeability between systems and serializability.

Here is a small sample of how a DTO could look like:

class Article:
    def __init__(**kwargs):
        self.title = kwargs['title']
        self.content = kwargs['content']
        self.author = kwargs['author']  # Author DTO
        self.published_at = kwargs['published_at']
        self.tags = kwargs.get('tags', [])

class Author:
    def __init__(**kwargs):
        self.name = kwargs['name']
        self.email = kwargs['email']
        self.website = kwargs.get('website')
        self.twitter = kwargs.get('twitter')

In this sample of a Python DTO, we can mark properties as required by using [] or make them optional by using .get(). For the latter we can also define a fallback value e.g. in case the property represents a boolean that should have a default value.

When to use a DTO?

As mentioned earlier, DTOs should be used every time data moves between layers and systems (as seen in "What data crosses the boundaries" in Uncle Bob's article about "The Clean Architecture").

Obviously it can be an overkill to define a DTO that only stores one property (even though there are cases where understandability can be improved by doing this). So for my personal taste, everything that is more than a (key, value) pair and moving between layers should be a DTO. Like, don't be afraid of having too many DTOs. Most of the time you probably don't have enough :D

Why not using ...?

A pythonic dict would give us mostly the same benefits as a well-defined class. But for me, the main benefits of using a well-defined DTO is reliability: There always will be exactly the same keys so you don't need to worry about KeyError when you try accessing data.

Another common solution to generate a DTO would be to use the Django REST Framework's Serializer class. This way you can easily transform a Django model into a serialized format (e.g. a dict). But this solution only works for Django and is heavily tied to models / adds an unnecessary overlay for simple data storage.

In Python 3.7 the dataclass decorator was introduced to simplify defining DTOs. As sweet as it sounds, for me the main issue is that it relies on the newly introduced typing syntax which looks really weird if you have an untyped codebase.