DTO’s and why you should be using them

If you’ve worked in any form of modern (decent sized) application, you know that the de facto standard is to use a layered design where people usually define operations into layers corresponding to certain functionality, for example a Data Access Layer, that is nothing else but an implementation of your repository using nHibernate, Entity Framework, etc. While that is a very good idea for most scenarios, a bit of a problem comes around with it, and is the fact that you need to pass around lots of calls between layers, and sometimes is not just calling a DLL inside your solution, sometimes, it’s calling a service hosted somewhere over the network.

The problem

If your app calls services and receives data from them (obviously?) then you might encounter in your service something like this:
public Person AddPerson(string name, string lastName, string email)
Now, let’s first look at the parameters and why this is probably not a very good definition. 
In this method, you have 3 arguments, name, lastName and email; what happens if somebody needs a telephone number? Well, we just add another argument! Dead easy! Yeah, no. Suppose we make it more interesting saying we have Workers and Customers, both inheriting from person, we would then have something like this:
public Person AddWorker(string name, string lastName, string email)
public Person AddCustomer(string name, string lastName, string email)
If you need to add that telephone number now and go for that extra param, you have to add code in two locations, so you need to touch more code, and what happens if we touch more code? Simple, we put more bugs.

The Good

Now, what happens if you have this?
public Worker AddWorker(Worker worker)
public Customer AddCustomer(Customer customer)
DTO stands for Data Transfer Object, and that is precisely what these classes do, we use them to transfer data on our services. For one, code is much simpler to read now! But there is another thing, if Worker and Customer inherit from Person as they should considering they are both a Person, then we can safely add that email to the person without having to change the signature of the service, yes, our service will now have an extra argument but we don’t have to change our service signature on the code, just the DTO it receives. 
Now, more on the common use for DTO’s, just as Martin Fowler states a DTO is
An object that carries data between processes in order to reduce the number of method calls.
Now, it’s fairly obvious that using DTOs for input arguments is good, but what happens for output arguments? Well, similar story really, with a small twist, considering that many people today use ORMs for accessing the database, it’s very likely that you already have a Worker, Customer and person class, because they are part of your domain model, or they are created by Linq To Sql (not a huge fan, but many people still use it), so, should you be using those entities to return on your services? Not a very good idea and I have some reasons for it.

One very simple reason is that the objects generated by these frameworks usually are not serialization friendly, because they are on top of proxy classes which are a pain to serialize for something that outputs JSON or XML. Another potential problem is when your entity doesn’t quite fit the response you want to give, what happens if your service has something like this?

public Salary CalculateWorkerSalary(Worker worker)

You could have a very simple method just returning a double, but let’s think of a more convoluted solution to illustrate the point, imagine salary being like this:

public class Salary
     public double FinalSalary {get;}
     public double TaxDeducted {get;}
     public double Overtime {get;}

So, this is our class, and Overtime means it’s coupled to a user because not everybody does the same amount of overtime. So, what happens now if we also need the Tax code for that salary? Or the overtime rate for the calculation? That is assuming these are not stored on the salary table. More importantly, what happens if we don’t want whoever is calling the API to see the Overtime the Worker is doing? Well, the entity is not fit for purpose and we need a DTO where we can put all of these, simple as that.

The Bad

However, DTOs are not all glory, there is a problem with them and it’s the fact they bloat your application, especially if you have a large application with many entities. If that’s the case, it’s up to you to decide when a DTO is worth it and when it’s not, like many things on software design, there is no rule of thumb and it’s very easy to get it wrong. But for most of things where you pass complex data, you should be using DTOs.

The Ugly

There is another problem with DTOs, and it’s the fact you end up having a lot of code like this:

var query = _workerRepository.GetAll();
var workers = query.Select(ConvertWorkerDTO).ToList();
return workers;

Where ConvertWorkerDTO is just a method looking pretty much like this:

public WorkerDTO ConvertWorkerDTO(Worker worker)
    return new WorkerDTO() {
        Name = worker.Name,
        LastName = worker.LastName,
        Email = worker.Email

Wouldn’t be cool if you could do something without a mapping method, like this:

var query = _workerRepository.GetAll();
var workers = query.Select(x => Worker.BuildFromEntity<Worker, WorkerDTO>(x))
return workers;

Happily, there is a simple way to achieve a result like this one, and it’s combining two very powerful tools, inheritance and reflection. Just have a BaseDTO class that all of your DTOs inherit from and make a method like that one, that manages the conversion by performing a mapping property to property. A fairly simple, yet fully working, version could be this:

public static TDTO BuildFromEntity<TEntity, TDTO>(TEntity entity)
    var dto = Activator.CreateInstance<TDTO>();
    var dtoProperties = typeof (TDTO).GetProperties();
    var entityProperties = typeof (TEntity).GetProperties();

    foreach (var property in dtoProperties)
        if (!property.CanWrite)

        var entityProp =
            entityProperties.FirstOrDefault(x => x.Name == property.Name && x.PropertyType == property.PropertyType);

        if (entityProp == null)

        if (!property.PropertyType.IsAssignableFrom(entityProp.PropertyType))

        var propertyValue = entityProp.GetValue(entity, new object[] {});
        property.SetValue(dto, propertyValue, new object[]{});

    return dto;

And Finally…

The bottom line is like everything, you can over engineer your way into adding far too many DTOs into your system, but ignoring them is not a very good solution either, and adding one or two to a project with more than 15 entities just to feel you’re using them, it’s just as good as using one interface to say you make decoupled systems.

What’s your view on this? Do you agree? Disagree? Share what you think on the comments!

EDIT: As a side note, it’s work checking this article that talks a lot about the subject.

Applying design patterns to web development

Part 1: The strategy pattern

One of my biggest efforts as a developer is to apply design patterns to my software designs and implementations. When I’m working with PHP is where I most feel the lack of OO approaches in many cases. PHP started as a procedural language, with a few functions serving for a limited amount of tasks, but like anything in this world, it has evolved into a more mature language. The problem is that many tutorials around the web are meant for starting as a developer and tend to leave Object Orientation out of the box, which is a trend that I find awful.

Sadly, there are still many PHP developers writing procedural code and calling it object oriented because they use the word class. Writing OO code is a bit more complex than that, we need to “think” how to express our problem into objects and how to make them communicate, not to create monolithic systems with strongly coupled components, which turns into software impossible to maintain, document, test and most importantly reuse and scale.

Well, back to the main idea, in this blog post I intend to use the strategy design pattern, which is one of the most implemented design patterns when it comes to web systems. The definition of the strategy pattern is:

Define a family of algorithms, encapsulate each one, and make them interchangeable. Strategy lets the algorithm vary independently from clients that use it.

The classic scenario is a situation where we can employ different variants depending on the environment. For instance, Session Management is an important task on every web application. If we are going to create a small application, then using the standard PHP session implementation, we are good to go. But this does pose two main problems:

  1. If we are in a shared hosting, anybody in the same server can work around the server´s security and copy the session files thus gaining access to data we don’t want them to have.
  2. If we want to scale our application, like adding extra servers, moving the application to a cluster or something like it, sharing the session files is a nightmare for managing the sessions

So, we have two main scenarios, where we are able to use the standard session driver and where we have to use a more advanced session management system, like for instance a database approach or integrating with Memcached or another powerful solution.

The initial approach is to write a PHP file called sessions.php and there do something like this:

$driver = $configuration[“session”]->getFile();
include $driver . “.php”;

And driver would be a file performing the calls to the session_set_save_handler function which changes the functions for managing the sessions, and of course, the implementation of such functions.

An approach like that one does the job, but we would have no objects, just a bunch of files. The real Object oriented version would be defining the contract of how we want to interact with the drivers. In our case, we have a session driver, which could hold a few operations, basing on the session_set_save_handler function arguments, we would need 6 methods in our interface: Open, Close, Read, Write, Destroy and GarbageCollect.We would have something like this:

interface ISessionDriver
   public function Open();
   public function Close();
   public function Read();
   public function Write();
   public function Destroy();
   public  function GarbageCollect();

Now, we have an interface, and we can register the interface as the session provider. To do that we have again two methods, the first one is loading from the configuration the class we want to load, and we’ll see the second soon on how to use Dependency Injection.

The idea is very much like the first functional approach, include the template file, but now we know that we’ll have a class implementing the ISessionDriver interface, a class which respects a contract previously established by me. Again, we would do a code quite similar, but now we would check:

$driver = $configuration[“session”]->getFile();
   include $driver . “.php”;
   $driverInstance = new $driver();

   if ($driverInstance instanceof ISessionDriver){
      //register the session driver
      throw new SystemException(“Expecting an ISessionDriver here!”);

With that we finish the first part of how to decouple our classes into a more reusable environment, applying the strategy design pattern and we will get deeper into it soon by using dependency injection to make this design even more decoupled.