Why repository pattern?

It’s really a sin not to use a repository, but I can be forgiven if I don’t use it in an example. Consider your software dealing with customers, which, internally, have different features based on when the customer signed up and his status. If customer is registered less than 5 days ago, you should display a big red button saying ‘buy this feature’, and in the back office, you should prioritize tickets these customers create. Let’s call these customers ‘leads’.

You will most likely have at least two pages, one for the customer, one for the back office. Assuming at least MVC, customer controller will execute something like: statement.executeQuery(“select id, name, status, signUpDate, DATEDIFF(signUpDate, now()) < 5 and status=’ACTIVE’ as isLead from customers where id = 12345”);

First you prepare the statement again in the back office controller. Then you execute it and prioritize support tickets. Business is happy. Now, that you did this so fast, you can probably add it up to a cron job that sends some promotional emails to these leads, every day at 10am. And since you did it once so fast, they are expecting you to finish the feature by yesterday morning. You copy-paste the statement, create the cron job, send the email, and you still have some time to spare.



Domain model is equal to database schema

Since the domain model IS the schema, you are bound by restrictions of the database in question. Your domain model knows what the database is, how it’s storing the data, what are the underlying relations etc. This makes the core of your software, domain model and domain logic tied to specific database. If you use JPA, it’s exactly the same thing, but you are tied to relational model, instead of a specific database.

I consider this an implementation detail. Domain model must not care about this, at all. But, should domain logic care about repositories and should we always use this notion of SomethingSomethingRepository? I’d say: not always. Sometimes, the domain procedures will name the repository for you, for example: A customer walks into the store, sits and takes the product catalogue to browse.

However, when breaking up with your bad past of using database schema as domain object, ORMs can cause you a lot of pain, especially when one aggregate references another, creating a large graph. Usually, the first step in refactoring is breaking these strong links with Identity fields, then mapping database objects to domain objects.



One repository per table or collection

I have yet to live for the moment when the product owner comes to me and say: so, the customer walks into the store buys something from product table, then join it with the customer information table, which we join with his payment history. If he has 20 rows in the payment history table that have the status ‘paid’ and that happened less than 10 days ago excluding today, then we update the checkout table with discount of 20%, only if paid in cash.

By doing one repository per table, as in the case described above, the code will be procedural. The service that will have to encapsulate this protocol will have too many dependencies. And it’s error prone. And nobody but you will understand it. And you will have to maintain that code. Instead, doing a repository per aggregate root allows us to interact with the domain model in manner familiar to the product owner. In fact, in most cases, P.O. could read the code and tell if it’s correct.



Technology specific repositories

I’m not sure about other languages, but in Java, we have Spring, so this might only be my rant against using Spring Data repositories as domain repositories. Why? Programmers, as a species, are the laziest species known to humanity. If there is a shortcut, be assured that over 90% of the programmers considered taking it, while well over 50% takes it (instead of actual research, I took a shortcut and made these numbers up).

Back to the topic. There are couple of downsides to this. Firstly, your domain layer suffers from immobility. It’s tied to Spring Data implementation of repository.
For example,
public interface CustomerRepository extends MongoRepository
Or
public interface CustomerRepository extends CrudRepository

Just recently, we tried to integrate with a library, for which we had to connect our database to their repository. We used JPA 1, they used JPA 2.

Secondly, Spring Data repositories (as domain repositories) encourage domain model as database schema. It makes it damn hard for separation of a layer of objects that will describe what the schema looks like, while keeping the domain layer clean database objects and their transformations from and to domain objects.

And lastly, which is just a personal taste: naming. I find that there is absolutely no excuse, no excuse whatsoever, for writing code like customerRepository.findByOrderByOrderDateAsc() while describing a business process. That’s something that will never come out of the mouth of a product owner or business analyst.

In all, use an adapter for spring data repositories, it is a great tool, but in the domain layer, it fails miserably in layering, portability and, for me at least, naming.

Remote service

In microservice environments, aggregates are separated by REST calls, or at least should be. What I’ve seen is people accessing these aggregates from ‘services’. Once upon a time, services were called managers. If you take this logic, you will understand that it does not manage the data. That’s the responsibility of the other microservice. What it does is it accesses it. Now this sounds like a repository… Why do we keep on calling them services?

But, what about SPAs? Should services, let’s say, in Angular, communicate with REST endpoints, or should they provide some business rules and delegate communication with REST services to repositories? I have exactly zero arguments why are we not doing that right now. Yet, there are some benefits to doing that. For example, Angular app is both web and a progressive app. When internet connection is unavailable, repository (should) become local. Or provide some cashing of the data. Like a true data access layer.

In all, repositories are a powerful pattern to encapsulate data access. The domain layer becomes oblivious of actual storage engine, whether it is relational database, document database or a microservice. While it’s unlikely that you will just switch databases, you get the benefit of a clear and clean domain, good separation of concerns and most importantly understandable and maintainable code.