Thursday, October 16, 2008

Reflecting about how we implement aggregates

How we implement aggregates

Since the beginning of my current project we tried to apply the ideas of structuring our business-logic into aggregates. We started to implement the aggregates using the following guidelines:

  • define an aggregate root and implement it as a public class
  • define the internal classes of an aggregate and implement it as a public class
  • implement a repository that can query and return aggregate roots of this class
  • try to avoid accessing the internals (like traversing from aggregate root to an internal class) from outside the aggregate
  • put as much as possible of the business-logic into the aggregate root

Following this rules we had some sort of guidelines about where to locate the business-logic and when to implement a repository. I think it was worth to have the aggregates. It was a valuable concept when we had to change our application from a 2-tier to a 3-tier architecture.

Nevertheless we always had following problems with aggregates:

  • Most of the time we had code in the presentation-layer that accessed the internals of an aggregate. It appeared often in case where there was an one-to-one data binding. For instance, in a master-detail view, where the master was the aggregate root an the detail an internal entity of the aggregate. (e.g. editing invoice and invoice position)
  • There were relations on the data model that do not comply the aggregate rules. In such a case we often had a nice definition of an aggregate, however there was sometimes a nasty dependency. It happend that an internal entity was depending on another entity in another aggregate. We agreed that this is not bad and that it can be abstracted to a dependency between the two involved aggregates.
And what happened:

  • Since the internals could be accessed, the business-logic begun to spread out into the presentation layer. It's impossible always to have the discipline to comply the aggregate rules and all of us violated it from time to time.
  • We didn't centralize the creational-logic. The test code is often a mess and it deals with setting-up the aggregates as well the internals of an aggregate.
  • A NDepend-analyse revealed that our business-logic assembly will be a serious maintenance problem ('Zone of pain')

Why is our business-logic in the zone of pain?

We generally get closer to the 'zone of pain' (see link for details), when a high number of types in an assembly:

  1. are more concrete than abstract (see Abstractness in the formula)
  2. are used a lot by other assemblies (see Ca in the formula)
  3. do not depend on other types (see Ce in the formula)

Since I can't see any value in implementing business-logic abstract (1.) and our business-logic already depends on a small number of types in other assemblies (2.) the only variable would be to reduce the number of types that are used by other assemblies (3.).

How could we do it different?

We could try to better encapsulate the internals of an aggregate avoiding the problems described above. That means, not allowing to access any internals by another type than the aggregate root. This increases the afferent coupling of the aggregate root, on the other hand eliminates the afferent coupling of the internals. The afferent coupling of the business logic assembly itself would then decrease.

Therefore I transformed one of our aggregates from something like that

// File Invoice.cs
public class Invoice
{
public EntitySet<InvoicePosition> Positions { get; set; }
}
public class InvoicePosition
{
}

to

// File Invoice.cs
public partial class Invoice
{
private EntitySet<Invoice.InvoicePosition> Positions { get; set; }
}
// File InvoicePosition.cs
public partial class Invoice
{
private class InvoicePosition
{
}
}
After completing the refactoring and changing some logic in the presentation layer I got the following code metrics:

  1. The assembly containing the business logic moved a little bit away from the zone of pain (from D=0.577 to 0.575). This is still bad, however the refactoring had a positive effect.
  2. The type-rank for the refactored aggregate root increased from 10.7 to 12. Ca (Afferent coupling) increased from 62 to 66. That means, it got a more important part of the business logic.
  3. The CC (cyclomatic complexity) for the aggregate root increased from 38 to 77, while CC decreased generally for the presentation layer (e.g. in one case from 26 to 16). That means that we put complexity into the business logic. It's a good thing since the complexity generally is easier to test and therefore to maintain in the business logic. It was also possible to eliminate duplicated business logic found in the presentation layer.

Testability and Unittests

As the internals are completely hidden from outside now I test the refactored aggregate with state-based black-box tests. I think that's a reasonable way to do it since the test setup do not depend anymore on the implementation details of an aggregate. I'm not sure if this should be a general testing strategy for aggregates. I think there are more complex cases, where white-box tests would still be needed. The following pictures shows on the left, the test-code before the refactoring and on the right, the test-code after: