Wednesday, May 14, 2008 #

How to map a tree in NHibernate

Introduction

Lately there have been many questions regarding how to best map a tree structure in NHibernate. In this post I'll try to show you various techniques how one can deal with a tree structure.

The code for this sample you can get from here.

Domain Model

Let's define a very simple domain model for this sample. We have a single class called Equipment that has a parent of type Equipment and 0 to many children of type Equipment. The tree has a single root node. The root node is specific by the fact that it's property Parent is equal to null. On the other hand a tree has leaf nodes. A leaf node is defined to have no children.

You could as well replace the class Equipment by e.g. a Person class. The principles shown here would not change.

image

Where is my Aggregate?

A Aggregate in DDD is defined as "A cluster of associated objects that are treated as a unit for the purpose of data changes. External references are restricted to one member of the Aggregate, designated as the root. A set of consistency rules applies within the Aggregate's boundaries."

Depending on our Use Cases or User Stories we can define our Aggregate either as

image

that is, the aggregate includes the equipment (which is the root) and its children.

Or we could define the Aggregate as follows

image

here only the equipment object is part of the aggregate (it is at the same time the root of the aggregate).

In the following I choose the former aggregate (equipment and children). This corresponds to the use case where I want to edit an equipment and its children in a single business transaction.

Note: Normally one defines a Repository for each Aggregate in the domain model.

Mapping

Let's have a look at the mapping for the Equipment class.

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   assembly="TreeStructure"
                   namespace="TreeStructure">
  <class name="Equipment">
    <id name="Id">
      <generator class="guid"/>
    </id>
    <property name="Name" length="50" not-null="true" unique="true"/>
    
    <many-to-one name="Parent" class="Equipment" column="ParentId" />
    
    <set name="Children" cascade="all-delete-orphan" >
      <key column="ParentId" foreign-key="fk_Equipment_ParentEquipment"/>
      <one-to-many class="Equipment"/>
    </set>
  </class>
</hibernate-mapping>

The name property of the equipment must be defined and unique. The name can be at most 50 characters long.

The parent property is not mandatory (this is the default, i.e. not_null=false) since the root node has Parent equal to null.

I have defined the children collection as a Set, that is an unordered collection of unique objects.  I have chosen cascade="all-delete-orphan" to instruct NHibernate that he automatically saves, updates or deletes the children of an equipment too when I save, update or delete the equipment. I also have defined that the foreign key constraint that will be defined on the database between the equipment and its parent has a friendly name (fk_Equipment_EquipmentParent).

Now I can write a unit test to check whether NHibernate can create the correct schema for me. In this sample I use a SQL Server 2005 database (the express edition is ok too) since later on I want to show some specific features only available in SQL Server. The code of the test is

[Test]
public void Can_generate_schema()
{
    var configuration = new Configuration();
    configuration.Configure();
    configuration.AddAssembly(typeof (Equipment).Assembly);
 
    // show the schema for debugging purposes
    new SchemaExport(configuration).Execute(true, false, false, false);
 
    // generate the schema in the database
    new SchemaExport(configuration).Execute(false, true, false, false);
}

This test runs successfully and generates the following output (SQL script)

if exists (select 1 from sys.objects 
           where object_id = OBJECT_ID(N'[fk_Equipment_ParentEquipment]') 
           AND parent_object_id = OBJECT_ID('Equipment'))
  alter table Equipment  drop constraint fk_Equipment_ParentEquipment
 
if exists (select * from dbo.sysobjects 
           where id = object_id(N'Equipment') 
           and OBJECTPROPERTY(id, N'IsUserTable') = 1) 
  drop table Equipment
 
create table Equipment (
  Id UNIQUEIDENTIFIER not null, 
  Name NVARCHAR(50) not null unique, 
  ParentId UNIQUEIDENTIFIER null, 
  primary key (Id)
)
 
alter table Equipment 
add constraint fk_Equipment_ParentEquipment 
foreign key (ParentId) references Equipment
The first part of the script drops the database objects if they are already there and in the second part of the script the table Equipment is generated as well as the foreign key constraint between the equipment and its parent. The create script for the table looks exactly like expected.

The Repository

A Repository is defined as "a mechanism for encapsulating storage, retrieval, and search behavior which emulates a collection of objects."

As said above, for each Aggregate in the domain model one defines a Repository. Since we only have one aggregate in our simple domain we also will have only one repository. Let us call it EquipmentRepository.

Now I want to be able to eager load the complete aggregate given the id of the aggregate root. I write the following unit test

[Test]
public void Can_load_aggregate_by_id()
{
    var equipmentId = ...;    // some id
    var equipment = _repository.GetAggregateById(equipmentId);
    Assert.IsNotNull(equipment);
    Assert.IsTrue(NHibernateUtil.IsInitialized(equipment.Parent));
    Assert.IsTrue(NHibernateUtil.IsInitialized(equipment.Children));
    Assert.AreEqual(3, equipment.Children.Count);
}

With this test I assert that a) I get an aggregate and b) the aggregate is fully populated. To check whether the aggregate is fully populated I use the NHibernateUtil utility class. The test assumes the existence of a equipment tree fragment in the database. Please refer to the project (here) for any details how I create this tree fragment in the SetUp of the test fixture.

A possibility to eager load an aggregate is by using a HQL query as below

public Equipment GetAggregateById(Guid equipmentId)
{
    using (var session = SessionManager.OpenSession())
    {
        var sql = "from Equipment e" +
                  " left join fetch e.Parent p" +
                  " left join fetch e.Children c" +
                  " where e.Id = :id";
        var equipment = session.CreateQuery(sql)
            .SetGuid("id", equipmentId)
            .UniqueResult<Equipment>();
        return equipment;
    }
}

I use left joins since either Parent can be null or the equipment has no children. The SessionManager class is a helper class which I use to provide me session objects. Please have a look at the project which you can find here for any details.

Finding the root aggregate is easy. We just have to look for an equipment whose parent is equal to null

public Equipment GetRootAggregate()
{
    using (var session = SessionManager.OpenSession())
    {
        var sql = "from Equipment e" +
                  " left join fetch e.Children c" +
                  " where e.Parent is null";
        var equipment = session.CreateQuery(sql)
            .UniqueResult<Equipment>();
        return equipment;
    }
}

Now some times we want to find all leaf nodes of the tree. Remember that a leaf node has by definition no children. We could use the following code for this

public ICollection<Equipment> GetAllLeafNodes()
{
    using (var session = SessionManager.OpenSession())
    {
        var sql = "from Equipment e" +
                  " left join fetch e.Parent p" +
                  " where e.Children.size = 0";
        var leafs = session.CreateQuery(sql)
            .List<Equipment>();
        return leafs;
    }
}

here we use the handy size function for collections provided by HQL. Note that this method delivers all leaf nodes of the tree. To only select the leaf nodes of a sub-tree starting at a given equipment node is trivial now and I leave for you as an exercise.

Eager load ancestors and descendants

To eager load all ancestors or descendants of a given equipment node we have to use native TSQL since HQL (as well as the criteria API) does not provide a generic way of doing it. Fortunately SQL server 2005 provides us the means to achieve the desired result. We have to construct a hierarchical query as below

public ICollection<Equipment> GetAllDescendantsOfEquipment(Guid equipmentId)
{
    using (var session = SessionManager.OpenSession())
    {
        var sql = "with Hierachy(Id, Name, ParentId, Level)" +
                  " as" +
                  " (" +
                  "   select Id, Name, ParentId, 0 as Level" +
                  "   from Equipment e" +
                  "   where e.Id = :id" +
                  "  union all" +
                  "   select e.Id, e.Name, e.ParentId, eh.Level + 1" +
                  "   from Equipment e" +
                  "   inner join Hierachy eh" +
                  "      on e.ParentId = eh.Id" +
                  " )" +
                  " select Id, Name, ParentId" +
                  " from Hierachy" +
                  " where Level > 0";
        var list = session.CreateSQLQuery(sql)
            .AddEntity(typeof(Equipment))
            .SetGuid("id", equipmentId)
            .List<Equipment>();
        return list;
    }
}

The last where statement in the SQL assures that the given equipment is not included in the list of descendants.

Analogous we can get the list of ancestors of a given equipment node

public ICollection<Equipment> GetAllAncestorsOfEquipment(Guid equipmentId)
{
    using (var session = SessionManager.OpenSession())
    {
        var sql = "with Hierachy(Id, Name, ParentId, Level)" +
                  " as" +
                  " (" +
                  "   select Id, Name, ParentId, 0 as Level" +
                  "   from Equipment e" +
                  "   where e.Id = :id" +
                  "  union all" +
                  "   select e.Id, e.Name, e.ParentId, eh.Level + 1" +
                  "   from Equipment e" +
                  "   inner join Hierachy eh" +
                  "      on e.Id = eh.ParentId" +
                  " )" +
                  " select Id, Name, ParentId" +
                  " from Hierachy" +
                  " where Level > 0";
        var list = session.CreateSQLQuery(sql)
            .AddEntity(typeof(Equipment))
            .SetGuid("id", equipmentId)
            .List<Equipment>();
        return list;
    }
}

Note that we had just to adjust the inner join between the Equipment table and the Hierarchy (temp) table. Again the last where statement in the SQL assures that the given equipment is not included in the list of ancestors.

An alternative approach

We can also choose another approach for the tree problem. The solution I want to show works for any database even if it does not support hierarchical queries. This time our simple domain model looks as follows

image

Here we have a node object which can have a parent and a collection of children. Additionally we have defined the two collections descendants and ancestors which contain all descendants (children, children of the children and so on) and all parents of the given node.

The code for the Node class is given by

public class Node
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public Node Parent { get; set; }
    public ISet<Node> Children { get; set; }
 
    public ISet<Node> Ancestors { get; set; }
    public ISet<Node> Descendants { get; set; }
}

The mapping for the node is defined as

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   assembly="TreeStructure"
                   namespace="TreeStructure">
  <class name="Node">
    <id name="Id">
      <generator class="guid"/>
    </id>
    <property name="Name" length="50" not-null="true" unique="true"/>
 
    <many-to-one name="Parent" class="Node" column="ParentId" />
 
    <set name="Children" inverse="true" cascade="all" >
      <key column="ParentId" foreign-key="fk_Node_ParentNode"/>
      <one-to-many class="Node"/>
    </set>
 
    <set name="Descendants" table="NodeHierarchy" inverse="true" >
      <key column="ParentId" foreign-key="fk_HierarchyNode_ParentNode"/>
      <many-to-many class="Node" column="ChildId" foreign-key="fk_HierarchyNode_ChildNode" />
    </set>
 
    <set name="Ancestors" table="NodeHierarchy" >
      <key column="ChildId" /> 
      <many-to-many class="Node" column="ParentId" /> 
    </set>
  </class>
</hibernate-mapping>

Note that we define the descendants and ancestors of a node as a many-to-many relation in the database. So the SQL script created by the SchemaExport class of NHibernate is

if exists (select 1 from sys.objects 
           where object_id = OBJECT_ID(N'[fk_Node_ParentNode]') 
           AND parent_object_id = OBJECT_ID('Node'))
  alter table Node  drop constraint fk_Node_ParentNode
 
if exists (select 1 from sys.objects 
           where object_id = OBJECT_ID(N'[fk_HierarchyNode_ChildNode]') 
           AND parent_object_id = OBJECT_ID('NodeHierarchy'))
  alter table NodeHierarchy  drop constraint fk_HierarchyNode_ChildNode
 
if exists (select 1 from sys.objects 
           where object_id = OBJECT_ID(N'[fk_HierarchyNode_ParentNode]') 
           AND parent_object_id = OBJECT_ID('NodeHierarchy'))
  alter table NodeHierarchy  drop constraint fk_HierarchyNode_ParentNode
 
if exists (select * from dbo.sysobjects 
           where id = object_id(N'Node') 
           and OBJECTPROPERTY(id, N'IsUserTable') = 1) 
  drop table Node
 
if exists (select * from dbo.sysobjects 
           where id = object_id(N'NodeHierarchy') 
           and OBJECTPROPERTY(id, N'IsUserTable') = 1) 
  drop table NodeHierarchy
 
create table Node 
(
  Id UNIQUEIDENTIFIER not null, 
  Name NVARCHAR(50) not null unique, 
  ParentId UNIQUEIDENTIFIER null, 
  primary key (Id))
 
create table NodeHierarchy 
(
  ParentId UNIQUEIDENTIFIER not null, 
  ChildId UNIQUEIDENTIFIER not null, 
  primary key (ParentId, ChildId)
)
 
alter table Node add constraint fk_Node_ParentNode 
foreign key (ParentId) references Node
 
alter table NodeHierarchy add constraint fk_HierarchyNode_ChildNode 
foreign key (ChildId) references Node
 
alter table NodeHierarchy add constraint fk_HierarchyNode_ParentNode