Introduction
Lately there have been many questions regarding how to best map a tree structure in NHibernate. In this post I'll try to show you various techniques how one can deal with a tree structure.
The code for this sample you can get from here.
Domain Model
Let's define a very simple domain model for this sample. We have a single class called Equipment that has a parent of type Equipment and 0 to many children of type Equipment. The tree has a single root node. The root node is specific by the fact that it's property Parent is equal to null. On the other hand a tree has leaf nodes. A leaf node is defined to have no children.
You could as well replace the class Equipment by e.g. a Person class. The principles shown here would not change.
Where is my Aggregate?
A Aggregate in DDD is defined as "A cluster of associated objects that are treated as a unit for the purpose of data changes. External references are restricted to one member of the Aggregate, designated as the root. A set of consistency rules applies within the Aggregate's boundaries."
Depending on our Use Cases or User Stories we can define our Aggregate either as
that is, the aggregate includes the equipment (which is the root) and its children.
Or we could define the Aggregate as follows
here only the equipment object is part of the aggregate (it is at the same time the root of the aggregate).
In the following I choose the former aggregate (equipment and children). This corresponds to the use case where I want to edit an equipment and its children in a single business transaction.
Note: Normally one defines a Repository for each Aggregate in the domain model.
Mapping
Let's have a look at the mapping for the Equipment class.
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
assembly="TreeStructure"
namespace="TreeStructure">
<class name="Equipment">
<id name="Id">
<generator class="guid"/>
</id>
<property name="Name" length="50" not-null="true" unique="true"/>
<many-to-one name="Parent" class="Equipment" column="ParentId" />
<set name="Children" cascade="all-delete-orphan" >
<key column="ParentId" foreign-key="fk_Equipment_ParentEquipment"/>
<one-to-many class="Equipment"/>
</set>
</class>
</hibernate-mapping>
The name property of the equipment must be defined and unique. The name can be at most 50 characters long.
The parent property is not mandatory (this is the default, i.e. not_null=false) since the root node has Parent equal to null.
I have defined the children collection as a Set, that is an unordered collection of unique objects. I have chosen cascade="all-delete-orphan" to instruct NHibernate that he automatically saves, updates or deletes the children of an equipment too when I save, update or delete the equipment. I also have defined that the foreign key constraint that will be defined on the database between the equipment and its parent has a friendly name (fk_Equipment_EquipmentParent).
Now I can write a unit test to check whether NHibernate can create the correct schema for me. In this sample I use a SQL Server 2005 database (the express edition is ok too) since later on I want to show some specific features only available in SQL Server. The code of the test is
[Test]
public void Can_generate_schema()
{
var configuration = new Configuration();
configuration.Configure();
configuration.AddAssembly(typeof (Equipment).Assembly);
// show the schema for debugging purposes
new SchemaExport(configuration).Execute(true, false, false, false);
// generate the schema in the database
new SchemaExport(configuration).Execute(false, true, false, false);
}
This test runs successfully and generates the following output (SQL script)
if exists (select 1 from sys.objects
where object_id = OBJECT_ID(N'[fk_Equipment_ParentEquipment]')
AND parent_object_id = OBJECT_ID('Equipment'))
alter table Equipment drop constraint fk_Equipment_ParentEquipment
if exists (select * from dbo.sysobjects
where id = object_id(N'Equipment')
and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table Equipment
create table Equipment (
Id UNIQUEIDENTIFIER not null,
Name NVARCHAR(50) not null unique,
ParentId UNIQUEIDENTIFIER null,
primary key (Id)
)
alter table Equipment
add constraint fk_Equipment_ParentEquipment
foreign key (ParentId) references Equipment
The first part of the script drops the database objects if they are already there and in the second part of the script the table
Equipment is generated as well as the foreign key constraint between the equipment and its parent. The create script for the table looks exactly like expected.
The Repository
A Repository is defined as "a mechanism for encapsulating storage, retrieval, and search behavior which emulates a collection of objects."
As said above, for each Aggregate in the domain model one defines a Repository. Since we only have one aggregate in our simple domain we also will have only one repository. Let us call it EquipmentRepository.
Now I want to be able to eager load the complete aggregate given the id of the aggregate root. I write the following unit test
[Test]
public void Can_load_aggregate_by_id()
{
var equipmentId = ...; // some id
var equipment = _repository.GetAggregateById(equipmentId);
Assert.IsNotNull(equipment);
Assert.IsTrue(NHibernateUtil.IsInitialized(equipment.Parent));
Assert.IsTrue(NHibernateUtil.IsInitialized(equipment.Children));
Assert.AreEqual(3, equipment.Children.Count);
}
With this test I assert that a) I get an aggregate and b) the aggregate is fully populated. To check whether the aggregate is fully populated I use the NHibernateUtil utility class. The test assumes the existence of a equipment tree fragment in the database. Please refer to the project (here) for any details how I create this tree fragment in the SetUp of the test fixture.
A possibility to eager load an aggregate is by using a HQL query as below
public Equipment GetAggregateById(Guid equipmentId)
{
using (var session = SessionManager.OpenSession())
{
var sql = "from Equipment e" +
" left join fetch e.Parent p" +
" left join fetch e.Children c" +
" where e.Id = :id";
var equipment = session.CreateQuery(sql)
.SetGuid("id", equipmentId)
.UniqueResult<Equipment>();
return equipment;
}
}
I use left joins since either Parent can be null or the equipment has no children. The SessionManager class is a helper class which I use to provide me session objects. Please have a look at the project which you can find here for any details.
Finding the root aggregate is easy. We just have to look for an equipment whose parent is equal to null
public Equipment GetRootAggregate()
{
using (var session = SessionManager.OpenSession())
{
var sql = "from Equipment e" +
" left join fetch e.Children c" +
" where e.Parent is null";
var equipment = session.CreateQuery(sql)
.UniqueResult<Equipment>();
return equipment;
}
}
Now some times we want to find all leaf nodes of the tree. Remember that a leaf node has by definition no children. We could use the following code for this
public ICollection<Equipment> GetAllLeafNodes()
{
using (var session = SessionManager.OpenSession())
{
var sql = "from Equipment e" +
" left join fetch e.Parent p" +
" where e.Children.size = 0";
var leafs = session.CreateQuery(sql)
.List<Equipment>();
return leafs;
}
}
here we use the handy size function for collections provided by HQL. Note that this method delivers all leaf nodes of the tree. To only select the leaf nodes of a sub-tree starting at a given equipment node is trivial now and I leave for you as an exercise.
Eager load ancestors and descendants
To eager load all ancestors or descendants of a given equipment node we have to use native TSQL since HQL (as well as the criteria API) does not provide a generic way of doing it. Fortunately SQL server 2005 provides us the means to achieve the desired result. We have to construct a hierarchical query as below
public ICollection<Equipment> GetAllDescendantsOfEquipment(Guid equipmentId)
{
using (var session = SessionManager.OpenSession())
{
var sql = "with Hierachy(Id, Name, ParentId, Level)" +
" as" +
" (" +
" select Id, Name, ParentId, 0 as Level" +
" from Equipment e" +
" where e.Id = :id" +
" union all" +
" select e.Id, e.Name, e.ParentId, eh.Level + 1" +
" from Equipment e" +
" inner join Hierachy eh" +
" on e.ParentId = eh.Id" +
" )" +
" select Id, Name, ParentId" +
" from Hierachy" +
" where Level > 0";
var list = session.CreateSQLQuery(sql)
.AddEntity(typeof(Equipment))
.SetGuid("id", equipmentId)
.List<Equipment>();
return list;
}
}
The last where statement in the SQL assures that the given equipment is not included in the list of descendants.
Analogous we can get the list of ancestors of a given equipment node
public ICollection<Equipment> GetAllAncestorsOfEquipment(Guid equipmentId)
{
using (var session = SessionManager.OpenSession())
{
var sql = "with Hierachy(Id, Name, ParentId, Level)" +
" as" +
" (" +
" select Id, Name, ParentId, 0 as Level" +
" from Equipment e" +
" where e.Id = :id" +
" union all" +
" select e.Id, e.Name, e.ParentId, eh.Level + 1" +
" from Equipment e" +
" inner join Hierachy eh" +
" on e.Id = eh.ParentId" +
" )" +
" select Id, Name, ParentId" +
" from Hierachy" +
" where Level > 0";
var list = session.CreateSQLQuery(sql)
.AddEntity(typeof(Equipment))
.SetGuid("id", equipmentId)
.List<Equipment>();
return list;
}
}
Note that we had just to adjust the inner join between the Equipment table and the Hierarchy (temp) table. Again the last where statement in the SQL assures that the given equipment is not included in the list of ancestors.
An alternative approach
We can also choose another approach for the tree problem. The solution I want to show works for any database even if it does not support hierarchical queries. This time our simple domain model looks as follows
Here we have a node object which can have a parent and a collection of children. Additionally we have defined the two collections descendants and ancestors which contain all descendants (children, children of the children and so on) and all parents of the given node.
The code for the Node class is given by
public class Node
{
public Guid Id { get; set; }
public string Name { get; set; }
public Node Parent { get; set; }
public ISet<Node> Children { get; set; }
public ISet<Node> Ancestors { get; set; }
public ISet<Node> Descendants { get; set; }
}
The mapping for the node is defined as
<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
assembly="TreeStructure"
namespace="TreeStructure">
<class name="Node">
<id name="Id">
<generator class="guid"/>
</id>
<property name="Name" length="50" not-null="true" unique="true"/>
<many-to-one name="Parent" class="Node" column="ParentId" />
<set name="Children" inverse="true" cascade="all" >
<key column="ParentId" foreign-key="fk_Node_ParentNode"/>
<one-to-many class="Node"/>
</set>
<set name="Descendants" table="NodeHierarchy" inverse="true" >
<key column="ParentId" foreign-key="fk_HierarchyNode_ParentNode"/>
<many-to-many class="Node" column="ChildId" foreign-key="fk_HierarchyNode_ChildNode" />
</set>
<set name="Ancestors" table="NodeHierarchy" >
<key column="ChildId" />
<many-to-many class="Node" column="ParentId" />
</set>
</class>
</hibernate-mapping>
Note that we define the descendants and ancestors of a node as a many-to-many relation in the database. So the SQL script created by the SchemaExport class of NHibernate is
if exists (select 1 from sys.objects
where object_id = OBJECT_ID(N'[fk_Node_ParentNode]')
AND parent_object_id = OBJECT_ID('Node'))
alter table Node drop constraint fk_Node_ParentNode
if exists (select 1 from sys.objects
where object_id = OBJECT_ID(N'[fk_HierarchyNode_ChildNode]')
AND parent_object_id = OBJECT_ID('NodeHierarchy'))
alter table NodeHierarchy drop constraint fk_HierarchyNode_ChildNode
if exists (select 1 from sys.objects
where object_id = OBJECT_ID(N'[fk_HierarchyNode_ParentNode]')
AND parent_object_id = OBJECT_ID('NodeHierarchy'))
alter table NodeHierarchy drop constraint fk_HierarchyNode_ParentNode
if exists (select * from dbo.sysobjects
where id = object_id(N'Node')
and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table Node
if exists (select * from dbo.sysobjects
where id = object_id(N'NodeHierarchy')
and OBJECTPROPERTY(id, N'IsUserTable') = 1)
drop table NodeHierarchy
create table Node
(
Id UNIQUEIDENTIFIER not null,
Name NVARCHAR(50) not null unique,
ParentId UNIQUEIDENTIFIER null,
primary key (Id))
create table NodeHierarchy
(
ParentId UNIQUEIDENTIFIER not null,
ChildId UNIQUEIDENTIFIER not null,
primary key (ParentId, ChildId)
)
alter table Node add constraint fk_Node_ParentNode
foreign key (ParentId) references Node
alter table NodeHierarchy add constraint fk_HierarchyNode_ChildNode
foreign key (ChildId) references Node
alter table NodeHierarchy add constraint fk_HierarchyNode_ParentNod