Eager loading aggregate with many child collections

In a comment to my post on lazy loading versus eager loading which you can find here I have been asked whether NHibernate 2.0 can now eagerly load an aggregate where the root has more than one child collection. The answer is yes but... Let's discuss it with a sample.

Imagine having the following (academic) domain model.

image

I have identified a Blog-aggregate. This aggregate has the Blog entity as root and has two child collections Posts and Readers. Additionally it has an Author child. If I want to eager load this aggregate with NHibernate I have several possibilities to do so. (If you are new to NHibernate please refer to this post for a detailed introduction.)

Preparing the initial data

For my tests I need some data in my sample database. I generate a single blog with one author, 3 readers and two posts as follows

private void CreateInitialData()
{
    var author = new Person {Name = "Gabriel Schenker"};
    _blog = new Blog {Name = "NHibernate FAQ", Author = author};
    var readers = new[]
                      {
                          new Person {Name = "John Doe"},
                          new Person {Name = "Sue Foo"},
                          new Person {Name = "Ann Li"},
                      };
    var posts = new[]
                    {
                        new Post {Title = "Lazy Loading - eager loading"},
                        new Post {Title = "MultiCriteria queries"},
                    };
    _blog.Readers.AddAll(readers);
    _blog.Posts.AddAll(posts);
 
    using (ISession session = SessionFactory.OpenSession())
    using (ITransaction transaction = session.BeginTransaction())
    {
        session.Save(author);
        session.Save(_blog);
        transaction.Commit();
    }
}

Not recommended solution

Let's start with the "bad" solution for the problem. I have defined the following test method

[Test]
public void Eager_load_blog_aggregate_in_NOT_recommended_way()
{
    Blog fromDb;
    using (ISession session = SessionFactory.OpenSession())
    {
        string sql = "from Blog b" +
                     " inner join fetch b.Author" +
                     " left join fetch b.Readers" +
                     " left join fetch b.Posts" +
                     " where b.Id=:id";
        fromDb = session.CreateQuery(sql)
            .SetGuid("id", _blog.Id)
            .UniqueResult<Blog>();
    }
 
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Author));
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Readers));
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Posts));
    Assert.AreEqual(3, fromDb.Readers.Count);
    Assert.AreEqual(2, fromDb.Posts.Count);
}

This works, that is my test is green. Work done, I can go home - or can't I?

No, the fact that the test passes does not mean, that this is a GOOD solution! Let's analyze what SQL NHibernate generates. It is similar to that:

select          blog.Id as Id1_0_, 
                author.Id as Id2_1_, 
                readers.Id as Id2_2_, 
                posts.Id as Id3_3_, 
                blog.Name as Name1_0_, 
                blog.Author as Author1_0_, 
                author.Name as Name2_1_, 
                readers.Name as Name2_2_, 
                posts.Title as Title3_3_, 
                readers.BlogId as BlogId0__, 
                readers.Id as Id0__, 
                posts.BlogId as BlogId1__, 
                posts.Id as Id1__ 
from            Blog blog 
inner join      Person author on blog.Author=author.Id 
left outer join Person readers on blog.Id=readers.BlogId 
left outer join Post posts on blog.Id=posts.BlogId 
where           blog.Id='090de8b6-4c36-4f73-9d63-12dbd6b332b2'

Please notice that you have three joins, an inner and two outer joins. NHibernate returns a single set of records containing the cross product of all three joins. That is: if a blog has one author, 10 posts and 100 readers the result set will have 1*10*100 = 1000 records. If you have 100 posts and 1000 readers you will get a set with 100'000 records!

Now imagine the blog having three child collections instead of 2... the situation would be catastrophic and your DBA would insist that you get fired.

Is there a better way? Of course! NHibernate 2.0 introduces the IMultiQuery and IMultiCriteria interfaces. With these interfaces you can send several select statements to the database in one go (Note: not all databases support this, check your provider)

Better solution

HQL Multi Query

Let's start with the MultiQuery class. I define a test method as follows

[Test]
public void Eager_load_blog_aggregate_with_multi_query()
{
    Blog fromDb;
    using (ISession session = SessionFactory.OpenSession())
    {
        var sql1 = "from Blog b inner join fetch b.Author where b.Id=:id";
        var sql2 = "from Blog b inner join fetch b.Readers where b.Id=:id";
        var sql3 = "from Blog b inner join fetch b.Posts where b.Id=:id";
 
        var result = session.CreateMultiQuery()
                        .Add(session.CreateQuery(sql1))
                        .Add(session.CreateQuery(sql2))
                        .Add(session.CreateQuery(sql3))
                        .SetParameter("id", _blog.Id)
                        .List();
        IList list = (IList) result[0];
        fromDb = list[0] as Blog;
    }
 
    Assert.AreNotSame(_blog, fromDb);
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Author));
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Readers));
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Posts));
    Assert.AreEqual(3, fromDb.Readers.Count);
    Assert.AreEqual(2, fromDb.Posts.Count);
}

NHibernate now sends 3 select statements in a single call to the database. I'm using an SQL Server 2005 database and the SQL sent by NHibernate is this

select  blog0_.Id as Id1_0_, 
        person1_.Id as Id2_1_, 
        blog0_.Name as Name1_0_, 
        blog0_.Author as Author1_0_, 
        person1_.Name as Name2_1_ 
from    Blog blog0_ 
inner join Person person1_ on blog0_.Author=person1_.Id 
where (blog0_.Id=@p0 );
 
select  blog0_.Id as Id1_0_, 
        readers1_.Id as Id2_1_, 
        blog0_.Name as Name1_0_, 
        blog0_.Author as Author1_0_, 
        readers1_.Name as Name2_1_, 
        readers1_.BlogId as BlogId0__, 
        readers1_.Id as Id0__ 
from    Blog blog0_ 
inner join Person readers1_ on blog0_.Id=readers1_.BlogId 
where (blog0_.Id=@p1 );
 
select  blog0_.Id as Id1_0_, 
        posts1_.Id as Id3_1_, 
        blog0_.Name as Name1_0_, 
        blog0_.Author as Author1_0_, 
        posts1_.Title as Title3_1_, 
        posts1_.BlogId as BlogId0__, 
        posts1_.Id as Id0__ 
from    Blog blog0_ 
inner join Post posts1_ on blog0_.Id=posts1_.BlogId 
where (blog0_.Id=@p2 );
 
; @p0 = '8558ccf4-449f-497a-8619-12458a0aa402', 
@p1 = '8558ccf4-449f-497a-8619-12458a0aa402', 
@p2 = '8558ccf4-449f-497a-8619-12458a0aa402'

SQL server will return 3 sets of records having 1, 3 and 2 records each (in our case). If I now assume I have a blog with 1 author, 10 posts and 100 readers SQL server will return 1+10+100=111 records. That's significantly less than in the bad sample from above where the database returns 1000 records. The situation is even better for the situation where I have 100 posts and 1000 reader. Here the database will return only 1+100+1000 = 1101 records. Compare this to the 100'000 records of the bad sample!

Multi Criteria Query

You can get the same result when using Criteria Queries. Take a look at the following test method

[Test]
public void Eager_load_blog_aggregate_with_multi_criteria()
{
    Blog fromDb;
    using (ISession session = SessionFactory.OpenSession())
    {
        var criteria1 = DetachedCriteria.For<Blog>()
            .Add(Restrictions.Eq("Id", _blog.Id))
            .SetFetchMode("Author", FetchMode.Eager)
            .CreateCriteria("Readers", JoinType.LeftOuterJoin);
        var criteria2 = DetachedCriteria.For<Blog>()
            .Add(Restrictions.Eq("Id", _blog.Id))
            .CreateCriteria("Posts", JoinType.LeftOuterJoin);
 
        var result = session.CreateMultiCriteria()
            .Add(criteria1)
            .Add(criteria2)
            .List();
        IList list = (IList) result[0];
        fromDb = list[0] as Blog;
    }
 
    Assert.AreNotSame(_blog, fromDb);
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Author));
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Readers));
    Assert.IsTrue(NHibernateUtil.IsInitialized(fromDb.Posts));
    Assert.AreEqual(3, fromDb.Readers.Count);
    Assert.AreEqual(2, fromDb.Posts.Count);
}

Here I only have defined 2 criteria since the blog has exactly one author I condensed the join with author and the join with readers. The resulting SQL is for my SQL server 2005

SELECT this_.Id as Id1_2_, 
this_.Name as Name1_2_, 
this_.Author as Author1_2_, 
person3_.Id as Id2_0_, 
person3_.Name as Name2_0_, 
person1_.BlogId as BlogId4_, 
person1_.Id as Id4_, 
person1_.Id as Id2_1_, 
person1_.Name as Name2_1_ 
FROM Blog this_ 
left outer join Person person3_ on this_.Author=person3_.Id 
left outer join Person person1_ on this_.Id=person1_.BlogId 
WHERE this_.Id = @p0;
 
SELECT this_.Id as Id1_1_, this_.Name as Name1_1_, 
this_.Author as Author1_1_, 
post1_.BlogId as BlogId3_, 
post1_.Id as Id3_, 
post1_.Id as Id3_0_, 
post1_.Title as Title3_0_ 
FROM Blog this_ 
left outer join Post post1_ on this_.Id=post1_.BlogId 
WHERE this_.Id = @p1;
 
; @p0 = 'dc29a877-3b28-4c3a-b8dd-4e015d77851d', 
@p1 = 'dc29a877-3b28-4c3a-b8dd-4e015d77851d'

Summary

I have shown you several ways how to eagerly load an aggregate where the root is having more than one child collection. I also have shown the pitfalls of a possible (bad) solution.

Blog Signature Gabriel .

Print | posted on Sunday, April 06, 2008 10:11 AM

Comments on this post

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Keep the good work up guys. This is great stuff. I have been dying for a blog with good examples about scenarios about nhibernate queries. The samples are well documented as well.

For a future posting, could you do one on complex mappings that people may run into? This is one area where I am still very weak. An example would be inheritance mapping options and perhaps one-to-one associations with nhibernate or how you would approach it. Great post!
Left by Sean Chambers on Apr 06, 2008 11:43 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Wow. This is powerful. Do you mind sharing the source code for this and your other posts? Each post you have is really interesting.
Left by Chua Wen Ching on Apr 06, 2008 6:29 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Very impressive work!

I'll be watching the progress. This will be one of my daily readings!

Left by Benny on Apr 06, 2008 7:43 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
@Chua: We will setup a source code repository on google soon. Please be patient.
Left by Gabriel Schenker on Apr 07, 2008 6:32 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
this post series is awesome, by the way, I already implemented the sources myself. I can share this if you can't wait the repository to be setup.
Left by jack on Apr 07, 2008 4:44 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Thanks Gabriel :)
Left by Chua Wen Ching on Apr 08, 2008 2:07 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
"Not recommended solution" : Indeed super bad
"HQL Multi Query" : Still fetches Blog-entry data Author + Readers + Posts times, where just one is enough.
"Multi Criteria Query" : Not only fetches blog-entry data Author + Readers + Posts times, but also Author data.
Is there a way to get the optimal:
select * from blog inner join author on ... where blog.id = n
select * from posts where blog_id = n
select * from readers where blog_id = n

Also, how is it possible to optimal queries when the original is a list?
Left by Dennis on Apr 10, 2008 3:43 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
@Dennis: sorry, don't fully understand your comment...
I am talking of eagerly loading a complete aggregate in one go (one call to the database).
Can you specify more precisely where you see a problem?
Left by Gabriel Schenker on Apr 10, 2008 4:13 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
@Gabriel : Two different issues here.
1. The queries issued in your examples are suboptimal.
My simple test in db:
"Not recommended solution":
Sent data: 1574 bytes
Received data: 1480 bytes
"HQL Multi Query":
Sent data: 2028 bytes
Received data: 1054 bytes
"Multi Criteria Query":
Sent data: 1438 bytes
Received data: 1005 bytes
"Optimal":
Sent data: 1224 bytes
Received data: 694 bytes

select blog0_.Id as Id1_0_,
person1_.Id as Id2_1_,
blog0_.Name as Name1_0_,
blog0_.Author_id as Author1_0_,
person1_.Name as Name2_1_
from Blog blog0_
inner join Person person1_ on blog0_.Author_Id=person1_.Id
where (blog0_.Id=1 );

SELECT post1_.Blog_Id as BlogId3_,
post1_.Id as Id3_,
post1_.Id as Id3_0_,
post1_.Title as Title3_0_ FROM post post1_ where post1_.blog_id = 1

select reader_.Blog_Id as BlogId4_,
reader_.Id as Id4_,
reader_.Id as Id2_1_,
reader_.Name as Name2_1_
from person reader_ where reader_.blog_id = 1
GO


2. Grandchildren and children of collections.
benchmarks from llblgen vs. LINQ to SQL eager loading
Left by Dennis on Apr 10, 2008 2:20 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Jack, Can you send me the code on duhoux.patrice@gmail.com ?

Thanks in advance
Left by pduh on Apr 10, 2008 6:48 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Really great tutorial, Gabriel, and easy to follow. However, when I run the multi-query examples, I get a SQL query parsing error on the second SELECT. Did I miss a SQL setup parameter to allow multi queries somewhere? Thanks a lot.
Left by Eivind on Nov 22, 2008 1:44 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Sorry, my fault. SQL Compact does not support multi queries. With SQL Express it works.
Left by Eivind on Nov 22, 2008 6:30 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
These articles rock! Thank you for all the time and effort you've put into them.

I wanted to ask you if there is any way to get nHibernate to use eager loading when querying with Linq?
Left by Tomas Matuschek on Mar 02, 2009 10:32 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Hey, this is a solution i've been searching for for more than 2 weeks. No-one seems to know about this except you!!
The first solution I was actually searching for is the other way around. I wanted to keep the lazy loading always false (all is loaded) and only when loading a list i want to place the lazy to true for some collections. Is that also possible?

Left by Peter Derwa on Apr 21, 2009 1:21 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
I've got 4 one to many bag properties and 10 many to one properties. the many to one properties are not lazy. with the multicriteria that results to loading those 10 many to one properties 4 times, which is taking too long.
Is there another solution except lazy load the many to one properties?
Left by Peter Derwa on Apr 21, 2009 2:22 AM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
how do i make sure the lazy loading on a collection of one of the collections is false?

greets
Left by Peter Derwa on May 03, 2009 6:34 PM

# re: Eager loading aggregate with many child collections

Requesting Gravatar...
Sorry for this question, but if you get in the first query 100,000 records, but only 1001 in the multiquery, what is the excedent data? trash?

Regards
Left by Cesar Sanz on May 20, 2009 10:44 AM

Your comment:

 (will show your gravatar)
 
Please add 7 and 1 and type the answer here: