Welcome

This is the generic homepage (aka Aggregate Blog) for a Subtext community website. It aggregates posts from every blog installed in this server. To modify this page, edit the default.aspx page in your Subtext installation.

To learn more about the application, check out the Subtext Project Website.

Powered By:

Blog Stats

  • Blogs - 6
  • Posts - 45
  • Articles - 0
  • Comments - 24179
  • Trackbacks - 37

Bloggers (posts, last update)

Welcome All

The aim of this blog is to discuss the frequently asked questions regarding NHibernate. We'd like to provide possible solutions to the questions and provide meaningful samples if valuable.

The home page of NHibernate can be found here.

A discussion group talking about usage scenarios and ask questions about how to do things with NHibernate can be found here.

Valuable documentation about NHibernate can be found here

Blog Signature Gabriel .

Latest Posts

Dictionary Adapter 1.1 Released

It can be downloaded at Sourceforge.

posted @ 11/22/2009 3:42 PM by Craig

Article series on NHibernate and Fluent NHibernate – Part 3

Today the third part of my series on NHibernate and Fluent NHibernate went live. It took a little bit longer due to work overload. You can read it here.

Summary

In part 3 of this article series about NHibernate and Fluent NHibernate I have discussed how to let Fluent NHibernate automatically map a domain model to a data model. We have realized that FNH provides a reasonable mapping out of the box by using default conventions. I have shown how one can implement user defined conventions which will influence how the mapping is defined on a very fine grained level. I have also shown that if we use a base class in our domain which implements common functionality we can instruct FNH to ignore this class and just map the “real” entities.

FNH with its auto mapping feature reduces the task of mapping a complex domain to an underlying data model to just a few keystrokes. And as is always true: less code results in less bugs and less maintenance overhead.

posted @ 9/1/2009 4:51 PM by NHibernate's Answers

Article series on NHibernate and Fluent NHibernate – Part 2

Today the second part of my series on NHibernate and Fluent NHibernate went live. You can read it here.

Summary

In part 2 of the article series I have continued to implement the remaining part of the domain which I had introduced in the first part. I discuss the mapping of various forms of relations between different entities. For all mappings I have presented the code needed to verify the mappings.

In part 3 of the article series I’ll show how one can further refactor and improve the mapping of the domain model. I’ll then discuss the usage of conventions and finally show how one can use the auto-mapping capabilities of Fluent NHibernate to completely avoid the explicit mapping of entities.

posted @ 4/16/2009 12:30 AM by NHibernate's Answers

From a data centric to a domain driven design

Introduction

Is NHibernate as an ORM tool or framework better suited for a data centric application or for an application developed by applying domain driven design? Well, that should NOT be the question here. NHibernate can serve equally well both approaches. But which one would you choose?

A data centric design

The modeling of the data structure is the center piece of this kind of design. The database is the most important part of our architecture. The DBA plays a very important role.

Data Layer

That's ok for simple problems like managing a collection of customer addresses and nothing more. But it miserably fails as soon as we get into more complicated business. Let's e.g. assume that all of the sudden our address management tool needs to be extended to a full fledged CRM solution.

Possible Problems

  • Management night mare with possibly hundreds of stored procedures
  • No clear structure in the data. It seems that each entity (or table) has more or less the same importance
  • We concentrate to much on the data and not on the business processes and/or functionality

Our principle questions are focused around

  • Do we have all data that we need (tables and columns there in)?
  • Did we normalize the data sufficiently?

Instead of

  • What do we want to automate and why do we want to do so?
  • How can we measure that we have achieved what we want (criterions!)?

User Interface

On the other hand we concentrate too much on e.g. data binding on the GUI and how to best present (too much) data on the screen for the user.

A sample taken from a real application

This is a sample from a real application (although simplified). At the beginning there was a data model (ERD) which was elaborate by two persons in a "heroic" several men month long effort. Let's show an extract of this data model

image

It might be clear after staring a while on the ERD that we want to manage employees and their assigned tasks. It's also clear that an employee is a person and that it has an address and two different titles. Further on each employee can have an associated photo. Finally we see that the data regarding an employee is split into two different tables [Employee] and [EmployeeDetails] obviously for scalability reasons. Since photos can become very large (compared to the rest of the data) they are kept in a separate table [EmployeePhoto] which possibly resides in separate database file on another disk.

Stored Procedures

Traditionally Microsoft has suggested us to write stored procedures to access and manipulate the data in the various tables. So usually one would write the following kind of stored procedures for each table (here as an example we use the Address table)

  • usp_AddressInsert(...)
  • usp_AddressUpdate(...)
  • usp_AddressDelet(...)
  • usp_AddressSelectById(...)

if we look at a table like [Employee] we would probably have many more stored procedures representing various querying scenarios, e.g.

  • usp_EmployeeSelectAll()
  • usp_EmployeeSelectActive()
  • usp_EmployeeSelectByDepartment(...)
  • etc.

Data access objects (DAO)

Now having the stored procedures in place we write a DAO for each table to produce a object oriented wrapper around the database (the business object should not know anything about the database). We thus will have the following class for e.g. the Address table

public class AddressDAO
{
    public void Insert(...) {...}
    public void Update(...) {...}
    public void Delete(...) {...}
    public Address GetById(...) {...}
}

A domain driven design

The modeling of the business processes is the center piece of this kind of design. The so called domain model is the center of our architecture. The domain expert (a representative of the customer with deep knowledge of the domain for which the application is planned) plays a very important role during all the development phases (analysis, design and implementation). The development often is more evolutionary and goes in "cycles". It's often a so called agile application development.

Distilling the ubiquitous language

Together with the domain expert  we want to find and use the ubiquitous language which describes best the business we are talking of. In our simple case this ubiquitous language is "hidden" in the following fragments of a discussion

"... So we want to keep track of our employees. Especially we want to know where the employee lives and how we can contact him at home as well as in the office. To better identify an employee (and for other usage as well, e.g. to apply for a business travel visa) we want to have a photo of him in the system. ..."

"... yeah, each employee has a business card of course where he's title and job description is marked... thus I guess we have to store this information - we call it title - in the system as well since the relevant text for a new business card shall be generated automatically... By the way, the titles should be available in German and in English. ..."

"... each employee has a list of tasks to carry out during his daily work. This list should be maintained and updated by the system. Tasks can be produced automatically by the fact that some event occurred, e.g. there is a redemption of a bond. Other tasks are a result of a contact with a customer. And last but not least the group manager can define some tasks for an employee or a group of employees. ..."

"... as an employee I want to see a list of my open tasks. I want to also see if tasks are over due and which tasks will occur in the near future. ... I also want to see the priority of each task and what it's current status is. ..."

I have marked in bold the key terms for our simplified domain. These nouns will be good candidates for entities or value objects.

A first model of our domain

We can immediately locate two hot candidates for entities. These are employee and task. We also immediately grasp that a employee might have a list of associated tasks. Every thing else is just detail for the moment. So let's draw a nice diagram

image

Refining the model

Up to now our model does not contain much structure and business logic. Let's start with the question how we could better model the employee entity. At the moment the employee contains just a bunch of properties and no clear internal structure is visible. To find out e.g. whether an instance of employee is in a valid state is rather cumbersome. Comes the value object to the rescue! What about this?

image

In the above model I have tried to group properties which belong together. Let's take as an example the address. A valid address is represented by an address line, a postal code and a city (in the real world we would also have a link to a country there). An address line alone does not really make sense. Again: we can validate an address only if we have the whole group of relevant properties at hand. On the other hand it does not really make sense to put the address validation logic into the employee entity itself. Thus the extraction of a new class - the address class - is certainly an improvement. In the terminology of DDD the address is now a value object. It has no identity by itself and belongs to and characterizes an entity (the employee). A value object can be recreated any time by just copying the content of its properties. Further on a value object is immutable. Once created it cannot be changed. Two instances of a value object are equal if the content of their fields match. A value object can never be in an invalid state.

Let's have a look at the code of the address class then

public class Address : IEquatable<Address>
{
    public Address(string addressLine1, string addressLine2, string postalCode, string city)
    {
        if(addressLine1==null) 
            throw new ArgumentException("Address line 1 cannot be undefined.");
        if(postalCode==null) 
            throw new ArgumentException("Postal code cannot be undefined.");
        if(city==null) 
            throw new ArgumentException("City cannot be undefined.");
 
        AddressLine1 = addressLine1;
        AddressLine2 = addressLine2;
        PostalCode = postalCode;
        City = city;
    }
 
    public string AddressLine1 { get; private set; }
    public string AddressLine2 { get; private set; }
    public string PostalCode { get; private set; }
    public string City { get; private set; }
 
    public bool Equals(Address other)
    {
        if(other==null) return false;
        return AddressLine1 == other.AddressLine1 &&
               ((AddressLine2==null && other.AddressLine2==null) || 
               (AddressLine2 != null && AddressLine2 == other.AddressLine2)) &&
               PostalCode == other.PostalCode &&
               City == other.City;
    }
 
    public override bool Equals(object obj)
    {
        return Equals(obj as Address);
    }
 
    public override int GetHashCode()
    {
        return string.Format("{0}|{1}|{2}|{3}", 
            AddressLine1, AddressLine2, PostalCode, City).GetHashCode();
    }
}

Note that all properties are read-only to account for the fact that a value object is immutable. Thus an address can only be constructed through its constructor. Note also that I have implemented the interface IEquatable<T> and overridden the two methods Equals and GetHashCode to be able to compare two instances of type Address for equality. Pay attention on the special treatment of the AddressLine2 property in the Equal method. It's the only property that can be null. All others properties cannot be null. This requirement is enforced in the constructor where I check each non-nullable parameter and throw an exception if the requirement is not met.

The above code is a typical sample of how one would implement a value object.

I think it makes sense that also the Employee entity is always in a valid state. A possible solution how to achieve that is to also make all properties of the entity read-only and force any modification to use dedicated methods. I would also provide a constructor which expects all mandatory values, e.g.

public Employee(Name name, Title title1, Address homeAddress, Address officeAddress, 
    Contact homeContact, Contact officeContact)
{
    if (name == null)
        throw new ArgumentException("Name of employee cannot be undefined.");
    if (title1 == null)
        throw new ArgumentException("Title 1 of employee cannot be undefined.");
    if (homeAddress == null)
        throw new ArgumentException("Home address of employee cannot be undefined.");
    if (officeAddress == null)
        throw new ArgumentException("Office address of employee cannot be undefined.");
    if (homeContact == null)
        throw new ArgumentException("Home contact of employee cannot be undefined.");
    if (officeContact == null)
        throw new ArgumentException("Office contact of employee cannot be undefined.");
 
    Name = name;
    Title1 = title1;
    HomeAddress = homeAddress;
    OfficeAddress = officeAddress;
    HomeContact = homeContact;
    OfficeContact = officeContact;
}

Once an employee exists in the system and I want to e.g. change its home address then I use a dedicated method of the employee class

public void ChangeHomeAddress(Address newHomeAddress)
{
    if (newHomeAddress == null)
        throw new ArgumentException("Cannot change home address of employee to undefined.");
    HomeAddress = newHomeAddress;
}

This method again asserts that after the updating the employee instance is again in a valid state.

Summary

For me it's obvious that the domain driven approach is definitely the better way to go when implementing an application which models complex business processes. A more data centric approach may still make sense when dealing with a forms-over-data type application which does not involve complex business processes. But even when implementing a data centric application I would certainly never ever use stored procedures to access the data. I would instead use an ORM tool like NHibernate to access the database. The only exception I can see for justifying the use of stored procedures is in reporting scenarios where some massive data mapping, filtering and/or aggregation might be needed which is best done directly on the database server (in-process for maximum speed).

Enjoy

Blog Signature Gabriel

posted @ 2/3/2009 4:08 AM by NHibernate's Answers

I'm a "Los Techies" member now

I just wanted to inform you that I have been invited to be a member of Los Techies which is a big honor for me.

LosTechies Logo

If you like my posts then you might as well have a look at my blog at LosTechies.  There I'll publish posts that do not necessarily have anything to do with NHibernate. To start with I have posted an article about the single responsibility principle (SRP) which I personally consider to be one of the most important principles for my daily development!

Don't be afraid, I'll continue to publish NHibernate related articles in this blog. So keep tuned.

Blog Signature Gabriel

posted @ 1/21/2009 1:30 AM by NHibernate's Answers

Type analyzer - Interesting findings

When reading this post from Ayende I was getting curious what this site would tell about my blogging style here. The result is amusing me a lot.

INTJ - The Scientists

The long-range thinking and individualistic type. They are especially good at looking at almost anything and figuring out a way of improving it - often with a highly creative and imaginative touch. They are intellectually curious and daring, but might be physically hesitant to try new things.
The Scientists enjoy theoretical work that allows them to use their strong minds and bold creativity. Since they tend to be so abstract and theoretical in their communication they often have a problem communicating their visions to other people and need to learn patience and use concrete examples. Since they are extremely good at concentrating they often have no trouble working alone.

I would be very pleased if you could provide some feedback about what you are thinking of this findings... Especially interesting is the question whether

  • I am too abstract or theoretical
  • I can communicate my visions to you
  • My examples are concrete enough

posted @ 11/30/2008 9:39 PM by NHibernate's Answers

Linq to NHibernate

The popularity of NHibernate is steadily increasing. At the same time people get used to LINQ. Now there exists a LINQ to NHibernate provider since quite some time. It's not a complete implementation of a LINQ provider but it is still quite useful. Most of the day to day problems we face when developing typical business application can be solved by using this provider. And if there is a query that cannot be executed against the provider we still have the option to falling back to the hibernate query language (HQL).

In this post I'll give you an introduction on how you can use NHibernate in conjunction with LINQ. Let's start with the definition of a domain model we are going to use when querying the database by using LINQ expressions or LINQ query operators.

The domain model

I have the following simple domain model. A person can have a set of assigned tasks.

model

The entities

I want to keep the entities as simple as possible for this example. So their implementation is as follows

public class Person
{
    public Person()
    {
        Tasks = new List<Task>();
    }
 
    public virtual long Id { get; set; }
    public virtual string Lastname { get; set; }
    public virtual string Firstname { get; set; }
    public virtual IList<Task> Tasks { get; set; }
}
 
public class Task
{
    public virtual long Id { get; set; }
    public virtual string TaskName { get; set; }
    public virtual DateTime DueDate { get; set; }
}

one thing worth noting is the fact that I instantiate a new empty Tasks collection in the constructor of the Person class.

The mapping

For the mapping of my domain model I am going to use the Fluent NHibernate framework (please refer to my previous posts for a detailed introduction to this framework: part 1, part 2, part 3, part 4) .

The Task is very easy to map

public class TaskMap : ClassMap<Task>
{
    public TaskMap()
    {
        Id(x => x.Id);
        Map(x => x.TaskName);
        Map(x => x.DueDate);
    }
}

And the person is nothing more complicated

public class PersonMap : ClassMap<Person>
{
    public PersonMap()
    {
        Id(x => x.Id);
        Map(x => x.Firstname);
        Map(x => x.Lastname);
        HasMany<Task>(x => x.Tasks)
            .LazyLoad()
            .Cascade.All();
    }
}

Just let's have a look at the mapping of the Tasks collection. I want the tasks to only be lazy loaded when loading a person entity (LazyLoad). And I want NHibernate to automatically save or update any tasks associated with a given person if ever the person is updated (Cascade.All).

Test the mapping

One quick test I want to always do is to check whether my mappings work as expected. Since I use Fluent NHibernate this is very easy and straight forward. A lot of infrastructure code is available to me to leverage and thus simplify my unit test regarding mapping of the domain entities.

The base test fixture

First I want to present my base class I use for all of my tests. Every test fixture used in this post will inherit from this base class.

public class FixtureBase<TModel> where TModel : PersistenceModel, new()
{
    protected SessionSource SessionSource { get; set; }
    protected ISession Session { get; private set; }
 
    [SetUp]
    public void SetupContext()
    {
        var cfg = new SQLiteConfiguration()
                        .InMemory()
                        .ShowSql();
        SessionSource = new SessionSource(cfg.ToProperties(), new TModel());
        Session = SessionSource.CreateSession();
        SessionSource.BuildSchema(Session);
        
        Context();
        Because();
 
        Session.Flush();
        Session.Clear();
    }
 
    [TearDown]
    public void TearDown()
    {
        TearDownContext();
 
        Session.Close();
        Session.Dispose();
    }
 
    protected virtual void Context()
    {
    }
 
    protected virtual void Because()
    {
    }
 
    protected virtual void TearDownContext()
    {
    }
}

First of all my base fixture has a generic parameter TModel. TModel represents the persistence model used during the tests. The persistence model must inherit from the PersistenceModel (provided by Fluent NHibernate) and it must have a default constructor.

In the SetupContext method (which is executed before each test)

  • I define my configuration I want to use. Most of the time I use the SQLite database in in-memory mode for my database related tests. Fluent NHibernate provides me such a configuration object. For debugging purposes I declare that I want to see the SQL statements generated by NHibernate (ShowSql).
  • Second I create a session source and pass the above configuration as well as an instance of the model used to the constructor of the session source.
  • Then I create a new session object. Note that since I am using SQLite in in-memory mode I have to use the same session object during the whole test (including setup and tear down) since the database schema (and the data) is destroyed when the session is closed.
  • I now use the Session object created above to generate the database schema. The schema is generated from the information provided by the model (which consists of all mappings)
  • Now I call the virtual Context and Because methods. The reason is that I want my unit test to be more BDD like
  • Finally I flush all pending operations from the session to the database and then clear the session.

The persistence model

The persistence model I am using is very straight forward. I just include all class mappings that are in the same assembly as the PersonMap.

public class TestModel : PersistenceModel
{
    public TestModel()
    {
        addMappingsFromAssembly(typeof(PersonMap).Assembly);
    }
}

I could also explicitly add mappings if I want so

public class TestModel : PersistenceModel
{
    public TestModel()
    {
        addMapping(new PersonMap());
        addMapping(new TaskMap());
    }
}

The test

Writing a test for the mapping is really straight forward with the aid of the PersistenceSpecification class provided by Fluent NH.

[Test]
public void can_add_person_without_tasks()
{
    new PersistenceSpecification<Person>(Session)
        .CheckProperty(x => x.Firstname, "Gabriel")
        .CheckProperty(x => x.Lastname, "Schenker")
        .VerifyTheMappings();
}

or a test which also tries to add tasks

[Test]
public void can_add_person_with_tasks()
{
    var tasks = new[]
                    {
                        new Task {TaskName = "Task 1", DueDate = DateTime.Today.AddDays(5)},
                        new Task {TaskName = "Task 2", DueDate = DateTime.Today.AddDays(6)},
                        new Task {TaskName = "Task 3", DueDate = DateTime.Today.AddDays(3)},
                    };
    new PersistenceSpecification<Person>(Session)
        .CheckProperty(x => x.Firstname, "Gabriel")
        .CheckProperty(x => x.Lastname, "Schenker")
        .CheckList(x=>x.Tasks, tasks)
        .VerifyTheMappings();
}

Do I need to tell that the tests pass successfully? It's really that easy... I have not needed a single line of XML so far. And no "magic strings" are involved. For me it's really a joy to work like this.

And now finally we get LINQ into the play

The context

In LINQ the notion of a context is very important. It represent kind of a facade to the database. LINQ to NHibernate provides us a base class from which we can derive when we define our own context. In our simplified model we have only 2 entities and thus our context needs just two members to have access via LINQ to either the persons or tasks.

public class SampleContext : NHibernateContext
{
    public SampleContext(ISession session)
        : base(session)
    { }
 
    public IOrderedQueryable<Person> Persons
    {
        get { return Session.Linq<Person>(); }
    }
 
    public IOrderedQueryable<Task> Tasks
    {
        get { return Session.Linq<Task>(); }
    }
}

Testing

Now I want to implement a first test to check whether LINQ to NHibernate does indeed work as expected. First I have to setup a context where I have some person object with tasks in the database

public class a_repository_with_persons_having_tasks : Person_Fixture
{
    protected Person[] persons;
    private IList<Task> tasks1, tasks2, tasks3;
 
    protected override void Context()
    {
        base.Context();
        tasks1 = new[]
                     {
                         new Task {TaskName = "Task 1.1", DueDate = DateTime.Today.AddDays(5)},
                         new Task {TaskName = "Task 1.2", DueDate = DateTime.Today.AddDays(6)},
                         new Task {TaskName = "Task 1.3", DueDate = DateTime.Today.AddDays(3)},
                     };
        tasks2 = new[]
                     {
                         new Task {TaskName = "Task 2.1", DueDate = DateTime.Today.AddDays(5)},
                         new Task {TaskName = "Task 2.2", DueDate = DateTime.Today.AddDays(6)},
                         new Task {TaskName = "Task 2.3", DueDate = DateTime.Today.AddDays(3)},
                     };
        tasks3 = new[]
                     {
                         new Task {TaskName = "Task 3.1", DueDate = DateTime.Today.AddDays(5)},
                         new Task {TaskName = "Task 3.2", DueDate = DateTime.Today.AddDays(6)},
                         new Task {TaskName = "Task 3.3", DueDate = DateTime.Today.AddDays(2)},
                     };
        persons = new[]
                      {
                          new Person {Firstname = "Gabriel", Lastname = "Schenker", Tasks = tasks1},
                          new Person {Firstname = "John", Lastname = "Doe", Tasks = tasks2},
                          new Person {Firstname = "Ann", Lastname = "Moe", Tasks = tasks3},
                      };
        foreach (var person in persons)
            Session.Save(person);
    }
}

Now I can write a test for the case where I want to retrieve all persons from the database

[TestFixture]
public class when_querying_all_persons : a_repository_with_persons_having_tasks
{
    private IEnumerable<Person> list;
 
    protected override void Because()
    {
        list = from p in db.Persons
               select p;
    }
 
    [Test]
    public void should_return_all_persons()
    {
        list.Count().ShouldEqual(persons.Length);
    }
}

The query generated by the above test is

SELECT  this_.Id as Id0_0_, 
        this_.Lastname as Lastname0_0_, 
        this_.Firstname as Firstname0_0_ 
FROM [Person] this_

That's exactly what we were expecting!

Using Where to get a filtered list

Now let's do some more interesting stuff. I want to have a filtered list of persons

[TestFixture]
public class when_retrieving_filtered_list_of_persons : a_repository_with_persons_having_tasks
{
    [Test]
    public void can_filter_by_LastName()
    {
        var list = db.Persons.Where(x=>x.Lastname=="Doe");
        list.Count().ShouldEqual(1);
    }
 
    [Test]
    public void can_filter_by_task()
    {
        var list = from p in db.Persons
                   from t in p.Tasks
                   where t.DueDate == DateTime.Today.AddDays(3)
                   select p;
        list.Count().ShouldEqual(2);
    }
}

In the first test I use the Where extension method defined by LINQ to filter my collection of persons. The extension methods defined by LINQ are also called query operators. The select statement generated by Linq2NH is

SELECT count(*) as y0_ 
FROM [Person] this_ 
WHERE this_.Lastname = @p0; 
@p0 = 'Doe'

In the second test I use the new language extension introduced for C# (also called a query expression) to get a filtered set of persons. This time we get the following select statement sent to the database

SELECT          count(*) as y0_ 
FROM            [Person] this_ 
left outer join [Task] t1_ on this_.Id=t1_.Person_id 
WHERE t1_.DueDate = @p0; 
@p0 = '29.11.2008 00:00:00'

Obviously also joins between tables work as expected.

Getting ordered lists

Often we need an ordered list of items, in this case persons ordered by their last name.

[Test]
public void can_order_by_LastName()
{
    var list = db.Persons.OrderBy(x => x.Lastname);
    list.First().Lastname.ShouldEqual("Doe");
}

The query generated is

SELECT  this_.Id as Id0_0_, 
        this_.Lastname as Lastname0_0_, 
        this_.Firstname as Firstname0_0_ 
FROM [Person] this_ 
ORDER BY this_.Lastname asc

Code

The code accompanying this post can be found here.

Summary

It is straight forward to use LINQ to NHibernate. Although the Linq2NH provider is not fully implemented it is more than sufficient for most scenarios we encounter in typical projects. And if ever we encounter a query that cannot be executed through the Linq2NH provider we can still implement the query by using HQL instead.

GabrielEnjoy

 Blog Signature Gabriel

posted @ 11/26/2008 5:13 AM by NHibernate's Answers

Legacy DB and one-to-one relations

When dealing with a legacy database one often encounters the situation that the database schema defines one-to-one relations between two entities. A typical example might be the following schema fragment

erd

with a one-to-one relation between person and address, that is: each person can have an address and an address can only belong to a single person. To guarantee that a single person can only have zero or one address the foreign key column PersonId in the Address table is set to be unique.

Such a construct is not straight forward to map in NHibernate! I want to show you one possible solution. The solution works but is not free from certain hick-ups! Let's first model the domain

The domain model

model

the code for the above model is straight forward. Nothing magic

public class Person
{
    public virtual int Id { get; set; }
    public virtual string FirstName { get; set; }
    public virtual string LastName { get; set; }
    public virtual Address Address { get; private set; }
 
    public virtual void AssignAddress(Address address)
    {
        Address = address;
        address.Owner = this;
    }
}
 
public class Address
{
    public virtual int Id { get; set; }
    public virtual string AddressLine1 { get; set; }
    public virtual string AddressLine2 { get; set; }
    public virtual string PostalCode { get; set; }
    public virtual string City { get; set; }
    public virtual Person Owner { get; set; }
}

Note that I have defined the setter method of the Address property in the Person entity as private such as that the consumer of the code is forced to use the AssignAddress method which sets up the bi-directional relation.

The mapping

And now we try to find a working mapping. I use the Fluent NHibernate framework for the mapping (please refer to my previous posts for an introduction to this framework: part 1, part 2, part 3, part 4)

public class PersonMapper : ClassMap<Person>
{
    public PersonMapper()
    {
        LazyLoad();
 
        Id(x => x.Id);
        Map(x => x.FirstName);
        Map(x => x.LastName);
        HasOne(x => x.Address)
            .PropertyRef(p => p.Owner)
            .Cascade.All()
            .FetchType.Join();
    }
}
 
public class AddressMapper : ClassMap<Address>
{
    public AddressMapper()
    {
        LazyLoad();
 
        Id(x => x.Id);
        Map(x => x.AddressLine1);
        Map(x => x.AddressLine2);
        Map(x => x.PostalCode);
        Map(x => x.City);
        References(x => x.Owner)
            .WithUniqueConstraint()
            .TheColumnNameIs("PersonId")
            .LazyLoad()
            .Cascade.None();
    }
}

Unit Tests

What does the above mapping derive as database schema? Let's implement a test. First I have to define what is my model (or: where are my mappings to be found)

public class TestModel : PersistenceModel
{
    public TestModel()
    {
        addMappingsFromAssembly(typeof(Person).Assembly);
    }
}

I then use the following code to generate the schema

[TestFixture]
public class when_creating_the_schema 
{
    [SetUp]
    protected void Context()
    {
        var model = new TestModel();
        var config = new Configuration();
        config.Configure();
        model.Configure(config);
        var factory = config.BuildSessionFactory();
        var session = factory.OpenSession();
        new SchemaExport(config).Execute(true, false, false, false, session.Connection, null);
    }
 
    [Test]
    public void smoke_test()
    {
        true.ShouldBeTrue();
    }
}

when using an SQLite database (in-memory mode) the output generated by the smoke test is

unittestschema

let me reformat the create table scripts a little bit

create table [Person] (
  Id  integer, 
  LastName TEXT, 
  FirstName TEXT, 
  primary key (Id))
  
create table [Address] (
  Id  integer, 
  AddressLine1 TEXT, 
  AddressLine2 TEXT, 
  PostalCode TEXT, 
  City TEXT, 
  PersonId INTEGER unique, 
  primary key (Id))

we can clearly see that the schema created from the model is indeed equal to the schema in the legacy database. Especially note that the PersonId foreign key in the Address table is set to be unique. Now that is a good start...!

Let's see how the creation of a new person with an associated address works (if ever). I have implemented the following unit test

[TestFixture]
public class when_adding_a_new_person_with_an_address : Person_Fixture
{
    private Address address;
    private Person person;
 
    protected override void Context()
    {
        base.Context();
        address = new Address
                      {
                          AddressLine1 = "Some Street 1",
                          PostalCode = "8000",
                          City = "Zurich"
                      };
        person = new Person {FirstName = "Gabriel", LastName = "Schenker"};
        person.AssignAddress(address);
        Session.Save(person);
        Session.Flush();
        Session.Clear();
    }
 
    [Test]
    public void smoke_test()
    {
        true.ShouldBeTrue();
    }
}

in the context I create a new person entity and assign it a new address. Then I save the person, flush and clear the session. The smoke test just verifies that the context can be set up without throwing an exception. Let's run the test and analyze the output

createpersontest

and indeed this works as expected. First a person record is created and the the associated address is created.

Now let's have a look what happens when we query for this person. I add this test to the above test class

[Test]
public void should_add_person_to_database()
{
    var fromDb = Session.Get<Person>(person.Id);
    fromDb.ShouldNotBeNull();
    fromDb.ShouldNotBeTheSameAs(person);
    fromDb.LastName.ShouldEqual(person.LastName);
    fromDb.FirstName.ShouldEqual(person.FirstName);
}

In the first line I load the person entity which I have previously created from database. I then assert that the entity does indeed exist and has the expected properties. The following test output is generated

queryperson

The interesting part is the one outlined by the red rectangle. When I load a person two select statements are generated. One to load the person record and the second to load its associated address.

Problems with this implementation

When loading a single person entity this is no problem but it starts to be a problem when I try to load a list of persons, as I do in the following test

public void can_load_all_persons()
{
    var list = Session.CreateQuery("from Person").List<Person>();
}

the output generated is

querylistofpersons

In this test I have 3 persons (each having an address) in the database. The problem is that although only one select statement is generated for the Person table there are 3 select statements for loading the corresponding addresses. We have a typical select (n+1) problem which is BAD.

A possible solution to avoid the select n+1 problem

A possible solution to this problem is shown below

[Test]
public void can_load_all_persons_revisited()
{
    var list = Session.CreateQuery("select p.Id, p.LastName from Person p").List();
}

here I explicitly select the fields I want to retrieve from database. And indeed the result of the test is only one select statement

querylistofpersons 2

We can use this method when we need a lookup list with all or some persons in the database.

Code

The code accompanying this post can be found here.

Summary

I have shown a possible solution how one can map a one-to-one relation implied by a pre-existing legacy database with NHibernate. For the mapping I have used the Fluent NHibernate framework. There is a select (n+1) problem with this implementation. But this problem can be avoided by using customized queries.

Enjoy.

Blog Signature Gabriel

posted @ 11/18/2008 10:17 PM by NHibernate's Answers

Lazy loading BLOBS and the like in NHibernate

[Updated, 2008-11-20 --> see end of post]

One of the questions that is asked again and again in the NHibernate user mailing list is the question about whether NHibernate supports lazy-loading of properties. The answer is NO - at least for the time being. Why is this question reasonable? Well, often we have entities in our domain that contain fields with large amount of data. Some samples are

  • a large binary object (BLOB, e.g. an image, a Word document, a PDF, etc.),
  • a large text object (CLOB, or nvarchar(max) )
  • a cluster of rarely used extra fields

The problem is that we do not always need all this information when loading an entity. Thus we can massively improve the performance of our queries if those fields would only be loaded on demand.

The Model

Let's have a look at the following simplified domain model.

model

Here the person entity has an associated photo. The photo has been extracted from the person entity since NHibernate does not support lazy load of specific properties of an entity (as mentioned above) and thus each time we load a person entity we would also load its photo which might be huge (e.g. several MB).

The code for the person entity is simple

public class Person
{
    public virtual Guid Id { get; private set; }
    public virtual string LastName { get; private set; }
    public virtual string FirstName { get; private set; }
    public virtual PersonPhoto Photo { get; private set; }
 
    // to satisfy NHibernate only!
    protected Person() { }
 
    public Person(string lastName, string firstName, PersonPhoto personPhoto)
    {
        LastName = lastName;
        FirstName = firstName;
        AssignPhoto(personPhoto);
    }
 
    public virtual void AssignPhoto(PersonPhoto photo)
    {
        Photo = photo;
        photo.Owner = this;
    }
}

please note that I have defined a method to assign a photo to the person. This method takes care of the fact that the relation between the person and the photo entity is bi-directional (via the assignment photo.Owner = this). I have omitted any validation code for brevity.

Let's also have a look at the implementation of the PersonPhoto class which I kept even simpler

public class PersonPhoto
{
    public virtual Guid Id { get; set; }
    public virtual Person Owner { get; set; }
    public virtual byte[] Image { get; set; }
}

Let's now define the mapping for my simple model.

Mapping with XML

The question is now: how can I map this model to achieve the desired result that is

  • whenever I load a person its photo is not automatically loaded but only on request (lazy load)
  • treating the person photo like an associated entity of the person entity, that is the photo is created, updated and deleted together with its parent - the person.

It is not possible in NHibernate to define a (bi-directional) one-to-one relation between Person and PersonPhoto and requesting at the same time that the PersonPhoto is lazy loaded.

But a possible solution to declare the relation between Person and PersonPhoto as many-to-one and between PersonPhoto and Photo as one-to-one.

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   assembly="Blobs"
                   namespace="Blobs">
  <class name="Person">
    <id name="Id">
      <generator class="guid"/>
    </id>
    <property name="LastName"/>
    <property name="FirstName"/>
    <many-to-one name="Photo" class="PersonPhoto" unique="true"
                 column="PersonPhotoId" cascade="all-delete-orphan"/>
  </class>
  
  <class name="PersonPhoto">
    <id name="Id">
      <generator class="guid"/>
    </id>
    <property name="Image" type="BinaryBlob"/>
    <one-to-one name="Owner" property-ref="Photo" constrained="true"/>
  </class>
</hibernate-mapping>

It is important to note that the many-to-one relation defined in the Person mapping is declared as unique to avoid that a single photo is assigned to more than one person. Also I define that each insert, update and delete action applied to a person entity should be cascaded to the associated PersonPhoto entity.

Unit testing the XML mapping

Let's now write some unit tests that verify that my requirements are indeed satisfied.

[Updated]

First I want to analyze what database schema is created. Remember: whenever I have the possibility to start a so called green field project I always start by defining the domain and then let the database auto-generated from the domain (that is: the database is only an implementation detail). Things might be different if you have to use a pre-existing database though...

So my unit test to verify that the schema can indeed be created is

[TestFixture]
public class when_creating_the_schema : Person_Fixture
{
    protected override void Context()
    {
        base.Context();
        new SchemaExport(Configuration).Execute(true, false, false, false, Session.Connection, null);
    }
 
    [Test]
    public void smoke_test()
    {
        true.ShouldBeTrue();
    }
}

In the last line of the Context method I use the SchemaExport class of NHibernate to generate the database schema from the model and the mapping information. The first (and only) test I write is a so called smoke test, that is it's a dummy test which should run just to test that setting up the context doesn't throw an exception. And indeed the test succeeds and the following output is produced

DbSchemaGenerationTest

The script is generated for the SQLite database I use. For a different type of database server the script would look slightly different. The two tables created are

create table Person (
  Id UNIQUEIDENTIFIER not null, 
  LastName TEXT, 
  FirstName TEXT, 
  PersonPhotoId UNIQUEIDENTIFIER unique, 
  primary key (Id)
)
 
create table PersonPhoto (
  Id UNIQUEIDENTIFIER not null, 
  Image BLOB, 
  primary key (Id)
)

Second I want to test whether I can create a new person object having a photo and save this aggregate to the database.

[TestFixture]
public class when_adding_a_new_person_with_a_photo : Person_Fixture
{
    private PersonPhoto photo;
    private Person person;
 
    protected override void Context()
    {
        base.Context();
        photo = new PersonPhoto {Image = Encoding.Default.GetBytes("This is a placeholder for a photo...")};
        person = new Person("Schenker", "Gabriel", photo);
        Session.Save(person);
 
        // clean up
        Session.Flush();
        Session.Clear();
    }
}

In the above code I have setup the context, that is - I want to add a new person with photo to the database - (please have a look at the source code regarding the base class Person_Fixture which I use)

The first test I write is again smoke test, which should run just to test that setting up the context doesn't throw an exception.

// Smoke test
[Test]
public void should_execute_without_an_error()
{
    true.ShouldBeTrue();
}

And indeed, if you run this test it is green. This tells me that the method Context is running without causing an exception.

Remarks: All the ShouldXXX methods you'll see in the presented code fragments are extension methods I defined and use for better readability of the code. You can find them here.

The second test method checks whether there exists indeed a person record in the database

[Test]
public void person_should_exist_in_the_database()
{
    var fromDb = Session.Get<Person>(person.Id);
    fromDb.ShouldNotBeNull();
    fromDb.ShouldNotBeTheSameAs(person);
    fromDb.LastName.ShouldEqual(person.LastName);
    fromDb.FirstName.ShouldEqual(person.FirstName);
}

In the above code I first check whether a non null object is loaded from db and then whether it is a different instance (that is it has not been just loaded from the first level cache of NHibernate --> please see also this post) which means it has really been loaded from the database. Finally I check some of the properties for equality. As you might have expected: this test is green when run.

Now I want to also check whether the photo has really been stored in the database or not. The following test should confirm that

[Test]
public void person_photo_should_exist_in_the_database()
{
    var fromDb = Session.Get<Person>(person.Id);
 
    fromDb.Photo.ShouldNotBeNull();
    fromDb.Photo.ShouldNotBeTheSameAs(person.Photo);
    fromDb.Photo.Image.ShouldEqual(person.Photo.Image);
}

Well, green again when run - so no problem!

One important thing to check is that the very same photo cannot assigned to two different person instances. Each photo is uniquely assigned to a certain person. The following test verifies that behavior.

[Test]
public void adding_another_person_with_same_photo_should_not_be_possible()
{
    var otherPerson = new Person("Doe", "John", photo);
    Session.Save(otherPerson);
    try
    {
        Session.Flush();
        Assert.Fail("Expected exception!");
    }
    catch(HibernateException)
    {
        Session.Clear();
    }
}

this code needs some further explanation. As you probably can see I expect that NHibernate throws an exception when trying to add a new person having the same photo as an already existing person. The exception is not thrown when calling the save method of the session but only when the session is flushed (that is: the insert command is executed on the database). I clear the session in the catch block since otherwise the exception would be raised again by my test tear-down method (in the base fixture class) which also flushes and disposes the session object.

The above test also passes and thus we are left with only one additional test. Does the person photo indeed lazy load? Let's write a test which verifies this

[TestFixture]
public class when_loading_an_existing_person_from_database : Person_Fixture
{
    private PersonPhoto photo;
    private Person person;
 
    protected override void Context()
    {
        base.Context();
        photo = new PersonPhoto { Image = Encoding.Default.GetBytes("This is a placeholder for a photo...") };
        person = new Person("Schenker", "Gabriel", photo);
        Session.Save(person);
 
        // clean up
        Session.Flush();
        Session.Clear();
    }
 
    [Test]
    public void Person_photo_should_be_lazy_loaded()
    {
        var fromDb = Session.Load<Person>(person.Id);
 
        NHibernateUtil.IsInitialized(fromDb.Photo).ShouldBeFalse();
 
        var image = fromDb.Photo.Image;
 
        NHibernateUtil.IsInitialized(fromDb.Photo.Image).ShouldBeTrue();
    }
}

Again I first setup the context for the test, that is I add a person with a photo to the database. Then in the test method I load the previously save person from database and with the aid of a utility class of NHibernate I check whether the property Photo of the person entity is un-initialized (i.e. the entity behind the property is not loaded). Then I access the property and finally I use the utility class again to test whether now the photo has been (lazy) loaded.

Wow - the test passes! We have found a mapping which satisfies all our needs!

Mapping with Fluent NHibernate

As described in earlier posts (part 1, part 2, part 3 and part 4) we have the possibility to use Fluent NHibernate to map our entities. Mapping this way has many advantages (I already discussed in the posts just mentioned). Let's have a look at the mapping needed for the Person entity

public class PersonMapper : ClassMap<Person>
{
    public PersonMapper()
    {
        LazyLoad();
        
        Id(x => x.Id);
        Map(x => x.FirstName);
        Map(x => x.LastName);
        References(x => x.Photo)
            .FetchType.Select()
            .Cascade.All()
            .TheColumnNameIs("PersonPhotoId")
            .WithUniqueConstraint();
    }
}

I define a mapper class which inherits form the generic ClassMap<T> base class provided by the Fluent NHibernate framework. In the constructor of this class I define the mapping. The first line defines that I want my person entity to be lazy loaded (by default in Fluent NHibernate all entities are NOT lazy loaded).

The many-to-one relation between person and person photo is mapped with the aid of the References method.

The PersonPhoto entity is mapped as follows

public class PersonPhotoMapper : ClassMap<PersonPhoto>
{
    public PersonPhotoMapper()
    {
        LazyLoad();
        Id(x => x.Id);
        Map(x => x.Image);
        HasOne(x => x.Owner)
            .PropertyRef(x=>x.Photo)
            .Constrained();
    }
}

Here the part that interests us most (the reverse relation from photo to person) is mapped with the aid of the HasOne method.

Unit testing the Fluent NHibernate mapping

There is not much to say about this. The unit test are nearly the same with only one difference. I use a different base class from which I derive all my test classes. The definition of the base class used can be found here.

Code

You can find the accompanying this post here.

Summary

I have shown you a way how you can structure your domain model and map your entities to be able to lazy load "extra" information of a given entity. I have explained how to map the domain by using standard XML mapping files as well as by using the Fluent NHibernate framework. By applying this technique you can massively improve the performance of queries and reduce the bandwidth needed to transfer data from the database to the consuming client.

[Update] Uni-directional link between Person and PersonPhoto

In a comment I was asked why I implemented the relation between the person and the person photo entity as bi-directional and whether it would not be possible to only implement and uni-directional realtion between person and person photo.

The answer is: there is NO special reason for having a bi-directional relation (Eric Evans in his DDD book even suggests to keep the relations uni-directional whenever possible). And yes, it is possible to implement the sample with an uni-directional relation. I'll show the details below (this time I only show the mapping in Fluent NHibernate but the XML mapping is straight forward.

Here is the model with only an uni-directional relation

unidirectional_model

and the code

public class Person
{
    public virtual Guid Id { get; private set; }
    public virtual string LastName { get; private set; }
    public virtual string FirstName { get; private set; }
    public virtual PersonPhoto Photo { get; private set; }
 
    // to satisfy NHibernate only!
    public Person() { }
 
    public Person(string lastName, string firstName, PersonPhoto personPhoto)
    {
        LastName = lastName;
        FirstName = firstName;
        AssignPhoto(personPhoto);
    }
 
    public virtual void AssignPhoto(PersonPhoto photo)
    {
        Photo = photo;
    }
}
 
public class PersonPhoto
{
    public virtual Guid Id { get; set; }
    public virtual byte[] Image { get; set; }
}

Now let me show the mapping for this model (Fluent NHibernate)

public class PersonMapper : ClassMap<Person>
{
    public PersonMapper()
    {
        LazyLoad();
        
        Id(x => x.Id);
        Map(x => x.FirstName);
        Map(x => x.LastName);
        References(x => x.Photo)
            .FetchType.Select()
            .Cascade.All()
            .TheColumnNameIs("PersonPhotoId")
            .WithUniqueConstraint();
    }
}
 
public class PersonPhotoMapper : ClassMap<PersonPhoto>
{
    public PersonPhotoMapper()
    {
        LazyLoad();
 
        Id(x => x.Id);
        Map(x => x.Image);
    }
}

If I generate the schema from this model I get the very same create table scripts as in the bi-directional sample. And all the other unit tests I have shown above run successfully.

The code has been updated to contain both samples the uni- and the bi-directional relation.

Enjoy

Blog Signature Gabriel

posted @ 11/17/2008 12:23 AM by NHibernate's Answers

First and Second Level caching in NHibernate

I'll try to dive deep into the caching of NHibernate in this article. This post has been inspired by the talk given by Oren Eini (aka Ayende) at the Kaizen Conference in Austin TX.

Caching is a topic that is IMHO only superficially described so far especially regarding the second level cache. Most of the time one finds a lot of information about how to configure a specific cache provider for usage but the real usage (who and when) is not really described. I hope to be able to provide some of the missing pieces with this post.

The full source code used for this post can be found here.

First Level Cache

When using NHibernate the first level cache is automatically enabled as long as one uses the standard session object. We can avoid to use a cache at all when using the stateless session provided by NHibernate though. The stateless session is especially useful for reporting situations or for batch processing. When NHibernate is loading an entity by its unique id from the database then it is automatically put into the so called identity map. This identity map represents the first level cache.
The life-time of the first level cache is coupled to the current session. As soon as the current session is closed the content of the respective first level cache is cleared. Once an entity is in the first level cache a subsequent operation that wants to load the very same entity inside the current session retrieves this entity from the cache and no roundtrip to the database is needed.
One of the main reasons behind this entity map is to avoid the situation that two different instances in memory can represent the same database record (or entity). The NHibernate session object provides us two ways to retrieve an entity by its unique id from the database. There are subtle but important differences between them.

Let's implement an Account class for our samples

public class Account
{
    public virtual int Id { get; private set; }
    public virtual string Name { get; private set; }
    public virtual decimal Balance { get; private set; }
 
    protected Account()
    {
    }
 
    public Account(string name, decimal balance)
    {
        Name = name;
        Balance = balance;
    }
 
    public virtual void Credit(decimal amount)
    {
        Balance += amount;
    }
}

the corresponding XML mapping file is

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Caching"
                   assembly="Caching">
  <class name="Account">
    <id name="Id">
      <generator class="hilo"/>
    </id>
    <property name="Name"/>
    <property name="Balance"/>
  </class>
</hibernate-mapping>

Get an entity from database

With the session.Get(id) method we can retrieve an entity from database. If there is no record found in the database with the given id then null is returned.
On the other hand if a record with the given unique id exists in the database then NHibernate loads this record and instantiates a fully populated entity in memory and immediately puts this entity into the entity map (or first level cache)
Assuming that a specific record has already been loaded from database inside the current session then a subsequent Get(id) operation will return the cached entity to the caller. We can see this in the following output produced by the unit test. There is only ONE select statement produced by NHibernate

FistLevelCache

The above output was produced by this unit test

[Test]
public void trying_to_get_the_same_account_a_second_time_should_get_the_account_from_1st_level_cache()
{
    Console.WriteLine("------ now getting entity for the first time");
    var acc1 = Session.Get<Account>(account.Id);
    Console.WriteLine("------ now getting entity for the second time");
    var acc2 = Session.Get<Account>(account.Id);
 
    acc1.ShouldBeTheSameAs(acc2);
}

Load an entity from database

When using the session.Load(id) method NHibernate only instantiates a proxy for the given entity. As long as we only access the id of the entity the entity itself is not loaded from the database. Only when we try to access one of the other properties of the entity NHibernate loads the entity from the database. We can see this clearly in the following output produced by the unit test. I have added some comment to the output to make it easier to verify the result.

Here is a unit test

[Test]
public void trying_to_load_a_non_existing_entity()
{
    var acc1 = Session.Load<Account>(account.Id);
    acc1.ShouldNotBeNull();
    Console.WriteLine("------ now accessing the id of the entity");
    Console.WriteLine("The id is equal to {0}", acc1.Id);
    acc1.Id.ShouldEqual(account.Id);
    Console.WriteLine("------ now accessing a property (other than the ID) of the entity");
    Console.WriteLine("The name of the account is: {0}", acc1.Name);
    acc1.Name.ShouldEqual(account.Name);
}

and the output produced by the above code

FistLevelCache2

Using Load to optimize data access

The behavior mentioned above is especially useful when creating or updating complex entities which have relations to other entities. Assume that the account entity references a customer entity and I want to create a new account for a customer from which I only know its unique id. Then my code might look as follows

var newAccount = new Account("EUR Account 1", 1250m);
newAccount.Customer = Session.Load<Customer>(customerId);
Session.Save(newAccount);

Note that I use the Load method to get the (existing) customer entity. NHibernate will not physically load the customer since in the account table on the database there is only the id of the customer needed (as a foreign key). And as I mentioned above, as long as you only use the id of an entity retrieved with the Load method the corresponding entity is not physically loaded from the database.

Second level cache

The life time of the second level cache is tied to the session factory and not to an individual session. Once an entity is loaded by its unique id and the second level cache is active the entity is available for all other sessions (of the same session factory). Thus once the entity is in the second level cache NHibernate won't load the entity from the database until it is removed from the cache.
To enable the second level cache we have to adjust our configuration file. We have to define which cache provider we want to use. There exist various implementations of a second level cache. For our sample we use a Hashtable based cache which is included in the core NHibernate assembly. Please note that you should never use this cache provider for production code but only for testing. Please refer to the chapter "Second Level Cache implementations" below to decide which implementation fits best for your needs. You won't have to change your code if you change the cache provider though.

We have to add the the following line to the configuration file

    <property name="cache.provider_class">NHibernate.Cache.HashtableCacheProvider</property>

this will instruct NHibernate to use the previously mentioned Hashtable based cache provider as a provider for the second level cache.
Now let's have a look at the following unit test.

[Test]
public void trying_to_load_an_existing_item_twice_in_different_sessions_should_use_2nd_level_cache()
{
    using(var session = SessionFactory.OpenSession())
    {
        var acc = session.Get<Account>(account.Id);
        acc.ShouldNotBeNull();
    }
 
    using(var session = SessionFactory.OpenSession())
    {
        var acc = session.Get<Account>(account.Id);
        acc.ShouldNotBeNull();
    }
}

In the above test we open a first session and load an existing entity from the database. Then we open a second session and try to load the very same entity from the database again. Without a second level cache we would expect that NHibernate loads the entity two times from the database since we are using 2 different sessions and thus the first level cache can not be used to avoid a roundtrip to the database. So let's have a look at the result produced.

SecondLevelCache

Wait a moment - we clearly see two select statements instead of only one. What did we do wrong? This is not an error, no but it's a feature. NHibernate does not enable the second level cache by default, since it would have too many undesired implications. One has to explicitly enable the second level cache. If we add the following statement to the configuration file

    <property name="cache.use_second_level_cache">true</property>

we activate the second level cache. But that is still not enough. We have to also enable our entity to be cached in the second level cache.
This can be done by adding the following statement to the entity's mapping file

    <cache usage="read-write"/>

If we now run the unit test again we obtain the expected result. The entity is loaded only once from the database. The second time it is loaded from the second level cache.
Now if we try to update an entity which is already in the second level cache then this entity should also be automatically updated in the second level cache. The following unit test should prove this behavior.

[Test]
public void when_updating_the_entity_then_2nd_level_cache_should_also_be_updated()
{
    using(var session = SessionFactory.OpenSession())
    using (var tx = session.BeginTransaction())
    {
        var acc = session.Get<Account>(account.Id);
        acc.Credit(200m);
        tx.Commit();
    }
 
    using(var session = SessionFactory.OpenSession())
    {
        var acc = session.Get<Account>(account.Id);
        acc.Balance.ShouldEqual(1200m);
    }
}

and indeed it does as we can see in the test output. Again the entity is loaded from the cache the second time it is requested although it was updated (no select statement after the update statement). The last line in the test code verifies that the entity was indeed updated.
SecondLevelCache2 

Second Level Cache Providers

All second level cache providers are part of the NHibernate contribution. The following list gives as short description of each provider.

  • Velocity: uses Microsoft Velocity which is a highly scalable in-memory application cache for all kinds of data.
  • Prevalence: uses Bamboo.Prevalence as the cache provider. Bamboo.Prevalence is a .NET implementation of the object prevalence concept brought to life by Klaus Wuestefeld in Prevayler. Bamboo.Prevalence provides transparent object persistence to deterministic systems targeting the CLR. It offers persistent caching for smart client applications.
  • SysCache: Uses System.Web.Caching.Cache as the cache provider. This means that you can rely on ASP.NET caching feature to understand how it works.
  • SysCache2: Similar to NHibernate.Caches.SysCache, uses ASP.NET cache. This provider also supports SQL dependency-based expiration, meaning that it is possible to configure certain cache regions to automatically expire when the relevant data in the database changes.
  • MemCache: uses memcached; memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. Basically a distributed hash table.
  • SharedCache: high-performance, distributed and replicated memory object caching system. See here and here for more info

Saving a transient entity

Lately the following question came up in the NHibernate user list. "When I create a transient object and then save it to my session and
commit to my database, should it be added to my second level cache as well?" The answer is of course YES. Let's write a unit test

[TestFixture]
public class when_saving_a_transient_account : FixtureBase
{
    private Account newAccount;
 
    protected override void Context()
    {
        base.Context();
        using (var session = SessionFactory.OpenSession())
        using (var tx = session.BeginTransaction())
        {
            newAccount = new Account("CHF Account", 5500m);
            session.Save(newAccount);
            tx.Commit();
        }
    }
 
    [Test]
    public void account_should_be_in_second_level_cache()
    {
        using (var session = SessionFactory.OpenSession())
        {
            Console.WriteLine("--> Now loading account");
            var acc = session.Get<Account>(newAccount.Id);
            acc.ShouldNotBeNull();
            acc.Name.ShouldEqual(newAccount.Name);
        }
    }
}

In the method Context I create and save a new account entity. In the test method I open a new session and try to load the previously created entity from database. I now expect that NHibernate should take the entity out of the second level cache. And indeed it does. This is the resulting output

SecondLevelCache3

as we can see, no select statement is sent to the database when the entity is loaded from a new session after it has been created and saved beforehand.

Inside the second level cache

An important point is that the second-level cache does not cache instances of the object type being cached; instead it caches the individual values of the properties of that object. This provides two benefits. One, NHibernate doesn't have to worry that your client code will manipulate the objects in a way that will disrupt the cache. Two, the relationships and associations do not become stale, and are easy to keep up-to-date because they are simply identifiers. The cache is not a tree of objects but rather a map of arrays.

If you are interested in some more details about the inner workings of the second level cache then the following text (taken from Ayende and only slightly edited by me) will be of interest to you:

NHibernate is design as an enterprise OR/M product, and as such, it has very good support for running in web farms scenarios. This support include running along side with distributed caches, including immediate farm wide updates.  NHibernate goes to great lengths to ensure cache consistency in these scenarios...

The way it works, NHibernate keeps three caches.

  • The entities cache - the entity data is disassembled and then put in the cache, ready to be assembled to entities again.
  • The queries cache - the identifiers of entities returned from queries, but no the data itself (since this is in the entities cache).
  • The update timestamp cache - the last time a table was written to.

The last cache is very important, since it ensures that the cache will not serve stale results.

Now, when we come to actually using the cache, we have the following semantics.

  • Each session is associated with a timestamp on creation.
  • Every time we put query results in the cache, the timestamp of the executing session is recorded.
  • The timestamp cache is updated whenever a table is written to, but in a tricky sort of way:
    • When we perform the actual writing, we write a value that is somewhere in the future to the cache. So all queries that hit the cache now will not find it, and then hit the DB to get the new data. Since we are in the middle of transaction, they would wait until we finish the transaction. If we are using low isolation level, and another thread / machine attempts to put the old results back in the cache, it wouldn't hold, because the update timestamp is into the future.
    • When we perform the commit on the transaction, we update the timestamp cache with the current value.

Now, let us think about the meaning of this, shall we?

If a session has perform an update to a table, committed the transaction and then executed a cache query, it is not valid for the cache. That is because the timestamp written to the update cache is the transaction commit timestamp, while the query timestamp is the session's timestamp, which obviously comes earlier.

The update timestamp cache is not updated until you commit the transaction! This is to ensure that you will not read "uncommitted values" from the cache.

Please note that if you open a session with your own connection, it will not be able to put anything in the cache (all its cached queries will have an invalid timestamp!)

In general, those are not things that you need to concern yourself with, but I spent some time today just trying to get tests for the second level caching working, and it took me time to realize that in the tests I didn't used transactions and I used the same session for querying as for performing the updates.

Collections and the second level cache

Let's assume the following scenario: we have a blog which can have many posts. See the diagram below.
Blog diagram
The corresponding code to define the entities is as follows

public class Blog
{
    public virtual int Id { get; set; }
    public virtual string Author { get; set; }
    public virtual string Name { get; set; }
    public virtual IList<Post> Posts { get; set; }
 
    public Blog()
    {
        Posts = new List<Post>();
    }
 
}
 
public class Post
{
    public virtual int Id { get; private set; }
    public virtual string Title { get; set; }
    public virtual string Body { get; set; }
}

and we can define the mapping of the blog and the post entity like this

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-mapping xmlns="urn:nhibernate-mapping-2.2"
                   namespace="Caching"
                   assembly="Caching">
  <class name="Blog">
    <cache usage="read-write"/>
    <id name="Id">
      <generator class="hilo"/>
    </id>
    <property name="Author"/>
    <property name="Name"/>
    <bag name="Posts" cascade="all" lazy="true">
      <cache usage="read-write"/>
      <key column="BlogId"/>
      <one-to-many class="Post"/>
    </bag>
  </class>
  
  <class name="Post">
    <cache usage="read-write"/>
    <id name="Id">
      <generator class="hilo"/>
    </id>
    <property name="Title"/>
    <property name="Body"/>
  </class>
</hibernate-mapping>

Note that I have added a <cache> element to the mapping for the Blog entity. This is enough to cache all simple Blog property values (e.g. Id, Name and Author) but not the state of associated entities or collections. Collections require their own <cache> element. In our case I added a <cache> element to the Posts collection (which is mapped as a bag). This cache will be used when enumeration the collection blog.Posts, for example. Please be aware that a collection cache only holds the identifiers of the associated post instances. That is, if we have a blog with three posts having id's 1,2 and 3 respectively then the second level cache will contain the values of simple properties of the blog and in addition an array with the ids {1,2,3}. If we require the post instances themselves to be cached, then we must enable caching of the Post class by adding a <cache> element to it's mapping.

Let me resume: by adding a <cache> element to the Blog, the Posts collection and the Post itself I have declared that I want NHibernate to cache not only my blog entities but also the associated Post collection in full detail.

Attention: dragons ahead

A common error (It happened to me as well!) is to forget to commit or omit a transaction when adding or changing an entity/aggregate to the database. If we now access the entity/aggregate from another session then the 2nd level cache will not be prepared to provide us the cached instances and NHibernate makes an (unexpected round trip to the database). The reason why this is the case is described in the chapter "Inside the second level cache" above.

What does this mean? Let's have a look at the code I use to setup the context for our unit tests regarding the Blog-->Posts problem.

blog = new Blog{ Author = "Gabriel", Name = "Keep on running"};
blog.Posts.Add(new Post{Title = "First post", Body = "Some text"});
blog.Posts.Add(new Post { Title = "Second post", Body = "Some other text" });
blog.Posts.Add(new Post { Title = "Third post", Body = "Third post text" });
using (var session = SessionFactory.OpenSession())
using(var tx = session.BeginTransaction())
{
    session.Save(blog);
    tx.Commit();        // important otherwise caching does NOT work!
}

In the above code I create a new blog having three assigned posts. The blog instance is then saved to the database inside a transaction. If I would omit the transaction or if I would forget to commit the transaction then the above samples would not work as expected and the 2nd level cache would not be used as desired.

Caching queries in the second level cache

We cannot only cache entities loaded by their respective unique id but also any query. For this we have to define the query as cacheable and set the desired cache mode. Let's have a look at a typical sample

[Test]
public void trying_to_cache_a_query()
{
    using (var session = SessionFactory.OpenSession())
    {
        Console.WriteLine("---> using query first time");
        var query = session
            .CreateQuery("from Blog b where b.Author = :author")
            .SetString("author", "Gabriel")
            .SetCacheable(true);
        var list = query.List<Blog>();
    }
    using (var session = SessionFactory.OpenSession())
    {
        Console.WriteLine("---> using query second time");
        var query2 = session
            .CreateQuery("from Blog b where b.Author = :author")
            .SetString("author", "Gabriel")
            .SetCacheable(true);
        var list2 = query2.List<Blog>();
    }
}

In the above sample I use the same query from different sessions. Please not that I have set cacheable to true for the query. In this case the query will be cached in the second level cache the first time it is executed. Any subsequent calls using the very same query will not hit the database. It is important to note however that if I change the value of the parameter(s) in the query then the query is reloaded again from the database. So it is the query and the set of parameter values that define the key under which the query is stored in the 2nd level cache.

The result of the above test looks as follows

 CachQuery1 

Of course I can also use named queries and cache them. A named query is defined inside a mapping file, e.g.

<query cacheable="true" cache-mode="normal" name="query1">
  <![CDATA[from Blog b where b.Name like :name]]>
  <query-param name="name" type="String"/>
</query>

The above query is called "query1" and has a single parameter called "name". The cache mode for this query is set to "normal". I can use such a query as follows

[Test]
public void trying_named_query()
{
    using (var session = SessionFactory.OpenSession())
    {
        Console.WriteLine("---> using named query first time");
        var list = session.GetNamedQuery("query1")
            .SetString("name", "Keep%")
            .List<Blog>();
    }
    using (var session = SessionFactory.OpenSession())
    {
        Console.WriteLine("---> using named query second time");
        var list2 = session.GetNamedQuery("query1")
            .SetString("name", "Keep%")
            .List<Blog>();
    }
}

The session object has a method GetNamedQuery to retrieve the query. The output produced by the above test is then

CachQuery2
If the content of the table on which the cached query is based is changed then the query is evicted from the second level cache and the next time the query is executed the query must be reloaded.

Cache Regions

If we don't use cache regions the second level cache can only be cleared as a whole. If you need to clear only part of the second level cache then use regions. Regions are distinguished by their name. One can put any number of different queries into a named cache region. The command to clear a cache region is as follows
    SessionFactory.EvictQueries("My Region");
where SessionFactory is the session factory instance currently used and "My Region" is the name of the cache region.

Source Code

The full source code used for this post can be found here.

Summary

NHibernate provides two types of caches. The first level cache and the second level cache. The first level cache is also called the identity map and is used not only to reduce the number of round trips to the database to improve the speed of an application but also to guarantee that there do not exist two distinct instances of an object having the very same id. NHibernate provides us two methods to load an entity by its unique id from the database. The Get method returns null if an entity with the given id does not exist or returns the fully loaded entity to the caller. The Load method on the other hand returns a proxy to the caller and only loads the entity from the database if another property than the identity is accessed. One can also call this a deferred load.

The second level cache is not used by default and should be used with caution. It can provide a huge scalability gain if used wisely but also reduce the performance of the overall system and introduce unnecessary complexity if used wrong.

The second level cache is related to the session factory, that is all session instances of a given session factory use the same 2nd level cache. That's differs from the behavior of the 1st level cache which is related to an individual session instance. One can cache individual entities or whole aggregates in the 2nd level cache. But one can also cache (complex and/or time consuming) queries in the 2nd level cache. The 2nd level cache can be fragmented into regions for a more fine grained control.

Enjoy.