Hibernating Rhinos

Zero friction databases

Stress testing RavenDB

The following is cross posted from Mark Rodseth’s blog (he also posted a follow up post with more details).

Mark is a .Net Technical Architect at a digital agency named Fortune Cookie in London. I would like to take the opportunity and thank Mark both for the grand experiment about which you are about to read and for the permission to post this in the company blog.

Please note: Mark or Fortune Cookie are not affiliated with either Hibernating Rhinos or RavenDB in any way.

When a colleague mentioned  RavenDB  to me I had a poke around and discovered that it was one of the more popular open source NoSQL technologies on the market. Not only that but it was bundled with Lucene.Net Search making it Document Database coupled with Lucene search capabilities.  With an interest in NoSQL technology and a grudge match that hadn’t been settled with Lucene.Net, I set myself the challenge to swap out our SQL Search implementation with RavenDB and then do a like for like load test against the two search technologies.
These are my findings from both a programmatic and performance perspective.


Installing RavenDB
There isnt much to installing Raven and its pretty much a case of downloading the latest build and running the Server application.
The server comes with a nice Silverlight management interface which allows you to manage all aspects of Raven Db from databases to data to indexes. All tasks have a programmatic equivalent but a decent GUI is an essential tool for noobs like myself.

Storing the Data
My first development task was to write an import routine which parsed the property data in SQL and then add it into a Raven Database. This was fairly easy and all I needed to do was to create a POCO, plug it with data from SQL and save it using the C# Raven API. The POCO serialised into JSON data and saved as a new document in the  RavenDB.

The main challenge here was changing my thinking from relational modelling to domain driven modelling - a paradigm shift required when moving to NoSQL - which includes concepts like aggregate roots, entities and value types. Journeying into this did get a bit metaphysical at times but here is my understanding of this new fangled schism.

Entity - An entity is something that has a unique identity and meaning in both the business and system context. In the property web site example, a flat or a bungalow or an office match these criteria.

Value Type - Part of the entity which does not require its own identity and has no domain or system relevance on its own. For example, a bedroom or a toilet.

Aggregate Root - Is an master entity with special rules and access permissions that relate to a grouping of similar entities. For example, a property is an aggregate of flats, bungalows and offices. This is the best description of these terms I found.

Hibernating Rhinos note: With RavenDB, we usually consider the Entity and Aggregate Root to be synonyms to a Document. There isn’t a distinction in RavenDB between the two, and they map to a RavenDB document.

In this example, I created one Aggregate Root Entity to store all property types.

C# Property POCO

Indexing the Data
Once the Data was stored it needed to be indexed for fast search. To achieve this I had to get to grips with map reduce functions which I had seen around but avoided like the sad and lonely looking bloke** at a FUN party.
The documentation is pretty spartan on the  RavenDB web site but after hacking away I finally created an index that worked on documents with nested types and allowed for spatial queries.
RavenDB allows you to create indexes using Map Reduce functions in LINQ. What this allows you to do is create a Lucene index from a large, tree like structure of data. Map reduce functions give you the same capability as SQL using joins and group by statements. To create a spatial index which allowed me to search properties by type and sector (nested value types) I created an index using the following Map Reduce function.

Index created using the Raven DB Admin GUI

Hibernating Rhinos note: a more efficient index would likely be something like:

from r in docs.AssetDetailPocos
select new
{
  sectorname = r.Sectors,
  prnlname = r.AddressPnrls,

  r.AssetId,
  r.AskingPrice,
  r.NumberOfBedrooms,
  r.NumberOfBathRooms,
  
  
  _ = SpatialIndex.Generate(r.AssetLatitude, r.AssetLongitude)
}

This would reduce the number of index entries and make the index smaller and faster to generate.

Querying the data

Now that I had data that was indexed, the final development challenge was querying it. RavenDB has a basic search API and a Lucene Query API for more complex queries. Both allow you to write queries in LINQ. To create the kind if complex queries you would require in a property searching web site, the API was a bit lacking. To work around this I had to construct my own native Lucene queries. Fortunately the API allowed me to do so.

Performance Testing

All the pawns were now in place for my load test.

  • The entire property SQL database was mirrored to  RavenDB.
  • The Search Interface now had both a SQL and a  RavenDB implementation.
  • I created a crude Web Page which allowed switching the search from SQL to  RavenDB via query string parameters and output the results using paging.To ensure maximum thrashing the load tests passed in random geo locations for proximity search and keywords for attribute search. 
  • A VM was setup and ready to face the wrath of BrowserMob.

I created a ramp test scaling from 0 to 1000 concurrent users firing a single get request with no think time at the Web Page and ran it in isolation against the SQL Implementation and then in isolation against the  RavenDB Implementation. The test ran for 30 minutes.
And for those of you on the edge of you seat the results where a resounding victory for  RavenDB. Some details of the load test are below but the headline is SQL choked at 250 concurrent users whereas with  RavenDB even with 1000 concurrent users the response time was below 12 seconds.

SQL Load Test

Transactions: 111,014 (Transaction = Single Get Request)
Failures: 110,286 (Any 500 or timeout)

SQL Data Throughput - Flatlines at around 250 concurrent users.

RavenDB Load Test

Transactions: 145,554 (Transaction = Single Get Request)
Failures: 0 (Any 500 or timeout)

RavenDB Data Throughput - What the graph should look like

Final thoughts

RavenDB is a great document database with fairly powerful search capabilities. It has a lot of pluses and a few negative which are listed for you viewing pleasure below.
Positives

  • The documentation although spartan does cover the fundamentals making it easy to get started. On some instances I did have to sniff through the source code to fathom how some things worked but that is the beauty of open source I guess. 
  • The Silverlight Admin interface is pretty sweet 
  • The Raven community (a google group) is very active and the couple of queries I posted were responded to almost immediately.
  • Although the API did present some challenges it both allowed you to bypass its limitations and even contribute yourself to the project.
  • The commercial licence for  RavenDB is pretty cheap at a $600 once off payment

Negatives

  • The web site documentation and content could do with an a facelift. (Saying that, I just checked the web site and it seems to have been be revamped)
  • I came a cross a bug in the Lucene.Net related to granular spatial queries which has yet to be resolved.   Not  RavenDB's fault but a dependence on third party libraries can cause issues. 
  • I struggled to find really impressive commercial reference sites. There are some testimonials but they give little information away. 
  • Sharding scares me.

I look forward to following the progress of  RavenDB and hopefully one day using it in a commercial project. I'm not at the comfort level yet for proposing it but with some more investigation and perhaps some good reference sites this could change very quickly.


* Starry Eyed groupies sadly didn't exist, nor have they ever.
** Not me.

http://ravendb.net

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

No comments posted yet.