Wednesday, April 8, 2009

Querying Java Objects stored in Terracotta's NAM

This post is inspired from : http://forums.terracotta.org/forums/posts/list/1965.page . Nothing new .. just another word in blogosphere.

Terracotta is gr8 clustering solution in-fact its platform-level service hence has large number of uses. One of the use is using it as database. Terracotta can never replace database but it can play role of data storage media very well. One of major disadvantage is lack of querying data. Only way you can query data is Map. Map is like single index so if you want to get list of object satisfying some criteria you are required to integrate through entire collection. There are already APIs written for querying java collections. So you can use them with Terracotta NAM.

When you think about querying there are lot of factors : Query Language, Its Performance - Optimizers, Operations Supported : Select, Update, Delete, Joins

JoSQL


JoSQL is good API for querying java collections with good documentation. I did small test with 1 million objects and random query took around 800ms which is way too much. Again its simple iteration through collection due to lack of indexes and query execution plan. Problem with Indexes is that Object graphs can change and at every change you are required to recompute the index which would be difficult to do : as complex as Terracotta's bytecode instrumentation. You can find test code here

Query Language : Moderately good, Performance : Not good for large collections, No update or delete only select projection queries. No joins

Quaere


Quaere is a very flexible DSL that lets you perform a wide range of queries against any data structure that is an array, or implements the java.lang.Iterable. Its sort of port of LINQ of .Net world. I think linq is next generation data quering tool -cleaner. Quaere is not query language but query API just like Hiberate Criteria query but more elegant. I really liked quaere - its really powerful its support join operation. You can read this post for details : Solving Puzzles with Quaere Its still beta level and not released. One of Queare's another sister project is its JPA integration. Imagine you could write standard JPA application with Quaere as query language and Terracotta as persistent store. No need of database. But as with JoSQL Quaere is also slow. I mean slower than RDBMS. I did small test with 1 million object similar to JoSQL test and response time was similar to JoSQL. You can download test code here

This post also discusses jmap's OQL implementation. It uses rhino javascript engine behind with hashtables.I did consider it to port for Terracotta but its custom written for Object Heap Dumps. JxPath is another tool with which you can query java collections using XPath expressions. I did not evaluate JxPath since i felt it will be on similar lines of JoSQL and Quaere, only different flavor. If you have used XPath earlier then this is much easier to use.

GlazedList is event driven list API specially designed for Swing Applications displaying table and list data. But if you consider List of Objects as table (each object is row and its properties as columns) a proper in-memory index can be maintained for querying. But this applies to only root object level. What if inner object in object graph store in your container changes?. You may then need to update the container whenever object changes. So i guess maintaining in-memory index for java objects is pretty difficult thing to do.

With such tools i think you can easily query moderate size java collections stored in Terracotta's durable memory with acceptable response time.

Tushar

8 comments:

Anonymous said...

What about Jofti?

Jofti is a pluggable object indexing and searching solution for objects in a Caching layer.

The framework provides transparent addition, removal and updating of object properties in its index as well as simple to use query capabilities for searching.

http://sourceforge.net/projects/jofti/

Rob Juurlink

Anonymous said...

What about Jofti?

Jofti is a pluggable object indexing and searching solution for objects in a Caching layer.

The framework provides transparent addition, removal and updating of object properties in its index as well as simple to use query capabilities for searching.

http://sourceforge.net/projects/jofti/

Rob Juurlink

Tushar Khairnar said...

Hi Robb,

thanks for the link.
But there are no releases of the project
Where i can get source code so that i can add Jofti in my test program?

I was also thinking of writing small object indexing framework. Gr8 to know something already exists.

Regards,
Tushar

Anonymous said...

Ok, fortunately I still have these files on my own server. I've made it downloadable, it is GPL afaik.

jofti-src-1.2-rc4.zip
jofti.jar

Rob Juurlink

Anonymous said...

The Jofti website exists at: http://prism-index.com/info.html

Taharqa said...

Hi Tushar,

what about JXPath (http://commons.apache.org/jxpath) which allows you things like this :

public class Author {
public Book[] getBooks(){
...
}
}

Author auth = new Author();
...

JXPathContext context = JXPathContext.newContext(auth);
Iterator threeBooks = context.iterate("books[position() < 4]");

or stuff like that

public class Employee {
public Address getAddress() {
...
}

public void setAddress(Address address) {
...
}
}

Employee emp = new Employee();
Address addr = new Address();
...

JXPathContext context = JXPathContext.newContext(emp);
context.setValue("address", addr);
context.setValue("address/zipCode", "90190");

Taharqa

Tushar Khairnar said...

hi all,

thanks for the comments and new options.

@kozmoz : I will check jofti. It looks like more than simple caching solution. Interesting

@taharqa
I did mention about JXPath but did not test it.

Regards,
Tushar

Kamran Ali Khan said...

You can also checkout JFilter http://code.google.com/p/jfilter. It is a high performace and simple to use like mangodb queries.