Tuesday, October 19, 2010

lucene-bytebuffer

What is lucene-bytebuffer.

lucene-bytebuffer is Lucene Directory implementation using Direct ByteBuffer. Directory in lucene is backing storage for index. Lucene uses directory for storing index contents. So there is RAMDirectory, FileDirectory, MemoryMappedFileDirectory, NIODirectory each presenting various different options. lucene-bytebuffer will allow in-memory index to grow upto several gigabytes without incurring garbage collection cost.

Mostly indexes are 90 to 95% read and 2-5% write ie. index hardly changes. If index is huge it will cost a lot in terms Garbage Collection CPU cycles. RAMDirectory holds arrays of size 1024 so for 1GB index its 1 million array objects. So as size gets increased in-memory index performance degrades due to garbage collection.

What if you want to index say 5GB data? Use off-heap bytebuffer backed directory.

Another question is why would you want to use lucene in-memory indexing. May be as Cache which can be queried on more than one property of object indexed?

No comments: