Apache Lucene Directory update

Posted by: Billy Newport on 2010-07-14 17:00:00.0

I've checked in what looks like a pretty good Directory implementation for now. The current version is configurable in terms of whether to use block compression and block size itself.

I've changed the way WXS maps are used to use dynamic Maps. I defined two Map templates, one for file meta data and another for chunks (the file data). The plugin now makes a new pair of Maps for each directory. All the file meta data maps are named "FileMetaData."+dirName and chunk maps are "ChunkMap."+dirName. The reason for this change is to enable better metrics. I had all the chunks and file meta data in one Map before. This made it hard to see per directory metrics on map sizes and the like. Separating the chunks and meta data to by directory maps make it easier to analyze this.

Compression has substantial memory and network bandwidth savings, I've seen as high as 90% in indexes that I have available to me but it does burn more CPU. It's the classic trade off.

be the first to rate this blog

Billy Newport's complete blog can be found at: http://www.devwebsphere.com/devwebsphere

About Billy Newport

Billy is a Distinguished Engineer at IBM. He's been at IBM since 2001. Billy was the lead on the WorkManager/ Scheduler APIs which were later standardized by IBM and BEA and are now the subject of JSR 236 and JSR 237. Billy lead the design of the WebSphere 6.0 non blocking IO framework (channel framework) and the WebSphere 6.0 high availability/clustering (HAManager). Billy currently works on WebSphere XD and ObjectGrid. He's also the lead persistence architect and runtime availability/scaling architect for the base application server.

Before IBM, Billy worked as an independant consultant at investment banks, telcos, publishing companies and travel reservation companies. He wrote video games in C and assembler on the ZX Spectrum, Atari ST and Commodore Amiga as a teenager. He started programming on an Apple IIe when he was eleven, his first programming language was 6502 assembler.

Billys current interests are lightweight non invasive middleware, complex event processing systems and grid based OLTP frameworks.

More About Billy »

NFJS, the Magazine

2010-07-01 00:00:00.0 Issue Now Available

Enterprise Security with Identity Access Management
by Rohit Bhardwaj
The Secret to Building Highly Available Systems
by Mark Richards
Polyglot OSGi, Part 2
by Matt Stine
On Writing a Groovy DSL
by Raju Gandhi

Learn More »