[email protected]
[Top] [All Lists]

Splitting the index

Subject: Splitting the index
From: Rob Young
Date: Wed, 27 Sep 2006 17:51:11 +0100

I'm using Lucene to search a product database (CDs, DVDs, games and now 
books). Recently that index has increased in size to over a million items 
(added books). I have been performance testing our search server and the 
throughput of requests has dropped significantly, profiling the server it all 
seems to be in the Lucene searching.

So, now that I've narrowed it down to the searching itself rather than the 
rest of the application. What can I do about it? I am running a TermQuery, 
falling back to a FuzzyQuery when no results are found (each combined in a 
boolean queries with the product type restrictions). 

One solution I had in mind was to split the index down into four, would this 
provide any gains? It will require a lot of re-factoring so I don't want to 
commit myself if there's no chance it will help.

Another solution along the same train of thought was to use a caching filter 
search to cut the index into parts. How would this compare to the previous 

Does anyone have any other ideas / suggestions?


To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

<Prev in Thread] Current Thread [Next in Thread>