[email protected]
[Top] [All Lists]

Re: Document Order in IndexWriter.addIndexes

Subject: Re: Document Order in IndexWriter.addIndexes
From: Andrzej Bialecki
Date: Thu, 01 Jul 2010 07:54:15 +0200
On 2010-06-30 22:16, Apoorv Sharma wrote:
> This implies there is no way to merge two parallel indexes(based on parallel
> reader) to get a new parallel index. Correct me if I am wrong.

Do you mean sequentially, using IW.addIndexes(new
IndexReader[]{index1,index2}), so that the fields from documents in one
index become attached to the fields in the other index under the same
doc IDs? No, it absolutely doesn't work this way.

Or do you mean to merge the two indexes in parallel? Then just open them
with ParallelReader ;) and submit this parallel reader to
IW.addIndexes(). ParallelReader doesn't expose sub-readers to
SegmentMerger, it returns null from getSequentialSubReaders() so that
SegmentMerger has to use the documents as they are returned from
ParallelReader.document(int). Net effect is that fields from sub-readers
become merged on the same doc id, and will be thus recorded in the
output index.

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

<Prev in Thread] Current Thread [Next in Thread>
  • Re: Document Order in IndexWriter.addIndexes, Andrzej Bialecki <=