--- Paul Elschot <[email protected]> wrote:
> On Tuesday 01 November 2005 08:51, Otis Gospodnetic wrote:
> > Hello,
> > I spent most of today talking to some people about Lucene, and one
> > them said how they would really like to have an "instantaneous
> > merge", and how he is thinking he could achieve that by simply
> > segments file of one index, and adding segment names of the other
> > index/indices, plus adjusting the segment size (SegSize in
> > fileformats.html), thus creating a single (but unoptimized) index.
> > Any reactions to that?
> > I imagine this isn't quite that simple to implement, as one would
> > to renumber all documents, in order to avoid having multiple
> > with the same document id.
> > Can anyone think of any other problems with this approach, or
> > offer ideas for possible document renumbering?
> Document numbers within segments are determined dynamically in the
> index reader, so these should not be a problem. Each segment simply
> its documents from zero.
Uh, and I always thought they were stored in the index. Aren't they
stored in the .fdx and .fdt files? And shouldn't they also be linked
from some place. I see a mention of document numbers in information
about the .frq.
> Iirc the segment names determine the order
> of the segments for an index reader.
> I think creating a new index by adding segments from an existing one
> be fairly straightforward. Some care will be needed to avoid
> clashes in the segment names.
You mean ensuring that segment _x from index A doesn't clash with _x
from index B? Segment names are written only in the segments file, I
believe, so I think if I detect that _x is already taken, I could
simply rename it to something (e.g. _foo) that hasn't been taken yet,
and remember to use that segment name when writing the segments file.
> Also what should happen with
> the index from which the segments are taken? Should the shared
> segments be copied between indexes?
I can simply distroy the original index once I've created a fakely
merged one. I'm not sure what you mean by shared segments. If I have
two indices, A and B, then each of them will have its own set of
segments with no segments in common.
> It's possible to share segments between indexes when the file system
> allows files to be present in multiple directories.
Oh, are you saying that I could just leave segments where they are and
use something like symlinks to point to them from a new index?
A: <index files for A>
B: <index files for B>
C: <symlinks to index files for A>
<symlinks to index files for B>
<segments file with segment names for A and B>
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]