java-user@lucene.apache.org
[Top] [All Lists]

Re: incremental update of index

Subject: Re: incremental update of index
From: "Erick Erickson"
Date: Mon, 10 Nov 2008 14:52:08 -0500
You have to have indexed something that uniquely identifies the
document in order to know what the old one is. Really, this is
the same question as updating, isn't it? If you could update
a document in place, you'd have to know what document
that was. If you know that information, you know which
document to delete.

Note that lucene has no built-in document recognition. If I
add the same document to the index twice, Lucene will
happily consider them two *separate* documents. You have
to code your own notion of document meta-id (as distinct
from the Lucene doc id). It could be the URL, the file path
on disk, a document ID from your organization... the
possibilities are endless. Which is why Lucene can't do that
for you.

Best
Erick

On Mon, Nov 10, 2008 at 2:22 PM, ChadDavis <chadmichaeldavis@xxxxxxxxx>wrote:

> In the FAQ's it says that you have to do a manual incremental update:
>
> How do I update a document or a set of documents that are already indexed?
> >
> > There is no direct update procedure in Lucene. To update an index
> > incrementally you must first *delete* the documents that were updated,
> and
> > *then re-add* them to the index.
> >
>
> How do I determine the existing document that matches the document I am
> updating?
>
<Prev in Thread] Current Thread [Next in Thread>