java-user@lucene.apache.org
[Top] [All Lists]

Re: IO exception during merge/optimize

Subject: Re: IO exception during merge/optimize
From: Michael McCandless
Date: Wed, 28 Oct 2009 11:29:28 -0400
On Wed, Oct 28, 2009 at 10:58 AM, Peter Keegan <peterlkeegan@xxxxxxxxx> wrote:
> The only change I made to the source code was the patch for PayloadNearQuery
> (LUCENE-1986).

That patch certainly shouldn't lead to this.

> It's possible that our content contains U+FFFF. I will run in debugger and
> see.

OK may as well check just so we cover all possibilities.

> The data is 'sensitive', so I may not be able to provide a bad segment,
> unfortunately.

OK, maybe we can modify your CheckIndex instead.  Let's start with
this, which prints a warning whenever the docFreq differs but
otherwise continues (vs throwing RuntimeException).  I'm curious how
many terms show this, and whether the TermEnum keeps working after
this term that has different docFreq:

Index: src/java/org/apache/lucene/index/CheckIndex.java
===================================================================
--- src/java/org/apache/lucene/index/CheckIndex.java    (revision 829889)
+++ src/java/org/apache/lucene/index/CheckIndex.java    (working copy)
@@ -672,8 +672,8 @@
         }

         if (freq0 + delCount != docFreq) {
-          throw new RuntimeException("term " + term + " docFreq=" +
-                                     docFreq + " != num docs seen " +
freq0 + " + num docs deleted " + delCount);
+          System.out.println("WARNING: term  " + term + " docFreq=" +
+                             docFreq + " != num docs seen " + freq0 +
" + num docs deleted " + delCount);
         }
       }

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>