java-user@lucene.apache.org
[Top] [All Lists]

Re: OutOfMemoryError using IndexWriter

Subject: Re: OutOfMemoryError using IndexWriter
From: Michael McCandless
Date: Thu, 25 Jun 2009 07:13:06 -0400
Can you post your test code?  If you can make it a standalone test,
then I can repro and dig down faster.

Can you try calling IndexWriter.setMaxBufferedDeleteTerms (eg, 1000)
and see if that prevents the OOM?

Mike

On Thu, Jun 25, 2009 at 7:10 AM, stefan<stefan@xxxxxxxxxxxxxxx> wrote:
>
> Hi Mike,
>
> I just changed my test-code to run in an indefinite loop over the database to 
> index everything. Set the jvm to 120MB heap size, all other parameters as 
> before.
> I got an OOError just as before - so I would say there is a leak somewhere.
>
> Here is the histogram.
>
> Heap Histogram
>
> All Classes (excluding platform)
> Class   Instance Count  Total Size
> class [B        1809102         41992326
> class [C        200610  26877068
> class [[B       46117   9473872
> class java.lang.String  198629  3178064
> class org.apache.lucene.index.FreqProxTermsWriter$PostingList   100927 
>  2825956
> class [Ljava.util.HashMap$Entry;        11329   2494312
> class java.util.HashMap$Entry   132578  2121248
> class [I        5186    2097300
>
> So far I had no success in pinpointing those binary arrays, I will need some 
> more time for this.
>
> Stefan
>
> -----Ursprüngliche Nachricht-----
> Von: Michael McCandless [mailto:lucene@xxxxxxxxxxxxxxxxxx]
> Gesendet: Mi 24.06.2009 17:50
> An: java-user@xxxxxxxxxxxxxxxxx
> Betreff: Re: OutOfMemoryError using IndexWriter
>
> On Wed, Jun 24, 2009 at 10:18 AM, stefan<stefan@xxxxxxxxxxxxxxx> wrote:
>>
>> Hi,
>>
>>
>>>OK so this means it's not a leak, and instead it's just that stuff is
>>>consuming more RAM than expected.
>> Or that my test db is smaller than the production db which is indeed the 
>> case.
>
> But a "leak" would keep leaking over time, right?  Ie even a 1 GB heap
> on your test db should eventually throw OOME if there's really a leak.
>
>> Please explain those buffered deletes in a few more details.
>
> Are you calling updateDocument (which deletes then adds)?
>
> Deletes (the Term or Query you pass to updateDocument or
> deleteDocuments) are buffered in a HashMap and then that buffer is
> materialized into actual deleted doc IDs when IndexWriter decides to
> do so.  I think IndexWriter isn't properly flushing the deletes when
> they use too much RAM.
>
> Mike
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
> For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
> For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@xxxxxxxxxxxxxxxxx
For additional commands, e-mail: java-user-help@xxxxxxxxxxxxxxxxx

<Prev in Thread] Current Thread [Next in Thread>