[email protected]
[Top] [All Lists]

Re: [Haskell-cafe] Boxed Mutable Arrays

Subject: Re: [Haskell-cafe] Boxed Mutable Arrays
From: Edward Kmett
Date: Wed, 16 Dec 2009 13:00:12 -0500

On Tue, Dec 15, 2009 at 12:00 PM, Gregory Collins <[email protected]> wrote:
Bulat Ziganshin <[email protected]> writes:

> now i see what you mean. no, i mean trivial transformation. #650 says
> about slow GC. why it's slow? because once you made any update to the
> array, the entire array is marked as updated and scanned on next minor
> GC (which occurs after every 512 kbytes allocated, afaik). let's
> replace big array (say, of 10,000 elements) with array of 100 arrays
> of 100 elements each. now, between minor GCs only some of arrays will
> be changed and entire amount of memory to be scanned will become less

I actually tried this, and modified Data.HashTable to use a two-tiered
chunked dynamic array as you suggest. In some limited testing it was
only about 5% faster, so I gave up on it -- you get some GC time back
but you also pay a significant indirection penalty by adding an extra
cache line miss for every operation.

This will depend very much on usage patterns and array size. As that array size approaches the size of available memory the difference becomes enormous. Then any single entry mutation forces a linear scan of all of memory. With a two-tier design you only scan a square root of that. I ran benchmarks on a similar hack a year or two ago. Since the difference is asymptotic, you can make it as large of a percentage as you want just by using a bigger hash table/array. I would hazard that your test case wasn't large enough to see the problem at its worst, or your costs were dominated by other factors than GC.

-Edward Kmett
Haskell-Cafe mailing list
[email protected]
<Prev in Thread] Current Thread [Next in Thread>