|
|
Jed Brown <jed <at> 59A2.org> writes:
> Uh, ByteString is Unicode-agnostic. ByteString.Char8 is not. So why not do
> IO
> with lazy ByteString and parse into your own representation (which might look
> a
> lot like StorableVector)?
One problem you might run into doing it this way is if a wide character is split
between two different arrays. In that case you have to do some post-porcessing
to put the pieces back together. More efficient, I think, if you could force a
given alignment when reading in the lazy bytestring. But there's not a way to do
that, is there?
I hope this makes sense. It's the problem I ran into when I tried once to use
lazy bytestrings instead of a storable vector, reasoning that the more recent
fusion work in bytestring would give a speed boost. But then I was doing
numerical stuff, and I don't know much about unicode.
Chad
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@xxxxxxxxxxx
http://www.haskell.org/mailman/listinfo/haskell-cafe
|
|