|
|
On Thu, Sep 11, 2008 David Laight wrote:
That sucks big-time.
It makes me think even more that UTF-8 is completely inappropriate
for a system-wide locale on any unix system.
Clearly some documents and strings can be in UTF-8, but that has to
be a known property of the string. It isn't appropriate that
any string a program obtains can be assumed to be UTF-8.
But at least, we could make the UTF-8 encoding explicit by including
the BOM (byte order mark) at the beginning of such a file.It is the
byte sequence 0xEF 0xBB 0xBF.
Vim has support for automatically handling it, see e.g.
http://www.nabble.com/utf8-BOM-td16427974.html
UTF-8 should IMO not be the default encoding (in the absence
of an explicit marker), we better stay at latin1.
Joachim
|
|