|
|
On Tue, 29 Jul 2008 08:59:06 +0200, "Peter Krefting"
<peter@xxxxxxxxxxxxxxxx> wrote:
> self <me@xxxxxxxxxxx>:
>
> > But the inability to continue to have my cache archived and indexed by
> > Copernic or Google is a real problem.
>
> Well, in my opinion, if the indexing software relies on the file extension
> to identify the file format, then it is the indexing software that needs
> to be changed. An HTML file is an HTML file, no matter whether it has a
> ".html" extension or not.
But the application shouldn't have to rely on content-sniffing to
work out the type of the data. There's a damned good reason why many
Internet protocols include a content-type header or some variant
thereof: they need an out-of-band way of indicating what the data
represents. A file can't have a content-type header, so the file
extension is used instead.
Some binary data formats have in-band content type information, but
for text files that just doesn't work. Content-sniffing is far too
unreliable, and the reason why MIME and HTTP and many other standards
avoid it is because it's a horrendously stupid and broken idea. It's
also incredibly hypocritical of Opera, having spent so much time in
the past (rightly) berating IE for content-sniffing, to recommend it
now in an attempt to deride objections to an unpopular change.
> But the cache index file is there, and it has had the same format since
> Opera 4.0. That means that the authors of the indexing software you are
> using have had eight years to implement support for reading the cache
> index file, and not having to rely on the actual naming of the files in
> the cache (I would assume that it already does that for the Internet
> Explorer cache).
I tried that, and found that when Opera is running the cache index
doesn't always seem to be up to date. I wanted to trap some cached
data from a site I was visiting, and although the cache files were
present there was no mention of them in the index. Infrequent writes
to the cache index give better performance than syncing on every
change, but it does impair the use of the index for accessing the
cache.
--
Matthew Winn
[If replying by mail remove the "r" from "urk"]
|
|