fa.netbsd.tech.kern
[Top] [All Lists]

Re: blocksizes

Subject: Re: blocksizes
From: Michael van Elst
Date: Mon, 25 Jan 2010 14:04:01 UTC
Newsgroups: fa.netbsd.tech.kern

tsutsui@xxxxxxxxxxxxxxx (Izumi Tsutsui) writes:

>> cd(4) uses DEV_BSIZE units to address blocks.

>I meant size parameters in struct disklabel, like
>d_secperunit and d_partitions[].p_size etc.

Yes, the disklabel uses physical sector sizes.


>> >Isn't it one example of inconsistent hacks?
>> 
>> No, this is a place where this is used consistently. Maybe
>> it becomes clear when treating the disk address not like
>> a way to access hardware but a way to specify the data
>> layout.
>> While these are related, it is not a 1:1 relationship.

>Consistent per what?

It uses consistently the same units to address the medium.


>I don't think current implementation was written per
>consistent design. It might look consistent, but
>there are many possible botches so we have to check
>all use of DEV_BSIZE (and lp->d_secsize, fsbtodb etc.)
>whether they should be hardware blocksize or logical one.

Actually using DEV_BSIZE units for the drivers avoids
to look at these details at higher layers. The higher
layers almost become agnostic of these details. That
is the major advantage of having DEV_BSIZE (or byte)
units to address the disk.



>All ufs code uses d_secsize to read superblock, for example.

It does not. This is a bug that hasn't been fixed yet.
There are more such problems in the userland tools that
originate from changing how disks are addressed 16 years
ago.



>> >There is no "right" solution. We can fix the hack with hacks,
>> >or we can also redesign it. Someone[tm] should make a decision.
>> >That's all.
>> 
>> Yesterdays' redesign is todays' hacks. By chosing words you
>> already say what you consider a "wrong" solution.
>> 
>> So far I have heard about three models.
>> 
>> - use DEV_BSIZE addressing (that's what we have now, no changes)

>s/no changes/less diffs/
>In this case, we still have to use physical block size in some more
>drivers, and we always have to consider which logical or physical one
>should be used.

There is no change to the model compared to now. Higher layers do not
need to distinguish between logical or physical blocks because it is
only logical blocks. This also helps when you deal with images
that have no physical block size.

We only have to fix those places that have been forgotten to adapt
years ago.



>> - use byte addressing (almost the same, just a few bits more, requires
>>   minor changes to device drivers and other code).
>> - use native block addressing (what we left in 1994, requires significant
>>   changes to device drivers and minor changes to other code).

>In the last case, we could have a common API to get physical block size
>for optimized min xfer size as side effect?

There are multiple common APIs to get physical block size. Mainly:

- disklabel (supported by all drivers)
- wedges and "struct disk" (the dk intermediate driver)

And yes, it could really need some cleanup.



>> To me, none of this has significant advantages.

>(BTW, did you take a look at PRs 3790-3792?)

I did.


>I still think there is some tradeoff among them
>(less diffs, or no logical vs physical confusion).

There is obviously a difference in how much work and
testing this requires. I don't see any advantages in
the result. For example, using native block sizes
fits more with the old FFS code, however, dealing
with FFS images still requires an extra transformation
and you cannot just copy FFS images between devices
of different block size.
Using DEV_BSIZE units on the other hand simplifies
higher layers and accessing images makes no difference.
On the other hand you may need some extra code to
handle compatibility.

Lots of tiny pro and con arguments, but nothing
that would strongly prefer one of the solutions.


>Probably we should ask Core.

That's what core is meant for :)



>FYI, here is a current result of 2KB/sector MO disk:

I have some local changes to address all of this, in particular
for a GPT (non-disklabel) disk. The GPT code also has a related
bug of its own, so I verified this against MacOS.

So far I had one remaining question: the fsbtodb shift value.

With a a DEV_BSIZE addressed device you need a shift value
that is unrelated to the physical sector size and does not
change when you talk to a device or access an image file.

However, this makes the superblock incompatible with say FreeBSD
which natively addresses the disk.
If we keep fsbtodb compatible then we need to adapt the
fsbtodb translation.

This is really the only place where native addressing
helps FFS, but only FFS. However, storing the physical
sector size in the filesystem (instead of using information
from the driver) is what I consider questionable.

-- 
-- 
                                Michael van Elst
Internet: mlelstv@xxxxxxxxxx
                                "A potential Snark may lurk in every tree."

<Prev in Thread] Current Thread [Next in Thread>