zfs-discuss@opensolaris.org
[Top] [All Lists]

Re[3]: [zfs-discuss] 120473-05

Subject: Re[3]: [zfs-discuss] 120473-05
From: Robert Milkowski
Date: Mon, 23 Apr 2007 01:41:47 +0200
Hello Robert,

Friday, April 13, 2007, 2:07:11 AM, you wrote:

RM> Hello Enda,

RM> Thursday, April 12, 2007, 2:36:39 PM, you wrote:

EOCSMSI>> Robert Milkowski wrote:
>>> Hello Enda,
>>> 
>>> Wednesday, April 11, 2007, 4:21:35 PM, you wrote:
>>> 
>>> EOCSMSI> Robert Milkowski wrote:
>>>>> Hello zfs-discuss,
>>>>>
>>>>>   In order to get IDR126199-01 I need to install 120473-05 first.
>>>>>   I can get 120473-07 but everything more than -05 is marked as
>>>>>   incompatible with IDR126199-01 so I do not want to force it.
>>>>>
>>>>>   Local Sun's support has problems with getting 120473-05 also so I'm
>>>>>   stuck for now and I would really like to get that IDR running.
>>>>>
>>>>>   Can someone help?
>>>>>
>>> 
>>> EOCSMSI> Hi
>>> EOCSMSI> This patch will be on SunSolve possibly later today, tomorrow at 
>>> latest
>>> EOCSMSI> I suspect as it has only justed being pushed out from testing.
>>> EOCSMSI> I have sent the patch in another mail for now.
>>> 
>>> Thank you patch - it worked (installed) along with IDR properly.
>>> 
>>> However it seems like the problem is not solved by IDR :(
>>> 
EOCSMSI>> Hi Robert
EOCSMSI>> So this IDR has two bugs as fixed
EOCSMSI>> 6458218 assertion failed: ss == NULL
EOCSMSI>> 6495013 Loops and recursion in metaslab_ff_alloc can kill performance,
EOCSMSI>> even on a pool with lots of free data

EOCSMSI>> I have add'ed the IDR's requestors as they can comment, which one of 
the
EOCSMSI>> above fixes was not solved via this IDR in your testing.



RM> I'm talking about 6495013 Loops and recursion in metaslab_ff_alloc can
RM> kill performance.

RM> I've got to test again with dtrace during peak hours if recursion
RM> still happens (on that scale) or maybe it's not. The problem is that
RM> during peak hours I still (again) can see performance problems. At
RM> first glance it looks like this time CPU is probably NOT a problem, so
RM> it would seem as 6495013 is indeed fixed and the problem is somewhere
RM> else. As I send|recv all file systems to another server and than back
RM> to the same server (after re-creating pool) before applying IDR it
RM> solved problem with a performance for some time. Not it again is
RM> getting worse day by day, and the IDR seems not to help. So perhaps
RM> there is a problem with zfs data fragmentation - that's why zfs
RM> sedn|recv helped and 6495013 made it only worse.

RM> ??

RM> Do you have any specific dtrace script in mind to test if 6495013 is
RM> solved or should I use the same as in the case?


bash-3.00# uname -a
SunOS nfs-10-1.srv 5.10 Generic_125100-04 sun4u sparc SUNW,Sun-Fire-V440
bash-3.00# showrev -p|grep IDR
Patch: IDR126199-01 Obsoletes:  Requires: 120473-05 Incompatibles: 120473-06 
Packages: SUNWzfskr
bash-3.00#



Right now this is during off hours and while I see no problems on
clients it seems much slower then similar configs on UFS
(zil_disable=1). There's no much writes.

I'll send later output during peak hours.

For example:

bash-3.00# ptime sync

real     1:29.199
user        0.001
sys         0.032
bash-3.00#


bash-3.00# dtrace -n profile-97'{@[stack()]=count();}' -n 
END'{trunc(@,10);printa(@);}'
dtrace: description 'profile-97' matched 1 probe
dtrace: description 'END' matched 1 probe
^C
CPU     ID                    FUNCTION:NAME
  3      2                             :END

              unix`i_ddi_splhigh+0x14
              unix`disp_getwork+0x38
              unix`idle+0xd4
              unix`thread_start+0x4
              116

              unix`disp_getwork+0x1a8
              unix`idle+0xd4
              unix`thread_start+0x4
              117

              unix`idle+0x12c
              unix`thread_start+0x4
              124

              unix`i_ddi_splx
              unix`disp_getwork+0x160
              unix`idle+0xd4
              unix`thread_start+0x4
              127

              unix`disp_getwork+0x10c
              unix`idle+0xd4
              unix`thread_start+0x4
              128

              zfs`space_map_seg_compare+0x4
              genunix`avl_find+0x2c
              zfs`space_map_add+0x12c
              zfs`space_map_load+0x218
              zfs`metaslab_activate+0x3c
              zfs`metaslab_group_alloc+0x1d4
              zfs`metaslab_alloc_dva+0x114
              zfs`metaslab_alloc+0x2c
              zfs`zio_dva_allocate+0x4c
              zfs`zio_write_compress+0x1ec
              zfs`arc_write+0xe4
              zfs`dbuf_sync+0x6c0
              zfs`dnode_sync+0x35c
              zfs`dmu_objset_sync_dnodes+0x6c
              zfs`dmu_objset_sync+0x54
              zfs`dsl_dataset_sync+0xc
              zfs`dsl_pool_sync+0x64
              zfs`spa_sync+0x1b0
              zfs`txg_sync_thread+0x134
              unix`thread_start+0x4
              128

              unix`disp_getwork+0x90
              unix`idle+0xd4
              unix`thread_start+0x4
              132

              unix`disp_getwork+0x7c
              unix`idle+0xd4
              unix`thread_start+0x4
              139

              zfs`space_map_seg_compare
              genunix`avl_find+0x2c
              zfs`space_map_add+0x12c
              zfs`space_map_load+0x218
              zfs`metaslab_activate+0x3c
              zfs`metaslab_group_alloc+0x1d4
              zfs`metaslab_alloc_dva+0x114
              zfs`metaslab_alloc+0x2c
              zfs`zio_dva_allocate+0x4c
              zfs`zio_write_compress+0x1ec
              zfs`arc_write+0xe4
              zfs`dbuf_sync+0x6c0
              zfs`dnode_sync+0x35c
              zfs`dmu_objset_sync_dnodes+0x6c
              zfs`dmu_objset_sync+0x54
              zfs`dsl_dataset_sync+0xc
              zfs`dsl_pool_sync+0x64
              zfs`spa_sync+0x1b0
              zfs`txg_sync_thread+0x134
              unix`thread_start+0x4
              140

              unix`i_ddi_splx+0x1c
              unix`disp_getwork+0x160
              unix`idle+0xd4
              unix`thread_start+0x4
              318


bash-3.00#


bash-3.00# ./avl_impact_on_zfs.d
^C

  Loops in metaslab_ff_alloc
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@@@@                  2249
               1 |@@@@@                                    485
               2 |@@@                                      334
               4 |@@                                       241
               8 |@@                                       170
              16 |@                                        125
              32 |@                                        76
              64 |@                                        61
             128 |                                         36
             256 |                                         19
             512 |                                         15
            1024 |                                         8
            2048 |                                         8
            4096 |                                         5
            8192 |                                         6
           16384 |                                         9
           32768 |                                         3
           65536 |                                         5
          131072 |                                         5
          262144 |                                         6
          524288 |                                         0

  Time per avl_walk in ns
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |                                         138
               1 |                                         0
               2 |                                         0
               4 |                                         0
               8 |                                         0
              16 |                                         0
              32 |                                         0
              64 |@@@@@@@@@@@@@@@@                         1523563
             128 |@@@@@                                    527718
             256 |@@@@@@@@@                                860982
             512 |@@@@@@@                                  696676
            1024 |@@@                                      254440
            2048 |                                         25446
            4096 |                                         643
            8192 |                                         603
           16384 |                                         1368
           32768 |                                         292
           65536 |                                         23
          131072 |                                         0


  Loops in avl_find
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |                                         86
               1 |                                         188
               2 |                                         1100
               4 |                                         2178
               8 |@                                        5261
              16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  288941
              32 |                                         0


  Time per avl_find in ns
           value  ------------- Distribution ------------- count
              64 |                                         0
             128 |                                         27
             256 |                                         104
             512 |                                         591
            1024 |@                                        5094
            2048 |@@@@@@@@@@@@@@@@@@@                      142584
            4096 |@@@@@@@@@@@@@@@@@@@@                     146899
            8192 |                                         1277
           16384 |                                         685
           32768 |                                         480
           65536 |                                         12
          131072 |                                         0
          262144 |                                         0
          524288 |                                         1
         1048576 |                                         0


  Time per avl_insert in ns
           value  ------------- Distribution ------------- count
              32 |                                         0
              64 |@@@@@@@@@@@@@                            727720
             128 |@@@@@@@@@@@@@@@@@                        906701
             256 |@@@@@@@@@@                               556764
             512 |                                         1213
            1024 |                                         122
            2048 |                                         122
            4096 |                                         26
            8192 |                                         40
           16384 |                                         257
           32768 |                                         73
           65536 |                                         5
          131072 |                                         0


  Loops in metaslab_ff_alloc
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |@@@@@@@@@@@@@@@@@@@@@@@                  2249
               1 |@@@@@                                    485
               2 |@@@                                      334
               4 |@@                                       241
               8 |@@                                       170
              16 |@                                        125
              32 |@                                        76
              64 |@                                        61
             128 |                                         36
             256 |                                         19
             512 |                                         15
            1024 |                                         8
            2048 |                                         8
            4096 |                                         5
            8192 |                                         6
           16384 |                                         9
           32768 |                                         3
           65536 |                                         5
          131072 |                                         5
          262144 |                                         6
          524288 |                                         0

  Time per avl_walk in ns
           value  ------------- Distribution ------------- count
              -1 |                                         0
               0 |                                         138
               1 |                                         0
               2 |                                         0
               4 |                                         0
               8 |                                         0
              16 |                                         0
              32 |                                         0
              64 |@@@@@@@@@@@@@@@@                         1523563
             128 |@@@@@                                    527718
             256 |@@@@@@@@@                                860982
             512 |@@@@@@@                                  696676
            1024 |@@@                                      254440
            2048 |                                         25446
            4096 |                                         643
            8192 |                                         603
           16384 |                                         1368
           32768 |                                         292
           65536 |                                         23
          131072 |                                         0

        Time spent in avl_find          : 1284633600 ns
        Time spent in avl_insert        : 445301000 ns
        Time spent in avl_walk          : 1620582500 ns
        Elapsed time                    : 65060492600 ns









#### avl script

bash-3.00# cat avl_impact_on_zfs.d
#!/usr/sbin/dtrace -s

#pragma D option quiet

BEGIN
{
        self->in_metaslab = 0;
        self->in_space_map_add = 0;
        self->in_space_map_remove = 0;
        Start = timestamp;
        TotalTimeIn_avl_walk = 0;
        TotalTimeIn_avl_find = 0;
        TotalTimeIn_avl_insert = 0;
}
fbt::metaslab_ff_alloc:entry
/self->in_metaslab == 0/
{
        self->in_metaslab = 1;
        self->loopcount = 0;
        self->metaslab_alloc_start = timestamp;
}
fbt::space_map_add:entry
/self->in_space_map_add == 0/
{
        self->in_space_map_add = 1;
}

fbt::avl_walk:entry
/self->in_metaslab/
{
        self->loopcount++;
        self->avl_walk_start_time = vtimestamp;
}

fbt::avl_walk:return
/self->in_metaslab/
{
        @avl_walk_time["Time per avl_walk in ns"] = quantize(vtimestamp - 
self->avl_walk_start_time);
        TotalTimeIn_avl_walk += vtimestamp - self->avl_walk_start_time;
}

fbt::avl_insert:entry
/self->in_space_map_remove/
{
        self->avl_insert_start_time = vtimestamp;
}

fbt::avl_insert:return
/self->in_space_map_remove/
{
        @avl_insert_time["Time per avl_insert in ns"] = quantize(vtimestamp - 
self->avl_insert_start_time);
        TotalTimeIn_avl_insert += vtimestamp - self->avl_insert_start_time;
}


fbt::avl_find:entry
/self->in_space_map_add/
{
        self->loops_in_avl_find = 0;
        self->avl_find_start_time = vtimestamp;
}
fbt::space_map_seg_compare:entry
/self->in_space_map_add/
{
        self->loops_in_avl_find++;
}

fbt::avl_find:return
/self->in_space_map_add/
{
        @avl_find_loops["Loops in avl_find"] = 
quantize(self->loops_in_avl_find);
        @avl_find_time["Time per avl_find in ns"] = quantize(vtimestamp - 
self->avl_find_start_time);
        TotalTimeIn_avl_find += vtimestamp - self->avl_find_start_time;
}
fbt::space_map_add:return
/self->in_space_map_add/
{
        self->in_space_map_add = 0;
}

fbt::metaslab_ff_alloc:return
/self->in_metaslab/
{
        self->in_metaslab = 0;
        @loops["Loops in metaslab_ff_alloc"] = quantize(self->loopcount);
        self->loopcount = 0;
}

fbt::space_map_remove:entry
/self->in_space_map_remove == 0/
{
        self->in_space_map_remove = 1;
}

fbt::space_map_remove:return
/self->in_space_map_remove/
{
        self->in_space_map_remove = 0;
}

END {
        printa(@loops);
        printa(@avl_find_loops);
        printa(@avl_find_time);
        printa(@avl_insert_time);
        printa(@avl_walk_time);
        printf ("\tTime spent in avl_find\t\t: %d ns\n", TotalTimeIn_avl_find);
        printf ("\tTime spent in avl_insert\t: %d ns\n", 
TotalTimeIn_avl_insert);
        printf ("\tTime spent in avl_walk\t\t: %d ns\n", TotalTimeIn_avl_walk);
        printf ("\tElapsed time\t\t\t: %d ns\n",timestamp - Start);
}


-- 
Best regards,
 Robert                            mailto:rmilkowski@xxxxxxxxxxx
                                       http://milek.blogspot.com

_______________________________________________
zfs-discuss mailing list
zfs-discuss@xxxxxxxxxxxxxxx
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

<Prev in Thread] Current Thread [Next in Thread>