|
|
On Tue, 06 May 2008 23:37:52 -0700, Thomas Rasmussen wrote:
> Hi
>
> I have a cfengine server on SuSE Linux Enterprise Server 10.1 but
> whenever I start up cfenvd it immediately crashes, I tried to start it
> with strace and option -F, and it produced the following (I've snipped
> the first couple of hundred lines of output): Any hints on what could be
> wrong? It is a cfengine 2.2.1, and the other daemons works fine and
> cfengine seems to work, except this daemon. Regards Thomas
>
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xfffffffff7b5b000
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xfffffffff7b5a000
> set_thread_area(0xffd7e758) = 0 mprotect(0xf7c9a000, 8192,
> PROT_READ) = 0 mprotect(0xf7f27000, 4096, PROT_READ) = 0
> munmap(0xf7f2b000, 84005) = 0 set_tid_address(0xf7b5a6f8)
> = 29843 rt_sigaction(SIGRTMIN, {0x4f7ccb760, [], 0}, NULL, 8)
> = 0 rt_sigaction(SIGRT_1, {0x10000004f7ccb670, [], 0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
> getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=73014444031}) =
> 0
> _sysctl({0x2f7cd5624, -2627504, (nil), (nil), (nil),
> 18435459034116886516}) = 0
> umask(077 <unfinished ...>
> --- SIGCHLD (Child exited) @ 0 (0) --- <... umask resumed> )
> = -1 ENOSYS (Function not implemented)
> --- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD,
> {0x1000000000000000, [], 0}, <unfinished ...> --- SIGCHLD (Child
> exited) @ 0 (0) --- <... rt_sigaction resumed> {0x1000000000000000,
> [TRAP ABRT FPE USR1 SEGV ALRM STKFLT CHLD STOP TTIN TTOU XCPU XFSZ
> VTALRM PROF IO PWR SYS RTMIN], 0}, 8) = -1 ENOSYS (Function not
> implemented) --- SIGSEGV (Segmentation fault) @ 0 (0) --- fstat64(0x1,
> 0xffd799d8cfenvd: Received signal 11 (SIGSEGV) while doing
> [lock.db.localhost.cfenvd.daemon_2743] cfenvd: Logical start time Wed
> May 7 08:34:11 2008 cfenvd: This sub-task started really at Wed May 7
> 08:34:11 2008 mmap2(NULL, 4096, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7b5b000
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xfffffffff7b5a000
> set_thread_area(0xffd7e758) = 0 mprotect(0xf7c9a000, 8192,
> PROT_READ) = 0 mprotect(0xf7f27000, 4096, PROT_READ) = 0
> munmap(0xf7f2b000, 84005) = 0 set_tid_address(0xf7b5a6f8)
> = 29843 rt_sigaction(SIGRTMIN, {0x4f7ccb760, [], 0}, NULL, 8)
> = 0 rt_sigaction(SIGRT_1, {0x10000004f7ccb670, [], 0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
> getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=73014444031}) =
> 0
> _sysctl({0x2f7cd5624, -2627504, (nil), (nil), (nil),
> 18435459034116886516}) = 0
> umask(077 <unfinished ...>
> --- SIGCHLD (Child exited) @ 0 (0) --- <... umask resumed> )
> = -1 ENOSYS (Function not implemented)
> --- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD,
> {0x1000000000000000, [], 0}, <unfinished ...> --- SIGCHLD (Child
> exited) @ 0 (0) --- <... rt_sigaction resumed> {0x1000000000000000,
> [TRAP ABRT FPE USR1 SEGV ALRM STKFLT CHLD STOP TTIN TTOU XCPU XFSZ
> VTALRM PROF IO PWR SYS RTMIN], 0}, 8) = -1 ENOSYS (Function not
> implemented) --- SIGSEGV (Segmentation fault) @ 0 (0) --- fstat64(0x1,
> 0xffd799d8cfenvd: Received signal 11 (SIGSEGV) while doing
> [lock.db.localhost.cfenvd.daemon_2743] cfenvd: Logical start time Wed
> May 7 08:34:11 2008 cfenvd: This sub-task started really at Wed May 7
> 08:34:11 2008
Either delete or rename the lock.db file or use the berkeley db utils to
fix it. I find stracing stats and opens to be useful as it eliminates
lots of output.
strace -e open,stat ...
This will show what files are being opened. Chances are the db is
corrupt, which results in a seg fault. If you can get a core file you can
use gdb to get a backtrace which will show what area cfenvd was in when
it crashes.
regards,
Frank Ranner
|
|