gnu.cfengine.help
[Top] [All Lists]

Re: cfenvd crashes on startup

Subject: Re: cfenvd crashes on startup
From: Frank Ranner <franner@xxxxxxxxxxxxxxxxxxxx>
Date: 31 May 2008 03:26:25 GMT
Newsgroups: gnu.cfengine.help

On Tue, 06 May 2008 23:37:52 -0700, Thomas Rasmussen wrote:

> Hi
> 
> I have a cfengine server on SuSE Linux Enterprise Server 10.1 but
> whenever I start up cfenvd it immediately crashes,  I tried to start it
> with strace and option -F, and it produced the following (I've snipped
> the first couple of hundred lines of output): Any hints on what could be
> wrong? It is a cfengine 2.2.1, and the other daemons works fine and
> cfengine seems to work, except this daemon. Regards Thomas
> 
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xfffffffff7b5b000
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xfffffffff7b5a000
> set_thread_area(0xffd7e758)             = 0 mprotect(0xf7c9a000, 8192,
> PROT_READ)   = 0 mprotect(0xf7f27000, 4096, PROT_READ)   = 0
> munmap(0xf7f2b000, 84005)               = 0 set_tid_address(0xf7b5a6f8) 
>            = 29843 rt_sigaction(SIGRTMIN, {0x4f7ccb760, [], 0}, NULL, 8)
> = 0 rt_sigaction(SIGRT_1, {0x10000004f7ccb670, [], 0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
> getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=73014444031}) =
> 0
> _sysctl({0x2f7cd5624, -2627504, (nil), (nil), (nil),
> 18435459034116886516}) = 0
> umask(077 <unfinished ...>
> --- SIGCHLD (Child exited) @ 0 (0) --- <... umask resumed> )            
>       = -1 ENOSYS (Function not implemented)
> --- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD,
> {0x1000000000000000, [], 0},  <unfinished ...> --- SIGCHLD (Child
> exited) @ 0 (0) --- <... rt_sigaction resumed> {0x1000000000000000,
> [TRAP ABRT FPE USR1 SEGV ALRM STKFLT CHLD STOP TTIN TTOU XCPU XFSZ
> VTALRM PROF IO PWR SYS RTMIN], 0}, 8) = -1 ENOSYS (Function not
> implemented) --- SIGSEGV (Segmentation fault) @ 0 (0) --- fstat64(0x1,
> 0xffd799d8cfenvd: Received signal 11 (SIGSEGV) while doing
> [lock.db.localhost.cfenvd.daemon_2743] cfenvd: Logical start time Wed
> May  7 08:34:11 2008 cfenvd: This sub-task started really at Wed May  7
> 08:34:11 2008 mmap2(NULL, 4096, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7b5b000
> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0xfffffffff7b5a000
> set_thread_area(0xffd7e758)             = 0 mprotect(0xf7c9a000, 8192,
> PROT_READ)   = 0 mprotect(0xf7f27000, 4096, PROT_READ)   = 0
> munmap(0xf7f2b000, 84005)               = 0 set_tid_address(0xf7b5a6f8) 
>            = 29843 rt_sigaction(SIGRTMIN, {0x4f7ccb760, [], 0}, NULL, 8)
> = 0 rt_sigaction(SIGRT_1, {0x10000004f7ccb670, [], 0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
> getrlimit(RLIMIT_STACK, {rlim_cur=-4286578688, rlim_max=73014444031}) =
> 0
> _sysctl({0x2f7cd5624, -2627504, (nil), (nil), (nil),
> 18435459034116886516}) = 0
> umask(077 <unfinished ...>
> --- SIGCHLD (Child exited) @ 0 (0) --- <... umask resumed> )            
>       = -1 ENOSYS (Function not implemented)
> --- SIGCHLD (Child exited) @ 0 (0) --- rt_sigaction(SIGCHLD,
> {0x1000000000000000, [], 0},  <unfinished ...> --- SIGCHLD (Child
> exited) @ 0 (0) --- <... rt_sigaction resumed> {0x1000000000000000,
> [TRAP ABRT FPE USR1 SEGV ALRM STKFLT CHLD STOP TTIN TTOU XCPU XFSZ
> VTALRM PROF IO PWR SYS RTMIN], 0}, 8) = -1 ENOSYS (Function not
> implemented) --- SIGSEGV (Segmentation fault) @ 0 (0) --- fstat64(0x1,
> 0xffd799d8cfenvd: Received signal 11 (SIGSEGV) while doing
> [lock.db.localhost.cfenvd.daemon_2743] cfenvd: Logical start time Wed
> May  7 08:34:11 2008 cfenvd: This sub-task started really at Wed May  7
> 08:34:11 2008

Either delete or rename the lock.db file or use the berkeley db utils to 
fix it. I find stracing stats and opens to be useful as it eliminates 
lots of output.

strace -e open,stat ...

This will show what files are being opened. Chances are the db is 
corrupt, which results in a seg fault. If you can get a core file you can 
use gdb to get a backtrace which will show what area cfenvd was in when 
it crashes.

regards,
Frank Ranner

<Prev in Thread] Current Thread [Next in Thread>