Troubleshooting a SICK Tasker
The vovserver marks a vovtasker SICK when it has not received the vovtasker's heartbeat message for three consecutive update cycles.
Possible causes include:
- The machine has crashed
- The machine got disconnected
- The top-level vovtaskerroot process has crashed or was killed
Since there is no single solution to this problem, here is a short debugging guide.
-
Is vovtasker SICK?
If vovtasker is SICK, use:
% nc cmd vovtaskermgr stop name-of-SICK-vovtasker % nc cmd vovtaskermgr start name-of-SICK-vovtasker
Otherwise, vovtasker will not start.
-
Is the machine running?
- No: you have a network problem: call IT
- Yes: continue
-
Is vovtasker/vovtaskerroot stuck?
- No: continue
- Yes: often, the output of strace and pstack help diagnose the problem (e.g. a bad NFS mount, an unresponsive LDAP, ...).
Sometimes you may not be able to figure out what is holding up the vovtasker. Submit a support request at Altair Community for assistance.