Failover Server Candidates

If a server crashes suddenly, VOV has the capability to start a replacement server on a pre-selected host. This capability requires that the pre-selected host is configured as a failover server.

The configuration instructions follow.
Note: The vovserverdir command only works from a VOV-enabled shell when the project server is running.
  1. Edit or create the file servercandidates.tcl in the server configuration directory. Use the vovserverdir command with the -p option to find the pathname to this file.
    % vovserverdir -p servercandidates.tcl
    /home/john/vov/myProject.swd/servercandidates.tcl
    The servercandidates.tcl file should set the Tcl variable ServerCandidates to a list of possible failover hosts. This list may include the original host on which the server was started.
    set ServerCandidates {
        host1
        host2
        host3
    }
  2. Install the autostart/failover.csh script as follows:
    % cd `vovserverdir -p .`
    % mkdir autostart
    % cp $VOVDIR/etc/autostart/failover.csh autostart/failover.csh
    % chmod a+x autostart/failover.csh
  3. Activate the failover facility by running vovautostart.
    % vovautostart
    For example:
    % vovtaskermgr show -taskergroups
    ID         taskername        hostname         taskergroup
    000404374  localhost-2      titanus          g1
    000404375  localhost-1      titanus          g1
    000404376  localhost-5      titanus          g1
    000404377  localhost-3      titanus          g1
    000404378  localhost-4      titanus          g1
    000404391  failover         titanus          failover
    Note: Each machine listed as a server candidate must be a vovtasker machine; the vovtasker running on that machine acts as its agent in selecting a new server host. Taskers can be configured as dedicated failover candidates that are not allowed to run jobs by using the -failover option in the taskers definition.

    Preventing jobs from running on the candidate machine eliminates the risks of machine stability being affected by demanding jobs. The -failover option also enables some failover configuration validation checks. Finally, failover taskers are started before the regular queue taskers, which helps ensure a failover tasker is available as soon as possible for future failover events.

    Refer to the tasker definition documentation for details on the -failover option.