Multiple Queues

Queue Name and Host Considerations

When setting up multiple Accelerator queues, there are factors to consider:
  • Of course, each host that runs a vovserver to manage an Accelerator queue needs access to the Altair Accelerator software. In most cases, this is done by automounting the software from a file server, but a host- local install may also be used.
  • The name of the queue must begin with the letters 'vnc'. This indicates that it is an Accelerator vovserver, so that it will check out the correct license feature 'server_nc'.

    It is helpful if the remainder of the queue name encodes something that identifes the queue. For example, 'vncsj' could represent a server running in San Jose, CA.

    It is also required that the queue names are unique within those running on the same or a replicated Altair Accelerator software hierarchy. Two queues may not have the same name that run on the same Altair Accelerator software installation, even if you try to start them on different hosts.

    An Accelerator queue is a specialized case of a VOV 'project', and the ncmgr command calls the vovproject command to start the vovserver that manages the queue. The latter command uses the Altair Accelerator registry to store data about all the known projects.

  • The host and TCP/IP port combination must be unique for all queues. The default is to compute the port number by hashing the queue name into the range 6200-6455. You may specify the port number using the -port option when first starting the queue.

Create a New Queue

To create a queue with a given name, use the -queue option of the ncmgr start command.

For example, if you have already started the default queue 'vnc' and have it in production, and want to start a separate queue to test a new Altair Accelerator version, you might create a new queue called 'vnctest'. Example:
% ncmgr start -q vnctest

To configure the new vnctest queue, you could copy the configuration files from vnc.swd except setup.tcl. This should not be copied, because it will usually have a different port number, and must have a different port if running on the same host.

You would also want to edit the taskers.tcl file for the vnc826 queue, so that only a few hosts are in it for testing.

How the Default Queue is Determined

When you have multiple queues, the Accelerator commands act on the default queue when no other is specified. The Accelerator administrator can control the default using files in the NC_CONFIG_DIR directory (usually $VOVDIR/local/vncConfig).

The files in that directory are in Tcl format, and set environment variables used to determine the vovserver to which your Accelerator command sends RPC. Whatever is set by the one named vnc.tcl determines the default. The file for each queue has the form <queue-name>.tcl, and is created by the ncmgr command when the queue if first started. A useful trick is to symbolic link from the queue-specific file to vnc.tcl, permitting the Accelerator admin to quickly and easily change the default.

Working with Multiple Queues

There are two ways to specify a non-default queue in Accelerator:
  • The -queue command line option
  • The NC_QUEUE environment variable

The Accelerator commands accept a -queue option to specify which queue to act on. This permits the queue to be selected on a command-by-command basis, but adds extra typing. You can abbreviate this to -q

If you will be working primarily with a queue other than the default 'vnc', it is better to set the environment variable NC_QUEUE to the name of the queue.

For example, suppose you have two sites in San Jose, and Andover, MA, and the queues are named named 'vncsj' and vncma' respectively. The Accelerator admin would set the default queue in San Jose to be 'vncsj', and in Andover, to be 'vncma'.

A user in San Jose who wants to see the jobs of user 'carl' in Andover could do as shown below:
% nc -q vncma list -u carl
Instead of relying on the default, you can name the local queue explicitly:
% nc -q vncsj list -u carl

Tradeoffs Separating Farm Hosts

There are trade-offs to separating farm hosts into separate queues. Some of them are:
  • When you divide your compute farm hosts into separate queues, you limit the number of job slots users have to run jobs without specifying a non-default queue.
  • More important, once a job is submitted to a queue, it stays there. So, jobs could wait longer if they are submitted to a loaded queue, but there are open slots on a different queue.
  • Separate queues permit maintenance shutdowns without completely stopping batch queue service. Since the addition of the -freeze option to ncmgr stop, you can even replace the vovserver binary without needing to stop running jobs, making this less of a concern.

Also in this Section