Altair Accelerator Administrator Guide

This manual is written for the Accelerator system administrator who needs to configure and manage the use of this Altair Accelerator product after it is installed. This guide describes basic tasks, including submitting jobs, tracking job information, and analyzing and solving common problems.

The administrator is expected to understand UNIX system processes, the dynamics of UNIX interactive shells, shell scripting techniques and general trouble shooting concepts. As configuration is part of the role of the Accelerator administrator, knowledge of schedulers is also expected.

For details about the usage and capabilities of using Accelerator, refer to the Altair Accelerator User Guide and Altair Accelerator User Tutorials.

Note:

The terminology in this release has changed from the previous one.

The Accelerator products are built on platform called vov using a client-server architecture with remote-procedure-calls (RPC). The server software module is called vovserver. It communicates to clients using the vov protocol; vovservers can also be configured to respond to http requests: the REST API is implemented on top of http. There are several different client types, those that make requests to the vovserver are typically implemented using vovsh (the vov shell - a Tcl interpreter); those that respond to vovserver requests to run jobs or tasks are taskers and the software here is called vovtasker. The vovtasker can run on the same host as the vovserver or on a separate host; these hosts are typically referred to as compute nodes, compute hosts or execution hosts.

The architecture allows for multiple vovservers to communicate with each other via a vovagent. Examples of vovagents include vovwxd, indirect taskers and vovlad.

In the 2021.1.0 release, the term slave has been deprecated and has been replaced with the term tasker. The web user interface and the online documentation have been updated to reflect this change, as has the majority of the code base. Subsequent releases will complete the transition.

Accelerator Features

Accelerator is a high-performance, enterprise grade job scheduler designed for distributed High Performance Computing (HPC) environments. Accelerator provides a cost-effective, highly adaptable solution capable of managing compute infrastructures from small dedicated server farms to complex distributed HPC environments.

A full-featured scheduler, Accelerator is equipped with a comprehensive set of policy management features including FairShare, Preemption, and Reservations, which can be customized per organizational requirements to maximize resource utilization and throughput.

The services provided by Accelerator include job prioritization, automatic job queuing, license management, resource management and reporting the status of jobs as well as the usage and availability of resources.

The fields of application include hardware and software engineering, running calculations on a farm, electronic design automation (EDA) and other industries.

Accessing Accelerator

Accelerator can be accessed via the following media:
  • Web UI. Configuring Accelerator properties, and viewing job status, configurations, available resources and more is available through the web user interface.
  • GUI. Graphical user interface, independent of the web is also available for graphical views of current job and resource statuses.
  • CLI Command. Commands are also available for configuration, viewing the status of jobs and resources. GUI and WebUI can be invoked through CLI commands.

Theory of Operation

During the initial setup, the Accelerator host server, vovserver, establishes a main port for communication and addition ports for web access and read-only access. The main process for the Accelerator vovserver is establishing a main port for communication plus additional ports for web and read-only access. Afterwards, the vovserver waits for and responds to incoming connection requests from clients.

Clients consist of regular clients that request a particular service, taskers (server farms) that provide resources, and notify clients that listen for events). As well as tasker-based resources, some clients provide central resources, which are stored in and accounted by the vovserver.

Regular clients can define jobs, or query data about jobs or system status. When a job is defined, it is normally placed in a scheduled state. Scheduled jobs are sorted into buckets. Jobs that have the same characteristics go in the same bucket. Buckets are placed in prioritized order for dispatching. This prioritization is based on FairShare, an allocation system. The top priority job in each bucket is dispatched when each of the defined resources (requests) for that job is available. The job requests can be fulfilled from the central pool as well as the tasker resources. When a tasker is found that completes the job's resource request, the job is dispatched to that tasker and the job status changes to running.

When the job has completed, the tasker notifies the vovserver. The resources, both tasker-based and central, are recovered, which allows subsequent jobs (queued in the buckets) to be dispatched. When completed, the job status is normally updated to either valid or failed.

As previously stated, in addition to dispatching jobs and processing their statuses, the vovserver responds to queries about system and job requests, publish events to notify clients, and continue to process incoming job requests.

Known Limitations

In the Windows environment, PowerShell is not supported; it is strongly recommend to avoid using PowerShell.

Related Documents

The following documents provide additional information that is related to using and configuring Accelerator:
  • Altair Accelerator User Guide
  • Altair Accelerator Training Guide
  • Altair Accelerator Installation Guide
  • Altair Monitor User Guide
  • VOV Subsystem Reference Guide

Also in This Guide