Automate Cloud Scaling for PBS Professional
Create rules to automatically scale-up and scale-down deployment of nodes on the cloud. Use node configuration to customize a node associated with a PBS cluster.
- Login to NavOps.
- Click Automation.
- Click Add Automation.
- Provide a name for the automation.
- Enter a description for the automation.
- Select a PBS cluster.
-
In the IF condition menu :
- select graphql to filter for dynamic scale-up automations..
- select none to trigger actions without conditions. For example, if you want to set a rule to scale-up instances at 8 AM.
- select node-filter to filter on the state of the node in the inventory. For example, used to scale-down instances once they are idle.
-
For Simulation driven scaling, in the If form field,
enter a GraphQL statement to filter a set of PBS Pro jobs.
Tip: Click (?) get more information about the supported schema. Here is an example to filter Jobs Queued in "workq" Filter:
orderBy: J_PRIORITY_DESC, filter: {queue: "workq", states: [ 0 ], withSubJobs: true}
- The Trigger type is defined as calender.
- In the When section, select the values from the drop down menus to build the required cron expression displayed in the textbox form field below the menus. You can also enter valid chron expression directly in the textbox form field. For example, * * * * * will run the automation every minute and is the recommended default.
-
In the Then menu, select Simulation Driven Scaling and
define the parameters:
-
Click Save.
The new automation is displayed in the automations table.
- Enable the Automation Engine and enable the automation.
Create Scale Down Automations
- Login to NavOps.
- Click Automation.
- Click Add Automation.
- Provide a name for the automation.
- Enter a description for the automation.
- Select a PBS cluster.
- In the IF condition menu select node-filter.
-
In the Query field, click (
) to open the advanced condition editor.
-
Select a template to configure a condition. For example, Remove nodes with
pbspro-mom service idle.
The configuration fields are populated based on the template. You can modify them as per your requirements. The generated query is displayed.Note: Typically, you may want to adjust the idle time for unburst, which is identified in this table as agent-services.pbspro-mom.timestamp (using its meta-data label). In certain situations, a lower restriction may be desirable to reduce expenses. The default value is -120s, which eliminates nodes that are idle for longer than two minutes. Here, values of -10s or lower are advised.
- Click Save.
- Trigger type is defined as calender.
- In the When section, select the values from the drop down menus to build the required cron expression displayed in the textbox form field below the menus. You can also enter valid chron expression directly in the textbox form field.
- In the Then menu, select Scale down (PBS Pro).
-
Click Save.
The new automation is displayed in the automations table.
- Enable the Automation Engine and enable the automation.
Configure Snapshots of Automations for PBS Professional
To enable snapshots for automations:
-
Edit /opt/navops/etc/navops-actions-pbspro.yaml.
actions: ... - type: scale-up-pbspro ... config: pbs_sim_snapshot_retention: 5
pbs_sim_snapshot_retention is an integer value which indicates how many snapshots will be saved for each automation. Older snapshots will be removed. If the value of this setting is 0 (or less), snapshots will not be saved.Snapshots are saved in $SNAPSHOT_WORK_DIR/<automation-uid>/ directories (one directory for each automation). For example, for an automation with UID of 7bad2a0d-6f65-48ab-b6fc-80a412e07478, the snapshots for that automation will be found in $SNAPSHOT_WORK_DIR/7bad2a0d-6f65-48ab-b6fc-80a412e07478/.
If you enable snapshot retention for a period of time, and then disable it, the previously saved snapshots and directories will not be removed. Additionally, if you add/remove automations, the snapshot directories for removed automations will not be automatically removed. It is important to pay attention to available disk space when enabling this feature.
-
Restart the agent using:
systemctl restart navops-agent
or the PBS Professional actions service using:systemctl restart navops-actions-pbspro