Unified InsightPro Database Schema | |||||
View the InsightPro database schema that combines the accounting logs from PBS Pro, AGE, and Accelerator. | |||||
The following table provides the InsightPro database schema mapping across WLMs | |||||
Field Name | Data Type | Field Description | PBS Pro | AGE | Accelerator |
account | String | Account info | account | account | NA |
arrayIndex | UInt64 | Array job task Identifier | Extract task number from job id | task_number | |
cgroupsFailcnt | UInt64 | Number of memory limit hits (cgroups). | N/A | failcnt | N/A |
cgroupsMaxMemory | Float64 | Maximum of the cgroups_memory load value. cgroups_memory is calculated from values in the cgroups file memory.stat | N/A | max_cgroups_memory | N/A |
cgroupsMaxUsageInBytes | Float64 | Maximum recorded memory usage (cgroups). | N/A | max_usage_in_bytes | N/A |
cgroupsMemswFailcnt | UInt64 | Number of memory and swap limit hits(cgroups). | N/A | memsw.failcnt | N/A |
cgroupsMemswMaxUsageInBytes | Float64 | Maximum recorded memory and swap usage (cgroups). | N/A | memsw.max_usage_in_bytes | N/A |
cluster_id | UUID | Unique cluster id to identify different cluster | N/A | N/A | N/A |
cluster_type | String | PBS or AGE or ACCELERATOR | N/A | N/A | N/A |
command | String | Application name | Resource_List.software | job_name | TOOL |
cpuTimeUsed | UInt64 | CPU usage in seconds calculated as usr+sys time (or taken from the cgroup cpuacct subsystem if it is active for the job in PBS) | resources_used.cput | cpu | CPUTIME |
cwd | String | The working directory the job ran in as specified with qsub(1) / qalter(1) switches -cwd and -wd. As the delimiter used by the accounting file (colon “:”) can be part of the working directory all colons in the working directory are re‐placed by ASCII code 255. | N/A | cwd | DIR |
deadline | UInt64 | unix time stamp when jobs need to be done | N/A | N/A | deadline |
department | String | Department name | N/A | department | N/A |
dependOnJobs | String | The job ids which this job was dependent to run | depend | hold_jid (65) | N/A |
dpcount | UInt64 | Distributed parallel jobs - e.g 200 (cores/slots) | N/A | N/A | dpcount |
dphosts | String | list of hosts where job was running | N/A | N/A | dphosts |
dptopid | UInt64 | top id - "master job" - numerical id | N/A | N/A | dptopid |
earliestStartTime | UInt64 | The initially requested execution time when the job was submitted | etime | orig_exec_time | N/A |
endTime | UInt64 | Job end time. Values is in epoch timestamp in microseconds. | end | end_time | ENDTIME |
env | String | The environment which the job should use (like "modules", VOV has a named-environment management facility). | N/A | N/A | env |
exclusive | Bool | If job requested for exclusive use of node | Resource_List.place=excl | N/A | N/A |
exitStatus | Int64 | Exit status of the job | Exit_status | exit_status | EXIT |
failed | Int64 | Indicates the problem which occurred in case a job could not be started on the execution host | Exit_status | failed | FAILCODE |
fsgroup | String | FairShare group specified for job. | N/A | N/A | fsgroup |
fstokens | UInt64 | Number of FairShare tokens specified for job. | N/A | N/A | fstokens |
gpusRequested | UInt64 | No. of GPU slots | Resource_List.ngpus | Configurable variable name in agent configuration. The usage will be fetch from hard_resources -> <variable name>. | N/A |
gpuUsage | UInt64 | GPU usage | N/A | N/A | N/A |
grabbed | String | Of resources requested, those that have been confirmed as being in-use. | N/A | N/A | grabbed |
group | String | Linux user group name | group | group | OSGROUP |
instanceType | String | "on-prem" default value for all the WLM. "on-cloud" value for PBS Navops cloud setup. | Resource_List.cloud_eligible=True | N/A | N/A |
io | Float64 | The amount of data transferred in Gbytes. On Linux data transferred mall bytes read and written by the job through the read(), pread(), writeand pwrite() systems calls. On Windows this is the sum of all bytes transferred by the job by doing write, read and other operations. | N/A | io | N/A |
ioops | UInt64 | The number of io operations. | N/A | ioops | N/A |
iow | UInt64 | io wait time in seconds | N/A | iow | N/A |
jobClass | String | Name of the Job Class the job uses. "NONE" for jobs not using a JC. | N/A | job_class | JOBCLASS |
jobExitStatus | String | "failed" value for not zero exit status, "succeed" value for zero exit status | N/A | N/A | N/A |
jobId | String | job identifier | index[2] | job_number | ID |
licenses | String | subset of RESOURCES string | N/A | N/A | licenses |
memoryRequested | UInt64 | Memory requested while submitting job.(Kb) | Resource_List.mem | Order of priority - first one found will be taken 1.hard_resources→m_mem_free 2.hard_resources→h_rss 3.hard_resources→h_vmem 4.hard_resources->mem_free |
In Resource field value of RAM |
name | String | Job name provided by the end user | jobname | job_name | JOBNAME |
owner | String | Linux user name which is used to submit job | user | user | USER |
peName | String | The parallel environment which was selected for that job. | N/A | granted_pe | N/A |
physicalMemoryUsed | UInt64 | RSS memory used reporting by WLM end record. (Kb) | resources_used.mem | maxrss | MAXRAM |
project | String | Project name | project | project | PROJECT |
qdelInfo | String | If the job (the array task) has been deleted via qdel, <username>@<hostname>, else NONE. If qdel was called multiple times, every invocation is recorded in a comma separated list. | N/A | qdel_info | N/A |
queued | UInt64 | amount of time was in one of the scheduling buckets in seconds | N/A | N/A | queue |
queueName | String | Queue in which job was run | queue | qname | N/A |
queueTime | UInt64 | Job Queue time (or when job was submitted). Values is in epoch timestamp in microseconds. | qtime | submission_time | SUBMITTIME |
resvID | String | Reservation ID | resvID | arid | N/A |
resvName | String | The name of the Advance Reservation of the job. | resvname | ar_name | N/A |
slots | UInt64 | Number of slots | Resource_List.ncpus | slots | CORES value from RESOURCE field |
spriority | UInt64 | Submission priority. | N/A | N/A | spriority |
startTime | UInt64 | Time Job Started running on execution host. Values is in epoch timestamp in microseconds. | start | start_time | STARTTIME |
status | String | string how the finished DONE or FAILED | N/A | N/A | status |
submitHost | String | Hostname where job was submitted | N/A | submit_host | SUBMITHOST |
submitTime | UInt64 | Job submission time. Values is in epoch timestamp in microseconds. | ctime | submission_time | SUBMITTIME |
susptime | UInt64 | total time job was suspended in seconds | N/A | N/A | susptime |
virtualMemoryUsed | UInt64 | VM memory used reporting by WLM end record. (Kb) | resources_used.vmem | maxvmem | MAXVM |
wallClockTimeRequested | UInt64 | Wallclock requested in seconds while submitting job | Resource_List.walltime | hard_resources | N/A |
wallclockTimeUsed | UInt64 | Wallclock used by the job | resources_used.walltime | wallclock | ENDTIME-STARTTIME-SUSPTIME |
xpriority | UInt64 | Execution priority (nice). | N/A | N/A | xpriority |
exec_host_list | Nested | list of hosts where job was running | exec_vnode | exec_host_list | EXEHOST |
cloud_cost | Float64 | If cloud job what is the price | resources_used.cloud_cost_list | N/A | N/A |
cloud_node_class | String | If cloud job what is the node class | resources_used.cloud_instance_list | N/A | N/A |
cloud_node_time | UInt64 | If cloud job how much time it used the node | resources_used.cloud_time_list | N/A | N/A |
cloud_node_type | String | If cloud job what is the node type | resources_used.cloud_type_list | N/A | N/A |
frac_node_cpu_used | Float64 | Fraction of CPU used from the availability | |||
frac_node_mem_used | Float64 | Fraction of Memory used from the availability | |||
mem | UInt64 | Memory used in the node | exec_vnode | ||
ncpus | UInt64 | No of CPU used in the node | exec_vnode | exec_host_list | CORES value from RESOURCE field |
node | String | execution node name | exec_vnode | exec_host_list | EXEHOST |
resourcesGranted | Nested | Based on custom resource, definition dynamic fields will be created with defined custom resource data type | resources_assigned.<resource_name>=<resource_value> | granted_req. (73) | RESOURCES parsed from RESOURCE field |
datatype | Enum('integer', 'float', 'string', 'boolean') | Custom resource data type | |||
resource_name | String | Custom resource name | |||
valueBoolean | Bool | Custom resource value | |||
valueFloat | Float64 | Custom resource value | |||
valueInt | Int64 | Custom resource value | |||
valueStr | String | Custom resource value | |||
other_resources | Nested | All the non-mapped job information. | |||
resource_name | String | ||||
resource_value | String | ||||
exec_host_list: Contains the list of nodes in which the job has run. It stores the resources consumed by the job on each of the nodes. Sample SQL queries are: Query 1: Find all the distinct node name: select DISTINCT(arrayJoin(exec_host_list.node)) from JOB_RECORDS; Query 2: Find the number of jobs ran on each node: select exec.node, count() from JOB_RECORDS ARRAY JOIN exec_host_list AS exec group by exec.node |
|||||
resourcesGranted: Based on the custom resource definition dynamic fields are stored with their defined data type. Different WLM has different terms for it, in PBS it is Custom Resources, in AGE it is Complex Attributes, and in Accelerator it is Resources. One job can have more than one custom resource and different data types (Boolean, Float, Integer, String). On resourcesGranted - we store value as a list of custom resources and resourcesGranted.resource_name contains a custom resource name. On resourcesGranted.datatype - we store the datatype of that resource and resourcesGranted.value<DataType> will have its value. Sample SQL queries are: Query 1: Find the number of jobs that has requested "ndesktops" resource: SELECT count() from JOB_RECORDS WHERE has(resourcesGranted.resource_name, 'ndesktops') |
|||||
other_resources: All the non-mapped job information that is not part of the default parsing fields and resourceGranted are stored in other_resource. All the values of such fields are stored as strings. For example, in PBS, Resource_List.nodect attribute of the job accounting log is not being mapped to any of the database fields or resourceGranted so it will be stored in other_resources. Sample SQL query to fetch data from other_resources Query 1: Find the number of job which have Resource_List.nodect value as 1. SELECT count() from JOB_RECORDS WHERE has(other_resources.resource_name, 'Resource_List.nodect') and other_resources.resource_value[indexOf(other_resources.resource_name, 'Resource_List.nodect')]='1'; |
|||||
Nested Data Structure: A nested data structure is like a table inside a cell. The column names and types of a nested data structure are specified in the same way as the CREATE TABLE query. Each table row can correspond to any number of rows in a nested data structure. For more information on how to work with nested data structures, refer to clickhouse database documentation. |
|||||