Unified InsightPro  Database Schema
View the InsightPro database schema that combines the accounting logs from PBS Pro, AGE, and Accelerator.
The following table provides the InsightPro database schema mapping across WLMs
Field Name Data Type Field Description PBS Pro AGE Accelerator
account String Account info account account NA
arrayIndex UInt64 Array job task Identifier Extract task number from job id task_number  
cgroupsFailcnt UInt64 Number of memory limit hits (cgroups). N/A failcnt N/A
cgroupsMaxMemory Float64 Maximum of the  cgroups_memory  load value.  cgroups_memory is calculated from values in the cgroups file memory.stat N/A max_cgroups_memory N/A
cgroupsMaxUsageInBytes Float64 Maximum recorded memory usage (cgroups). N/A max_usage_in_bytes N/A
cgroupsMemswFailcnt UInt64 Number of memory and swap limit hits(cgroups). N/A memsw.failcnt N/A
cgroupsMemswMaxUsageInBytes Float64 Maximum recorded memory and swap usage (cgroups). N/A memsw.max_usage_in_bytes N/A
cluster_id UUID Unique cluster id to identify different cluster N/A N/A N/A
cluster_type String PBS or AGE or ACCELERATOR N/A N/A N/A
command String Application name Resource_List.software job_name TOOL
cpuTimeUsed UInt64 CPU usage in seconds calculated as usr+sys time (or taken from the cgroup cpuacct subsystem if it is active for the job in PBS) resources_used.cput cpu CPUTIME
cwd String The  working  directory  the  job  ran  in as specified with qsub(1) / qalter(1) switches -cwd and -wd.  As the delimiter used by the accounting file (colon “:”) can be part of the working directory all colons in the working directory are re‐placed by ASCII code 255. N/A cwd DIR
deadline UInt64 unix time stamp when jobs need to be done N/A N/A deadline
department String Department name N/A department N/A
dependOnJobs String The job ids which this job was dependent to run depend hold_jid (65) N/A
dpcount UInt64 Distributed parallel jobs - e.g 200 (cores/slots) N/A N/A dpcount
dphosts String list of hosts where job was running N/A N/A dphosts
dptopid UInt64 top id - "master job" - numerical id N/A N/A dptopid
earliestStartTime UInt64 The initially requested execution time when  the  job  was  submitted  etime orig_exec_time N/A
endTime UInt64 Job end time. Values is in epoch timestamp in microseconds. end end_time ENDTIME
env String The environment which the job should use (like "modules", VOV has a named-environment management facility). N/A N/A env
exclusive Bool If job requested for exclusive use of node Resource_List.place=excl N/A N/A
exitStatus Int64 Exit status of the job Exit_status exit_status EXIT
failed Int64 Indicates the problem which occurred in case a job could not be started on the execution host  Exit_status failed FAILCODE
fsgroup String FairShare group specified for job. N/A N/A fsgroup
fstokens UInt64 Number of FairShare tokens specified for job. N/A N/A fstokens
gpusRequested UInt64 No. of GPU slots Resource_List.ngpus Configurable variable name in agent configuration. The usage will be fetch from hard_resources -> <variable name>. N/A
gpuUsage UInt64 GPU usage N/A N/A N/A
grabbed String Of resources requested, those that have been confirmed as being in-use. N/A N/A grabbed
group String Linux user group name group group OSGROUP
instanceType String "on-prem" default value for all the WLM. "on-cloud" value for PBS Navops cloud setup. Resource_List.cloud_eligible=True N/A N/A
io Float64 The  amount  of data transferred in Gbytes.  On Linux data transferred mall bytes read and written by the job through the read(), pread(), writeand pwrite() systems calls.  On Windows this is the sum of all bytes transferred by the job by doing write, read and other operations.  N/A io N/A
ioops UInt64 The number of io operations. N/A ioops N/A
iow UInt64 io wait time in seconds N/A iow N/A
jobClass String Name of the Job Class the job uses. "NONE" for jobs not using a JC. N/A job_class JOBCLASS
jobExitStatus String "failed" value for not zero exit status, "succeed" value for zero exit status N/A N/A N/A
jobId String job identifier index[2] job_number ID
licenses String subset of RESOURCES string N/A N/A licenses
memoryRequested UInt64 Memory requested while submitting job.(Kb) Resource_List.mem Order of priority - first one found will be taken
1.hard_resources→m_mem_free
2.hard_resources→h_rss
3.hard_resources→h_vmem
4.hard_resources->mem_free
In Resource field value of RAM 
name String Job name provided by the end user jobname job_name JOBNAME
owner String Linux user name which is used to submit job user user USER
peName String The parallel environment which was selected for that job. N/A granted_pe N/A
physicalMemoryUsed UInt64 RSS memory used reporting by WLM end record. (Kb) resources_used.mem maxrss MAXRAM
project String Project name project project PROJECT
qdelInfo String If  the  job  (the array task) has been deleted via qdel, <username>@<hostname>,  else NONE.  If qdel was called multiple times, every invocation is recorded in a comma separated list. N/A qdel_info N/A
queued UInt64 amount of time was in one of the scheduling buckets in seconds N/A N/A queue
queueName String Queue in which job was run queue qname N/A
queueTime UInt64 Job Queue time (or when job was submitted).  Values is in epoch timestamp in microseconds. qtime submission_time SUBMITTIME
resvID String Reservation ID resvID arid N/A
resvName String The name of the Advance Reservation of the job. resvname ar_name N/A
slots UInt64 Number of slots  Resource_List.ncpus slots CORES value from RESOURCE field
spriority UInt64 Submission priority. N/A N/A spriority
startTime UInt64 Time Job Started running on execution host. Values is in epoch timestamp in microseconds. start start_time STARTTIME
status String string how the finished DONE or FAILED N/A N/A status
submitHost String Hostname where job was submitted N/A submit_host SUBMITHOST
submitTime UInt64 Job submission time. Values is in epoch timestamp in microseconds. ctime submission_time SUBMITTIME
susptime UInt64 total time job was suspended in seconds N/A N/A susptime
virtualMemoryUsed UInt64 VM memory used reporting by WLM end record. (Kb) resources_used.vmem maxvmem MAXVM
wallClockTimeRequested UInt64 Wallclock requested in seconds while submitting job Resource_List.walltime hard_resources N/A
wallclockTimeUsed UInt64 Wallclock used by the job resources_used.walltime wallclock ENDTIME-STARTTIME-SUSPTIME
xpriority UInt64 Execution priority (nice). N/A N/A xpriority
exec_host_list Nested list of hosts where job was running exec_vnode exec_host_list EXEHOST
cloud_cost Float64 If cloud job what is the price resources_used.cloud_cost_list N/A N/A
cloud_node_class String If cloud job what is the node class resources_used.cloud_instance_list N/A N/A
cloud_node_time UInt64 If cloud job how much time it used the node resources_used.cloud_time_list N/A N/A
cloud_node_type String If cloud job what is the node type resources_used.cloud_type_list N/A N/A
frac_node_cpu_used Float64 Fraction of CPU used from the availability      
frac_node_mem_used Float64 Fraction of Memory used from the availability      
mem UInt64 Memory used in the node exec_vnode    
ncpus UInt64 No of CPU used in the node exec_vnode exec_host_list CORES value from RESOURCE field
node String execution node name exec_vnode exec_host_list EXEHOST
resourcesGranted Nested Based on custom resource, definition dynamic fields will be created with defined custom resource data type resources_assigned.<resource_name>=<resource_value> granted_req. (73) RESOURCES parsed from RESOURCE field
datatype Enum('integer', 'float', 'string', 'boolean') Custom resource data type      
resource_name String Custom resource name      
valueBoolean Bool Custom resource value      
valueFloat Float64 Custom resource value      
valueInt Int64 Custom resource value      
valueStr String Custom resource value      
other_resources Nested All the non-mapped job information.      
resource_name String        
resource_value String        
exec_host_list:

Contains the list of nodes in which the job has run. It stores the resources consumed by the job on each of the nodes. Sample SQL queries are:


Query 1:
Find all the distinct node name:
select DISTINCT(arrayJoin(exec_host_list.node)) from JOB_RECORDS;


Query 2:
Find the number of jobs ran on each node:
select exec.node, count() from JOB_RECORDS ARRAY JOIN exec_host_list AS exec group by exec.node
resourcesGranted:

Based on the custom resource definition dynamic fields are stored with their defined data type. Different WLM has different terms for it, in PBS it is Custom Resources, in AGE it is Complex Attributes, and in Accelerator it is Resources.

One job can have more than one custom resource and different data types (Boolean, Float, Integer, String).

On resourcesGranted - we store value as a list of custom resources and resourcesGranted.resource_name contains a custom resource name.

On resourcesGranted.datatype - we store the datatype of that resource and resourcesGranted.value<DataType> will have its value.

Sample SQL queries are:

Query 1: Find the number of jobs that has requested "ndesktops" resource:
SELECT count() from JOB_RECORDS WHERE has(resourcesGranted.resource_name, 'ndesktops')
other_resources:

All the non-mapped job information that is not part of the default parsing fields and resourceGranted are stored in other_resource. All the values of such fields are stored as strings.

For example, in PBS, Resource_List.nodect attribute of the job accounting log is not being mapped to any of the database fields or resourceGranted so it will be stored in other_resources.

Sample SQL query to fetch data from other_resources

Query 1: Find the number of job which have Resource_List.nodect value as 1.

SELECT count() from JOB_RECORDS WHERE has(other_resources.resource_name, 'Resource_List.nodect') and other_resources.resource_value[indexOf(other_resources.resource_name, 'Resource_List.nodect')]='1';
Nested Data Structure:

A nested data structure is like a table inside a cell. The column names and types of a nested data structure are specified in the same way as the CREATE TABLE query. Each table row can correspond to any number of rows in a nested data structure.

For more information on how to work with nested data structures, refer to clickhouse database documentation.