Jump to content
  • Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 
speleolinux

Not getting resources_used from array jobs

Recommended Posts

Hi all


 


When a job runs to completion users can get the memory used from qstat:



$ qstat -fx 326958.hpcnode1 | grep resources_used
    resources_used.cpupercent = 0
    resources_used.cput = 00:00:29
    resources_used.mem = 10620kb

but it looks like for array jobs there are no  resources_used.* attributes shown.


so any command like the following:



$ qstat -fx 326959[].hpcnode1   <-- I don't expect to see resources_used.* here
$ qstat -fx 326959[1].hpcnode1 <-- but for the actual subjobs it should be available
$ qstat -fx 326959[2].hpcnode1 <-- ditto

I can get the info from tracejob for each job in the array:



$ tracejob 326959[1].hpcnode1
... blah ...
07/28/2015 15:53:02  S    Exit_status=0 resources_used.cpupercent=0
resources_used.cput=00:00:00 resources_used.mem=3772kb resources_used.ncpus=1
resources_used.vmem=432572kb resources_used.walltime=00:00:01

So why doesn't qstat show resources_used for array jobs?


 


Mike


 


Share this post


Link to post
Share on other sites

Hi Mike, job array subjobs are designed to be "lighter weight" than individual batch jobs, and as such not all job attributes are retained by the server in memory nor ever even written to the internal PBS database (FYI, this is the reason that subjobs are restarted upon server restart).  The resources_used values for the subjobs are recorded in the accounting log E record, though.


 


I hope this helps!


 


-Scott

Share this post


Link to post
Share on other sites

Hi

 

Okies now I understand. I can see that the E record is in this file for example (server_priv/accounting/20150730) and I can  see the record in there.

The PBSPro manual says "The tracejob command can read both event logs and accounting logs." so thats why I could the resources_used from trace_job as it reads that file but not qstat.
 

Thanks Scott

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now

×