Jump to content
  • Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 
Sign in to follow this  
jerrytown

Exclude (or include) specific nodes in PBS Pro

Recommended Posts

I'm working on a cluster with 8 nodes; 4 nodes have python and 4 don't. How can I ensure that my python jobs only go to the nodes with python?

  • I do not have admin rights on the cluster
  • PBS Pro 13.1
  • RedHat 5.11

A subset of the many attempts that have not worked for me:

  • qsub -l host=!bad_node1
  • qsub -l select=1:host=!bad_node1
  • qsub -l host=!bad_node1&!bad_node2
  • qsub -l nodes=good_node1+good_node2

This question has been asked before, but those solutions are not working for me:

Share this post


Link to post
Share on other sites

Hi Jerry,

PBS Professional does not currently support using logical operators in resource specifiers. This is something we're working on for a future release.

Since you don't have administrative privileges on the machine, you can't configure PBS to identify vnodes that have Python on them. That's too bad, because then you could just request a vnode with Python and PBS could automatically schedule you there. As it is, you're going to have to request  a specific vnode or host for each job, e.g.:

qsub -l select=1:ncpus=1:host=good_node1

 

Steve

Share this post


Link to post
Share on other sites

Thanks for the response Steve. At least I know to stop trying that strategy.  Now to figure out a homebrew solution.  Initial idea: send the identical job to all four good nodes; set a job on the login node to monitor which of the four jobs starts first and then kill the other three. Not elegant but a possible work around. Other strategies?

Share this post


Link to post
Share on other sites
10 hours ago, jerrytown said:

Thanks for the response Steve. At least I know to stop trying that strategy.  Now to figure out a homebrew solution.  Initial idea: send the identical job to all four good nodes; set a job on the login node to monitor which of the four jobs starts first and then kill the other three. Not elegant but a possible work around. Other strategies?

Well, the ideal solution would be to get your friendly neighborhood system admin to set the built-in "software" resource on the nodes that have Python on them (because that's what that resource is there for) and configure the scheduler to take that resource into consideration for scheduling purposes.  Then, all you'd have to do is something like this:

qsub -lselect=1:software=python ....

and the scheduler would take care of the rest for you. This takes exactly one qmgr command and a one-line change to the scheduler configuration. But, as you said, you don't have the requisite permission to do that.

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×