Jump to content
  • Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 
Sign in to follow this  
vechinsd

How to specify a subset of nodes from a queue

Recommended Posts

We have a cluster with 5 nodes (4 cores per node) and have PBS running. One the users has a tendency to run jobs interactively on one or two of the nodes. If I use qsub to submit many jobs, they get submitted to all the nodes including the "busy" ones with the other user's interactive jobs.

Besides the obvious of making the other user use qsub, is there a way in my qsub scripts to restrict what nodes to submit the jobs to. For instance, say that nodes 3 and 4 are running the interactive jobs. I'd like to tell my jobs that they can use nodes 1,2 or 5. Note: I may be submitting 30 to 100 jobs or more at a time. So I don't want to manually tell each job what specific node to use, I just want to say use whatever is currently available from node 1, 2 or 5. I know there is the -l nodes= resource that can be used to specify nodes. However, this seems to use an "and" type logic. For instance if I use

#PBS -l nodes=node001+node002+node005

then jobs are only executed on node001 (while I assume, locking up node002 and node005) because this says that each job requires a process on each of the three nodes.

(At the moment its more trouble than its worth to try to get the other user to use PBS. I'm just trying to work around it when its an issue.)

Share this post


Link to post
Share on other sites

You didnt mention if you are using PBS Professional or OpenPBS. What you can do in each case is different.

PBS Pro uses a much better qsub selection directive. While the "-l nodes" version is still available for compatibility sake the easiest thing to do in this case would be to have your user who is running interactive jobs simply ask for exclusive access to his nodes:

qsub -I -lselect=2:ncpus=2 -lplace=excl

would request 2 chunks of 2 cpus. So PBS could fullfill that with one node or two. The addition of the "excl" says to not let the other cores on the selected node be used by a subsequent job.

qsub -I -lselect=2:ncpus=4

would allow you to select two whole nodes. In either case above your next jobs would not be allowed to fall on to the two nodes that PBS selected for the interactive job.

There are some other things you could also do like set the nodes sharing attribute to "force_excl" which would always allocate whole nodes to each job.

If you are using OpenPBS your options are more limited. You could create a qsub wrapper script that would force the submission of interactive jobs to a certain queue and then you could assign specific nodes to specific queues. This would make all interactive jobs fall into a specific queue that can only assign jobs to specific nodes. Its a little less flexible but it would probably work to keep the two workload seperate.

You could also mess with creating a different qsub wrapper script that would add a resource component to your selection directive that would ask for nodes with a certain resource defined for non interactive jobs. This would force the placement of non interactive jobs only onto a set of nodes that fulfills that resource request. Do you know how to create resources?

Share this post


Link to post
Share on other sites

I'm pretty certain that we are not using PBS Professional.

Unfortunately, the other users jobs are being kicked off from the interface of a third party piece of software that I don't believe we are able to modify. The user is not very Unix literate and would much rather push a button to make it go than edit a file specifying requested resources and submitting several jobs. If it starts becoming a major problem then that will have to be the case but for right now I'm just trying to work around it myself.

I thought about making different queues that only the nodes I want to use but that seemed like more effort than I was trying to deal with at the moment since I'd have to make a new queue each time it was an issue since what nodes the other jobs are on could change each time.

Your last point about adding resources to nodes sounds like a possibility though it seems somewhat similar to the previous option of making a new queue since the nodes with the new resource would have to be defined each time I was wanting to run something since the nodes busy with the interactive jobs could change from day to day. But it also sounds limited to one with the privileges of setting up PBS queues and resources. No, at the moment I don't know how to add resources to nodes. Will have to research it some.

Share this post


Link to post
Share on other sites

Well my first suggestion is that you get PBS Professional. You can download a 30 day free trial directly off our website.

I think probably setting the nodes to be used exclusively for Interactive jobs would be the easiest way. This would force the placement of subsequent jobs automatically to nodes other than the ones being used for interactive work. This wouldnt require any knowledge by the end user and over time job placement would flow natrually.

In PBS Pro you could simply set the node sharing attribute to force_excl for all nodes or create a job submission hook to look for interactive jobs and only set those jobs to use the node exclusively by appending "-lplace=sharing=excl" to the selection directive. This in effect blocks any new jobs from landing on interactively used nodes.

In Open PBS you could create a script that wraps qsub to check if a user subsmits a jobs with "-I" when doing a qsub and then change the resource request to take a whole node.

In both cases this would block subsequent jobs from landing on nodes when an interctive job was requested becuase all the resources for that node would be in use. This would not require any changes to the third party software (or creation of queues or resources)provide they are using qsub to submit jobs at some point. Typcially the "integration" is some kind of script that launches a job via qsub and not true API integration. What software is it?

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
Sign in to follow this  

×