• Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 
speleolinux

How to prevent new jobs from starting on specific nodes

5 posts in this topic

Hi

I have some nodes that need a reboot. I want to stop new jobs from starting on those specific nodes so when the currently running jobs are finished there will be no jobs running on those nodes and I can reboot them.  So I don't want existing jobs to be ended. Just new jobs not started. The manual for qhold suggests that might be what I should use but its not clear.

Also what's the best way to stop ALL new jobs from starting but allow current jobs to continue? Setting dedicated_time to a short time in the future still allows new jobs to start.

Thanks

Mike

Share this post


Link to post
Share on other sites

Have you looked at setting the node offline?

pbsnodes -o <nodename>

or you can use

qmgr -c "set node <nodename> state = offline"

Share this post


Link to post
Share on other sites

If I put a node offline what will happen to existing jobs?  I thought that would stop existing jobs when they tried to communicate back to the head node.  Also by keeping the node online I can add a comment "Note not accepting new jobs." and users with jobs on that node wont worry about their jobs as they wont see the node is down.

Share this post


Link to post
Share on other sites

Job(s) running on the "offline" node will continue to execute. By setting a node to "offline" will NOT stop the executing job(s) on the node.

Referring to the reference guide, 

pbsnodes -o <nodename>

Marks listed hosts as OFFLINE even if currently in use. This is different from being marked DOWN. A host that is marked OFFLINE will continue to execute the jobs already on it, but will be removed from the scheduling pool (no more jobs will be scheduled on it.)

For hosts with multiple vnodes, pbsnodes operates on a host and all of its vnodes, where the hostname is resources_available.host, which is the name of the natural vnode. To offline a single vnode in a multi-vnoded system, use:

qmgr -c “set node <nodename> state=offline”

If you want to issue the node comment at the same time as offlining the node, you can 

pbsnodes -C "Note: not accepting new jobs" -o <nodename> [<nodename2> ...>

with qmgr, you would 

qmgr -c "set node <nodename> comment = 'Note: not accepting new jobs'"

Share this post


Link to post
Share on other sites

Thankyou Scott. I have now done that for the node. It was better I check here before doing something wrong and ending users jobs :-) 

Thanks

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now