I have some nodes that need a reboot. I want to stop new jobs from starting on those specific nodes so when the currently running jobs are finished there will be no jobs running on those nodes and I can reboot them.  So I don't want existing jobs to be ended. Just new jobs not started. The manual for qhold suggests that might be what I should use but its not clear.

Also what's the best way to stop ALL new jobs from starting but allow current jobs to continue? Setting dedicated_time to a short time in the future still allows new jobs to start.



If I put a node offline what will happen to existing jobs?  I thought that would stop existing jobs when they tried to communicate back to the head node.  Also by keeping the node online I can add a comment "Note not accepting new jobs." and users with jobs on that node wont worry about their jobs as they wont see the node is down.

Job(s) running on the "offline" node will continue to execute. By setting a node to "offline" will NOT stop the executing job(s) on the node.

Referring to the reference guide, 

pbsnodes -o <nodename>

Marks listed hosts as OFFLINE even if currently in use. This is different from being marked DOWN. A host that is marked OFFLINE will continue to execute the jobs already on it, but will be removed from the scheduling pool (no more jobs will be scheduled on it.)

For hosts with multiple vnodes, pbsnodes operates on a host and all of its vnodes, where the hostname is resources_available.host, which is the name of the natural vnode. To offline a single vnode in a multi-vnoded system, use:

qmgr -c “set node <nodename> state=offline”

If you want to issue the node comment at the same time as offlining the node, you can 

pbsnodes -C "Note: not accepting new jobs" -o <nodename> [<nodename2> ...>

with qmgr, you would 

qmgr -c "set node <nodename> comment = 'Note: not accepting new jobs'"

