torque pbs in workstation failed

2 posts in this topic

I am trying to install Torque PBS system in my workstation (which contains 2xE5-2680V3, 4xGTX TITAN X). I install it as following:


(1)./configure --without-tcl --with-nvidia-gpus --enable-nvidia-gpus --prefix=/soft/torque-5.1.1 --with-nvml-include=/usr/local/cuda/gpukit/usr/include/nvidia/gdk --with-nvml-lib=/usr/local/cuda/lib64

(2) make && make install

(3)set path=(/soft/torque-5.1.1/bin $path)
set path=(/soft/torque-5.1.1/sbin $path)

(4) #vi /etc/hosts as following: localhost cudaC
xx.xx.xx.xx torqueserver

(5) #./torque.setup albert torqueserver

(6)# vi /var/spool/torque/server_priv/nodes
cudaC np=4

(7) #cd /soft/torque-5.1.1/sbin
#./ pbs_mom


(8)#pbsnodes to check status:
state = down
power_state = Running
np = 12
ntype = cluster
mom_service_port = 15002
mom_manager_port = 15003

It seems that the service doesn't start....


If I configure /var/spool/torque/server_priv/nodes as following:


node01.cudaC np=12
node02.cudaC np=12
node03.cudaC np=12

node04.cudaC np=12

then run #pbsnodes, it will failed with messages:


pbsnodes: Server has no node list MSG=none of the nodes in the 'server_priv/nodes' file resolves to a valid address

Does anybody have any idea what's problem?

thx a lot


Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now