Jump to content
  • Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 

albumns

Members
  • Content count

    2
  • Joined

  • Last visited

  1. pbs failed in workstation.

    I am trying to install Torque PBS system in my SUSE LINUX 13.2 workstation (which contains 2xE5-2680V3, 4xGTX TITAN X). I install it as following: (1)./configure --without-tcl --with-nvidia-gpus --enable-nvidia-gpus --prefix=/soft/torque-5.1.1 --with-nvml-include=/usr/local/cuda/gpukit/usr/include/nvidia/gdk --with-nvml-lib=/usr/local/cuda/lib64 (2) make && make install (3)set path=(/soft/torque-5.1.1/bin $path) set path=(/soft/torque-5.1.1/sbin $path) (4) #vi /etc/hosts as following: 127.0.0.1 localhost cudaC xx.xx.xx.xx torqueserver (5) #./torque.setup albert torqueserver (6)# vi /var/spool/torque/server_priv/nodes cudaC np=4 (7) #cd /soft/torque-5.1.1/sbin #./pbs_sever #./pbs_sched #./ pbs_mom (8)#pbsnodes to check status: cudaC state = down power_state = Running np = 12 ntype = cluster mom_service_port = 15002 mom_manager_port = 15003 It seems that the service doesn't start.... If I configure /var/spool/torque/server_priv/nodes as following: node01.cudaC np=12 node02.cudaC np=12 node03.cudaC np=12 node04.cudaC np=12 then run #pbsnodes, it will failed with messages: pbsnodes: Server has no node list MSG=none of the nodes in the 'server_priv/nodes' file resolves to a valid address Does anybody have any idea what's problem? thx a lot
  2. I am trying to install Torque PBS system in my workstation (which contains 2xE5-2680V3, 4xGTX TITAN X). I install it as following: (1)./configure --without-tcl --with-nvidia-gpus --enable-nvidia-gpus --prefix=/soft/torque-5.1.1 --with-nvml-include=/usr/local/cuda/gpukit/usr/include/nvidia/gdk --with-nvml-lib=/usr/local/cuda/lib64 (2) make && make install (3)set path=(/soft/torque-5.1.1/bin $path) set path=(/soft/torque-5.1.1/sbin $path) (4) #vi /etc/hosts as following: 127.0.0.1 localhost cudaC xx.xx.xx.xx torqueserver (5) #./torque.setup albert torqueserver (6)# vi /var/spool/torque/server_priv/nodes cudaC np=4 (7) #cd /soft/torque-5.1.1/sbin #./pbs_sever #./pbs_sched #./ pbs_mom (8)#pbsnodes to check status: cudaC state = down power_state = Running np = 12 ntype = cluster mom_service_port = 15002 mom_manager_port = 15003 It seems that the service doesn't start.... If I configure /var/spool/torque/server_priv/nodes as following: node01.cudaC np=12 node02.cudaC np=12 node03.cudaC np=12 node04.cudaC np=12 then run #pbsnodes, it will failed with messages: pbsnodes: Server has no node list MSG=none of the nodes in the 'server_priv/nodes' file resolves to a valid address Does anybody have any idea what's problem? thx a lot
×