Jump to content
  • Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 

vjm

Members
  • Content count

    2
  • Joined

  • Last visited

  1. release the reserved resources

    Hi Jeff Currently the short answer to your question is no, reservations won't automatically end early if jobs they contain end early (as they often do in fact). The rationale for this behavior lies on the fact that some of our users use reservations to set aside nodes to perform service tasks (patching nodes or other) and ending the reservation earlier than intended would be a problem for them. The good news is that there is work going on to enhance the product and you should see improvements over the next release or two. The internal ticket number for this enhancement is 268750 if you follow-up with us in the future feel free to use that number to find out more about its status. Starting in 12.2 there is an unsupported set of PBS Python library bindings available in the unsupported directory that can be used to make scripting these types of workarounds much easier than other scripting languages, below is an example script that you could periodically run through cron that would do the trick: from ptl.lib.pbs_testlib import * s = Server() resvs = s.status(RESV, {'reserve_state': (MATCH_RE, 'RESV_RUNNING|5')}) if not resvs: sys.exit(0) rids = map(lambda r:r['id'], resvs) jobs = s.status(JOB, 'reserve_ID') for job in jobs: if job['reserve_ID'] in rids: rids.remove(job['reserve_ID']) for rid in rids: s.delete(rid) I hope this helps - Vincent
  2. PBS in the Cloud

    Hi there, Setting up a cloud provisioned environment is definitely achievable with PBS Pro, I have done it with AWS instances. The idea is a bit heavy-handed on the configuration side as opposed to a turnkey solution, primarily because PBS Pro won't allow adding nodes that aren't resolvable, nor reflect the state of a node that may be available but that we would want to offload work to only under specific conditions, there are also some security considerations that aren't address in PBS Pro's most common environment of traditional HPC solutions. In a nutshell, the steps call for: 1) writing scripts that will bring remote nodes online (e.g. AWS) 2) configuring end-to-end open networking, i.e. resolved any firewall traversal that would block communication on PBS Ports (defaults 1500[1-10], and scp for file staging 3) configuring a flat uid user namespace across local and remote servers Based on policy dictating when a workload should trigger a cloud provision they can (which could be monitored through a cron script): 4) invoke the provisioning mechanism 5a) associate provisioned nodes to a queue (e.g., cloudq) -OR- 5b) create a reservation on nodes on that queue to control provisioning lease expiration 6) move workload into that queue Tear down would consist of removing the nodes and optionally deleting the queue. The traffic between the local server and the cloud MoM would be as secure as the link between these two endpoints, so unless configuration of an encrypted channel was performed, the data sent may be compromised. Cheers Vincent
×