Jump to content
  • Announcements

    • admin

      PBS Forum Has Closed   06/12/17

      The PBS Works Support Forum is no longer active.  For PBS community-oriented questions and support, please join the discussion at http://community.pbspro.org.  Any new security advisories related to commercially-licensed products will be posted in the PBS User Area (https://secure.altair.com/UserArea/). 
jerryleo

group name issue in the job log

Recommended Posts

Hi,

It's using PBSPro_12.1.1.131502,  with a particular user,  the '-default-' was used as the group name instead the user's group, like this

20170110:01/10/2017 21:52:26;E;553720.sdb;user=songgt group=-default- project=_pbs_project_default accounting_id="0x600026632" jobname=roms queue=workq ctime=1484083146 qtime=1484083146 etime=1484083146 start=1484083146 exec_host=login1/0+login1/1+login1/2+login1/3+login1/4+login1/5+login1/6+login1/7+login1/8+login1/9+login1/10+login1/11+login1/12+login1/13+login1/14+login1/15 exec_vnode=(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_1:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1)+(clogin86_8_0:ncpus=1) Resource_List.arch=XT Resource_List.mppwidth=16 Resource_List.ncpus=16 Resource_List.nodect=16 Resource_List.place=free Resource_List.select=16:vntype=cray_compute Resource_List.walltime=02:00:00 session=5842 alt_id=439506 end=1484085146 Exit_status=0 resources_used.cpupercent=7 resources_used.cput=00:00:19 resources_used.mem=7036kb resources_used.ncpus=16 resources_used.vmem=135980kb resources_used.walltime=00:33:17 run_count=1

 

Make sure that the group exists and the user is in the group. 

NO this sort of issue with other users, only for a particular user. 

Have no idea what's wrong.

Any ideas ?

Thanks for your time

Regards

 

Share this post


Link to post
Share on other sites

The group = -default- is set by the pbs_server for a job if the system function getpwnam(userid) returns null.  Was this user created after the pbs_server had started last, or was the user added to the group since the last server start?  Maybe something (try restarting nscd, if it is in use?) is caching old information?  You can test getpwnam() with the following simple program:

 

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <pwd.h>
#include <errno.h>

int main(int argc, char *argv[])
{
char* name;

int returnval;
char command[100];

struct passwd           *pwdp;

name=argv[1];

pwdp = getpwnam(name);

printf("errno: %s\n",strerror(errno));

if (pwdp == NULL)
	{
        printf("Pointer returned is NULL:\n");
	}	
else
	{
        printf("Pointer returned is NOT NULL!\n");
	}

if (pwdp == (struct passwd *)0) 
	{
	printf("No Password Entry for User %s\n", name);
	}
else
	{
	printf("Password Entry for User FOUND for %s\n", name);
	}
sprintf(command,"getent passwd %s", name);
returnval = system(command);
}

 

Share this post


Link to post
Share on other sites

Hi,

Thanks for your kindly inputs.

Found the root problem of this issue. The /etc/passwd file are not consistent on the login node and pbs server node, user account was missed from the /etc/passwd file on the pbs server node.

Wondering why the user still could submit and run job without any problem even if the account info is missing from /etc/passwd on pbs server node.  Something is screwing on the system setting/configuration ?

Appreciating further comments/inputs.

Regards

Jerry

Share this post


Link to post
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now

×