Error: no HOME environment variable

Greetings,

In my ondemand installation i keep getting this error. I tried sourcing the ~/.bashrc in the script wrapper under batch_connect and no go. Where else is the HOME environment set in the setup that i could be missing?

Setting VNC password…
Error: no HOME environment variable
Starting VNC server…
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
vncserver: The HOME environment variable is not set.
Cleaning up…
vncserver: The HOME environment variable is not set.

Hi and welcome!

I wonder if you can just force HOME to be set? Does your ~/.bashrc load /etc/profile? I think maybe you need to source the global files that then source your own. Maybe HOME is being set in those global etc files.

Greetings,

Have a separate issue now that’s making is hard for me to test. Every time i change my config in clusters.d it doesn’t seem to change on the app even after restarting the service. Is there a way to tell it to explicitly reload the config I updated in clusters.d?

No we cache all those files at startup so you’ll have to bounce your PUN every time you make a change.

Ok gotcha. Seems I have another separate issue and can’t see where this environment is being set. It’s not in /etc/ood/config/hook.env but getting this when i reboot the pun for my username:

environment name contains a equal : {:OOD_JOB_NAME_ILLEGAL_CHARS=>"/"}

Is there a system cache for it i need to delete? I deleted all the tmp files for the pun.

Can you send a screen shot? My suspicion is that your scheduler is complaining about environment variables when we submit

/etc/ood/config/hook.env only applies to the environment of the hook itself. A PUN itself on the OnDemand instance is started with very little in the way of an environment, but that’s beside the point anyhow. Your issue seems to be on the compute node during the jobs execution. This has nothing to do with the environment in the PUN, your scheduler is responsible for setting up the job’s execution environment.

What scheduler are you using?

Sure, here’s what i get when i log in. It’s before i try grabbing a compute node right when i log in.

We’re using slurm.

Screen Shot 2021-12-01 at 7.13.39 PM

Um… OK. Not sure how that happened. Let’s do a spot check on your configuration, something must be upsetting it.

Did you set that env variable OOD_JOB_NAME_ILLEGAL_CHARS? I think I can replicate

2.7.1 :016 > env = {'LASDF=' => 'sdflsdfn'}
2.7.1 :017 > Open3.capture3(env, 'ls')

It gives this error.

ArgumentError (environment name contains a equal : LASDF=)

There could be some bug here that you’ve uncovered, though how this is happening I don’t quite know. I’ll have to see what you’re trying to configure.

Let me know about any /etc/ood environment modifications you’re trying.

Very early when i was setting this up, i put OOD_JOB_NAME_ILLEGAL_CHARS="/" in the /etc/ood/config/hook.env file but removed when i realized it didn’t need to be there. Now the only place that value is set is in /etc/ood/config/nginx_stage.yml. So can’t see where it’s coming from. When it generates the config for the pun /var/lib/ondemand-nginx/config/puns/user.conf

It reappears in the file every time i bounce the pun, not sure how it keeps getting there:

env USER;

env LOGNAME;

env ONDEMAND_VERSION;

env ONDEMAND_PORTAL;

env ONDEMAND_TITLE;

env SECRET_KEY_BASE;

env NGINX_FILE_UPLOAD_MAX;

env OOD_DASHBOARD_TITLE;

env OOD_PORTAL;

env OOD_DEV_APPS_ROOT;

env OOD_FILES_URL;

env OOD_EDITOR_URL;

env RAILS_LOG_TO_STDOUT;

env {:OOD_JOB_NAME_ILLEGAL_CHARS=>"/"};

env PATH;

env LD_LIBRARY_PATH;

env X_SCLS;

env MANPATH;

env PCP_DIR;

env PERL5LIB;

env PKG_CONFIG_PATH;

env PYTHONPATH;

env XDG_DATA_DIRS;

env SCLS;

env RUBYLIB;

If it’s set in nginx_stage.yml then it’s coming from nginx_stage.yml. It should be set like this. Can you share how you’re setting it?

pun_custom_env:
  OOD_JOB_NAME_ILLEGAL_CHARS: "/"

Yes, it’s in there like this.

pun_custom_env:
  - OOD_JOB_NAME_ILLEGAL_CHARS: "/"

Edit: yep looks like that dash i must have added was doing it. Removed it and i can log in again.

Well progress. Turned out $HOME wasn’t in the regular /etc/profile and since we did a lot of custom modifications for our environment i had to make a second file /etc/profile-portal that gets sourced for batch_connect in the script wrapper:

  batch_connect:
    basic:
      script_wrapper: |
        #!/bin/bash
        set +o posix
        source /etc/profile-portal
        module purge
        %s
      set_host: "host=$(hostname -A | awk '{print $1}')"
    vnc:
      script_wrapper: |
        #!/bin/bash
        set +o posix
        source /etc/profile-portal
        module purge
        export PATH="/opt/TurboVNC/bin:$PATH"
        export WEBSOCKIFY_CMD="/apps/websockify/0.10.0/run"
        %s
      set_host: "host=$(hostname -A | awk '{print $1}')"

Now getting this so if there’s a separate topic that addressed this i can look at it or start a new topic.

Could be! Can’t quite tell what’s going on here, but yea I’d say open a new topic is the best bet.

1 Like