Can submit job from OOD host directly, but not from Job Composer


I am working with a new OOD 2.0 deployment on RHEL 8.4 and am running into issues with Job Composer being unable to submit jobs. Here is the error message that shows up in the browser:

An error occurred when submitting jobs for simulation 3: pbsconf error: pbs conf variables not found: PBS_HOME
No such file or directory
qsub: cannot connect to server  (errno=0)

I’m not sure how to track this error down. I can see the error message in the PUN logs, but it doesn’t add much information (actual full paths omitted):

App 37849 output: [2021-09-29 13:11:27 -0500 ]  INFO "execve = [{\"PBS_DEFAULT\"=>\"headnode\", \"PBS_EXEC\"=>\"/path/to/openp
bs/20.0.1\"}, \"/path/to/openpbs/20.0.1/bin/qsub\", \"-j\", \"oe\"]"
App 37849 output: [2021-09-29 13:11:27 -0500 ] ERROR "An error occurred when submitting jobs for simulation 3: pbsconf error: pbs con
f variables not found: PBS_HOME\nNo such file or directory\nqsub: cannot connect to server  (errno=0)"

The weird thing is, I can successfully submit this job just fine from the OOD host, with the same user account, via the terminal. Furthermore, when I launch a terminal session on the OOD host, PBS_HOME, as well as PBS_EXEC and PBS_CONF_FILE are set appropriately. I also have the correct paths set in /etc/ood/config/clusters.d/cluster.yml:

    adapter: "pbspro"
    host: "headnode"
    exec: "/path/to/openpbs/20.0.1"

The last two lines match the contents of PBS_CONF_FILE which is working on this and other nodes, via the terminal.

Any ideas what could be wrong?

I’d check your configs on the deployment that does work.

The PUN starts up with a very limited environment, not one you’d see in a bash shell. Which is to say, when the PUN starts up, it doesn’t source things from /etc/profile.d like bash or other shells do.

If they’re static, you can add them here in nginx_stage.yml.

# Custom environment variables to set for the PUN environment
# Below is an example of the use for setting env vars.
# pun_custom_env:
#   OOD_DASHBOARD_TITLE: "Open OnDemand"
#   OOD_BRAND_BG_COLOR: "#53565a"

Thanks, Jeff!

Manually defining PBS_HOME in nginx_stage.yml did the trick. Now I have to figure out why that wasn’t necessary on our other instance…