I have been trying to get linux_host working in Ondemand and have had to make changes to files in cluster.d - it seems that eventually if you use an existing file as a template such as a working Slurm cluster file and try and change it then Ondemand seems to continue to think its a Slurm cluster and ignores any changes. Also it seems some filenames do not work and the cluster file can be completely ignored. Its a really frustrating issue since it makes it hard to try and get things working if I cannot be sure if its reading the latest version.
Is there some mechanism that would cause this? I got one cluster file working fine with Slurm but trying to add another one to get a linux_host working for VNC has really confused me. The Desktop app by modifying file /etc/ood/config/apps/bc_desktop/test.yml can sometimes display in the dashboard but if I use another name for it and sync the name with the filename in /etc/ood/config/clusters.d it can make the Interactive Apps menu option to disappear in the dashboard. Any advice? Is there a way of resetting the information it remembers?
Tom, below is our cluster config for the Linux Host Adapter. Note that OOD currently assumes IP round robinning as frequently done in cluster interactive nodes, if you have multiple Linux hosts, so, our workaround now is to have different cluster config for each host. I think I started from the OOD example docs, not from the existing SLURM cluster.
---
v2:
metadata:
title: "frisco1"
url: "https://www.chpc.utah.edu/documentation/guides/frisco-nodes.php"
hidden: false
login:
host: "frisco1.chpc.utah.edu"
job:
adapter: "linux_host"
submit_host: "frisco1.chpc.utah.edu" # This is the head for a login round robin
ssh_hosts: # These are the actual login nodes, need to have full host name for the regex to work
- frisco1.chpc.utah.edu
site_timeout: 7200
debug: true
singularity_bin: /uufs/chpc.utah.edu/sys/installdir/singularity3/std/bin/singularity
singularity_bindpath: /etc,/mnt,/media,/opt,/run,/srv,/usr,/var,/uufs,/scratch
# singularity_image: /opt/ood/linuxhost_adapter/centos7_lmod.sif
singularity_image: /uufs/chpc.utah.edu/sys/installdir/ood/centos7_lmod.sif
# Enabling strict host checking may cause the adapter to fail if the user's known_hosts does not have all the roundrobin hosts
strict_host_checking: false
tmux_bin: /usr/bin/tmux
batch_connect:
basic:
script_wrapper: |
#!/bin/bash
set -x
if [ -z "$LMOD_VERSION" ]; then
source /etc/profile.d/chpc.sh
fi
export XDG_RUNTIME_DIR=$(mktemp -d)
%s
set_host: "host=$(hostname -s).chpc.utah.edu"
vnc:
script_wrapper: |
#!/bin/bash
set -x
export PATH="/uufs/chpc.utah.edu/sys/installdir/turbovnc/std/opt/TurboVNC/bin:$PATH"
export WEBSOCKIFY_CMD="/uufs/chpc.utah.edu/sys/installdir/websockify/0.8.0/bin/websockify"
export XDG_RUNTIME_DIR=$(mktemp -d)
%s
set_host: "host=$(hostname -s).chpc.utah.edu"
Thanks, that is very useful. I have had some success using private windows in my browser to clear cache and maybe cookies between changes so maybe the browser is caching information and causing the confusing behaviour if name of the cluster file changes. I might take a step back from trying to access our VNC Xinetd server that is on a single port and get it working with spawning VNC servers on a single host for each user as described in the documentation.