Strange behaviour when modifying second cluster file in cluster.d

I have been trying to get linux_host working in Ondemand and have had to make changes to files in cluster.d - it seems that eventually if you use an existing file as a template such as a working Slurm cluster file and try and change it then Ondemand seems to continue to think its a Slurm cluster and ignores any changes. Also it seems some filenames do not work and the cluster file can be completely ignored. Its a really frustrating issue since it makes it hard to try and get things working if I cannot be sure if its reading the latest version.

Is there some mechanism that would cause this? I got one cluster file working fine with Slurm but trying to add another one to get a linux_host working for VNC has really confused me. The Desktop app by modifying file /etc/ood/config/apps/bc_desktop/test.yml can sometimes display in the dashboard but if I use another name for it and sync the name with the filename in /etc/ood/config/clusters.d it can make the Interactive Apps menu option to disappear in the dashboard. Any advice? Is there a way of resetting the information it remembers?

Tom, below is our cluster config for the Linux Host Adapter. Note that OOD currently assumes IP round robinning as frequently done in cluster interactive nodes, if you have multiple Linux hosts, so, our workaround now is to have different cluster config for each host. I think I started from the OOD example docs, not from the existing SLURM cluster.

---
v2:
  metadata:
    title: "frisco1"
    url: "https://www.chpc.utah.edu/documentation/guides/frisco-nodes.php"
    hidden: false
  login:
    host: "frisco1.chpc.utah.edu"
  job:
    adapter: "linux_host"
    submit_host: "frisco1.chpc.utah.edu"  # This is the head for a login round robin
    ssh_hosts: # These are the actual login nodes, need to have full host name for the regex to work
      - frisco1.chpc.utah.edu
    site_timeout: 7200
    debug: true
    singularity_bin: /uufs/chpc.utah.edu/sys/installdir/singularity3/std/bin/singularity
    singularity_bindpath: /etc,/mnt,/media,/opt,/run,/srv,/usr,/var,/uufs,/scratch
#    singularity_image: /opt/ood/linuxhost_adapter/centos7_lmod.sif
    singularity_image: /uufs/chpc.utah.edu/sys/installdir/ood/centos7_lmod.sif
    # Enabling strict host checking may cause the adapter to fail if the user's known_hosts does not have all the roundrobin hosts
    strict_host_checking: false
    tmux_bin: /usr/bin/tmux
  batch_connect:
    basic:
      script_wrapper: |
        #!/bin/bash
        set -x
         if [ -z "$LMOD_VERSION" ]; then
            source /etc/profile.d/chpc.sh
         fi
        export XDG_RUNTIME_DIR=$(mktemp -d)
        %s
      set_host: "host=$(hostname -s).chpc.utah.edu"
    vnc:
      script_wrapper: |
        #!/bin/bash
        set -x
        export PATH="/uufs/chpc.utah.edu/sys/installdir/turbovnc/std/opt/TurboVNC/bin:$PATH"
        export WEBSOCKIFY_CMD="/uufs/chpc.utah.edu/sys/installdir/websockify/0.8.0/bin/websockify"
        export XDG_RUNTIME_DIR=$(mktemp -d)
        %s
      set_host: "host=$(hostname -s).chpc.utah.edu"

Thanks, that is very useful. I have had some success using private windows in my browser to clear cache and maybe cookies between changes so maybe the browser is caching information and causing the confusing behaviour if name of the cluster file changes. I might take a step back from trying to access our VNC Xinetd server that is on a single port and get it working with spawning VNC servers on a single host for each user as described in the documentation.

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.