Multiple clusters from a single app

I’ll blame Matthew and Brandon from Idaho National Lab for this but am asking the OOD developers for help :wink:

The issue is, INL has successfully modified their Desktop and Jupyter app to include multiple clusters in a single app, by an extra parameter in the app’s form.yml (or desktop’s cluster.yml), like
widget: select
label: “Cluster”
- [“Cluster1”, “cluster1”]
- [“Cluster2”, “cluster2”]

and then feeding this pbs_cluster variable to the submit.yml.erb:
- “-q”
- “<%= pbs_queue %>@<%= pbs_cluster %>”

We use SLURM so this PBS solution does not work for us, but, we do use a single slurmdbd for all our clusters so we can cross-submit jobs with the -M flag (sbatch -M cluster1 …).

So, I added into our setup a generic cluster that uses SLURM binaries that work across all our clusters, and use -M in the submit.yml.erb to direct the job to a specific cluster:
- “-M”
- “<%= slurm_cluster %>”

In the process, I discovered that OOD has in ln 279 of gems/ood_core-0.11.3/lib/ood_core/job/adapters/slurm.rb hard coded the -M flag:
args += ["-M", cluster] if cluster

I tried to comment out the flag (since I feed it in through the submit.yml.erb), and, that does submit the job with the app (desktop) starting on the compute node correctly, but, OOD Interactive Sessions does not know about this job. I suspect because OOD behind the scenes queries the SLURM about the job status and since I removed the -M from the SLURM adapter, the commands like squeue don’t have the appropriate cluster name.

Perhaps there could be a simple fix to this for us (rather than waiting for future OOD release that should allow this), which is I am asking for feedback.

I guess the simplest way would be to set the cluster variable in the slurm.rb to the slurm_cluster variable that I define in the submit.yml.erb, but, I don’t know the complexities of how all these things interact behind the scenes.

I appreciate any thoughts on this.


I’m actually surprised this works for PBS. Maybe it works for PBS only because 1 binary can submit to and query multiple clusters? So the cluster_id in their database file (~/ondemand/data/sys/dashboard/batch_connect/db/) is incorrect, but qstat` continues to search for that job id on other clusters, finds it and returns the data to OOD? (I’m only guessing as to why it works for them)

So, if you’ve configured your application as cluster: vulcan but you actually were able to submit to the cluster romulus you would need to somehow enable the binary configured in /etc/ood/config/clusters.d/vulcan.yml to be able to query both vuclan and romulus clusters because OOD will use that cluster config to run squeue.

But when would it query both clusters? It seems like you could create a wrapper script that can interact with both clusters and pass an environment variables that could tell it which cluster to interact with (or which binary to use). In this wrapper script you could catch the -M option and modify as you see fit and use the appropriate binary.

These are my initial thoughts on it. I’m not sure how easy this endeavor would be but I do know we’re adding this functionality to the next release, so it isn’t too very far away.

We’re supporting multiple clusters with our apps but our use case is different. Our main
cluster is very busy, with many queues and a complex usage policy. A scheduler run
typically takes 30 - 60 seconds and occasionally much longer (say when someone
submits 10,000 jobs that fail immediately.) We have provisioned dedicated resources to
support a certain class of interactive ood jobs but if they went through the main
scheduler they would experience unpleasantly long startup times.

We created a second cluster to support this class of jobs. But we don’t want the users to
have to think about it. They should just be able to request whatever resources they want
and their job should be sent to the appropriate place automatically. So the app forms
just reference the main scc cluster. The redirection to the ood cluster happens in the
wrapper scripts. We use SGE. The SGE qsub command has an option to query if a job
request can be started immediately on a cluster. So the qsub wrapper asks the ood
cluster if it can run the job and submits it there if the answer is yes, otherwise it submits
to the scc cluster. In order for OnDemand to track the jobs in the ood cluster the qstat
wrapper checks to see if a job with the right job_id and user exists in the ood cluster. If
so, the qstat request goes there, otherwise it goes to the scc cluster. The qdel wrapper
is similar.

This has been working without issue since last September but in a few months the job
ids in the scc cluster will roll over and then catchup with the job ids in the ood cluster.
So it will be possible for the same user to have jobs with the same job id in both clusters
and OnDemand won’t be able to track the one in the scc cluster. I think the probability of
this happening is low but we’ll see. I hope the future multi-cluster support will allow me
to eliminate this possibility. I would just need to be able to modify the cluster specified in
the form after the form is submitted but before the actual job submission occurs.

Thanks Jeff and Mike,

I think we’re stuck with SLURM because of the hard coded “-M cluster” option. Since the multi-cluster support is not too far out, I’ll wait till it’s officially supported, and looking forward to that.