How To See The Sbatch Command That Gets Run

I’m trying to apply a GPU with Slurm using native attributes. It runs but the GPUs aren’t being reserved as expected. How can I see the actual sbatch command that is running to know what is going on? My actual submit.yml.erb file is given below:

batch_connect:
  template: "basic"
<%-
  slurm_args = if gpu_switch == 1
                 ["--gpus-per-node", "1", "--gres", "gpu:1" ]
               else
                 []
               end
-%>

script:
  native:
  <%- slurm_args.each do |arg| %>
    - "<%= arg %>"
  <%- end %>

My guess is it’s a type issue where gpu_switch is a string not an integer.

Either turn gpu switch into a integer with gpu_switch.to_i or compare it against the string 1 like == "1".

Also, just as an aside to style, you may want to put the ERB <%- %> block at the top for readability.

<%- 
   # do all the computations up here
%>
---
batch_connect:
  # and so on

Yep, that was it… thanks!

Is there a way to see the sbatch command that is actually getting run? And any other tips for debugging these kinds of issues?

Yea you can see the command being run in /var/log/ondemand-nginx/USER/error.log. For this error, I just kinda knew from experience that they’re always strings - maybe I’d run into the same issue once or twice. Something like this, gpu_switch isn’t going to be in the sbatch command, but you can see what all these templates got templated with in the user_defined_context.json in the job’s session directory (the link in the card). This will indicate whether or not flags are being correctly passed from the form to the submit.yml.

I’m not seeing much useful in error.log. The user_defined_context.json is definitely helpful.
If I drop print statements into my ERB block in submit.yml.erb, I cannot find the output anywhere. Is there a way to print out info like this?

No, but if you’re in a development environment - that is, in your home directory where only you have access to this app - I have raised errors with debug messages like:

<%-
  raise StandardError.new("the value of the things I'm looking for is: #{the_thing.inspect}")
-%>

Not sure how useful this is, but:

% cat submit.yml.erb                      
# Job submission configuration file
#
---

<%-
slurm_args = if gpu_switch.to_i == 1
                 #["--nodes", "#{nodes}", "--ntasks-per-node", "#{ppn}", "--gpus-per-node", "1", "--gres", "vis" ]
                 ["--gpus-per-node", "1", "--gres", "gpu:1" ]
               else
                 []
               end
-%>

batch_connect:
  template: "basic"

script:
  native:
  <%- slurm_args.each do |arg| %>
    - "<%= arg %>"
  <%- end %>
   - "--exclusive"%                                                                                                                                                                                           

Manually process the template using CLI erb, providing params:

% erb  -T - gpu_switch="1" submit.yml.erb 

# Job submission configuration file
#
---


batch_connect:
  template: "basic"

script:
  native:

    - "--gpus-per-node"

    - "1"

    - "--gres"

    - "gpu:1"

   - "--exclusive"

Sure is! If you end the each and end lines with -%> instead of %> you’ll get rid of the spaces in the array arguments.

Also the last - "--exclusive" seems to be off by 1 space.