SGE Configuration Test

Just installed Open OnDemand and am able to get the test configuration at

https://osc.github.io/ood-documentation/master/installation/resource-manager/test.html

to work with slurm but are having issues with UGE. (Running both schedulers on the cluster (different nodes) for testing)

The uge_genius.yml file is

v2:
metadata:
title: “Genius Cluster (UGE)”
login:
host: “genius.hpcc.ttu.edu”
job:
adapter: “sge”
cluster: “genius”
bin: “/export/uge/bin/lx-amd64”
# conf: “”
sge_root: “/export/uge”
libdrmaa_path: “/export/uge/lib/lx-amd64/libdrmaa.so”
# bin_overrides:
# sbatch: “/usr/local/bin/sbatch”
# squeue: “”
# scontrol: “”
# scancel: “”

From my account execute

sudo su $USER -c ‘scl enable ondemand – bin/rake test:jobs:uge_genius RAILS_ENV=production --trace’

and receive the following output with the TypeError.

** Invoke test:jobs:uge_genius (first_time)
** Invoke environment (first_time)
** Execute environment
Rails Error: Unable to access log file. Please ensure that /var/www/ood/apps/sys/dashboard/log/production.log exists and is writable (ie, make it writable for user and group: chmod 0664 /var/www/ood/apps/sys/dashboard/log/production.log). The log level has been raised to WARN and the output directed to STDERR until the problem is fixed.
** Invoke /home/thomasbr/test_jobs (first_time, not_needed)
** Execute test:jobs:uge_genius
Testing cluster ‘uge_genius’…
Submitting job…
rake aborted!
TypeError: no implicit conversion of nil into String
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge/batch.rb:36:in initialize' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge/batch.rb:36:in new’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge/batch.rb:36:in initialize' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge.rb:19:in new’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge.rb:19:in build_sge' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/factory.rb:36:in build’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/cluster.rb:78:in job_adapter' /var/www/ood/apps/sys/dashboard/lib/tasks/test.rake:29:in block (4 levels) in <top (required)>’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in block in execute' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in each’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in execute' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:214:in block in invoke_with_call_chain’
/opt/rh/rh-ruby24/root/usr/share/ruby/monitor.rb:214:in mon_synchronize' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:194:in invoke_with_call_chain’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:183:in invoke' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:160:in invoke_task’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in block (2 levels) in top_level' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in each’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in block in top_level' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:125:in run_with_threads’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:110:in top_level' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:83:in block in run’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:186:in standard_exception_handling' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:80:in run’
bin/rake:4:in `’
Tasks: TOP => test:jobs:uge_genius

The UI shows the cluster but displays the same TypeError. Thanks.

Line 36 is your config file (the conf keyword). Looks like you can’t have that commented out. You can try setting it to "", an empty string or make an empty file (like /tmp/test_sge) and point it there.

When I look at the documentation it says that the conf value is optional (documentation may be wrong?). Created an empty conf file /tmp/test_sge.conf and modified the yml file to point to it. It now gives the following with error:

working directory /home/thomasbr/test_jobs
** Invoke test:jobs:uge_genius (first_time)
** Invoke environment (first_time)
** Execute environment
Rails Error: Unable to access log file. Please ensure that /var/www/ood/apps/sys/dashboard/log/production.log exists and is writable (ie, make it writable for use r and group: chmod 0664 /var/www/ood/apps/sys/dashboard/log/production.log). The log level has been raised to WARN and the output directed to STDERR until the pro blem is fixed.
** Invoke /home/thomasbr/test_jobs (first_time, not_needed)
** Execute test:jobs:uge_genius
Testing cluster ‘uge_genius’…
Submitting job…
[2019-12-03 07:51:50 -0600 ] INFO "execve = [{}, “/export/uge/bin/lx-amd64/qsub”, “-wd”, “/home/thomasbr/test_jobs”, “-N”, “test_jobs_uge_genius”, "-o “, “/home/thomasbr/test_jobs/output_uge_genius_2019-12-03T07:51:50-06:00.log”, “-l”, “h_rt=00:01:00”]”
rake aborted!
OodCore::JobAdapterError: Unable to run job: can’t resolve hostname “/home/thomasbr/test_jobs/output_uge_genius_2019-12-03T07”.
Exiting.
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge.rb:88:in rescue in submit' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge.rb:82:in submit’
/var/www/ood/apps/sys/dashboard/lib/tasks/test.rake:30:in block (4 levels) in <top (required)>' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in block in execute’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in each' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in execute’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:214:in block in invoke_with_call_chain' /opt/rh/rh-ruby24/root/usr/share/ruby/monitor.rb:214:in mon_synchronize’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:194:in invoke_with_call_chain' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:183:in invoke’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:160:in invoke_task' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in block (2 levels) in top_level’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in each' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in block in top_level’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:125:in run_with_threads' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:110:in top_level’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:83:in block in run' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:186:in standard_exception_handling’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:80:in run' bin/rake:4:in

Caused by:
OodCore::Job::Adapters::Sge::Batch::Error: Unable to run job: can’t resolve hostname “/home/thomasbr/test_jobs/output_uge_genius_2019-12-03T07”.
Exiting.
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge/batch.rb:175:in call' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge/batch.rb:164:in submit’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/job/adapters/sge.rb:86:in submit' /var/www/ood/apps/sys/dashboard/lib/tasks/test.rake:30:in block (4 levels) in <top (required)>’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in block in execute' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in each’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:273:in execute' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:214:in block in invoke_with_call_chain’
/opt/rh/rh-ruby24/root/usr/share/ruby/monitor.rb:214:in mon_synchronize' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:194:in invoke_with_call_chain’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/task.rb:183:in invoke' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:160:in invoke_task’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in block (2 levels) in top_level' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in each’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:116:in block in top_level' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:125:in run_with_threads’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:110:in top_level' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:83:in block in run’
/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:186:in standard_exception_handling' /var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/rake-12.3.3/lib/rake/application.rb:80:in run’
bin/rake:4:in `’
Tasks: TOP => test:jobs:uge_genius

Yea docs must be wrong given this error, sorry about that! Looks like from the docs -o (output likely) is this -o [[hostname]:]path,..., though I can’t tell if hostname there is optional or not.

Also looking at the docs again, this file may exist already just by having sge already installed and configured (or a good version exists on your login hosts where folks currently submit jobs to sge). You should probably find that real file and point to it because it’s likely to have all sorts of configurations you need like what queues to use or where to submit the job among all sorts of other likely necessary configurations.

I’m not super familiar with SGE’s configs, but in the docs it looks like you can specify fs_stdout_host and fs_stderr_host (probably to localhost?).