(Verifying a few details about) RStudio app setup/configuration

My first question is more of an observation and comment for those installing the app in the future…

#1 – Is the “Build/install the updated Apache configuration file” step is missing something?
Prior to deploying RStudio/other interactive apps, one needs to “check the boxes” in the “Setup Interactive Apps” section of the documentation, including enabling the reverse proxy using the update_ood_portal script. When you modify ood_portal.yml as recommended, and execute that script, it’s very easy to miss this bit…

Generating Apache config using YAML config: '/etc/ood/config/ood_portal.yml'
Generating Apache config checksum file: '/etc/ood/config/ood_portal.sha256sum'
WARNING: Checksum of /opt/rh/httpd24/root/etc/httpd/conf.d/ood-portal.conf does not match previous value, not replacing.
Generating new Apache config at: '/opt/rh/httpd24/root/etc/httpd/conf.d/ood-portal.conf.new'

…which, unless I’m mistaken, means you haven’t actually updated Apache’s config after all. Right? Some searching turned up this probably unrelated post that scared me from proceeding for quite a while, but has some valuable context suggesting (if I correctly may paraphrase) the check is there to prevent future OOD updates from overwriting site-specific config modifications. Examining the contents of that script I found and used the -f flag to force the update. Should a note about this be added to the tutorial?

#2 – What’s the ideal way to modify the submit-form for use with Slurm?
The modifications we needed to make were

  • replacing the bc_num_slots field in favor of one requesting cores instead of nodes (-c, --cpus-per-task=<ncpus>)
  • including a field to set memory allocation (--mem=<size[units]>)
  • invisibly setting Slurm’s qos parameter (-q, --qos=<qos>)
  • setting a custom jobname (vs the default that uses the app path)

The final config is quoted below, but there were sufficiently many stumbling blocks I wanted to document them for the future as well as ask if there isn’t a better way.

The search result that looked the most promising actually turned out to be quite confusing in the long run: @jeff.ohrstrom, am I missing something, or is that suggested native leaf not actually syntactically correct for Slurm? It seemed to do the job for neranjan but any variation I tried on that yielded a ruby error; the fix for which was in this thread mentioning that “any other adapter but Torque you should convert the value of the “native” key in the submit.yml.erb to an array”. (Also, neranjan used an id leaf under his custom field for cores but I didn’t see that documented anywhere else – is that just ignored?)

Along with the form.yml documentation, that about covered everything and I initially used the native leaf to set the job name too before I found the pointer to “generic fields”, which I mention here for future forum searchers.

## /var/www/ood/apps/sys/RStudio/manifest.yml ##
---
name: RStudio Server
category: Interactive Apps
subcategory: Servers
role: batch_connect
description: |
  This app will launch an RStudio server.


## /var/www/ood/apps/sys/RStudio/form.yml ##
---
cluster: "monsoon"
form:
  - bc_num_hours
  - num_mem
  - num_cores
  - bc_email_on_started
attributes:
  num_mem:
    widget: "number_field"
    label: "Memory (in megabytes)"
    value: 500
    min: 1
    id: 'num_mem'
  num_cores:
    widget: "number_field"
    label: "Number of cores"
    value: 1
    required: true
    min: 1
    id: 'num_cores'


## /var/www/ood/apps/sys/RStudio/submit.yml.erb ##
---
batch_connect:
  template: "basic"
script:
  job_name: "od_rstudio"
  native: [ "--mem=<%= num_mem.to_i %>", "-c <%= num_cores.to_i %>", "--qos=ondemand" ]

The above has been working great, but I’d love if someone could verify I’m not accidentally doing something in a brittle way.

#3 – Singularity image

(Again, this is mostly for future forum searchers…) We’re still running Centos 6 here, so as per the docs we built our own barebones Singularity container which we did by simply changing the example’s 7 to 6

…plus one more necessary mod: I’m not sure if this was a site-specific thing, or a Singularity container thing, but we needed to explicitly bind-in the path to libuuid.so.1. (in /var/www/ood/apps/sys/RStudio/template/script.sh.erb) Failing to do this, one would find the following in the output.log following a launch attempt:

Script starting... 
Waiting for RStudio Server to open port 24314...
+ echo 'Starting up rserver...'
Starting up rserver...
+ singularity run -B /tmp/jason/27026039/tmp.8xzKaqrtF2:/tmp /common/contrib/containers/rserver-launcher-centos6-custom.simg --www-port 24314 --auth-none 0 --auth-pam-helper-path /home/jason/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/05eb9bfa-da61-4ec1-a5cc-2fb662454028/bin/auth --auth-encrypt-password 0 --rsession-path /home/jason/ondemand/data/sys/dashboard/batch_connect/sys/RStudio/output/05eb9bfa-da61-4ec1-a5cc-2fb662454028/rsession.sh
rserver: error while loading shared libraries: libuuid.so.1: cannot open shared object file: No such file or directory

Apologies for this huge post – I hadn’t forseen it being nearly so long, but I wanted to be as verbose as possible in case it ended up helping anyone.

–Jason Buechler / NAU Monsoon

#1 - Thanks for bringing that back up. I just created this ticket for us so that we can come up with a way to deal with it.

on #2.

Thanks for commenting, I did find out what I’d given him was syntactically incorrect, which I’ve just now fixed, thanks to your prompt. I suppose Neranjan took the correct bits (to use -N and -c as you have) and left out the incorrect bit (to specify it as a string instead of as an array). So I think your confusion is warranted and I’ve just now updated it. I’m sorry for the mine field you’ve been in. In short: native is scheduler specific (get it, native to the scheduler), so when I give an example like 1:ppn=4 that’s specific to PBS Pro and the slurm translation would be -N 1 -c 4.

To answer #2 - There’s nothing there that’s way out of bounds. But if I had one one thing I’d say you should probably supply maximums as well? These apps configs get cached so if a user wants an entire node (say all 28 cores)

# lastly I just think
array:
- this yml array syntax
- seems a little more readable
- but that's just me.

No problem at all! In fact, thanks for all the information!

1 Like

Awesome, thank you for trudging through all that – I appreciate all the notes!

I don’t author yaml much so that array syntax was not obvious to me, but I agree!
I was afraid the strings’ leading-dashes would be parsed weirdly, but a yaml validator says this is indeed valid :slight_smile: Not sure if quoting items under native is necessary, but they work for me.

---
batch_connect:
  template: "basic"
script:
  job_name: "od_rstudio"
  native: 
    - "--mem=<%= num_mem.to_i %>"
    - "-c <%= num_cores.to_i %>"
    - "--qos=ondemand"

PS: if anyone knows if/how one can set fenced code blocks to use yaml syntax highlighting, please lemme know!

Quick followup: is there a way to force form.yml fields to not cache/present cached values when a default value is set? (other than doing something dynamically via a form.yml.erb?)

Not build in AFAIK. What would you recommend? An extra key value pair in the form.yml to say not to cache the value for that specific form element? Or is it a feature you want to turn off for a single app or for all of OnDemand?

@jasonbuechler would this work for you: https://github.com/OSC/ondemand/issues/386

Disabling the caching behavior would be separate from setting a default value, so you would need to do both for a particular form field.

1 Like

That does indeed look ideal!
FWIW, we were specifically concerned with novice users not conscientiously policing their allocation requests, when the system defaults usually suffice.
Thank you!

I would like to extend the discussion to ask about implementing R version selection into the app configuration. I’m taking the approach of the launcher container, with locally installed rstudio/rserver and R packages. I’d like to provide researchers with a method to load the R version of their choice, acknowledging that the current app supports a statically defined rstudio/rserver.

For general use in our cluster, we have a default R version included in the Lmod module for rstudio. So to run the studio server with another R version, that’s a straightforward module substitution:

module swap intel gcc
module load rstudio/1.1.455 # default R/3.5.0
module load R/4.0.2
rstudio

I had thought to implement module selection through the form.yml options:
attributes:
modules:
label: “List modules to load”
widget: “text_field”
value: “R/3.6.2”
form:

  • modules

This has so far proven ineffective. However, the following is effective from template/script.sh.erb:
setup_env () {

Additional environment has been moved into module ood-rstudio-launcher

module load gcc openmpi
module load singularity
module load ood-rstudio-launcher/0.0.1
module load rstudio/1.1.455
module load R/4.0.2
export SINGULARITYENV_LD_LIBRARY_PATH="$LD_LIBRARY_PATH"
echo “$LD_LIBRARY_PATH”
echo “$SINGULARITYENV_LD_LIBRARY_PATH”
}
setup_env

Setting the environment through script.sh.erb is what I term ‘static’. Any help to improve my conceptual understanding (and pragmatic implementation of dynamic R version specification) is appreciated.

~ Em

Emily,

We use a template.sh.erb, and in it we use

module load <%= context.version %>

where the version is taken from

custom:

rVersions:

  • label: “1.2.1335”
    module: “R/4.0.2 rstudio/1.2.1335”

and we set the version in the form.yml.erb with

attributes:
version:
widget: select
label: “R version”
help: “”
options:
<%- rVersions.each do |p| -%>

  • [ “<%= p[‘label’] -%>”, “<%= p[‘module’] -%>” ]
    <%- end -%>

Is that of any use to you?

– bennet

Emily,

Sorry, I left out context that is important. This

custom:
  rVersions:
    - label: “1.2.1335”
      module: “R/4.0.2 rstudio/1.2.1335”

comes from /etc/ood/config/clusters.d/custom_vars.yml

We’ve made a change to how we get variables, so we have to put this at
the top of the form.yml.erb file, as well,

<%
    rVersions =  OodAppkit.clusters[:custom_vars].custom_config[:rVersions]
-%>

Sorry, I should have included that in the first message.

Thanks @bennet! Yes a select field is likely what you should use/.

An important feature of select fields from Bennet’s solution shows you that you can show the user one thing through the label and actually pass another thing to the script.sh.erb. The user sees custrom.rVersions.label but it actually loads whatever’s in custrom.rVersions.module which can be several things (as they’re loading 2 modules in that example).

We do a similar thing right in our form.yml where we show a user the R version number (the very first string in an option, here it’s 4.0.2) but what we pass to the script.sh.erb is actually the string app_rstudio_server/4.0.2 which is a module that does all the dependency module loading.

Both these approaches have the same affect. One you set the options in the cluster.d file and update it when you need updates, the other in the app directly so newer version of the app have newer options.

There’s a caveat here where we’ve had to add javascript functions to show different options for different clusters. The form.yml is only rendered once, so if you have R version X that’s only available on cluster Y and not cluster Z, there’s a little bit of fancy footwork you have to do to make sure only valid options are shown when a given cluster is chosen.

Thanks Bennet, and Jeff –

Now we’re getting somewhere ( :

So there are up to four files that require content:
template/template.sh.erb
template/script.sh.erb
form.yml
/etc/ood/config/clusters.d/custom_vars.yml

Specially In my case, I just want to offer alternate R versions, for 1 rstudio, on one cluster. To aid my understanding, what would be the simplest implementation?

More generally, I’ve seen for other apps that ‘module’ itself is a field for the form.
form.yml:
attributes:
modules:
label: “List modules to load”
widget: “text_field”
value: “R/3.6.2”
form:

  • modules

Under what circumstances can that field be used directly? I’m assuming that the implementation of the rserver launcher container leads to the more elaborate structure that you are patiently describing to me.

Thanks very much – I look forward to picking up again tomorrow.

It’s not a question of when can it be used but when should it. We use it a lot for project codes. We simply cannot list out all the project codes available (actually, now that I say that we can, so maybe we should turn project codes into a select field as well).

My only point is - we should make the right choice the easy choice. It’s about balancing flexibility for an advanced user and ease of use for beginner and intermediate users.

test_fields allow users to do all sorts of crazy things that may be unexpected like load the wrong modules or mistype values.

But that may be totally fine for you. The example you’ve given is absolutely fine. It’s simple and easy for you to manage, but puts the onus on the user to get it right so it may generate some support tickets.

So yea, something like this is totally fine. It just gives the user a chance to do the wrong thing, which is something you should at least be aware of.

attributes:
  modules:
    label: "List modules to load"
    widget: "text_field"
    value: "R/3.6.2"
form:
   modules

Or if you want to be more in control of what they can load and limit their mistakes, you could do something like this. Though it limits what advanced users can do so we ended up adding the ability to load modules inside RStudio itself for folks who need a little more.

attributes:
  version:
    label: "R version to load"
    widget: "select"
    options:
      -  [ "3.6.2",  "R/3.6.2 some_other_module" ]
      -  [ "4.0.2",  "R/4.0.2 yet_another_module" ]
form:
   version

Hope that helps!

@bennet I saw you typing a moment ago. Don’t let my answer discourage, I’m happy to hear what your thoughts are too!

I was going to say much what you said, but not as clearly! :wink: