Launching OnDemand when home directory does not exist

Some sites have the home directory auto-create on first ssh login, for example via pam_mkhomedir.so. This introduces a problem if users first access the system through OnDemand, which expects the existence of a user’s home directory.

How can we configure OnDemand to handle this edge case?

I’m going to try this experiment:

  1. Add to an install /etc/ood/profile that sets $HOME to a temporary directory i.e. export $EVENTUAL_HOME=$HOME; export HOME=$(mktemp -d) if home directory doesn’t exist
  2. Add an /etc/ood/config/announcements.d/announcement.yml with erb:
    1. only show the announcement if if ENV['EVENTUAL_HOME'] && (ENV['HOME'] != ENV['EVENTUAL_HOME'])
    2. Explain the problem that your home directory needs created with a link to launch the shell app, if ! File.directory?(ENV['EVENTUAL_HOME'])
  3. Explain the problem that your home directory is created but your OnDemand web server (PUN) needs restarted, with a link to restart web server, if File.directory?(ENV['EVENTUAL_HOME']). Alternatively just display all of these steps in the message.

The theory is that with $HOME set to a valid directory that the user owns, OnDemand will not crash, and the persistent warning/error message at the top of the dashboard will guide the user through the steps necessary to complete the setup process.

That experiment failed. However, there appears to be another solution. A modification to nginx_stage/lib/nginx_stage/views/pun_config_view.rb and nginx_stage/templates/pun.conf.erb to show a warning page if the home directory was not found. Along with this diabeling the check for the existence of the home directory in nginx_stage/lib/nginx_stage/generator_helpers.rb.

The result is this workflow. When launching OnDemand with an account that does not have a home directory created, you would see this page:

  1. Access OnDemand with a user that has no home directory:

  2. Click Open Shell to create home directory which opens the shell app:

  3. Click Restart Web Server:

Since this is not a dynamic page like the “init app page”, we could look into generating a copy of this on the fly and storing it with the pun config, which could provide sites more control over customization.

I will share the code modifications required in a separate comment (or update this one).

One problem with this solution is if users ever try to access OnDemand when the home directory file systems are unavailable they may see this error page!

I think this is a very good solution from our perspective! If home directories are not available, all users will see that message and we’ll have a bigger problem both in OnDemand and with users logging directly in with SSH. I think this is a good work around.

See PRs https://github.com/OSC/ondemand/pull/7 and https://github.com/OSC/ondemand/pull/8

If you want to modify your existing installation to have a solution, you can make these changes to /opt/ood/nginx_stage/ from one of these PRs.

The raw files for 7 are:

  1. https://raw.githubusercontent.com/OSC/ondemand/02fc2d273f171880b57ae2bee07ae0db83b946f3/nginx_stage/lib/nginx_stage/generator_helpers.rb
  2. https://raw.githubusercontent.com/OSC/ondemand/02fc2d273f171880b57ae2bee07ae0db83b946f3/nginx_stage/lib/nginx_stage/views/pun_config_view.rb
  3. https://raw.githubusercontent.com/OSC/ondemand/02fc2d273f171880b57ae2bee07ae0db83b946f3/nginx_stage/templates/pun.conf.erb

And then if you want the configurable option you would update two files with extra changes:

  1. https://raw.githubusercontent.com/OSC/ondemand/50eed7aa1da4187c79f2a3dbceb75800cf4453bc/nginx_stage/lib/nginx_stage.rb
  2. https://raw.githubusercontent.com/OSC/ondemand/50eed7aa1da4187c79f2a3dbceb75800cf4453bc/nginx_stage/lib/nginx_stage/views/pun_config_view.rb

As for the merging of these into the production version for 1.4, I need to let this sit for a while, and revisit after updating OnDemand to Passenger 5. I’m not sure if this is the optimal solution.

An update: a proper fix for this is introduced in this PR https://github.com/OSC/ondemand/pull/18 and will be included in the 1.4 release of OnDemand.

If the test for the user’s home directory fails, instead of aborting the PUN startup process, we start the PUN with this NGINX rewrite directive for all requests to the dashboard:

rewrite ^/pun/sys/dashboard(/.*|$) /pun/custom_html/missing_home_directory.html;

Two new locations have been added to the PUN. One for loading a custom html, if it exists, from /etc/ood/config/pun/html/missing_home_directory.html, and the default page will be served from the nginx_stage root directory i.e. /opt/ood/nginx_stage/html/missing_home_directory.html.

Since these locations will always be accessible you can see the default error page by going directly to http://ondemand.host.edu/pun/html/missing_home_directory.html and the custom error page by going directly to http://ondemand.host.edu/pun/custom_html/missing_home_directory.html. If there is no custom error page it will serve up the default one of course.

An example of a custom error page has been provided at /opt/ood/nginx_stage/html/missing_home_directory.html.example.pam_mkhomedir and can be copied to /etc/ood/config/pun/html/.

Since the location directives are always in the PUN, you can do this, edit the file, reload and test changes to it without having to restart the web server. Both custom and default error page NGINX locations set the Control-Cache header to no-store so that every reload or button click to restart the web server will avoid browser caching the page.

The default error page looks like this:

The example custom error page looks like this:

@dsajdak FYI this feature is now supported in the 1.4 release.

sadly this does not work for us. users only way to access the cluster is through ood, no username/password, so all they get is a message

Could not create directory ‘/users/bob/.ssh’.
The authenticity of host ‘ip-10-1-42-99 (10.1.42.99)’ can’t be established.
ECDSA key fingerprint is SHA256:amystery.
ECDSA key fingerprint is MD5:a.m.y.s.t.e.r.y.
Are you sure you want to continue connecting (yes/no)? yes
Failed to add the host to the list of known hosts (/users/bob/.ssh/known_hosts).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).

Your connection to the remote server has been terminated.

so I’m wondering about alternatives. Is there are way to just open a terminal on the ood server itself? this has the same nfs user homedirs so would work for us.

Hello,

I previously implmented the solution at Customization — Open OnDemand 1.8.12 documentation, and it seemed to work great in OOD 1.8.

After upgrading to OOD 2.0 (using 2.0.23 right now), when you click the “Restart Web Server”, it usually doesn’t work. The page just refreshes back to the “no home directory page.” If you wait 60 seconds (actual time is probably less) and click the restart the web server link again, it seems to work.

Any suggestions on getting the system to respond like it did previously? Is it a timing issue with how long the nginx process takes to restart?

Thanks in advance for any assistance or suggestions.