Clicking on "Connect to Jupyter" leads to wrong URL

hi all,

i successfully launched a jupyter notebook on slurm from OOD.
however, when i click on the “Connect to Jupyter” button in
/pun/sys/dashboard/batch_connect/sessions
this leads me to
http:///node/worker001/44559/login
instead of
http://:44559/node/worker001/44559/login
where the notebook is actually running.

which options do i need to change for this url to lead to the correct host + port?

thanks in advance.

any suggestions on how to resolve this?

Sorry for the delay on responding.

The link you click is created by this file:

If you notice, the URL is absolute - the assumption is that OnDemand or the /node or /rnode proxies do not have a suburi. So in OSC OnDemand’s case, we would have https://ondemand.osc.edu and then this /URL would be appended to that base domain.

Now we run OnDemand on port 80 and have not done testing to see what problems we might run into running OnDemand on a different port.

Also, I’m confused by the link example, where the host is missing. Are you omitting the host in the URL you are sharing? i.e. http://myhost/node/worker001/44559/login or http://myhost:44559/com/node/worker001/44559/login

yes, i forgot the host in the urls i shared. i meant that at:
http:///pun/sys/dashboard/batch_connect/sessions
i clicked on “Connect to Jupyter”
and was led to
http:///node/worker001/27910/login
instead of
http:///node/worker001/27910/login

where myjupytersessionhost=slurmworker

shouldn’t ood automatically lead me to the url in which the host is the, in my case, slurm worker, on which the jupyter session was scheduled to run?

Have you solved the problem you were having configuring this?

I’m having similar problem.

on my server I see following

http://node1.rs.gsu.edu:31263/node/node1.rs.gsu.edu/31263/

Now node1 is the server that the slurm run compute jobs.

if I manually type

https://head.rs.gsu.edu//node/node1.rs.gsu.edu/31263/

it takes me to the correct Jupiter instance and ask password, which I do not know.

This is the output of such attempt,

Script starting…
Waiting for Jupyter Notebook server to open port 31263…
TIMING - Starting wait at: Fri Sep 27 11:20:29 EDT 2019
TIMING - Starting main script at: Fri Sep 27 11:20:29 EDT 2019
Currently Loaded Modulefiles:

  1. /Compilers/intel 3) /Compilers/Cudalib
  2. /Compilers/mkl_2015 4) /Compilers/Python3.6
    TIMING - Starting jupyter at: Fri Sep 27 11:20:29 EDT 2019
  • jupyter notebook --config=/home/users/neranjan/ondemand/data/sys/dashboard/batch_connect/dev/jupyter/output/e0dc01e9-d850-4fc0-8f78-d856e1e0758e/config.py
    [W 11:20:34.326 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
    [I 11:20:34.360 NotebookApp] Serving notebooks from local directory: /home/users/neranjan
    [I 11:20:34.360 NotebookApp] The Jupyter Notebook is running at:
    [I 11:20:34.360 NotebookApp] http://node1.rs.gsu.edu:31263/node/node1.rs.gsu.edu/31263
    [I 11:20:34.360 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
    Timed out waiting for Jupyter Notebook server to open port 31263!
    TIMING - Wait ended at: Fri Sep 27 11:25:27 EDT 2019
    [C 11:25:27.350 NotebookApp] received signal 15, stopping
    Cleaning up…
    [I 11:25:27.362 NotebookApp] Shutting down 0 kernels
    /var/spool/slurmd/blue/job01029/slurm_script: line 20: 458659 Terminated “/home/users/neranjan/ondemand/data/sys/dashboard/batch_connect/dev/jupyter/output/e0dc01e9-d850-4fc0-8f78-d856e1e0758e/script.sh”

How can I make-sure the server correctly do the proxy.

I have interactive app working correctly, btw. I can launch VNC sessions without any problems.

Are you sure that your view.html.erb is the same as our example?

In it, as Eric says, it holds this line which is actually what you’re redirecting to.

<form action="/node/<%= host %>/<%= port %>/login" method="post" target="_blank">

Which just gave me this image below. You can see the action does not have a host or port in it, it’s simply the path bit (the host:port being the one I’m currently connected to).

image

If you confirm your view.html.erb is the same as our example, can you show us what your form looks like (the html div that I’ve given in the image above).

@jeff.ohrstrom . This is my view.html.erb. I do not think I changed anything in this file.

This is the output when I start a new session

As I mentioned above when I manually navigate to the correct web page it produces following web page

However, OOD main application does not show or link to the above URL. It just hangs and timeout with the following message in the output file

Please let me know if I’m still not clear. My main question if how to make sure the main OOD app correctly navigate a user to Jupyter instance once it created. Right now a user has to manually get the URL from the output file and change the URL to correct one. Even with this method, I do not see a way to get the password. Please help.

OK I see what’s going on, I apologize for not taking a better look at the logs before.

We’re disconnecting you in this file because we think that that port is not yet open. Here’s the file that’s giving you trouble. That function wait_until_port_used is here in this library. It’s basically just netcat on the host and port.

As you have shown, this should work, as you’re able to connect to it. My guess is, however, that there’s some problem connecting to the local port from the machine itself. Some firewall or ip table rule could be blocking you. Note how it times out, not immediately fails with something like ‘connection refused’.

If you’re able to ssh into your host (like cder15.rs.gsu.edu from that logfile) during this time, see if you can’t run the same nc command nc -w 2 $HOST $PORT < /dev/null &> /dev/null (replacing host and port here appropriately).

If this fails try with just localhost instead of the hostname. Maybe there’s some problem with routing, but you can connect to it if you just call it localhost or 127.0.0.1. If this if you can connect to it through localhost just replace the ${host} bit from the after.sh shown above with localhost. This way that shell script will use localhost instead of the network name.

If that doesn’t work, if you still can’t connect to it through localhost, you’ll have to reach out to your administrators to see what networking rules are blocking you. Obviously you can connect from the outside but you also need to connect to that port locally.

@neranjan it is possible that nc is not available or fails to exit with a 0 status when the port is in use. We don’t use nc to check for the port in use when launching TurboVNC, so that is likely why you see things working for VNC and not for Jupyter. The fact that we do not handle this case is a bug and is captured https://github.com/OSC/ood_core/issues/153. This mailing list thread from Oct 2018 covers the problem in greater detail https://listsprd.osu.edu/pipermail/ood-users/2018-October/000269.html.

If this is the case, one solution is to modify the Jupyter OnDemand plugin template/before.sh to override the definition of the port_used function. You can see an example of how to do that https://github.com/OSC/ood_core/issues/153#issuecomment-536619617

Note: that example provided for overriding port_used uses lsof instead. If your bash is a new enough version you may also be able to use the pseudo-device /dev/tcp to determine if the port is in use (mentioned in the body of the GitHub issue)

@efranz what is the location of this file ood_core/lib/ood_core/batch_connect/template.rb when installed using rpm?

Just to make-sure, I need to edit the file belongs to the dashboard, right?

I found this location.

/var/www/ood/apps/sys/dashboard/vendor/bundle/ruby/2.4.0/gems/ood_core-0.9.3/lib/ood_core/batch_connect/template.rb

Also note that in the above output file it reports the server is running on

http://node1.rs.gsu.edu:31263/node/node1.rs.gsu.edu/31263/

should this be

https://head.rs.gsu.edu//node/node1.rs.gsu.edu/31263/

nc is installed on nodes and it does work. However for me the

[I 11:20:34.360 NotebookApp] The Jupyter Notebook is running at:
[I 11:20:34.360 NotebookApp] http://node1.rs.gsu.edu:31263/node/node1.rs.gsu.edu/31263

seems wrong.

It boots on that node, so logs indicating that it boots on that node is appropriate. When you hit the apache process on head.rs.gsu.edu it essentially redirecs to node1.rs.gsu.edu. All of that is well and good, because it doesn’t know about the redirection.

It works when you do nc -w 2 node1.rs.gsu.edu 31263 or when you try with localhost? If it works with localhost you can modify this line of your after.sh to use "localhost" instead of "${host}".

OR if you want to use lsof instead,

Where you want to add the override is in your before.sh.erb about in the same place as I’ve marked. No need to modify the ood_core library (the safer way would be to add an initializer in the dashboard in /etc/ood/config/apps/dashboard/initializers/template_override.rb and override it there).

I wouldn’t edit the dashboard gem file directly. If you add or edit the Jupyter plugin file template/before.sh to override the port_used that would be preferred…though it would seem you would need that fix for every interactive app that is not VNC.

Though you said nc is installed on the server… I think the issue is that it seems that this function is returning a non-zero value when the port is of the Jupyter notebook server:

This function accepts an argument like: “host:port” i.e. “localhost:1234”. So either the command is not correct for your version of nc or the host and port values are not set correctly when passed into this function.

The fact that JupyterNotebook server is running on http://node1.rs.gsu.edu:31263/node/node1.rs.gsu.edu/31263 is correct. The idea is:

  1. Find an open port to have JupyterNotebook server to listen to. In this case 31263.
  2. When starting JupyterNotebook, configure the base url of the notebook server to be /node/HOST/PORT so that URL’s the JupyterNotebook generates utilize OnDemand’s “node” proxy. You can see the difference between the OnDemand node and rnode proxies here: https://osc.github.io/ood-documentation/master/app-development/interactive/view.html#reverse-proxy.

This is necessary so when Jupyter Notebook server creates hyperlinks in its response HTML, those links are not broken links, but when followed, the requests make it back to the Jupyter Notebook server.

I’m sorry , but I’m not going anywhere. nc only return to the localhost and timeout on cder15.rs.gsu.edu.

So I changed the host to localhost.

now I see following in the out file

However it still does not return correctly

it only shows this

where else should I change?

Thanks you very much for your help so far.

I would try undoing that change. You want to see in the log something like before: “Jupyter notebook is running at: http://cder15.rs.gsu.edu:40184/node/cder15.rs.gsu.edu/40184” not “http://cder15.rs.gsu.edu:40184/node/localhost/40184”.

But instead if using “localhost” works with the nc command you could try modifying your template/after.sh from:

if wait_until_port_used "${host}:${port}" 60; then

to

if wait_until_port_used "localhost:${port}" 60; then

I think I finally solve this problem. It seems you need to disable SELinux on compute nodes as well. I had SELinux disabled on ondemand node but cluster has SELinux enabled. once I disable SELinux on the node it worked. Thanks for all your help.

@neranjan other sites have SELinux enabled on the compute nodes and have no problems. You shouldn’t need to disable SELinux on the compute nodes. Perhaps the fact doing this fixes the problem might be a clue to what the problem was. @tdockendorf any ideas?