CentOS 7.x, python3, websockify, numpy and "Failed to connect to server"

This is complex and annoying.

We get occasional reports of “Failed to connect to server” issues, which are often solved by just starting a new job. This isn’t a new problem in OOD but it’s thorny.

One of our users just did some nice triage but made it thornier.

Every time she got that error, she found

all of them contain traceback with the following error:
pkg_resources.DistributionNotFound: The 'numpy' distribution was 
not found and is required by websockify

Note that the websockify installation docs say numpy is optional

Download one of the releases or the latest development version, extract it 
and run python3 setup.py install as root in the directory where you 
extracted the files. Normally, this will also install numpy for better 
performance, if you don't have it installed already. However, numpy is 
optional. If you don't want to install numpy or if you can't compile it, 
you can edit setup.py and remove the install_requires=['numpy'], line 
before running python3 setup.py install.

Here’s where it gets messy. We run CentOS 7.x, which has python3.6 and python-websockify 0.6.0

OnDemand requires >= websockify 0.8.0

websockify installed from source has install_requires=['numpy']( websockify/setup.py at master · novnc/websockify · GitHub ) which installs the latest numpy - currently version 1.20.x.

numpy 1.20.x deprecates support for python 3.6 and now we have a problem.

So on machines built since December last year, the numpy installation during our websockify installation, has been failing - but not preventing the websockify installation. It looks like it’s working until it’s not.

The probable short term solution is installing CentOS numpy (python36-numpy via EPEL) on all machines - version 1.12.1

But longer term…well moving to CentOS8 has been made difficult. We can’t ask websockify to pin numpy to 1.19.x, numpy wouldn’t and shouldn’t un-deprecate python3.6.

Anyway. that’s the rant - and if others come looking for a solution to the “Failed to connect to server” error - this might help?

If you don’t want to install numpy or if you can’t compile it, you can edit setup.py and remove the install_requires=['numpy'], line before running python3 setup.py install .

That is from the bottom of the README on GitHub - novnc/websockify: Websockify is a WebSocket to TCP proxy/bridge. This allows a browser to connect to any application/server/service. Implementations in Python, C, Node.js and Ruby. so perhaps that is the step to take when installing websockify without numpy. I think without numpy the proxy will be slower, but I do not know what the impact is. websockify/websocket.py at 33710b397230e239a202c650ceaa8148a1a45c01 · novnc/websockify · GitHub is the import and websockify/websocket.py at 33710b397230e239a202c650ceaa8148a1a45c01 · novnc/websockify · GitHub is where it is used

Another alternative might be to use a different implementation. GitHub - novnc/websockify-other: Assorted ports of websockify code to other languages has a C version

If the CentOS 7.x system python is still available instead of a newer python3.6, a third option would be to configure the websockify command to be a wrapper script that first setups up the environment to use the system python, instead of a newer default python, and then execute websockify using that.

Do you have any thoughts on how we could improve the error reporting to make diagnosing the problem faster?

Also at OSC we do this. In our cluster config we have at the bottom:

  batch_connect:
      basic:
        script_wrapper: "module restore\n%s"
      vnc:
        script_wrapper: "module restore\nmodule load ondemand-vnc\n%s"

Notice the script_wrapper for vnc does a nmodule load ondemand-vnc which does

load("turbovnc/2.1.90")
setenv("WEBSOCKIFY_CMD","/usr/local/novnc/utils/websockify/run")

So in our case we use the default python 2.7:

efranz@owens-login01:~$ python --version
Python 2.7.5
efranz@owens-login01:~$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.7 (Maipo)

One thought is that in the script_wrapper you could set WEBSOCKIFY_CMD to a wrapper script that first set the environment for a different python, one that plays well with websockify, and then exec /usr/local/novnc/utils/websockify/run (or wherever the associated websockify is located).