Failed to establish a websocket connection - USC

New installation of Open OnDemand for a first time admin. I did a basic OOD configuration and was able to log into the OOD dashboard. When I tried to access the web shell for our login node, I get an error page. Note the url looks correct - https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu:
Failed to establish a websocket connection. Be sure you are using a browser that supports websocket connections.

I have read the other topic on here but they didnt help. I’ve double checked configurations and logs… nothing looks wrong. Any hints on how to debug further?

Detailed information about my setup:

  • Shared server, ports 80, 8080, 443 were already used. I had to configure 9443 & fronted by a load balancer.

  • Use Shibboleth SSO for authentication

  • Note that our log node should prompt for 2FA. This might be an issue, but I thought I would see error alluding to that. Like so:
    $ ssh jkhong@discovery.usc.edu
    Duo two-factor login for jkhong

    Enter a passcode or select one of the following options:

    1. Duo Push to XXX-XXX-2455
    2. Phone call to XXX-XXX-2455
    3. Phone call to XXX-XXX-5858
    4. SMS passcodes to XXX-XXX-2455 (next code starts with: 3)

    Passcode or option (1-4): 1
    Success. Logging you in…

Logs when I access https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu:
/opt/rh/httpd24/root/etc/httpd/logs/access_log
128.125.253.76 - jkhong [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/ssh/discovery.usc.edu HTTP/1.1” 200 1330 “https://carcaccount.usc.edu/pun/sys/dashboard” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36”
128.125.253.76 - jkhong [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/stylesheets/style.css HTTP/1.1” 200 333 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36”
128.125.253.76 - jkhong [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/javascripts/ood_shell.1.js HTTP/1.1” 200 3168 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36”
128.125.253.76 - jkhong [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/javascripts/hterm_all_1.85.js HTTP/1.1” 200 656388 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36”
128.125.253.76 - - [13/Oct/2020:14:43:37 -0700] “GET /favicon.ico HTTP/1.1” 404 196 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36”

/opt/rh/httpd24/root/etc/httpd/logs/error_log
[Tue Oct 13 14:43:36.690213 2020] [lua:info] [pid 31270] [client 128.125.253.76:9515] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept=“text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,/;q=0.8,application/signed-exchange;v=b3;q=0.9” req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” res_content_length=“1330” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“125.619” local_user=“jkhong” req_referer=“https://carcaccount.usc.edu/pun/sys/dashboard” res_content_language="" req_port=“443” log_time=“2020-10-13T21:43:36.690042Z” req_server_name=“carcaccount.usc.edu” log_hook=“ood” req_accept_charset="" req_hostname=“carcaccount.usc.edu” res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“jkhong” res_location="" req_user_ip=“128.125.253.76” req_is_https=“true” req_filename=“proxy:http://localhost/pun/sys/shell/ssh/discovery.usc.edu” req_uri="/pun/sys/shell/ssh/discovery.usc.edu" time_proxy=“4.925” res_content_type=“text/html; charset=utf-8” req_accept_language=“en-us,en;q=0.9,lb;q=0.8” req_cache_control=“no-cache” req_accept_encoding=“gzip, deflate, br”, referer: https://carcaccount.usc.edu/pun/sys/dashboard
[Tue Oct 13 14:43:36.888310 2020] [lua:info] [pid 31270] [client 128.125.253.76:9515] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept=“text/css,/;q=0.1” req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” res_content_length=“333” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“127.045” local_user=“jkhong” req_referer=“https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” res_content_language="" req_port=“443” log_time=“2020-10-13T21:43:36.888164Z” req_server_name=“carcaccount.usc.edu” log_hook=“ood” req_accept_charset="" req_hostname=“carcaccount.usc.edu” res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“jkhong” res_location="" req_user_ip=“128.125.253.76” req_is_https=“true” req_filename=“proxy:http://localhost/pun/sys/shell/stylesheets/style.css” req_uri="/pun/sys/shell/stylesheets/style.css" time_proxy=“1.233” res_content_type=“text/css” req_accept_language=“en-us,en;q=0.9,lb;q=0.8” req_cache_control=“no-cache” req_accept_encoding=“gzip, deflate, br”, referer: https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu
[Tue Oct 13 14:43:36.902130 2020] [lua:info] [pid 31377] [client 128.125.253.76:43093] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept="/" req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” res_content_length=“3168” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“127.557” local_user=“jkhong” req_referer=“https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” res_content_language="" req_port=“443” log_time=“2020-10-13T21:43:36.901986Z” req_server_name=“carcaccount.usc.edu” log_hook=“ood” req_accept_charset="" req_hostname=“carcaccount.usc.edu” res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“jkhong” res_location="" req_user_ip=“128.125.253.76” req_is_https=“true” req_filename=“proxy:http://localhost/pun/sys/shell/javascripts/ood_shell.1.js” req_uri="/pun/sys/shell/javascripts/ood_shell.1.js" time_proxy=“1.253” res_content_type=“application/javascript” req_accept_language=“en-us,en;q=0.9,lb;q=0.8” req_cache_control=“no-cache” req_accept_encoding=“gzip, deflate, br”, referer: https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu
[Tue Oct 13 14:43:36.968652 2020] [lua:info] [pid 31226] [client 128.125.253.76:29334] req_protocol=“HTTP/1.1” req_handler=“proxy-server” req_method=“GET” req_accept="/" req_user_agent=“Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” res_content_length=“656388” req_content_type="" res_content_encoding="" req_status=“200” req_origin="" time_user_map=“128.229” local_user=“jkhong” req_referer=“https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” res_content_language="" req_port=“443” log_time=“2020-10-13T21:43:36.968492Z” req_server_name=“carcaccount.usc.edu” log_hook=“ood” req_accept_charset="" req_hostname=“carcaccount.usc.edu” res_content_location="" res_content_disp="" req_is_websocket=“false” remote_user=“jkhong” res_location="" req_user_ip=“128.125.253.76” req_is_https=“true” req_filename=“proxy:http://localhost/pun/sys/shell/javascripts/hterm_all_1.85.js” req_uri="/pun/sys/shell/javascripts/hterm_all_1.85.js" time_proxy=“68.766” res_content_type=“application/javascript” req_accept_language=“en-us,en;q=0.9,lb;q=0.8” req_cache_control=“no-cache” req_accept_encoding=“gzip, deflate, br”, referer: https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu

/var/log/ondemand-nginx/jkhong/access.log
unix: - - [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/ssh/discovery.usc.edu HTTP/1.1” 200 1330 “https://carcaccount.usc.edu/pun/sys/dashboard” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” “10.2.203.102, 128.125.253.76”
unix: - - [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/stylesheets/style.css HTTP/1.1” 200 333 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” “10.2.203.102, 128.125.253.76”
unix: - - [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/javascripts/ood_shell.1.js HTTP/1.1” 200 3168 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” “10.2.203.102, 128.125.253.76”
unix: - - [13/Oct/2020:14:43:36 -0700] “GET /pun/sys/shell/javascripts/hterm_all_1.85.js HTTP/1.1” 200 656388 “https://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Safari/537.36” “10.2.203.102, 128.125.253.76”

/var/log/ondemand-nginx/jkhong/error.log
App 28598 output: Listening on 3000

Browser web developer console error:
ood_shell.1.js:12 WebSocket connection to 'wss://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu?csrf=REKBEv2L-N8xwpyH7bOQhktqwb52YtDYK6yc' failed: Connection closed before receiving a handshake response

Hi yea, that message isn’t totally helpful. I’d suggest making sure discovery.usc.edu is a part of the allowlist. You can see docs for setting the allowlist here.

More than that, I think in the developer tools - in the network tab seeing what’s there may provide some insight. What the response code and the response headers are for that upgrade request.

In this case, i get 401 unauthorized and there’s an X-OOD-Failure-Reason header saying the host is not in the allowlist.
image

Jeff -
Thanks for your response. I did a lot of research into this issue. I did set OOD_SSHHOST_ALLOWLIST just in case. I printed it out in app.js just to verify, from errors.log:
App 28680 output: host_allowlist: discovery.usc.edu:discovery1.usc.edu:discovery2.usc.edu:*.usc.edu

WRT to the request - wss://carcaccount.usc.edu/pun/sys/shell/ssh/discovery.usc.edu?csrf= - in web developer the request is sent, but no response is received (connection is closed).

At this point I suspect the shell code is not working on my server. I found the nodejs code - /var/www/ood/apps/sys/shell/apps.js - and added some log messages in there. The logs are sent to - /var/log/ondemand-nginx/jkhong/error.log - The only shell message I see is the “Listening on 3000”. I expect to see “Connection established” but never do. I also never see the upgrade event triggered - server.on(‘upgrade’,… - which should initiate the connection. This is as far as I have gotten in my debugging.

OK gotcha, sorry I jumped right to something before I read your initial message fully.

You’re behind a load balancer. You may need to setup rules for wss and ws protocol. This is my next guess as to what is going on, your LB may be disconnecting the request if it’s working on L7 protocols. Your origin logs don’t show any request_is_websocket="true" entries, and at some point it should.

It was our load balancer (Citrix Netscaler). It was not configured to allow websockets (ws and wss).

This is what I did to debug. I looked at developer tools request headers when it made the wss request. I then simulated the request using curl on the command-line. Here is an example curl wss request with it rejecting the wss (openssl errno 104):

 curl -vk -i -N  -H "Host: carcaccount.usc.edu" -H "Origin: https://carcaccount.usc.edu" -H "Connection: Upgrade"  -H "Upgrade: websocket" 'https://carcaccount.usc.edu/pun/sys/shell/discovery.usc.edu?csrf=Y6GDrzuL-dv6WrczQXBsax7mwOCasZ1Yp8e8'
> GET /pun/sys/shell/ssh/discovery.usc.edu?csrf=Y6GDrzuL-dv6WrczQXBsax7mwOCasZ1Yp8e8 HTTP/1.1
> Host: carcaccount.usc.edu
> User-Agent: curl/7.66.0
> Accept: */*
> Connection: Upgrade
> Upgrade: websocket
> Origin: https://carcaccount.usc.edu
>
* STATE: DO => DO_DONE handle 0x800083108; line 1756 (connection #0)
* STATE: DO_DONE => PERFORM handle 0x800083108; line 1877 (connection #0)
* OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104
* Marked for [closure]: Transfer returned error
* multi_done
* Closing connection 0
* The cache now contains 0 members
* Expire cleared (transfer 0x800083108)
curl: (56) OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104

I noticed the wss request never made it to apache httpd. At that point I did suspect the LB configuration. I found out that I had to make special configuration changes on our Citrix Netscaler LB to support ws and wss protocols. After that the wss curl request worked! And of course OOD shell worked too!

@jeff.ohrstrom - Thank you for your assistance and guess as the LB as the problem!