Driverless-AI OOD Setup

Dear All,

I am working on install Driverless-AI on our system. The “dai” service does not have a launch command like “jupyter notebook”. Users can use “127.0.0.1:12345” to access its interface, I am wondering how to set up an interactive section for this case? Currently, since there is no launch command to stop the launch script will end “successfully” right away.

Best,

Dawei

@superdavidxp I’m so sorry for the delay.

You can absolutely do this, and I think jupyter is the best working example of how to. You’ll need the start command whatever it may be. You say it has no start command, but how does the process that serves 127.0.0.1:12345 come to be?

Another thing you’ll want to consider is authentication. If this process boots, and is accessible, by anyone who happens upon it, that’s probably bad. Security through obscurity will only get you so far.

What all these interactive script.sh.erb files do is essentially this:

  1. setup and prep the environment. This is typically module loading, X11 prep if it’s a VNC application and maybe writing configuration files.
  2. launch the application

Hi Jeff,

Thank you for your reply.

There is a following question. Recently, we found out that our jupyter notebook apps worked fine with GPU partition but not with CPU partition. the error messages as follows, Wou you please give me some guidance?

Thank you in advance,

Dawei

Resetting modules to system default. Reseting $MODULEPATH back to system default. All extra directories will be removed from $MODULEPATH.
Script starting…
Waiting for Jupyter Notebook server to open port 40363…
TIMING - Starting wait at: Wed Mar 4 12:02:51 CST 2020
TIMING - Starting main script at: Wed Mar 4 12:02:51 CST 2020

Currently Loaded Modules:

  1. wmlce/1.7.0-py3.7

TIMING - Starting jupyter at: Wed Mar 4 12:02:51 CST 2020

  • jupyter-lab --config=/home/dmu/ondemand/data/sys/dashboard/batch_connect/dev/jupyter-lab/output/e0ec0589-9ed4-4d30-8643-cd3ce9e424df/config.py
    [W 12:02:52.574 LabApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
    [I 12:02:52.583 LabApp] JupyterLab extension loaded from /opt/apps/anaconda3/lib/python3.7/site-packages/jupyterlab
    [I 12:02:52.583 LabApp] JupyterLab application directory is /opt/apps/anaconda3/share/jupyter/lab
    [I 12:02:52.587 LabApp] Serving notebooks from local directory: /home/dmu
    [I 12:02:52.587 LabApp] The Jupyter Notebook is running at:
    [I 12:02:52.587 LabApp] http://hal16:40363/node/hal16.hal.ncsa.illinois.edu/40363/
    [I 12:02:52.587 LabApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
    Timed out waiting for Jupyter Notebook server to open port 40363!
    TIMING - Wait ended at: Wed Mar 4 12:03:52 CST 2020
    Cleaning up…
    [C 12:03:52.162 LabApp] received signal 15, stopping
    [I 12:03:52.164 LabApp] Shutting down 0 kernels
    /home/dmu/ondemand/data/sys/dashboard/batch_connect/dev/jupyter-lab/output/e0ec0589-9ed4-4d30-8643-cd3ce9e424df/script.sh: line 27: 130079 Terminated jupyter-lab --config="${CONFIG_FILE}"

My guess is it comes down configuring that module you’ve loaded wmlce (I’m guessing is IBM’s watson machine learning?). A quick google search shows you have to specify cpu support? It’s not clear to me, but I’m sure it’s that module. Maybe you could make a separate cpu only module (wmlce-cpu)?