I am working in the ArianeGroup company on HPC & CAE subjects. We have developed a home made HPC web portal since 2004 based on the e-batch alpha release coming from BULL. After 6 months of study in 2019, we have decided to use the OOD solution for our new HPCs.
we are used to propose to our users this type of GUI :
Is it possible to have in the next major release of OOD, the capability to have a GUI with this kind of presentation ? mixing vertical & horizontal widgets, more widgets, dynamic screens more simple to implement…,
for me, the reference in term of functionality & ergonomy is the Nice EnginFrame solution which has a GUI maker with a complete set of components to build by drag & drop some nice GUI.
our portal is a complete HPC solution, with file transfer, submit jobs within GUIs, TurboVNC + VirtualGL + noVNC for interactive jobs, SGE tools to manage jobs, monitoring with Ganglia & Xymon, etc … the data are available within the HPC tools included in the portal or directly by the windows share of the Netapp filer.
We have decided to stop the maintenance of this home made solution and go on with OOD.
I would like to join this conversation. At our center, I have been setting up OnDemand. I have successfully set up Interactive apps and then I moved to somehow “customizing” Job Composer app. However, I stayed quite puzzled when I found out that there are no options for submitting jobs like there are for every interactive app.
At first, I thought that one of the goals of the OnDemand was to simplify submitting jobs but Job Composer doesn’t look like simplifying to me. I expected that there would be some customizable pair of form.erb + submit.erb . This would actually be the biggest advantage of this whole instrument.
Therefore, we would like to extend the functionality of OnDemand Job Composer for PBS. We would like to create such form that would enable users to input desired parameters without thinking about exact form of the qsub command.
Then, this form could transform to a few lines at the beginning of the script (that ones starting with # ) and the rest of the job submitting could stay as it is.
I would like to ask if there is some easy way of accomplishing this. As long as I looked into the code, the form that you see when you click on “Job Options” is hardcoded to look like that. And actually there is not any option for customizing this part.
So, if there is a way, what is it please?
If there isn’t, would you be interested in making this part of the code (we would create) as the native part of OOD code? Or as some extension/ plug-in?
Our discussions around solving these problems have produced several approaches we are considering:
Create a new “batch app” plugin to complement the “interactive app” plugins OnDemand currently has
Build a new files app that includes job management functionality, so a kind of merging of files app, job composer, and
Try to fix some of the problems of the Job Composer app itself
Getting the design right on the first two is difficult but something we want to tackle this year. Two thing we would want to do to properly support this:
extend the job adapters to properly support abstraction for requesting nodes, procs, gpus, features, etc. and other things the abstraction does not support or change the architecture of the abstraction to more easily support scheduler specific deviations
Beyond that work, fixing the Job Composer itself to support some of these goals, even with sub-optimal solutions, seems like the easiest.
For that, you suggest being able to customize the job composer with a form.erb and submit.erb that would result in modifying or prepending lines at the beginning of the job script with directives. Several questions:
Currently, since we don’t have an easy way to reliably parse and manage those directives that can also be hand modified by users, our work around currently is the Job Options in the Job Composer results in using the ood_core ruby library to specify command line arguments to qsub. It does not touch the job script at all - so user modified script directives are preserved. If there is duplication between something in Job Options and something specified in a job script directive, the per scheduler precedence of command line argument or script directive is enforced (in PBS Torque the command line argument takes precedent. The easiest approach technically is to continue with the the Job Options setting command line arguments, though the more options that are controlled from this interface, the more likely that there will be conflicts between command line arguments and script directives, which could confuse the user when they modify a script directive and it has no effect. This is actually why currently when you set the “job name” in Job Options in the Job Composer it doesn’t use this string when submitting the job to the batch scheduler as it would override whatever job name was set in the script, and all our templates currently set a job name in the script. So if we add these options their affect would need to be communicated properly to users.
I assume you would want to be able to have a separate form.erb/submit.erb for each cluster? OnDemand can be configured for multiple clusters of different schedulers, so the Job Composer would need to be updated to handle this.
I wonder how the batch schedulers handle duplicate directives. If duplicates do not cause a problem and the later duplicate overrides the earlier, then we could switch the Job Composer to inserting the command line arguments as directives at the top of the script below the shebang, leaving any directives defined in the script by the user untouched and with the highest precedence.
For our beginning users we supply a template that has most of what they need in the template. We also have a mandatory training session that all of our HPC users must attend. We go over the template
and how to use it and our beginners seem to be able to get up and working faster. We also tell everyone that the most power will be gained by using the command line interface.
Actually, I don’t think the job composer has any of two problems you stated.
I assumed that 1st problem (too difficult for beginners) can be solved by docs. I was prepared to write down a couple of lines (or a lot of lines, depending on my feelings about that) about how to work with PBS, what each option means and what is mostly used. I think this would be enough for beginners.
For advanced users, I would suggest that there would be a text field where they could write any other directives (starting with #) and these would be prepended to the script. Or they could just add these lines to the script directly and save it.
I dare to say that our users know what they are doing so I can’t really say they are beginners and they would understand what they can achieve.
I like this solution although I do not know whether I understand it totally correctly. I also thought about adding something similar to interactive apps. I would be very interested in hearing what you mean by this solution.
This seems a bit confusing to me.
I’m sorry but I can’t understand this whole block of text. Are you suggesting that one of the approaches could be adding Job Options to Job Composer where user could specify command arguments he wants to use? Or are you saying that there is already something like that available? I think that the former one is correct because when I am looking onto Job Options in Job Composer, there isn’t something like “Insert your command line arguments here”.
I think the solution might be that each institute could specify which cmd arguments are mandatory and if user does not fill in some custom value, the default one will be used. Just as in the forms with interactive apps. This directly leads me to the second part
Yes, this might be needed although this is not a problem for us because we have one node which is scheduling where the job is submitted based on given arguments or specified queue.
We would need only one form for that one node. Everything other is done by the PBS scheduler.
PBS scheduler handles duplicate directives differently - some of them can be inserted twice and some cannot. So this doesn’t have straightforward solution. With incorrect command, submitting job would just throw an error and user had to repair the command.
One last. For start, I thought about creating very simple interactive app where all user input would transform into directives and the user just had to copy and paste them at the beginning of the script. We already have something like that available, it looks like this :
You input all the parameters and the code somehow computes if your requirements make sense and if they can be fulfilled by some node. As the result, you get the whole qsub command (but that can be transferred into directives)