03. Executing Grid workflows using Serpens

1. Requirements for the tutorial

Backing up Kepler home directory
Before you proceed with installation of the Kepler application be sure to make a backup of your Kepler home directory
mv ~/.kepler ~/.kepler~
mv ~/kepler ~/kepler~

1.1 Installing Kepler with Serpens ad-on

Linux installation

Installation steps within this demo assume that you are working within Linux/Unix environment

In order to install Kepler and Serpens related workflows you have to do following:

cd ~
wget http://scilla.man.poznan.pl/euforia/install/serpens-all.tar.gz -O serpens-all.tar.gz
tar zxvf serpens-all.tar.gz
cd Kepler-1.0.0
./kepler.sh

You should see Kepler loading.

Starting Kepler

No matter which way have you used to install Kepler, make sure to export some variables before you start Kepler again.

setenv JAVA_HOME {$your_java_location}
setenv KEPLER ~/Kepler-1.0.0
cd $KEPLER
./kepler.sh

2 Executing Grid workflows (using gLite middleware)

In this section we will take a closer look at grid related components developed by the EUFORIA project to work within Kepler. To use grid resources the user has to be also a member of one of Virtual Organisations (VO).

Useful Information
To acquire certificates contact your country Certification Authority (CA).
For this tutorial you will be provided with tutorial certificate. You can find it within demo directory - you will be given the number of the certificate you should use for the tutorial
$HOME/serpens/core/cert/
Obtaining tutorial certificates

For the purpose of this session you will be provided with temporary certificates.

cd ~
tar xvf ~owsiak/public/certificates.tar

In order to use these workflows you need a proxy. This will be the first step to get grid jobs running.

If you have your own Grid certificate but are not part of the EUFORIA virtual organisation a detailed description of how to register with the VO can be found at the following location: VOMS.

2.1 Introduction

The basic Kepler workflows that were discussed and investigated in the [basic training] all involve running simulation or computation using Kepler on the same computational resource you are using to develop the workflow and run Kepler. Kepler is a Java program and as such the Kepler workflow executes in Java alongside the Kepler program itself. Whilst Kepler can generate Java threads and execute workflows in an efficient, even parallel, manner if necessary it is still restricted to running the workflow on the same machine as Kepler is running on which may not have enough computational resources for a significant workflow.

It is possible to run workflows from the command line and submit Kepler as a program to a batch system to run on a high performance computer or similar resource. However, this may not be ideal for a number of reasons; firstly Java (and particularly the version of Java required for Kepler) is not always available on high performance computing resources; secondly the batch systems that are used to run jobs on these machines may not handle the Java threads created by Kepler properly; thirdly it removes the ability to visualise, analyse, and monitor a running workflow (as the workflow must run as a command line jobs); and finally it does not cater for the running of external codes from a workflow (again this is generally difficult or impossible through a batch system on a HPC system). This last point is significant as one of the main use cases for Kepler, particularly in the fusion community, is to orchestra and support the running of simulation codes.

These simulation codes are generally very large, complex, multi-user codes which cannot feasibly be re-developed as a Kepler workflow but do benefit from the data processing and visualisation/computational steering functionality that Kepler can provide (if you are interesting in how to add an existing code to a Kepler workflow please see the Adding Codes tutorial). Furthermore, as full tokamak modelling generally requires the use of multiple codes there is a requirement for a system that can orchestrate multiple simulation codes (i.e. run a workflow). Therefore, we can see that there is a requirement for running simulation codes from Kepler, but that these codes often have much larger computational requirements that can be satisfied on a standard computer, so there is a requirement for users to be able to launch simulations or jobs from a Kepler workflow onto high performance computational resources.

The EUFORIA project has worked to produce Kepler functionality to allow users to run jobs on both Grid and HPC systems from a Kepler workflow. EUFORIA has a Grid infrastructure which incorporates a large amount of computational resources using the gLite/EGEE software to provide Grid functionality. We also have access to a number of HPC (parallel) computers, including the DEISA network of supercomputers and the HPC-FF machine. The EUFORIA Grid has the capability to run both parallel (MPI) and serial jobs, whereas the HPC targets are designed exclusively for parallel programs (generally MPI).

However, it should be noted that we only have the functionality to submit jobs to these resources, upload and download data, etc... The workflow will not automatically port your code onto the resource targeted, it can only run codes which have been pre-compiled for the Grid/HPC resource and are either in place already on that resource or are available ready to run (i.e. as a pre-compiled executable). One of the parts of this tutorial will to show examples of preparing such executables.

This tutorial outlines the steps required to prepare and execute workflows which execute programs on Grid resources.

2.2 Generating VOMS proxy

After installation of Java API 4 HPC/grid version of Kepler was you should be able to locate vomsproxy.xml workflow at following location:

$HOME/serpens/demo-ITM-09.2010/workflow/grid/common/vomsproxy.xml
VOMS proxy workflow

The VOMS proxy workflow sets up a proxy certificate for Kepler to use for Grid job submissions. Generally the proxy lasts for 3 days so it only needs to be created once a day but it is required before any Grid workflows can be executed (as the grid tools rely on a proxy certificate being available). A proxy is a short term certificate generated from a users actual grid certificate which can be safely used for grid tools as it has a limited lifespan and so if it is compromised or obtained by other individuals in the course of using the grid tools and submitting jobs they have a very limited scope for abusing the certificate. In general actual grid certificates have a lifetime of one year and if they are compromised they have to be revoked by the certificate authority and a new one issued, proxy certificates are used to mitigate these issues/problems.

The exercise below outlines using the workflow that creates the VOMS proxy.

After this exercise you will:
  • know how to create VOMS proxy from your credentials
Exercise no. 1 (approx. 10 min)

Film available: http://www.youtube.com/watch?v=recsgpchKpE

In this exercise you will execute workflow that creates VOMS proxy. In order to this follow the instructions:

  1. Start Kepler application by issuing:
    cd $KEPLER
    ant run
    
  2. Open VOMS proxy workflow by issuing: File -> Open and navigate to:
    $HOME/serpens/demo-ITM-09.2010/core/workflow/grid/common/vomsproxy.xml
    

    Workflow is extensively commented, providing information on how to use it, what parameters need to be changed, and what can be left unmodified. We summarise this information below but you can simply follow the documentation in the workflow.

  3. After workflow is opened you have to modify some of the parameters:
    1. Set the location of certificate's private key. Modify key parameter by double clicking on it.
    2. Set the location of certificate's public key. Modify cert parameter by double clicking on it.

      If you have your own certificate, the above two are stored in the directory .globus in your $HOME and are called userkey.pem and usercert.pem. If you do not have a certificate or do not know where your certificate is or it is not in a form that the workflow recognises, please contact EUFORIA support for further details.

  4. After setting up the parameters, press "Play" button.
  5. The workflow will ask you for the password in a dialog box that will pop up when the workflow is running.
    .
  6. Once you've entered your password, the workflow should generate some output within MultiDisplay actor and should save VOMS proxy into specified location. It can be used in further grid/HPC related workflows.

2.3 Submission of a predefined grid job

In order to submit a grid job you have to open template.xml workflow – it can be found at the following location:

$HOME/serpens/demo-ITM-09.2010/workflow/grid/glite/template.xml
Workflow for job submission into grid infrastructure.
After this exercise you will:
  • know how to submit a simple grid application
Simple Application Submission Exercise (approx. 20 minutes)

Film available: http://www.youtube.com/watch?v=bXIMjkyqGj0

In this exercise you will submit a simple grid application which will execute /bin/ls -l command on a grid node. Kepler workflow will be responsible for job submission and output/error streams download.

In order to run the workflow follow these instructions:

  1. Start Kepler application by issuing:
    cd $KEPLER
    ant run
    
  2. Open template workflow by issuing: File -> Open and navigate to:
    $HOME/serpens/demo-ITM-09.2010/core/workflow/grid/glite/template.xml
    
  3. Press "Play" button:
  4. In a moment, a MultiDisplay window will appear with ID of a newly submitted job.
  5. Wait until the job finishes. When it's ready, workflow will present you a path where your output files were downloaded. You can now read StdOutput file, which will contain a listing of directory on a grid node.

2.4 Submission of a custom grid job with input files

After this exercise you will:
  • know how to specify input files for a grid job
  • know how to upload files into remote grid storage
Modifying Simple Application Submission Exercise (approx. 20 minutes)

Film available: http://www.youtube.com/watch?v=Iv-d1gnuBTU

In this exercise you will prepare your own input data and write a simple script to process it.

In order to run the workflow follow these instructions:

  1. Prepare input data, for example:
    echo "My own input data" > ~/tmp/input.txt
    
  2. Prepare a script/program to process this data, for example:
    ~/tmp/tutorial.sh
    #! /bin/bash
    cat input.txt
    

    When files are uploaded into grid storage, a job will get them in its own working directory. It means that location of files on your local computer is irrelevant. That's why the above script tutorial.sh will be able to access input.txt.

  3. Start Kepler application by issuing:
    cd $KEPLER
    ant run
    
  4. Open template workflow by issuing: File -> Open and navigate to:
    $HOME/serpens/demo-ITM-09.2010/core/workflow/grid/glite/template.xml
    
  5. Modify inputFiles parameter by adding "$HOME/tmp/input.txt" and "$HOME/tmp/tutorial.sh".
  6. Press "Play" button:
  7. In a moment, a MultiDisplay window will appear with ID of a newly submitted job.
  8. Wait until the job finishes. When it's ready, workflow will present you a path where your output files were downloaded. You can now read StdOutput file, which will contain an output of your script. If you followed this exercise exactly and used the same script, then you will see contents of your input file.

2.5 Submitting a parametric grid job

After this exercise you will:
  • know how to submit many jobs at once
  • know how to specify parametric jobs
Parametric Job Submission Exercise (approx. 20 minutes)

Film available: http://www.youtube.com/watch?v=WXBjlHLiU3w

In this exercise you will submit a parametric job. You will set names of parameters that will be passed to each executed subjob.

In order to run the workflow follow these instructions:

  1. Start Kepler application by issuing:
    cd $KEPLER
    ant run
    
  2. Open template workflow by issuing: File -> Open and navigate to:
    $HOME/serpens/demo-ITM-09.2010/core/workflow/grid/glite/template.xml
    
  3. Set jobType parameter to value parametric
  4. Set parametricType to value list
  5. Set commandLine parameter to some command that will allow each job to present its parameter's value. For example: /bin/echo _PARAM_

    When you submit a parametric job, every occurrence of string _PARAM_ is substituted with a value of parameter of current job. This special keyword _PARAM_ may occur as an argument in command line or as an input/output file. So if you submit 5 jobs with parameters {"a", "b", "c", "d", "e"}, and each job writes to file test-_PARAM_, then you will have 5 output files: {"test-a", "test-b", "test-c", "test-d", "test-e"}.

  6. Press "Play" button:
  7. In a moment, a MultiDisplay window will appear with ID of a newly submitted job. This is the ID of master job which is a metaname for 5 subjobs you just submitted.
  8. Each job may finish in different time. When it happens, output files will be downloaded and other jobs will be processed.
  9. Workflow finishes when all subjobs are done.

3. Executing Grid workflows (using UNICORE and RAS-Vine middleware)

Description

3.1 Template workflow for serial applications

After this exercise you will:
  • basic concepts of HPC related workflow
  • concepts related to uploading, downloading files into UNICORE storage
  • concepts of job submission and job's status checking
Exercise no. 3.1 (approx. 15 min)

In this exercise we will take a look at the workflow used for HPC job submission. This workflow can be used as a basis for HPC job submission (as composite actor).

A. Start Kepler if you haven't done so far

cd ~/kepler
ant run-dev

B. After Kepler is loaded, open template workflow for submission HPC jobs

$HOME/serpens/demo-ITM-09.2010/workflows/grid/unicore/template-unicore.xml

C. You should see following workflow on the screen

D. Right-click HPC actor and choose Open - you should see Composite actor expanding

3.2 Executing simple application at HPC (preinstalled)

After this exercise you will:
  • modify settings for submission of serial HPC jobs
  • how to retrieve Standard Output and Standard Error of the application
Exercise no. 3.2 (approx. 15 min)

Film available: http://www.youtube.com/watch?v=MGT625QbM34

In this exercise we will submit simple, serial application (e.g. /bin/ls). We will take a look at parameters, executable, inputs and outputs.

A. Start Kepler if you haven't done so far

cd ~/kepler
ant run-dev

B. After Kepler is loaded, open template workflow for submission HPC jobs

$HOME/serpens/demo-ITM-09.2010/workflows/grid/unicore/template-unicore-ls.xml

C. Right-click HPC actor and choose Open - you should see Composite actor expanding

We will pay attention to following parameters (marked with green)

D. After workflow is loaded you can Execute it

E. You can play with "Executable" and "Arguments" parameter (e.g. you can use /bin/echo as Executable and "Hello world" as Arguments)

3.3 Executing simple application at HPC (submitting application)

After this exercise you will:
  • modify settings for submission of serial HPC jobs
  • how to retrieve Standard Output and Standard Error of the application
  • how to submit application as input files
Exercise no. 3.3 (approx. 15 min)

Film available: http://www.youtube.com/watch?v=ioe7r2ZLMcs

In this exercise you will further modify the serial example. This time we will submit startup script together with application and take a closer look at the possibilities that we gain through this solution.

A. Start Kepler if you haven't done so far

cd ~/kepler
ant run-dev

B. After Kepler is loaded, open template workflow for submission HPC jobs

$HOME/serpens/demo-ITM-09.2010/workflows/grid/unicore/template-unicore-script.xml

C. Right-click HPC actor and choose Open - you should see Composite actor expanding

We will pay attention to following parameters (marked with green)

D. After workflow is loaded you can Execute it

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.