Stuart Martin
Draft January 08, 2003
This document gives an overview of the
Grid Resource Allocation and Management (GRAM) implementation in Globus Toolkit
version 3 (GT3). The
Globus Resource Allocation Manager (GRAM) is the lowest level of Globus resource
management architecture. GRAM allows you to run jobs remotely, using a
set of WSDL/OGSI client interfaces for
submitting, monitoring, and terminating a job.
Architectural Walkthrough
For the above diagram, we walk through a typical GT3 GRAM service invocation from the point of view of the resource and the user.
1. The Master configures the Redirector to direct createService calls sent to it through the Starter UHE
2. The Master publishes its handle to a remote registry
3. A client submits a createService request which is received by the Redirector
4. The Redirector calls the Starter UHE class which authorizes the request via the grid-mapfile to determine the local username and port to be used and constructs a target URL
5. The Redirector attempts to forward the call to the said target URL. If it is unable to forward the call because the UHE is not up, the Launch UHE module invoked
6. The Launch UHE creates a new UHE process under the authenticated user’s local uid.
7. The Starter UHE waits for the UHE to be started up (ping loop) and returns the MJFS URL to the Redirector
8. The Redirector forwards the createService call to the MJFS unmodified and mutual authentication/authorization can take place
9. MJFS creates a new Managed Job Service (MJS)
10. MJS submits the job into a back-end scheduling system
11. Subsequent calls to the MJS from the client will be redirected through the Redirector
12. RIPS providing data to the MJS instances and Master. It gathers data from the local scheduling system, file system, host info, ...
13. FindServiceData requests to the Master will result in either an SDE returned (populated by the Service Data Aggregate) or redirected to the MJFS of the requestor’s UHE
14. In order to stream stdout/stderr back to the client, the MJS creates 2 File Stream Factory Services (FSFS), one for stdout and one for stderr
15. The MJS then creates the File Stream Services (FSS) instances as specified in the job request
16. The grim handler is run in the UHE to create a user host certificate. The user host certificate is used for mutual authentication between the MJS service and the client.
Virtual Host Environment Redirector
It accepts all incoming soap messages and redirects them to the User Host Environment. This component is part of Core. See core documentation for details.
Starter UHE
This java class is used by the Redirector to resolve the
incoming calls to a user hosting environment. The gridmap file is used to obtain
the username corresponding to a particular subject DN and one UHE is run per
user on a machine.
Mapping from the username to the port number of the UHE for that particular user
is maintained in a configuration file.
When a request to resolve a URL comes in and an entry is found in the
configuration file, the target URL is constructed and returned to the
Redirector. If the UHE on that port number is not up, the setuid/launch
module is used to launch a UHE as the user.
If an entry does not exist in the configuration file, a free port number is
chosen, the setuid/launch module is used to start up a UHE on the particular
port number as the user and the local URL is returned to the proxy, after
ensuring the UHE is running. The configuration file is also updated with
this entry.
LAUNCH UHE
A simple java class that is used to call a C program in order to start a new hosting environment under the user's account. The setuid C program does an su/fork/exec of a shell script which starts the UHE. The C program needs to be "setuid" root. The path to the launchScript.sh script is determined when the C program is compiled. This limits the root exposure to starting up a new hosting environment as a user.
Master
The Master Managed Job Factory Service is responsible for exposing the virtual GRAM service to the outside world. It configures the Redirector to direct createService calls sent to it through the Startup UHE, and launch UHE in order to eventually end up unmodified to the MJFS. The Redirector is instructed to redirect subsequent createService calls sent to it to a user’s hosting environment.
The Master uses the Service Data Aggregator to collect and populate local Service Data Elements which represent local scheduler data (e.g. freenodes, totalnodes) and general host information (e.g. host cpu type, host OS). If the FSD request is for any known MJFS SDE, then is it redirected to he MJFS of the UHE. All other FSD queries are handled locally.
Managed Job Factory Service (MJFS)
The Managed Job Factory Service is responsible for instantiating a new MJS when it receives a CreateService request. It also exposes a single Service Data Element which is an array of Grid Service Handles of all active MJS instances created by this factory. The MJFS stays up for the life of the UHE.
Managed Job Service (MJS)
An OGSA service that given a job request specification can submit a job to a local scheduler, monitor its status and send notification. The MJS will start two File Streaming Factory Service (FSFS), one for the job's stdout and one for the job’s stderr. The MJS starts the initial set of FSS instances as specified in the job specification. The FSFS's Grid Service Handles (GSH) are available in the SDE of the MJS, which will enable the client to start additional FSS instances of stdout/err or terminate existing FSS instances. The MJS destroys the stdout and stderr File Stream Factories during it’s preDestroy operation.
File Stream Factory Service (FSFS)
The File Stream Factory Service is responsible for instantiating a new File Stream Service instances when it receives a CreateService request. It exposes a single SDE, which is an array of Grid Service Handles of all FSS instances created by this factory.
File Stream Service (FSS)
An OGSA service that given a destination URL will stream
from the local file the factory was created to stream (stdout or stderr) to the
destination URL.
RIPS
RIPS is a specialized notification service providing raw data about a resources scheduling system, file system, host system, etc. Some of the data may be privileged. The MJS instances will subscribe to RIPS for notification on job state changes. The Master will subscribe for data about the local scheduler (e.g free / total nodes), file system and host system information.