GT3 Index Service
User’s Guide
Introduction
This document presents the capabilities of the Globus Toolkit® Version 3.0
(GT3) Index Service and describes how to configure and use those
capabilities for generating, aggregating, indexing, and querying Grid
services and service data.
GT3 Prerequisite
Documents
User’s Guide Core Framework
Admin Guide
GT3 Index Service Overview
Audience
This document is intended primarily for developers and system
administrators who want to set up and use the Index Service to generate,
aggregate, index, and query service data; create service data
notifications and subscriptions; and create an Index Service hierarchy. A
knowledge of OGSA and Grid concepts, particularly as described in
The Physiology of the Grid, is presumed, as is a working knowledge
of Java.
Assumptions
This document assumes that the Globus Toolkit 3.0 has been installed and
configured for your particular computing environment.
Organization
This document contains the following sections:
Related Documents
Each Grid service instance has a particular set of service data associated
with it. The essence of the Index Service is to provide an interface for
operations accessing, aggregating, generating, and querying this service
data.
The Index Service uses an extensible framework for managing static and
dynamic data for Grids built using GT3. Note that the Index Service does
not beget specific data types. The types of data available from the Index
Service for queries and other operations instead depend on how the Service
is configured; that is, what sorts of Service Data Providers it uses to
aggregate data and what sorts of other services it indexes.
The Index Service provides the following key capabilities:
The Index Service
provides a standard mechanism for dynamic generation and management of
service data via external programs. These external provider programs
can be the core providers that are part of GT3 or user-created, custom
providers.
Service data coming
in from multiple Service Data Provider programs and other grid services
can be aggregated in different ways and indexed to provide efficient
query processing. Various command-line tools and GUIs can be used as
clients to the aggregate data views. The Index Service uses standard
OGSA notification mechanisms for subscription, notification, and
updating of service data.
A set of available
Grid services is maintained in a Registry. A Registry allows for
soft-state registration of Grid services, in that a set of services can
be registered and periodically updated as required. A Registry then can
be used to support query or other operations on a given service. A
Registry is also known as a Service Group.
The Index Service combines ServiceDataProviderExecution
components with ServiceDataAggregator and ServiceGroup
components to create a dynamic data-generating and indexing node,
similar in concept to an MDS2 hierarchical GIIS. This capability can be
used to create hierarchical Index Services.
This GT3 Beta version of the Index Service supports multithreaded,
recurring execution of providers via Java TimerTask and asynchronous
notification of changed Service Data to listening OGSA Notification
Sinks. It is also possible to create server-side subscriptions to other
OGSA service Service Data, where the notification messages of these
server-side subscriptions get cached in the local Service Data container –
this is known as Service Data Aggregation.
This GT3 Beta version of the Index Service also supports transparent
Service Data persistence through the use of the Apache Xindice XML-native
database platform. This option is disabled by default and can be enabled
via a configuration switch. JDK 1.4 or greater is required on Linux
distributions in order to use Xindice.
Prerequisites
-
Build or install GT3
Beta.
-
If you are planning to use the OGSA Service Browser GUI, in order to
have the required GUI control panels for the Index Service, you must add
the following lines to the “client-gui-config.xml” file in your GT3 Beta
installation directory:
<panel
portType="ServiceDataProviderExecutionPortType" class="org.globus.ogsa.gui.ProviderExecutionPortTypePanel"/>
<panel portType="ServiceDataAggregatorPortType" class="org.globus.ogsa.gui.AggregatorPortTypePanel"/>
Using Service Data
Providers
By
default, the following Service Data Providers are available in the
IndexService:
SimpleSystemInformation:
This is a Java native system probe. This provider enumerates the following
data; CPU count, Memory statistics, OS type, and Logical Disk Volumes.
HostScriptProvider:
This is a
Linux-specific set of shell scripts that monitor system-specific host
data. This package must be installed and deployed to your GT3
installation base.
These providers are executed when the Index Service activates, based on
the execution parameters specified in the “index-service-config.xml” file
found in the “etc” dir of your GT3 installation root.
Procedure for
Executing a Service Data Provider via the OGSA Service Browser GUI
1.
Run "ant
startContainer" to launch the standalone OGSA Service Container.
2.
Run "ant gui" to
launch the Service Browser GUI.
3.
Locate the
IndexService in the browser list and double-click to activate the service.
4.
Select a
Provider from the ServiceData Providers list and adjust any of the default
provider execution parameters.
5.
Name the new
service data which will be created from the output of the selected
provider. If you do not specify a name, the name of the root element of
the XML output of the provider will be used.
6.
Click "Create"
to create the new service data from the output of the provider. If you
specify ‘0’ for the refresh frequency, the provider will run once and
terminate. Otherwise, if you specified a positive refresh frequency, the
provider will execute again after the refresh frequency in seconds has
elapsed. In addition, subscriptions on this data will also now be notified
whenever the ServiceData has changed as a result of provider execution.
7.
Issue a Query by
Service Data Name on the name to which you associated the output of the
provider. If the provider has run successfully, you should see the
results displayed in the XML tree view in the Grid Service panel of the
GUI. You should now be able to subscribe to this ServiceData from another
service.
8.
Optionally issue
an XPath query against the data created. See the OGSA XPath Documentation
(included in the distribution) for samples.
Using ServiceData Aggregation
1.
Create another
Index Service in order to test notification caching, otherwise known as
ServiceData Aggregation. The Index Service is a persistent service, which
means that to add another instance you must add an additional service
descriptor entry to the server-config.wsdd file in your GT3 root
directory. For this simple test, make a copy of the existing Index
Service descriptor and change the “service name” parameter to something
other than “Index Service”. Also, change the “name” parameter in order to
change the display name of the service. Your result should look like
something like this:
<service
name="base/index/IndexService2" provider="Handler" style="wrapped">
<parameter name="name" value="Index Service 2"/>
... (no additional changes needed)
</service>
2.
Run "ant
startContainer" to launch the standalone serviceContainer.
3.
Run "ant gui" to
launch the Service Browser GUI.
4.
Use the Service
Browser GUI to activate both Index Service instances
5.
Use the controls
in the "Manage Subscriptions" group box of the secondary Index Service
instance to create a subscription to the Service Data element on the
primary Index Service instance. Do this by specifying the URL of the
primary Index Service instance in the "Source" field, followed by the
ServiceData QName of the SDE on the instance to which you are subscribing
to in the "New Service Data Name" and “New Service Data Namespace”
fields. Then, enter the URL of the secondary Index Service instance in
the "Sink" field and enter a positive lifetime in seconds for the new sink
in the "Sink Lifetime" field. If all goes well you should see a message
box indicating that the subscription was successful, and subsequent
notifications.
6.
Run a query by
name on the second instance for the SDE element for which you created the
subscription, and if a notification has occurred from the first service
instance to the second, the Service Data Element from the first instance
should be returned by the second instance, where it is being cached in the
services' local Service Data Container.
To
use the capabilities of the Index Service for your particular operational
needs and environment, you need to configure it for the Grid services and
Service Data Providers you intend to use, the kind of service data you
want to aggregate, and the service data notifications and subscriptions
you desire. You perform this configuration with the Index Service
configuration file, index-service-config.xml.
The index-service-config.xml file is included in your GT3 installation and
resides by default in <gt3-install-location>/etc. The default contents of
this file are described in
Index Service Configuration File Contents below.
In
relation to your GT3 installation and configuration, each persistent Index
Service instance you create requires the following:
-
One service
descriptor entry in the server configuration file, server-config.wsdd.
-
One Index Service
configuration file, index-service-config.xml.
Server
Configuration File
The server-config.wsdd file is also included in the GT3 installation and
resides by default in <gt3-install-location>. This file contains all of
the deployment descriptors for every service in the hosting environment.
By default, there is a single descriptor present for the Index Service.
If you want additional persistent Index Service instances, you just add
more descriptors, but with different service names.
The default Index Service descriptor in server-config.wsdd appears as
follows:
<service
name="base/index/IndexService" provider="Handler" style="wrapped">
<parameter name="name" value="Index Service"/>
<parameter name="schemaPath" value="schema/base/index/index_service.wsdl"/>
<parameter name="className" value="org.globus.ogsa.base.index.IndexService"/>
<parameter name="baseClassName"
value="org.globus.ogsa.impl.base.index.IndexServiceImpl"/>
<parameter name="instance-schemaPath"
value="schema/ogsi/ogsi_service_group_entry_service.wsdl"/>
<parameter name="instance-baseClassName"
value="org.globus.ogsa.impl.ogsi.ServiceGroupEntryImpl"/>
<parameter name="factoryCallback"
value="org.globus.ogsa.impl.ogsi.DynamicFactoryCallbackImpl"/>
<parameter name="operationProviders"
value="org.globus.ogsa.impl.ogsi.NotificationSourceProvider"/>
<parameter name="handlerClass" value="org.globus.ogsa.handlers.RPCURIProvider"/>
<parameter name="persistent" value="true"/>
<parameter name="allowedMethods" value="*"/>
<parameter name="sweepServiceData" value="false"/>
<parameter name="entryInstanceCreation" value="true"/>
<parameter name="disableFactoryRegistry" value="true"/>
<parameter name="serviceConfig" value="etc/index-service-config.xml"/>
<parameter name="xindiceEnabled" value="false"/>
<parameter name="xindiceURI" value="xindice-embed:///db"/>
</service>
The parameter names above are described in more detail in the
User’s Guide Core Framework and the
Java Programmer’s Guide Core Framework. The parameters of
particular interest for the Index Service are as follows:
<parameter
name="serviceConfig" value="etc/index-service-config.xml"/>
The
serviceConfig
parameter is the path to the Index Service configuration file. It is a
required parameter. Although the default index-service-config.xml file is
provided in the GT3 installation, this file can be copied and modified,
and then this descriptor entry can be updated to reflect the location and
name of the new configuration file to use.
<parameter
name="xindiceEnabled" value="false"/>
<parameter name="xindiceURI" value="xindice-embed:///db"/>
These optional parameters indicate, respectively, whether or not the
Xindice XML database package should be used to persist service data, and
if Xindice is to be used, the URI of the Xindice database root.
Index Service
Configuration File Functions
The functions of the Index Service configuration file are as follows:
-
Specifies the Service Data Providers (core, custom, and ported from GT2)
to be enabled for each service referencing this configuration file.
-
Specifies which (if any) of the enabled providers are to be executed at
startup and/or when the configuration file is reread, along with any
parameters relevant to the provider’s execution.
-
Specifies notification and subscription of service data to other service
instances, which allows for aggregation of service data from multiple
services.
The following section shows and describes the default Index Service
configuration file.
The following shows the default contents of the Index Service
configuration file, index-service-config.xml. The sections of this file
are described in detail following the contents.
<?xml version="1.0"
encoding="UTF-8" ?>
<serviceConfiguration xmlns:ogsi="http://www.gridforum.org/namespaces/2003/03/OGSI"
xmlns:aggregator="http://www.globus.org/namespaces/2003/04/service_data_aggregator"
xmlns:provider-exec="http://www.globus.org/namespaces/2003/04/service_data_provider_execution"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<installedProviders>
<providerEntry
class="org.globus.ogsa.impl.base.providers.servicedata.impl.SimpleSystemInformationProvider"
/>
<providerEntry class="org.globus.ogsa.impl.base.providers.servicedata.impl.AsyncDocumentProvider"
/>
<providerEntry class="org.globus.ogsa.impl.base.providers.servicedata.impl.ScriptExecutionProvider"
/>
<providerEntry class="org.globus.ogsa.impl.base.providers.servicedata.impl.HostScriptProvider"
/>
</installedProviders>
<executedProviders>
<provider-exec:ServiceDataProviderExecution>
<provider-exec:serviceDataProviderName>SystemInformation</provider-exec:serviceDataProviderName>
<provider-
exec:serviceDataProviderImpl>org.globus.ogsa.impl.base.providers.servicedata.impl.SimpleSystemInformation<
/provider-exec:serviceDataProviderImpl>
<provider-exec:serviceDataProviderArgs> </provider-exec:serviceDataProviderArgs>
<provider-exec:serviceDataName>SystemInformation</provider-exec:serviceDataName>
<provider-exec:refreshFrequency>60</provider-exec:refreshFrequency>
<provider-exec:async>true</provider-exec:async>
</provider-exec:ServiceDataProviderExecution>
</executedProviders>
<!-- Sample
Subscription Configuration -->
<aggregatedSubscriptions>
<!--<aggregator:AggregatorSubscription>
<aggregator:serviceDataName>SystemInformation</aggregator:serviceDataName>
<ogsi:source>http://127.0.0.1:8080/ogsa/services/base/gram/ResourceInformationProviderService<
/ogsi:source>
<aggregator:lifetime>1200</aggregator:lifetime>
</aggregator:AggregatorSubscription>-->
</aggregatedSubscriptions>
</serviceConfiguration>
serviceConfiguration
The serviceConfiguration section in index-service-config.xml specifies
Grid service, service data aggregator, and Service Data Provider
namespaces. This section in the default file is as follows:
<serviceConfiguration
xmlns:ogsi="http://www.gridforum.org/namespaces/2003/03/OGSI"
xmlns:aggregator="http://www.globus.org/namespaces/2003/04/service_data_aggregator"
xmlns:provider-exec="http://www.globus.org/namespaces/2003/04/service_data_provider_execution"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
Note that namespaces (identified by URLs) are specified above for OGSI,
Service Data Aggregator, Service Data Provider execution, and XML schema.
These namespaces are required so that the Apache Axis SOAP engine can
properly render the XML entries in the file into runtime objects.
installedProviders
The installedProviders section in the default index-service-config.xml
file is as follows:
<installedProviders>
<providerEntry
class="org.globus.ogsa.impl.base.providers.servicedata.impl.SimpleSystemInformationProvider"
/>
<providerEntry class="org.globus.ogsa.impl.base.providers.servicedata.impl.AsyncDocumentProvider"
/>
<providerEntry class="org.globus.ogsa.impl.base.providers.servicedata.impl.ScriptExecutionProvider"
/>
<providerEntry class="org.globus.ogsa.impl.base.providers.servicedata.impl.HostScriptProvider"
/>
</installedProviders>
This section specifies the core Service Data Providers as shown above, as
well as any Java custom or ported Service Data Providers. Every Java
Service Data Provider specified in the executedProviders section
(described below) must also be listed here in the installedProviders
section; MDS2-style Unix information providers are excepted.
The only required attribute in the installedProviders section is
class,
under
ProviderEntry,
which provides a path to the class. The class attribute contains a Java
class name that implements a ServiceDataProvider interface.
An
optional
handler
parameter can be used to indicate a user-supplied custom callback method
where the resulting data will be sent for any post-processing.
executedProviders
The executedProviders section in the default index-service-config.xml file
is as follows:
<executedProviders>
<provider-exec:ServiceDataProviderExecution>
<provider-exec:serviceDataProviderName>SystemInformation</provider-exec:serviceDataProviderName>
<provider-
exec:serviceDataProviderImpl>org.globus.ogsa.impl.base.providers.servicedata.impl.SimpleSystemInformation<
/provider-exec:serviceDataProviderImpl>
<provider-exec:serviceDataProviderArgs> </provider-exec:serviceDataProviderArgs>
<provider-exec:serviceDataName>SystemInformation</provider-exec:serviceDataName>
<provider-exec:refreshFrequency>60</provider-exec:refreshFrequency>
<provider-exec:async>true</provider-exec:async>
</provider-exec:ServiceDataProviderExecution>
</executedProviders>
One or more piece(s) of service data is produced by each execution of each
Service Data Provider specified in this section.
Each execution of a Service Data Provider here may use the same individual
provider in the installedProviders section multiple times, like the
ScriptExecutionProvider.
The
serviceDataProviderArgs
parameter supplies parameters to the Service Data Providers specified in
this entry.
The
serviceDataName
parameter specifies the name of the new service data to create. If no
name is specified, the name will be created from the tag name of the root
element of the XML document that is the result of the provider execution.
The name may be qualified with a namespace, using standard XML QName
syntax. For example:
<provider-exec:serviceDataName xmlns:ns=”http://www.globus.org/example”>ns:SystemInformation</provider-exec:serviceDataName>
The
refreshFrequency
parameter indicates how often (in seconds) to run this Service Data
Provider.
The async
parameter indicates that the Service Data Provider specified should
attempt to run asynchronously if it is capable of doing so.
Every Java Service Data Provider specified in this section must also be
specified in the installedProviders section (described above) of the
configuration file; MDS2-style Unix information providers are excepted.
aggregatedSubscriptions
The aggregatedSubscriptions section in the default index-service-config.xml
file is in the Sample Subscription Configuration section and appears as
follows:
<!-- Sample
Subscription Configuration -->
<aggregatedSubscriptions>
<!--<aggregator:AggregatorSubscription>
<aggregator:serviceDataName xmlns:ce=”http://glue.base.ogsa.globus.org/ce/1.1”>ce:Cluster<
/aggregator:serviceDataName>
<ogsi:source>http://127.0.0.1:8080/ogsa/services/base/gram/ResourceInformationProviderService<
/ogsi:source>
<aggregator:lifetime>1200</aggregator:lifetime>
</aggregator:AggregatorSubscription>-->
</aggregatedSubscriptions>
</serviceConfiguration>
This section specifies the Grid services to be indexed by the Index
Service. It is in this section where aggregation occurs.
The
serviceDataName
parameter specifies the QName of the Service Data to which the Index
service will subscribe.
The
source parameter
specifies the service on which the subscription will be made.
The
lifetime
parameter specifies the lifetime of the subscription, in seconds.
The above example shows the Index Service subscribing to the GRAM RIPS
service in order to aggregate information about the Cluster resources
available.