Skip to content

JCLReferenceUsingANTAdvanced

stockiNail edited this page Oct 12, 2015 · 3 revisions

ANT advanced utilities

JEM provides a set of utilities to manage internal entities using a batch, with a great benefit to perform many statements and then handle massive changes with a unique step.

These utilities are custom ANT task (and then ANT JCL must be used) and they can manage:

  • GDG
  • Nodes
  • Roles
  • Common resources
  • Statistics
  • archiving jobs from output queue

All ANT utilities to be executed need one of following authorization:

  • Internal service permission
  • Administrator role
  • Grantor role

GDG utility

A generation data group (GDG) is a collection of historically related sequential data sets that are arranged in chronological order. That is, each data set is historically related to the others in the group. They have sequentially ordered absolute and relative names that represent their age. Older data sets have smaller absolute numbers.

The relative name is a signed integer used to refer to the latest (0), the next to the latest (-1), and so forth, generation.

The group of datasets (usually called root) is a properties file that contains the list of generations and files related to it. To guarantee the consistency of file, JEM asks to GRS to lock both the root file.

The utility is able:

  • to create a new GDG, initializing the root.properties and optionally creating a first empty generation 0
  • to rebuild the root.properties if some inconsistent information are present there, aligning the content of folder with properties file
  • to cleanup the folder of oldest files

The task parses a list of commands (separated by semicolons ;) found inside a data description called COMMAND (be careful because is case sensitive).

Defining GDG

Before using a GDG it's mandatory to have the folders and root otherwise a exception will occur. The command syntax is following:

DEFINE GDG  [gdg-ddname]          ; 
                          NOEMPTY

gdg-ddname is the data description name where a GDG name is indicated (with disposition NEW) and will be created as a folder. With argument NOEMPTY, a empty generation is created as well.

A complete sample is following:

<project name="CREATEGDG" default="step1" basedir=".">
   <description>
   CREATEGDG
   </description>
   
   <property name="jem.job.name" value="CREATEGDG"></property>
   <property name="jem.environment" value="ENV1"></property>
   
   <taskdef name="gdg" classname="org.pepstock.jem.ant.GDGTask"/>

   <target name="step1">
      <gdg>
         <dataDescription name="GDG1" disposition="NEW">
            <dataSet  name="gdg/test1"></dataSet>
         </dataDescription>
         <dataDescription name="GDG2" disposition="NEW">
            <dataSet  name="gdg/test2"></dataSet>
         </dataDescription>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet >
               DEFINE GDG GDG1;
               DEFINE GDG GDG2 NOEMPTY;
            </dataSet>
         </dataDescription>
      </gdg>
   </target>
</project>

Rebuilding GDGs

It could be happen that root.properties file could contain wrong information comparing with the file system. The command syntax is following:

REBUILD GDG [gdg-ddname]                      ;
                          MASTER(ROOT)
                          MASTER(GENERATIONS)

The REBUILD command will align the data deciding what master data source to use. The master data sources could:

  • ROOT when the information inside of root.propeties leads the alignment, creating nonexistent files or removing files not in root (is the default)
  • GENERATIONS when the information inside of folder of GDG leads the alignment, removing wrong keys of root.propeties and adding new ones

gdg-ddname is the data description name where a GDG name is indicated (with disposition OLD).

A complete sample is following:

<project name="REBUILDGDG" default="step1" basedir=".">
   <description>
   REBUILDGDG
   </description>
   
   <property name="jem.job.name" value="REBUILDGDG"></property>
   <property name="jem.environment" value="ENV1"></property>

   <taskdef name="gdg" classname="org.pepstock.jem.ant.GDGTask"/>

   <target name="step1">
      <gdg>
         <dataDescription name="GDG1" disposition="OLD">
            <dataSet  name="gdg/test1"></dataSet>>
         </dataDescription>
         <dataDescription name="GDG2" disposition="OLD">
            <dataSet  name="gdg/test2"></dataSet>
         </dataDescription>
         <dataDescription name="GDG3" disposition="OLD">
            <dataSet  name="gdg/test3"></dataSet>
         </dataDescription>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet >
               REBUILD GDG GDG1;
               REBUILD GDG GDG2 MASTER(ROOT);
               REBUILD GDG GDG3 MASTER(GENERATIONS);
            </dataSet>
         </dataDescription>
      </gdg>
   </target>
</project>

Cleaning GDGs

It deletes the oldest files of a GDG. The command syntax is following:

CLEANUP GDG [gdg-ddname] VERSIONS [number-versions] ;

The number-versions is the amount of generations that you want to maintain in your folder.

For example, using the number '2' it removes the oldest generations, leaving only 2 files. gdg-ddname is the data description name where a GDG name is indicated (with disposition OLD).

A complete sample is following:

<project name="CLEANUPGDG" default="step1" basedir=".">
   <description>
      CLEANUPGDG
   </description>
   
   <property name="jem.job.name" value="CLEANUPGDG"></property>
   <property name="jem.environment" value="ENV1"></property>

   <taskdef name="gdg" classname="org.pepstock.jem.ant.GDGTask"/>

   <target name="step1">
      <gdg>
         <dataDescription name="GDG1" disposition="OLD">
            <dataSet  name="gdg/test1"></dataSet>
         </dataDescription>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet >
               CLEANUP GDG GDG1 VERSIONS 2;
            </dataSet>
         </dataDescription>
      </gdg>
   </target>
</project>

Deleting GDGs

It deletes specific versions of a GDG. The command syntax is following:

DELETE GDG [gdg-ddname] [generation] ;

The generation is the generations that you want to delete from your folder.

For example, using the generation 00002 it removes generation 00002, both from file system and root properties. gdg-ddname is the data description name where a GDG name is indicated (with disposition OLD).

A complete sample is following:

<project name="DELETEGDG" default="step1" basedir=".">
   <description>
      DELETEGDG
   </description>
   
   <property name="jem.job.name" value="DELETEGDG"></property>
   <property name="jem.environment" value="ENV1"></property>

   <taskdef name="gdg" classname="org.pepstock.jem.ant.GDGTask"/>

   <target name="step1">
      <gdg>
         <dataDescription name="GDG1" disposition="OLD">
            <dataSet  name="gdg/test1"></dataSet>
         </dataDescription>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet >
               DELETE GDG GDG1 00002;
               DELETE GDG GDG1 00003;
            </dataSet>
         </dataDescription>
      </gdg>
   </target>
</project>

Renaming GDGs

It renames files related to a specific versions of a GDG. The command syntax is following:

RENAME GDG [gdg-ddname] [generation] TO [new-file-name] ;

The generation is the generations that you want to rename from your folder.

The new-file-name is the relative name of file related to generation.

For example, using the generation 00002 and new file name 00002.V1 it renames the current file name assigned to generation 00002 to 00002.V1, both from file system and root properties. gdg-ddname is the data description name where a GDG name is indicated (with disposition OLD).

A complete sample is following:

<project name="GDG-RENAME" default="gdg-rename" basedir=".">
        <description>
                RENAME GDGs
    </description>

   <property name="jem.job.name" value="GDG-RENAME"/>
   <property name="jem.environment" value="ENV1"></property>

   <taskdef name="gdg" classname="org.pepstock.jem.ant.GDGTask"/>

   <target name="step1">
      <gdg>
         <dataDescription name="GDG1" disposition="OLD">
            <dataSet  name="gdg/test1"></dataSet>
         </dataDescription>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet >
               RENAME GDG GDG1 00002 TO 00002.V1;
            </dataSet>
         </dataDescription>
      </gdg>
   </target>
</project>

Nodes Utility

With nodes utility you can start and drain nodes by batch, using some attributes to perform on a group of nodes.

The commands syntax is following:

START  [node-ip-address]                                  ;
       name:[node-ip-address]
       hostname:[node-hostname]
       domain:[node-domain-name]
       staticaffinities:[node-static-affinities-tags]
       dynamicaffinities:[node-dynamic-affinities-tags]
       status:[node-status]
       os:[node-os]

DRAIN  [node-ip-address]                                  ;
       name:[node-ip-address]
       hostname:[node-hostname]
       domain:[node-domain-name]
       staticaffinities:[node-static-affinities-tags]
       dynamicaffinities:[node-dynamic-affinities-tags]
       status:[node-status]
       os:[node-os]

All filters arguments can be used, blank separated. The task parses a list of commands (separated by semicolons ;) found inside a data description called COMMAND (be careful because is case sensitive).

Here is a JCL sample:

<project name="NODES" default="step1" basedir=".">
   <description>
   NODES action
   </description>

   <property name="jem.job.name" value="NODES"/>
   <property name="jem.job.environment" value="ENV-1"/>

   <taskdef name="nodes" classname="org.pepstock.jem.ant.NodesTask" />

   <target name="step1">
      <nodes>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet>
               DRAIN *;
               START *;
               DRAIN name:10.;
               DRAIN name:10. domain:testDomain ;
         </dataSet>
      </dataDescription>
      </nodes>
   </target>
</project>

Roles Utility

This utility is able to manage all security entities, so roles, permissions and users by a batch and it's very helpful when massive definitions must be done.

In details, the utility is able:

  • to create and delete roles
  • to grant and revoke permissions to roles
  • to add and delete users to roles

The task parses a list of commands (separated by semicolons ;) found inside a data description called COMMAND (be careful because is case sensitive).

Create and Remove roles

It's possible to create and remove roles by their names (comma-separated). The commands syntax is following:

CREATE [role-name] ;
REMOVE [role-name] ;

If you try to add an existing role or to remove an nonexistent one, an exception occurs.

Grant and Revoke permissions to roles

It's possible to grant and revoke permissions, by their name or wild card (comma-separated), to roles, by their name (comma-separated). The commands syntax is following:

GRANT  [permission-name] TO [role-name] ;
REVOKE [permission-name] TO [role-name] ;

If you try to grant or revoke an invalid permission or if roles don't exist, an exception occurs.

=== Add and Delete users to roles

It's possible to add and delete users, by their name or wild card (comma-separated), to roles, by their name (comma-separated). The commands syntax is following:

ADD    [user-name]       TO [role-name] ;
DELETE [permission-name] TO [role-name] ;

If you try to add or delete user to nonexistent role, an exception occurs.

Here is the complete JCL sample:

<project name="ROLES" default="step1" basedir=".">
   <property name="jem.job.name" value="ROLES"/>
   <property name="jem.job.environment" value="ENV-1"/>

   <taskdef name="roles" classname="org.pepstock.jem.ant.RolesTask" />
   
   <target name="step1">
      <roles>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet>
               CREATE Role;
               GRANT jobs:* TO Role;
               REVOKE jobs:* TO Role;
               ADD usr1, usr2 TO Role;
               DELETE usr2 FROM Role;
               REMOVE Role;
            </dataSet>
         </dataDescription>
      </roles>
   </target>
</project>

Common-resource Utility

This utility is able to manage common resources which are used by data sources in JCL for java program. The common resources are defined and reported by a XML language as following:

<resources>
   <resource name="FTPlocalhost" type="ftp">
      <property name="url" visible="true" override="false">
         ftp://localhost:2121
      </property>
      <property name="username" visible="true" override="false">
         admin
      </property>
      <property name="password" visible="false" override="false" hash="8c6976e5b5">
         51898c5cb75e3
      </property>
   </resource>
</resources>

The <resources> element is the optionally root of common-resources definitions. It can contains 1 or more <resource> element. If it is missing, that means one only resource is defined or reported and the root element is simply <resource>. The <resource> element must have 2 mandatory attributes:

  • name: name or common-resource, used inside of JCL to reference the data source
  • type: type of resource, based on it a JNDI reference and resource factory must be implemented

The <resource> element can have 1 or more <property> elements needed to create the data source at run-time. The <property> element can have 4 attributes:

  • name: mandatory name of property.
  • visible: optional boolean attribute which defined if it's readable or it must be encrypted or decrypted. To do it, it's necessary to access to web application and take the value encryption and hash to check if the value is correct. Default is true.
  • override: optional boolean attribute which defines if the attribute could be override at runtime inside of JCL (for instance the FTP binary property). Default is true.
  • hash: optional attribute which is used when you try to set or get a property with visible parameter set to false. To do it, it's necessary to access to web application and take the value encryption and hash to check if the value is correct.

In details, the utility is able:

  • to get one or more common resources definition, saving to a dataset
  • to remove a common resource by name
  • to add and update a common resource

The task parses a list of commands (separated by semicolons ;) found inside a data description called COMMAND (be careful because is case sensitive).

Getting common-resources

It's possible to get a common resource definition by its name (an exception occurs if the common resources doesn't exist). The command syntax is following:

GET [resource-name]                                ;
                     FILE [ddname]
                                    NOENCRYPTION

By default (if FILE command is missing), the common resources definition is reported in the OUTPUT data description. If FILE is present, the definition is written inside of data description related by name. If NOENCRYPTION attribute is present, the properties of common resources defined as visible=false, are written without any encryption.

Here is a JCL sample:

<project name="RES" default="step1" basedir=".">
   <property name="jem.job.name" value="RES"/>
   <property name="jem.job.environment" value="ENV-1"/>

   <taskdef name="resources" classname="org.pepstock.jem.ant.CommonResourcesTask"/>

   <target name="step1">
      <resources>
         <dataDescription name="OUTPUT" sysout="true" disposition="MOD"/>
         <dataDescription name="my" sysout="true" disposition="MOD"/>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet>
               GET FTPlocalhost;
               GET FTPlocalhost FILE OUTPUT;
               GET FTPlocalhost FILE OUTPUT NOENCRYPTION;
               GET FTPlocalhost FILE my;
               GET FTPlocalhost FILE my NOENCRYPTION;
            </dataSet>
         </dataDescription>
      </resources>
   </target>
</project>

It's possible to get a all or a subset of resources definitions by resource attributes, their types and names. The command syntax is following:

GETLIST [resource-name]                                           ;
        name:[resource-name]
        type:[resource-name]
                                  FILE [ddname]
                                                 NOENCRYPTION

All filters arguments can be used, blank separated. By default (if FILE command is missing), the common resources definitions are reported in the OUTPUT data description. If FILE is present, the definitions are written inside of data description related by name. If NOENCRYPTION attribute is present, the properties of common resources defined as visible=false, are written without any encryption.

Here is a JCL sample:

<project name="RES" default="step1" basedir=".">
   <property name="jem.job.name" value="RES"/>
   <property name="jem.job.environment" value="ENV-1"/>

<taskdef name="resources" classname="org.pepstock.jem.ant.CommonResourcesTask"/>

   <target name="step1">
      <resources>
         <dataDescription name="OUTPUT" sysout="true" disposition="MOD"/>
         <dataDescription name="my" sysout="true" disposition="MOD"/>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet>
               GETLIST *;
               GETLIST * FILE OUTPUT;
               GETLIST * FILE OUTPUT NOENCRYPTION;
               GETLIST * FILE my;
               GETLIST * FILE my NOENCRYPTION;
               GETLIST name:F;
               GETLIST name:F FILE OUTPUT;
               GETLIST name:F FILE OUTPUT NOENCRYPTION;
               GETLIST name:F FILE my;
               GETLIST name:F FILE my NOENCRYPTION;
               GETLIST type:ftp name:F;
               GETLIST type:ftp name:F FILE OUTPUT;
               GETLIST type:ftp name:F FILE OUTPUT NOENCRYPTION;
               GETLIST type:ftp name:F FILE my;
               GETLIST type:ftp name:F FILE my NOENCRYPTION;
               GETLIST type:ftp;
               GETLIST type:ftp FILE OUTPUT;
               GETLIST type:ftp FILE OUTPUT NOENCRYPTION;
               GETLIST type:ftp FILE my;
               GETLIST type:ftp FILE my NOENCRYPTION;
            </dataSet>
         </dataDescription>
      </resources>
   </target>
</project>

Remove common-resource

It's possible to remove a common resource definition by its name (an exception occurs if the common resources doesn't exist). The command syntax is following:

REMOVE [resource-name] ;

Here is a JCL sample:

<project name="RES" default="step1" basedir=".">
   <property name="jem.job.name" value="RES"/>
   <property name="jem.job.environment" value="ENV-1"/>

   <taskdef name="resources" classname="org.pepstock.jem.ant.CommonResourcesTask"/>

   <target name="step1">
      <resources>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet>
               REMOVE FTPlocalhost;
            </dataSet>
         </dataDescription>
      </resources>
   </target>
</project>

Setting common-resources

It's possible to add common resource definitions (if the common resources exist will be updated). The command syntax is following:

SET                                 ;
    SOURCE [ddname]
                    NOENCRYPTION

By default (if SOURCE command is missing), the common resources definitions must be present in the INPUT data description. If SOURCE is present, the definitions must be located inside of data description related by name. If NOENCRYPTION attribute is present, the properties of common resources defined as visible=false, are read without any decryption.

Here is a JCL sample:

<project name="RES" default="step1" basedir=".">
   <property name="jem.job.name" value="RES"/>
   <property name="jem.job.environment" value="ENV-1"/>

   <taskdef name="resources" classname="org.pepstock.jem.ant.CommonResourcesTask"/>

   <target name="step1">
      <resources>
         <dataDescription name="myNoEnc" disposition="SHR">
            <dataSet >
               <![CDATA[
               <resources>
                  <resource name="FTPlocalhost" type="ftp">
                     <property name="url" visible="true"override="false">
                        ftp://localhost:2121
                     </property>
                     <property name="username" visible="true" override="false">
                        admin
                     </property>
                     <property name="password" visible="false" override="false">
                        admin
                     </property>
                  </resource>
               </resources>
               ]]>
            </dataSet>
         </dataDescription>

         <dataDescription name="my" disposition="SHR">
            <dataSet >
               <![CDATA[
               <resources>
                  <resource name="FTPlocalhost" type="ftp">
                     <property name="url" visible="true"override="false">
                        ftp://localhost:2121
                     </property>
                     <property name="username" visible="true" override="false">
                        admin
                     </property>
                     <property name="password" visible="false" override="false" hash="8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918">
                        51898c5cb75e3d81a9736567637d9b6ef6942e492987e7d5801c9893b0e40c8a
                     </property>
                  </resource>
               </resources>
               ]]>
            </dataSet>
         </dataDescription>
         <dataDescription name="INPUT" disposition="SHR">
            <dataSet >
               <![CDATA[
               <resources>
                  <resource name="FTPlocalhost" type="ftp">
                     <property name="url" visible="true"override="false">
                        ftp://localhost:2121
                     </property>
                     <property name="username" visible="true" override="false">
                        admin
                     </property>
                     <property name="password" visible="false" override="false" hash="8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918">
                        51898c5cb75e3d81a9736567637d9b6ef6942e492987e7d5801c9893b0e40c8a
                     </property>
                  </resource>
               </resources>
               ]]>
            </dataSet>
         </dataDescription>
         <dataDescription name="COMMAND" disposition="SHR">
            <dataSet>
               SET ;
               SET SOURCE my;
               SET SOURCE myNoEnc NOENCRYPTION;
            </dataSet>
         </dataDescription>
      </resources>
   </target>
</project>

Statistics Utility

This utility is able to manage to extract statistics, which are stored by JEM in file system changing file every day, to be stored in data warehouse infrastructure for data analytical and performance and capacity management purposes.

JEM doesn't have any goal to maintain and do this kind of work, so an external infrastructure must be in place.

The target of this utility is to read XML(not-well formed because there's any any root element) files, creates java object and calls a transformer and loader class passing the object. Transformer and loader contains all logic to manage the data. Java object, read by utility, is org.pepstock.jem.node.stats.Sample and , using all public methods of it, you could access to following data:

Field Type Scope Description
Key String General Unique key of sample, built on timestamp. A sample is taken EVERY minute. Format is "yyyy-MM-dd HH:mm"
Date String General Date when sample has been created. Format is "yyyy-MM-dd".
Time String General Time when sample has been created. Format is "HH:mm".
Environment String General JEM environment where the node is in
Member-Key String Node Unique key for JEM node. This is a UUID representation.
Member-Label String Node Name of JEM node, composed by ipaddress and listening port.
Pid (process id) Long Node Process ID (based on system) of JEM node
Number-Of-JCL-Check Long Node Number of JCL taken by JEM node for JCL checking, during last minute
Number-Of-JOB-Submitted Long Node Number of JOB submitted and executed by JEM node, during last minute
Total-Number-Of-JCLCheck Long Node Total amount of JCL taken by JEM node for JCL checking, since JEM node start
Total-Number-Of-JOBSubmitted Long Node Total amount of JOB submitted and executed by JEM node, since JEM node start
CPU User Long Machine Total amount of milliseconds of CPU used for USER purposes since machine starting
CPU System Long Machine Total amount of milliseconds of CPU used for SYSTEM purposes since machine starting
Idle Long Machine Total amount of milliseconds spent by machine in IDLE since machine starting
CPU Total Long Machine Total amount of milliseconds of CPU used since machine starting
Memory Available Long Machine Total amount of memory (in byte) available
Memory Used Long Machine Total amount of memory (in byte) used
Memory Free Long Machine Total amount of memory (in byte) free
CPU User Long Node Total amount of milliseconds of CPU used for USER purposes by JEM node since node starting
CPU System Long Node Total amount of milliseconds of CPU used for SYSTEM purposes by JEM node since node starting
CPU Total Long Node Total amount of milliseconds of CPU used since machine starting
CPU % Total Double Node Percentage of CPU used by JEM node since last minute
Memory Used Long Node Total amount of memory (in byte) used
Name String HazelcastMap Name of Hazelcast map
Owned Entry Count Long HazelcastMap Number of map entries owned by JEM node
Backup Entry Count Long HazelcastMap Number of map entries of backup hold by JEM node
Owned Entry Memory Cost Long HazelcastMap Amount of memory cost (number of bytes) of owned entries in JEM node
Backup Entry Memory Cost Long HazelcastMap Amount of memory cost (number of bytes) of backup entries in JEM node
Locked Entry Count Long HazelcastMap Number of currently locked locally owned keys
Lock Wait Count Long HazelcastMap Number of cluster-wide threads waiting to acquire locks for the locally owned keys
Dirty Entry Count Long HazelcastMap Number of entries that JEM node owns and are dirty (updated but not persisted yet)
Hits Long HazelcastMap Number of hits (reads) of the locally owned entries
Number Of Gets Long HazelcastMap Number of get operations of JEM node
Number Of Puts Long HazelcastMap Number of put operations of JEM node
Number Of Removes Long HazelcastMap Number of remove operations of JEM node
Number Of Events Long HazelcastMap Number of events received in JEM node
Number Of Other Operations Long HazelcastMap Total number of other operations in JEM node
Total Get Latency Long HazelcastMap Total latency (in milliseconds) of get operations in last minute
Total Put Latency Long HazelcastMap Total latency (in milliseconds) of put operations in last minute
Total Remove Latency Long HazelcastMap Total latency (in milliseconds) of remove operations in last minute

When the utility is able to read a sample object, it calls a class which must implement org.pepstock.jem.node.stats.TransformAndLoader. This interface is called when a statistics file is open and when it is close, when a sample is created correctly and when an error occurs reading a record to create a sample.

/**
*/
public void fileStarted(File file) throws Exception;
/**
*/
public void loadSuccess(Sample sample) throws Exception;
/**
*/
public void loadFailed(String record, int line, Exception exception) throws Exception;
/**
*/
public void fileEnded(File file) throws Exception;

The utility accepts 3 arguments, using the arg element:

  • -days n: because JEM writes a file per day, with this argument it's possible to read the files created in the date calculated subtracting the value to current date.
    • Example: -days -1 is yesterday
  • -date yyyyMMdd : because JEM writes a file per day, with this argument it's possible to read the files created in the date passed as argument. If -days argument is used, you can't use -date.
    • Example: -date 20030101 is January 1st, 2003.
  • -class [full-class-name]: to manage the stored sample, it's necessary to implement transformer and loader interface and put the name of class in this argument. It's a mandatory arguments, so an exception occurs if is missing.
    • Example: -class org.pepstock.jem.node.stats.DefaultTransformAndLoader is default and out-of-the-box loader class, that displays all sample and removes the files readed when is closed, to clean up the folder where JEM nodes store statistics.

Here is a JCL sample:

<project name="STATS" default="step1" basedir=".">
   <property name="jem.job.name" value="STATS"/>
   <property name="jem.job.environment" value="ENV-1"/>

   <taskdef name="stats" classname="org.pepstock.jem.ant.StatsCollectTask" />

   <target name="step1">
      <stats>
         <arg value="-days"/>
         <arg value="-1"/>
         <arg value="-class"/>
         <arg value="org.pepstock.jem.node.stats.DefaultTransformAndLoader"/>
      </stats>
   </target>
</project>

Job Archive Utility

Archiving means to remove jobs from output queue and pass them to another class with the right logic to store somewhere. There are 2 ways to do it:

  1. By submission of a JCL
  2. By Job lifecycle listener implementation

JCL job archive

This utility is able to manage to remove job from output queue, compress (in zip format) all produced output files and pass to a interface implementation.

JEM doesn't have any goal to maintain and do this kind of work, so an external infrastructure must be in place.

When the utility is able to have the jobs collection, it calls a class which must implement org.pepstock.jem.node.archive.JobOutputArchive. This interface is called when for every job object, extracted from output queue.

public interface JobOutputArchive {
	
	/**
	 * Called for every job ready to be archived
	 * @param job job instance
	 * @param zipOutputContent array of byte 
	 * @return true if the job must be remove from output queue
	 * @throws Exception if any error occurs
	 */
	public boolean archive(Job job, byte[] zipOutputContent) throws Exception;

}

The utility accepts an arguments, using the arg element:

  • -class [full-class-name]: to manage the job to be archive, it's necessary to put the name of class in this argument. It's a mandatory arguments, so an exception occurs if is missing.
    • Example: -class org.pepstock.jem.node.archive.DefaultJobOutputArchive is default and out-of-the-box loader class, that does nothing.

The utility needs a COMMAND data description with one or more SQL where-conditions statements to filter the jobs to extract from output queue. The language to used is well-explained inside of Hazelcast Map query documentation.

Here is the list of field that you can use to filter:

Field Type Description
id String Unique identification ID of job
name String Job name
user String User who submitted the job
orgUnit String Organizational unit of user who submitted the job
submittedTime Date Timestamp when the job has been submitted
startedTime Date Timestamp when the job is started
endedTime Date Timestamp when the job is started
jcl.type String JCL type: "ant" or "sb"
jcl.content String JCL content
jcl.jobName String Job name
jcl.environment String JEM environment where job has been executed
jcl.domain String JEM domain where job has been executed
jcl.affinity String Affinities used to execute job
jcl.memory int Amount of memory required by JCL
jcl.priority int Priority on queues required by JCL
jcl.hold boolean if job is in HOLD status
jcl.emailNotificationAddresses String List of mail addresses used to notify end of job
jcl.user String User indicated in JCL, surrogated on job user
memberId String Unique ID of JEM node
memberLabel String IP address and port of JEM node
processId String Process ID of job, assigned from OS during the execution
result.returnCode int Maximum return code of job
result.exceptionMessage String Exceptions in case of error

Here is JCL example:

<project name="JobOutputArchive" default="step1" basedir=".">
    <description>
        Clean up JOB
    </description>

    <property name="jem.job.name" value="JobOutputArchive"/>
    <property name="jem.job.environment" value="ENV-1"/>
    <property name="jem.job.affinity" value="linux"/>
    
    <taskdef name="archive" classname="org.pepstock.jem.ant.JobOutputArchiveTask" />

    <target name="step1">
        <archive>
            <arg value="-class"/>
            <arg value="org.pepstock.jem.node.archive.DefaultJobOutputArchive"/>

            <dataDescription name="COMMAND" disposition="SHR">
                <dataSet>
                    <![CDATA[

                    name LIKE 'ICEGENE%' ;
                    name LIKE 'J%' and user = 'root' ;

                    ]]>
                </dataSet>
            </dataDescription>
        </archive>
    </target>

</project>

JOB life cycle listener

A job life cycle listener can listen the change status of a job. The org.pepstock.jem.listeners.JobOutputArchiveListener implements the method when the job is ended. It must be configured in JEM XML configuration node file, as following:

    <listeners>
        <listener className="org.pepstock.jem.listeners.JobOutputArchiveListener">
            <properties>
                <property name="class" value="org.pepstock.jem.node.archive.DefaultJobOutputArchive" />
            </properties>
        </listener-->
    </listeners>

Only the class property must inserted, indicating the full name of class to manage the job to be archive, it's necessary to put the name of class in this argument. It's a mandatory arguments, so the listener does nothing if is missing

Clone this wiki locally