-
Notifications
You must be signed in to change notification settings - Fork 133
Control Scripts
The control scripts provide the glue between the CSD and the underlining service binaries - either distributed via parcels or via some other mechanism. Control scripts need to be placed in the scripts/
directory. Cloudera Manager can execute control scripts when:
- launching a role process. See Start Runner.
- executing a role command. See Commands.
- deploying client configuration. See Client Configuration.
When a command is issued in Cloudera Manager, the configuration data is zipped and sent down to the agent as part of the heartbeat. The agent will then create a new process directory under: /var/run/cloudera-scm-agent/process/
for that command and unpack the configuration data.
The configuration data will contain:
- the entire
scripts/
directory. See scripts. - the entire
aux/
directory. See aux. - any configuration files that were generated by Config Writers.
The agent at this point will execute the script and capture the stderr and stdout inside two log files:
logs/stderr.log
and logs/stdout.log
respectively.
It is very important that the last line of the control script be a call to exec and that the process does not run in the background. This is because the agent uses supervisord to control its processes and the service binary needs to be rooted under supervisord's process tree. The process also cannot be a daemon process - fork and run in the background. The process should stay in the foreground so supervisord can manage it.
The ECHO_WEBSERVER startRunner
:
{
"startRunner" : {
"program" : "scripts/control.sh",
"args" : [ "start" ],
"environmentVariables" : {
"WEBSERVER_PORT" : "9797"
}
}
}
The control script:
CMD=$1
case $CMD in
(start)
echo "Starting Server on port $WEBSERVER_PORT"
exec python -m SimpleHTTPServer $WEBSERVER_PORT
;;
(*)
log "Don't understand [$CMD]"
;;
esac
When the role is started on the agent the following directory is created: /var/run/cloudera-scm-agent/process/121-echo-ECHO_WEBSERVER/
(121 is just a unique CM process id). The agent will then execute:
WEBSERVER_PORT=9797 scripts/control.sh start
In addition the the environment variables provided by the script runner, the Cloudera Manager agent also sets some special variables.
Variable | Description | Example |
---|---|---|
CONF_DIR | The current agent process directory. | /var/run/cloudera-scm-agent/process/121-echo-ECHO_WEBSERVER/ |
JAVA_HOME | The location of the Java binaries. | /usr/java/jdk1.7.0_25-cloudera |
CDH_VERSION | The version of CDH used in the cluster. | 5 |
COMMON_SCRIPT | Script containing functions that are useful while running commands or starting processes. | /usr/lib64/cmf/service/common/cloudera-config.sh |
ZK_PRINCIPAL_NAME | The Kerberos principal name (i.e., short name) for ZooKeeper. Only defined when a non-default principal is configured for ZooKeeper. | zk_custom_principal |
For most services, a parcel will also exist along with the CSD. The parcel will contain the actual service bits. Control scripts will need to know where the relevant parcel files and binaries are to start the service or run commands. CSDs and parcels are linked via the provides tags. The CSD parcel structure declares what tags the service is interested in. Parcels declare what tags they provide.
When the agent executes a control script, it will first look to see what tags the associated service declared are of interest. The agent will then source the environment scripts of all the parcels that have matching tags. After the environment scripts have been sourced, the control script is run. The defined environment variables from the parcels can be used by the control scripts to discover the location of the service bits.
Note: there must be exactly one parcel that provides the "required" tag - otherwise the command will fail. The "optional" tags can exist in any number of parcels - this is how plugins are supported.
Lets take the Spark service as an example. Only the relevant sections of the files are shown for simplicity.
The meta/parcel.json
:
{
"scripts": {
"defines": "spark_env.sh"
},
"provides" : {
"spark"
}
}
The meta/spark_env.sh
:
SPARK_DIRNAME=${PARCEL_DIRNAME:-"SPARK-0.9.0-1.cdh4.6.0.p0.47"}
export CDH_SPARK_HOME=$PARCELS_ROOT/$SPARK_DIRNAME/lib/spark
The spark descriptor/service.sdl
:
{
"parcel" : {
"requiredTags" : [ "spark" ],
"optionalTags" : [ "spark-plugin" ]
}
}
The Spark scripts/control.sh
:
DEFAULT_SPARK_HOME=/usr/lib/spark
export SPARK_HOME=${SPARK_HOME:-$CDH_SPARK_HOME}
export SPARK_HOME=${SPARK_HOME:-$DEFAULT_SPARK_HOME}
...
exec "$SPARK_HOME/bin/spark-class ${ARGS[@]}"
...
The following steps occur when the control script is executed by the agent:
- The agent reads the parcel tags from the Spark CSD.
- "spark" is a required tag.
- "spark-plugin" is an optional tag.
- The agent scans all the activated parcels and finds the Spark parcel is providing the "spark" tag.
- No other parcels are providing the "spark" tag or the "spark-plugin" tag.
- The
meta/spark_env.sh
script is sourced by the agent.
- This sets
CDH_SPARK_HOME=/opt/cloudera/parcels/SPARK-0.9.0-1.cdh4.6.0.p0.47/lib/spark
.
- The agent executes
scripts/control.sh
. - The control script checks to see if
CDH_SPARK_HOME
is set. If so it setsSPARK_HOME
to it.
- The
CDH_SPARK_HOME
might not be set if we are using the spark packages instead of parcels. In that caseSPARK_HOME
is set to the default package location of/usr/lib/spark
.
- The control script then uses
SPARK_HOME
to exec the spark-class binary.
On clusters using kerberos for authentication, it is necessary to acquire kerberos ticket before running any commands that require interaction with a secure server. If a role specifies kerberos principals in its description, then those principals are added to the role's keytab file and added to the configuration directory. Also, every principal is added to the role's environment where key is the name of the kerberos principal and value is the principal itself. In order to use these principals, you need to source $COMMON_SCRIPT
in your script and call the function acquire_kerberos_tgt
. Below is an example for this:
# Source the common script to use acquire_kerberos_tgt
. $COMMON_SCRIPT
# acquire_kerberos_tgt expects that the principal to be kinited is referred to by
# SCM_KERBEROS_PRINCIPAL environment variable
export SCM_KERBEROS_PRINCIPAL=$MY_KERBEROS_PRINCIPAL
# acquire_kerberos_tgt expects that the argument passed to it refers to the keytab file
acquire_kerberos_tgt myservice.keytab