Skip to content

Service Descriptor Language Reference

Endre Major edited this page May 4, 2021 · 155 revisions

The Service Descriptor Language (SDL) is used in the service.sdl file to describe the CSD service to Cloudera Manager. It is worth first looking at the CSD primer to get an idea of how all the CSD components fit together.

Below is a description of each section of the SDL in more detail. There are some structures that are shared between sections. For example, the parameters under service are the same structures as the parameters under roles. Those structures will be described at the end.

Service

This is the top level section that describes the service as a whole.

Example

{
  "name" : "ECHO",
  "label" : "Echo",
  "description" : "The echo service",
  "version" : "1",
  "runAs" : {
    "user" : "echoservice",
    "group" : "echoservice"
   },
  "maxInstances" : 1,
  "svgIcon" : "images/icon.svg",
  "compatibility" : {
    "generation" : 2,
    "cdhVersion" : {
      "min" : 4,
      "max" : 5
    }
  },
  "parcel" : {
    "repoUrl" : "http://mywebsite.com",
    "requiredTags" : [ "echo" ],
    "optionalTags" : [ "echo-plugin" ]
  },
  "serviceDependencies" : [
    {
     "name" : "ZOOKEEPER"
    }, 
    {
      "name" : "HDFS", 
      "required" : "true"
    }
  ],
  "serviceInit" : {
    "preStartSteps" : [
      { 
        "commandName" : "CreateHomeDirCommand"
      }
    ],
    "postStartSteps" : [
      { 
        "commandName" : "CreateParamDirCommand", 
        "failureAllowed" : true
      }
    ]
  },
  "stopRunner" : {
    "relevantRoleTypes" : ["ECHO_WEBSERVER"],
    "runner" : {
      "program" : "scripts/control.sh",
      "args" : ["admin", "stopAll"]
    }, 
    "timeout" : 180000,
    "masterRole" : "ECHO_MASTER"
  }, 
  "inExpressWizard" : true,
  "rolesWithExternalLinks" : ["ECHO_MASTER_SERVER"],
  "hdfsDirs" : [ 
     {   
       "name" : "CreateUserHomeDirCommand",
       "label" : "Create Echo User Home Dir",
       "description" : "Creates the Echo user directory in HDFS",
       "directoryDescription" : "Echo HDFS user directory",
       "path" : "/user/${user}",
       "permissions" : "0750"
     }
  ],
  "commands" : [
    {
      "name" : "service_cmd1",
      "label" : "Service Cmd1 Label",
      "description" : "Service Cmd1 Help",
      "roleName" : "ECHO_WEBSERVER",
      "roleCommand" : "role_cmd1",
      "runMode" : "all",
      "internal" : "true"
    }
  ],
  "parameters" : "..."
}

name

This is the logical name of the service. In CM it is the service type. Cloudera Manager validates that this name is globally unique.

Required?: Yes

{
  "name" : "ECHO"
}

Note: must be uppercase and only contain letters, numbers and underscores

label

The text that is user facing. e.g. in the "Add Service" wizard.

Required?: Yes

{  
  "label" : "Echo"
}

description

The description of the service - shown in the wizards.

Required?: Yes

{  
  "description" : "The echo service"
}

version

The version of the CSD.

  • Updated in Cloudera Manager 5.14 CSD versioning logic improved. The version of a CSD is now checked at CSD load time (during Cloudera Manager startup): only the latest version for a CSD service type is loaded, while earlier CSD versions are not. The version number has precedence over the CDH compatibility range: note that a newer CSD version might not be compatible with an older cluster, even if an older version CSD which is compatible is still present in the CSD repository. Version numbers are recommended to be in the format "major.minor.micro".

Required?: Yes

{  
  "version" : "1"
}

runAs

The default user and group to run all commands as. If the user/group does not exist, the script will fail to run. The user/group should have been created by Cloudera Manager, the system packages or parcels. The CSD system will not create users and groups. The runAs can be overridden by a CM administrator in the configuration page.

The optional principal field lets you specify the kerberos user as which daemon and all processes on it run. More precisely this field specifies the default value of a parameter that gets added for configuring the kerberos principal, the value of which is used to replace ${principal} in configurations. ${principal} will have the same value as ${user} on non-secure clusters.

Required?: Yes

{  
  "runAs" : {
    "user" : "echoservice",
    "group" : "echoservice",
    "principal" : "echoservice" // this is optional
   }
}

maxInstances

The number of service instances this service type can have. For example, CM only allows one instance of HDFS to exist per cluster.

Required?: No, defaults to no limit

{
  "maxInstances" : 1
}

icon

The path and file name to the icon that shows up beside the service. Required?: No, a default service icon exists in Cloudera Manager

For CDP Data Center (Cloudera Manager 7.0+), The icon could be a 24x24 SVG image.

{
 "svgIcon": "images/icon.svg"
}

The SVG must have this in the root element:

<svg xmlns="http://www.w3.org/2000/svg" width="24px" height="24px" viewBox="0 0 24 24">

OR

<svg xmlns="http://www.w3.org/2000/svg" width="1em" height="1em" viewBox="0 0 24 24">

It could also a PNG image for backward compatibility reasons, but this image could be blurry and aliased when displayed in the title area of the service as a 28x28 image, or inside a service list as a 16x16 image.

{
 "icon" : "images/icon.png"
}

compatibility

Describes compatibility requirements for this CSD. It is further discussed in detail here.

Required?: No, by default compatibility checks are not enforced.

{
  "compatibility" : {
    "generation" : 2,
    "cdhVersion" : {
      "min" : "4",
      "max" : "5"
    }
  }
}
Key Description Required?
generation An integer used to communicate compatibility between different CSDs of the same name. Compatibility is intentionally independent of CSD versions. yes
cdhVersion The range of CDH cluster versions compatible with this CSD. Defaults to all versions. min is inclusive, but max is exclusive. If only the major version is specified for max, it is interpreted as everything up to but not including the next major version. Therefore the example above means everything starting from 4.0.0 inclusive, up to but not including 6.0.0. no
  • Changed in Cloudera Manager 5.4.0, support "major.minor.micro" syntax, e.g. "min" : "5.3.0"

parcel

A structure that describes the interaction of the CSD with parcels.

Required?: No

{
 "parcel" : {
    "repoUrl" : "http://mywebsite.com",
    "additionalRepoUrls": [ "http://anotherparcelsource.com" ],
    "requiredTags" : [ "echo" ],
    "optionalTags" : [ "echo-plugin" ]
 }
}
Key Description Required?
repoUrl Automatically adds the parcel repository url to the list in Cloudera Manager. This makes deploying the associated parcel easier since the user doesn't have to manually add a parcel url. no
additionalRepoUrls A list of URLs which is added to the parcel repository list in Cloudera Manager, the same as 'repoUrl'. This is combined with 'repoUrl', and all unique URL strings will be added to the repository list. (Since: Cloudera Manager 6.0.0) no
requiredTags The tags that must exist in the active parcels on the cluster. no
optionalTags Any parcel that has this optional tag will get their environment scripts sourced when commands are run. no

For more info see The Parcel provides tags and interaction with Parcels.

serviceDependencies

A list of service types that this CSD depends on. An instance of each of these services needs to exist on the cluster before a service of this CSD can be added. Declaring a dependency on a service, means that the client configs for all dependencies will also get deployed to the process directory.

Required?: No

{
  "serviceDependencies" : [
    {
     "name" : "ZOOKEEPER"
    }, 
    {
      "name" : "HDFS", 
      "required" : "true"
    }
  ]
}
Key Description Required?
name The service type. yes
required Whether an error should surface if dependency is not met. Defaults to false. no

Note: If you add a dependency to ZooKeeper service, then any process in that service (e.g. role daemon process, command process, client config deployment process) will get the ZooKeeper quorum in ZK_QUORUM environment variable. This can be used in the control script to add configuration properties for the ZooKeeper quorum.

serviceInit

In the wizard, when the service is added, it might be necessary to run some service commands before starting the service and right after starting the service. For example, creating a directory in HDFS.

Required?: No, only the start service commands are run

{
 "serviceInit" : {
   "preStartSteps" : [
     { 
       "commandName" : "CreateHomeDirCommand"
     }
   ],
   "postStartSteps" : [
     { 
       "commandName" : "SomeCommand", 
       "failureAllowed" : true
     }
   ]
 }
}
Key Description Required?
preStartSteps A list of service commands to execute before the service starts. no
postStartSteps A list of service commands to execute after the service starts. no
failureAllowed For each service command in preStartSteps or postStartSteps, indicates if it is ok if the commands fail. Defaults to no. no

stopRunner

Defines a custom runner to gracefully bring down the service. After the runner completes, the remaining standing roles will be abruptly stopped. We use something similar for Hbase and Accumulo that notifies the master to shutdown the service - ensuring that the worker roles shutdown cleanly.

Required?: No, when the service is stopped, each role gets a SIGTERM

{
 "stopRunner" : {
  "relevantRoleTypes" : ["ECHO_WEBSERVER"],
  "runner" : {
   "program" : "scripts/control.sh",
   "args" : ["admin", "stopAll"]
  }, 
  "timeout" : 180000,
  "masterRole" : "ECHO_MASTER"
 }
} 
Key Description Required?
relevantRoleTypes The types of roles to be stopped by the runner. These roles are set as stopped after the runner runs. If not specified, all roles are expected to be stopped. no
masterRole The master role name. This is used to determine when the stop runner is completed. When one of the master roles is stopped then the runner is considered done. If defined, the runner will only expect the script to bring down one running master, among possibly multiple masters. no
runner The script to gracefully bring down the roles. See Script Runner. yes
timeout The duration to wait for the runner to complete. If the runner does not complete by then, the runner will be aborted and the processes will be exited abruptly. Default is no timeout. no

inExpressWizard

Set to true if this service should show up in the Express and Add Cluster wizards. Only services that can be fully configured from inside the wizards should be added. Otherwise the initial cluster setup will fail.

Required?: No, defaults to false

{
  "inExpressWizard" : true
}

rolesWithExternalLinks

Specifies a list of roles that have external links that should be shown on the service main page.

Required?: No

{
  "rolesWithExternalLinks" : ["ECHO_MASTER_SERVER"],
}

hdfsDirs

A common requirement of services is that directories exist in HDFS with specific permissions. In order for these directories to exist, the user running the command needs to have elevated HDFS privileges. To aid with this, Cloudera Manager can create the needed directories and set the proper permissions.

Required?: No

{
  "hdfsDirs" : [ 
     {   
       "name" : "CreateUserHomeDirCommand",
       "label" : "Create Echo User Home Dir",
       "description" : "Creates the Echo user directory in HDFS",
       "directoryDescription" : "Echo HDFS user directory",
       "path" : "/user/${user}",
       "permissions" : "0750"
     }
  ] 
}
Key Description Required?
name The name of the command. yes
label The user facing name in the "Actions" dropdown list. yes
description The help string for the command. yes
directoryDescription The description for this directory. yes
path The path in HDFS. This can have standard substitutions. yes
permission Permission for the directory. yes

commands

A list of service commands. A command needs to execute on a host. Since a service is an abstract entity made up of roles it does not live on a specific host. The way service commands work is they point to a role command on a specific role type. runMode can be used to specify how this command should be run.

Required?: No

{
 "commands" : [
  {
   "name" : "service_cmd1",
   "label" : "Service Cmd1 Label",
   "description" : "Service Cmd1 Help",
   "roleCommand" : "role_cmd1",
   "roleName" : "ECHO_WEBSERVER",
   "runMode" : "all",
   "internal" : "true"
  }
 ]
}
Key Description Required?
name The name of the command. yes
label The user facing name in the "Actions" dropdown list. yes
description The help string for the command. yes
roleCommand The role command the service command will execute. yes
roleName The role type the role command lives on. yes
runMode Enum of "all", "single", or "target". See runMode. yes
internal Whether the command should be marked as internal and hidden in the CM UI. Defaults to false. no

runMode

all

  • The service command will run on all the roles in the service with the required role state (specified in the role command description). The command will wait until all the roles have completed before completing the service command.

single

  • The service command will pick a single random role that is in the required role state (specified in the role command description) to run the role command. The service command will wait until the role has completed the command. This type of command is often needed if the command should only be run once, for example an "Initialization" type of command.

target (Since: Cloudera Manager 7.1.0)

  • The service command will run on all targeted roles in the service with the required role state (specified in the role command description). Roles may be targeted by selecting them through the CM UI. The service command will wait until all targeted roles have completed the command.

parameters

A list of parameters used to configure the service. See Parameters.

Required?: No

{
  "parameters" : "..."
}

externalKerberosPrincipals

List of external kerberos principals used by the service. Cloudera Manager will not manage these principals, but this can be used to refer to any external principals in configuration.

kerberos

Whether kerberos authentication is used. A common practice is to reference a boolean parameter which is used to enable or disable Kerberos for this service. Kerberos authentication is required if any of the following criteria is met:

  • This field has a variable reference that will evaluate to "true" or "kerberos" (case-insensitive)
  • This field has a value of "true" or "kerberos" (case-insensitive)
  • Any dependency of the service requires kerberos authentication (regardless of the above)

providesConnectors (Since: Cloudera Manager 7.0.0)

providesConnectors: [
  {
    "connectorType": "knownType"
    // ...
  }
  // ...
]

Required?: No

providesConnectors is a list of elements, where the CSD can enable the integration for known connector types.

Service connectors in Cloudera Manager are entities that allow decoupling a particular behavioral interface from the service type that provides a specific implementation for that behavior. This is useful in certain scenarios where Cloudera Manager needs to integrate with a specific kind of behavior, but the behavior can be implemented by different solutions. Using a programming language analogy, imagine an interface eg. 'Employee' and classes implementing that: 'SalesEmployee', 'EngineeringEmployee'. In Cloudera Manager a service type is analog to the concrete class, while the interface would be the service connector.

The supported connector types are the following.

KmsConnector

HDFS Encryption needs to talk to a Key Management Server. If your CSD is implementing a Hadoop Key Management Server interface, then add the following to declare this connector.

  {
    "connectorType" : "KmsConnector",
    "roleName" : "MY_KMS_ROLE",
    "insecureUrl" : "http://${host}:${kms_port}/kms",
    "secureUrl" : "https://${host}:${kms_ssl_port}/kms",
    "loadBalancerUrl" : "${kms_load_balancer}"
  }
Key Description Required?
roleName Name of the role that provides the KMS interface. Usually. Can omit if your CSD always configures a load balancer.
insecureUrl URL of KMS when SSL is not enabled and the load balancer is not in use. Usually. Can omit if your CSD always configures a load balancer.
secureUrl URL of KMS when SSL is not enabled and the load balancer is not in use. Usually. Can omit if your CSD always configures a load balancer, or if it never supports SSL.
loadBalancerUrl URL of load balancer. Must be specified when there are multiple relevant roles. Note that you can only reference service-level parameters in substitutions for loadBalancerUrl. If your CSD can be configured with multiple roles of the relevant type (see topology), then this must be present.

In addition, if roleName is provided and the matching role defines a configGenerator for “core-site.xml”, then core-site.xml will automatically get everything in the core-site.xml from HDFS client configuration. If parameters registered with this configGenerator have keys that conflict with contents in HDFS core-site.xml, the values from HDFS will be overridden.

Note that SSL enablement is determined by sslServer for the relevant role.

The KEYTRUSTEE CSD serves as a complete example of how to create a custom KMS that HDFS can use for encryption.

providesKms (Since: Cloudera Manager 5.2.0)

Deprecated since 7.0.0 in favor or providesConnectors. Still allowed for backward compatibility.

{
  "providesKms" : {
    "roleName" : "MY_KMS_ROLE",
    "insecureUrl" : "http://${host}:${kms_port}/kms",
    "secureUrl" : "https://${host}:${kms_ssl_port}/kms",
    "loadBalancerUrl" : "${kms_load_balancer}"
  }
}

For more details see the new equivalent KmsConnector under providesConnectors.

rollingRestart (Since: Cloudera Manager 5.5.0)

If service supports rolling restart, the steps can be specified using this. See Rolling Restart for more details.

{
  "rollingRestart" : {
    "nonWorkerSteps" : [{
      "roleName" : "NON_WORKER_ROLE",
      "bringDownCommands" : [ "Stop" ],
      "bringUpCommands" : [ "Start", "role_cmd" ]
    }],
    "workerSteps" : {
      "roleName" : "WORKER_ROLE",
      "bringDownCommands" : [ "service_cmd", "Stop" ],
      "bringUpCommands" : [ "Start" ]
    }
  }
}

Roles

A service can have list of role types. These role of a specific role type is associated with a process on a host.

Example

{
 "roles" : [
  {
   "name" : "ECHO_WEBSERVER",
   "label" : "Web Server",
   "pluralLabel" : "Web Servers",
   "startRunner" : {
     "program" : "scripts/control.sh",
     "args" : [ "start" ],
     "environmentVariables" : {
       "WEBSERVER_PORT" : "${port_num}"
      }
    },
    "stopRunner" : {
      "timeout" : "90000",
      "runner" : {
        "program" : "scripts/stop_echows.sh",
        "args" : ["cmdlineArg1"],
        "environmentVariables" : {
          "FOO_VAR" : "bar"
        }
      }
    },
    "externalLink" : {
      "name" : "webserver_web_ui",
      "label" : "Web Server Web UI",
      "url" : "http://${host}:${webserver_webui_port}"
    },
    "additionalExternalLinks" : [
      {
        "name" : "webserver_web_ui2",
        "label" : "Web Server WebUI2",
        "url" : "http://${host}:${webserver_webui_port}/ui2"
       }
    ],
    "topology" : {
      "minInstances" : "2",
      "maxInstances" : "10",
      "softMinInstances" : "3",
      "softMaxInstances" : "6",
      "requiresOddNumberOfInstances" : "false"
    },
    "logging" : {
      "dir" : "/var/log/echo",
      "filename" : "webserver.log",
      "modifiable" : true,
      "configName" : "log.dir",
      "loggingType" : "log4j"
    },
    "commands" : [
      {
       "name" : "role_cmd1",
       "label" : "Role Cmd1 Label",
       "description" : "Role Cmd1 Help",
       "expectedExitCodes" : [0, 1, 2],
       "requiredRoleState" : "running",
       "commandRunner" : {
         "program" : "scripts/control.sh",
         "args" : ["cmd1"]
       },
       "internal" : "true"
      }
     ],
     "configWriter" : "....",
     "parameters" : "....",
     "cgroup" : "...."
    }
  ]
}

name

This is the logical name of the role. In CM it is the role type. A current limitation is that the role type needs to be globally unique. Because of this, it is suggested that the service type be prepended to the role type to make it scoped to this service.

Required?: Yes

{
  "name" : "ECHO_WEBSERVER"
}

Note: must be uppercase and only contain letters, numbers and underscores

label

The name that is user facing. e.g. in the "Add Service" wizard.

Required?: Yes

{  
  "label" : "Web Server"
}

pluralLabel

The plural name that is user facing. Shows up in the home screen.

Required?: Yes

{  
  "pluralLabel" : "Web Servers"
}
  • New in Cloudera Manager 5.3.0

startRunner

The script to run that starts this role.

Required?: Yes

{
 "startRunner" : {
   "program" : "scripts/control.sh",
   "args" : [ "start" ],
   "environmentVariables" : {
     "WEBSERVER_PORT" : "${port_num}"
   }
 }
}

The startRunner structure is of type Script Runner

stopRunner

This section provides two capabilities: custom stop behavior and graceful role shutdown. When this descriptor is not specified, the standard behavior applies: the process will receive a SIGTERM and will be allowed to complete its shutdown in a certain amount of time (currently hardcoded). If the process cannot terminate in time or report the expected exit code it will be forcefully killed (group SIGKILL). This is mutually exclusive with using a stopRunner on the service-level descriptor.

Required?: No

"stopRunner" : {
  "timeout" : "90000",
  "runner" : {
    "program" : "scripts/stop_echows.sh",
    "args" : ["cmdlineArg1"],
    "environmentVariables" : {
      "FOO_VAR" : "bar"
    }
  }
}
  1. timeout: default timeout in milliseconds. Any positive non-zero value will enable the graceful role shutdown feature. This value can be configured by the user in Cloudera Manager, the value here specified will be the default. A value of zero (0) means wait forever the process termination.
  2. When a runner is specified (of type Script Runner), it allows to run a custom script to perform the stop operation for the role. The pid of the process currently running is passed via the Cloudera Manager-provided env. variable PID_TO_STOP. When runner is not specified, the standard behavior applies but the graceful timeout will be respected before a forced kill.
  • New in Cloudera Manager 5.11.1

externalLink

Specifies an external link that shows up in the status page of the role. This is often a Web UI for the role. If this role is present in the service rolesWithExternalLinks, it would be this external link that is used in the service status page.

Required?: No

{
  "externalLink" : {
    "name" : "webserver_web_ui",
    "label" : "Web Server Web UI",
    "url" : "http://${host}:${webserver_webui_port}",
    "secureUrl" : "https://${host}:${webserver_webui_ssl_port}"
  }
}
Key Description Required?
name The internal identifier for the link. yes
label The user facing name. Show up on the Status Page. yes
url The url to the status page. This can have standard substitutions. yes
secureUrl The url to the status page when SSL is enabled. This can have standard substitutions. If secureUrl is not specified, then url is used instead. SSL enablement is determined by sslServer in the same role. no

additionalExternalLinks

Specifies additional external links. These links will show up alongside the externalLink on the role process page. These links will not show up in the main status page of the role. Ideally, a externalLink should first be specified before specifying additional links.

Required?: No

{
  "additionalExternalLinks" : [
      {
        "name" : "webserver_web_ui2",
        "label" : "Web Server WebUI2",
        "url" : "http://${host}:${webserver_webui_port}/ui2",
        "secureUrl" : "https://${host}:${webserver_webui_ssl_port}"
       }
    ]
}
Key Description Required?
name The internal identifier for the link. yes
label The user facing name. yes
url The url to the status page. This can have standard substitutions. yes
secureUrl The url to the status page when SSL is enabled. This can have standard substitutions. If secureUrl is not specified, then url is used instead. SSL enablement is determined by sslServer in the same role. no

topology

Provides restrictions on the number of instance of this role type that should exist on the cluster. For example, a singleton type role like a master can have minInstaces = 1 and maxInstances = 1.

Required?: No

{
  "topology" : {
    "minInstances" : "2",
    "maxInstances" : "10",
    "softMinInstances" : "3",
    "softMaxInstances" : "6",
    "requiresOddNumberOfInstances" : "false",
    "placementRules" : [ ]
  }
}
Key Description Required?
minInstances The minimum number of instances. Defaults to 1. no
maxInstances The maximum number of instances. Default is no limit. no
softMinInstances The recommended minimum number of instances. A warning will be displayed on the service's Instances page in Cloudera Manager when fewer than this number of roles are configured. By default, there is no recommended minimum. no
softMaxInstances The recommended maximum number of instances. A warning will be displayed on the service's Instances page in Cloudera Manager when more than this number of roles are configured. By default, there is no recommended maximum. no
requiresOddNumberOfInstances This should be set to true in cases where only an odd number of instances is allowed for the role type. This might be important for some HA architectures. Defaults to false. no
placementRules List of special placementRules limiting where a role may be placed. Default is no special rules. no
  • New in Cloudera Manager 5.5.0, added placementRules
  • New in Cloudera Manager 5.8.0, added softMinInstances and softMaxInstances
  • New in Cloudera Manager 5.14.0, added requiresOddNumberOfInstances

placementRules

Placement Rules allow special restrictions for where a role may be placed.

alwaysWith

{
  "type" : "alwaysWith",
  "roleType" : "PRIMARY_ROLE_NAME"
}

When this rule is present, the current role must always be placed on the same host as wherever the role with name "PRIMARY_ROLE_NAME" is placed. The current role will no longer appear in the wizard when adding this service. Instead, one instance of this role will automatically be placed on any host that has the primary role. If the user ever assigns roles in a way that violates this placement rule, the service will have a configuration error and fail to start.

alwaysWithAny

{
 "type" : "alwaysWithAny",
 "roleTypes" : [ "PRIMARY_ROLE_NAME_1", "PRIMARY_ROLE_NAME_2"]
}

When this rule is present, the current role must always be placed on the same host as wherever the roles with the name "PRIMARY_ROLE_NAME_1", "PRIMARY_ROLE_NAME_2" are placed. The current role will no longer appear in the wizard when adding this service. Instead, one instance of this role will automatically be placed on any host that has at least one of the primary roles. If more than one of the primary roles are themselves placed in the same host, then only one instance of this role will be automatically placed on that host.

There should be at least two unique primary roles defined in alwaysWithAny rule. And, alwaysWithAny rule is mutually exclusive to alwaysWith placement rule and should not be defined together for the same role. If the user ever assigns roles in a way that violates this placement rule, the service will have a configuration error and fail to start.

neverWith

{
 "type" : "neverWith",
 "roleTypes" : [ "INCOMPATIBLE_ROLE_NAME_1", "INCOMPATIBLE_ROLE_NAME_2" ]
}

When this rule is present, the current role must never be placed on the same host as wherever the listed incompatible roles are placed. If the user ever assigns roles in a way that violates this placement rule, the service will have a configuration error and fail to start.

Starting Cloudera Manager 5.13.0, all the placement rules for a role should be of a unique “type”. That is, a role can have a maximum of one placement rule per type.

  • Placement Rules (alwaysWith and neverWith) new in Cloudera Manager 5.5.0
  • Placement Rule (alwaysWithAny) new in Cloudera Manager 5.13.0

metricsSource

Required?: No The metric source section describes the metric format and the URLs of a role.

  "metricsSource": {
    "metricsFormat": "SIMPLE",
    "httpSource": {
      "httpUrl": "http://${host}:${echo_server_http_port}/api/echo/admin/metrics",
      "httpsUrl": "https://${host}:${echo_server_https_port}/api/echo/admin/metrics",
      "sslEnabled": "${ssl_enabled}",
      "kerberosEnabled": "${kerberos_enabled}",
      "kerberosPrincipal": "${principal}"
    }
  },
Key Description Required?
metricsFormat Can be SIMPLE or JMX. yes
httpUrl URL using http. This can have standard substitutions. no
httpsUrl URL using https. This can have standard substitutions. no
sslEnabled Indicates if SSL is enabled. Can be true or false. This can have standard substitutions. no
kerberosEnabled Indicates if Kerberos authentication is required. Parameter substitutions are available. See kerberos for values. no
kerberosPrincipal Kerberos principal primary to use. Parameter and principal substitutions are available. no

The SIMPLE format supports fetching metrics from a nested set of dictionaries as described in the context field of the metric. The JMX format supports fetching metrics from Hadoop json serialized jmx beans end-points. It iterates over the list of beans and try to match the regular expression with the bean name. If a match is found then it tries to extract the metrics from the bean.

Kerberos settings are available since Cloudera Manager 7.4.3. If not provided, defaults are backwards-compatible: kerberosEnabled is false, and kerberosPrincipal is a built-in default, currently "HTTP". This default is always available for metrics collection if present in the keytab, regardless of kerberosEnabled.

A suggested (but not required) approach to setting Kerberos options is to reference the same parameter in kerberosEnabled and kerberos, and set kerberosPrincipal to "${principal}", as in the example above.

serviceMonitorClient

Required?: No

This a role level parameter. It indicates that the role needs the Service Monitor host and port information in order to send in metric data directly. This can be used for metrics which cannot be obtained the supported way or for canary implementations.

"serviceMonitorClient" : "true",

healthTests

Required?: No

Both services and roles can have health tests defined. Health tests periodically act on metrics, as defined in MDL, that have been collected, and raise health alerts when the criteria defined by the test is satisfied. Note that health tests depend on the presence of the metrics, and not responsible for the collection of it. Each health test is typed and depending on the type there might be more required fields to be set If a health test is defined at service level, it acts on metric values belonging to the service. Similarly, if defined at role level, it acts on metric values for the role.

healthTests list

The healthTests section contains a list of health tests.

"healthTests" : []

A single health test example:

{
  "type" : "metric",
  "name" : "PENDING_JOBS",
  "label" : "Pending Jobs",
  "description" : "Some Useful Description",
  "metric" : "pending_jobs",
  "timeWindowSec" : 60,
  "comparisonOperator" : "gt",
  "aggregationFunction" : "avg",
  "greenMessage" : "There have been a reasonable number of jobs pending : ${metric.value}",
  "yellowThreshold" : 10,
  "yellowMessage" : "There have been ${metric.value} jobs pending.",
  "redThreshold" : 20,
  "redMessage" : "There have been ${metric.value} jobs pending. Bad!",
  "missingDataColor" : "yellow"
}

A health test

Common fields

Key Description Required?
name The unique identifier for the health test within this CSD. This name may be persisted as a reference so changing it equates to changing health test identity. The convention is to only use upper case letter separated by underscores. yes
label The short user-friendly name. yes
description The description for the health test itself. yes
advice Optional structure that provides detailed advice on how to deal with health issue. Highly recommended. no
type The type of the health test. Currently only "metric" type is supported, there is an experimental "status" type which is not covered here. yes

Advice

Key Description Required?
message Descriptive message on how to address health problem raised by this test. yes
parameters Relevant parameters for addressing health alert or tuning this test. no
commands Relevant commands that may help with addressing alerts by this test. no

Metric health test fields

Currently "metric" is the only supported health test type. For this type of health test, the criteria is essentially a numerical evaluation. The criteria is met when actual value compared to threshold evaluates to true. Expanding further, the evaluation is comparisonOperation(aggregationFunction(metric values within timeWindowSec), threshold).

Key Description Required?
metric The metric this test acts on. Must be either gauge or counter metric. yes
divisorMetric Optional metric serving as denominator while the main metric is numerator. no
timeWindowSec Time window in seconds to look back. yes
comparisonOperator How to compare actual value against the threshold. See [comparison operators](Comparison Operators) yes
aggregationFunction How to aggregate all values within the time window into one value: avg, sum, max, min, last yes
redThreshold Threshold to trigger bad health. no
redMessage Message for bad health. The metric value can be referenced in the message using the ${metric.value} expression. no
yellowThreshold Threshold to trigger concerning health. no
yellowMessage Message for concerning health. The metric value can be referenced in the message using the ${metric.value} expression. no
greenMessage Message when the health is good, i.e. neither red nor yellow threshold met. The metric value can be referenced in the message using the ${metric.value} expression. yes
missingDataColor Determines how missing data is handled, it enables the health test to give a warning or error in such case. Valid values: not_avail, green, yellow, red. When missingDataColor is not used is equivalent to missingDataColor=not_avail no

Comparison Operators

Name Meaning
lt <
lte <=
gt >
gte >=
eq ==
neq !=

healthAggregation

Required?: No

Health Aggregation allows reporting health of the service based on the health of one or more role(s). A role can define one of the following health aggregation types, depending on the topology of the role, but not both of them.

singleton

A type of health aggregation where the service health is reported based the health of the singleton role. The service health has a 1:1 mapping to the health of the singleton role, for example: if the role is "Good" (i.e GREEN), then the service-level health check reports "Good". A “singleton” health aggregation type should be only used for a singleton role. A role is considered to be a singleton role if its topology has “maxInstances” = 1.

{
   "healthAggregation" : { 
      "type" : "singleton" 
    }
}

nonSingleton

A type of health aggregation where the service health is reported based on the health of all the roles of this type. A “nonSingleton” health aggregation type should be only used with a non-singleton role. A role is considered to be a non-singleton role if its topology has “maxInstances” > 1.

{
  "healthAggregation" : { 
    "type" : "nonSingleton", 
    "percentGreenForGreen" : 95.0, 
    "percentYellowGreenForYellow" : 90.0 
    }
}

percentGreenForGreen: A double value. The associated service health will report "Good" (i.e GREEN) if the total percentage of "Good" roles is strictly greater than this value. If this condition evaluates to true, then "percentYellowGreenForYellow" is not evaluated.

percentYellowGreenForYellow: A double value. The associated service health will report "Concerning" (i.e YELLOW) if the total percentage of "Good" and "Concerning" roles is strictly greater than this value. If this condition evaluates to false, then service health would report as "Bad" (i.e RED). Note that these rules imply that the health check will never return “Bad” (i.e RED) unless at least one role is “Bad”.

  • Health Aggregation new in Cloudera Manager 5.13.0.

logging

Instructs Cloudera Manager where to look for the role log file. By specifying this structure, the CSD can also participate in log collection when a support bundle is requested. In addition, if the loggingType of log4j is set, then a log4j.properties file is generated from user provided configuration and sent to the agent's process directory. The start script can then place the log4j.properties file in the appropriate location where the role process can read it.

Since: Cloudera Manager 5.4.0, it is also possible specify logback as the loggingType. By default, a logback.xml file is generated from user provided configuration when this logging type is chosen, with support for appending XML snippets into the generated logback configuration file.

Since: Cloudera Manager 5.5.0, it possible to specify glog as the loggingType. This is intended for services that use glog for logging. Environment variables for glog (prefixed with "GLOG_") will be emitted into the role's environment.

Required?: No

Log4j example

{
  "logging" : {
    "dir" : "/var/log/echo",
    "filename" : "webserver.log",
    "modifiable" : true,
    "configName" : "log.dir",
    "loggingType" : "log4j",
    "additionalConfigs" : [
      {
        "key" : "additional.log.key",
        "value" : "additional.log.value"
      }
    ]
  }
}

Logback example

{
  "logging" : {
    "dir" : "/var/log/echo",
    "filename" : "webserver.log",
    "modifiable" : true,
    "configName" : "log.dir",
    "loggingType" : "logback",
    "additionalConfigs" : [
      {
        "key" : "extraLoggerConfig",
        "value" : "<logger name=\"org.apache.commons.beanutils\" level=\"ERROR\"/>"
      }
    ]
  }
}

glog example

{
  "logging" : {
    "dir" : "/var/log/echo",
    "filename" : "webserver.INFO",
    "modifiable" : true,
    "loggingType" : "glog"
  }
}
Key Description Required?
dir The location on disk where the logs are written. This directory gets created automatically by the CM agent. yes
filename The log file name. This can have standard substitutions. yes
configFilename The name of the configuration file. Defaults to log4j.properties (log4j) or logback.xml (logback) no
modifiable Whether the directory should be exposed in CM configuration UI for modification. Default is false. no
configName The name of the configuration key to output when being written to a config file. Default is "log_dir". no
loggingType Enum of "log4j", "logback", "glog", or "other". Defaults to "other". See loggingType. no
additionalConfigs List of ConfigEntry to add near the end of the file. Only works with log4j or logback files. In the case of log4j, these additional configs will go after all regular parameters, but before any Advanced Configuration Snippets. In the case of logback, the "value" field of each of the additional configs will be added to the generated configuration XML. (Not allowed for Gateway roles until CM 5.3.2) no
kerberosPrincipals List of Kerberos Principal Config Entries to add to config file. no
  • New in Cloudera Manager 5.2.0, added additionalConfigs for any non-gateway role. Gateway support added in 5.3.2

loggingType

log4j

  • Cloudera Manager auto-generates the following parameters for the role:
  • Log Threshold
  • Max file size
  • Max backup index size
  • Log4j safety valve
  • A log4j.properties file is generated and deployed to the process directory.

logback (Since: Cloudera Manager 5.4.0)

  • Cloudera Manager auto-generates the following parameters for the role:
  • Log Threshold
  • Max file size
  • Max backup index size
  • Logback XML override
  • A logback.xml file is generated and deployed to the process directory.

glog (Since: Cloudera Manager 5.5.0)

  • Cloudera Manager auto-generates some parameters for a role. Each one will automatically be emitted into the environment. The parameters are listed below with their corresponding environment variable names in parentheses:
  • Log directory (GLOG_log_dir)
  • Minimum log level (GLOG_minloglevel)
  • Maximum log level to buffer (GLOG_logbuflevel)
  • Minimum log verbosity (GLOG_v)
  • Maximum log size (GLOG_max_log_size)
  • When using glog-based logging, the filename field must end in ".INFO".

other

  • Cloudera Manager doesn't do anything special for this logging type.

commands

A list of commands that can be run on the role. A command has a few pieces of metadata but essentially it is a script that gets executed on the host of the role. The command script should return one of a predefined list of exit codes to be considered successful. This is the building block for a service commands.

Required?: No

{
  "commands" : [
    {
      "name" : "role_cmd1",
      "label" : "Role Cmd1 Label",
      "description" : "Role Cmd1 Help",
      "expectedExitCodes" : [0, 1, 2],
      "requiredRoleState" : "running",
      "commandRunner" : {
        "program" : "scripts/control.sh",
        "args" : ["cmd1"]
      },
      "internal" : "true"
    }
 ]
}
Key Description Required?
name The internal name of the role command. yes
label The user facing name in the "Actions" dropdown list. yes
description The help text for the command. yes
expectedExitCodes The exit codes from the command that are expected. If an exit code doesn't is found that is not in this list, the command will be flagged as failed. yes
requiredRoleState The state the role needs to be in for the command to be available. Enum of "stopped" and "running". no
commandRunner The script to execute for this command. The structure is of type Script Runner. yes
internal Whether the command should be marked as internal and hidden in the CM UI. Defaults to false. no

configWriter

Specifies what configuration files should be written out to the process directory when commands get run for this role. This includes the start command and additional role commands.

Required?: No

{
  "configWriter" : "..."
}

The configWriter structure is of type Configuration Writer

parameters

A list of parameters used to configure the role. See Parameters. Roles inherit all the parameters that are specified in the service.

Required?: No

{
  "parameters" : "..."
}

sslServer

Use when this role acts as an SSL Server. This will automatically generate relevant role parameters and help CM know when SSL is relevant, such as when it should use regular or secure urls.

Required?: No

Java Keystore Format (JKS)

{
 "sslServer" : {
   "keyIdentifier" : "echo_master",
   "enabledConfigName" : "echo.ssl.enabled",
   "keystoreLocationConfigName" : "echo.ssl.keystore.location",
   "keystorePasswordConfigName" : "echo.ssl.keystore.password.script",
   "keystorePasswordCredentialProviderCompatible" : false,
   "keystorePasswordScriptBased" : true,
   "keyPasswordOptionality" : "required",
   "keystoreKeyPasswordConfigName" : "echo.ssl.keystore.key.password.script",
   "keystoreKeyPasswordCredentialProviderCompatible" : false,
   "keystoreKeyPasswordScriptBased" : true
 }
}
Key Description Required?
keystoreFormat The format of the ssl server, either "jks" or "pem". If not specified, defaults to "jks". The rest of this section assumes the JKS format is configured. no
keyIdentifier The alias / identifier for this key in the keystore yes
enabledConfigName Config name to emit when ssl_enabled is used in a config file. If null, ssl_enabled will not be emitted into config files, and can only be used in substitutions like ${ssl_enabled}. no
keystoreLocationConfigName Config name to emit when ssl_server_keystore_location is used in a config file. If null, ssl_server_keystore_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_keystore_location}. no
keystorePasswordConfigName Config name to emit when ssl_server_keystore_password is used in a config file. If null, ssl_server_keystore_password will not be emitted into config files, and can only be used in substitutions like ${ssl_server_keystore_password}. You must set this in order to use keystorePasswordCredentialProviderCompatible or keystorePasswordScriptBased. no
keystorePasswordCredentialProviderCompatible Defaults to false. Whether ssl_server_keystore_password can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_server_keystore_password}. Requires keystorePasswordConfigName to be set. Mutually exclusive with keystorePasswordScriptBased. no
keystorePasswordScriptBased Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_server_keystore_password}): 1) The regular password for the keystore is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires keystorePasswordConfigName to be set. Mutually exclusive with keystorePasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_server_keystore_password}. For this functionality to be useful, your code must run the script in the parameter to get the real password. no
keyPasswordOptionality Whether to expose the ssl_server_keystore_keypassword parameter, and whether it is required or optional. When not specified, no parameter is emitted for ssl_server_keystore_keypassword. See Parameter Optionality no
keystoreKeyPasswordConfigName Config name to emit when ssl_server_keystore_keypassword is used in a config file. If null, ssl_server_keystore_keypassword will not be emitted into config files, and can only be used in substitutions like ${ssl_server_keystore_keypassword}. You must set this in order to use keystoreKeyPasswordCredentialProviderCompatible or keystoreKeyPasswordScriptBased. no
keystoreKeyPasswordCredentialProviderCompatible Defaults to false. Whether ssl_server_keystore_keypassword can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_server_keystore_keypassword}. Requires keystoreKeyPasswordConfigName to be set. Mutually exclusive with keystoreKeyPasswordScriptBased. no
keystoreKeyPasswordScriptBased Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_server_keystore_keypassword}): 1) The regular password for the keystore is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires keystoreKeyPasswordConfigName to be set. Mutually exclusive with keystoreKeyPasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_server_keystore_keypassword}. For this functionality to be useful, your code must run the script in the parameter to get the real password. no
  • New in Cloudera Manager 5.2.0, initial SSL Server support
  • New in Cloudera Manager 5.5.0, format type, custom config names, and script-based passwords for generated parameters.

Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.

Users will see these parameters in the normal Configuration page as well as when adding this service.

Parameter name Type Description
ssl_enabled boolean Enable SSL Requests to this role.
ssl_server_keystore_location path Path to the keystore file containing the server certificate and private key used for SSL. Used when this role is acting as an SSL server. Keystore must be in JKS format.
ssl_server_keystore_password password Password for the JKS keystore used when this role is acting as an SSL server.
ssl_server_keystore_keypassword password Password that protects the private key contained in the JKS keystore used when this role is acting as an SSL server.
  • New in Cloudera Manager 5.2.0, initial SSL Server support
  • New in Cloudera Manager 5.5.0, SSL configs appear in wizards.

PEM Certificate Format

{
 "sslServer" : {
   "keystoreFormat" : "pem",
   "enabledConfigName" : "echo.ssl.enabled",
   "privateKeyLocationConfigName" : "echo.ssl.keystore.location",
   "privateKeyPasswordConfigName" : "echo.ssl.keystore.password.script",
   "privateKeyPasswordCredentialProviderCompatible" : false,
   "privateKeyPasswordScriptBased" : true
 }
}
Key Description Required?
keystoreFormat The format of the ssl server, either "jks" or "pem". If not specified, defaults to "jks". The rest of this section assumes this was configured to "pem" no (yes for PEM format)
enabledConfigName Config name to emit when ssl_enabled is used in a config file. If null, ssl_enabled will not be emitted into config files, and can only be used in substitutions like ${ssl_enabled}. no
privateKeyLocationConfigName Config name to emit when ssl_server_privatekey_location is used in a config file. If null, ssl_server_privatekey_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_privatekey_location}. no
privateKeyPasswordConfigName Config name to emit when ssl_server_privatekey_password is used in a config file. If null, ssl_server_privatekey_password will not be emitted into config files, and can only be used in substitutions like ${ssl_server_privatekey_password}. You must set this in order to use privateKeyPasswordCredentialProviderCompatible or privateKeyPasswordScriptBased. no
privateKeyPasswordCredentialProviderCompatible Defaults to false. Whether ssl_server_privatekey_password can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_server_privatekey_password}. Requires privateKeyPasswordConfigName to be set. Mutually exclusive with privateKeyPasswordScriptBased. no
privateKeyPasswordScriptBased Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_server_privatekey_password}): 1) The regular password for the private key is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires privateKeyPasswordConfigName to be set. Mutually exclusive with privateKeyPasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_server_privatekey_password}. For this functionality to be useful, your code must run the script in the parameter to get the real password. no
certificateLocationConfigName Optional. Config name to emit when ssl_server_certificate_location is used in a config file. If null, ssl_server_certificate_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_certificate_location}. no
certificateLocationDefault Optional. Default value for ssl_server_certificate_location. no
caCertificateLocationConfigName Optional. Config name to emit when ssl_server_ca_certificate_location is used in a config file. If null, ssl_server_ca_certificate_location will not be emitted into config files, and can only be used in substitutions like ${ssl_server_ca_certificate_location}. no
caCertificateLocationDefault Optional. Default value for ssl_server_ca_certificate_location. no
  • New in Cloudera Manager 5.5.0, PEM format SSL Server support

Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.

Users will see these parameters in the normal Configuration page as well as when adding this service.

Parameter name Type Description
ssl_enabled boolean Enable SSL Requests to this role.
ssl_server_keystore_location path The path to the TLS/SSL file containing the server certificate and private key used for TLS/SSL. Used when {0} is acting as a TLS/SSL server. The certificate file must be in PEM format. This file can be created by concatenating the certificate.pem file with the private key.pem file.
ssl_server_keystore_password password The password for the private key in the {0} TLS/SSL Server Certificate and Private Key file. If left blank, this indicates that the private key is not protected by a password.
  • New in Cloudera Manager 5.5.0, PEM format SSL Server support

sslClient

If this role acts as an SSL Client of some SSL Server, use sslClient. This will automatically generate role parameters related to SSL client configuration.

Required?: No

Java Truststore Format (JKS)

{
  "sslClient" : {
    "truststoreLocationConfigName" : "echo.ssl.truststore.location",
    "truststorePasswordConfigName" : "echo.ssl.truststore.password.script",
    "truststorePasswordCredentialProviderCompatible" : false,
    "truststorePasswordScriptBased" : true 
  }
}
Key Description Required?
truststoreFormat The format of the ssl truststore, either "jks" or "pem". If not specified, defaults to "jks". no
truststoreLocationConfigName Config name to emit when ssl_client_truststore_location is used in a config file. If null, ssl_client_truststore_location will not be emitted into config files, and can only be used in substitutions like ${ssl_client_truststore_location}. no
truststorePasswordConfigName Config name to emit when ssl_client_truststore_password is used in a config file. If null, ssl_client_truststore_password will not be emitted into config files, and can only be used in substitutions like ${ssl_client_truststore_password}. You must set this in order to use truststorePasswordCredentialProviderCompatible or truststorePasswordScriptBased. no
truststorePasswordCredentialProviderCompatible Defaults to false. Whether ssl_client_truststore_password can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. Has no effect on substitutions like ${ssl_client_truststore_password}. Requires truststorePasswordConfigName to be set. Mutually exclusive with truststorePasswordScriptBased. no
truststorePasswordScriptBased Defaults to false. If true, the following things happen when used in a configFile (not through substitutions like ${ssl_client_truststore_password}): 1) The regular password for the keystore is no longer emitted. 2) In its place, CM will emit the full path to a script, and that script will echo the value of this desired password to stdout. Requires truststorePasswordConfigName to be set. Mutually exclusive with truststorePasswordCredentialProviderCompatible. Has no effect on substitutions like ${ssl_client_truststore_password}. For this functionality to be useful, your code must run the script in the parameter to get the real password. no
  • New in Cloudera Manager 5.2.0, basic SSL Client support
  • New in Cloudera Manager 5.5.0, custom config names and script-based truststore password

Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.

Users will see these parameters in the normal Configuration page as well as when adding this service.

Parameter name Type Description
ssl_client_truststore_location path Path to the client truststore file used for SSL. Used when this role is acting as an SSL client. Truststore must be in JKS format.
ssl_client_truststore_password password Password for the JKS truststore file used when {0} is acting as an SSL client.
  • New in Cloudera Manager 5.2.0, initial SSL Client support
  • New in Cloudera Manager 5.5.0, SSL configs appear in wizards.

PEM Certificate(s) format

{
  "sslClient" : {
    "truststoreFormat" : "pem",
    "truststoreLocationConfigName" : "echo.ssl.truststore.location"
  }
}
Key Description Required?
truststoreFormat The format of the ssl truststore, either "jks" or "pem". If not specified, defaults to "jks". The rest of this section assumes "pem" was configured. no (yes for PEM format)
truststoreLocationConfigName Config name to emit when ssl_client_truststore_location is used in a config file. If null, ssl_client_truststore_location will not be emitted into config files, and can only be used in substitutions like ${ssl_client_truststore_location}. no
  • New in Cloudera Manager 5.5.0, PEM format for sslClient

Automatically generated parameters can be used like any other role parameters, commonly as substitutions in config files via additionalConfigs or in environment variables. If using script-based passwords, then these parameters are commonly emitted directly into a config file.

Users will see these parameters in the normal Configuration page as well as when adding this service.

Parameter name Type Description
ssl_client_truststore_location path The location on disk of the trust store, in .pem format, used to confirm the authenticity of TLS/SSL servers that {0} might connect to. This is used when {0} is the client in a TLS/SSL connection. This trust store must contain the certificate(s) used to sign the service(s) being connected to. If this parameter is not provided, the default list of well-known certificate authorities is used instead.
  • New in Cloudera Manager 5.5.0, PEM format for sslClient

uniqueIdParameters

Can be used to indicate which String parameters are unique identifiers for the role. If specified, Cloudera Manager will initialize the parameters to a unique value at role creation.

  • New in Cloudera Manager 5.2.0

kerberosPrincipals

List of kerberos principals used by the role. If this is specified, a keytab file containing all the principals will be added to role's configuration whenever the role is started or when a command is run on the role. The name of the keytab file is serviceType.toLowerCase() + ".keytab". Kerberos principals are also added to the environment of the processes of the role, where variable name is the name of the kerberos principal and value is the principal itself.

cgroup

Describes if and how cgroup parameters belonging to roles of this type should be automatically configured during static service pool configuration. In general, this section is only relevant to roles that:

  1. Are present in large numbers across many hosts in the cluster, and
  2. Consume a non-trivial amount of resources.

For example, the Datanode is the only role within HDFS with automatic configuration for cgroups, because it's the only resource-consuming, multi-instance role.

More information about Linux cgroups, how they're used to implement static service pools, and how Cloudera Manager configures them can be found here.

Required?: No

{
  "cgroup" : {
    "cpu" : {
      "autoConfigured" : true
    },
    "memory" : {
      "autoConfigured" : true,
      "autoConfiguredMin" : 1073741824
    },
    "blkio" : {
      "autoConfigured" : true
    }
  }
}
Key Description Required?
cpu The cpu cgroup subsystem. Includes the cpu.shares resource control. no
memory The memory cgroup subsystem. Includes the memory.limit_in_bytes resource control. no
blkio The blkio cgroup subsystem. Includes the blkio.weight resource control. no
autoConfigured Whether the resource controls of this cgroup subsystem should be automatically configured by CM when setting up static service pools. If false, these controls will be left at their defaults. yes
autoConfiguredMin For memory.limit_in_bytes, the absolute minimum memory amount (in bytes) that a role can be given. If unset, defaults to 0. no

jvmBased

Indicates whether the process associated with a role is a Java or other JVM-based process.

Required?: No. Defaults to false if omitted.

{
  "roles" : [
    {
       "name" : "MY_JAVA_ROLE",
       "jvmBased" : "true"
    }
  ],
}

Setting jvmBased to true enables a number of features for the role in Cloudera Manager:

Out of memory handling

  • This feature allows the user to configure the JVM's behavior when an OutOfMemoryError occurs.
  • When jvmBased is true, Cloudera Manager auto-generates the following parameters for the role:
  • Dump Heap When Out of Memory
  • Heap Dump Directory
  • Kill When Out of Memory
  • When the Dump Heap When Out of Memory parameter is checked, Cloudera Manager monitors the amount of free space available on the filesystem hosting the heap dump directory, and incorporates this information into the health checks performed for the role.

Periodic Stacks collection

  • Periodic stacks collection allows the user to enable and configure the periodic collection of thread stack traces in Cloudera Manager.
  • When jvmBased is true, Cloudera Manager auto-generates the following parameters for the role:
  • Stacks Collection Enabled
  • Stacks Collection Directory
  • Stacks Collection Frequency
  • Stacks Collection Data Retention
  • Stacks Collection Method

JVM-related role commands

  • Cloudera Manager defines the following commands for the role when jvmBased is true:
  • Collect Stack Traces (jstack)
  • Heap Dump (jmap)
  • Heap Histogram (jmap -histo)
  • These commands are accessible via the Actions menu on the role instance page, as well as through the Cloudera Manager API.

Changes required to CSD control script

When jvmBased is true, a new environment variable, CSD_JAVA_OPTS, is defined in the environment of the role's process. This variable contains options that must be passed when starting up the JVM for the role. In order for the features described above to work, the CSD control script must be modified to include the value of CSD_JAVA_OPTS on the command line that launches the JVM.

Typically, you can add CSD_JAVA_OPTS to an existing variable that defines JVM options. For example, the Spark CSD control script defines a variable called SPARK_DAEMON_JAVA_OPTS. The control script includes the following code to add CSD_JAVA_OPTS to these options:

  export SPARK_DAEMON_JAVA_OPTS="$CSD_JAVA_OPTS $SPARK_DAEMON_JAVA_OPTS"
  • jvmBased new in Cloudera Manager 5.7.0

Gateway

The gateway structure is used to describe the client configuration of the service. Client configuration can be deployed from the service "Action" menu. Once the "Deploy Client Configuration" command is run, the following steps occur:

  1. Cloudera Manager sends the configuration files specified in the gateway config writer to each gateway role host.
  2. If a scriptRunner exists, it is executed. This gives the CSD a hook to modify the client configuration before it is deployed.
  3. The agent expects client configurations to exist in a subdirectory of the process directory with the same name as the alternatives name. For example: /var/run/cloudera-scm-agent/process/111-deploy-client-config/echo-conf.
  4. After the scriptRunner is run (or not if there isn't one), the agent will copy the client config subdirectory to the alternatives linkRoot and create the system alternatives.
{
 "gateway" : {
  "alternatives" : {
    "name" : "echo-conf",
    "linkRoot" : "/etc/echo",
    "priority" : 50
  },
  "scriptRunner" : {
    "program" : "scripts/cc.sh",
    "args" : ["deploy"]
  },
  "sslClient" : {
   "..."
  },
  "parameters" : "...",
  "configWriter" : "...",
  "logging" : "..."
 }
}

alternatives

Describes how to install the deployed files into alternatives for the client configuration.

Required?: Yes

{
  "alternatives" : {
    "name" : "echo-conf",
    "linkRoot" : "/etc/echo",
    "priority" : 50
  }
}
Key Description Required?
name The logical name for the link group in alternatives. It will also serve as the subdirectory name within the process directory for all the generated configuration files. yes
linkRoot The symbolic link to be used by clients that internally points to the alternatives managed locations. The files will be deployed to a subdirectory called conf. For example, if link root is /etc/service, the complete link would be /etc/service/conf. yes
priority Default priority when installed into alternatives. The value can be changed later via the CM configuration UI. Default is 0. no

scriptRunner

A script to run under the agent's process directory alongside the files generated by configWriter. The generated files will be in the script's current working directory. The environment variable $CONF will be available for the root directory of the deploy process. If environment variables are defined for the script, it is highly recommended that they be namespaced with a unique prefix to avoid conflict with the other environment variables that CM injects.

Required?: No

{
  "scriptRunner" : {
    "program" : "scripts/cc.sh",
     "args" : ["deploy"]
  }
}

The startRunner structure is of type Script Runner.

sslClient (Since Cloudera Manager 7.0.0)

Required?: No

This has the same functionality as the sslClient section in a role.

configWriter

Specifies what configuration files should be written out to the process directory when the "Deploy Client Configuration" command is run.

Required?: yes

{
  "configWriter" : "..."
}

The configWriter structure is of type Configuration Writer.

parameters

A list of parameters used to configure the client configuration. See Parameters.

Required?: No

{
  "parameters" : "..."
}

logging (Since: Cloudera Manager 5.4.0)

Instructs Cloudera Manager where to look for the gateway role log file. Similar effects of generating log4j properties file or logback XML configuration file take place if configured for a gateways as they do for regular roles.

Generated gateway logging configuration files currently are limited to logging to console, but this behavior can be overridden using the supported Log4J safety valve or the Logback XML override configuration.

Required?: No

Log4j example

{
  "logging" : {
    "configFilename" : "log4j-1.properties",
    "loggingType" : "log4j",
    "additionalConfigs" : [
      {
        "key" : "additional.log.key",
        "value" : "additional.log.value"
      }
    ]
  }
}

Logback example

{
  "logging" : {
    "configFilename" : "logback-test.xml",
    "loggingType" : "logback",
    "additionalConfigs" : [
      {
        "key" : "extraLoggerConfig",
        "value" : "<logger name=\"org.apache.commons.beanutils\" level=\"ERROR\"/>"
      }
    ]
  }
}
Key Description Required?
configFilename The name of the configuration file. Defaults to log4j.properties (log4j) or logback.xml (logback) no
loggingType Enum of "log4j", "logback" or "other". Defaults to "other". See loggingType. no
additionalConfigs List of ConfigEntry to add near the end of the file. Only works with log4j or logback files. In the case of log4j, these additional configs will go after all regular parameters, but before any Advanced Configuration Snippets. In the case of logback, the "value" field of each of the additional configs will be added to the generated configuration XML. no

loggingType

log4j

  • Cloudera Manager auto-generates the following parameters for the role:
  • Log Threshold
  • Log4j safety valve
  • A log4j.properties file is generated and deployed to the process directory.

logback

  • Cloudera Manager auto-generates the following parameters for the role:
  • Log Threshold
  • Logback XML override
  • A logback.xml file is generated and deployed to the process directory.

other

  • Cloudera Manager doesn't do anything special for this logging type.

Configuration Writers

Configuration writers provide a way to create custom configuration files for processes/commands controlled by the service. Configuration writers can be associated with a role or a gateway. Cloudera Manager bundles all the configuration files together and sends them to the agent via the heartbeat and writes them to the process directory.

{
  "configWriter" : {
    "generators" : [
      {
        "filename" : "sample_xml_file.xml",
        "configFormat" : "hadoop_xml",
        "excludedParams" : ["service_var1", "role_var3"],
        "includedParams" : ["service_var1", "role_var3"]
      }
    ],
    "peerConfigGenerators" : [
       {
         "filename" : "sample_role_peer_file.properties",
         "params" : ["service_var1", "role_var3"],
         "roleName" : "ECHO_MASTER_SERVER"
        }
     ],
     "auxConfigGenerators" : [
       {
         "filename" : "sample_aux_file.properties",
         "sourceFilename" : "aux/some_aux_file.properties"
       }
     ]
  }
}

There are three types of configWriters: generators, peerConfigGenerators, auxConfigGenerators.

Generators

A generator allows the CSD author to write out all or a subset of the parameters in a file automatically. The advantage is that when a new parameter is added to the service descriptor, it will automatically be written out in the format specified. The disadvantage is that there is a limited number of supported formats: hadoop_xml, java properties, gflags, and jinja. Generated configuration might also be useful for passing a large number of parameters to the command scripts. In addition, every generator will have a safety valve that shows up in the CM configuration UI.

{
  "generators" : [
    {
      "filename" : "sample_xml_file.xml",
      "refreshable" : false,
      "configFormat" : "hadoop_xml",
      "excludedParams" : ["service_var1", "role_var3"],
      "includedParams" : ["service_var1", "role_var3"],
      "additionalConfigs" : [
        {
          "key" : "additional.config.key",
          "value" : "additional.config.value"
        }
      ]
    }
  ]
}
Key Description Required?
filename The configuration filename. yes
refreshable Whether this file can be "refreshed", which means it can be replaced without stopping the role. No control script is run during this refresh. In order for this to be useful, the underlying role must be able to re-read the file automatically. Defaults to false. no
configFormat The format of the configuration file. Enum of "hadoop_xml", "properties", "gflags", or "jinja". See configFormat. yes
template The jinja template file, which must be placed within aux directory. Just for "jinja" configFormat. yes (when configFormat is "jinja")
includedParams A list of all the parameters, by name, to include in the configuration file. By default all parameters are included. no
excludedParams A list of all the parameters, by name, to exclude. This list takes precedence over the include list. By default no parameters are excluded. no
additionalConfigs List of ConfigEntry to add near the end of the file. Only works with log4j files. These additional configs will go after all regular parameters, but before any Advanced Configuration Snippets. no
  • New in Cloudera Manager 5.2.0, added additionalConfigs for any non-gateway role. Gateway support added in 5.3.2
  • New in Cloudera Manager 5.5.0, added refreshable and support for gflags configFormat

configFormat

Below are examples of the format. If a service needs to do some complex munging of configuration variables or needs a different configuration format, that work can be done in the command script.

hadoop_xml

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://nightly-1.ent.cloudera.com:8020</value>
  </property>
  <property>
    <name>hadoop.security.authentication</name>
    <value>simple</value>
  </property>
<configuration>

properties

fs.default.name=hdfs://nightly-1.ent.cloudera.com:8020
hadoop.security.authentication=simple

gflags

--enable_foo
--num_roles=3
--service_name=my_service

jinja

The result can be anything, based on the jinja2 template. globals (map/dict) variable contains the config parameters, for example the value of a parameter, where the "configName" is config.key can be referenced as globals['config.key'].

The following template would generate key value pairs in properties format.

aux/teplates/properties.j2 template content:

{% for key, value in globals | dictsort() -%}
{{ key }} = {{ value }}
{% endfor -%}

The generator config for it:

{
  "generators" : [
    {
      "filename" : "all.properties",
      "configFormat" : "jinja",
      "template" : "aux/templates/properties.j2"
    }
  ]
}

Note: nested interpretation is disabled, so for example variables containing jinja expressions won't be interpreted again, will be displayed as is.

Peer Configuration Generators

A peer configuration generator is used to distribute a properties file that contains a list of all the hostnames that share the same roletype plus any parameters needed from each host. This is a general solution for distributing a hostlist and port number for each role of the same roletype. The format generated is a java properties file of the following format: <hostname>:<parameter config name>=<value>

{
  "peerConfigGenerators" : [
    {
      "filename" : "hostlist.properties",
      "refreshable" : false,
      "params" : ["dataNodeDataDir", "webPortAddress"],
      "roleName" : "ECHO_MASTER_SERVER"
     }
  ]
}
Key Description Required?
filename The configuration filename. yes
refreshable Whether this file can be "refreshed", which means it can be replaced without stopping the role. No control script is run during this refresh. In order for this to be useful, the underlying role must be able to re-read the file automatically. Defaults to false. no
params A list of parameters to include for each hostname. yes
roleName If specified, instead of using current role type, it uses the role type specified by roleName. This allows for example, a Datanode to get parameters from a Namenode. Only role types defined within the service can be used here. For example, the HDFS service cannot reference the Regionserver role type. By default, the current role is used. no
  • New in Cloudera Manager 5.5.0, added refreshable

Example of hostlist.properties

hostname1.mycompany.com:dfs.datanode.data.dir=/foo/bar
hostname1.mycompany.com:webPortAddress=6060
hostname2.mycompany.com:dfs.datanode.data.dir=/foo/bar/foo
hostname2.mycompany.com:webPortAddress=6060

Aux Config Generators

An aux config generator allows the CSD author to copy a static configuration file to a different location in the process directory. In addition, a safety valve will be created for it in Cloudera Manager. This generator is useful if the service has a config file that is in a format not natively supported by the configGenerators - for example, a config.yml YAML config file or a service_env.sh environment script. One possible way to deal with this is:

  1. Add the config.yml file template to the aux/directory. When the service starts all of the contents of the aux directory are copied to the agent process directory.
  2. Create a auxConfigGenerators structure for the config.yml and specify where the file should be copied in the process directory.
  3. The start script can perform the necessary variable substitutions before starting the service.

The benefit, is that a safety valve will show up in Cloudera Manager for the config.yml file. An administrator can use this safety valve to add additional configuration that may not have been modeled in the service.sdl. This behavior exists for all of the first party services in Cloudera Manager. The contents, if any, of the safety valve will simply be appended to the empty configuration file.

{
  "auxConfigGenerators" : [
    {
      "filename" : "sample_aux_file.properties",
      "sourceFilename" : "aux/some_aux_file.properties"
    }
  ]
}
Key Description Required?
filename The configuration filename. yes
sourceFilename The configuration file to copy over to filename. By default, if there is no sourceFilename specified, an empty file is created. The contents of the safety valves will be appended to the configuration file. no

CsdConfigEntryType

The type of ConfigEntry. The possible values are -

simple

In this case, config value is evaluated using standard substitutions.

auth_to_local

In this case, config value is computed by using Auth-to-Local rules configuration properties of the DFS service that is a dependency/dependent of the service. Value field is ignored in this case.

  • New in Cloudera Manager 5.5.0

ConfigEntry

A ConfigEntry describes custom entries in a config file. Both key and value support standard substitutions.

{
  "key" : "additional.config.key",
  "value" : "additional.config.value",
  "type" : "config.entry.type"
}

key

Property name to be used in the config file.

value

Value to be emitted in the config file for the given property. This field is ignored if type is auth_to_local.

type

Type of the config entry. Default is simple if none is specified.

Required?: No

  • type is new in Cloudera Manager 5.5.0

Kerberos Principal Config Entry

Kerberos principal config entry describes an entry in config file referencing a kerberos principal.

{
  "external" : "false",
  "peerRoleType" : "PEER_ROLE_TYPE",
  "principalName" : "kerberos_principal_name",
  "propertyName" : "kerberos.principal.property.name",
  "instanceWildcard" : "_HOST"
}

external

Should be set to true if the principal refers to an external principal of the service.

peerRoleType

If the principal belongs to a peer role type, then this field should be used to specify that role type. If this is set, principal from an arbitrary role of that role type is used. If both external and peerRoleType are specified, external takes precedence.

principalName

Name of the principal (as specified in [principalName] (#principalName)) to be emitted in config file.

propertyName

Property name to be used while emitting the principal in config file.

instanceWildcard

Optional wildcard string that will be used to replace the instance part of the principal while emitting it in a config file. E.g. hdfs/${host}@REALM will be emitted as hdfs/_HOST@REALM if instance wildcard is _HOST.

  • New in Cloudera Manager 5.2.0

Parameter Optionality

Determines whether a particular parameter should be exposed and / or required

  • NOT_EXPOSED - do not expose this parameter
  • OPTIONAL - expose this parameter as an optional parameter
  • REQUIRED - expose this parameter as a required parameter

Script Runner

The script runner structure provides all the information needed to run a script. When a command is run, Cloudera Manager deploys the entire scripts directory present in the CSD to the agent process directory.

Note: CSDs won't install third-party dependencies for scripts. If a script requires a third-party dependency (for example a python module), the CSD author needs to verify that the dependency is included as part of the system package or parcel.

{
  "program" : "scripts/control.sh",
  "args" : [ "cmd" ],
  "environmentVariables" : {
    "ENV_VAR1" : "${parameter_name}",
    "ENV_VAR2" : "23"
  }
}
Key Description Required?
program The path to the script to run relevant to the process directory home. yes
args A list of arguments to pass to the agent. These can have standard substitutions. no
environmentVariables A map of environment variables set before the script is run. These can have standard substitutions. no

Scripts vs. Binaries

The scripts that will be packaged with the CSD are just glue between the agent and the real program binaries/scripts that ship part of the standard program deliverable - whether it's packages, parcels, or tar balls. Although the service.sdl runner can just call the shipped binary/script directory using arguments and environment variables, most likely some adapting is necessary. It is the responsibility of the script bundled with the CSD to do this work before the real service binary can run.


Parameters

Both the service and roles have parameters defined. These parameters show up in the "Configuration" pages of Cloudera Manager. They can be consumed via configWriters or used in substitutions. Each parameter is typed and depending on the type there might be more required fields to be set. See parameter types.

{
  "name" : "server_timeout",
  "label" : "Server Timeout",
  "description" : "The Server Timeout",
  "configName" : "server.timeout",
  "required" : true,
  "configurableInWizard" : true,
  "default" : 20,
  "invalidValues": [0],
  "type" : "long", 
  "unit" : "seconds"
}
Key Description Required?
name The logical name for the parameter. This is the name that will be referenced either during variable substitution or config generation. Must be unique within a service. The convention is to only use lower case letter separated by underscores. yes
label The user friendly name. yes
description The description shown in the "Configuration" page. yes
configName The name of the key that will be outputted to a config file. By default the name of the parameter is used. no
required True if this parameter is required to be set. By default is false. no
default The default value of the parameter. no
configurableInWizard True if the configuration option should be configurable in the wizard before the service is started. This should be used sparingly: only for parameters whose values cannot be known a priori and are very difficult to change after the fact. By default is false. no
sensitive True if this parameter holds sensitive information. By default is false. no
type The type of the parameter. See parameter types. yes
  • New in Cloudera Manager 5.1.0, added "sensitive" field

type

Some types may require additional keys in the parameter structure.

boolean

double

Type Description Required?
softMin Recommended minimum double value. By default there is no recommended minimum. no
softMax Recommended maximum double value. By default there is no recommended maximum. no
min Absolute minimum double value. By default there is no minimum. no
max Absolute maximum double value. By default there is no maximum. no
unit Unit of the value. See units. no
invalidValues A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. no
  • New in Cloudera Manager 6.0.0, added "invalidValues" field

long

Type Description Required?
softMin Recommended minimum long value. By default there is no recommended minimum. no
softMax Recommended maximum long value. By default there is no recommended maximum. no
min Absolute minimum long value. By default there is no minimum. no
max Absolute maximum long value. By default there is no maximum. no
unit Unit of the value. See units. no
invalidValues A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. no
  • New in Cloudera Manager 6.0.0, added "invalidValues" field

memory

Type Description Required?
softMin Recommended minimum memory value. By default there is no recommended minimum. no
softMax Recommended maximum memory value. By default there is no recommended maximum. no
min Absolute minimum long value. By default there is no minimum. no
max Absolute maximum long value. By default there is no maximum. no
unit Unit of the value. See units. Must be a byte quantity. yes
scaleFactor Factor used in memory consumption calculation to account for any inherent overhead in the memory quantity. Defaults to 1.0.

See [[resource management

Resource-management-support-for-csds#cooperative-memory-limits]] for more details.
autoConfigShare Dictates the percentage of the role's overall memory allotment that should be set aside for this memory quantity during autoconfiguration for resource management. If unset, parameter is not autoconfigured for RM.

See [[resource management

Resource-management-support-for-csds#cooperative-memory-limits]] for more details.
invalidValues A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. no
  • New in Cloudera Manager 6.0.0, added "invalidValues" field

port

Type Description Required?
softMin Recommended minimum port value. By default there is no recommended minimum. no
softMax Recommended maximum port value. By default there is no recommended maximum. no
min Absolute minimum port value. By default there is no minimum. no
max Absolute maximum port value. By default there is no maximum. no
outbound True if the port is outbound. By default is false. no
zeroAllowed True if 0 can be specified. By default is false. no
negativeOneAllowed True if -1 can be specified. By default is false. no
invalidValues A list of invalid values. If this parameter is set to one of these invalid values, it will cause a validation error. no
  • New in Cloudera Manager 6.0.0, added "invalidValues" field

string_enum

Type Description Required?
validValues An array of valid strings. yes

string

Type Description Required?
conformRegex A regular expression the string needs to conform to. By default, all strings are valid. no
initType An Enum of "randomBase64". Initializes the parameter on creation of the owning entity. no
  • New in Cloudera Manager 5.4.0, added initType

password

A string type used for passwords so that they are masked upon entry in the Cloudera Manager UI. This type is considered sensitive by default. Note that this does not prevent the password from being displayed in the "Process" UI, the "Config diff" UI, or on disk on the host where the service is running. In order to encrypt the passwords in those locations, please see "credentialProviderCompatible" or "alternateScriptParameterName" below.

Type Description Required?
conformRegex A regular expression the string needs to conform to. By default, all strings are valid. no
initType An Enum of "randomBase64". Initializes the parameter on creation of the owning entity. no
credentialProviderCompatible Whether this parameter can use the Credential Provider, a Hadoop mechanism that allows for the encrypting of sensitive items in an encrypted store. This is mutually exclusive with alternateScriptParameterName. Has no effect on substitutions like ${parameter_name}, which will always get the raw password. no
alternateScriptParameterName If specified, the following things happen: 1) The "configName" of this parameter is no longer emitted 2) In its place, a parameter with the name specified by the "alternateScriptParameterName" is emitted. This parameter contains the full path to a script, and that script will echo the value of the desired password to stdout. For this functionality to be useful, your code must accept the parameter specified in "alternateScriptParameterName" as a parameter that replaces the "configName" and is known to point to the full path of a script that will print the desired password to stdout. This is mutually exclusive with credentialProviderCompatible. Has no effect on substitutions like ${parameter_name}, which will always get the raw password. no
  • New in Cloudera Manager 5.1.0
  • New in Cloudera Manager 5.4.0, added initType
  • New in Cloudera Manager 5.5.0, added credentialProviderCompatible and alternateScriptParameterName

string_array

Type Description Required?
separator When the array is serialized to a string, what separator should be used. By default comma is used. no
minLength The minimum length the array can be. By default there is no lower bound. no
maxLength The maximum length the array can be. By default there is no upper bound. no

path

Type Description Required?
conformRegex A regular expression the path needs to conform to. By default, all paths are valid. no
pathType An Enum of "localDataDir", "localDataFile" or "serviceSpecific". For "localDataDir", CM will create for you with the mode specified. For "serviceSpecific" and "localDataFile" the path will not be created. yes
mode The mode of the path. By default it is 0755. yes

path_array

Type Description Required?
separator When the array is serialized to a string, what separator should be used. By default comma is used. no
minLength The minimum length the array can be. By default there is no lower bound. no
maxLength The maximum length the array can be. By default there is no upper bound. no
conformRegex A regular expression the path needs to conform to. By default, all paths are valid. no
pathType An Enum of "localDataDir", "localDataFile" or "serviceSpecific". Both "localDataDir" and "localDataFile" CM will create for you with the mode specified. For "serviceSpecific" the path will not be created. yes
mode The mode of the path. By default it is 0755. yes

uri

Type Description Required?
conformRegex A regular expression the uri needs to conform to. By default, all uris are valid. no
opaque True if the uri is opaque or not. By default it is false. no
allowedSchemas A list of allowed schemas for this uri. By default all schemas are allowed. no

uri_array

Type Description Required?
separator When the array is serialized to a string, what separator should be used. By default comma is used. no
minLength The minimum length the array can be. By default there is no lower bound. no
maxLength The maximum length the array can be. By default there is no upper bound. no
conformRegex A regular expression the uri needs to conform to. By default, all uris are valid. no
opaque True if the uri is opaque or not. By default it is false no
allowedSchemas A list of allowed schemas for this uri. By default all schemas are allowed. no

units

The units can be one of:

  • milliseconds
  • seconds
  • minutes
  • hours
  • bytes
  • kilobytes
  • megabytes
  • gigabytes
  • percent
  • pages
  • times
  • lines

Substitutions

For some strings in the service.sdl it is necessary to add substitutions. For example, script runner arguments and environment variables are more useful if instead of passing hardcoded values we can pass in the materialized values of parameters or the host the role is running on. To facilitate this, the SDL supports ant style placeholders: ${<variable_name>}. There are various types of substitutions, each supporting a specific set of variables:

parameter

Any parameter available to the context -- role parameters if applicable and inherited service parameters. This is used in the form: ${<parameter_name>} where the parameter_name is the name (not the configName) of the parameter.

user

The user - ${user}. Note that in single user mode, this evaluates to the user running all the processes in the cluster.

group

The group - ${group}

host

The host - ${host} if available.

principal

The kerberos principal - ${principal}. This evaluates to the kerberos principal of the role in a secure cluster, otherwise has the same value as ${user}.

The value of the placeholder is generated after CM has processed the configuration hierarchy for the role and has produced a flat map of configuration keys and values. Because of this, we do not require scoping for role parameters to account for overrides or config groups. This is also a reason why we need unique parameter names throughout the SDL file.

  • New in Cloudera Manager 5.4.0, added ${principal} placeholder

Versioning and Compatibility

It is important to separate the concept of a CSD's version and its compatibility.

{
 "version" : "1.23",
 "compatibility" : {
   "generation" : 2,
    "cdhVersion" : {
      "min" : 4,
      "max" : 5
    }
  }
}

When we refer to a CSD's version, it is a string that lives in both the service descriptor as version 1.23 and in the filename SPARK-1.23.jar. Apart from requiring the version to match the CSD file name, the infrastructure does not derive any more semantics from the string. The compatibility structure, on the other hand, is used by the CSD infrastructure to verify preconditions before installing the CSD.

Generations

The compatibility section has a generation number used to communicate compatibility between different CSD versions. The generation is scoped to the CSD names and should increase monotonically when there is a breaking change between service descriptors. When Cloudera Manager installs a CSD it looks at the generation number and has three situations to consider:

  • The previous and current generation numbers match: CM can safely upgrade the installed CSD.
  • There is no previous generation number: CM can install the CSD since this is the first time it has seen this CSD type.
  • The previous and current generation number don't match: CM surfaces an error and doesn't allow the user to upgrade the CSD.

When the previous and current generation numbers don't match, the CSD author is communicating that there are breaking changes to the CSD and the user needs to follow additional upgrade steps provided by the CSD author. This is the most heavyweight way of upgrading a CSD, and will require the user to uninstall the CSD and re-install the new CSD.

cdhVersion

cdhVersion is used to restrict the installation of a CSD to a specific version of CDH. When the CSD is installed, the cluster CDH version is checked and if does not fall within the CSD's compatibility range the service type is not available for the cluster.

Kerberos Principals

Kerberos principals are used by services for enabling security using kerberos authentication. A kerberos principal is of the form primary/instance@REALM. It can be described as below :

{
 "name" : "kerberos_principal_name",
 "primary" : "principal_primary",
 "instance" : "principal_instance"
}

name

The name of the principal. This name can be used to refer to the kerberos principal in configuration files or process environment for roles.

primary

First part of the principal. This is a required field.

instance

Optional second part of the principal. If omitted, then the principal will just be primary@REALM. This can refer to a parameter, and in case of a role, to its host. If this is a URI, Cloudera Manager will extract the host name and use that as the instance name.

  • New in Cloudera Manager 5.2.0

RollingRestart

Specifies the workflow to rolling restart a service. If a service supports rolling restart, this workflow must restart every daemon role within the service. So all roles of the service must be specified in either worker or non-worker steps. Roles can be restarted either one-by-one if they are non-worker roles, or in batches if they are worker roles.

Both non-worker and worker steps descriptors have the following sections -

roleName

Role for which the steps are applied during rolling restart.

bringDownCommands

List of commands to run while bringing the role down during rolling restart. If "Stop" is specified as one of the commands, the regular role stop command is called. If this is not provided, role is simply stopped to bring it down.

bringUpCommands

List of commands to run while bringing the role up during rolling restart. If "Start" is specified as one of the commands, the regular role start command is called. If this is not provided, role is simply started to bring it up.

Note: The commands used for non-worker roles must be role level commands and the commands used for worker roles must be service level commands.

  • New in Cloudera Manager 5.5.0

Clone this wiki locally