-
Notifications
You must be signed in to change notification settings - Fork 133
CSD Overview
Cloudera Manager (CM) 4.5 introduced parcels - a mechanism to distribute software to a managed cluster. Parcels go only as far as to distribute software across the cluster - they do not allow the management of processes. In Cloudera Manager 5 we have introduced the ability to add your own managed service through the use of Custom Service Descriptors (CSDs). A third party service making use of CSDs can leverage features of Cloudera Manager such as monitoring, resource management, configuration, distribution, life-cycle management, etc. This service will show up in Cloudera Manager just like any other service e.g. HDFS, HBase.
Note: This documentation assumes you have read and are familiar with basic operating principles of Cloudera Manager.
- Can be written by non-programmers using documentation and developer tooling.
- The service descriptor language (SDL) and service monitoring descriptor language (MDL) should be declarative and not require a specialized programming language.
- In Cloudera Manager, a service backed by a CSD should look and feel like a first-party service. e.g. HDFS.
- A baseline of functionality is provided to a CSDs for free. e.g. process level monitoring.
- CSDs should work well with parcels but not require them.
- If you have your own way of laying down bits, you can still use a CSD for configuration and process life-cycle management.
A CSD is linked to one service type in Cloudera Manager and is packaged and distributed as a jar file. The jar is self-contained and encases all the description and logic needed to manage the service type in CM. For example, the Spark CSD layout is shown below:
$ jar -tf SPARK-1.0.jar
descriptor/service.sdl
scripts/control.sh
images/icon.png
More examples including the Spark CSD are available in our git repo.
The descriptor/service.sdl
is a json file that declaratively describes the service type in Cloudera Manager. CSDs have a scripts/
directory that contains binaries which control how the service is started. See The Structure of a CSD for more details.
Both CSDs and parcels are tools for extending Cloudera Manager but in different ways. Parcels aid in the distribution of software on the cluster. Since a parcel is essentially a tar ball with added metadata, when it gets distributed, the Cloudera Manager agent simply unpacks it on the host - there is no mechanism to manage/configure processes. There are valid use cases for only using parcels like distributing a library to the cluster - there is no configuration/process to manage. An example of this is the LZO plugin for Hadoop since it only needs to modify the HADOOP_CLASSPATH.
CSDs pick up where parcels leave off. Once the bits are distributed to the cluster, Cloudera Manager uses the CSD to know how to manage the deployed software - start/stop, configuration, resource management etc. A CSD is what provides the ability for a partner to have a service show up in the wizard and status pages.
The most turnkey solution for integrating with Cloudera Manager is to use a parcel to distribute the software and a CSD for management. There are cases where you might like to use an out-of-band software deployment system to lay down the bits on the cluster. In this case, a CSD can still be used in the absence of a parcel for management of the software.