Galaxy is a lightweight software deployment and management tool. We use it at Ning to manage the Java cores and Apache httpd instances that make up the Ning platform (www.ning.com).
Galaxy has four components:
-
Agent
-
Console
-
Command-line client
-
Gepo
The Galaxy agent, galaxy-agent, is a Ruby process responsible for managing deployments, starting and stopping processes, and registering with the console. One galaxy-agent process runs on each host (physical machine, Solaris zone, Virtual Machine) managed by Galaxy.
By default, it runs as the xncore user.
Each agent can manage a single software deployment at a time. For this reason, Galaxy is commonly used with OS virtualization (e.g., Solaris zones), wherein new OS instances, and thus Galaxy agent instances, can be instantiated with low overhead.
The Galaxy console, galaxy-console, is a Ruby process responsible for receiving announcements from agents, logging state changes, and providing an agent list to the command-line tool.
Typically, each environment (e.g. production, QA, development) has its own console.
The galaxy command-line client is the primary user interface to the Galaxy system. It retrieves a list of agents from the console, and contacts individual agents to do its work. Cores can be assigned (deployed), cleared (undeployed), updated, started, and stopped from the command-line tool.
The user must specify which console to use, either through a command-line argument (-c) or an environment variable (GALAXY_CONSOLE).
Gepo is the Galaxy software depot. It is a web-accessible repository that contains applications binaries and configuration properties.
Typically, the configuration files are stored in a SCM repository (subversion, git, …), and the binary packages are stored in a single flat directory on the file system.
Gepo is not part of the Galaxy code base but rather an external webserver that needs to be setup.
The Gepo SCM server should have two top-level directories:
-
config/: The root of a hierarchical directory structure defining a “config path” that contains configuration files
-
binaries/: A flat directory containing application binary packages of the form <type>-<version>-<build>.tar.gz.
Binaries are stored in the binary branch in Gepo.
Core configuration properties are stored in the config branch in Gepo, usually broken down per environment (each environment should have its own Gonsole):
-
gepo.company.com/config/trunk/prod => Production
-
gepo.company.com/config/trunk/dev => Development
-
etc…
Subdirectories under the config branch are known as “config path”. Each host managed by Galaxy can be assigned to one config path, which specifies which binaries and properties to download.
Here is an example of a config path, which has three components:
<site>/<ver>/<type>
where:
-
<site> specifies the site (datacenter)
-
<ver> specifies the version tag associated with the release
-
<type> specifies the core type abbreviation
The config path may include optional sub-types:
<site>/<ver>/<type>/<sub-type>/...
Configuration properties are built by scanning each node of the config path for properties files and merging them to create one file. Properties defined at deeper levels override properties of the same name defined at higher levels.
For example, consider the following files:
uk_datacenter/xncore.properties:
greatest.band="Led Zeppelin" greatest.drummer="John Bonham" greatest.guitarist="Jimmy Page"
uk_datacenter/6.5.15/xncore.properties:
greatest.band="Primus" greatest.bassist="Les Claypool"
uk_datacenter/6.5.15/apache/xncore.properties:
greatest.band="Dave Matthews Band" greatest.drummer="Carter Beauford"
The configuration file generated from merging these files would contain:
greatest.guitarist="Jimmy Page" greatest.bassist="Les Claypool" greatest.band="Dave Matthews Band" greatest.drummer="Carter Beauford"
Note that this property inheritance/overriding only works on files with names ending in “.properties”. With other files, such as jvm.config or log4j.xml, the file at the deepest level is used as-is.
When a host is assigned a config path, updated to a different config path, updated to the same config path (to force a properties refresh), or cleared, the Galaxy agent begins a new “deployment”.
A deployment is a set of files and directories used to download, manage, and run software releases.
A deployment consists of:
-
A sequence number. All Galaxy agents begin with deployment number 1 and increment for each new deployment. This sequence number is stored in the “deployment” file in the agent’s data directory (typically ~xncore/data).
-
A data file describing the deployment core type, build number, and version. The data file is stored in the agent’s data directory (typically ~xncore/data); the file name is the same as the deployment sequence number (e.g., ~xncore/data/1).
-
A deployment directory in the agent’s deploy directory (typically ~xncore/deploy) that contains the contents of the binary archive downloaded from Gepo.
-
A symbolic link called “current” that points to the current deployment directory.
An example might help illustrate these points. Consider the following directory listings (under ~xncore) for a host that has been assigned to 5 different config paths over its lifetime:
xncore@prod1:~: pwd /home/xncore xncore@prod1:~: ls -l . data deploy .: total 14 drwxr-xr-x 2 xncore xncore 8 Oct 24 04:08 data drwxr-xr-x 7 xncore xncore 8 Oct 24 04:08 deploy data: total 12 -rw-rw-rw- 1 xncore xncore 162 Oct 3 18:16 1 -rw-rw-rw- 1 xncore xncore 162 Oct 12 23:45 2 -rw-rw-rw- 1 xncore xncore 162 Oct 13 00:08 3 -rw-rw-rw- 1 xncore xncore 162 Oct 17 04:58 4 -rw-rw-rw- 1 xncore xncore 164 Oct 24 04:08 5 -rw-rw-rw- 1 xncore xncore 1 Oct 24 04:08 deployment deploy: total 16 drwxr-xr-x 8 xncore xncore 9 Oct 12 13:26 1 drwxr-xr-x 7 xncore xncore 7 Oct 11 19:02 2 drwxr-xr-x 7 xncore xncore 8 Oct 17 04:58 3 drwxr-xr-x 7 xncore xncore 8 Oct 24 04:08 4 drwxr-xr-x 7 xncore xncore 9 Oct 24 04:09 5 lrwxrwxrwx 1 xncore xncore 27 Oct 24 04:08 current -> /home/xncore/deploy/5 xncore@prod1:~:cat data/deployment ; echo # the deployment file does not contain a linefeed 5 xncore@prod1:~: cat data/5 --- !ruby/object:OpenStruct table: :build: 6.5.10-9378 :core_base: /home/xncore/deploy/5 :config_path: /uk_datacenter/6.5.10/apache :core_type: resolver xncore@prod1:~: ls -l deploy/current/ total 767612 drwxr-xr-x 2 xncore xncore 5 Oct 23 22:08 bin drwxr-xr-x 2 xncore xncore 6 Oct 24 04:08 etc -rw-rw-rw- 1 xncore xncore 392679750 Dec 1 09:06 launcher.log -rw-rw-rw- 1 xncore xncore 5 Oct 24 04:09 launcher.pid drwxr-xr-x 4 xncore xncore 4 Oct 23 22:07 lib drwxr-xr-x 15 xncore xncore 15 Oct 16 16:56 static-resources drwxr-xr-x 3 xncore xncore 5 Oct 23 22:07 test
Old deployment directories are not removed when new deployments are created. Thus, a core which has been assigned, cleared, or updated several times will contain several old versions of binaries and logs. It is safe for users to remove all but the current deployment directory if needed.
Launcher is a script responsible for starting and stopping cores. It is distributed with each core type, and may vary as needed depending on the type of core it maintains.
Galaxy calls launcher to start and stop cores, and to check core status.
Note that you need to write your own launcher scripts as they are deployments specific. See the section “How to Galaxify an application” below.
Launcher can be invoked directly from the command line, allowing cores to be controlled without the galaxy command-line tool. This is useful for debugging:
xncore@prod1# su - xncore xncore@prod1% cd deploy/current xncore@prod1% bin/launcher start xncore@prod1# su - xncore xncore@prod1% cd deploy/current xncore@prod1% bin/launcher stop
Mongrel gem.
You can run a Gonsole and an agent locally. In two different terminals, run:
rake run:gonsole rake run:gagent
You can now use the command line client to interact with theses:
galaxy > ruby -Ilib bin/galaxy show-agent -c localhost localhost online 3.0.0 galaxy > ruby -Ilib bin/galaxy show-console -c localhost druby://localhost:4440 http://Pierre-Alexandre-Meyers-MacBook-Pro.local:4442 localhost - 10
Running the galaxy command with -help will display on-line help.
xncore@prod1:~: galaxy -help galaxy [options] <command> [args] Options: -h, --help Display a help message and exit -c, --console CONSOLE Galaxy console host (overrides GALAXY_CONSOLE) -C, --config FILE Configuration file (overrides GALAXY_CONFIG) -p, --parallel-count THREADS Maximum number of threads to use, default 25 -r, --relaxed-versioning Allow updates to the currently assigned version -V Display the Galaxy version number and exit -y, --yes Avoid confirmation prompts by automatically confirming all actions Filters: -i, --host HOST Select a specific agent by hostname -I, --ip IP Select a specific agent by IP address -m, --machine MACHINE Select agents by physical machine -M, --cohabitants HOST Select agents that share a physical machine with the specified host -s, --set SET Select 'e{mpty}', 't{aken}' or 'a{ll}' hosts -S, --state STATE Select 'r{unning}' or 's{topped}' hosts -A, --agent-state STATE Select 'online' or 'offline' agents -e, --env ENV Select agents in the given environment -t, --type TYPE Select agents with a given software type -v, --version VERSION Select agents with a given software version Notes: - Filters are evaluated as: set | host | (env & version & type) - The HOST, MACHINE, and TYPE arguments are regular expressions (not globs) - The default filter selects all hosts Commands: assign cleanup clear perform reap restart rollback show show-agent show-console show-core ssh start stop update update-config
Galaxy requires Ruby 1.8.7 or Ruby 1.9 and Ruby Gems. See github.com/ning/galaxy/downloads.
The galaxy command-line client is used to interact with the deployments via the Gonsole. For convenience, the GALAXY_CONSOLE environment variable can be set prior to running the galaxy command, e.g.:
export GALAXY_CONSOLE=gonsole.prod.company.com
To deploy one core to one machine, you need at least an Agent and a Gonsole running. The Agent needs to run on the same machine where the code is being deployed. The command-line client and the Gonsole can be anywhere (same machine, or not).
Core status is displayed using the ‘show’ command.
xncore@prod1:~: galaxy show -help galaxy [options] <command> [args] Usage for 'show': show Show software deployments on the selected hosts Examples: - Show all hosts: galaxy show - Show unassigned hosts: galaxy -s empty show - Show assigned hosts: galaxy -s taken show - Show a specific host: galaxy -i foo.bar.com show - Show all widgets: galaxy -t widget show
- options
-
defines a search filter, and is some combination of:
-
-i <host> => Selects the single host specified by the supplied host name
-
-s <site> => Selects the hosts in the specified site <site>, corresponding to the first node of the config path (e.g., “uk_datacenter”)
-
-v <version> => Selects the hosts running the specified version <version>, corresponding to the second node of the config path (e.g., “6.6”)
-
-t <type> => Selects the hosts running the specified core type <type>, corresponding to the third node (and any subsequent nodes) of the config path (e.g., “apache”, “apache/ssl”, …).
-
-s <set> => Selects the hosts in the specified set, which is one of:
-
a => Selects all hosts running a Galaxy agent
-
t => Selects all hosts that have been assigned a role
-
e => Selects all hosts that have NOT been assigned a role
Other search filters are available; see the Galaxy command-line help for more information.
By default, core sub-types (e.g., “apache/ssl”, “apache/passenger”, etc.) are NOT included in searches for the core super-type (e.g., “apache”). To retrieve cores with a sub-type, append a .* to the core type. Note that some shells require the wildcard to be escaped; bash does not (as long as the current working directory does not contain files matching the glob pattern).
Two files are required to Galaxify an application:
-
A build.properties file uploaded to the desired config path in the Gepo config branch. This file must contain at least one property, type, which defines the _type> portion of the tarball name described below. It may also contain a version property, which defines the <version> portion of the tarball name, which is an arbitrary string that may include a build number, patch date, or other useful information.
A typical build.properties file looks like:
type=apache build=2.2
Note that if the version property is not specified, it will be inherited from a build.properties file located higher up in the Gepo config path.
-
An appropriately named binary archive uploaded to the binaries branch in Gepo. The file name should be: <type>-<version>.tar.gz.
The archive must contain at least one file, bin/xndeploy, that must return a 0 exit status. This script can do whatever is needed to deploy the application being installed. Note that most xndeploy scripts require the Galaxy library to be included in the archive, under lib/galaxy.
If Galaxy is to start and stop processes, a bin/launcher script is also required. This script takes one of “start”, “stop”, “restart”, or “status”, and returns with exit status 0 (stopped), 1 (running), or 2 (unknown).
The archive can contain any other files the application requires, commonly placed under the lib directory. The archive will be extracted to the deployment directory (~xncore/deploy/<sequence number>) and linked to ~xncore/deploy/current for convenience.
Running the galaxy-agent command with no arguments will display on-line help.
xncore@prod1:~: galaxy-agent -help Usage: galaxy-agent <command> [options] Commands, use just one of these -s, --start Start the agent -k, --stop Stop the agent Options for Start -C, --config FILE Configuration file (overrides GALAXY_CONFIG) -i, --host HOST[:PORT] Hostname this agent manages (default localhost) -m, --machine MACHINE Physical machine where the agent lives (overrides -M) -M, --machine-file FILE Filename containing the physical machine name -c ['http://'|'druby://']HOST[:PORT] --console Hostname where the console is listening -r, --repository URL Base URL for the repository -b, --binaries URL Base URL for the binary archive -d, --deploy-to DIR Directory where to make deployments -x, --data DIR Directory for the agent's database -f, --fore, --foreground Run agent in the foreground -a, --announce-interval INTERVAL How frequently (in seconds) the agent should announce General Options -z, --event_listener URL Which listener to use -l, --log LOG STDOUT | STDERR | SYSLOG | /path/to/file.log -L, --log-level LEVEL DEBUG | INFO | WARN | ERROR. Default=INFO -u, --user USER User to run as -g, --agent-log FILE File agent should rediect stdout and stderr to -t, --test Test, displays as -v without doing anything -v, --verbose Verbose output -V, --version Print the galaxy version and exit -h, --help Show this help
For example:
galaxy-agent -s -i prod1.company.com -c gonsole.prod.company.com -r http://gepo.company.com/config/trunk/prod -b http://gepo.company.com/binaries -d ~xncore/deploy -x ~xncore/data
To see the agent debug log on the terminal, append the following to the end of the above command:
-f -l STDOUT -L DEBUG -v
Same steps as the agent:
xncore@prod1:~: galaxy-console -help Usage: galaxy-console <command> [options] Commands, use just one of these -s, --start Start the console -P, --start-proxy Start the proxy console -k, --stop Stop the console Options for Start -C, --config FILE Configuration file (overrides GALAXY_CONFIG) -i, --host HOST Hostname this console runs on -a HOST[:PORT] Port for Http post announcements --announcement-url -p, --ping-interval INTERVAL How many seconds an agent can be silent before being marked dead -f, --fore, --foreground Run console in the foreground -Q, --console-proxied-url URL Gonsole to proxy General Options -z, --event_listener URL Which listener to use -l, --log LOG STDOUT | STDERR | SYSLOG | /path/to/file.log -L, --log-level LEVEL DEBUG | INFO | WARN | ERROR. Default=INFO -g, --console-log FILE File agent should rediect stdout and stderr to -u, --user USER User to run as -t, --test Test, displays as -v without doing anything -v, --verbose Verbose output -V, --version Print the galaxy version and exit -h, --help Show this help
The preferred method of configuring Galaxy is to write an /etc/galaxy.conf file.
See github.com/ning/galaxy/blob/master/lib/galaxy/config.rb for up-to-date configuration options.
mvn clean package
The Rakefile has a few rules to build gems, RPM and Solaris packages. See:
rake -T
for more information.
Copyright 2010 Ning
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.