-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathapppot.txt
483 lines (399 loc) · 19.9 KB
/
apppot.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
This document describes how to set up and use an AppPot system image.
In particular, it embeds all the relevant scripts using [[http://orgmode.org/worg/org-contrib/babel/][Emacs'
Org-babel mode]]. You can still read and edit this document using
any regular text editor, but you will need to copy+paste the
scripts out of it yourself.
* Introduction
AppPot is a way to implement generic application deployment on a
computational Grid, and especially to enable users to provide their
own software to the computing cluster.
AppPot uses [[http://user-mode-linux.sf.net][User Mode Linux]] to provide users with a development and
running environment /they/ have full control over. User Mode Linux
(/UMLx/ for short) provides a virtualized environment in which users
can build and run their own computational applications. AppPot
augments this with scripts that can take a copy of the UML system
image and submit it as a regular computational job using the ARC
middleware.
** What is User Mode Linux
[[http://user-mode-linux.sf.net][User-Mode Linux]] allows running a Linux kernel as a userspace process
within a running Linux system; the guest kernel can see a file in the
host filesystem as if it was a disk device, but it can also mount
(portions of) the host filesystem into its own filesystem
hierarchy. Additional helper programs enable use of the host network
from within the system running in the UML.
From the [[http://user-mode-linux.sf.net][UMLx website]]:
User-Mode Linux gives you a virtual machine that may have more
hardware and software virtual resources than your actual, physical
computer. Disk storage for the virtual machine is entirely contained
inside a single file on your physical machine. You can assign your
virtual machine only the hardware access you want it to have. With
properly limited access, nothing you do on the virtual machine can
change or damage your real computer, or its software.
These are all features that are now familiar in virtualization
products; the main difference of UML with respect to other
virtualization solutions is that:
- User-Mode Linux can only run Linux inside Linux.
- [[http://user-mode-linux.sf.net][*User Mode Linux]] runs entirely in userspace* (including networking),
so there is no need for support by the systems administrator, nor
is any coordinated deployment or setup needed.
- UMLx's use of system resources is more easily controlled: a running
UMLx system is just a process in the host Linux system, so all
regular process controls apply.
** How is UML useful for running computational jobs on a Grid
Users receive a "AppPot system image", which is a raw Linux disk image
containing a working installation of a Linux distribution. The
reference AppPot image is based on Debian (see below): apart from
being optimized for running a single-user task in a virtualized
environment (i.e., multi-user support, hardware-related software and
most daemons have been turned off), this is a regular Debian install.
Users modify the AppPot system image by installing new software, e.g.,
their own version of a computational code. They can then submit a
job to the Grid middleware that just runs UMLx with the provided modified image.
A script is provided to create the correct incantation to run the
AppPot system remotely as a Grid computational job, and have it
execute a copy of the user's code with the specified parameters.
In order to reduce the network traffic, if the AppPot image is already
installed at the execution site, the script can just send a tarball
with the differences between the standard AppPot and the user-modified one.
This solves the issue of compiling and deploying each application
software on each platform: users compile and run tests on
their own desktop. There's no worry about multiple architecture/OSes
because all jobs run in the same Linux environment: same environment
for development and execution.
The AppPot system image is accompanied by the helper programs =linux=
and =slirp=; these are just statically-linked versions of the binaries
provided by Debian and can be used by anyone for running a UMLx system
in their Linux-based desktop (static link guarantees wider
compatibility).
Note that, since the AppPot system image is a raw disk image file, it
can be run through any virtualization software that supports the /raw/
disk format (e.g., KVM or Xen): this provides a way to use the AppPot
system in a more convenient way for users.
** Parts of AppPot
A working AppPot system consists of the following parts:
- an /AppPot system image/, i.e., a complete Linux system installed in
a partition image file (in [[http://en.wikibooks.org/wiki/QEMU/Images#Image_types][raw format]])
- a UMLx =linux= executable for running the Linux system in the
AppPot system image, and a =slirp= program for providing network
connectivity.
- a set of /AppPot scripts/ providing support for the features that
AppPot exposes to users: running a computational job upon startup,
taking a snapshot of a modified AppPot system, etc.
* Creation of the AppPot system image
Although any Linux system can run with a UMLx kernel, most Linux
distributions ship with configuration that are oriented towards
Desktop/graphical or server/multi-user usage. Those systems are
definitely overkill for a UMLx system whose sole purpose is to run a
single instance of a computational job.
On the other hand, the AppPot image needs to be usable by end-users
for compiling, installing and testing applications, so a good choice
of development tools must be available on it.
The ideal choice for the AppPot system image seems to be Debian
distribution:
- A *bare bones* system setup is supported by the installation
script (in the minimal setup, only three processes are running
after system boot, including =/sbin/init=; the total size of
installed packages is ~200MB) -- this can be even reduced by
carefully disabling some features that are not needed in the
AppPot environment.
- It has a *vast selection of pre-compiled packages*, that users can
easily install with =apt-get=/=aptitude=. The choice of available
development tools is probably the largest among Linux
distributions.
- It comes with excellent UMLx support, including:
* pre-packaged UMLx kernel, matching the distribution's default
kernel version;
* tools to bootstrap a UMLx disk image;
* utilities for UMLx networking.
Although the reference implementation is based on Debian Linux, the
AppPot scripts strive to be generic enough to run on any LSB-compliant
Linux distribution; in particular, they should run on any recent UMLx
image distributed on the [[http://user-mode-linux.sf.net][UMLx website]]
** Installation of UML in the host computer
UMLx runs a modified Linux kernel as a regular process. A sample
kernel (Fedora's linux-2.6.24-x86_64) and root image can be downloaded
from the [[http://user-mode-linux.sf.net][UMLx website]].
Debian/Ubuntu provide a pre-compiled UMLx kernel in package
`user-mode-linux`. In addition, Debian (hence, Ubuntu) provides
packages for creating a UML disk image ([[http://linuxappfinder.com/package/rootstrap][=rootstrap=]]) and for userspace
UMLx networking ([[http://en.wikipedia.org/wiki/Slirp][=slirp=]], [[http://vde.sourceforge.net/][=vde=]]).
The steps given here are intended to be performed on a Debian/Ubuntu
system; the following software should be installed as a prerequisite:
- a User-Mode Linux kernel;
- the =slirp= program (compiled with =-DFULL_BOLT= to remove speed restrictions);
- the =rootstrap= program to install Debian images.
The following command installs prerequisite software on a
Debian/Ubuntu system:
#+begin_src sh :exports code :file install-prereq.sh :shebang "#!/bin/sh"
sudo aptitude install \
user-mode-linux \
slirp \
rootstrap
#+end_src
** The AppPot system image
The AppPot system image is a regular Debian GNU/Linux installation,
with the following features:
- The same image is used by users for code development and testing
and for running a production job remotely.
- It provides a single-user environment, where multi-user features are
disabled to save space and execution time.
In particular, this translates into the following requirements:
- There is just one unprivileged user in =/etc/passwd=, and this
user is entitled to run `root` commands via =sudo= (Grid users
must be able to install additional packages if they are needed
for code development and/or run time). In particular, no strong
security system (like [[SELinux]]) is needed, and all supporting
code can be removed from the system and its kernel.
- Network access is only /outbound/: users have access to the
system console, and jobs are not usually able to receive
incoming network traffic at execution sites. Network daemons
can thus be removed to save space.
- In order to run jobs, it should be possible to specify a program
to run (with arguments) on the kernel command line; the UMLx
machine should execute the specified program non-interactively
instead of the shell prompt.
- Upon booting, if no command is specified, the user is dropped
directly to a shell prompt; there is no need for =/bin/login=.
There is just one terminal, multiplexed with [[GNU Screen]].
*** Creation of the Debian-based AppPot system image
The reference AppPot system image is based on [[http://www.debian.org/][Debian GNU/Linux]].
Debian's program [[rootstrap]] will be used for creating the AppPot system
image. The =rootstrap= command reads its configuration from the
system-wide file =/etc/rootstrap/rootstrap.conf= and merges it with a
=rootstrap.conf= file in the current directory (if it exists). The
format of the configuration file is documented in the [[http://manpages.ubuntu.com/1/rootstrap][rootstrap(1)]] man
page.
Based on the previous items, AppPot's =rootstrap.conf= builds a Debian
system image as follows:
- 1GB disk space: this should be enough for hosting all the software
one can possibly want to install for development; the actual
working/scratch space will always be mounted from the host system.
- Use the /ext2/ filesystem: journaling features provide no advantage
here, as the whole disk is just a file in the host filesystem,
which already provides journaling features.
#+begin_src sh :tangle rootstrap.conf
[global]
# Initial size of the filesystem image (in MBs). It will be created
# sparsely, so additional space is not allocated until it is used.
initialsize=1024
# the filesystem to create
fstype=ext2
# add the `appot` module to the std list to perform local customization
modules=network mkfs mount debian uml apppot umount
# rootstraps' defaults are too low for installing Debian squeeze
umlargs= mem=256M
#+end_src
Network settings in rootstrap only apply to the initial image setup;
however, having production settings here saves us from changing them
later on. We use the =slirp-fullbolt= program to provide access through
the user's networking capabilities on the host; /note that/ we assume that
=slirp='s default network 10.0.2.x does not conflict with local IP
installations.
#+begin_src sh :tangle rootstrap.conf
[network]
# name of the network interface inside the UML
interface=eth0
transport=slirp
# note: the following settings were copied verbatim from
# /etc/rootstrap/rootstrap.conf; they're slirp's defaults, tho
host=
uml=10.0.2.15
nameserver=10.0.2.3
gateway=10.0.2.2
netmask=255.255.0.0
slirp=slirp-fullbolt
#+end_src
Currently we build the AppPot system image based on Debian's `squeeze`
distribution, which was recently released as `stable`.
Alternatively, Debian's `testing` distribution provides a good
compromise of stability and up-to-date packages.
#+begin_src sh :tangle rootstrap.conf
[debian]
dist=stable
#mirror=http://mirror.switch.ch/ftp/mirror/debian/
mirror=http://debian.ethz.ch/debian/
# Sources for target's sources list. If no value is given the default
# is to use the main section of $mirror
# NOTE: you can provide multiple lines if each new line is indented
# with blank spaces
sources=
deb http://security.debian.org/ stable/updates main contrib non-free
deb http://debian.ethz.ch/debian stable main contrib non-free
#deb http://mirror.switch.ch/ftp/mirror/debian/ stable main contrib non-free
#+end_src
The default system-wide =/etc/rootstrap/rootstrap.conf= excludes
packages `pcmcia-cs` and `setserial` from the start, but this makes
the installation fail. Let us override this here: we'll remove unused
packages later on.
#+begin_src sh :tangle rootstrap.conf
# Packages which should not be installed in the first place (be sure
# you know what you're doing)
exclude=
#+end_src
After the base system is installed, =rootstrap= invokes =apt-get= to
remove and then install packages as specified in the following
configuration lines. Note that we just install the package `apppot`:
its dependencies control what really gets installed (but this allows
us to override some configuration files as well).
#+begin_src sh :tangle rootstrap.conf
# Packages which should be purged after the initial install
purge=dmidecode isc-dhcp-client isc-dhcp-common tasksel tasksel-data
# Extra packages to install via apt after initial debootstrap install
install=makedev bash screen libterm-readline-perl-perl
#+end_src
*** AppPot image customization
Image customization is performed by a custom =apppot= module, which is
run from =rootstrap=. According to [[man:rootstrap][rootstrap(1)]]:
When a module is invoked, the filesystem being created is mounted on
$TARGET, =/dev=, =/etc= and =/tmp= are tmpfs filesystems internal to
the UMLx system, while the root filesystem is a hostfs mount of the
system where rootstrap is running, to have access to the above
shadowed directories the full host filesystem is available on
$HOST. =/lib/modules= is tmpfs too and bind mounted to the host's
=/usr/lib/uml/modules= to let scripts load necessary kernel modules.
This means that most software on the host system should be available
and work as expected. The working directory where rootstrap is run
is available as $WORKDIR. The environment is generated from the
configuration file as described above.
Local modules should be placed in the =modules/= directory and
implement the following interface
The following actions are performed by the =apppot= module:
**** Reset the `root` password to a known standard value
Since there are no security concerns, we can just use a passwordless
root account:
#+begin_src sh :tangle modules/apppot :shebang #!/bin/sh
## remove root password
chroot $TARGET passwd -d root
#+end_src
**** Install additional packages
/TODO:/ move this into dependencies of the `apppot` Debian package?
This is made trivial with `apt-get` and `aptitude`:
#+begin_src sh :export code :file modules/apppot :tangle modules/apppot
## install additional packages
# install compatibility stuff
chroot $TARGET apt-get -y install \
lsb-base lsb-release \
bash jed less screen sudo \
;
# install basic development packages
chroot $TARGET apt-get -y install \
build-essential \
autoconf automake libtool make \
gcc g++ gfortran libc6-dev libstdc++6-4.4-dev \
subversion git-core \
;
#+end_src
**** Disable standard daemons
The minimal Debian install still leaves two daemons installed and
working: =rsyslogd= and =cron=. We keep =rsyslogd= running, as log
files can help diagnose problems, but we disable cron -- users can
re-enable it via =sudo update-rc.d cron enable= if they want.
#+begin_src sh
## disable standard daemons
chroot $TARGET /usr/sbin/update-rc.d cron disable
#+end_src
**** Modify `/etc/inittab` to disable multi-user consoles
#+begin_src sh :tangle modules/apppot
## comment out the console getty instances
sed -i -e 's|^\([1-6]\):|#\1:|' $TARGET/etc/inittab
## spawn a root shell on the system console
cat >> $TARGET/etc/inittab <<__EOF__
# spawn a root shell on the system console...
0:a:respawn:/usr/bin/screen /bin/bash
__EOF__
#+end_src
**** Install AppPot support scripts
/TODO:/ Replace with proper script packaging.
#+begin_src sh :export code :file modules/apppot :tangle modules/apppot
## copy AppPot support scripts to the created filesystem
# install the init script as '/etc/rc.local'; this is always run as
# the last step of the boot process; on Debian, by default 'rc.local'
# does nothing, so we can safely overwrite it.
install -o 0 -g 0 -m 0755 $WORKDIR/apppot-init.sh $TARGET/etc/rc.local
# setup the "snapshot" script
install -o 0 -g 0 -m 0755 $WORKDIR/apppot-snap.sh $TARGET/usr/bin/apppot-snap
install -d -o 0 -g 0 -m 0755 $TARGET/var/lib/apppot
install -o 0 -g 0 -m 0755 $WORKDIR/apppot-snap.exclude $TARGET/var/lib/apppot/apppot-snap.exclude
#+end_src
**** Record a snapshot of the system
As the very last step, we record a snapshot of the system, i.e., we
make this into a "base system image". This allows users to send only
the changes from the "base system image" when they want to run a job
on a remote AppPot.
#+begin_src sh :export code :file modules/apppot :tangle modules/apppot
## tag this as a "base image"
chroot $TARGET /usr/bin/apppot-snap base
#+end_src
** The =apppot= Debian package
Customization of the AppPot system image is done automatically by
installing a single Debian package =apppot=. Since the package is
versioned, one has a simple way to check what exact features are
provided by a certain AppPot image by looking at the version number.
In addition, existing images can be upgraded by upgrading the =apppot=
package.
The =apppot= package is a regular Debian package; it depends on all
extra packages (i.e., not in the Debian base system) that AppPot
includes, and replaces some configuration files with customized
versions.
The =apppot= package is described by the single file
`debian/packages`, which follows the format required by the [[http://yada.alioth.debian.org/][Yada]] tool.
#+begin_src sh :tangle debian/packages
## This is adapted from the example packages file for YADA; read
## /usr/share/doc/yada-doc/yada.txt.gz (from the yada-doc package) to
## find out how to customise it to your needs.
Source: apppot
Section: unknown
Priority: extra
Maintainer: Riccardo Murri <riccardo.murri@gmail.com>
Packaged-For: Grid Computing Competence Center,
University of Zurich
Standards-Version: 3.7.2
Upstream-Source: http://apppot.googlecode.com/svn/trunk/
Homepage: http://apppot.googlecode.com/
Description: AppPot customizations
This package installs the customizations that make a standard Debian
GNU/Linux image into an AppPot image.
Copyright: GPL
Copyright 2011 University of Zurich. All rights reserved.
Build: sh
# nothing to build, this is a metapackage
true
Clean: sh
# likewise, nothing to remove
true
Package: apppot
Architecture: any
Depends: autoconf, automake, bash, g++-4.4, gcc-4.4, git-core,
jed, less, libc6-dev, libstdc++6-4.4-dev, libtool, lsb-base (>= 3.0-6),
make, screen, subversion, sudo
Description: AppPot customizations
This package installs the customizations that make a standard Debian
GNU/Linux image into an AppPot image.
Install: sh
# files that will be installed by this package on the AppPot system
# see http://yada.alioth.debian.org/ for the syntax of the `yada` cmd
#yada install -script debian/apppot-make-job-package
Init: sh
defaults 00
# any commands placed here will placed into an /etc/init.d/apppot file
#+end_src
** TODO Creating a minimal UMLx kernel
** TODO Creating a statically-linked =slirp= binary.
* TODO Set-up for running computational jobs
** Command-line usage of UML
** Generating the job tarball
** Wrapper script for job execution
** End-user submission script
* TODO Example submission: the "Rheinfall" gaussian elimination code.
** TODO Installation of sources in the AppPot image
** TODO Compilation & local testing
** TODO Executing a test run
* Copyright and license
Copyright (c) 2011 GC3, University of Zurich. All rights reserved.
This document and the code in it are available under the GNU General
Public License version 3, or (at your option) any later version.
# (For Emacs only)
# Local variables:
# mode: org
# End:
# LocalWords: AppPot UMLx virtualization userspace middleware filesystem linux virtualized journaling