forked from proot-me/proot
-
Notifications
You must be signed in to change notification settings - Fork 0
/
manual.txt
462 lines (353 loc) · 15.9 KB
/
manual.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
======
CARE
======
-------------------------------------------------
Comprehensive Archiver for Reproducible Execution
-------------------------------------------------
:Date: 2014-09-15
:Version: 2.2
:Manual section: 1
Synopsis
========
**care** [*option*] ... *command*
Description
===========
CARE monitors the execution of the specified command to create an
*archive* that contains all the material required to *re-execute* it
in the same context. That way, the command will be reproducible
everywhere, even on Linux systems that are supposed to be not
compatible with the original Linux system. CARE is typically useful
to get reliable bug reports, demonstrations, `artifact evaluation`_,
tutorials, portable applications, minimal rootfs, file-system
coverage, ...
By design, CARE does not record events at all. Instead, it archives
environment variables and accessed file-system components -- before
modification -- during the so-called *initial* execution. Then, to
reproduce this execution, the ``re-execute.sh`` script embedded into
the archive restores the environment variables and relaunches the
command confined into the saved file-system. That way, both *initial*
and *reproduced* executions should produce the same results as they
use the same context, assuming they do not rely on external events --
like key strokes or network packets -- or that these external events
are replayed manually or automatically, using umockdev_ for instance.
That means it is possible to alter explicitly the reproduced
executions by changing content of the saved file-system, or by
replaying different external events.
.. _umockdev: https://github.com/martinpitt/umockdev/
.. _artifact evaluation: http://www.artifact-eval.org/
Privacy
-------
To ensure that no sensitive file can possibly leak into the archive,
CARE *conceals* recursively the content of ``$HOME`` and ``/tmp``,
that is, they appear empty during the original execution. Although,
for consistency reasons, the content of ``$PWD`` is *revealed* even if
it is nested into the two previous paths.
As a consequence, a program executed under CARE may behave
unexpectedly because a required path is not accessible anymore. In
this case, such a path has to be revealed explicitly. For details,
see the options ``--concealed-path`` and ``--revealed-path``, and the
file ``concealed-accesses.txt`` as well.
It is advised to inspect the archived content before sharing it.
Options
=======
The command-line interface is composed of two parts: first CARE's
options, then the command to launch. This section describes the
options supported by CARE, that is, the first part of its command-line
interface.
-o path, --output=path
Archive in *path*, its suffix specifies the format.
The suffix of *path* is used to select the archive format, it can
be one of the following:
========= ========================================================
suffix comment
========= ========================================================
/ don't archive, copy into the specified directory instead
.tar most common archive format
.cpio most portable archive format, it can archive sockets too
?.gz most common compression format, but slow
?.lzo fast compression format, but uncommon
?.bin see ``Self-extracting format`` section
?.?.bin see ``Self-extracting format`` section
.bin see ``Self-extracting format`` section
.raw recommended archive format, use `care -x` to extract
========= ========================================================
where "?" means the suffix must be combined with another one. For
examples: ".tar.lzo", ".cpio.gz", ".tar.bin", ".cpio.lzo.bin", ...
If this option is not specified, the default output path is
``care-<DATE>.bin`` or ``care-<DATE>.raw``, depending on whether
CARE was built with self-extracting format support or not.
-c path, --concealed-path=path
Make *path* content appear empty during the original execution.
Some paths may contain sensitive data that should never be
archived. This is typically the case for most of the files in:
* $HOME
* /tmp
That's why these directories are recursively *concealed* from the
original execution, unless the ``-d`` option is specified.
Concealed paths appear empty during the original execution, as a
consequence their original content can't be accessed nor archived.
-r path, --revealed-path=path
Make *path* content accessible when nested in a concealed path.
Concealed paths might make the original execution with CARE behave
differently from an execution without CARE. For example, a lot of
``No such file or directory`` errors might appear. The solution
is to *reveal* recursively any required paths that would be nested
into a *concealed* path. Note that ``$PWD`` is *revealed*, unless
the ``-d`` option is specified.
-p path, --volatile-path=path
Don't archive *path* content, reuse actual *path* instead.
Some paths contain only communication means with programs that
can't be monitored by CARE, like the kernel or a remote server.
Such paths are said *volatile*; they shouldn't be archived,
instead they must be accessed from the *actual* rootfs during the
re-execution. This is typically the case for the following pseudo
file-systems, sockets, and authority files:
* /dev
* /proc
* /sys
* /run/shm
* /tmp/.X11-unix
* /tmp/.ICE-unix
* $XAUTHORITY
* $ICEAUTHORITY
* /var/run/dbus/system_bus_socket
* /var/tmp/kdecache-$LOGNAME
This is also typically the case for any other fifos or sockets.
These paths are considered *volatile*, unless the ``-d`` option is
specified.
-e name, --volatile-env=name
Don't archive *name* env. variable, reuse actual value instead.
Some environment variables are used to communicate with programs
that can't be monitored by CARE, like remote servers. Such
environment variables are said *volatile*; they shouldn't be
archived, instead they must be accessed from the *actual*
environment during the re-execution. This is typically the case
for the following ones:
* DISPLAY
* http_proxy
* https_proxy
* ftp_proxy
* all_proxy
* HTTP_PROXY
* HTTPS_PROXY
* FTP_PROXY
* ALL_PROXY
* DBUS_SESSION_BUS_ADDRESS
* SESSION_MANAGER
* XDG_SESSION_COOKIE
These environment variables are considered *volatile*, unless the
``-d`` option is specified.
-m value, --max-archivable-size=value
Set the maximum size of archivable files to *value* megabytes.
To keep the CPU time and the disk space used by the archiver
reasonable, files whose size exceeds *value* megabytes are
truncated down to 0 bytes. The default is 1GB, unless the ``-d``
option is specified. A negative *value* means no limit.
-d, --ignore-default-config
Don't use the default options.
-x file, --extract=file
Extract content of the archive *file*, then exit.
It is recommended to use this option to extract archives created
by CARE because most extracting tools -- that are not based on
libarchive -- are too limited to extract them correctly.
-v value, --verbose=value
Set the level of debug information to *value*.
The higher the integer *value* is, the more detailed debug
information is printed to the standard error stream. A negative
*value* makes CARE quiet except on fatal errors.
-V, --version, --about
Print version, copyright, license and contact, then exit.
-h, --help, --usage
Print the user manual, then exit.
Exit Status
===========
If an internal error occurs, ``care`` returns a non-zero exit status,
otherwise it returns the exit status of the last terminated program.
When an error has occurred, the only way to know if it comes from the
last terminated program or from ``care`` itself is to have a look at
the error message.
Files
=====
The output archive contains the following files:
``re-execute.sh``
start the re-execution of the initial command as originally
specified. It is also possible to specify an alternate command.
For example, assuming ``gcc`` was archived, it can be re-invoked
differently:
$ ./re-execute.sh gcc --version
gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
$ echo 'int main(void) { return puts(\"OK\"); }' > rootfs/foo.c
$ ./re-execute.sh gcc -Wall /foo.c
$ foo.c: In function "main":
$ foo.c:1:1: warning: implicit declaration of function "puts"
``rootfs/``
directory where all the files used during the original execution
were archived, they will be required for the reproduced execution.
``proot``
virtualization tool invoked by re-execute.sh to confine the
reproduced execution into the rootfs. It also emulates the
missing kernel features if needed.
``concealed-accesses.txt``
list of accessed paths that were concealed during the original
execution. Its main purpose is to know what are the paths that
should be revealed if the the original execution didn't go as
expected. It is absolutely useless for the reproduced execution.
Limitations
===========
It's not possible to use GDB, strace, or any programs based on
*ptrace* under CARE yet. This latter is also based on this syscall,
but the Linux kernel allows only one *ptracer* per process. This will
be fixed in a future version of CARE thanks to a ptrace emulator.
Example
=======
In this example, Alice wants to report to Bob that the compilation of
PRoot v2.4 raises an unexpected warning::
alice$ make -C PRoot-2.4/src/
make: Entering directory `PRoot-2.4/src'
[...]
CC path/proc.o
./path/proc.c: In function 'readlink_proc':
./path/proc.c:132:3: warning: ignoring return value of 'strtol'
[...]
Technically, Alice uses Ubuntu 11.04 for x86, whereas Bob uses
Slackware 13.37 on x86_64. Both distros are supposed to be shipped
with GCC 4.5.2, however Bob is not able to reproduce this issue on his
system::
bob$ make -C PRoot-2.4/src/
make: Entering directory `PRoot-2.4/src'
[...]
CC path/proc.o
[...]
Since they don't have much time to investigate this issue by iterating
between each other, they decide to use CARE. First, Alice prepends
``care`` to her command::
alice$ care make -C PRoot-2.4/src/
care info: concealed path: $HOME
care info: concealed path: /tmp
care info: revealed path: $PWD
care info: ----------------------------------------------------------------------
make: Entering directory `PRoot-2.4/src'
[...]
CC path/proc.o
./path/proc.c: In function 'readlink_proc':
./path/proc.c:132:3: warning: ignoring return value of 'strtol'
[...]
care info: ----------------------------------------------------------------------
care info: Hints:
care info: - search for "conceal" in `care -h` if the execution didn't go as expected.
care info: - use `./care-130213072430.bin` to extract the output archive.
Then she sends the ``care-130213072430.bin`` file to Bob. Now, he
should be able to reproduce her issue on his system::
bob$ ./care-130213072430.bin
[...]
bob$ ./care-130213072430/re-execute.sh
make: Entering directory `PRoot-2.4/src'
[...]
CC path/proc.o
./path/proc.c: In function 'readlink_proc':
./path/proc.c:132:3: warning: ignoring return value of 'strtol'
[...]
So far so good! This compiler warning doesn't make sense to Bob since
``strtol`` is used there to check a string format; the return value is
useless, only the ``errno`` value matters. Further investigations are
required, so Bob re-execute Alice's GCC differently to get more
details::
bob$ ./care-130213072430/re-execute.sh gcc --version
gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
The same invocation on his system returns something slightly
different::
bob$ gcc --version
gcc (GCC) 4.5.2
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This confirms that both GCC versions are the same, however Alice's one
seems to have been modified by Ubuntu. Although, according to the web
page related to this Ubuntu package [#]_, no changes regarding
``strtol`` were made. So Bob decides to search into the files coming
from Alice's system, that is, the ``rootfs`` directory in the
archive::
bob$ grep -wIrl strtol ./care-130213072430/rootfs
care-130213072430/rootfs/usr/include/inttypes.h
care-130213072430/rootfs/usr/include/stdlib.h
[...]
Here, the file ``usr/include/stdlib.h`` contains a declaration of
``strtol`` with the "warn unused result" attribute. On Ubuntu, this
file belongs to the EGLIBC package, and its related web page [#]_
shows that this attribute was actually wrongly introduced by the
official EGLIBC developers. Ultimately Bob should notify them in this
regard.
Thanks to CARE, Bob was able to reproduce the issue reported by Alice
without effort. For investigations purpose, he was able to re-execute
programs differently and to search into the relevant files.
.. [#] https://launchpad.net/ubuntu/oneiric/+source/gcc-4.5/4.5.2-8ubuntu4
.. [#] https://launchpad.net/ubuntu/+source/eglibc/2.13-0ubuntu13.2
Self-extracting format
======================
The self-extracting format used by CARE starts with an extracting
program, followed by a regular archive, and it ends with a special
footer. This latter contains the signature "I_LOVE_PIZZA" followed by
the size of the embedded archive::
+------------------------+
| extracting program |
+------------------------+
| |
| embedded archive |
| |
+------------------------+
| uint8_t signature[13] |
| uint64_t archive_size | # big-endian
+------------------------+
The command ``care -x`` can be used against a self-extracting archive,
even if they were not build for the same architecture. For instance,
a self-extracting archive produced for ARM can be extracted with a
``care`` program built for x86_64, and vice versa. It is also
possible to use external tools to extract the embedded archive, for
example::
$ care -o foo.tar.gz.bin /usr/bin/echo OK
[...]
OK
[...]
$ hexdump -C foo.tar.gz.bin | tail -3
0015b5b0 00 b0 2e 00 49 5f 4c 4f 56 45 5f 50 49 5a 5a 41 |....I_LOVE_PIZZA|
0015b5c0 00 00 00 00 00 00 12 b4 13 |.........|
0015b5c9
$ file_size=`stat -c %s foo.tar.gz.bin`
$ archive_size=$((16#12b413))
$ footer_size=21
$ skip=$(($file_size - $archive_size - $footer_size))
$ dd if=foo.tar.gz.bin of=foo.tar.gz bs=1 skip=$skip count=$archive_size
1225747+0 records in
1225747+0 records out
1225747 bytes (1.2 MB) copied, 2.99546 s, 409 kB/s
$ file foo.tar.gz
foo.tar.gz: gzip compressed data, from Unix
$ tar -tzf foo.tar.gz
foo/rootfs/usr/
[...]
foo/re-execute.sh
foo/README.txt
foo/proot
Downloads
=========
CARE is heavily based on PRoot_, that's why they are both hosted in
the same repository: http://github.proot.me. Since CARE is supposed
to work on any Linux systems, it is recommended to use following
highly compatible static binaries:
* for x86_64: http://static.reproducible.io/care-x86_64
* for x86: http://static.reproducible.io/care-x86
* for ARM: http://static.reproducible.io/care-arm
* other architectures: on demand.
.. _PRoot: http://proot.me
Colophon
========
Visit http://reproducible.io for help, bug reports, suggestions, patches, ...
Copyright (C) 2014 STMicroelectronics, licensed under GPL v2 or later.
::
_____ ____ _____ ____
/ __/ __ | __ \ __|
/ /_/ | / __|
\_____|__|__|__|__\____|