Skip to content

Command a add (Add or append files to archive)

Franco Corbelli edited this page Sep 2, 2023 · 1 revision

Append changes in files to archive, or create archive if it does not exist.

Examples Diff against 7.15

Files is a list of file and directory names separated by spaces. If a name is a directory, then it recursively includes all files and subdirectories within. In Windows, files may contain wildcards * and ? in the last component of the path (after the last slash). * matches any string and ? matches any character. In Unix/Linux, wildcards are expanded by the shell, which has the same effect.

A change is an addition, update, or deletion of any file or directory in files or any of its subdirectories to any depth. A file or directory is considered changed if its size or last-modified date (with 1 second resolution), or Windows attributes or Unix/Linux permissions (if saved) differ between the internal and external versions. File contents are not compared. If the attributes but not the date has changed, then the attributes are updated in the archive with the assumption that the file contents have not changed.

Files are added by splitting them into fragments along content-dependent boundaries, computing their SHA-1 hashes, and comparing with hashes already stored in the archive. If the hash matches, it is assumed that the fragments are identical and only a pointer to the previous compressed fragment is saved.
zpaqfranz, by default, double-check against SHA-1 collisions with CRC-32: in this case a blocking error (GURU) is throwed (with -verify).
Unmatched fragments are packed into blocks, compressed, and appended to the archive.
This is extremely important: only the final portion of the .zpaq archive is modified, all the initial part remains the same. This allows mechanisms such as rsync --append to transfer only the modified portion of the data, instead of the entire file.

For each added or updated file or directory, the following information is saved in the archive:

  • the compressed contents;
  • fragment SHA-1 hashes;
  • the file or directory name as it appears in files plus any trailing path;
  • the last-modified date with 1 second resolution;
  • the Unix/Linux permissions or Windows attributes;
  • CRC-32 and the selected hash (default xxhash64) of the entire file.

Default behavior:

  • other metadata such as owner, group, ACLs, last access time, etc. are not saved;
  • symbolic links are not saved or followed;
  • hard links are followed as if they were ordinary files;
  • special file types such as devices, named pipes, and named sockets are not saved;
  • the Windows version will NOT save alternate data streams (ADS) if not forced to (-forcewindows or -715);
  • by default files with .zfs into the path will be ignored (unless -forcezfs or -715);
  • .xls/.ppt files are by default always "hard" appended to the file (unless -xls).

Multipart

If archive is multi-part, the zpaqfranz will create a new part using the next available part number. For example:

    zpaqfranz a "arc??" files   (creates arc01.zpaq)
    zpaqfranz a "arc??" files   (creates arc02.zpaq)
    zpaqfranz a "arc??" files   (creates arc03.zpaq)
    zpaqfranz x "arc??"         (extracts all parts)

Note the ": do NOT forget the double quote!

Comment

Using the switch -comment sometext it is possible to mark the current version of the archive with "sometext". This will make it easier to search or extract. Warning: the text should not contain spaces or non-ASCII characters and, above all, be unique. There is no "duplication" check on version comments. If you add the same comment more than once, you will not be able to use it later to extract the data. This remains possible through the normal use of -until

"Strange" things

  • If any file cannot be read (e.g. permission denied), then it is skipped and a warning is reported. However, other files are still added and the update is still valid.
  • If a file is readable but its size is not the expected one (for example changed during the archiving etc) a WARN is written in output (38385); the file IS added to the archive, and the global result will become ERROR.
  • If the file is readable, but the readed size is zero (ex. corrupted, lost connection...) a ERROR is written in output (38383); the file IS added to the archive, and the global result will become ERROR. The hash of the file (stored by default by zpaqfranz) turns to !ERROR!, and the file will be NOT listed or extracted, unless -force.

The file list is generated during the initial scan, which (for large jobs) can take place even hours before the actual reading from the media. Obviously zfs snapshots are the suggested method to avoid any kind of such inconsistency which are essentially... unavoidable. In some cases VSS can be used to mitigate the risk on Windows.

It is possible to quickly find this kind of weird files with the i (info) command, or l (list).

zpaqfranz i c:\nz\ugo3.zpaq
zpaqfranz l c:\nz\ugo3.zpaq
zpaqfranz l c:\nz\ugo3.zpaq -force -summary -checksum

If archive is "" (a quoted empty string), then zpaqfranz compresses files as if creating a new archive, but discards the output without writing to disk (benchmark or debug).

Updates are transacted. If zpaq is interrupted before completing the update, then the partially appended data is ignored and overwritten on the next update. This is accomplished by first appending a temporary update header, appending the compressed data and index, then updating the header as the last step.

Verbosity

As the archive is updated, the program will report the percent complete, estimated time remaining, and others info. This behavior can be changed by -noeta, -verbose, -pakka, -debug, -summary

Scripting

It is possible to run some kind of scripts after the execution, with 3 different switches

  • exec_ok
  • exec_warn
  • exec_err
zpaqfranz a (...) -exec_err "d:\script\my very own script for errors.bat"

As a general rule for zpaqfranz, only the first of multiple switches is considered: -exec_on pippo.sh -exec_ok pluto.sh -exec_ok paperino.sh is equal to -exec_ok pippo.sh

Specific switches

-test

Run a thorough check of all added files, re-reading them from the filesystem. This gives ample guarantees, but slows down the procedure, especially if you are operating on a network

-f

-force

With add, attempt to add files even if the last-modified date has not changed. Files are added only if they really are different, based on comparing the computed and stored SHA-1 hashes

-index indexfile

With add, create archive.zpaq as a suffix to append to a remote archive which is assumed to be identical to indexfile except that indexfile contains no compressed file contents (D blocks). Then update indexfile by appending a copy of archive.zpaq without the D blocks. With extract, specify the index to create for archive.zpaq and do not extract any files. The purpose is to maintain a backup offsite without using much local disk space. The normal usage is to append the suffix at the remote site and delete it locally, keeping only the much smaller index. For example:

zpaqfranz a part files -index index.zpaq
cat part.zpaq >> remote.zpaq
rm part.zpaq

indexfile has no default extension. However, with a .zpaq extension it can be listed to show the contents of the remote archive or compare with local files. It cannot be extracted or updated as a regular archive. Thus, the following should produce identical output:

zpaqfranz l remote.zpaq
zpaqfranz l index.zpaq

If archive is multi-part (contains * or ?), then zpaq will substitute a part number equal to 1 plus the number of previous updates. The parts may then be accessed as a multi-part archive without appending or renaming.

With add, it is an error if the archive to be created already exists, or if indexfile is a regular archive. -index cannot be used with -until or a streaming archive -method s.... With extract, it is an error if indexfile exists and -force is not used to overwrite.

-noattributes

With add, do not save Windows attributes or Unix/Linux permissions to the archive.

-not [file]...

Do not add files that match any file by name. file may contain wildcards * and ? that match any string or character respectively, including /. A match to a directory also matches all of its contents. In Windows, matches are not case sensitive, and \ matches /. In Unix/Linux, arguments with wildcards must be quoted to protect them from the shell.

-only file...

Do not add list any files unless they match at least one argument. The rules for matching wildcards are the same as -not. The default is * which matches everything.

If a file matches an argument to both -only and -not, then -not takes precedence.

-tN

-threads N

Please note the SPACE from -tN and -threads N Add at most N blocks in parallel. The default is 0, which uses the number of processor cores, except not more than 2 when when zpaqfranz is compiled to 32-bit code. Selecting fewer threads will reduce memory usage but run slower. Selecting more threads than cores does not help.

-to name...

With add rename external files to respective internal names. When files is empty, prefix the extracted files with the first name in names, inserting / if needed and removing : from drive letters. For example:

zpaq add archive dir -to newdir

will save dir/file as newdir/file, and so on.

The -only and -not options apply prior to renaming.

-until date | [-]version

Ignore any part of the archive updated after date or after version updates or -versions from the end if negative. Additionally, add will truncate the archive at this point before appending the next update. When a date is specified, the update will be timestamped with date rather than the current date.

A date is specified as a 4 digit year (1900 to 2999), 2 digit month (01 to 12), 2 digit day (01 to 31), optional 2 digit hour (00 to 23, default 23), optional 2 digit minute (00 to 59, default 59), and optional 2 digit seconds (00 to 59, default 59). Dates and times are always universal time zone (UT), not local time. Numbers up to 9999999 are interpreted as version numbers rather than dates. Dates may contain spaces and punctuation characters for readability but are ignored. For example:

zpaq add backup files -until 2014/04/30 11:30

truncates any data added after April 30, 2014 at 11:30:59 universal time, then appends the update as if this were the current time. (It does not matter if any files are dated in the future).

zpaq add backup files -until 0

deletes backup.zpaq and creates a new archive.

add -until is an error on multi-part archives or with an index. A multi-part archive can be rolled back by deleting the highest numbered parts.

Truncating and appending an encrypted archive with add -until (even -until 0) does not change the salt or keystream. Thus, it is possible for an attacker with the old and new versions to obtain the XOR of the trailing plaintexts without a password.

-until is rather risky with add. For backward compatibility is maintained, but use the new -timestamp instead

-timestamp X

Setting version datetime @X, ex 2021-12-30_01:03:04 to freeze zfs snapshots. Must be monotonic increasing (v[i+1].date>v[i]+date)

-find pippo

-replace pluto

Find and replace strings into filenames from "pippo" to "pluto", useful to "sterilize" headers of filename, for easy checking, comparing and extracting in different OS (ex. Linux vs Windows etc)
As a general rule for zpaqfranz, only the first of multiple switches is considerate: -find pippo -find pluto -find paperino is equal to -find pippo

-xls

Do NOT force adding of .XLS/.PPT (default: NO)

-715

Runs just about like 7.15

-nochecksum

Do NOT get checksum, like 7.15 (fastest)

-crc32

Store CRC-32 for every file

-xxhash

Store CRC-32 AND XXHASH64 for every file (DEFAULT)

-xxh3

Store CRC-32 AND XXH3 (128 bit) for every file (very fast)

-checksum -sha1

Store CRC-32 AND SHA-1 of each file (slow).

-sha256

Store CRC-32 AND SHA-256 of each file (slowest, most reliable).

-blake3

Store CRC-32 AND BLAKE3 of each file (faster of SHA-256 on Intel CPU, very reliable. HW accelerated on Win 64).

-md5

Store CRC-32 AND MD5 of each file (widespread, backward compatibility).

-sha3

Store CRC-32 AND SHA-3 (256 bit) of each file (newer NIST standard), very, VERY reliable.

-verify

Do an early check for SHA-1 collisions during add(), fail if detected (slows ~10%)

-test

Do a post-add test (doveryay, no proveryay).

-vss

Volume Shadow Copies (Win with admin rights) to backup files from %users%.

-comment foo

Add a version with ASCII text 'foo'

-noeta

Do not show ETA (reduce script redirection clutter)

-verbose

Verbose output

-pakka

New-style output (by chunks)

-debug

Show LOTS of infos

-summary

Be brief

-filelist

Add the list of file to be added in a VFILE

-forcezfs

Force to NOT ignore .zfs (typically FreeBSD snapshots)

-noqnap

Force to NOT ignore "@Recently-Snapshot" and "@Recycle"

-forcewindows

Force to NOT ignore "System Volume Information" and "$RECYCLE.BIN" and the ADS (alternate data stream)

-nopath

Do not store the full path when adding

-nosort

Do not sort files before adding

-freeze outputfolder -maxsize something

The backup of virtual disks (es. vmdk) can become quite big in the mid-time (months), the classic way is to simply rename (/move) the .zpaq file "somewhere" (a NAS, typically) so that, at the next run, it will start over from scratch.
Usually the old archive is not immediately deleted (basically it includes the backups of yesterday, the previous day etc) but "parked" waiting for a cancellation, let's say after months or years. The new switch does this automatically and, with the n (purge) command, archive rotation is quite easy

-copy somefolder

Make a 2nd copy of the written data into another folder (ex. two copies on two different NAS)

-sfx nameofsfx.exe

On Windows 32/64 is is possible to make SFX archives

zpaqfranz a z:\1.zpaq *.cpp -sfx z:\2.exe

with some options that become part of the extraction SFX module

  • -sfxto foldr Set -to into the SFX module
  • -sfxforce Set -force into the SFX module
  • -sfxnot Like -not for SFX
  • -sfxonly Like -only for SFX
  • -sfxuntil Like -until for SFX
Clone this wiki locally