Skip to content

Releases: fcorbelli/zpaqfranz

Windows 32/64 binary,HW accelerated, Linux, FreeBSD

21 Apr 11:47
d2f9fca
Compare
Choose a tag to compare

This is a brand new branch, full of bugs, ehm "features" :)

HW accelerated SHA1/SHA2

Up to version 57 the hardware acceleration was only available for the Windows version (zpaqfranzhw.exe)
From version 58 (obviously still to be tested) it also becomes activatable on different systems (newer Linux-BSD-based AMD/Intel), via the compilation switch -DHWSHA2

zpaqfranz (should) then autodetect the availability of those CPU extensions, nothing is needed by the user
It is possible to enforce with the -hw
To see more "things" use b -debug

TRANSLATION

If you compile with -DHWSHA2 you will get something like that

zpaqfranz v58.1e-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-03-21)

In this example this is a INTEL (JIT) executable, with (kind of) GUI (on Windows), with HW BLAKE3 acceleration, SHA1/2 HW acceleration, win SFX64 bit module (build 55.1)

So far, so good

Then run

zpaqfranz b -debug

If you are lucky you will get something like

(...)
zpaqfranz v58.1e-JIT-GUI-L,HW BLAKE3,SHA1/2,SFX64 v55.1,(2023-03-21)
FULL exename <<C:/zpaqfranz/release/58_1/zpaqfranz.exe>>
42993: The chosen algo 3 SHA-1
1838: new ecx 2130194955
1843: new ebx 563910569
SSSE3 :OK
SSE41 :OK
SHA   :OK
DETECTED SHA1/2 HW INSTRUCTIONS
(...)

zpaqfranz will "automagically" runs HW acceleration, because your CPU does have SSSE3, SSE4.1, and SHA extension
Of course if you get a "NO"... bye bye

This kind of CPUs should be AMD Zen family (Ryzen, Threadripper, etc), Intel mobile 10th+, Intel desktop 11th+ generation

BTW the old zpaqfranzhw.exe (Win64) is

zpaqfranz v58.1e-JIT-GUI-L,HW BLAKE3,SHA1,SFX64 v55.1,(2023-03-21)

Beware: this is SHA1 acceleration, NOT SHA1/2. Therefore you will need to enter the -hw switch manually (to enable)

RECAP

  • With -DHWSHA2 enabled, zpaqfranz will detect and use the HW acceleration, if it thinks your CPU supports it
  • If, for some reason, you want to force its use, even on CPUs that do not officially have these extensions, use the switch -hw; usually you will get a segmentation fault or something like that (depending on the operating system), not my fault
  • If you want to know if zpaqfranz "thinks" that your CPU is enabled, use zpaqfranz b -debug and look at the output
  • Will you get a huge improvement in compression times? No, not really. You will have the biggest difference if you use SHA256 hashing functions, which benefit so much from the acceleration. SHA1 much less (the software version is already very fast)
  • Is -DHWSHA2 faster than -DHWSHA1 ? In fact, no. SHA1 is "just a tiny bit" faster. Why? Too long to explain.
  • Why does my even relatively modern Intel CPU not seem to support it? Who knows, the short version: not my fault. Even relatively recent CPUs have not been equipped by the manufacturer Intel
  • Does it work on SPARC-ARM-PowerPC-whatever-strange-thing? Of course NO
  • Is it production-safe? Of course NOT. As the very first release some nasty things can happend

Luke, remember. The more feedback, the more bug-fixing. Luke, report bugs, use the Force...

And don't forget the github star and sourceforce review! (I am becoming like a youtuber who invites people to subscribe to channels LOL)

Other news

Some refactoring, to became more "Mac-friendly" (here the risk of introducing bugs is considerable, sorry, I will correct them as I go along)

Using MD5 instead of XXH3 in checktxt (supporting Hetzner storagebox, there is still work to be done)

Some "GUI" improvement (In perspective, I am preparing the possibility of selecting some files to extract, but it still needs development)

No more dd embedded (smaller source size)

Download zpaqfranz

Windows 32/64 binary, 64 bit-HW accelerated

29 Mar 12:56
d61c814
Compare
Choose a tag to compare

Changed help

Rationalisation of help

zpaqfranz
zpaqfranz h
zpaqfranz h h
zpaqfranz h full

Multioperation (with wildcards)

In commands t and x (test, extract)

zpaqfranz t *.zpaq ...
zpaqfranz x pippo*.zpaq...

Initial (kind of) text based GUI (Windows)

The new gui command open a (rudimentaly) ncurses-based GUI for listing, sorting, selecting and extracting files
Yes, I know, the vim-style syntax is not exactly user friendly, there will be future improvements

Under Windows, compiling with the -DGUI switch, you can do something like

zpaqfranz gui 1.zpaq

The vim-like commands are
f F / => find substring
Cursor arrow up-down left-right => page up, page down, line up, line down

    • => move line + -
      : => goto line
      m M => set minsize Maxsize
      d D => set datefrom Dateto
      q Q ESC => exit
      F1 sort name, F2 sort size, F3 sort date, F4 sort ext, F5 sort hash
      F6 show size, F7 show date, F8 show hash, F9 show stdout
      t => change -to
      s => searchfrom
      r => replace to
      x => extract visible rows

In this example we want to extract all the .cpp files as .bak from the 1.zpaq archive. This is something you typically cannot do with other archives such as tar, 7z, rar etc.

With a "sort of" WYSIWYG 'composer'

First f key (find) and entering .cpp
Then s (search) every .cpp substring
Then r (replace) with .bak
Then t (to) for the z:\example folder
Finally x to run the extraction

gui.mp4

In the medium term, in addition to bug fixes, box filters etc., there will be a PAKKA-style sorted list, or time machine style, with versions of individual files

Download zpaqfranz

Windows 32/64 binary,HW accelerated, Linux, FreeBSD

12 Mar 16:44
d12ae32
Compare
Choose a tag to compare

New command: 1on1

Deduplicate a folder against another one, by filenames and checksum, or only checksum

Julius Erving and Larry Bird Go One on One

A deduplication function at file level is to identify files inside folders that have been 'manually' duplicated e.g. by copy-paste

I did not find portable and especially fast programmes: they often use very... stupid approaches (NxM comparisons), with quite high slowdowns.
By using the -ssd switch it is possible to activate the multithread which allows, in the real world, performance above GB/s

To make things clear the file into "-deleteinto" will be (in case) deleted
Dry run (no -kill), =hash,=filename,multithread

zpaqfranz 1on1 c:\dropbox -deleteinto z:\pippero2 -ssd

Real run (because -kill), 0-files too

zpaqfranz 1on1 c:\dropbox -deleteinto z:\pippero2 -zero -kill

Real run, with XXH3, with everything (even file with .zfs). This will delete file with DIFFERENT name, BUT same content

zpaqfranz 1on1 c:\dropbox -deleteinto z:\pippero2 -xxh3 -kill -forcezfs

Updated zfs-something commands

zfsadd

Now support almost every zpaqfranz switch, getting the timestamp from snapshot, not snapshot name

Suppose you have something like that

tank/pippo@franco00000001
tank/pippo@franco00000002
tank/pippo@franco00000003
(...)
tank/pippo@franco00001025

You want to purge those snapshots, but retaining the data, getting everything inside consolidate.zpaq

zpaqfranz zfsadd /tmp/consolidated.zpaq "tank/pippo" "franco" -force

You can get only a folder, read the help!

Then you can purge with

zpaqfranz zfspurge "tank/pippo" "franco" -script launchme.sh

This method is certainly slow, because it requires an exorbitant amount of processing. However, the result is to obtain a single archive that keeps the data in a highly compressed format, which can eventually be extracted at the level of a single version-snapshot

In short, long-term archiving for anti-ransomware policy

Improved zfsreceive

This VERY long term archiving of zfs snapshots is now tested for 1000+ snapshots on 300GB+ datasets, should be fine

Example: "unpack" all zfs snapshots (made by zpaqfranz zfsbackup command) from ordinato.zpaq into
the new dataset rpool/restored

zpaqfranz zfsreceive /tmp/ordinato.zpaq rpool/restored -script myscript.sh

Then run the myscript.sh

Download zpaqfranz

Windows 32/64 binary,HW accelerated, ESXi, Linux, Free/Open BSD

15 Feb 11:44
d12ae32
Compare
Choose a tag to compare

Initial support for proxmox backup/restore, on zfs

Proxmox is a debian-based virtualiser that I like, it has a similar style to what I would have done myself.
It has a particular backup mechanism (with an external product, proxmox backup) that is very interesting (it looks like a free Nakivo, for those involved in virtualisation).
The 'internal' backups are done by vzdump, also good, BUT which operates with 'normal' compression (zstd for example), without deduplication AND WITHOUT ENCRYPTION (this is bad, very bad)

proxomox also supports zfs storage, but in a way I do not like much, namely zvol

For those not used to zfs, these are 'volumes' written in blocks, so they are not accessible as files (yep, sometime not 'everything' is a file)
The aim is to have more performance by removing the 'intermediate' layer of virtual-disks-on-files.
However, this makes backups a real nightmare, as there is no 'easy' way to make them (there are no .vmdk or .raw to copy back and forth)

It is possible to force this behaviour (i.e. save to file instead of zvol). I will not go into details.
In this hypothesis (i.e. that the virtual machines are 'real' files, normally present in the /var/lib/vz folder, itself on a zfs storage), I have implemented two new functions for zpaqfranz to make zfs-based-snapshotted proxmox-backups

To reiterate: proxmox supports a thousand different types of storage, I chose the one I like, maybe in the future I will make zpaqfranz more 'smart'. For now I basically use it for my proxmox server backups, together with proxmox backup (better two technologies than one)

proxmox, although opensource, is commercially supported by a German company and, therefore, they are not very keen, understandably, on alternative tools to those offered by them.
(they deleted a thread on their forum without any explanation :)

I tried to contact the developer of proxmox backup systems by e-mail, with no response.
So I share - with anyone who is interested - my little experience

zfsproxbackup

Archiving proxmox backups (from zfs local storage), getting VM disks from /var/lib/vz

  • -force Destroy temporary snapshot (before backup)
  • -kill Remove snapshot (after backup)
  • -all Get all VM
  • -not Do not backup (exclude) VMs
  • -snapshot kj Make 'snapname' to kj (default: francoproxmox)
Backup/encrypt w/key 'pippo' VM 200  zfsproxbackup /bak/200.zpaq 200 -force -kill -key p
ippo
Backup 2 VMs: 200 and 300            zfsproxbackup /bak/200_300.zpaq 200 300
Backup ALL VMs                       zfsproxbackup /bak/all.zpaq  -all -force -kill
Backup all EXCEPT 200 and 300        zfsproxbackup /bak/part.zpaq -all -not 200 -not 300
 -force -kill

This is a "real world" example of taking a zfs-based FreeBSD mailserver
Please note the size taken


/usr/local/bin/zpaqfranz zfsproxbackup /backup/200_posta_G6.zpaq 200 -force -kill -key pippo

zpaqfranz v57.3f-JIT-L, (14 Feb 2023)
franz:-key                               (hidden)
franz:-force -kill
zfsproxmox-backup VERY EXPERIMENTAL!
Works only on /var/lib/vz VM disk(s)
and        on /etc/pve/qemu-server/ config(s)
37720: running        Searching vz from zfs list...
38072: Founded pool   <<zp0/zd0>>
53549: Archive        /backup/200_posta_G6.zpaq
53550: Pool           zp0/zd0
53550: Purged Pool    zp0_zd0
53552: Mark           francoproxmox
38135: VM Path        000  /var/lib/vz/.zfs/snapshot/francoproxmox/images/200
37720: running        Destroy snapshot (if any)
38162: x_one          zfs destroy zp0/zd0@francoproxmox
37720: running        Taking snapshot
38162: x_one          zfs snapshot zp0/zd0@francoproxmox
/backup/200_posta_G6.zpaq:
15 versions, 18 files, 693.419 frags, 3.215 blks, 7.764.776.789 bytes (7.23 GB)
Updating /backup/200_posta_G6.zpaq at offset 7.764.776.789 + 0
Adding 42.954.981.615 (40.00 GB) in 2 files (1 dirs), 8 threads @ 2023-02-14 15:45:30
(001%)   1.00% 00:08:55 ( 409.63 MB)->(    0.00 B) of (  40.00 GB)   81.93 MB/se
(002%)   2.00% 00:07:07 ( 819.29 MB)->(    0.00 B) of (  40.00 GB)  102.41 MB/se
(...)
(099%)  99.00% 00:00:03 (  39.60 GB)->(  52.12 MB) of (  40.00 GB)  107.57 MB/se
(100%) 100.00% 00:00:00 (  40.00 GB)->(  52.12 MB) of (  40.00 GB)  107.52 MB/se
1 +added, 0 -removed.

7.764.776.789 + (42.954.981.615 -> 692.206.304 -> 57.182.470) = 7.821.959.259 @ 107.42 MB/s
37720: running        Destroy snapshot (if any)
38162: x_one          zfs destroy zp0/zd0@francoproxmox

381.635 seconds (000:06:21) (all OK)


zpaqfranz v57.3f-JIT-L, (14 Feb 2023)
franz:-key                               (hidden)
200_posta_G6.zpaq:
17 versions, 20 files, 702.583 frags, 3.263 blks, 7.826.948.327 bytes (7.29 GB)
-------------------------------------------------------------------------
< Ver  > <  date  > < time >  < added > <removed>    <    bytes added   >
-------------------------------------------------------------------------
00000001 2023-02-09 17:55:38  +00000003 -00000000 ->        6.654.630.534
00000002 2023-02-09 18:47:44  +00000001 -00000000 ->           17.394.282
00000003 2023-02-11 13:40:51  +00000001 -00000000 ->          252.707.222
00000004 2023-02-11 17:09:04  +00000001 -00000000 ->           74.337.419
00000005 2023-02-11 17:43:25  +00000001 -00000000 ->           15.669.831
00000006 2023-02-11 19:04:01  +00000001 -00000000 ->           17.670.788
00000007 2023-02-12 00:00:01  +00000001 -00000000 ->           72.951.575
00000008 2023-02-12 08:00:01  +00000001 -00000000 ->           94.408.432
00000009 2023-02-12 16:00:01  +00000001 -00000000 ->           89.868.811
00000010 2023-02-13 00:00:01  +00000001 -00000000 ->           89.430.987
00000011 2023-02-13 08:00:01  +00000001 -00000000 ->           84.165.485
00000012 2023-02-13 16:00:01  +00000001 -00000000 ->           91.936.236
00000013 2023-02-13 17:30:36  +00000002 -00000000 ->           16.994.079
00000014 2023-02-14 00:00:01  +00000001 -00000000 ->           92.889.622
00000015 2023-02-14 08:00:01  +00000001 -00000000 ->           99.721.454
00000016 2023-02-14 15:45:30  +00000001 -00000000 ->           57.182.470
00000017 2023-02-14 16:00:01  +00000001 -00000000 ->            4.989.068

Today's update: at every run gets only some MBs of space (e-mails are generally small, deduplicable and compressible)

(100%) 100.00% 00:00:00 (  40.00 GB)->(  19.43 MB) of (  40.00 GB)   93.10 MB/se1 +added, 0 -removed.

7.951.716.338 + (42.954.981.615 -> 254.326.790 -> 22.662.906) = 7.974.379.244 @ 92.96 MB/s
37720: running        Destroy snapshot (if any)
38162: x_one          zfs destroy zp0/zd0@francoproxmox

441.390 seconds (000:07:21) (all OK)

zfsproxrestore

The corrispondent restore command, of course in the same "expectations"

Restore proxmox backups (on local storage) into /var/lib/vz and /etc/pve/qemu-server
Without files selection restore everything, files can be a sequence of WMIDs (ex. 200 300)

  • -kill Remove snapshot (after backup)
  • -not Do not restore (exclude)
Restore all VMs                      zfsproxrestore /backup/allvm.zpaq
Restore 2 VMs: 200 and 300           zfsproxrestore /backup/allvm.zpaq 200 300
Restore VM 200, release snapshot     zfsproxrestore /backup/allvm.zpaq 200 -kill
Restore all VMs, except 200          zfsproxrestore /backup/allvm.zpaq -not 200 -kill

pre-compiled binaries

In this release I put some binaries for various platforms, I am checking in particular Synology-Intel-based NAS
zpaqfranz_linux "should" run just about everywhere (for 64 bit Intel-systems); zpaqfranz_qnap_intel on (just about) every 32 bit-Intel-based-simil-Linux systems

Download zpaqfranz

Windows 32/64 binary, 64 bit-HW accelerated

28 Jan 15:45
4f6999b
Compare
Choose a tag to compare

Please consider 57.2 "to be thoroughly tested"


News:

Extraction with wildcards of multiple zpaqs

zpaqfranz x c:\zpaq\the*.zpaq -to z:\allin

-replace works with "" (requested by a user to "erase" pieces of paths)

Many "features" introduced!

Just kidding: various fixing due to internal refactoring

Download zpaqfranz

Windows 32/64 binary,HW accelerated, ESXi, Linux, Free/Open BSD

25 Jan 11:44
1f3acfa
Compare
Choose a tag to compare

This is the first release (to be tested) of the new 57 series

Like any first release it has no bugs, but features :)

Remember: the more feedback (even negative) I receive, the greater the likelihood of improving zpaqfranz. And do no forget, please, the star on github or a review on sourceforge :)

The main difference is the internal refactoring (which can cause subtle problems in parameter/switch recognition), and especially the inclusion of a new metadata storage "package"
The new V3 (testable for now with whirlpool and highway) stores additional useful information (or rather will) and in the future also some sort of posix-style data
In short - to summarize - facilitate restoration to *nix of symlinks, proprietary users etc, similar to tar

So externally the changes look modest, but internally they are numerous

New hashers that can be used inside archive: whirlpool, highway 64/128/256

In addition to the supported control hashes, within the archives, such as XXHASH|SHA-1|SHA-2|SHA-3|MD5|XXH3|BLAKE3, you can now choose

  • whirlpool. It is a "slow" hash that creates very large footprints, but it is based on a completely different technology than the others. I like it very much
  • highway. It is a hash developed by two very good programmers the github which is actually not designed for use with large amounts of data (like zpaqfranz), but rather for (relatively) "small" packet indexing. In the case of zpaqfranz there are 3 different "versions" (actually it is the same, so there is no difference in speed) for 64, 128 and 256 bits. So it is (as you can understand) most useful for quick debugging with different-length hashes. The implementation is "straight" C (no AVX2 acceleration etc), and is not tested (at present) for use on BIG ENDIAN or sparc or "strange" systems. In fact a debug tool
zpaqfranz a z:\1.zpaq c:\nz -whirlpool
zpaqfranz a z:\1.zpaq c:\nz -highway64
zpaqfranz a z:\1.zpaq c:\nz -highway128
zpaqfranz a z:\1.zpaq c:\nz -highway256
zpaqfranz l z:\1.zpaq -checksum

The dir command is now better than... dir

As is well known, or maybe not, if the zpaqfranz executable is called "dir" it works, roughly, like the Windows dir
I use it so much on Linux and FreeBSD where the ls command doesn't look anything like what is needed for a storage manager (you need numerous other commands, hard to remember concatenations etc)

zpaqfranz dir c:\*.cpp /s /os
dir c:\*.cpp /s /os -n 100

will show all the "*.cpp" into c:\ (with recursion /s), ordered by size (/os) and limit at 100 (- n 100)

zfsbackup a bit evolved

During use with simple datasets, so far, zfsbackup seems to work better than expected. Clearly this is referred to systems with zfs (linux+openzfs, FreeBSD, Solaris)
The -kill switch will delete temporary files (otherwise you need to manually "purge" the /tmp folder)

sparc64: -DALIGNMALLOC

For sparc64 an experimental switch to try to align malloc(): must be used at compile time

Haiku OS https://www.haiku-os.org/

Yep, zpaq 7.15 is already in Haiku OS :)
zpaqfranz, not very tested, can be compiled on Haiku R1/beta4, 64 bit (gcc 11.2.0), hrev56721 (maybe the -pthread is redundand, but not a big deal)

g++ -O3 -Dunix zpaqfranz.cpp -o zpaqfranz  -pthread -static

TrueNAS

It is an appliance based on FreeBSD 13x, which, however, lacks the compiler.
zpaqfranz can run inside a GUI-made jail, or outside (i.e. in the normal /usr/local/bin). The second case, of course, enable any function (including zfs backups), but have to be "injected" manually (with a SSH session, for example). Maybe I'll do a little HOW-TO

Comment on the source

The curious will see that (partial) refactoring is "strange," as it does not use very convenient features (e.g., RTTI) that would make it more compact and elegant. This is because of the inability to get "modern" compilers, in short for backward compatibility They will also notice that the handling of Boolean flags is peculiar. The reason is of performance within highly CPU-bound loops as in zpaqfranz. They will note that sometimes maps are used where **unordered_**map would be more efficient. But I can't, because they simply don't exist on certain systems (!). In short, it is in best tradeoff I have found (so far) between conciseness, maintainability, and breadth of supported platforms. Sometimes even to_string or atoi does not exists :)

The binaries

In this very first release there are

  • zpaqfranz.exe (64bit Windows)
  • zpaqfranz32.exe (32bit Windows)
  • zpaqfranzhw.exe (64bit Windows w/HW SHA-1 acceleration via -hw switch, usually for AMD)
  • zpaqfranz_esxi (32bit vSphere "maybe-will-run")
  • zpaqfranz_freebsd (64bit statically linked)
  • zpaqfranz_linux (64bit statically linked)
  • zpaqfranz_openbsd (64bit statically linked)
  • zpaqfranz_qnap (QNAP NAS TS-431P3 Annapurna AL314)

Of course the "right" way I recommend is to download the source and compile directly from scratch.

I attach them because they are convenient for me to do quick tests on as many systems as I can get

Download zpaqfranz

Windows 32/64 binary, FreeBSD statically linked

30 Dec 19:18
af8bc17
Compare
Choose a tag to compare

First (public) release with zfsbackup-zfsreceive-zfsrestore


NOTICE. There are virtually no test on input parameters. So place caution. If you need help just ask. In the next release I will include stringent checks


And now... THE SPIEGONE!

This release contains numerous features, both commonly used and specific to zfs

The first function is versum, something similar to a "smarter" hashdeep

Basically: verifies the hashes of the files in the filesystem, against a list in a text file
We want to verify that the backup-restore with zfs works well, without trusting
It can be fed with two types of files: those created by zpaqfranz itself, and those of hashdeep.

The former are written by the sum function with the appropriate switches (ez. zpaqfranz sum *.txt -xxh3). BTW zpaqfranz can write and read (-hashdeep switch) this fileformat.

In the following examples we will operate on the tank/d dataset with SSD/NVMe, working on fc (yep, francocorbelli snapshot)

You can use all hash types of zpaqfranz, in this example it will be xxhash64
Incidentally -forcezfs is used to have the example folder (which contains .zfs, being a snapshot) examined, otherwise zpaqfranz will ignore it

zpaqfranz sum /tank/d/.zfs/snapshot/fc -forcezfs -ssd -xxhash -noeta -silent -out /tmp/hash_xx64.txt

A possible alternative, to have third-party control (i.e. software other than zpaqfranz) is to use hashdeep
usually in the md5deep package
The essential difference of hashdeep from md5deep is the use of multithreading: it reads files from disk in parallel, so suppose we are operating with solid-state disks (or ... wait longer :) )

Various hashes can be selected, but as they are basically used as checksums and not as cryptographic signatures: md5 is more than fine (it is the fastest), at least for me

hashdeep -c md5 -r /tank/d/.zfs/snapshot/fc >/tmp/hashdeep.txt

BTW hashdeep does not have a find replace function, awk or sed is commonly used. Uncomfortable to say the least

To check the contents of the filesystem we have three chances

  1. the zpaqfranz hashlist
    In this example, a multithreaded operation (-ssd) will be adopted, operating a renaming (-find/-replace) to convert the paths in the source file from the target ones
zpaqfranz versum z:\uno\_tmp\hash_xx64.txt -ssd -find /tank/d/.zfs/snapshot/fc -replace z:\uno\_tank\d
  1. the hashdeep

zpaqfranz is able to 'understand' the original format of hashdeep (look at the -hashdeep switch)

zpaqfranz versum z:\uno\_tmp\hashdeep.txt -hashdeep -ssd -find /tank/d/.zfs/snapshot/fc -replace z:\uno\_tank\d
  1. small-scale test, without reading from filesystem
    If the hash function used to create the .zpaq file is the same as that of the .txt control file, you can operate it as follows
zpaqfranz versum z:\uno\_tmp\hash_xx64.txt -to thebak.zpaq  -find /tank/d/.zfs/snapshot/fc -replace /tank/d

It should be remembered that the default hash of zpaqfranz is xxhash64, so if you want to use other hashes (e.g. xxh3, sha256 or sha3 etc.) you must, when creating the .zpaq file (the a command), add the relevant switch (e.g. -xxh3, -sha3, -blake3 etc.)

Recap

Complete example of creating an archive (on FreeBSD with zfs) to be then extracted on Windows with independent control

The source will be tank/d using -ssd for multithread

Take the snapshot fc of tank/d

zfs snapshot tank/d@fc

Get the hash list with xxhash64 into the file /tmp/hash_xx64.txt

zpaqfranz sum /tank/d/.zfs/snapshot/fc -forcezfs -ssd -xxhash -noeta -silent -out /tmp/hash_xx64.txt

Create hashdeep.txt w/md5 into /tmp/hashdeep.txt. Using md5 because very fast
WARNING: /sbin/zfs set snapdir=hidden tank/d should be required to "hide" .zfs folders to hashdeep. There is not an easy way to exclude folders in hashdeep

hashdeep -c md5 -r /tank/d/.zfs/snapshot/fc >/tmp/hashdeep.txt

Now making the backup (fixing path w/-to)
In this case the default hash function is used (xxhash), matching with hash_xx64.txt
We "inject" the two hash list, /tmp/hash_xx64.txt and /tmp/hashdeep.txt, to keep with the archive

zpaqfranz a /tmp/thebak.zpaq /tank/d/.zfs/snapshot/fc /tmp/hash_xx64.txt /tmp/hashdeep.txt -to /tank/d

Destroy the snapshot

zfs destroy tank/d@fc

Now transfer somehow thebak.zpaq to Win (usually with rsync)

Extracting everything to z:\uno (look at -longpath)

zpaqfranz x thebak.zpaq -to z:\uno -longpath

Verify files by zpaqfranz's hash list
Note the -find and -replace to fix source (on FreeBSD) and destination (on Windows) paths

zpaqfranz versum z:\uno\_tmp\hash_xx64.txt -ssd -find /tank/d/.zfs/snapshot/fc -replace z:\uno\_tank\d

Now paranoid double-check with hashdeep.
Please note the -hashdeep

zpaqfranz versum z:\uno\_tmp\hashdeep.txt -hashdeep -ssd -find /tank/d/.zfs/snapshot/fc -replace z:\uno\_tank\d

Finally compare the hashes into the txt with the .zpaq

zpaqfranz versum z:\uno\_tmp\hash_xx64.txt -to thebak.zpaq -find /tank/d/.zfs/snapshot/fc -replace /tank/d

Short version: this is an example of how to perform on a completely different system (Windows) the verification of a copy made from a .zfs snapshot with zpaqfranz. We will see how, in reality, it is designed for "real" zfs backup-restore

New advanced option: the -stdout

If the files are ordered-stored into the .zpaq, it is possible to -stdout

WHAT?

Files stored within .zpaq are divided into fragments (let's say 'chunks') which, in general, are not sorted.
This happens for various reasons (I will not elaborate), preventing the possibility of extracting files in stream form, i.e. as a sequence of bytes, as required by -stdout

This is not normally a serious problem (for zpaq 7.15) as it simply does not support mixing streamed and journaled files into an archive

Translation of the translation (!)

zpaq started out as a stream compressor (actually no, there would be a further very long explanation here that I will spare)
Processes any long sequence of bytes, one byte at a time, and writes a sequence of bytes in output: this is the so-called streamed format

It was present in older versions of zpaq, something analogous to gz just to give a known example.

Subsequently, the developer of zpaq (Matt Mahoney) implemented the so-called 'journaled' storage format, where each file has its various versions in it.
This is the 'normal' format, while the 'streamed' one has practically disappeared (vestiges remain in the source).

For a whole series of technical problems that I won't go into here, Mahoney decided not to allow the mixing of the two types:

  • archives WITH VERSIONS (aka: modern)

XOR

  • with streamed files (aka: OK for stdout)

The ability to write to stdout does not have much reason to exist, unless coupled with the ability to read from stdin, and zpaq 7.15 does not allow this, essentially operating by reading files from the filesystem "the usual way".

As you may have noticed (?) for some time now, I have instead evolved zpaqfranz to allow the processing of input streams (with -stdin)

The concrete reason is twofold

The first is to archive mysql dumps, the tool of which (mysqldump and various similar ones) output precisely a text file.
This way, you can use zpaqfranz to archive them versioned (which as far as I know is superior to practically any other system, by a WIDE margin).

The second is to make sector-level copies of Windows drives, in particular the C disk:

As you may have noticed (?) zpaqfranz is now able to back up (from within Windows) an entire system, either 'file-based' (with a VSS) or 'dd-style'

Obviously the 'dd' method will take up more space (it is good to use the f command to fill the free space with zeros) and will also be slower

BUT
it allows you to mount (with other software) / extract (e.g. with 7z) almost everything

If you are really paranoid (like me), what could be better than a backup of every sector of the disk?

Let us now return to why it is so important (and actually not trivial) to obtain archives in journaled format but with the possibility of orderly (=streamed) extraction

It is about speed

Streamed .zpaq archives exist, BUT listing (i.e. enumerating the list of files therein) is very slow, requiring a scan of the entire file (which can be hundreds of GB = minutes)

They are also extremely slow to be created (by the zpaqd 7.15 utility), essentially monothread (~10MB/s)

Instead, by having journaled (i.e. 'normal' zpaq format) but ORDERED archives, I can obtain all the benefits

Various versions of the same file, listing speed, and even creation speed (maintaining multithreading), at the cost of a (not excessive) slowdown in output, due to the use of -stdout instead of the more efficient filesystem

Why all this?

For zfs backups, of course, and especially restorations

There are now three new commands (actually very crude, to be developed, but that's the "main thing")

  1. zfsbackup
  2. zfsrestore
  3. zfsreceive

One normally uses .zpaq to make differential zfs backups, i.e. with one base file and N differential files, which are stored as different versions. This is good, it works well, and it is not fragile (differential means that two files are enough to restore). The "normal" method for "older" zpaqfranz.

BUT

it takes up space: as the differential snapshots get bigger and bigger, it is a normal problem for any differential system

On the other hand, using incremental zfs snapshots has always been very risky and fragile, because it only takes smal...

Read more

Windows 32/64 binary

06 Dec 16:00
351a707
Compare
Choose a tag to compare

Fixes for -longpath

Issue 42

Download zpaqfranz

Windows 32/64 binary

01 Dec 18:50
d13bf95
Compare
Choose a tag to compare

Windows imaging (first release)

Sector-level image of Windows partition (admin rights required) by internal imager or... dd (!)
Yes, now there is a (GNU coreutil) dd embedded in the Windows executable (used for test the buffered -stdin)
Suggestion: fill to zero unused space before imaging (save space)

zpaqfranz f c:\ -zero

How to extract the zpaq archive after formatting the C partition?

Simply, you can't :)
Imaging of partitions (of course C too) is NOT (yet) something like Acronis or Macrium
It is (or should be) a full-backup that, in case of emergency, you need to

  • Restore (extract from .zpaq the .img)
  • Mount with something else (es. OSFMount https://www.osforensics.com/tools/mount-disk-images.html), then copy-and-paste your files
  • OR open the .img with 7zip (supposing NTFS format), then extract with 7zip
  • OR write back the image with "something" (example dd) into a virtual machine, or even the "real" HW (booting from USB key, for example)

work in progress...

Switch -dd (Windows)

Make an image with dd, via a script (beware of antivirus)

zpaqfranz a z:\2.zpaq c: -dd

Two additional parameters: -minsize bs and -maxsize count (just like dd)
Cannot use more add() parameters, but yes the -key (for encryption)
Next releases: dd-over-VSS

Switch -image (Windows)

Use internal imager to backup a partition. It is possible to use almost all "normal" add switch PLUS the new -buffer, useful for SSD

zpaqfranz a z:\2.zpaq e: -image
zpaqfranz a z:\2.zpaq c: -image -buffer 1MB -key pippo

-buffer X switch (in add)

Use a larger input buffer (zpaq's default is 4KB), typically 64KB or 1MB

-image switch in extraction (Windows)

Restore huge image w/smart progress, by default every 20MB or -minsize X
print_progress() cannot handle huge file (ex. vmdks) due to seek-and-write on
filesystems without "smartness"
example: seek @ 300GB and write 1KB
FS must write 300GB of zeros, then 1KB
with a slow spinning drive this can seems a "freeze"
=>when extracting this kind of file use -image and even -minsize something
if something is 1 all writes (and seeks) will be showed

zpaqfranz x copia.zpaq -to z:\prova\ -image

buffered -stdin

Much faster, good for archiving mysqldump piping into zpaqfranz

Minor fixes

  • On -NOJIT (ex. Apple M1) shows more info on failed allocx
  • Extended Windows error decoder

References:
stdin
Apple M1
Imaging

If you want please leave a review on Sourceforge

and put a star on github (if you haven't already).

Any comment or suggestion is welcome. Thanks for collaboration

Download zpaqfranz

Windows 32/64 binary

15 Nov 17:26
a268fe4
Compare
Choose a tag to compare
Windows 32/64 binary Pre-release
Pre-release

First public release with (a bit) for the cloudpaq

The next "big thing" is zpaq-over-ssh, or (better), zpaqfranz-over-TCP-socket
Some more

zpaqfranz can send an (encrypted) copy of the archive via "internet" (TCP) to a remote server, running the new (soon-to-be-released) cloudpaq

The server will make a (lot) of checks before update the local version of the archive
Of course all this "mess" is for a ransomware-resilient archiver

It is rather hard to implement, currently it works with an entire file, but the sending of subsequent updates needs to be implemented

There is definitely an overkill in the security methods used - in practice I expect to use it over an ssh tunnel, so authentication and encryption issues are actually redundant - but after all it is a hobby.

However, in my spare time maybe I will complete it :)

News against 55.x

-flagbig

Shows a BIG ASCII-text in the result. Useful for crontabbed-e-mail-sended results

C:\zpaqfranz>zpaqfranz t z:\1.zpaq -big
zpaqfranz v56.1j-JIT-L (HW BLAKE3), SFX64 v55.1, (15 Nov 2022)
franz:-big
Archive seems encrypted (or corrupted)
Enter password :***
z:/1.zpaq: zpaqfranz error:password incorrect
23013: zpaqfranz error: password incorrect

1.453 seconds (000:00:01) (with errors)


####### ######  ######  ####### ######  ###
#       #     # #     # #     # #     # ###
#       #     # #     # #     # #     # ###
#####   ######  ######  #     # ######   #
#       #   #   #   #   #     # #   #
#       #    #  #    #  #     # #    #  ###
####### #     # #     # ####### #     # ###

-checktxt

After an "add", create a file with a full xxh3 hashing of the archive (note: without parameter take the same name of the archive, with .txt)

zpaqfranz a z:\knb.zpaq c:\nz\ -checktxt z:\pippo.txt
(...)
zpaqfranz v56.1j-JIT-L (HW BLAKE3), SFX64 v55.1, (15 Nov 2022)
Creating XXH3 check txt on z:/pippo.txt

44202: final XXH3: hash 6C68C17F625AF11AA734E8D122241789

Now you can send the .txt file (with rsync, in future with cloudpaq) on a remote server, then do a quick check

zpaqfranz sum z:\knb.zpaq -checktxt z:\pippo.txt -big
C:\zpaqfranz>zpaqfranz sum z:\knb.zpaq -checktxt z:\pippo.txt -big
zpaqfranz v56.1j-JIT-L (HW BLAKE3), SFX64 v55.1, (15 Nov 2022)
franz:-big
franz:checktxt   <<z:/pippo.txt>>
Checking XXH3 (because of -checktxt) on z:/pippo.txt
Hash from checktxt |6C68C17F625AF11AA734E8D122241789|

0.063 seconds (00:00:00) (all OK)


####### #    #
#     # #   #
#     # #  #
#     # ###
#     # #  #
#     # #   #
####### #    #

Now - on the remote server - something like

/usr/local/bin/zpaqfranz dir "/home/pizza/copie/" -noeta >/tmp/checkpizza.txt
/usr/local/bin/zpaqfranz sum /home/pizza/copie/$1 -big -noeta -checktxt /home/pizza/
copie/$2 >>/tmp/checkpizza.txt

if [ -f /tmp/checkpizza.txt ]; then
    /usr/local/bin/smtp-cli --missing-modules-ok  (...) -cc $3 -subject  "CHECK-pizza-backup" -body-plain=/tmp/checkpizza.txt
fi

Short version: quickly compare a local archive with a rsync-ed remote one, tanking the free space (of the remote server) too, by e-mail

-stdin

During add it is now possible to take the stdin flux. Typically to get mysqldump's backup or a full image of a live Windows (!) via dd

zpaqfranz a z:\1.zpaq mydump.sql -stdin

Or...

c:\nz\dd if="\\\\.\\c:" bs=1048576 count=100000000000 |c:\zpaqfranz\zpaqfranz a j:\image\prova cimage.img -stdin

-windate (on Win)

Add/Restore (if any) file's creation date. Please note: this will force xxhash64 hash [the default]

C:\zpaqfranz>zpaqfranz a z:\2 *.cpp -windate
zpaqfranz v56.1j-JIT-L (HW BLAKE3), SFX64 v55.1, (15 Nov 2022)
franz:winhash64 (-windate)
Creating z:/2.zpaq at offset 0 + 0
(...)
0.250 seconds (00:00:00) (all OK)

C:\zpaqfranz>zpaqfranz x z:\2.zpaq -to z:\kajo -windate

-all -comment (in x, extract)

Mark extracted versions with ASCII comment, if present

zpaqfranz x copia.zpaq -to z:\prova\ -all -comment

range in extraction

Extract the files added from version X to Y, or 1 to X, or X until end. Something like github (!)

In this example I want to get all the different comp.pas source code, from version 100 to 1000

zpaqfranz x copia.zpaq -only *comp.pas -to z:\allcomp -all -range 100-1000

in utf command -fix255

To quickly find longfiles on a folder

zpaqfranz utf c:\vm -fix255

-isopen on a (add)

Quickly abort if the file is already opened (on Windows). For multi-virtual-machines backup on the same archive

fixes

  • On Windows w with longpath now sanitize path better
  • Better (but not perfect) handling for OneDrive folder #37
  • Skip .zfs by default for dir mode
  • Extended -debug output (try to catch really weird NTFS' "something"
  • Fixed output

Download zpaqfranz