Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--stats strange output for "All archives" #5055

Closed
stream1972 opened this issue Mar 20, 2020 · 9 comments
Closed

--stats strange output for "All archives" #5055

stream1972 opened this issue Mar 20, 2020 · 9 comments

Comments

@stream1972
Copy link

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

Question, but possible an issue

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

1.1.11

Operating system (distribution) and version.

Client: Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 GNU/Linux
Server: Ubuntu 14.04.6 LTS (GNU/Linux 3.13.0-139-generic x86_64)

Hardware / network configuration, and filesystems used.

Backup over ssh, ext4 -> ext4

How much data is handled by borg?

15-20G source, repositories up to 10G

Full borg commandline that lead to the problem (leave away excludes and passwords)

export BORG_RSH='ssh -oBatchMode=yes'
export BORG_REPO='ssh://user@xxxxx.yyyyyy.zz:12345/./dbbackup'
export BORG_PASSPHRASE=sometext

borg create -s -v --show-rc --chunker-params=13,23,18,4095 --compression lzma ::dbbackup_{utcnow:%Y-%m-%d_%H:%M:%S} data_1.sql data_2.sql data_3.sql data_4.sql data_5.sql data_6.sql

Describe the problem you're observing.

This is output of --stats after first backup to the fresh, just created repository:

Creating archive at "ssh://user@xxxxx.yyyy.zz:12345/./dbbackup::dbbackup_2020-03-15_08:03:16"
------------------------------------------------------------------------------
Archive name: dbbackup_2020-03-15_08:03:16
Archive fingerprint: 70cee52792dfc966ec8d3783a4e81f2e0f9412be780cf61820e7dfeb3bbf028d
Time (start): Sun, 2020-03-15 08:03:18
Time (end):   Sun, 2020-03-15 08:55:29
Duration: 52 minutes 10.41 seconds
Number of files: 6
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               10.50 GB              1.39 GB              1.39 GB
All archives:               20.75 GB              2.74 GB              1.39 GB

                       Unique chunks         Total chunks
Chunk index:                   36559                72326
------------------------------------------------------------------------------
terminating with success status, rc 0

Note the numbers for "All archives". There is only one archive in this backup, but numbers are almost doubled.

After second backup, "All archives" continues to grow:

------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               10.66 GB              1.40 GB            479.56 MB
All archives:               31.42 GB              4.14 GB              1.87 GB

But after two backups there should be (10.50+10.66) GB total, not 31.

The backup contains few database dump, one huge file (~10 GB) and few small ones.

$borg list ::dbbackup_2020-03-15_08:03:16
-rw-r--r-- user user     1767 Sun, 2020-03-15 08:00:01 dbbackup_slave_2020-03-15_0800.txt
-rw-r--r-- user user 10254073908 Sun, 2020-03-15 08:03:11 dbbackup_1_2020-03-15_0800.sql
-rw-r--r-- user user 96474541 Sun, 2020-03-15 08:03:12 dbbackup_2_2020-03-15_0800.sql
-rw-r--r-- user user 55927632 Sun, 2020-03-15 08:03:13 dbbackup_3_2020-03-15_0800.sql
-rw-r--r-- user user 42570884 Sun, 2020-03-15 08:03:14 dbbackup_4_2020-03-15_0800.sql
-rw-r--r-- user user 44977907 Sun, 2020-03-15 08:03:15 dbbackup_5_2020-03-15_0800.sql

Look like this huge file was accounted twice. It is working as intended? If yes, it's misleading.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes, I made another new repository (now it was backup of user home directory, different files, some big archives, some text files of different length) and again, after first backup numbers in "All archives" are larger then in "This archive", although "this" is the only one archive in the repository. The difference is not so big but it still exist.

Creating archive at "ssh://user@xxxxxx.yyyyy.zz:12345/./data::data_backup_2020-03-20_13:03:01"
------------------------------------------------------------------------------
Archive name: data_backup_2020-03-20_13:03:01
Archive fingerprint: 072547b1d2db50e2303efc507cba1343677e9f98fde2dae6df01231849d5c0df
Time (start): Fri, 2020-03-20 13:03:07
Time (end):   Fri, 2020-03-20 14:33:35
Duration: 1 hours 30 minutes 28.61 seconds
Number of files: 5740
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               17.57 GB              7.83 GB              7.83 GB
All archives:               18.21 GB              8.46 GB              7.83 GB

                       Unique chunks         Total chunks
Chunk index:                   70316                72651
------------------------------------------------------------------------------
terminating with success status, rc 0
@ThomasWaldmann
Copy link
Member

@infectormp not sure what you mean.

@ThomasWaldmann
Copy link
Member

@stream1972 what do these commands show:

borg info REPO
and
borg info --consider-part-files REPO

@stream1972
Copy link
Author

@ThomasWaldmann
No difference, same output from both. Currently has two backups and it should be 20-21Gb. 31Gb is reported.

$ borg list
dbbackup_2020-03-15_08:03:16         Sun, 2020-03-15 08:03:18 [70cee52792dfc966ec8d3783a4e81f2e0f9412be780cf61820e7dfeb3bbf028d]
dbbackup_2020-03-17_08:03:20         Tue, 2020-03-17 08:03:23 [231182e90c390f3463a861698181f64f7fc9ab7d5bf2a14f027310a3cb706369]
$ borg info
Repository ID: fa55f02bec842600b2dec72c19680dc2d8c0fd25fcb3e8197d127b58443dcd46
Location: ssh://user@xxxx.yyyyy.zz:12345/./dbbackup
Encrypted: Yes (repokey)
Cache: /home/user/.cache/borg/fa55f02bec842600b2dec72c19680dc2d8c0fd25fcb3e8197d127b58443dcd46
Security dir: /home/user/.config/borg/security/fa55f02bec842600b2dec72c19680dc2d8c0fd25fcb3e8197d127b58443dcd46
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
All archives:               31.42 GB              4.14 GB              1.87 GB

                       Unique chunks         Total chunks
Chunk index:                   52184               109430
borg --consider-part-files info
Repository ID: fa55f02bec842600b2dec72c19680dc2d8c0fd25fcb3e8197d127b58443dcd46
Location: ssh://user@xxxx.yyyyy.zz:12345/./dbbackup
Encrypted: Yes (repokey)
Cache: /home/user/.cache/borg/fa55f02bec842600b2dec72c19680dc2d8c0fd25fcb3e8197d127b58443dcd46
Security dir: /home/user/.config/borg/security/fa55f02bec842600b2dec72c19680dc2d8c0fd25fcb3e8197d127b58443dcd46
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
All archives:               31.42 GB              4.14 GB              1.87 GB

                       Unique chunks         Total chunks
Chunk index:                   52184               109430

@stream1972
Copy link
Author

Server command line (just in case):

command="cd /store1/repos && borg serve --restrict-to-path /store1/repos",no-pty,no-agent-forwarding,no-port-forwarding,no-X11-forwarding,no-user-rc <ssh key goes here>

@ThomasWaldmann
Copy link
Member

Additionally, please do that:

borg info REPO::ARCHIVE
and
borg info --consider-part-files REPO::ARCHIVE

@stream1972
Copy link
Author

Yep, they're here. I see there are many issues (mostly closed) about similar problems.

$ borg --consider-part-files info ::dbbackup_2020-03-15_08:03:16
Archive name: dbbackup_2020-03-15_08:03:16
Archive fingerprint: 70cee52792dfc966ec8d3783a4e81f2e0f9412be780cf61820e7dfeb3bbf028d
Comment:
Hostname: host
Username: user
Time (start): Sun, 2020-03-15 08:03:18
Time (end): Sun, 2020-03-15 08:55:29
Duration: 52 minutes 10.41 seconds
Number of files: 8
Command line: borg create -s -v --show-rc --chunker-params=13,23,18,4095 --compression lzma '::dbbackup_{utcnow:%Y-%m-%d_%H:%M:%S}' dbbackup_slave_2020-03-15_0800.txt dbbackup_1_2020-03-15_0800.sql dbbackup_2_2020-03-15_0800.sql dbbackup_3_2020-03-15_0800.sql dbbackup_4_2020-03-15_0800.sql dbbackup_5_2020-03-15_0800.sql
Utilization of maximum supported archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               20.75 GB              2.74 GB            470.21 MB
All archives:               31.42 GB              4.14 GB              1.87 GB

                       Unique chunks         Total chunks
Chunk index:                   52184               109430
$ borg --consider-part-files list ::dbbackup_2020-03-15_08:03:16
-rw-r--r-- user user     1767 Sun, 2020-03-15 08:00:01 dbbackup_slave_2020-03-15_0800.txt
-rw-r--r-- user user 5241145216 Sun, 2020-03-15 08:03:11 dbbackup_1_2020-03-15_0800.sql.borg_part_1
-rw-r--r-- user user 5012928692 Sun, 2020-03-15 08:03:11 dbbackup_1_2020-03-15_0800.sql.borg_part_2
-rw-r--r-- user user 10254073908 Sun, 2020-03-15 08:03:11 dbbackup_1_2020-03-15_0800.sql
-rw-r--r-- user user 96474541 Sun, 2020-03-15 08:03:12 dbbackup_2_2020-03-15_0800.sql
-rw-r--r-- user user 55927632 Sun, 2020-03-15 08:03:13 dbbackup_3_2020-03-15_0800.sql
-rw-r--r-- user user 42570884 Sun, 2020-03-15 08:03:14 dbbackup_4_2020-03-15_0800.sql
-rw-r--r-- user user 44977907 Sun, 2020-03-15 08:03:15 dbbackup_5_2020-03-15_0800.sql

@traktofon
Copy link

Hello,

seeing the same or similar issue with borg 1.1.13. Fresh repo, first backup, and "All archives" stats differ from "This archive". Backup was done over network, server also running 1.1.13.

These are the steps I took: 1/ On the server, create a user account borg with an authorized_keys files containing

command="borg serve --restrict-to-path /path/to/repos",restrict ssh-rsa ...

The directory /path/to/repos was freshly created and chown'd to user borg.

2/ On the client:

# export BORG_REPO='borg@server:/path/to/repos/machine1'
# borg init -e none
# borg create --progress --stats ::2020-06-22 /path/to/files
-----------------------------------------------------------------------------
Archive name: 2020-06-22
Archive fingerprint: 8ef29bdf314348193e7d281cd1546f943a909cd8c3910730b432e2b1301d5f91
Time (start): Mon, 2020-06-22 14:07:53
Time (end):   Tue, 2020-06-23 01:00:03
Duration: 10 hours 52 minutes 10.60 seconds
Number of files: 453751
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:              513.03 GB            335.77 GB            262.63 GB
All archives:              564.39 GB            387.12 GB            262.63 GB

                       Unique chunks         Total chunks
Chunk index:                  351950               619171
------------------------------------------------------------------------------

For the record, the "This archive" original size matches with the du output.
Checking there's only one archive:

# borg list
2020-06-22                           Mon, 2020-06-22 14:07:53 [8ef29bdf314348193e7d281cd1546f943a909cd8c3910730b432e2b1301d5f91]

And indeed with --consider-part-files the stats match more closely:

borg --consider-part-files info ::2020-06-22
Archive name: 2020-06-22
Archive fingerprint: 8ef29bdf314348193e7d281cd1546f943a909cd8c3910730b432e2b1301d5f91
Comment: 
Hostname: client
Username: root
Time (start): Mon, 2020-06-22 14:07:53
Time (end): Tue, 2020-06-23 01:00:03
Duration: 10 hours 52 minutes 10.60 seconds
Number of files: 453788
Command line: borg create --progress --stats ::2020-06-22 /path/to/files
Utilization of maximum supported archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:              564.23 GB            387.08 GB            262.63 GB
All archives:              564.39 GB            387.12 GB            262.63 GB

                       Unique chunks         Total chunks
Chunk index:                  351950               619171

I've tried to prune the repo, but no data would be deleted.

Is there any explanation where those "part files" come from? Would it be safe to continue using this repo?

@ThomasWaldmann
Copy link
Member

The part files are a feature of borg and are created when doing a checkpoint in the middle of a (usually bigger) file. So having them is expected behaviour.

You can see them with borg list --consider-part-files repo::archive.

What's strange is that you see inconsistencies between "this" and "all", but maybe that comes from how these values are computed. I'll have a look whether we can improve that.

@ThomasWaldmann
Copy link
Member

see that comment and the next comment after that one:

#5408 (comment)

Besides showing the impact of a sparse file, it also shows how borg deals with part files (and --consider-part-files) and how that impacts "this archive" and "all archives" size. It also shows what's correct, what not and when the issues were fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants