Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chat Backups are inflated in size #11509

Open
4 tasks done
SebiderSushi opened this issue Aug 1, 2021 · 16 comments
Open
4 tasks done

Chat Backups are inflated in size #11509

SebiderSushi opened this issue Aug 1, 2021 · 16 comments

Comments

@SebiderSushi
Copy link

SebiderSushi commented Aug 1, 2021


Bug description

When creating backups using the "Chat backups" feature the resulting backup files don't appear to be deduplicated. They can be vastly bigger than the actual data used by the Signal app. Forwarding messages inflates backup size while Signal App Data Size does not change.

Steps to reproduce

  • Send a large-ish file or media (a couple megabytes)
  • Create a backup and note down: Backup size & Signal app data usage
  • Forward the file one or more times
  • Create another backup, then compare backup sizes & Signap app data usage

Actual result: The Backup file size increases by the size of the forwarded file while Signal App usage does not. For example, in the case of another persons Signal i observed 400MB of app data usage while the backup file size was about 1GB.
Expected result: If Signal deduplicates forwarded messages internally they should also be deduplicated in backup files. The backup file should not be considerably larger than Signals app data.

Device info

Device: Xiaomi Redmi 4X
Android version: LineageOS 15.1 based on AOSP 8.1.0
Signal version: 5.18.5

Link to debug log

https://debuglogs.org/bb268c5fb8102c6c3430235813b64721c100dec8c3ac8dfe71885073a2133fc1

@SebiderSushi
Copy link
Author

Replying to a message of a large file/media has the same effect.

@stale
Copy link

stale bot commented Jan 26, 2022

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the wontfix label Jan 26, 2022
@SebiderSushi
Copy link
Author

SebiderSushi commented Feb 1, 2022

This is reproducable with the current Signal version 5.29.7
This issue is being blocked by the fact that no one considered this issue when implementing backups and even now there is no one available to review it among the 1.3k open issue at the moment.
There is nothing any user can do to help it move forward.

Thanks for trying though, stale bot. /s

@stale stale bot removed the wontfix label Feb 1, 2022
@stale
Copy link

stale bot commented Apr 2, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Apr 2, 2022
@SebiderSushi
Copy link
Author

This is reproducable with the current Signal version 5.34.5

@stale
Copy link

stale bot commented Apr 9, 2022

This issue has been closed due to inactivity.

@stale stale bot closed this as completed Apr 9, 2022
@SebiderSushi
Copy link
Author

i guess it could be argued that this should not be discussed as a github issue as per this rule from #7598

If you disagree with the way that a feature is working (but it's working!) that's not a bug.

i would disagree because issues like this are part of the reason signal backups are much larger than necessary. as a result, i do not create any signal backups because i don't have the storage on my phone to store them. in my experience, this is equivalent to a feature that is not working; i am unable to create any signal backups at all even though i have enough storage on my PC. Signal does not offer any way to create useful backups on external / could storage unless you have lots of excess storage on your phone and copy huge monolithic backups every day.

i have no hope that this set of issues will ever be resolved (given its history of only making baby steps every now and then over multiple years). here's some relevant forum threads that i could find for anyone else reading this issue:

@cody-signal cody-signal reopened this Apr 11, 2022
@stale stale bot removed the wontfix label Apr 11, 2022
@cody-signal
Copy link
Contributor

Hey there, we have an idea as to why this is happening, and tracking internally to fix.

@SebiderSushi
Copy link
Author

Thanks a lot for acknowledging the issue!

@greyson-signal
Copy link
Contributor

My comment from #12106:

So I have a theory that I'm relatively sure is correct. When you forward messages to multiple chats or just send the same exact attachment to multiple people in general, we will de-duplicate those attachments on disk. Unfortunately, this de-duplication is lost during the backup process because our current backup format doesn't really support it.

So if you send the same media to multiple chats somewhat often (or with large items, like videos), then I could definitely see it happening where the backup is much larger than your normal signal install (and thus your restore will be larger too).

We have some ideas to help with this, but they're larger projects that will require some thought.

@skickar
Copy link

skickar commented Mar 8, 2023

Screenshot_20230307-184305.png

Just confirming this is not fixed in any update 11 months later.

@radeklat
Copy link

radeklat commented Mar 8, 2023

As a workaround, I have to go to the Settings / Data and storage / Manage storage and delete large duplicate files manually about once a month (I forward a lot of photos to a lot of people). An expiry on large files would also help with this issue. Currently, I can expire only messages, which take almost no space if without attachments.

@spring-trees-ufo
Copy link

Replying to a message of a large file/media has the same effect.

I read somewhere that when you reply to a message with a media, the media gets duplicated and gets included to the backup. Even after deleting the original media, the backup size remains largely unaffected.

I am guessing the duplicated media doesn't get deleted and I don't see any other way to get rid of that media since it doesn't show up under Storage usage. My backup size currently stands at twice of the Storage as shown by Signal.

I have consciously avoiding replying to any media but it is a huge inconvenience. Would be happy if this gets looked into. Or at the very least, it would be convenient to be able to see the media and delete manually.

@Rafee-M
Copy link

Rafee-M commented Aug 1, 2024

I read somewhere that when you reply to a message with a media, the media gets duplicated and gets included to the backup. Even after deleting the original media, the backup size remains largely unaffected.

I can confirm this. When I cleaning up my storage when reached 1.5 GB. I had to manually scroll and delete replied messages from chats. That reduced about 900 MB. And actual media was taking up ~450 MB.

So, this seems like a big issue since replied chats take more space compared to actual media, specially in group chats

I am guessing the duplicated media doesn't get deleted and I don't see any other way to get rid of that media since it doesn't show up under Storage usage.

You can by deleting the specific reply, but it can be a pain to find

@dcjkfgdjhd
Copy link

dcjkfgdjhd commented Sep 9, 2024

On my phone, Signal says it occupies 9.7 GB, Android says it occupies 10.2 GB. So that is more or less consistent.

But the backup takes 26.3 GB. 🤬

@cody-signal
Copy link
Contributor

cody-signal commented Sep 9, 2024

Hey folks, if you've looked at our recent commit history you can see we've been building out a new backup system. Part of that work addresses the inflated size by only writing out one local copy of the file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

8 participants