-
-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to import processed backup file #17
Comments
Hi! Sorry you are having problems, at first glance things do not look good. I will try to help you, but I might have about a million question. Here are the first couple: Do you have any idea what happened to the backup file? Do you still have the original Signal installation that created this backup (can you repeat the process)? Do you still have multiple copies of this backup (for example on the phone, on an sd card or usb stick) and can you verify that they are all the same (I usually take an md5sum of backup files before and after transfer)? It might seem stupid, but the most common cause of corrupted backups these days is just some faulty storage or network problems while transferring it from the old phone to the new one. The attachment data not found messages are somewhat expected, they happen in a lot of backups. I don't remember exactly what causes it, I think it may be that if you delete (message with) an attachment, the entry in the 'part' table is not actually removed, but I'm not sure. If I have some time I will try to figure it out, but I wouldn't worry about that part too much.
As far as I know, you are the first and only person to ever reach this part of the code, I thought it was a statistical impossibility to get at this point, but either that's not the case, or there is a bug in my code (which is very possible). I have just uploaded a new version which adds a (work-in-progress) 'verbose' option. Also, instead of just bailing out when encountering a verified but invalid frame, it tries to just skip it and continue. Could you please rerun the tool with Let me know if your are using the windows version and you need me to compile it for you, right now I just updated the source code. I am often quite busy, but I'll try to put in some time this evening and tomorrow (I have a day off). |
It's from a phone that initially had not enough space to perform a backup but at some point it seemed that enough space was available and the backup finished correctly. I tested the file initially with
No, I can't repeat the process as the phone is back to factory settings.
I got multiple copies of the file and did verify via
Not sure if I should feel honored... ;)
Sure, will do that in the next couple of minutes and will provide the output.
No need to compile for me, I'm using the container via buildah / podman.
Brilliant, if you're interested in a faster feedback loop let me know what might work for you (IRC, discord etc.) |
@bepaald I got the resulting output file - it is 12MB big. Is it save to attach it here or am I giving parts of my data away? |
I don't think there is anything in there, but just to be safe maybe you can email it to me. |
Ok, thanks. I've really been thinking hard on this, it's a difficult problem. (The following may not make much sense to you, but I'm writing it partly to remind myself what I think the situation is) The first thing that goes wrong is that (after the first corrupted data appears), the frame-sync is lost and it tries to find a valid frame again. The one it thinks is good ( After this, it looks to me, all the next frames are decrypted with incorrect key material leading to random data (and invalid frames), which suggests to me the counter is not correct. Probably either an attachment was tried to be read when there wasn't one, or one that should have been read was skipped (the counter is increased when an attachment follows a backupframe). Now, I've tried to deal with the first problem, frame validation is a lot stricter now, so that might already make a difference. The second problem I just can't seem to wrap my head around. At the moment I have not been able to figure out by looking at the code logic how this can happen. I've tried reproducing this in many ways now and while sometimes I do indeed get a few invalid frames, the program actually always corrects itself and successfully decrypts most of the backup file. Could you please try again? Same as yesterday? I'm not sure if the changes I've made so far have greatly increased the chances of success or actually decreased them, but in either case, the output might give me some more insight. Thanks! |
Thanks for the update, I'm currently building the newest version and will send you the log file again. |
Resulting log file is 162 MB big...compressed it comes down to 16 MB - you still want it via mail? |
Ouch, that doesn't sound promising... But send it anyway, I think I can grep though it allright... |
Well, I'm afraid I'm going to need to give this problem a bit more time in my head. Looking at what's happening I'm really starting to think there must be some flaw in the programs logic, but I've read over it many times today and I can't find it and can't reproduce the things I'm seeing in your log by deliberately breaking my own backups. At the same time I can also not prove with certainty in any way that there is a bug after the program incorrectly finds a 'good frame' after having lost sync. This incorrectly validating a bad frame can theoretically always happen since the program at that point is dealing with essentially random data. I've made frame validation even stricter now. I don't think it will fix the problems you are having this time, but if you are not sick of it yet, I wouldn't mind getting the new output. I'll have a little less time the coming days, but I'll surely be thinking about this problem often and will try to spend a bit of time on it. Hopefully next weekend (if not before) I'll have an idea and more time to implement something, though you might have to run the program another few times and pass along the results. Thank you for your patience. |
So I already rebuild your tool yesterday night and did process the backup file. A fixed version was build and I was able to "import" it - or at least signal thought it was okay. I ended up with no messages in the app. I was able to catch some of the output via adb logcat which you'll find in the attached file. I'm also able to provide you the logfile from the signalback-tool verbose output via mail if needed, it 51 MB, resulting in 1.3 MB in compressed form. |
Thanks. I think I can pretty much guess what happened this time, but if it's not too much trouble I wouldn't mind looking at the log. From what is happening I think I see only two possibilities: a bug in my program, or the signal app has written bad data to the backup file. I do think I have a way to check which, but I won't be able to implement a test until this weekend at the earliest, and then you would have to run it. You should probably prepare for the possibility that the last ~1/3 of the file is simply random data where no information exists to be recovered. From the log posted in your first message (and I'm guessing that the last log from yesterday ends the same way), at least all your (text) message content is still there and probably about 2/3 of your attachments. The problem is, the messages belong in threads, but the thread database is at the end of the backup file, which is missing:
And this subsequently makes it appear there are no messages in the app after restoring. Now, my program has in the past been able to create a working thread database for people who had incomplete (truncated) backups files, using the undocumented You wouldn't happen to have some other, older, working backup? It would probably be possible to transfer the thread and recipient tables from that, that would simplify matters if it gets to that. |
You should have the log file in the mailbox
No worries - I'm able to retest it on the weekend. But I guess it is not really possible to use signal in the meantime, right? Or is there a way to merge a newer and an older backup?
It's not such a big issue if the attachments are missing. Just out of curiosity - is it not possible to find the attachments in the data through magic tests?
Nope - there is a desktop app which was hooked up with that account till the end (but not from the start) |
Yes there is: https://github.com/bepaald/signalbackup-tools#merge! Note about the note: that is slightly outdated, I've had multiple reports of success (both here [1], [2] and via email), just haven't updated the readme yet.
Well, no not really, but I think it is difficult to explain. File magic would only work on decrypted data, but it is impossible to decrypt the data once framesync is lost (as happens in your file after that first 'bad mac' warning). This is because each frame in the backup file is encrypted with different parameters (the framecount being the important one here), and after losing framesync, even if a correct frame boundary is found, it is still not possible to continue decrypting without also determining how many bad frames were skipped (because otherwise the framecount would be incorrect). So, I guess in summary, you can't do file magic without decrypting, but if you can decrypt you wouldn't need file magic. (Warning, I think I used the following paragraph to sort my own thoughts again, it may be hard to follow) I was looking back at what you said earlier:
And was sort of hoping
That is probably also good. To be able to generate some working thread table, I used to be able to read the phone numbers from the 'sms' and 'mms' tables, as they were used as identifiers. However, these days, recipients are identified by an uuid primarily, which is stored in the 'recipient' table, which is also empty. The messages and threads only refer to a recipient which (I think) needs this unique identifier. Luckily these identifiers are also in the desktop app's database, which my program can also read. Anyway, no news to report right now. I'll try to work on generating a valid thread table and doing a final check on your backup file this weekend. I just wanted to let you know you could use signal in the meantime if you trust the merging ability of my program. In fact, it may be relatively easy to import messages from the incomplete backup into the threads of your new signal installation. |
Ok, I added a temporary function to check one last time wether all those invalid frames found are really junk data. It will seek to a position in the file where such a frame was found, and then decode it with the decoding parameters of the first million possible frames. Just run like this: I'll have more time the next 2 days, I'll try to start on generating a functional backup from the truncated version you have. Do you happen to know wether all the conversations in your backup exist in your desktop install? Or had some not had any activity since linking the desktop? Thanks! |
@bepaald just ran the new version with the parameter you provided. I'll send you the output via mail but it seemed not that "interesting". |
Thanks! That was unexpected but actually very interesting. It explains everything I'm seeing. As I already mentioned, the order of the backup is always set. The first thing that always happens is filling the 'sms' table, so when I saw that frame would decode to an 'insert into sms'-statement, when decoded with a very low frame number everything fell into place. Somehow, the last 1/3 of your backup - probably right from where the corrupted attachment is, in frame 26198 - is a duplicate of the first part of the backup. So, after seeing that last log you sent me, I started looking at the invalid frames found after the corrupted attachment and comparing them to the start of the file: The invalid frames:
And the beginning (which decode properly):
Obviously, the last (undecryptable) part of the backup is just a duplicate of the beginning. And Signal did not start re-exporting the earlier frames during the backup process, then the counter would have incremented, it is really a copy. My program does not find these frames because the counter can only ever increase, never decrease, so it only searches forward for the correct framenumber. How or why your backup ended up like this I don't know, but at least I'm as certain as I'm going to be (without having the backup here) that:
So thanks for sending me that! I'll get to work on generating a thread table for your backup tomorrow. |
Hi! Sorry for the slow progress lately, I ran into some difficulties while investigating the Signal Desktop database, then work kept me busy. The good news is I've figured out the desktop database format (I hope - maybe more difficulties will come up later), and I hope to get something implemented in the next few days. In the meantime, there are some possible issues when filling the missing data from the Signal Desktop database. Some of which could be fixed with more coding, but if I don't have to write code for that, I'd rather not. I believe normal (unsecured) sms messages are not synced to the Desktop? Do you use Signal as your sms app, and do you have any non-signal threads in the backup? Do you think there are any conversations in the backup that do not exist (at all) on the Desktop, for example in conversations that have had no activity since linking the desktop? I have a work-in-progress function that matches the threads in the backup to specific recipients by comparing to the desktop database. Maybe you could try it? It will print out a list of threads, and then for each thread, it will either print the contact (matched from Signal Desktop) or it will print the last ten messages so you could maybe figure out what thread it is. I don't need the full output, but maybe you could tell me how many threads are not matched and if you think they exist in the desktop database or not. And, if you have already started using a fresh Signal install on phone, do the unmatched threads exist there (or could they)? example:
The parameter after the |
Just thought of something important still missing. I still need to translate the group members from the desktop db to the android database. So this is still a work in progress... Sorry. |
Hey! I also have problems restoring the Signal backup and stumbled over this issue. I /think/ I was hit by signalapp/Signal-Android#11076 and/or signalapp/Signal-Android#8355. So I tried your signalbackup-tools, ended up with this:
Transferred the So this is what happened:
That is why I ended up here. Any idea what to do next? Thank you very much! (I have access to the first phone and can create backup any time.) |
Ok, so I think it did not detect the backup because you had left the '-fixed' in the filename? Signal requires backups to be named 'signal-YYYY-mm-dd-HH-MM-SS.backup', nothing else will work. And if multiple files matching this pattern are found, the newest time-string is used. The actual timestamp of the file (eg 1970-01-01) should not matter. I think the incorrect timestamp is a docker thing by the way, but shouldn't be important. Is it possible that you were restoring the broken backup when signal crashed? Or possibly the new one you just created (which should have had a later date)? Also, I don't see the wrong passphrase event in your list, when did this happen? Again, is it possible Signal was picking up a different backup file then the one whose passphrase you entered?
That last part is very good! I think you're going to need it :). From the output of my program I can see the backup is indeed broken and no data can be recovered after the break. Fortunately it looks like the break happens very late in the file, so all of your messages and most (if not all) attachments are still there. The bad news is, the signal backup file has the 'thread'-table at the end and that one is missing completely ( What I would do next:
Let me know if it works. Good luck! |
Hi,
I tried your tool today to see if I could fix a somewhat broken signal backup file. The original file is shown with a size of 3.2 G, the resulting file is 2.1 G big, but I'm still unable to import it. I'll attach the output of the run, it seems like "105/2018 entries...Warning: attachment data not found " problems were not fixed, as they are still show up if I rerun your tool against the fixed backup file.
The text was updated successfully, but these errors were encountered: