Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kemono.party]Corrupted PNGs when downloading using version 1.24.3 #3519

Closed
FriedGenera opened this issue Jan 11, 2023 · 12 comments
Closed

[kemono.party]Corrupted PNGs when downloading using version 1.24.3 #3519

FriedGenera opened this issue Jan 11, 2023 · 12 comments

Comments

@FriedGenera
Copy link

The pngs downloaded after upgrading to 1.24.3 all seem to be corrupted. I downgraded to previous version and it works fine.

@kattjevfel
Copy link
Contributor

473bd38 seems to be at fault, I have a build of the commit prior to it and it works fine.

@mikf
Copy link
Owner

mikf commented Jan 11, 2023

return (response.headers["content-length"] != "9" and
response.content != b"not found")

This "and" should have been an "or" and almost all bugs this is causing now would have been avoided. Accessing response.content here for basically every file causes the first 16 bytes to be duplicated:

00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452  .PNG........IHDR
00000010: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452  .PNG........IHDR
00000020: 0000 0b4e 0000 0ffd 0806 0000 00c7 9db6  ...N............

I ran all extractor result tests and even added a new one. They all passed so I thought it was OK like this. Oh well. Sorry. Guess I'll be pushing another release this evening.

mikf added a commit that referenced this issue Jan 11, 2023
- do not access 'response.content' unless necessary
- only validate responses if filename extensions differ
@mikf
Copy link
Owner

mikf commented Jan 11, 2023

Fixed in commit 85bd1cb.
New release at https://github.com/mikf/gallery-dl/releases/tag/v1.24.4.

@mikf mikf closed this as completed Jan 11, 2023
@reyaz006
Copy link

reyaz006 commented Jan 11, 2023

What's the safe way to fix the issues with broken files if I use archive (don't wan to re-download something I removed earlier)? I can't just wipe latest data inside the .sqlite3 file?

@Hrxn
Copy link
Contributor

Hrxn commented Jan 12, 2023

What's the safe way to fix the issues with broken files if I use archive (don't wan to re-download something I removed earlier)? I can't just wipe latest data inside the .sqlite3 file?

Yes, you can quite easily actually: https://sqlitebrowser.org/

@reyaz006
Copy link

I mean there is no column for date. How do I sort by date, or select entries created on specific date and remove them?

@Hrxn
Copy link
Contributor

Hrxn commented Jan 12, 2023

Uhh.. not sure.
By checking the first bytes of the local files on the filesystem then? Aren't they identical for every corrupted file?

@reyaz006
Copy link

They are identical for each individual binary file (and not just PNG) only. I found them all by creation date, but I'll need to make some script that would properly find such duped bytes and remove them.

@Hrxn
Copy link
Contributor

Hrxn commented Jan 12, 2023

Well, if you also save the ID inside the filename, and not just in archive-format, we'd known the relevant IDs for the archive file..

@reyaz006
Copy link

Yes, there are ~4000 files with ids in their names. I guess I could list, find them inside the .sqlite3 and remove. But that seems like too much work compared to a script.

@kattjevfel
Copy link
Contributor

If you can identify all the bad files, say with file (it should show up as PNG image data, -1991225785 x 218765834, 0-bit grayscale), you can then move them somewhere and remove the first 16 bytes, I'd do something like this:

for f in ./*.png; do dd if="$f" of="$f"_fixed.png ibs=16 skip=1; done

This will remove the first 16 bytes of all pngs in the folder and output them to filename.png_fixed.png

@reyaz006
Copy link

Solved the problem using sfk.exe and a batch script:

@echo off
move %1 tempfile
sfk hexdump -raw -offlen 0 16 tempfile >1.txt
sfk hexdump -raw -offlen 16 16 tempfile >2.txt
sfk md5 1.txt 2.txt
if %ERRORLEVEL% equ 0 goto :process
echo ABORT
goto :skip

:process
move tempfile !._16broken
sfk partcopy !._16broken -allfrom 16 !._16fixed -yes
move !._16fixed %1
echo DONE
goto :skip2

:skip
move tempfile %1

:skip2

Usage for each file: dothething.bat "D:\path to broken file\file.zip"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants