Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image corruption / distortion / artifacting #29

Closed
jtara1 opened this issue Jul 26, 2017 · 6 comments
Closed

Image corruption / distortion / artifacting #29

jtara1 opened this issue Jul 26, 2017 · 6 comments

Comments

@jtara1
Copy link

jtara1 commented Jul 26, 2017

edit: This could be considered an issue that the user is responsible for managing.

There's a possibility this issue is irrelevant to this module / program. After downloading a comic, I run a script that uses zipfile module to zip each image from each chapter into its own zip file separated by chapters.

The possible issue with gallery-dl is:

  • I may send a KeyBoardInterupt or kill the process running the script to stop the downloading of a comic, then resume the same comic later.
  • I think I may have turned on or off my VPN mid download which would be like disabling my internet connection mid download.

I've had similar issues w/ other image downloading scripts.

image corruption e.g.s:

edit: to confirm it's not the source that is corrupt (http://www.mangareader.net/feng-shen-ji/26/13)

feng shen ji_c026_013
feng shen ji_c022_027

@Hrxn
Copy link
Contributor

Hrxn commented Jul 27, 2017

How/when do you run your script exactly? Why is it possible to interrupt gallery-dl in the first place?

@Bfgeshka
Copy link

It happens due to gallery-dl sudden stop or losing connection. I guess it would be better if @mikf can add logic for deleting not completely downloaded file on exit.

@mikf
Copy link
Owner

mikf commented Jul 27, 2017

I may send a KeyBoardInterupt or kill the process running the script to stop the downloading of a comic, then resume the same comic later.

A KeyboardInterrupt or any other exception raised during a download causes a partially downloaded file to be deleted, but killing the process obviously leaves the file in place. Installing some signal handlers could help against anything that isn't a SIGKILL, but then there is question of how this would work on Windows.

I think I may have turned on or off my VPN mid download which would be like disabling my internet connection mid download.

I don't know how Requests/urllib3 handles client-side network outages mid-download (exception? just silently closing/aborting the connection?), but a remote-server closing the connection preemptively gets silently ignored and could also lead to partially downloaded files, although that has never happened to me as far as I can tell.

I think there are three things that should be done here:

  • Use .part files during the download and rename them upon completion
  • Implement continuation of partial downloads
  • Check if the file size matches the Content-Length header

@Bfgeshka
Copy link

By the way @mikf, add ability to set path for this .part file, so it can be stored in /tmp or in tmpfs mount or wherever user wants.

@mikf mikf added this to the 1.0.0 milestone Oct 1, 2017
mikf added a commit that referenced this issue Oct 24, 2017
- use '.part' files during file-download
- implement continuation of incomplete downloads
- check if file size matches the one reported by server
@mikf
Copy link
Owner

mikf commented Oct 24, 2017

I've finally managed to rewrite the downloader modules to at least implement those three features I listed above, which should hopefully solve any issues regarding incomplete downloads and image corruption.

I tested this quite thoroughly on my own machine and a Windows 7 VM, but it would be nice if someone could test this themselves and report any errors/problems and/or possible improvements.

@Bfgeshka
Copy link

This little thing was not critical, but bugging with routine of checking last file of interrupted download session.
Good one.

mikf added a commit that referenced this issue Oct 24, 2017
- '--no-part' command line option to disable them
- 'downloader.http.part' and 'downloader.text.part' config options

Disabling .part files restores the behaviour of the old downloader
implementation.
mikf added a commit that referenced this issue Oct 25, 2017
Note: The path set for 'downloader.*.part-directory' needs to point to an
already existing directory.
@mikf mikf closed this as completed Oct 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants