Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase download buffer size and improve tarball import logging #11171

Merged
merged 4 commits into from
Jul 29, 2024

Conversation

edolstra
Copy link
Member

Motivation

Downloads of tarball and github flakes could appear quite slow. This was because we are piping curl downloads into unpackTarfileToSink(), but the latter is typically slower than the former if you're on a fast connection. This would cause the curl download to be put to sleep. (There is even a risk that if the Git import is really slow for whatever reason, the TCP connection could time out.)

So let's make the download buffer bigger by default - 64 MiB is big enough for the Nixpkgs tarball. Perhaps in the future, we could have an unlimited buffer that spills data to disk beyond a certain threshold, but that's probably overkill.

This also adds a warning if the download buffer is full, and shows when we are unpacking an archive into the Git cache.

Context

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

We are piping curl downloads into `unpackTarfileToSink()`, but the
latter is typically slower than the former if you're on a fast
connection. So the download could appear unnecessarily slow. (There is
even a risk that if the Git import is *really* slow for whatever
reason, the TCP connection could time out.)

So let's make the download buffer bigger by default - 64 MiB is big
enough for the Nixpkgs tarball. Perhaps in the future, we could have
an unlimited buffer that spills data to disk beyond a certain
threshold, but that's probably overkill.
This happens in parallel with the download (which starts later), so
you only see this message when the download has finished but the
import hasn't.
@github-actions github-actions bot added the fetching Networking with the outside (non-Nix) world, input locking label Jul 24, 2024
@edolstra edolstra added the backport 2.23-maintenance Automatically creates a PR against the branch label Jul 24, 2024
@cole-h
Copy link
Member

cole-h commented Jul 24, 2024

@edolstra edolstra added backport 2.21-maintenance Automatically creates a PR against the branch backport 2.22-maintenance Automatically creates a PR against the branch labels Jul 24, 2024
Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of the buffer being at capacity, could the cause be an ever increasing chunk size of the data transferred to the consuming thread?
That way the pace of consumption becomes spaced apart in time more and more, creating a spiking download rate at best, and dropped connections at worst. A larger buffer would only allow those intervals of blocking due to consumer processing to grow larger and larger, because a larger buffer fits larger chunks.

debug("download buffer is full; going to sleep");
static bool haveWarned = false;
warnOnce(haveWarned, "download buffer is full; consider increasing the 'download-buffer-size' setting");
state.wait_for(state->request, std::chrono::seconds(10));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 seconds seems a little excessive, or is this a placeholder for infinity because we're getting interrupted by a signal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually the download thread will be woken up long before the 10 second limit by the FileTransfer::download main loop.

debug("download buffer is full; going to sleep");
static bool haveWarned = false;
warnOnce(haveWarned, "download buffer is full; consider increasing the 'download-buffer-size' setting");
state.wait_for(state->request, std::chrono::seconds(10));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An std::string is a contiguous region of memory, so resizing it to fit an ever larger amount of data could cause O(n²) copying to new regions.

Copy link
Member Author

@edolstra edolstra Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure it's O(n lg n) amortized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I guess it isn't quite that, but it's a lot of unnecessary copying nonetheless.
Forgot about the increments being proportional.

@edolstra
Copy link
Member Author

Instead of the buffer being at capacity, could the cause be an ever increasing chunk size of the data transferred to the consuming thread?

No it's the buffer being full, as you can see running it with -vvvv.

Not sure how the chunk size increases? Also, the client thread moves the current contents of the buffer so that's O(1).

@roberth
Copy link
Member

roberth commented Jul 25, 2024

No it's the buffer being full, as you can see running it with -vvvv.

Ohh, you're trying to download faster for no other reason than the progress bar?
I thought this was about reliability, because I couldn't fathom performance theater as a goal. (No disrespect to theater)
My comments are about downloading at a steady pace, to make the backpressure work without massive waits that could make it unreliable.
Also jittery for other network users if you do manage to max out the internet connection during this burst. (One bug burst after the limit increase)

Not sure how the chunk size increases?

It increases because more data gets appended to data while the download thread is doing a burst and the consumer thread is slowly eating away the previous chunk of data data that was indeed moved.
This causes the consumer thread to work on an even larger piece of data the next time, causing more delay in the consumer, causing the download thread's data to grow larger again, etc.

@edolstra
Copy link
Member Author

Ohh, you're trying to download faster for no other reason than the progress bar?

It's mostly cosmetic, but it's possible that if the download thread "starves", TCP connections start dropping or get throttled.

@roberth roberth merged commit 6e3bba5 into NixOS:master Jul 29, 2024
15 checks passed
Copy link

Successfully created backport PR for 2.21-maintenance:

Copy link

Successfully created backport PR for 2.22-maintenance:

Copy link

Successfully created backport PR for 2.23-maintenance:

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-07-29-nix-team-meeting-minutes-165/49970/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.21-maintenance Automatically creates a PR against the branch backport 2.22-maintenance Automatically creates a PR against the branch backport 2.23-maintenance Automatically creates a PR against the branch fetching Networking with the outside (non-Nix) world, input locking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants