Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Async HTTP-client to httpx (HTTP/2-support) #770

Merged
merged 49 commits into from
Aug 13, 2024

Conversation

thomasht86
Copy link
Collaborator

@thomasht86 thomasht86 commented May 12, 2024

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

  • Removed @staticmethod-decorator of callback_docv1 (See comment on bug below)
  • Switched to httpx - Mainly to add HTTP/2 support, which presumably could improve feed performance by a lot.
  • Unit tests passing ✅ - BUT, working to quantify performance gain.
⚠️ Encountered this bug when tenacity retry is applied

self = <AsyncRetrying object at 0x10468fca0 (stop=<tenacity.stop.stop_after_attempt object at 0x10468fb20>, wait=<tenacity.wa...bject at 0x10468fac0>, before=<function before_nothing at 0x104463f70>, after=<function after_nothing at 0x10447b1f0>)>
retry_state = <RetryCallState 4369142976: attempt #3; slept for 3.0; last result: failed (TypeError object dict can't be used in 'await' expression)>

    def iter(self, retry_state: "RetryCallState") -> t.Union[DoAttempt, DoSleep, t.Any]:  # noqa
        fut = retry_state.outcome
        if fut is None:
            if self.before is not None:
                self.before(retry_state)
            return DoAttempt()
    
        is_explicit_retry = fut.failed and isinstance(fut.exception(), TryAgain)
        if not (is_explicit_retry or self.retry(retry_state)):
            return fut.result()
    
        if self.after is not None:
            self.after(retry_state)
    
        self.statistics["delay_since_first_attempt"] = retry_state.seconds_since_start
        if self.stop(retry_state):
            if self.retry_error_callback:
>               return self.retry_error_callback(retry_state)
E               TypeError: 'staticmethod' object is not callable

notebooks/lib/python3.9/site-packages/tenacity/__init__.py:322: TypeError

We should make unit tests for error handling. Maybe this is part of your fix?

@thomasht86 thomasht86 changed the title Switch Async HTTP-client to httpx Switch Async HTTP-client to httpx (HTTP/2-support) May 13, 2024
@jobergum
Copy link

Last time I tried httpx with http/2 I ran into issues with that it did not handle the case when the server would shutdown the connection (we do in Vespa cloud to disallow extremely long connections)

@thomasht86
Copy link
Collaborator Author

Last time I tried httpx with http/2 I ran into issues with that it did not handle the case when the server would shutdown the connection (we do in Vespa cloud to disallow extremely long connections)

That's very useful info. I will try to mock a test for that to check!

@thomasht86
Copy link
Collaborator Author

Closing for now, as httpx did not improve feeding performance notably, even with HTTP/2.
Need to investigate further to justify the change.

@thomasht86 thomasht86 closed this Jun 14, 2024
@bratseth bratseth reopened this Jul 29, 2024
@thomasht86
Copy link
Collaborator Author

Adding these for context:
encode/httpx#2112
encode/httpx#3215

Got a prototype working. Get Error: <ConnectionTerminated error_code:ErrorCodes.NO_ERROR, last_stream_id:2147483647, additional_data:677261636566756c> for the last ~3k when running 10k POST-requests with vectors "concurrently".
1k requests is approx 2x faster than feed_iterable, which is promising.
Need to implement queue, retries and semaphore to benchmark properly.

@thomasht86
Copy link
Collaborator Author

Ok, think this should be good now!

  • VespaAsync now uses httpxwith http/2.
  • Changed default max_connections-parameter to VespaAsync.__init__() to 1.
  • Introduced keepalive_expiry=10 to avoid connections being terminated by the server.
  • Added feed_async_iterable()-method. ~2-5x faster than feed_iterable() for the test case in the notebooks, better with increasing number of docs.
  • Added feed_performance_cloud.ipynb that compares and demonstrates different ways of feeding to Vespa Cloud.

feed-cloud

Some things that were explored:

  • Using asyncio.Queue, following the pattern from feed_iterable(), but using a Semaphore and only simple batching for restricting memory usage of tasks was 2-3x faster (with the case from the notebook in the PR). Possibly due to overhead of switching between putting/getting from the queue.
  • Different max_queue_size does not impact performance significantly.
  • Increasing max_connections for the async client impacts the performance negatively, as expected with HTTP2.

Should still be possible to get better performance by parallelizing, but leaving this for later.

Tests

  • Added unit test.
  • Added integration test in docker.
  • Added notebook which also acts as test.
  • ⚠️ Unit tests failing on linkcheck, will be fixed in different PR.
  • ⚠️ Integration-cloud test cancelled due to conflicting with other PR runs.

@thomasht86 thomasht86 marked this pull request as ready for review August 9, 2024 09:33
@thomasht86 thomasht86 requested a review from glebashnik August 9, 2024 09:33
namespace="retryapplication",
)

def test_retries_async(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not obvious from code that it tests retires.
Is feeding 10 docs will trigger retries?
Maybe a brief comment explaining that uses special application that sleeps on indexing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
I will add comments to make this more obvious.

vespa/application.py Show resolved Hide resolved
vespa/application.py Show resolved Hide resolved
vespa/application.py Show resolved Hide resolved
@@ -582,6 +581,157 @@ def _handle_result_callback(
queue.join()
consumer_thread.join()

def feed_async_iterable(
self,
iter: Iterable[Dict],
Copy link
Contributor

@glebashnik glebashnik Aug 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better name for iter arg, e.g. documents or documents_iter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, iteris a reserved keyword, and shadowing it isn`t ideal, but think it is more important to follow existing pattern rather than to introduce divergent naming.
Agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of reserved words, I think it is worth the change.
Confusing for devs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, since this is a public API, we would need to add deprecation-warning for a while first if we want to rename the keyword arg from feed_iterable. That should nevertheless be a separate PR. So, then the question is if this new method should have both the renamed arg, but also iter in the meanwhile, to avoid divergent naming.

Maybe @jobergum could enlighten us a little on best practice here? 🙏

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, backwards compatibility for public APIs.
Separate PR makes sense.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a new method with a new signature we don't need to make the same mistake, it this works well, we can deprecate the sync version.

Copy link
Contributor

@glebashnik glebashnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comments.

vespa/application.py Show resolved Hide resolved
@thomasht86 thomasht86 merged commit a8af934 into master Aug 13, 2024
34 of 45 checks passed
@thomasht86 thomasht86 deleted the thomasht86/switch-asynchttpclient-to-httpx branch August 13, 2024 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants