Skip to content
This repository has been archived by the owner on May 21, 2024. It is now read-only.

ostree_http_repo: Reuse curl handle. #1643

Merged
merged 1 commit into from
Apr 16, 2020

Conversation

pattivacek
Copy link
Collaborator

@pattivacek pattivacek commented Apr 15, 2020

Reuse provides a huge performance boost because certs don't have to be re-read and fewer objects get destroyed and recreated.

@pattivacek pattivacek force-pushed the feat/OTA-4165/reuse-curl-handle branch from f39b7df to 3608a47 Compare April 15, 2020 14:43
@lbonn
Copy link
Contributor

lbonn commented Apr 15, 2020

Wow good catch. How huge is the performance boost roughly?

@pattivacek
Copy link
Collaborator Author

Wow good catch. How huge is the performance boost roughly?

Here are the results of a naïve test run without this change:
Upload to Treehub complete after 902 HEAD requests and 356 PUT requests.
Total size of uploaded objects: 42063914 bytes.
Total runtime: 367.729 seconds.

With this new change:
Upload to Treehub complete after 902 HEAD requests and 355 PUT requests.
Total size of uploaded objects: 39016714 bytes.
Total runtime: 136.462 seconds.

Working on a longer/bigger test now.

@pattivacek
Copy link
Collaborator Author

I'm trying to understand why making the handle static would cause the user_agent_ in Utils not to get destroyed properly:

2020-04-15T14:52:18.0134386Z ==2234== 31 bytes in 1 blocks are definitely lost in loss record 7 of 13
2020-04-15T14:52:18.0134667Z ==2234==    at 0x4C3017F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
2020-04-15T14:52:18.0135454Z ==2234==    by 0x5E36B9A: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_mutate(unsigned long, unsigned long, char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
2020-04-15T14:52:18.0142558Z ==2234==    by 0x5E38142: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_append(char const*, unsigned long) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25)
2020-04-15T14:52:18.0142709Z ==2234==    by 0x1650E7: append (basic_string.h:1258)
2020-04-15T14:52:18.0142895Z ==2234==    by 0x1650E7: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&, char const*) (basic_string.h:5991)
2020-04-15T14:52:18.0143268Z ==2234==    by 0x4EF5D2C: Utils::getUserAgent() (utils.cc:784)
2020-04-15T14:52:18.0143381Z ==2234==    by 0x4EF5E1E: CurlEasyWrapper::CurlEasyWrapper() (utils.cc:922)
2020-04-15T14:52:18.0143506Z ==2234==    by 0x4EBA4D2: __static_initialization_and_destruction_0 (ostree_http_repo.cc:11)
2020-04-15T14:52:18.0143626Z ==2234==    by 0x4EBA4D2: _GLOBAL__sub_I_ostree_http_repo.cc (ostree_http_repo.cc:74)
2020-04-15T14:52:18.0143915Z ==2234==    by 0x4010732: call_init (dl-init.c:72)
2020-04-15T14:52:18.0144130Z ==2234==    by 0x4010732: _dl_init (dl-init.c:119)
2020-04-15T14:52:18.0144354Z ==2234==    by 0x40010C9: ??? (in /lib/x86_64-linux-gnu/ld-2.27.so)
2020-04-15T14:52:18.0144452Z ==2234==    by 0x2: ???
2020-04-15T14:52:18.0144628Z ==2234==    by 0x1FFF000ACE: ???
2020-04-15T14:52:18.0144727Z ==2234==    by 0x1FFF000B25: ???
2020-04-15T14:52:18.0144817Z ==2234==    by 0x1FFF000B67: ???

@lbonn
Copy link
Contributor

lbonn commented Apr 15, 2020

Actually, does it really need to be static? It looks like it can just be local to a OstreeHTTPRepo object, no?

diff --git a/src/sota_tools/ostree_http_repo.cc b/src/sota_tools/ostree_http_repo.cc
index 8c2bac715..0016d414e 100644
--- a/src/sota_tools/ostree_http_repo.cc
+++ b/src/sota_tools/ostree_http_repo.cc
@@ -8,8 +8,6 @@
 
 namespace pt = boost::property_tree;
 
-CurlEasyWrapper OSTreeHttpRepo::easy_handle_;
-
 bool OSTreeHttpRepo::LooksValid() const {
   if (FetchObject("config")) {
     pt::ptree config;
diff --git a/src/sota_tools/ostree_http_repo.h b/src/sota_tools/ostree_http_repo.h
index af3612b4b..511b422e6 100644
--- a/src/sota_tools/ostree_http_repo.h
+++ b/src/sota_tools/ostree_http_repo.h
@@ -37,7 +37,7 @@ class OSTreeHttpRepo : public OSTreeRepo {
   // garage-deploy is single-threaded. Reuse provides a huge performance boost
   // because certs don't have to be re-read and fewer objects get destroyed and
   // recreated.
-  static CurlEasyWrapper easy_handle_;
+  mutable CurlEasyWrapper easy_handle_;
 };
 
 // vim: set tabstop=2 shiftwidth=2 expandtab:

I think this solves the leak issue as well.

@pattivacek
Copy link
Collaborator Author

  • mutable CurlEasyWrapper easy_handle_;

Yep, that's exactly what I'm testing locally right now! I'm always scared of mutable but I think it's fine here.

@pattivacek pattivacek force-pushed the feat/OTA-4165/reuse-curl-handle branch from 3608a47 to 0f878d2 Compare April 15, 2020 15:33
@pattivacek
Copy link
Collaborator Author

pattivacek commented Apr 15, 2020

Bigger test (AGL RPi image) without this change:
Upload to Treehub complete after 6682 HEAD requests and 5893 PUT requests.
Total size of uploaded objects: 93707765 bytes.
Total runtime: 2323.02 seconds.

With this new change:
Upload to Treehub complete after 6682 HEAD requests and 5893 PUT requests.
Total size of uploaded objects: 93707765 bytes.
Total runtime: 765.403 seconds.

Not bad! ~12 minutes for ~100 MB isn't exactly awesome performance, but it's 1/3 the time it used to be.

Reuse provides a huge performance boost because certs don't have to be
re-read and fewer objects get destroyed and recreated.

Signed-off-by: Patrick Vacek <patrickvacek@gmail.com>
@mike-sul
Copy link
Collaborator

Not bad! ~12 minutes for ~100 MB isn't exactly awesome performance, but it's 1/3 the time it used to be.

It looks like a quite improvement.

curlEasySetoptWrapper(easy_handle.get(), CURLOPT_VERBOSE, get_curlopt_verbose());
server_->InjectIntoCurl(path.string(), easy_handle.get());
curlEasySetoptWrapper(easy_handle.get(), CURLOPT_WRITEFUNCTION, &OSTreeHttpRepo::curl_handle_write);
server_->InjectIntoCurl(path.string(), easy_handle_.get());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance, it looks like the auth creds/certs are set for each request while it can be set once in the curl context and then just CURLOPT_URL needs to be set for each request for an object so TCC/TLS connection is reused.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the goal here is to make sure that TCP/TLS connection is reused for each request for both the downloading and pushing objects to Treehub. I suppose libcurl should do it by default as long as "_Then reusing the same easy handle will ensure it can reuse its connections.", but I am not sure if it works if we setup the creds in the "easy handle" for each request.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did wonder if I should refactor that function, and there is room for some minor optimization there, but it seems like the connection is getting reused without messing with it, so I didn't change it.

Copy link
Collaborator

@mike-sul mike-sul Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mike-sul
Copy link
Collaborator

I think further steps for the performance improvement are:

  • making sure that TCP/TLS connection is reused for each request;
  • archiving/tarring all objects and delivering data within a single HTTP request (both PUT/POST and GET);

@pattivacek
Copy link
Collaborator Author

archiving/tarring all objects

This seems possible but requires backend changes as well.

delivering data within a single HTTP request (both PUT/POST and GET);

This isn't possible for the fetching because we don't know what we need to fetch until we walk the object hierarchy. For pushing, it's an option but I'm not sure it's worth it; it also requires backend changes.

I'll add these suggestions to the ticket!

@pattivacek pattivacek merged commit a4d411d into master Apr 16, 2020
@pattivacek pattivacek deleted the feat/OTA-4165/reuse-curl-handle branch April 16, 2020 08:52
@mike-sul
Copy link
Collaborator

requires backend changes as well

Yes, it requires changes in BE for sure.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants