Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lock around all API calls #1021

Merged
merged 1 commit into from
Nov 21, 2022
Merged

add lock around all API calls #1021

merged 1 commit into from
Nov 21, 2022

Conversation

simonbyrne
Copy link
Collaborator

@simonbyrne simonbyrne commented Oct 27, 2022

As a potential fix to #835 and #990: this adds a simple lock around all API calls: I was initially worried this would cause a performance bottleneck, but some simple examples didn't show any real difference.

@mkitti
Copy link
Member

mkitti commented Oct 27, 2022

Would it make sense to provide this in two layers

  1. A raw API module without any locks
  2. An Thread-safe API module with locks like you've done here.

Basically we run the generator twice and produce two separate modules.

For the higher level API we may want to engage the locks for a series of low-level API calls rather than locking and unlocking for each API call.

@mkitti
Copy link
Member

mkitti commented Oct 27, 2022

We could even use a Preferences.jl mechanism to enable locks everywhere or not by default.

@simonbyrne
Copy link
Collaborator Author

That's what I was originally thinking, but it seems like the overhead is small enough that it might not be worth it?

@simonbyrne
Copy link
Collaborator Author

I've tried some more simple test cases (e.g. reading/writing single values in a loop) and I don't see any observable overhead from the lock.

Copy link
Member

@mkitti mkitti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we create a toggle to make this easier to test?

Another option would be to copy api.jl and make a rawapi.jl.

Then api.jl has use_lock() = true and rawapi.jl has use_lock() = false.

Comment on lines +244 to +250
jlfuncbody = Expr(
:block, __source__, :(lock(liblock)), :($statsym = try
$ccallexpr
finally
unlock(liblock)
end)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
jlfuncbody = Expr(
:block, __source__, :(lock(liblock)), :($statsym = try
$ccallexpr
finally
unlock(liblock)
end)
)
jlfuncbody = Expr(
:block, __source__, :(use_lock() && lock(liblock)), :($statsym = try
$ccallexpr
finally
use_lock && unlock(liblock)
end)
)

Copy link
Collaborator Author

@simonbyrne simonbyrne Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds two runtime lookups at each call

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If use_lock is constant, (e.g. use_lock() = false), it gets inlined. The generated code does not actually call use_lock at runtime.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I saw that's what you did in #1022. That's an option, but I would argue that using the lock should be the default.

@@ -13,6 +13,8 @@ else
)
end

const liblock = ReentrantLock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const liblock = ReentrantLock()
const liblock = ReentrantLock()
const _USE_LOCK = Ref(true)
use_lock() = _USE_LOCK[]

@mkitti
Copy link
Member

mkitti commented Nov 8, 2022

@simonbyrne What's your reading of JuliaLang/julia#35689 ? It sounds to me like there may be an issue if finalizers acquire locks at the moment, with that issue still open.

@simonbyrne
Copy link
Collaborator Author

@simonbyrne What's your reading of JuliaLang/julia#35689 ? It sounds to me like there may be an issue if finalizers acquire locks at the moment, with that issue still open.

I think that is addressed by JuliaLang/julia#38487, which prevents finalizers from running on the current thread until the lock is released.

@mkitti
Copy link
Member

mkitti commented Nov 8, 2022

That's strange that a pull request with a lower number resolves an issue with a higher number.

@musm
Copy link
Member

musm commented Nov 9, 2022

@mkitti I haven't fully investigated the Preferences.jl solution, but this seems a lot simpler and since there's no tradeoff with these calls, I don't see a strong reason to make it configurable. Are there any use cases for users to want to turn it off other than devs wanting to experiment?

@mkitti
Copy link
Member

mkitti commented Nov 9, 2022

The statement that "there is no tradeoff" still needs to be clearly established. Anecdotal testing does suggest that the effect is small, but we do not have a way of comprehensively establishing this for all applications. Thus, we need to provide a way for the user to do the experiment with their specific applications.

The result "there is no tradeoff" does seem surprising. A single HDF5.jl function usually makes several C API calls. Locking and unlocking several times within a high level function call does seem inefficient. Is there really no cost to this at all? If there were no cost, then why not enable thread safety within the C API by default?

@mkitti
Copy link
Member

mkitti commented Nov 10, 2022

With lock:

Test Summary:                      | Pass  Broken  Total     Time
HDF5.jl                            | 1085       5   1090  1m55.2s
  plain                            |  151       1    152    23.1s
  complex                          |   13             13     0.8s
  undefined and null               |    4              4     0.0s
  abstract arrays                  |    2              2     0.6s
  empty and 0-size arrays          |   39             39     0.6s
  generic read of native types     |   17             17     0.6s
  show                             |   51             51     2.8s
  split1                           |   13             13     0.3s
  haskey                           |   18             18     0.5s
  AbstractString                   |   51             51     4.3s
  opaque data                      |    7              7     1.2s
  FixedStrings and FixedArrays     |   18             18     1.0s
  Object Exists                    |    6              6     0.0s
  HDF5 existance                   |    4              4     0.0s
  bounds                           |    2              2     0.2s
  h5a_iterate                      |    8              8     0.2s
  h5l_iterate                      |    8              8     0.2s
  compound                         |   10             10     3.6s
  custom                           |    6              6     0.7s
  reference                        |    6              6     0.2s
  Dataspaces                       |   91             91     0.4s
  Datatypes                        |   15             15     0.1s
  BlockRange                       |   35             35     0.3s
  hyperslab                        |    5              5     1.2s
  read 0-length arrays: issue #859 |                None     0.2s
  attrs interface                  |   92             92     1.3s
  readremote                       |   23             23     1.6s
  extend                           |   25             25     1.7s
  gc                               |  101            101     3.8s
  external                         |    6              6     0.1s
  swmr                             |    4              4     8.2s
  mmap                             |    9              9     1.0s
  properties                       |   42       2     44     2.0s
  filter                           |   48       1     49    14.0s
  Raw Chunk I/O                    |   52             52     4.6s
  fileio                           |    6              6     1.8s
  track order                      |   18             18     2.4s
  h5f_get_dset_no_attrs_hint       |    6              6     0.4s
  non-allocating methods           |   11       1     12     1.2s
  Compression Filter Unit Tests    |    6              6     0.2s
  Object API                       |   38             38     0.8s
  virtual dataset                  |    5              5     0.9s
  API Lock Preference              |    5              5     0.7s
     Testing HDF5 tests passed 

Without lock:

Test Summary:                      | Pass  Broken  Total     Time
HDF5.jl                            | 1085       5   1090  1m40.0s
  plain                            |  151       1    152    21.9s
  complex                          |   13             13     0.8s
  undefined and null               |    4              4     0.0s
  abstract arrays                  |    2              2     0.6s
  empty and 0-size arrays          |   39             39     0.6s
  generic read of native types     |   17             17     0.6s
  show                             |   51             51     2.8s
  split1                           |   13             13     0.3s
  haskey                           |   18             18     0.4s
  AbstractString                   |   51             51     4.0s
  opaque data                      |    7              7     1.2s
  FixedStrings and FixedArrays     |   18             18     0.9s
  Object Exists                    |    6              6     0.0s
  HDF5 existance                   |    4              4     0.0s
  bounds                           |    2              2     0.2s
  h5a_iterate                      |    8              8     0.2s
  h5l_iterate                      |    8              8     0.2s
  compound                         |   10             10     3.5s
  custom                           |    6              6     0.7s
  reference                        |    6              6     0.2s
  Dataspaces                       |   91             91     0.4s
  Datatypes                        |   15             15     0.0s
  BlockRange                       |   35             35     0.3s
  hyperslab                        |    5              5     1.2s
  read 0-length arrays: issue #859 |                None     0.2s
  attrs interface                  |   92             92     1.3s
  readremote                       |   23             23     1.6s
  extend                           |   25             25     1.7s
  gc                               |  101            101     3.5s
  external                         |    6              6     0.0s
  swmr                             |    4              4     7.7s
  mmap                             |    9              9     1.0s
  properties                       |   42       2     44     1.7s
  filter                           |   48       1     49    13.7s
  Raw Chunk I/O                    |   52             52     4.4s
  fileio                           |    6              6     1.7s
  track order                      |   18             18     2.4s
  h5f_get_dset_no_attrs_hint       |    6              6     0.3s
  non-allocating methods           |   11       1     12     1.2s
  Compression Filter Unit Tests    |    6              6     0.2s
  Object API                       |   38             38     0.7s
  virtual dataset                  |    5              5     0.9s
  API Lock Preference              |    5              5     0.6s
     Testing HDF5 tests passed 

That suggests the tests with the lock in place could take 15% longer than without the lock in place. I think that's at the high end. Over a number of tests it looks like 5% with a range of 2% to 16%.

@mkitti mkitti mentioned this pull request Nov 18, 2022
@mkitti
Copy link
Member

mkitti commented Nov 18, 2022

Should we bump to 0.17 if we merge this? It does not look directly breaking to me, but it is a significant change that it could be.

@musm
Copy link
Member

musm commented Nov 21, 2022

I think we can probably keep things at 0.16, this PR should be a strict improvement (sans minor speed regression)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants