Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib/pull: Wait for pending ops to complete on error #1185

Closed
wants to merge 2 commits into from

Conversation

cgwalters
Copy link
Member

I saw in a stack trace that the main thread was calling exit() even while
worker threads were alive and doing sha256/write/fsync etc. for objects.

The stack trace was a SEGV as the main thread was calling into library
atexit() handlers and we were a liblz4 destructor:

 #0  0x00007f2db790f8d4 _fini (liblz4.so.1)
 #1  0x00007f2dbbae1c68 __run_exit_handlers (libc.so.6)

(Why that library has a destructor I don't know offhand, can't find
it in the source in a quick look)

Anyways, global library destructors and worker threads continuing simply don't
mix. Let's wait for our outstanding operations before we exit. This is also a
good idea for projects using libostree as a shared library, as we don't want
worker threads outliving operations.

Our existing pull corruption tests exercise coverage here.

I added a new caught-error status boolean to the progress API, and use it the
commandline to tell the user that we're waiting for outstanding ops.

I saw in a stack trace that the main thread was calling `exit()` even while
worker threads were alive and doing sha256/write/fsync etc. for objects.

The stack trace was a SEGV as the main thread was calling into library
`atexit()` handlers and we were a liblz4 destructor:

```
 #0  0x00007f2db790f8d4 _fini (liblz4.so.1)
 #1  0x00007f2dbbae1c68 __run_exit_handlers (libc.so.6)
```

(Why that library has a destructor I don't know offhand, can't find
 it in the source in a quick look)

Anyways, global library destructors and worker threads continuing simply don't
mix. Let's wait for our outstanding operations before we exit. This is also a
good idea for projects using libostree as a shared library, as we don't want
worker threads outliving operations.

Our existing pull corruption tests exercise coverage here.

I added a new `caught-error` status boolean to the progress API, and use it the
commandline to tell the user that we're waiting for outstanding ops.
g_queue_clear (&pull_data->scan_object_queue);
g_hash_table_remove_all (pull_data->pending_fetch_metadata);
g_hash_table_remove_all (pull_data->pending_fetch_deltaparts);
g_hash_table_remove_all (pull_data->pending_fetch_deltaparts);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/deltaparts/content/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hooray for code review! Fixup ⬇️

@jlebon
Copy link
Member

jlebon commented Sep 19, 2017

@rh-atomic-bot r+ 1b6c8e5

@rh-atomic-bot
Copy link

⌛ Testing commit 1b6c8e5 with merge 5c4f26b...

@rh-atomic-bot
Copy link

☀️ Test successful - status-atomicjenkins
Approved by: jlebon
Pushing 5c4f26b to master...

cgwalters added a commit to cgwalters/ostree that referenced this pull request Sep 27, 2017
I now believe the flatpak issue we were hitting was
ostreedev#1185
but let's add these tests anyways for more coverage.
rh-atomic-bot pushed a commit that referenced this pull request Sep 28, 2017
I now believe the flatpak issue we were hitting was
#1185
but let's add these tests anyways for more coverage.

Closes: #888
Approved by: jlebon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants