Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

determine thread stack size with pthread API #6970

Closed
wants to merge 1 commit into from

Conversation

damaxwell
Copy link
Contributor

Moves thread stack size computation out of Fiber and into Thread.

Switches to pthread APIs for computing stack size.

@ysbaddaden
Copy link
Contributor

Thanks a lot!

The only bit I'm wary of, is whether the stack addr/size/guard is reported correctly for the main thread, since there isn't a standard and portable pthread_getattr. I investigated and wrote my results below. Only darwin seems to behave incorrectly for the main thread.

Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. it seems pthread_get_stacksize_np can't be trusted for the main thread on Darwin;
  2. maybe harmonize code style (pedantic) to improve readability?
  3. some implementations (e.g. linux) report a guard size; could it be useful?

src/thread.cr Outdated
stack_size = LibC.pthread_get_stacksize_np(@th);
top = Pointer(Void).new(bottom.address - stack_size)
{% elsif flag?(:freebsd) %}
err = LibC.pthread_attr_init(out attr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Pedantic: can we harmonize the implementation within thread? I use ret (instead of err) and inline the check+raise.
  • Leak: if any further call to pthread fails, the pthread_attr_t object is never destroyed (this applies to linux as well).

For example:

ret = LibC.pthread_attr_init(out attr)
raise Errno.new("pthread_attr_init", ret) unless ret == 0

begin
  ret = LibC.pthread_attr_get_np(@th, pointerof(attr))
  raise Errno.new("pthread_attr_get_np", ret) unless ret == 0

  # ...
ensure
  LibC.pthread_attr_destroy(pointerof(attr))
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True enough about that leak. I had seen the same leak also in #6944 in the changes to ConditionVariable#initialize and had assumed that you hadn't used full care to release the resources because if these calls are failing the house is on fire, and maybe the leaky faucet isn't so worrisome. I'll fix this here and submit a separate PR for that other one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I totally missed it! I only overlooked some possible errnos that are very unlikely (per the manpages) and not worth raising.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the requested changes:

  1. On Darwin, on the main thread, we fall back to getrlimit.
  2. Switched to ret not err and tried to match nearby code style (throw ... unless ret == 0)
  3. This was the biggest change. For platforms that do not have an equivalent of pthread_getattr_np we need to cache the thread attributes at the time of creation to access the guard size. The stack is returned as a separate Thread::Stack structure that maintains a pointer, a size, and an optional guard size, which is nil if is unknown (i.e. on the main process thread). If you think this is pedantic, we can simply report a 0 instead. My previous use of Slice(Void) to represent a stack turned out to be a problem because it uses an Int32 to represent its count, and (rarely, but in principle) a stack can have more bytes than that.

The leak was addressed as well. For consistency in the code base, probably ConditionVariable also ought to get fixed; it doesn't look like the errno's that could occur there are much less likely than the errno's addressed here.

src/fiber.cr Outdated
if @stack != Pointer(Void).null
return Slice(Void).new(@stack, @stack_bottom.address - @stack.address)
end
# Otherwise, assume for now that there is only one thread,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use a more detailed explanation:

the fiber represents a thread's main stack and we assume it's the currently running thread; this should be fine since a thread should never resume the main fiber of another thread (undefined behavior? chaos?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not pass the thread stack through the Fiber.new constructor. Then we simply have a fiber constructor for creating a fiber with a new stack and a fiber constructor for creating a fiber with an existing stack, and Fiber never has to worry about Thread.current.stack at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it's better to properly initialize @stack already in the constructor.

Copy link
Contributor Author

@damaxwell damaxwell Oct 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can’t cache @stack in Fiber because it is dynamic in time. What is is at the time of Fiber construction need not be what it is when a segfault occurs later in the run. The stack bottom stays put but the stack size adjusts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did too much low level programming, I guess. It sounded safe enough, but it's better to be really safe.

Is the stack size really changing? We never tweak it, and I'm not sure we ever do.

If so, maybe linking a Fiber to its Thread when it represents a thread stack would be a better idea? We could then have something like @thread.try(&.stack) || @stack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On x86_64-linux-musl at least, the stack sizes reported from pthread_attr_get_np & pthread_attr_getstack on the main thread reflect the current stack allocation, which can be much smaller than the ultimate limit, so it will grow in time. And although the Crystal code base never changes the limits on the main stack, there is nothing to stop a user from calling setrlimit, which will change the main stack size.

Saving the associated thread if the Fiber represents a thread's stack (main thread or otherwise), is the right thing to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it turned out that it having the Thread know about its Fiber and the Fiber knowing about its Thread didn't mix well with the fact that Thread has a finalizer. Numerous GC Warning: Finalization cycle warnings were generated running the spec suite.

So instead, a Fiber returns an optional Thread::Stack which is Nil if the Fiber is not responsible for maintaining its stack (i.e. the main Fiber for a Thread). A Thread always knows about its own stack and can return it. If you want the current stack, talk to the Crystal::Scheduler, which knows both the current Thread and the current Fiber and can figure out what is the right stack to report.

src/signal.cr Outdated Show resolved Hide resolved
src/thread.cr Outdated Show resolved Hide resolved
src/fiber.cr Outdated Show resolved Hide resolved
@damaxwell damaxwell force-pushed the pthread-stack branch 2 times, most recently from 65119f2 to aff2888 Compare October 25, 2018 18:29
src/thread.cr Outdated Show resolved Hide resolved
src/thread.cr Outdated Show resolved Hide resolved
src/thread.cr Outdated Show resolved Hide resolved
src/thread/stack.cr Outdated Show resolved Hide resolved
src/thread.cr Outdated Show resolved Hide resolved
@RX14
Copy link
Contributor

RX14 commented Oct 26, 2018

This is basically fine

@bcardiff bcardiff added this to the 0.27.1 milestone Nov 2, 2018
Copy link
Member

@bcardiff bcardiff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. My only input would be the same as #6970 (comment) but can be done later.

@ysbaddaden are you satisfied with the current behaviour?

@damaxwell would you mind rebasing over master?

@damaxwell
Copy link
Contributor Author

@bcardiff: Rebased.

@RX14 RX14 requested a review from ysbaddaden November 8, 2018 21:43
@bcardiff bcardiff modified the milestones: 0.27.1, 0.28.0 Jan 30, 2019
@straight-shoota straight-shoota added kind:feature topic:stdlib:concurrency pr:needs-work A PR requires modifications by the author. labels Feb 15, 2019
@bcardiff bcardiff mentioned this pull request Mar 14, 2019
@bcardiff bcardiff removed this from the 0.28.0 milestone Mar 27, 2019
@damaxwell damaxwell closed this Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature pr:needs-work A PR requires modifications by the author. topic:stdlib:concurrency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants