Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iroh-net: netcheck actor dropped stun response channel #2647

Open
dignifiedquire opened this issue Aug 19, 2024 · 5 comments
Open

iroh-net: netcheck actor dropped stun response channel #2647

dignifiedquire opened this issue Aug 19, 2024 · 5 comments
Assignees
Labels
bug Something isn't working c-iroh-net
Milestone

Comments

@dignifiedquire
Copy link
Contributor

Have some reports of this happening quite often, inside docker (iroh@0.22)

ERROR iroh_net::netcheck::reportgen::hairpin: Hairpin actor failed: netcheck actor dropped stun response channel
at <..>/.cargo/registry/src/index.crates.io-6f17d22bba15001f/iroh-net-0.22.0/src/netcheck/reportgen/hairpin.rs:97
@dignifiedquire dignifiedquire added c-iroh-net bug Something isn't working labels Aug 19, 2024
@flub flub added this to the v0.24.0 milestone Aug 20, 2024
@flub flub self-assigned this Aug 26, 2024
@ramfox ramfox modified the milestones: v0.24.0, v0.25.0 Aug 30, 2024
@flub
Copy link
Contributor

flub commented Sep 6, 2024

Do we have any more details about this? E.g. is this at shutdown? Is this while it is running? Is there a full log?

@flub
Copy link
Contributor

flub commented Sep 6, 2024

I should summarise what I know from having looked at this:

  • This happens when the netcheck actor is no longer running but the hairpin actor is not running anymore.
  • This itself should not happen.
  • Maybe it could happen at shutdown
    • All the actors involved (netcheck -> reportgen -> hairpin) are all supposed to abort on drop.
    • I think drop order might be fine but an not entirely sure. If say the hairpin actor is dropped while it happens to be running what happens? It first waits until it is pending again before it is dropped?
  • If this happens during normal runtime that means the netcheck actor died.

flub added a commit that referenced this issue Sep 6, 2024
netcheck::Client owns the actor task and when dropped it will abort
the actor task.  Making a struct owning a task Clone means it is easy
to lose track of who should be owning a task like this.  I now believe
each task should have a clear supervisor/owner in charge of it.

This cleans up the multiple-ownership of the netcheck::Client, which
is just a small step into this direction.  Later on, for e.g. #2647,
more supervision will be added.  But small changes are good.
github-merge-queue bot pushed a commit that referenced this issue Sep 9, 2024
## Description

netcheck::Client owns the actor task and when dropped it will abort
the actor task.  Making a struct owning a task Clone means it is easy
to lose track of who should be owning a task like this.  I now believe
each task should have a clear supervisor/owner in charge of it.

This cleans up the multiple-ownership of the netcheck::Client, which
is just a small step into this direction.  Later on, for e.g. #2647,
more supervision will be added.  But small changes are good.

## Breaking Changes

- `iroh_net::netcheck::Client::receive_stun_packet` is no longer
available.
- `iroh_net::netcheck::Client` is not longer `Clone`.

## Notes & open questions

I don't want to make `Addr` public, in fact I'd like all of netcheck to
be
private. So I made some docs not links to avoid the warnings resulting
to
linking to private items from public docs.

Removing `Client::receive_stun_packet` is a bit harsh. I'd like to make
all
of netcheck private, but the cli uses it for the doctor so I'm a bit
stuck.
In any case, the way that uses it does not need `receive_stun_packet`.

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- ~~[ ] Tests if relevant.~~
- ~~[ ] All breaking changes documented.~~
@dignifiedquire
Copy link
Contributor Author

@flub can this be closed?

@flub
Copy link
Contributor

flub commented Sep 16, 2024

Does it still happen to the original reporter? (I assume you are a proxy)

@dignifiedquire
Copy link
Contributor Author

I don’t know, will try to find out

@dignifiedquire dignifiedquire modified the milestones: v0.25.0, v0.26.0 Sep 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working c-iroh-net
Projects
Status: No status
Development

No branches or pull requests

3 participants