Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the utility of the swarm stats command for determining which limits are being hit #9474

Closed
Tracked by #9442 ...
BigLep opened this issue Dec 7, 2022 · 6 comments
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@BigLep
Copy link
Contributor

BigLep commented Dec 7, 2022

Below are some ideas to improve the utility of swarm stats after having used it for debugging resource management limit concerns.

I've listed these in priority order.

When --min-used-limit-perc is used, only include the specific limits that exceed min-used-limit-perc rather than the whole scope

Looking at my current limits:

./ipfs swarm limit system                         
{
  "Conns": 4611686018427388000,
  "ConnsInbound": 254,
  "ConnsOutbound": 4611686018427388000,
  "FD": 46080,
  "Memory": 6399205376,
  "Streams": 4611686018427388000,
  "StreamsInbound": 4075,
  "StreamsOutbound": 4611686018427388000
}

When I do ipfs swarm stats --min-used-limit-perc=10 system, I would want to just see "ConnsInbound" since that is what is using more than 10%. Instead I see the whole scope:

ipfs swarm stats --min-used-limit-perc=10 system
{
  "System": {
    "Conns": 239,
    "ConnsInbound": 129,
    "ConnsOutbound": 110,
    "FD": 43,
    "Memory": 9883648,
    "Streams": 248,
    "StreamsInbound": 144,
    "StreamsOutbound": 104
  }
}

(Note there is a bug with ipfs swarm stats --minu-sed-limit-perc when the "all" scope is used: #9473 )

List scopes from big picture downwards when "all" scope is used

The current dump of swarm stats all doesn't put the most important (big picture) stuff at the top:

ipfs swarm stats all | jq 'keys'
[
  "Peers",
  "Protocols",
  "Services",
  "System",
  "Transient"
]

This is exasperated by how there is lots underneath peers.

The order that I think is more intuitive is:

  1. System
  2. Transient
  3. Protocols
  4. Services
  5. Peers

With that I can do ipfs swarm stats all and read from top to bottom to get a sense of what's happening.

(The same is true for ipfs swarm limit all)

(Minor) List limits within a scope in importance order

When outputting a scope, we are listing chronologically:

ipfs swarm stats system
{
  "System": {
    "Conns": 382,
    "ConnsInbound": 212,
    "ConnsOutbound": 170,
    "FD": 71,
    "Memory": 19255296,
    "Streams": 418,
    "StreamsInbound": 252,
    "StreamsOutbound": 166
  }
}

I think it would be more useful to to put in an order like:

  "Memory",
  "FD",
  "Conns",
  "ConnsInbound",
  "ConnsOutbound",
  "Streams",
  "StreamsInbound",
  "StreamsOutbound",

(The same is true for ipfs swarm limit)

@BigLep BigLep added the kind/enhancement A net-new feature or improvement to an existing feature label Dec 7, 2022
@BigLep
Copy link
Contributor Author

BigLep commented Dec 7, 2022

Per 2022-12-07 maintainer conversation: the top ask of "When --min-used-limit-perc is used, only include the specific limits that exceed min-used-limit-perc rather than the whole scope" will require more work because of the structs we're relying on from go-libp2p.

@BigLep BigLep mentioned this issue Dec 7, 2022
@ajnavarro
Copy link
Member

When --min-used-limit-perc is used, only include the specific limits that exceed min-used-limit-perc rather than the whole scope

We cannot output specific fields on the limit struct because we are not using pointers. We can fill the limits that are above the threshold, but all other fields will appear as zeroes. Example:

ipfs swarm stats --min-used-limit-perc=10 system
{
  "System": {
    "Conns": 239,
    "ConnsInbound": 0,
    "ConnsOutbound": 0,
    "FD": 0,
    "Memory": 0,
    "Streams": 0,
    "StreamsInbound": 0,
    "StreamsOutbound": 0
  }
}

I think that can be misleading to the user. To be able to just output the specific field ("Conns" in this case) we need to copy all the libp2p Limit structs and use pointers on fields instead.

List scopes from big picture downwards when "all" scope is used

Go JSON library is ordering fields in lexicographic order. If we want to change that we should do some research about other libraries that allow changing the ordering or see if there is any way to specify an order when encoding using the standard library.

(Minor) List limits within a scope in importance order

The same as the previous point, fields are ordered in lexicographic order.

@BigLep
Copy link
Contributor Author

BigLep commented Jan 3, 2023

Thanks for the info @ajnavarro . Some thoughts...

When --min-used-limit-perc is used, only include the specific limits that exceed min-used-limit-perc rather than the whole scope

Are we sure there isn't a way to either:

  1. only copy "BaseLimit" https://github.com/libp2p/go-libp2p/blob/master/p2p/host/resource-manager/limit.go#L89 (duplicating that doesn't seem like the end of the world)
  2. Contruct a map not using the structs. It sounds like you are able to "fill the limits that are above the threshold". If so can we start with an empty map and then add the key/value pairs that are above the threshold. I think it's fine if the output is even not copy/pastable into one's config. The important thing is to be clear about the problem. Even outputting something like:
{
  "key" : "System.ConnsInbound",
   "currentValue" : 129,
   "limit" : 254
}

would be great.

List scopes from big picture downwards when "all" scope is used

rather than constructing one big struct that we print, can we do multiple print statements, one for each of these:

  1. System
  2. Transient
  3. Protocols
  4. Services
  5. Peers

Again, I think it's fine if this isn't directly copy/pastable. Clarity of information for debugging is more important.

(Minor) List limits within a scope in importance order

I'm ok to give up on this one, but if copying "BaseLimit" struct gets us there, than I'm interested in it.

@ajnavarro
Copy link
Member

AFAIU, output from swarm limit and swarm stats must be compatible with the actual configuration. If it is something that we are happy to change, I can prepare a PR with the requested changes.

@BigLep
Copy link
Contributor Author

BigLep commented Jan 4, 2023

@ajnavarro : the intent/desire to have the commands compatible with the actual configuration makes sense. That said, as I think we have found while working with users to debug their issues in #9432 that the commands in their current output form aren't as useful as they could be as they aren't highlighting the key information and bringing it to the top. The resource manager is complex and information overload or disorganization doesn't help. As a result, I personally think it's more important to provide precise information than hold to the tenet of exporting "actual configuration". To help with this, maybe we add prefixing note to the output like:

# Note: this is not complete/correct configuration

I think the way forward here is either:

  1. Async comment from @lidel OR
  2. Maintainer decision during Thursday 2023-01-05 standup

@BigLep
Copy link
Contributor Author

BigLep commented Feb 16, 2023

No more work on this specifically is planned given going to change/simplify the commands to focus on more useful debug information per #9621

@BigLep BigLep closed this as not planned Won't fix, can't repro, duplicate, stale Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
No open projects
Archived in project
Development

No branches or pull requests

2 participants