Skip to content
This repository has been archived by the owner on Aug 19, 2022. It is now read-only.

rewrite limits to allow auto-scaling #48

Merged
merged 7 commits into from
Jul 2, 2022
Merged

Conversation

marten-seemann
Copy link
Contributor

@marten-seemann marten-seemann commented Jun 10, 2022

We want to be able to auto-scale our limits so that 99% of node operators don't need to worry about them.

To do so, this PR removes dynamic limits, which, to the best of our knowledge, have not been used. Using them would be challenging anyway due to complicated interactions between resource manager and connection manager.

The ScalingLimitConfig is what makes this auto-scaling possible (and configurable). It defines both the minimum resource requirements for a libp2p node (assuming 1 GB of system memory, of which 128 MB are reserved for libp2p), e.g. in SystemBaseLimit. In SystemLimitIncrease one then specifies how many connections / streams / memory is added _for every additional GB of memory.

As a convenience function, the ScalingLimitConfig defines a AutoScale function that queries the system for available memory / FDs. Light nodes (IPFS desktop for example) can use the Scale function and pass in lower values of memory / FDs, to obtain a more lightweight resource manager.

@marten-seemann
Copy link
Contributor Author

@vyzo This should be ready for review now.

Copy link
Collaborator

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the comment.

limit.go Show resolved Hide resolved
@BigLep
Copy link
Contributor

BigLep commented Jun 18, 2022

@marten-seemann : what are the next steps here? Would it help to have @MarcoPolo review?

@MarcoPolo MarcoPolo mentioned this pull request Jun 19, 2022
7 tasks
Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay. Just some minor nits. I feel like this is really only useful for helping set up good defaults, but I think users will probably use the fixed limits when tuning this themselves.

}

func (l *BasicLimiter) GetServicePeerLimits(svc string) Limit {
pl, ok := l.ServicePeerLimits[svc]
func (l *fixedLimiter) GetServiceLimits(svc string) Limit {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but I would like a method for Limiter that iterates over all limits so that we can expose this information to the use (via opencensus or something else).

Maybe something like GetAllLimits(forEach func(scopeName string, Limit))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be good to log at the start as well since it can be a bit confusing what the final values are at the end of the day.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like this in to help debugging :)

limit_defaults.go Show resolved Hide resolved
limit_defaults.go Show resolved Hide resolved
StreamMemory int64
// Scale scales up a limit configuration.
// memory is the amount of memory that the stack is allowed to consume,
// for a full it's recommended to use 1/8 of the installed system memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// for a full it's recommended to use 1/8 of the installed system memory.
// for a node dedicated to this process it's recommended to use 1/8 of the installed system memory.

@marten-seemann
Copy link
Contributor Author

@vyzo Could you please review this again? I'll be merging this PR tomorrow (Thu) by EOD.

Copy link
Collaborator

@vyzo vyzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still dont see how the user can do full manual specification, which is a feature we need to retain support for.
Auto* stuff have a tendency to do unpredictable things.

@@ -20,19 +23,6 @@ type Limit interface {
GetConnTotalLimit() int
// GetFDLimit returns the file descriptor limit.
GetFDLimit() int

// WithMemoryLimit creates a copy of this limit object, with memory limit adjusted to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we please keep the prototype constructors for limit in some form?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not constructors. They’re methods on an interface, where they don’t belong.

Copy link
Collaborator

@vyzo vyzo Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are prototype OO constructors hich allow you to easily adjust parts of the limit without knowing the concrete types.

They belong just fine, please provide them or a suitable replacement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't found any use for them. They conflate scaling and limit configuration. Disentangling those made this PR so invasive, but ultimately resulted in a cleaner structure.

@marten-seemann
Copy link
Contributor Author

I still dont see how the user can do full manual specification, which is a feature we need to retain support for.
Auto* stuff have a tendency to do unpredictable things.

NewFixedLimiter takes a LimitConf struct. The LimitConf struct contains all the limits exactly as they'll be applied.
How you arrive at the LimitConf is your decision:

  1. You can define it yourself, based on whatever algorithm you like (or read it from a config file)
  2. You can define a ScalingLimitConf, then call Scale() on it, which produces a LimitConf.

@vyzo
Copy link
Collaborator

vyzo commented Jun 29, 2022

I can't approve this as is. This is too much of a breaking change without providing replacement for anything it is removing.

I want autoscaling too, but this is flat out throwing everything else away, including the baby with the bathwater.

@marten-seemann
Copy link
Contributor Author

I don't understand what you're missing here. This builds with go-ipfs, proving that there's nothing left in the API.

@BigLep
Copy link
Contributor

BigLep commented Jul 1, 2022

2022-07-01 conversation:

We need a README update that discusses:

  1. What happens when someone passes the DefaultConfig.
  2. How someone sets static limits
  3. What happens when uses Scaled limits
  4. How someone changes the Scaled limits
  5. How someone overrides the computed scaled limits

We're also going to log the limits that the resource manager was configured with.

@vyzo
Copy link
Collaborator

vyzo commented Jul 1, 2022

and i must insist on having prototype adjustment in the limit interface.

Also, it must be made abundantly clear how we can do full manual control, the functionality that was previously provided and seems gone now.

As it is, it adds a desirable feature but on the same time regresses on fuma control.
I would like to exercise more care in not breaking existing code or removing existing functionality without replacement.

Copy link
Contributor

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. I don't think it'll break existing use cases because:

  1. Users can still pass in limits from a json config.
  2. Users can define a custom limit config and merge it with the autoscale defaults.
  3. Users can modify a limit by modifying the LimitConfig instead of the prototype methods.

And we get some new functionality around scaling. And net less code (nice work!).

I would like the log of what the limits are, but it's really not a blocker for this PR so feel free to do in a separate PR along with Steve's suggestions above.

@marten-seemann
Copy link
Contributor Author

Rebased on top of master.

@marten-seemann
Copy link
Contributor Author

Merging this PR. There'll be some follow up PRs for documentation and FD counting on Windows.

@marten-seemann marten-seemann merged commit d0a1694 into master Jul 2, 2022
@BigLep
Copy link
Contributor

BigLep commented Jul 5, 2022

@marten-seemann : are there issues for the followup work? I'm interested in the doc items for showing what the user experience is going to be like for anyone configuring the resource manager.

@marten-seemann
Copy link
Contributor Author

@BigLep It's already done: #58 and #59. Merged and released.

@ajnavarro ajnavarro mentioned this pull request Aug 24, 2022
72 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants