Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up global variables in safekeeper code #8200

Open
3 of 4 tasks
petuhovskiy opened this issue Jun 28, 2024 · 2 comments
Open
3 of 4 tasks

Clean up global variables in safekeeper code #8200

petuhovskiy opened this issue Jun 28, 2024 · 2 comments
Labels
a/tech_debt Area: related to tech debt c/storage/safekeeper Component: storage: safekeeper m/good_first_issue Moment: when doing your first Neon contributions

Comments

@petuhovskiy
Copy link
Member

petuhovskiy commented Jun 28, 2024

We have at least several places in the code where global variables are used:

  • some places call GlobalTimelines::<funcname> to lookup timelines
  • static TIMELINES_STATE stores all timelines as a global variable
  • static REMOTE_STORAGE stores config for remote storage (e.g. S3)

To resolve the issue, we should replace GlobalTimelines:: calls with Arc<GlobalTimelines> struct. It will contain the hashmap of all timelines (the same way TIMELINES_STATE is working now). It should be passed around to the code that needs access to all timelines, instead of everything relying on global access through GlobalTimelines.

  • remove all GlobalTimelines:: calls
  • remove static TIMELINES_STATE

REMOTE_STORAGE too should be passed around as a Arc<RemoteStorage>. Each task that requires access to remote storage should get Arc<RemoteStorage> as a dependency during initialization.

  • remove static REMOTE_STORAGE

If some background task requires access to all global state (remote storage, timeline, config), we can bundle them together inside Arc<SafekeeperApp>, which is easier to carry around.

Also, currently struct GlobalTimelinesState contains two fields that are part of the global state:

conf: Option<SafeKeeperConf>,
broker_active_set: Arc<TimelinesSet>,

Currently conf is required in many places and usually it just copied from global state on demand. The config is loaded once on startup and never changes. This is wasteful, we should replace it with Arc<SafeKeeperConf> and always carry Arc<..> instead of SafeKeeperConf.

  • replace SafeKeeperConf usages with Arc<SafeKeeperConf> everywhere

Why?

To improve testing, to allow to run more than one safekeeper instance in a single process.
To simplify creating new dependencies without introducing global variables and special getters.

@petuhovskiy petuhovskiy added c/storage/safekeeper Component: storage: safekeeper a/tech_debt Area: related to tech debt m/good_first_issue Moment: when doing your first Neon contributions labels Jun 28, 2024
@fowlerlee
Copy link

@petuhovskiy can I take up this issue ?

@petuhovskiy
Copy link
Member Author

@petuhovskiy can I take up this issue ?

@fowlerlee
Sure, no one is working on this right now. You can start and open a PR, thanks!

github-merge-queue bot pushed a commit that referenced this issue Dec 9, 2024
Hello! I was interested in potentially making some contributions to Neon
and looking through the issue backlog I found
[8200](#8200) which seemed
like a good first issue to attempt to tackle. I see it was assigned a
while ago so apologies if I'm stepping on any toes with this PR. I also
apologize for the size of this PR. I'm not sure if there is a simple way
to reduce it given the footprint of the components being changed.

## Problem
This PR is attempting to address part of the problem outlined in issue
[8200](#8200). Namely to
remove global static usage of timeline state in favour of
`Arc<GlobalTimelines>` and to replace wasteful clones of
`SafeKeeperConf` with `Arc<SafeKeeperConf>`. I did not opt to tackle
`RemoteStorage` in this PR to minimize the amount of changes as this PR
is already quite large. I also did not opt to introduce an
`SafekeeperApp` wrapper struct to similarly minimize changes but I can
tackle either or both of these omissions in this PR if folks would like.

## Summary of changes
- Remove static usage of `GlobalTimelines` in favour of
`Arc<GlobalTimelines>`
- Wrap `SafeKeeperConf` in `Arc` to avoid wasteful clones of the
underlying struct

## Some additional thoughts
- We seem to currently store `SafeKeeperConf` in `GlobalTimelines` and
then expose it through a public`get_global_config` function which
requires locking. This seems needlessly wasteful and based on observed
usage we could remove this public accessor and force consumers to
acquire `SafeKeeperConf` through the new Arc reference.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/safekeeper Component: storage: safekeeper m/good_first_issue Moment: when doing your first Neon contributions
Projects
None yet
Development

No branches or pull requests

2 participants