-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Conversation
as part of an effort to understand the notes better (and, more generally how git refs are passed around) I elected to create an opaque type to be used anywhere git refs are passed around. I'd say that (among other things like the sheer amount of change happening here being way too high) this is not something that should be a part of this project. That said, it is somewhat instructive to see how/where refs are used. |
You need to fetch them first |
I don't think you need to replace these -- they get written to accompany commits, and in read-only mode, fluxd shouldn't be making any commits. |
@squaremo if I don't need to replace git notes... then what are they there for ultimately? Before you answer that please understand that (with this as some evidence) I've tried to get to the precise bottom of that question and haven't been able to fully understand that yet. It appears that, at least in part, they're there for some of the APIs? All the same, I'll continue, then, on short-circuiting calls to notes when in read-only mode (or ConfigMap state mode, rather). @stefanprodan the following seems to suggest to me that we're going to be hitting more than 1 megabyte in storage relatively easily:
Given that.. I'll keep it shimmed out in an in-memory object until we figure out a next course of action. |
It's a read-only mode. You are by definition removing the ability to write to the repo, and not replacing it. |
I think you're misunderstanding what I'm specifically saying. I understand that git notes need to be pushed to work. I'm talking about the underlying information stored in the notes. That information (today encapsulated in a git note but perhaps tomorrow encapsulated elsewhere in the context of a hypothetical read-only mode), I am assuming, is necessary for something to function (otherwise, why would it be done in the first place?). Whatever facility within flux it is that needs the information (today stored in a note) will presumably need that information in the future. That means that I will need to store that information elsewhere for it to be made available for consumption. |
I'm not. I'll elaborate on what I said earlier: git notes are used to annotate commits with fluxd-specific metadata. If you're not making commits or push them, you won't make notes or push them. You will still need to read notes, because they might be created by some other process (like another fluxd that has write access). Can you give an example of where you find you need to put the data that goes into a note, somewhere else? |
6daa14a
to
34a205b
Compare
@squaremo as an update: everything seems to be working. The tests all pass and I have smoke-tested it with readonly mode both on and off. Please take a closer look at the code, now, and let me know what you think about the general direction. There are two areas in particular I'd like you to look at (in addition to reading the rest):
Probably not a good time to really critique things line by line (I want to write a lot more tests next), but it seems to be straightforward enough to prove out the concept. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, the ergonomics of this for users will be pretty good.
Before moving out of draft/WIP state, I think these general points (with specifics in the comments) will need to be addressed:
- there's quite a few changes that are not related to what this PR sets out to do
- there's a few gratuitous changes, a small number harmful, most just unnecessary
- the abstraction for storing the sync high water mark is in the wrong place
@squaremo thank you for taking a look. I'll be going through all of these individually asap. regarding some of the changes, when I said:
I was trying to save you the frustration of not having context for some of the changes. Some of them look like random renamings, but only because the context is missing. If there are 8 places in the code that say Same goes for |
ok, I've gone through all the notes and resolved the small/easy ones and left comments on the others. To say it again here: yes, I am aware that there's an architectural problem that needs to be solved, to quote my first message on this topic from before you took a look:
I'll take a look next at extracting the "sync marker" logic to another package and see where we get from there, but it's not clear to me (yet) how that solves the issue of |
19a7b8e
to
bc3c0ed
Compare
I have tested the latest push in both case RepoCloned:
if !r.IsReadOnly() {
ctx, cancel := context.WithTimeout(bg, r.timeout)
err := checkPush(ctx, dir, url)
cancel()
if err != nil {
r.setUnready(RepoCloned, err)
return false
}
}
r.setReady()
// Treat every transition to ready as a refresh, so
// that any listeners can respond in the same way.
r.refreshed()
return true The issue is that this code is actually faulty in its reasoning. It's taken from a switch statement in The This PR actually creates a perfect place for this functionality, the
To me, that's sync functionality that goes beyond the git package as I currently understand the intended architectural boundaries of the
Previously these boiled down to the same question - "can I push a tag", but today they are separate questions. There seem to be other implications of being in a state where we want flux to know it can, indeed, push commits for releases and the like, but that it should not use git for storing the high water mark state. I went ahead and implemented what I'm talking about with making I'm still doing some testing, but aside from this problem we're certainly getting somewhere closer to what resembles a finish line. Once @squaremo you agree with the direction I'm going to turn my attention to refactoring the tests (and writing new ones). |
I don't think it is. A repo is either read-only, or read/write. If you expect to be pushing tags*, make a read/write repo. *In gitlab at least, there are grades of permission, such that it's possible to be able to push commits to the repo, but not to force-push a tag. If we want to account for that, then the way to do it would be to represent the grades of interest as requirements of the repo, replacing |
27b86d4
to
58fdc6d
Compare
I have reviewed every line of the diff many times now and successfully tested it and believe the implementation to be very near completion (if not already complete). From a high level here are the next steps as I see them:
Please respond to each of the following in addition to above request:
|
one of @2opremio / @hiddeco / @stefanprodan do you think you could take a look at the above ? I would appreciate that greatly if so. You don't have to go full-bore on it line-by-line (unless you want to, which would be great in my book), but an overall "looks good but maybe change x to be like y" might save some time when @squaremo has time to get to it. I want to keep it moving if at all possible. |
#1791 is going to give you conflicts once it has been merged. I have taken a look at this PR and compared it to changes I made in the mentioned PR and these are the things that stand out to me:
|
I don't see why the SyncMarkerAction exists, to be quite honest. All the SyncProvider cares about is moving the high water mark. It doesn't need a message or a signing key to do that. Those things are specific to moving a tag, which is a git operation, not a SyncProvider operation.
I think that's a reasonable design, yes. The secret is mandatory, at least at present; we know its name because it's used for the keyring; and, it's related to the git repo.
The Kubernetes API does have optimistic locking, if you're concerned about that. But setting it to an empty string, and treating an empty string as equivalent to a missing value, is fine I would think.
OK.
It'd be fine if the Repo, or otherwise the git package, had methods for dealing with tags. That would mean you can avoid putting the SyncProvider implementation in |
to be clear, I'm waiting on the first step of the above to be completed:
|
My guidance remains as above:
In addition, take Hidde's notes into consideration (you may want to look at the ratchet mechanism in #1791), since it's likely you'll need to rebase on that at some point. |
Any progress on this? Would be exciting to use :) |
678d008
to
dd260b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tested with stefanprodan/flux:readonlygit-fbc3cc98
and a readonly deploy key.
Works great! Thanks Michael 🥇
api/v6/api.go
Outdated
@@ -28,6 +28,7 @@ const ( | |||
ReadOnlySystem ReadOnlyReason = "System" | |||
ReadOnlyNoRepo ReadOnlyReason = "NoRepo" | |||
ReadOnlyNotReady ReadOnlyReason = "NotReady" | |||
ReadOnlyROMode ReadOnlyReason = "ReadonlyMode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick:
ReadOnlyROMode ReadOnlyReason = "ReadonlyMode" | |
ReadOnlyROMode ReadOnlyReason = "ReadOnlyMode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on this many times. There are examples of both in the wild. Both are used in the Go codebase about evenly. Both variants will feel weird to different people at for different reasons (if not simultaneously).
@@ -47,7 +50,14 @@ func (d *Daemon) Loop(stop chan struct{}, wg *sync.WaitGroup, logger log.Logger) | |||
syncHead := "" | |||
|
|||
// In-memory sync tag state | |||
lastKnownSyncTag := &lastKnownSyncTag{logger: logger, syncTag: d.GitConfig.SyncTag} | |||
ratchet := &lastKnownSyncState{logger: logger, state: d.SyncState} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for naming
e4b9326
to
71985ca
Compare
This commit factors out the bookkeeping (maintaining a high water mark) for sync, and provides two implementations: - push a git tag to the upstream (the status quo) - make an annotation on the Kubernetes secret used for thhe deploy key The latter is useful if you want to maintain the high water mark without needing to write to the upstream repo. Aside from the state implementations, the main body of changes is daemon (loop) code that refers to the sync tag with code that refers to the sync.State abstraction.
This does the minimum to enable a read-only mode for git, that is: - add the flag to fluxd - pass the flag value along to the repo, so it knows when the repo is read-only (the main effect of this is that it won't try to test that it can write to the upstream after cloning) - do some checks for interaction between the sync state and read-only -- basically, you can't use a git tag for sync state if the repo is read-only.
To ease the way for distinguishing between cases where we want a read-only view of the files, and those where we need a clone to prepare commits in, this commit lines git.Export and git.Checkout up, by embedding the former in the latter. This removes a long-running duplication of a couple of methods. There's also a modest move towards letting manifests be generated from an Export rather than needing a checkout. In theory, all you need are the files, but to get there in practice we have to break down some assumptions (e.g., instead of requiring a git.Checkout, require an interface from which you can get the directories of interest).
Operations that need to write back to the git repo will fail eventually if they try to push changes upstream; but this can look a bit mysterious to the end user, and ought to just be a safety net. This commit adds some mechanisms for failing such operations earlier and more gracefully, when running with readonly mode: - distinguish between clones needed for reads, and clones needed for writes, and for the latter case, - make working (writable) clones of a read-only repo fail (again) - mark all workloads as read-only in ListServices* responses, when the repo is read-only - skip automated image updates, when the repo is read-only
In principle it is only necessary to have a read-only clone of the git repo in order to do a sync. Most of the needed methods are on clone, though, by historical accident. So actually making this the case takes a bit of shuffling.
closes #1139
I am opening this draft PR just to provide some early context for the direction I'm going with enabling a read-only mode for Flux. I have a lot more to say to fill out the discussion of this post, so I'll be updating this description as more findings arise. I am not only interested in talking about high-level architecture of this feature at this stage so please reserve your comments on anything else until later.
Use Cases
I know of a few but haven't gotten them distilled into text quite yet. If you have some in mind please comment below and I will add them here.
Storing Flux's application state elsewhere
The main challenge with this mode is that Flux uses git for storing Flux's application state. These fall down into two main pieces of data: (1) the "sync-tag" and (2) the usage of git notes.
Replacing the sync-tag
This one, at surface level, looks simple enough. The data held by the sync-tag boils down to a string holding the git reference for the commit that flux has currently synced to. This string can be stored in native kubernetes storage such as:
Replacing the git notes
This one is a little trickier. All of the above listed possibilities for storing the sync-tag are potentially options here as well. We can store this in a kubernetes resource so long as it doesn't ever have potential to go over 1mb, the limit imposed by kubernetes versions that Flux currently supports. Currently I am unable to determine what the maximum size that the notes might reasonably reach because when I run
git notes list
on a repository that is using Flux on production, I don't get any results back. if someone can point me in the right direction on this one I would appreciate it.So far as I can tell the notes are used for webhooks and APIs for returning information about the activity of the repository. It is not possible to release this readonly feature without retaining other primary features, in my opinion.
Currently there are errors associated with git tags that I am still working out in the codebase.
Notes
Yes, I did change some variable names and function signatures in the public API that would constitute a breaking change of some sort. Please ignore the implications any such changes until we get a working strategy for handling the architectural problems described above.
I have left some notes to self under
READONLY-NOTE:
-prepended comments in the codebase. These will all be removed before merging.