Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pure SSH LFS support #31516

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Conversation

ConcurrentCrab
Copy link

@ConcurrentCrab ConcurrentCrab commented Jun 28, 2024

Fixes #17554
/claim #17554

To test, run pushes like: GIT_TRACE=1 git push. The trace output should mention "pure SSH connection".

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Jun 28, 2024
@pull-request-size pull-request-size bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jun 28, 2024
@github-actions github-actions bot added modifies/go Pull requests that update Go code modifies/cli PR changes something on the CLI, i.e. gitea doctor or gitea admin modifies/docs modifies/dependencies labels Jun 28, 2024
@lunny lunny added this to the 1.23.0 milestone Jun 28, 2024
@@ -0,0 +1,8 @@
# LFS SSH transfer protocol server
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just reference but copy?

Copy link
Author

@ConcurrentCrab ConcurrentCrab Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lunny It's ~1000 lines of code from a well-known & robust implementation of the protocol, under an MIT license. I was told in the issue discussion that integrating this external dependency would be fine, so it seemed the obvious choice to make.

Or if you actually meant to ask why I vendored it instead of just importing the module: see commit 8492908, these changes were needed but couldn't have been made upstream. also the last tag on upstream was from ~2 years ago so we'd get an older version of this code without logging etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding copyright headers: acknowledged, will add them in a bit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ConcurrentCrab I've cloned the upstream to https://gitea.com/gitea/git-lfs-transfer. Could you send a PR to it for the changes to the library instead (if they don't get merged upstream)?

We could also tag that repo on a more recent commit so that a go get will give us the expected version.

Thanks again for your work on this <3

Copy link
Author

@ConcurrentCrab ConcurrentCrab Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@techknowlogick the PR I made to charmbracelet (charmbracelet/git-lfs-transfer#47) doesn't need to be merged for this PR. I vendored the relevant parts of the library (namely, the transfer protocol) with the necessary modifications, so we don't have to care whether upstream chooses to merge it or not. I made that PR just because I thought they could use some of those improvements too (and also so our vendored version doesn't diverge too much).

Or did you mean that it would be more appropriate to have a fork of the library, instead of vendoring it? In that case, sure, I'll make the PR there, drop the vendoring commit here, and add our fork as a dependency.

Just wanted to make sure there's no miscommunication regarding this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, the later of having a fork instead of vendoring it. Thanks for being so understanding :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ConcurrentCrab
Copy link
Author

@lunny added copyright headers.

@ConcurrentCrab
Copy link
Author

Improved error handling.

@ConcurrentCrab
Copy link
Author

Accidentally resolved the conversation about DB connections above, but reiterating:

Pushes and clones work... but there's sometimes (by which I mean decently reproducible) bug where after transferring and verification and everything... the connection just hangs for almost exactly 120 seconds. After which it just ends gracefully and git proceeds with its push.

It's particularly nasty, I have no leads on it yet. The best I can figure out is that it looks like the git-lfs client seems to not send the quit command that it should at the end of the protocol, or that it isn't coming through.

The code here checks if the repo being requested doesn't exist.
If it doesn't, then a write operation might create it. But a read
operation doesn't make any sense, and should error out. So simply
check the access mode.

I assume this was the intent here, but only checked for one "verb"
instead, while there exist other read-only verbs as well. And ofc
more can be introduced in the future ;)

Possibly some write verbs don't make sense as well (presumably
those that only add stuff incrementally to existing repos)?
Coalesce access mode detection into one place.
Yes, "upload" really has opposite semantics for
git commands vs. git-lfs commands. Wow.

This commit makes no functional changes.
this makes no functional changes
Big changes are inevitable due to the difference in Gitea's approach
to storing LFS files. And upstream hasn't tagged for a while anyway.
Make a few changes to make the transfer backend suit Gitea better.

Merge Upload into one method:
  Makes things simpler since out content store verifies size & hashes
  itself

Change Download return into io.ReadCloser:
  Removes dependency on filesystem

Add size: int64 arguments wherever appropriate
  In arguments for Upload and Verify, and in return for Download
@ConcurrentCrab
Copy link
Author

Added locking API support.

Well at least I figured out what's causing the hang. Setting lfs.ssh.automultiplex to false on the client side:
https://github.com/git-lfs/git-lfs/blob/8e36a03d85bf05bbca004dd7e8b55b147809b3e0/docs/man/git-lfs-config.adoc?plain=1#L90
gets rid of the hang. The bad news is that it's true by default.

I think this is feature interacting badly with the SSH server?

I traced the traffic, and indeed, something weird is happening on the client side. Git LFS spawns several workers each making its own connection. In a multiplexed connection, only the "first" spawn of the our commands gets a quit message. The rest of them have to wait 120 seconds for an EOF. I'm almost tempted to believe this is a bug on the git-lfs client side.

Can you check and see if this is reproducible in your environments?

@ConcurrentCrab
Copy link
Author

Yeah, I'm not going crazy. This commit:
git-lfs/git-lfs@44b8801
breaks pure SSH LFS sessions.

And this has history, it isn't even the first time lol:
git-lfs/git-lfs#5537

@ConcurrentCrab
Copy link
Author

just made: git-lfs/git-lfs#5816

@ConcurrentCrab ConcurrentCrab force-pushed the lfs-ssh branch 2 times, most recently from 5fadb5d to de9a3cf Compare July 1, 2024 20:15
Copy link
Member

@lafriks lafriks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As pointed in quite some places already but all should be fixed

modules/lfstransfer/backend/backend.go Outdated Show resolved Hide resolved
modules/lfstransfer/backend/backend.go Outdated Show resolved Hide resolved
modules/lfstransfer/backend/lock.go Outdated Show resolved Hide resolved
modules/lfstransfer/backend/lock.go Outdated Show resolved Hide resolved
@ConcurrentCrab
Copy link
Author

@lafriks adressed the url joins.

Also add handler in runServ()

The protocol lib supports locking but the backend does not,
as neither does Gitea. Support can be added later and the
capability advertised.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. modifies/cli PR changes something on the CLI, i.e. gitea doctor or gitea admin modifies/dependencies modifies/docs modifies/go Pull requests that update Go code size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support LFS purely over SSH protocol
5 participants