Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pluggable blob storage for pkg/registry #1209

Merged
merged 7 commits into from
Dec 23, 2021

Conversation

imjasonh
Copy link
Collaborator

@imjasonh imjasonh commented Dec 16, 2021

Features:

  • supports redirects if the blob handler hosts blobs elsewhere.
  • supports automatically verifying digests/sizes if possible, transparent to the blob handler (if they want it to be).
  • repo is passed to blobHandler methods as string and not name.Repository, but they should be easily parsed by implementations if they want that.

This is still hidden in unexported interfaces and always uses memHandler for now, while we iterate on the shape we want these things to take.

Longer term, we should also extend this to manifest and upload storage, using similar interfaces.

Features:
- supports redirects if the blob handler hosts blobs elsewhere.
- supports automatically verifying digests/sizes if possible,
  transparent to the blob handler (if they want it to be).
- repo strings passed to blobHandler methods aren't name.Repository, but
  should be easily parsed by implementations if they want that.

This is still hidden in unexported interfaces and always uses memHandler
for now, while we iterate on the shape we want these things to take.

Longer term, we should also extend this to manifest and upload storage,
using similar interfaces.
@codecov-commenter
Copy link

codecov-commenter commented Dec 16, 2021

Codecov Report

Merging #1209 (7d7f0c4) into main (d1271fe) will decrease coverage by 0.78%.
The diff coverage is 48.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1209      +/-   ##
==========================================
- Coverage   74.55%   73.76%   -0.79%     
==========================================
  Files         111      111              
  Lines        8048     8181     +133     
==========================================
+ Hits         6000     6035      +35     
- Misses       1464     1549      +85     
- Partials      584      597      +13     
Impacted Files Coverage Δ
pkg/registry/error.go 75.00% <0.00%> (-25.00%) ⬇️
pkg/registry/blobs.go 63.81% <47.61%> (-24.71%) ⬇️
internal/verify/verify.go 88.46% <72.72%> (-5.02%) ⬇️
pkg/registry/registry.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d1271fe...7d7f0c4. Read the comment docs.

pkg/registry/blobs.go Outdated Show resolved Hide resolved
pkg/registry/blobs.go Outdated Show resolved Hide resolved
pkg/registry/blobs.go Outdated Show resolved Hide resolved
Comment on lines +71 to +74
// redirectError represents a signal that the blob handler doesn't have the blob
// contents, but that those contents are at another location which registry
// clients should redirect to.
type redirectError struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda regret letting Scott use errors for non-error cases like this in genreconciler, but that ship sailed there. Perhaps we can have the methods return a Result type or something, which can be an error, but don't have to be an error.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can justify this as a valid error scenario: "oops, I don't have it, but I know who does"

I could collapse this with errNotFound, so the same error type expresses "oops not found[, but I know who does]" if Location is set.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4xx and 5xx are clear error cases, but 3xx feels like a reach.

I'm not gonna die on this hill, it just feels like a funny API.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree, it's weird. Luckily everything's still internal so we can change it to something we like better if we find it.

Comment on lines 51 to 55
// blobHandler represents a blob storage backend.
//
// For all methods, repo is a string that can be passed to
// pkg/name.NewRepository to validate the repository.
type blobHandler interface {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this must be public to be useful.

I think one of the things I realized in my fork was that I wanted the default implementation to be public as well, so that I could compose my own BlobHandler with the default to layer functionality on.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what I have in my fork:

// BlobHandler is the interface for the storage layer underneath this registry.
type BlobHandler interface {
	// Stat returns the size of the blob whose hash is specified,
	// if it exists. If not, it returns (0, error).
	Stat(repo name.Repository, h v1.Hash) (int64, error)

	// Get returns true and a reader for consuming the blob specified with the hash,
	// if it exists.  It now, it returns (nil, error).
	Get(repo name.Repository, h v1.Hash) (io.ReadCloser, error)

	// Store stores the stream of content with the given hash, or returns the error
	// encountered doing so.
	Store(repo name.Repository, h v1.Hash, content io.ReadCloser) error
}

I think plumbing context.Context through makes sense, and am on board with that.

For completeness, this is what I have for manifests:

// ManifestHandler is the interface for the metadata storage layer underneath
// this registry.
type ManifestHandler interface {
	// ListRepositories enumerates up to some count of repositories starting at
	// a specified offset.  This is used to implement the "catalog" handler.
	ListRepositories(offset, count int) ([]name.Repository, error)

	// GetRepository fetches a handler for interacting with a collection of
	// manifests rooted under a particular repository.
	GetRepository(repo name.Repository) (RepositoryHandler, error)
}

// RepositoryHandler is the interface for accessing manifest data under a
// particular repository.
type RepositoryHandler interface {
	// GetDigest fetches the raw bytes of the manifest, and its media type,
	// or returns an error indicating why it cannot.
	GetDigest(v1.Hash) ([]byte, types.MediaType, error)
	// GetTag fetches the raw bytes of the manifest currently labeled by the
	// provided tag, and its media type, or returns an error indicating why
	// it cannot.
	GetTag(string) ([]byte, types.MediaType, error)

	// DeleteDigest removes the manifest with the given hash, it returns an
	// error when the digest does not exist.
	DeleteDigest(v1.Hash) error
	// DeleteTag removes the tag with the given name, the digest is left untouched,
	// it returns an error when the digest does not exist.
	DeleteTag(string) error

	// PutDigest adds the provided manifest content, with the associated media type
	// to the store under the provided hash name.  It returns an error if this cannot
	// be completed.
	PutDigest(v1.Hash, []byte, types.MediaType) error
	// PutTag applies the provided tag to the given hash.  It returns an error if
	// this is not possible.
	PutTag(string, v1.Hash) error

	// ListTags enumerates up to some count of tags starting at
	// a specified offset.  This is used to implement the "tags" handler.
	ListTags(offset, count int) ([]string, error)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the intention is that we add support for manifests and uploads and then export it when we agree we like the shape. This gives us maximum flexibility to change things willy nilly.

pkg/registry/blobs.go Outdated Show resolved Hide resolved
  - if the provided handler supports Stat, we use it; otherwise fallback to Get
  - if the provided handler doesn't support Put, the registry is read-only
- better registry error handling
  - reused vars (less verbose)
  - renamed errTODO(string) -> regErrInternal(err)
- added response body check to test cases
- checking verify.ReadCloser errors, verification is broken somehow (working on it)
@imjasonh
Copy link
Collaborator Author

imjasonh commented Dec 21, 2021

Friendly ping, I have upload handling in the next PR which I'll send out after this one merges. Sneak preview 👀 (and what's after that 👀 )

pkg/registry/blobs.go Outdated Show resolved Hide resolved
@imjasonh imjasonh merged commit 2874338 into google:main Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants