Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: add 'Has' feature. #276

Merged
merged 2 commits into from
Oct 25, 2021
Merged

storage: add 'Has' feature. #276

merged 2 commits into from
Oct 25, 2021

Conversation

warpfork
Copy link
Collaborator

It's what you'd expect. A method that returns a bool.

I debated for a while if there's any way to avoid doing interfaces that refer to other interfaces here, because I usually find that's a smell, but any idea I can think of to try to avoid it seems worse (e.g. would threaten to spawn a 'Has(R)' and a 'Has2(W)' function on the package scope, due to golang's lack of polymorphic parameters, etc). If anyone has any better ideas about how to arrange this, I'm all ears.

I debated for a while if there's any way to avoid doing interfaces that
refer to other interfaces here, because I usually find that's a smell,
but any idea I can think of to try to avoid it seems worse (e.g.
would threaten to spawn a 'Has(R)' *and* a 'Has2(W)' function on the
package scope, due to golang's lack of polymorphic parameters, etc).
@@ -7,11 +7,17 @@ import (

// --- basics --->

type Storage interface {
Has(ctx context.Context, key string) (bool, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for putting Has here instead of ReadableStorage? Write-only storages will have to figure out how to satisfy this method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly the fear of ending up with a Has2 function on the package scope. Avoiding that problem seems to mandate we have some interface on the bottom that's in common to both the base read and the base write interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Has2 function you posit seems to require a readable storage anyway. I don't really see a strong need for a predefined Storage interfaces. It seems more useful for the consumer of a storage to define what methods it needs satisfied by a provider that is passed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to agree with Ian. I don't really understand why we want to require WritableStorage to implement Has, but not Get.

Copy link
Collaborator Author

@warpfork warpfork Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recap: The holistic design goal I'm keeping aim on is: to consistently tell both users and intermediate API designers that they should be referring to either ReadableStorage or WritableStorage. And nothing else -- because then they should use the package-scope functions -- not the methods -- in order to do all their work (and let those functions take care of all the feature detection for them).

Now, Has is an operation that a user might want to ask for on either one of them.

(And surprisingly, I'd say one is almost more likely to want to ask Has on the writable side. To ask Has on the reader might be slightly cheaper than opening the read stream for some implementations; but to ask Has before writing can be significantly cheaper than opening a write stream for some implementations. But I digress; let's suffice to say we want the operation available on either direction.)

Now if I want to make a package-scope function for Has, just to keep 100% consistent with the messaging to the API user about "use the package-scope functions for all operations"... it turns out to be the one thing so far that's legitimately in common to both the read and the write directions.

If golang had parametric polymorphism, I'd write two functions:

func Has(Context, ReadableStorage, Key) (bool, error) {...}
func Has(Context, WritableStorage, Key) (bool, error) {...}

... but we do not have parametric polymorphism, so this is not an option.

Then our other options are:

  • Give up on having a package-scope Has function, and document the inconsistency of the "use the package-scope functions for all operations" rule for users.
  • Probably still have Has as methods on both anyway?
  • Or if not, and just putting Has on ReadableStorage only, then... we make people feature-detect Has on the writer side, when that's wanted?
  • Or we throw in the towel on separating reader and writer directions entirely? But I don't care for this at all -- this split has felt really good so far. It's like a soft form of "capabilities"-style APIs that keeps you from making mistakes in wiring, and it's also really great to be able to implement a read-only system and express that clearly at compile time. So surely we don't want to give all that up.

All of those choices sound worse to me than introducing an interface that's at the bottom of the type graph here.

Introducing an interface type that's at the bottom of the type graph gives us what we need to make the package API consistent and low-friction. Coincidentally, Has fits really nicely there.

(And in the future: if we find more features that might work on both the read or the write direction (and presumably, are optional extensions), we make those package-scope functions of the style func Foo(Context, Storage, ...) (...) too, reusing the same bottom type and doing feature-detection internally from there, to minimize burden to the caller.)

Copy link
Collaborator Author

@warpfork warpfork Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is sort of like the inverse of #277 . It makes the most sense if you look at how many casts show up when you actually try to use the API.)

@iand
Copy link
Contributor

iand commented Oct 24, 2021

Have you considered func Has(Context, interface{Has(Key) bool}, Key) (bool, error) {...} so it can be used with anything that has a Has method?

@warpfork
Copy link
Collaborator Author

warpfork commented Oct 24, 2021

That's functionally the same but harder to read.

Again, the key consideration here is how this is called.

Because it's hard to see in the textual diff of this PR itself, I made a playground snippet to help demonstrate this: https://play.golang.org/p/hppG2taSAlQ

The end result of someone following the documentation about "APIs should be designed around accepting ReadableStorage or WritableStorage", if we don't have a base interface that both of them include, is (quote from the playground output, which has the complete setup):

cannot use r (type ReadableStorage) as type interface { Has(string) bool } in argument to Has:
	ReadableStorage does not implement interface { Has(string) bool } (missing Has method)
cannot use w (type WritableStorage) as type interface { Has(string) bool } in argument to Has:
	WritableStorage does not implement interface { Has(string) bool } (missing Has method)

So this is all about avoiding cast friction, at the same time as giving clear direction to implementers, on how to fit things together.

We need a "bottom" type in the family to avoid this. And it just so happens that Has is a reasonable method to put in it.

@warpfork
Copy link
Collaborator Author

warpfork commented Oct 24, 2021

In case this helps clarify where I'm coming from: If I could shift Has back up to something less required and more feature-detectable, I'd probably consider that desirable, as a baseline, fwiw.

It's just that: that doesn't really work, because we'd still need the "bottom" type in the family, in order to make the feature-detection-for-you function at package scope be definable without cast friction that's shifted onto the caller. And then... what would that type be, if it didn't contain Has? We'd have... an empty interface for it, which is generally a significant footgun.

Sometimes I use non-functional marker interfaces for situations like this, e.g. type Storage interface { YesHelloImStorageIPromise() }. But here, that seems... strange. It would would work; it would compile; but it feels like overkill, I feel like it would make implementers look at me funny, and it doesn't feel like it's buying much gain.

So: yeah, I thought about a lot of things to try to make Has less special, and I appreciate the reviews that are expressing the same desire. It's just that all other approaches I can think of seem worse :)

@warpfork
Copy link
Collaborator Author

Also, though I did just confess an initial desire to make Has optional and feature-detectable, that's also held up less and less the longer I've thought about it. The ROI on that is just... awful.

  • Has is sufficient frequently wanted that making it less consistently present isn't making any user's lives better; and is possibly making them worse. (In ROI terms: low or negative "R".)
  • Has is sufficiently simple to build in almost all cases I can imagine, that making it optional has very low cost reduction impact for implementors. (In ROI terms: low "R".)
  • The amount of strange twists we'd do as a library and API to make Has optional is... per above posts, let's just call it "high I", in ROI terms.
  • Probably most critically, Has cannot be backfilled from both of the other two base interfaces. And definitely not cheaply. (In ROI terms: drives "R" extremely to the negative.)
    • For ReadableStorage, you can backfill Has behavior without a Has method by just doing the full read, and returning whether or not the error was a "not found" error.
      • ... but that's made you pay the full read cost, already. That's a potentially sizable "ouch" if you didn't already mean to -- so the feature-detection approach may be harmful here: feature-detection shouldn't generally try to paper over a categorically different performance cost or it leads to ecosystemic nastiness. So it's an option, technically: but it's a bad one.
      • (We'll leave aside for a moment that this would also requires standardizing error codes in a way that golang is notoriously bad at offering tooling to enforce, and thus this would also probably make for a rather more fragile interface.)
    • For WritableStorage, you could try to backfill Has behavior without a Has method by... what, exactly?
      • The package-scope Has function wouldn't take a body parameter, because that would be clearly silly, so we can't do a dummy write and check for an "already exists" error.
      • ... or we do a dummy write with an empty body, just to check for the "already exists" error? No, that's clearly wrong, and deeply into no-go territory.
      • There's nothing. There's no recourse here.
      • So then we'd have a package scope Has method that... sometimes returns a "feature not supported" error? Nope. I'm unwilling to accept that. One of the joys of splitting the read and the write interfaces is getting away from rampant "feature not supported" errors. I'd like to keep to that. Especially for something so very basic.

tl;dr turns out life gets a lot worse if Has is not consistently available.

@iand
Copy link
Contributor

iand commented Oct 24, 2021

Your playground example was incomplete since it didn't include concrete implementations. It seems to work just fine when added: https://play.golang.org/p/i1wY0CkwSTG

@warpfork
Copy link
Collaborator Author

warpfork commented Oct 24, 2021

... I mean, yeah, sure, it works fine if you use concrete implementations, and not interfaces. But not solving anything; that's just avoiding engaging with the interfaces entirely. We can't actually write much useful code that way. We start using the interfaces almost immediately, even one step away, within go-ipld-prime; and I intend to use them in other parts of go-ipldtool immediately, as well, which definitely won't be able to accomplish anything with concrete types and type switches.

Making the interfaces work is the name of the game.

Maybe it's not clear yet from the amount of code pushed implementing it in this single repo so far, but I aim for there to be a lot more than one concrete implementation of this, nor just one for read and one for write. Think like... four, next week, minimum. (They're sitting in a stack in my working tree, waiting on this, because I hate chaining PRs :))

@warpfork
Copy link
Collaborator Author

I take this discussion as a big cue that more docs were needed in the diff itself, so I've now increased the volume of those substantially. :)

@warpfork
Copy link
Collaborator Author

I'm going to tentatively carry on with this. There's a lot of demos I'd like to finish soon, and having some solution to this blocks a lot of things. I've also just encountered another third-party downstream repo that will immediately need something about Has in order to be able to use go-ipld-prime, and I want to pursue that without delay as well.

We can revisit this API area again in the future if we accumulate experiences that indicate something went askew here. (Especially if we end up loosening constraints -- that's always an easy migration, so we can not sweat it in advance.)

@warpfork warpfork merged commit b870be9 into master Oct 25, 2021
@warpfork warpfork deleted the storage-api-has branch October 25, 2021 17:58
@aschmahmann aschmahmann mentioned this pull request Dec 1, 2021
80 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants