Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch: first draft. #350

Merged
merged 14 commits into from
Jun 7, 2022
Merged

patch: first draft. #350

merged 14 commits into from
Jun 7, 2022

Conversation

warpfork
Copy link
Collaborator

@warpfork warpfork commented Feb 4, 2022

(Work in progress.)

The JSON Patch specification is pretty decent; this roughly follows it.

Most of the heavy lifting is done by shelling out to the existing
traversal.FocusedTransform functions. We're just gluing that to a more
declarative API, which is more ready to be used via a serial API.

The performance of this system in the face of multiple patch operations
on the same target is probably abysmal. This implementation is an MVP.
It's likely that a smarter implementation could combine more than one
operation application onto the same data build process, and thus
induce a lot fewer copies. This also gets much trickier to implement
(maybe you need some patricia trees for ordering operations? or more
mutable nodes for the duration of the computation, then ideally a way
to freeze them again at the end so we don't break other consistency?),
so it's "future work". The goal of this changeset is probably going to
be limited in scope to getting the declarative API part hammered out.

As of this first commit, there's still several TODOs, regarding validation of
the patch instructions during their parse, generalizing the API to
support more codecs, and testing around upsets and moves and etc.

@BigLep BigLep linked an issue Feb 7, 2022 that may be closed by this pull request
15 tasks
@warpfork
Copy link
Collaborator Author

warpfork commented Mar 6, 2022

This now appears to work with all six operations: add, replace, remove, copy, move, and test. Fixtures exist in the ipld/ipld repo for each.

@warpfork
Copy link
Collaborator Author

warpfork commented Mar 6, 2022

Probable future work:

  • Even more fixtures! Especially those that probe error paths.
  • I'd like to standardize the error codes and how they're serialized.
  • Tests still needed for patches that apply across links. (I'm fairly sure traversal.FocusedTransform, which I built this feature on, already supports this -- but it's not tested at this level.)
    • ... it might be nice to build some more helper features around go-testmark to help manage fixtures that are stored by content addressed hunk names (currently you can of course do this, but manually it would be a bit ishy to write such fixtures).
  • There's still a variety of TODOs scattered in the implementation about better validation and error handling paths in general.
  • The API for taking patch instruction messages (which have a CID to apply against, etc) and operating end-to-end returning a patch result message, etc -- those aren't included here yet. (There's some schema for them, but it's not all wired up.)

I'm not sure if these are worth doing immediately, or if this PR is worth landing as-is and iterating more in future PRs. (Leaning towards the latter, especially because having long-running branches in the ipld/ipld specs&meta repo becomes a bit frictional.)

@warpfork warpfork marked this pull request as ready for review March 6, 2022 10:28
This was referenced Mar 6, 2022
@BigLep BigLep requested a review from mvdan March 8, 2022 15:28
@warpfork
Copy link
Collaborator Author

warpfork commented Mar 8, 2022

Status of this: I think I call this "alpha". The current behavior appears to be correct so far. The tests we have cover key features. I'd still consider it relatively experimental and the odds of bugs relatively high -- "use at your own risk". (I want many more test fixtures probing corner cases, in the long run, before promising complete stability.)

I think we may be able to merge this, with those caveats.

smrz2001 added a commit to smrz2001/go-ipld-prime that referenced this pull request Mar 13, 2022
traversal/patch/patch_test.go Outdated Show resolved Hide resolved
traversal/patch/parse.go Outdated Show resolved Hide resolved
@BigLep
Copy link

BigLep commented Mar 21, 2022

What are the next steps here? Can this be merged after feedback is incorporated?

@RangerMauve
Copy link
Contributor

TODO post merge: Update ipld.io/docs/intro/hello-world.md

@smrz2001 smrz2001 mentioned this pull request May 8, 2022
19 tasks
@RangerMauve RangerMauve self-assigned this May 9, 2022
@RangerMauve
Copy link
Contributor

RangerMauve commented May 12, 2022

Two questions:

  • I'm not seeing the errors for the operations when I try to run the tests locally on my machine. What can I do to debug this? Is it a golang version? I'm on go version go1.17.9 linux/amd64.
go test ./traversal/patch
ok  	github.com/ipld/go-ipld-prime/traversal/patch	(cached)
  • What can I do about the following errors I'm getting when running go test ./...?
--- FAIL: TestRoundtripSchemaSchema (0.00s)
    roundtrip_test.go:23: 
        error:
          got non-nil error
        got:
          e"open ../../.ipld/specs/schemas/schema-schema.ipldsch.json: no such file or directory"
        stack:
          /home/mauve/programming/go-ipld-prime/schema/dmt/roundtrip_test.go:23
            qt.Assert(t, err, qt.IsNil)
        
FAIL
FAIL	github.com/ipld/go-ipld-prime/schema/dmt	0.002s
--- FAIL: TestParseSchemaSchema (0.00s)
    parse_test.go:30: 
        error:
          got non-nil error
        got:
          e"open ../../.ipld/specs/schemas/schema-schema.ipldsch: no such file or directory"
        stack:
          /home/mauve/programming/go-ipld-prime/schema/dsl/parse_test.go:30
            qt.Assert(t, err, qt.IsNil)
        
--- FAIL: TestParse (0.00s)
    parse_test.go:63: 
        error:
          unexpected success
        len(got):
          int(0)
        got:
          []string(nil)
        want length:
          <same as "len(got)">
        stack:
          /home/mauve/programming/go-ipld-prime/schema/dsl/parse_test.go:63
            qt.Assert(t, matches, qt.Not(qt.HasLen), 0)
        
FAIL
FAIL	github.com/ipld/go-ipld-prime/schema/dsl	0.003s

@rvagg
Copy link
Member

rvagg commented May 16, 2022

@RangerMauve we have a git submodule sync problem here; both ipld/ipld and master have moved on from the last commit here.

If you run git submodule update --init --recursive you should reset back to the submodule commit this branch has and you should get the errors CI is showing here.

I think what you probably should do is rebase both ipld/ipld#187 and then here against current master versions and get them both in sync.

Then, if you replace your enum in the schema with:

		type Op enum {
			| add
			| remove
			| replace
			| move
			| copy
			| test
		}

it should work OK, bindnode is using the strings "Op_Add" etc. and they're just getting converted straight to Ops as Op("Op_Add") which is failing in the switch in eval.go.

Also as an aside, now I look at this conversion code in parse.go, I reckon I can make operationRaw go away with #414, since the main problem seems to be the conversion of string to datamodel.Path. So it's nice to see another use-case for doing that to avoid intermediate types.

@rvagg
Copy link
Member

rvagg commented May 16, 2022

Also, the embed trick might be worthwhile here, see: https://github.com/ipfs/go-graphsync/tree/main/message/ipldbind

import _ "embed" then you can //go:embed schema.ipldsch a file into []byte so you can keep the schema file separate and it gets loaded automatically.

The JSON Patch specification is pretty decent; this roughly follows it.

Most of the heavy lifting is done by shelling out to the existing
traversal.FocusedTransform functions.  We're just gluing that to a more
declarative API, which is more ready to be used via a serial API.

The performance of this system in the face of multiple patch operations
on the same target is probably abysmal.  This implementation is an MVP.
It's likely that a smarter implementation could combine more than one
operation application onto the same data build process, and thus
induce a *lot* fewer copies.  This also gets much trickier to implement
(maybe you need some patricia trees for ordering operations?  or more
mutable nodes for the duration of the computation, then ideally a way
to freeze them again at the end so we don't break other consistency?),
so it's "future work".  The goal of this changeset is probably going to
be limited in scope to getting the declarative API part hammered out.

As of this commit, there's still several TODOs, regarding validation of
the patch instructions during their parse, generalizing the API to
support more codecs, and testing around upsets and moves and etc.
I'm thinking of trying to loosely (loosely!) standardize the idea of
some phases for features like this one: parse, compile, eval, report;
these seem to be somewhat recurrent.

(Conventionalizing function names could actually save headaches for
the package author, because it removes some decisions that must be made
but are not interesting.  It can also help users, for obvious reasons.)
@rvagg
Copy link
Member

rvagg commented May 25, 2022

I've cleaned this up a bit:

  • rebased to master
  • fixed the enum
  • extracted the schema to a separate file
  • cleaned up bindnode use a bit
  • added an "EXPERIMENTAL" notice at the top of the package
  • updated Patch spec ipld#187 (rebased to ipld/ipld master and updated submodule commit here); we have a green.

I reckon this might be mergeable along with ipld/ipld#187, what do you reckon @RangerMauve?

@RangerMauve
Copy link
Contributor

Thanks @rvagg! I was away last week so I'm just getting into this now.

I was thinking some link traversal tests would be useful for patch before folks start building on it, but I guess that could be done in a separate PR so we can get this out the door. 🤔

@RangerMauve
Copy link
Contributor

Also, I'm seeing a bunch of warnings saying that there's lines not covered by tests. Is that something we should address or is it expected?

@rvagg
Copy link
Member

rvagg commented Jun 1, 2022

Also, I'm seeing a bunch of warnings saying that there's lines not covered by tests. Is that something we should address or is it expected?

The codecov is new, comes with the rebase to master where we now have the unified-ci workflow! So we get to feel that little bit more guilty about poor test coverage.

The uncovered bits are all for error cases, which are pretty annoying to cover in their entirety; and for a first pass at an experimental feature I don't think it's worth chasing those down here - at least that's not been the past practice in this repo where coverage hasn't been a super high priority. The one case where it's not for errors is the "-" path segment case where there's two paths where that would invoke if it were tested. It's in the comments:

There's also a special case for "-", which means "append to the end of the list".

But no tests for it. We maybe should add one for that since we have two codepaths that are relevant.

@RangerMauve
Copy link
Contributor

Tests pass, merging.

@RangerMauve RangerMauve merged commit a79fa65 into master Jun 7, 2022
@RangerMauve RangerMauve deleted the patch-feature branch June 7, 2022 00:35
@smrz2001 smrz2001 mentioned this pull request Jun 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🎉 Done
Development

Successfully merging this pull request may close these issues.

Patch MVP
5 participants