Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for hidden branches #8375

Merged
merged 5 commits into from
Nov 22, 2024
Merged

Conversation

N-o-Z
Copy link
Member

@N-o-Z N-o-Z commented Nov 15, 2024

Closes #8355

Change Description

Background

This came as a discussion regarding the use of ephemeral branches. For example for the use of transactions (Python SDK)

New Feature

Add support for creating hidden branches

Hidden branches by default not shown on branch listing
Added a new flag for branch listing to show all branches (disabled by default)
Hidden branches can still be fetched using the GetBranch API

Testing Details

Add unit testing for graveler and controller

Breaking Change?

No

@N-o-Z N-o-Z added the include-changelog PR description should be included in next release changelog label Nov 15, 2024
@N-o-Z N-o-Z requested review from arielshaqed and guy-har November 15, 2024 21:43
@N-o-Z N-o-Z self-assigned this Nov 15, 2024
Copy link

github-actions bot commented Nov 15, 2024

♻️ PR Preview 4a48b43 has been successfully destroyed since this PR has been closed.

🤖 By surge-preview

Copy link

E2E Test Results - DynamoDB Local - Local Block Adapter

13 passed

Copy link

E2E Test Results - Quickstart

11 passed

Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thanks! This make isolation on lakeFS so much better!

Small things:

  • I would prefer to mark this "experimental". While I am quite sure that we will want to keep it, I can imagine the API changing.
  • Check annotation in catalog_test.go suggests we need to run gomock or something.
  • Not sure about controller tests.

Larger issue: I would like us to re-think how we handle "special-purpose" metadata.

I imagine we will end up with user metadata for branches. Indeed I can think of several purposes for this already. I think I would prefer to have general metadata for branches -- key-value, essentially. If we had then, we could declare that we continue to own the ::lakefs::* namespace (same as we do on objects) -- and then a branch is hidden if it has ::lakefs::hidden set.

I do like the new interface for ListBranches. If (as an alternative) we implemented ListBranches as "list branches with this general filter language for branch metadata", then hidden branches would not really be hidden unless you asked for that. So I do not want to do that!

pkg/api/controller_test.go Show resolved Hide resolved
Comment on lines 1516 to 1520
if len(results) != 2 {
t.Fatalf("expected 2 branches to return, got %d", len(results))
}
require.Equal(t, results[0].Id, "main2")
require.Equal(t, results[1].Id, "main3")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to compare outputs directly, it makes debugging easier when there are any failures. So in this case I would extract .Ids to a slice, and then use a contents comparison such as deep.Equal (or perhaps testify has a usable slice equality test).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry - but this feels like a complete overkill.
We're comparing strings, why would we need deep equals? Also I don't understand how putting the expected and received in a slice and comparing slices makes it easier to debug?
These tests run in ms, so putting a breakpoint where you need and looking and the variables takes less than a second.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debuggers are the antithesis of testing.
The advantage of comparing slices of strings correctly is that you see entire context.

  • For instance, suppose you receive ["main3"]. You get the useless error message "expected 2 branches to return", and not even the branches you did get.
  • For instance, suppose you receive ["main3", "main2"] (maybe some KV weirdness). You get the useless error message "got main3 expected main2" which still doesn't say what happened.
  • For instance, suppose you receive ["main1", "main2"] (some filtering is broken). You get the useless error message "got main1 expected main2" which still doesn't say what happened.

Tests should fail with a useful error message, if only to help understand how much work is required to fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modified

api/swagger.yml Outdated Show resolved Hide resolved
Co-authored-by: Ariel Shaqed (Scolnicov) <ariels@treeverse.io>
@N-o-Z
Copy link
Member Author

N-o-Z commented Nov 18, 2024

Wow, thanks! This make isolation on lakeFS so much better!

Small things:

  • I would prefer to mark this "experimental". While I am quite sure that we will want to keep it, I can imagine the API changing.
  • Check annotation in catalog_test.go suggests we need to run gomock or something.
  • Not sure about controller tests.

Larger issue: I would like us to re-think how we handle "special-purpose" metadata.

I imagine we will end up with user metadata for branches. Indeed I can think of several purposes for this already. I think I would prefer to have general metadata for branches -- key-value, essentially. If we had then, we could declare that we continue to own the ::lakefs::* namespace (same as we do on objects) -- and then a branch is hidden if it has ::lakefs::hidden set.

I do like the new interface for ListBranches. If (as an alternative) we implemented ListBranches as "list branches with this general filter language for branch metadata", then hidden branches would not really be hidden unless you asked for that. So I do not want to do that!

Thanks for the review.
Not so sure about the metadata approach. User metadata is one thing, but defining branch properties as internal metadata is something I think we should be careful with - it's much more prone to errors IMO.
We might want to also refrain from coupling features which don't necessarily have close relation and especially if they don't exist yet (nor on the roadmap).

@N-o-Z N-o-Z requested a review from arielshaqed November 18, 2024 17:55
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Asking for better tests but not requiring it for approval.

pkg/api/controller_test.go Show resolved Hide resolved
Comment on lines 1516 to 1520
if len(results) != 2 {
t.Fatalf("expected 2 branches to return, got %d", len(results))
}
require.Equal(t, results[0].Id, "main2")
require.Equal(t, results[1].Id, "main3")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debuggers are the antithesis of testing.
The advantage of comparing slices of strings correctly is that you see entire context.

  • For instance, suppose you receive ["main3"]. You get the useless error message "expected 2 branches to return", and not even the branches you did get.
  • For instance, suppose you receive ["main3", "main2"] (maybe some KV weirdness). You get the useless error message "got main3 expected main2" which still doesn't say what happened.
  • For instance, suppose you receive ["main1", "main2"] (some filtering is broken). You get the useless error message "got main1 expected main2" which still doesn't say what happened.

Tests should fail with a useful error message, if only to help understand how much work is required to fix.

Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for product feedback

@N-o-Z N-o-Z requested a review from itaiad200 November 22, 2024 03:46
Copy link
Contributor

@itaiad200 itaiad200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

api/swagger.yml Outdated
schema:
type: boolean
default: false
description: When set - list all branches including hidden branches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: When set - list all branches including hidden branches
description: When set - list all branches including hidden branches. *EXPERIMENTAL*

@N-o-Z N-o-Z merged commit fb14ffb into master Nov 22, 2024
40 checks passed
@N-o-Z N-o-Z deleted the task/add-hidden-branches-support-8355 branch November 22, 2024 18:38
guy-har added a commit that referenced this pull request Nov 25, 2024
* Add support for hidden branches (#8375)

* Add support for hidden branches

* Update api/swagger.yml

Co-authored-by: Ariel Shaqed (Scolnicov) <ariels@treeverse.io>

* CR Fixes

* CR Fixes 2

* CR Fixes 3

---------

Co-authored-by: Ariel Shaqed (Scolnicov) <ariels@treeverse.io>

* Create infra for plugins

* Create usage example

---------

Co-authored-by: N-o-Z <ozery.nir@gmail.com>
Co-authored-by: Ariel Shaqed (Scolnicov) <ariels@treeverse.io>
Co-authored-by: Itai Gilo <itai.gilo@treeverse.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
include-changelog PR description should be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support hidden branches feature
3 participants