Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema_registry: Add support for DELETE /config/{subject} #13557

Merged
merged 4 commits into from
Oct 23, 2023

Conversation

oleiman
Copy link
Member

@oleiman oleiman commented Sep 20, 2023

Introduces DELETE /config/{subject} endpoint for Schema Registry.

This makes use of (mostly) existing primitives on the sequenced write path to avoid duplicating critical code. Key observation is that a config write message with k: {subject}, v: std::nullopt results in durably (I think) clearing the compatibility config for that subject.

Fixes #13229

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.2.x
  • v23.1.x
  • v22.3.x

Release Notes

Features

  • Schema Registry: Add DELETE /config/{subject} endpoint

@oleiman oleiman marked this pull request as draft September 20, 2023 21:47
@oleiman oleiman force-pushed the delete-subject-compat branch 3 times, most recently from 4a4e9cd to d10c047 Compare September 21, 2023 06:11
src/v/pandaproxy/schema_registry/handlers.cc Show resolved Hide resolved
src/v/pandaproxy/schema_registry/handlers.cc Show resolved Hide resolved
@@ -118,6 +118,10 @@ class sharded_store {
ss::future<compatibility_level>
get_compatibility(subject sub, default_to_global fallback);

///\brief Check the compatibility level for a subject w/o fallback.
/// If none exists, the underlying store returns subject_not_found
ss::future<compatibility_level> check_compatibility(subject sub);
Copy link
Member

@BenPope BenPope Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this name because it feels a lot like check_compatible. Naming is hard!.

Perhaps the conversion can be done in the handler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's what I had in my first draft. Less code better code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, perhaps the check could be done within seq_writer::do_delete_config and that method can return a result<optional<bool>>?

Copy link
Contributor

@NyaliaLui NyaliaLui Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, perhaps the check could be done within seq_writer::do_delete_config and that method can return a result<optional>?

Specially since it seems that check_compatible is only used for the delete path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the extraneous function. Since the seq_writer implementations need to return bool, we'll still need to call get_compatibility from the handler (I think) in order to conform to the API. Am I reading that right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see delete_subject_permanent_inner does something similar to what you're describing @NyaliaLui. Makes sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Played around with this a bit, and I think performing the check up in the handler is more or less in line with surrounding code (and less of a pain). Happy to revisit if that interpretation sounds wrong.

}

auto key = config_key{.seq{write_at}, .node{seq._node_id}, .sub{sub}};
auto batch = as_record_batch(key, std::nullopt);
Copy link
Member

@BenPope BenPope Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to build tombstones so that the original config can be compacted, you may take some inspiration from delete_subject_permanent_inner

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Will put this in a follow on commit. Need a bit of time to grok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would squash that commit with this one

@oleiman oleiman force-pushed the delete-subject-compat branch from d10c047 to a365944 Compare September 21, 2023 17:58
// ensure we see latest writes
co_await rq.service().writer().read_sync();

compatibility_level lvl;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very nice. Normally I'd put a lambda or something around the try/catch, but I don't think that works here because coro. Helper function?

@oleiman oleiman force-pushed the delete-subject-compat branch from a365944 to d972155 Compare September 22, 2023 05:24
@oleiman oleiman marked this pull request as ready for review September 22, 2023 05:26
@oleiman oleiman marked this pull request as draft September 24, 2023 23:44
@oleiman oleiman marked this pull request as ready for review September 25, 2023 15:46
@oleiman oleiman self-assigned this Sep 26, 2023
@oleiman oleiman changed the title Delete subject compat schema_registry: Add support for DELETE /config/{subject} Sep 26, 2023
Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, just a couple of comments

src/v/pandaproxy/schema_registry/seq_writer.cc Outdated Show resolved Hide resolved
tests/rptest/tests/schema_registry_test.py Outdated Show resolved Hide resolved
}

auto key = config_key{.seq{write_at}, .node{seq._node_id}, .sub{sub}};
auto batch = as_record_batch(key, std::nullopt);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would squash that commit with this one

@vbotbuildovich
Copy link
Collaborator

Comment on lines 289 to 291
// co_return true;

// TODO(oren): seems like this might not be necessary?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chore: remove commented out code and is this a TODO where we need to file a ticket?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, meant to transform this into a PR comment. This is an open question from my perspective, maybe for @BenPope? After we've tombstoned the history of this subject's config, do we still need to finally write an empty record at the front?

Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

NyaliaLui
NyaliaLui previously approved these changes Sep 29, 2023
Copy link
Contributor

@NyaliaLui NyaliaLui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@oleiman
Copy link
Member Author

oleiman commented Sep 29, 2023

/ci-repeat 1

Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I'm seeing correct tombstoning now.

Comment on lines 506 to 507
return sm.key_type == marker.key_type && sm.seq && marker.seq
&& *sm.seq == *marker.seq;
Copy link
Member

@BenPope BenPope Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this?

        std::erase_if(sub_it->second.written_at, [marker](auto sm) {
            return sm.key_type == marker.key_type && sm.seq == marker.seq;
        });

It's possible to have read a "compatible" config on the topic - the compat format has no seq or node, e.g.:

{
  "topic": "_schemas",
  "key": "{\"keytype\":\"CONFIG\",\"subject\":\"foo\",\"magic\":0}",
  "value": "{\"compatibilityLevel\":\"BACKWARD\"}",
  "timestamp": 1697654666281,
  "partition": 0,
  "offset": 5
}

In which case the config_key::seq will be nullopt, which we want to match.

But why not match on node as well?

        std::erase_if(sub_it->second.written_at, [marker](auto sm) {
            return sm.key_type == marker.key_type && sm.seq == marker.seq
                   && sm.node == marker.node;
        });

But of course, I'm not opposed to adding seq_marker::operator== (with a comment that matching nullopts are intentional due to the "compatible" records not having seq or node)

model::record_batch_type::raft_data, model::offset{0}};

std::vector<config_key> keys;
for (auto s : sequences) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I see what you mean. This change (current branch HEAD) seems to fix it if somewhat naively. Look reasonable to you?

It's gone, but if it's the last commit, it looks good.

@BenPope
Copy link
Member

BenPope commented Oct 18, 2023

since this is a feature, why are we backporting it?

Last time I added one of these the backport justification was "so console can have uniform access to it right away". That was a GET endpoint though with a clean cherry pick. Worth discussing in this case for sure.

This is also a request from Console team. It's a cross between a feature and a bugfix, since deleting a config is a necessary operation. I often backport this type of fix, since the change almost never touches existing code, so the chance of a regression is minimal.

@oleiman oleiman force-pushed the delete-subject-compat branch from 82f4010 to 6b8f645 Compare October 19, 2023 02:40
@oleiman oleiman requested a review from BenPope October 19, 2023 02:43
Comment on lines 318 to 321
bool operator==(const seq_marker& other) const {
return key_type == other.key_type && seq == other.seq
&& node == other.node;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the general case for operator== should include the version

Suggested change
bool operator==(const seq_marker& other) const {
return key_type == other.key_type && seq == other.seq
&& node == other.node;
}
friend bool operator==(const seq_marker&, const seq_marker&) = default;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

Comment on lines 501 to 503
std::erase_if(sub_it->second.written_at, [marker](const auto& sm) {
return marker == sm;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::erase_if(sub_it->second.written_at, [marker](const auto& sm) {
return marker == sm;
});
std::erase(sub_it->second.written_at, marker);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fancy 👍

@oleiman oleiman force-pushed the delete-subject-compat branch 2 times, most recently from 55451af to bebdc51 Compare October 23, 2023 06:03
@oleiman
Copy link
Member Author

oleiman commented Oct 23, 2023

force push (add a comment to seq_marker::operator==

Adds `seq_writer::delete_config`, which deletes the stored compatibility
level for a given subject by producing an empty-valued config write.

Also adds `sharded_store::get_subject_config_written_at` which returns
the sequence number history of a subject's config. This differs from
`get_subject_written_at` in that the sequence markers returned cannot
be used to perm delete the subject itself, only the config. Therefore,
there is no need to soft-delete a subject before fetching this info.
Adds a handler for the delete endpoint.

To match the behavior of the reference implementation, this endpoint:

- Returns the compatibility level of the subject before deletion
- OR subject_not_found if no compatibility level was previously set
@oleiman oleiman force-pushed the delete-subject-compat branch from bebdc51 to df1a49d Compare October 23, 2023 16:15
@vbotbuildovich
Copy link
Collaborator

new failures detected in https://buildkite.com/redpanda/redpanda/builds/39608#018b5d95-0d4e-4cf6-b1f6-dc9d8f8299c4: "rptest.tests.follower_fetching_test.FollowerFetchingTest.test_basic_follower_fetching.read_from_object_store=False"

@oleiman oleiman requested a review from BenPope October 23, 2023 19:34
@oleiman
Copy link
Member Author

oleiman commented Oct 23, 2023

CI Failures:

Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth backporting this

@oleiman oleiman merged commit 4600a7a into redpanda-data:dev Oct 23, 2023
14 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.1.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for deleting subject compatibility in schema registry
6 participants