Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-16390: v2 Cluster Property APIs. #2788

Merged

Conversation

cugarte
Copy link
Contributor

@cugarte cugarte commented Oct 21, 2024

https://issues.apache.org/jira/browse/SOLR-16390

Description

v2 (JAX-RS) endpoints for Cluster Property operations.

Solution

This commit adds the following v2 JAX-RS Cluster Property APIs:

Colloquial Name v2 JAX-RS API Notes
List ClusterProps GET /api/cluster/properties New, no v1 or v2 counterparts
Create/Update Single ClusterProp PUT /api/cluster/properties/propName {"value": "propVal"}
Create/Update Nested ClusterProp PUT /api/cluster/properties {...}
Fetch Single ClusterProp GET /api/cluster/properties/propName New, no v1 or v2 counterparts
Delete ClusterProp DELETE /api/cluster/properties/propName

Tests

The existing Cluster Property tests cover the "Create/Update Single ClusterProp", "Create/Update Nested ClusterProp" and "Delete ClusterProp". I added unit tests for the new APIs (ClusterPropsAPITest).

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@cugarte
Copy link
Contributor Author

cugarte commented Oct 21, 2024

List ClusterProps sample response:

{
  "responseHeader": {
    "status": 0,
    "QTime": 14
  },
  "clusterProperties": [
    "urlScheme",
    "defaults"
  ]
}

Fetch Single ClusterProp sample responses:

{
  "responseHeader": {
    "status": 0,
    "QTime": 5
  },
  "clusterProperty": {
    "name": "urlScheme",
    "value": "v1-value"
  }
}
{
  "responseHeader": {
    "status": 0,
    "QTime": 5
  },
  "clusterProperty": {
    "name": "defaults",
    "value": {
      "collection": {
        "numShards": 2,
        "nrtReplicas": 1,
        "tlogReplicas": 1,
        "pullReplicas": 1
      }
    }
  }
}
{
  "responseHeader": {
    "status": 400,
    "QTime": 26
  },
  "error": {
    "metadata": {
      "error-class": "org.apache.solr.common.SolrException",
      "root-error-class": "org.apache.solr.common.SolrException"
    },
    "msg": "No such cluster property [doesNotExist]",
    "code": 400
  }
}

Wasn't sure how to model the response. Different APIs use different error for "does not exist" responses: /api/collections/collectionName returns a 400, /api/aliases/specificalias returns a 405, /solr/collectionName/schema/fields/fieldName returns a 404.

@cugarte
Copy link
Contributor Author

cugarte commented Oct 21, 2024

  1. Leaving this PR in Draft mode to get feedback on APIs/responses and so I can provide the missing unit tests and update documentation.
  2. Each API consist of two separate classes in solr/api/.../endpoint and solr/core/.../handler/admin/api (with some supporting POJO classes in solr/api/.../model). I wasn't sure if that would be preferred over grouping them all together as was done with AliasPropertyApis/AliasProperty.
  3. The new v2 JAX-RS Bulk Update ClusterProp API requires providing a body that looks like {"properties":{"actualPropertyToBeUpdated":...}} because I didn't know how to map an unknown top-level value to data in the model (SetNestedClusterPropertyRequestBody).
  4. The Bulk Update ClusterProp APIs (old and new) allows setting "invalid" properties (i.e. ones that cannot be deleted by old APIs). Should this be allowed? If so, perhaps delete should be changed to not check that these names are on the allowed list.

Known TODOs:

  1. Add test code for List and Fetch ClusterProps. Done
  2. Update documentation and CHANGES.txt.

@github-actions github-actions bot added the tests label Oct 22, 2024
@cugarte cugarte marked this pull request as ready for review October 23, 2024 13:29
@gerlowskija
Copy link
Contributor

Still going through the individual files on this PR, but wanted to respond to some of the high-level comments first:

Wasn't sure how to model the response. Different APIs use different error for "does not exist" responses: /api/collections/collectionName returns a 400, /api/aliases/specificalias returns a 405, /solr/collectionName/schema/fields/fieldName returns a 404

I'm personally a fan of 404 in this case as it seems a little more actionable for users than the more generic '400', so that'd be my preference. But I don't have any strong feelings on that point, and would be open to something else if you do have preferences? When we decide, we should document the decision in dev-docs/v2-api-conventions.adoc so there's a "standard" we can align on.

(I suspect the 405 returned by GET /api/aliases/nonexistentAlias is a bug, FWIW. Will have to file a ticket for that if I can reproduce...)

I wasn't sure if that would be preferred over grouping them all together as was done with AliasPropertyApis/AliasProperty

I prefer grouping related APIs into a single file, at least on the 'api' side. IMO it cuts down on boilerplate, and makes reviewing and browsing easier by keeping a bunch of related definitions together. But again, it's a very slight preference on my end if you happen to prefer the alternative.

The new v2 JAX-RS Bulk Update ClusterProp API requires providing a body that looks like {"properties":{"actualPropertyToBeUpdated":...}} because I didn't know how to map an unknown top-level value

Hmm - I think you should be able to nuke SetNestedClusterPropertyRequestBody altogether, and replace it in the method signature with Map<String, Object>? e.g.

  @PUT
  @Operation(
      summary = "Set nested cluster properties in this Solr cluster",
      tags = {"cluster-properties"})
  SolrJerseyResponse createOrUpdateNestedClusterProperty(
      @RequestBody(description = "Property/ies to be set", required = true)
          Map<String, Object> propertyValuesByName)

Or does that break something or other that I've forgotten about?

Copy link
Contributor

@gerlowskija gerlowskija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @cugarte!

I left a few inline comments - mostly small fixes like missing @PermissionName annotations. (I've also replied to a few of your questions in a preceding comment, in case you miss it.)

But overall this looks great. Should be pretty close to being merge-ready! Lmk if you have any questions about the feedback!

@Operation(
summary = "Set nested cluster properties in this Solr cluster",
tags = {"cluster-properties"})
SolrJerseyResponse createOrUpdateNestedClusterProperty(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] It's interesting that this API is both the only way to set "nested"/complex cluster properties, and the only way to set multiple properties simultaneously.

I guess that's fine, since it mirrors what's supported in v1? I don't really have a question or suggestion here, mostly just making a note of it...

@@ -907,6 +912,11 @@ private void loadInternal() {
ClusterAPI clusterAPI = new ClusterAPI(collectionsHandler, configSetsHandler);
registerV2ApiIfEnabled(clusterAPI);
registerV2ApiIfEnabled(clusterAPI.commands);
registerV2ApiIfEnabled(ListClusterProperties.class);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] Not a huge deal either way, but it's prob a bit cleaner to register these APIs from within CollectionsHandler, where the v1 logic (such as it is) lives. (See CollectionsHandler.getJerseyResources)

CoreContainer is already pretty bloated so the more API-registration we can defer to individual request-handler's, the better IMO

} catch (Exception e) {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Error in API", e);
}
SetNestedClusterPropertyRequestBody setNestedClusterPropertyRequestBody =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] SetNestedClusterPropertyApi/SetNestedClusterProperty provide an improved v2 API for this functionality, so IMO we should delete this method outright, rather than keeping the older v2 API around.

(v2 APIs are "experimental", so there's no backcompat concerns with removing/changing them as necessary)

super(coreContainer.getZkController().getZkClient());
}

@Override
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-1] We want all APIs to have an @PermissionName annotation, so that they can be mapped to the predefined permissions that Solr's RuleBasedAuth plugin knows about. In terms of the what permission - COLL_READ seems like the best fit at a glance to me. Wdyt?

(Same comment applies to GetClusterProperty, but I think all the others are good!)

try {
clusterProperties.setClusterProperties(requestBody.properties);
} catch (Exception e) {
throw new SolrException(SolrException.ErrorCode.SERVER_ERROR, "Error in API", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0] Not the most helpful error message, but I can see it came straight from the pre-existing "set-obj-property" API.

Commenting here as a reminder to myself to fix this and a few other places in a subsequent PR.

}

@Test
public void testClusterPropertyOpsAllGood() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[+1] Great tests - afaict we have very little coverage around cluster-props so this is a huge improvement!

[-0] These tests seem like a good candidate for using the SolrJ classes generated by your API definitions (i.e. o.a.s.client.solrj.request.ClusterPropertiesApi).

Doing so feels a bit circular (i.e. any bugs in the JAX-RS annotations would get propagated to the generated code and might not be caught)...but it'd help the tests to scan a lot nicer IMO, and it'd get us coverage for the SolrJ classes to boot, so it sees like a worthwhile tradeoff IMO. Wdyt?

…erPropertyApi.java

Co-authored-by: Jason Gerlowski <gerlowskija@apache.org>
@cugarte
Copy link
Contributor Author

cugarte commented Nov 1, 2024

Thank you for the detailed feedback, @gerlowskija! I've updated the code to reflect most of the changes you suggested. Specifically,

  • Merged all of the previously new classes in o.a.s.client.api.endpoint into a new class ClusterPropertyApis and all of the previously new classes o.a.s.handler.admin.api into a new class ClusterProperty (updated to extend/use AdminAPIBase),
  • Removed all of the previously new classes that were merged above,
  • Updated the v1 code in CollectionsHandler to use the new ClusterProperty APIs,
  • Removed the old v2 code in ClusterAPI (both setObjProperty which you suggested and setProperty which I had originally left unchanged (used the v1 API code path) as both were v2 experimental ClusterProps APIs),
  • Moved registration of the new code from CoreContainer to CollectionsHandler,
  • Updated the v2 JAX-RS Fetch Single ClusterProp API (getClusterProperty) to return a 404 instead of a 400, and
  • Added the missing @PermissionName annotations (now in ClusterProperty).

A couple of code suggestions are outstanding:

[-0] These tests seem like a good candidate for using the SolrJ classes generated by your API definitions (i.e. o.a.s.client.solrj.request.ClusterPropertiesApi).

Doing so feels a bit circular (i.e. any bugs in the JAX-RS annotations would get propagated to the generated code and might not be caught)...but it'd help the tests to scan a lot nicer IMO, and it'd get us coverage for the SolrJ classes to boot, so it sees like a worthwhile tradeoff IMO. Wdyt?

I hadn't thought of this earlier, but think it's a good idea as it would let us test the SolrJ classes. I don't know enough about the code generation or how annotation bugs might lead to masking errors - perhaps it would be helpful to have a specific set of tests that checks an underlying API in multiple ways (using the SolrJ APIs and HttpClient calls or something similar?) but generally let test code just use the SolrJ APIs? In any event, I'm running into a couple of problems here that I think are related to code generation.

Problem 1: I added one new unit test (testClusterPropertyFetchNonExistentPropertySolrJ, left the others unchanged) using a generated SolrJ API. I didn't see any other unit tests that used the other generated classes from o.a.s.client.solrj.request so am guessing as to the proper way to use generated client code. This particular test is meant to return an error, and I believe it triggers an error when the request is handled server-side, but the invocation of the client API (new ClusterPropertiesApi.GetClusterProperty("ext.clusterPropThatDoesNotExist").process(client)) does not throw an exception as I had expected, so the unit test fails. Not sure how to check for errors otherwise.

Hmm - I think you should be able to nuke SetNestedClusterPropertyRequestBody altogether, and replace it in the method signature with Map<String, Object>

Problem 2: when I did this, the generated code below (in o.a.s.client.solrj.request.ClusterPropertiesApi) could not be compiled:

this.requestBody = new Map<String, Object>();

as Map is an abstract class. Replacing SetNestedClusterPropertyRequestBody with HashMap<String, Object> yields the exact same issue (the generated code tries to instantiate a Map). I left this part unchanged.

Question 1: while I also prefer having related calls be in the same class, this has resulted in two similarly named classes that could be confusing: o.a.s.handler.admin.api.ClusterProperty and org.apache.solr.common.cloud.ClusterProperties. The latter implements the interaction with ZooKeeper. Is this ok, or is there a convention I should follow?

TODOs:

  • Updating dev-docs/v2-api-conventions.adoc convention on how v2 APIs should return not found errors.
  • Updating documentation on the v2 APIs.

@cugarte
Copy link
Contributor Author

cugarte commented Nov 1, 2024

Just saw your changes to solr/solrj/src/resources/java-template/api.mustache in https://github.com/apache/solr/pull/1993/files#diff-d053f95e7416908019cf4f7f5e8ede80cc24e3ed8eed2a72d078b75406251c8c . I applied the same change, which fixed Problem 2 I noted above. The most recent commit makes that change, updates a unit test accordingly and nukes the unnecessary model.

I gather the correct changes to api.mustache would make the generated code throw an exception in the right case (Problem 1 above). Will see if I can come up with something that works.

@gerlowskija
Copy link
Contributor

I added one new unit test [...] using a generated SolrJ API [...] but the invocation of the client API (new ClusterPropertiesApi.GetClusterProperty("ext.clusterPropThatDoesNotExist").process(client)) does not throw an exception as I had expected, so the unit test fails. [...] I gather the correct changes to api.mustache would make the generated code throw an exception in the right case.

sigh No it's not a template problem - I think you've uncovered a design implication I hadn't realized up until now.

In short - SolrClients typically parse the response into a "NamedList" that is then immediately inspected to see whether there were any server errors that should result in a client-side exception. But for these generated v2 classes, the response-parsing happens much much later - nothing gets parsed until the caller has already received a SolrResponse instance and invokes a special getParsed() method. Which unintentionally side-steps SolrClient's response-inspection and exception throwing logic entirely, without giving it any sort of replacement.

It's fixable, but definitely beyond the scope of what we'd want to bring into this PR. I've created a separate JIRA ticket to discuss more: https://issues.apache.org/jira/browse/SOLR-17549

In the meantime - forget I said anything about having tests that use the generated SolrJ objects. I'd love to backfill those tests later once we've worked out some of the kinks in how the generated classes need to work, but that shouldn't block this PR.

@gerlowskija
Copy link
Contributor

The discussion here has gotten a bit long, but to summarize my understanding: "Problem 1" and "Problem 2" are both resolved, the first by virtue of being punted to SOLR-17549 and the second by a minor 'api.mustache' template fix. I think that only leaves the two more minor "TODOs" Carlos mentioned above:

  • Updating dev-docs/v2-api-conventions.adoc convention on how v2 APIs should return not found errors.
  • Updating documentation on the v2 APIs.

Lmk if that's right!

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 13, 2024
@cugarte
Copy link
Contributor Author

cugarte commented Nov 13, 2024

Thanks @gerlowskija ! The punting of "Problem 1" does resolve all of the outstanding issues, and the commit I just pushed added the two missing bits of documentation that you noted.

@gerlowskija
Copy link
Contributor

Alright, appreciate the latest round of docs, and with those latest changes this now LGTM. I'll leave a few days for other feedback to trickle in but will aim to merge towards the end of the week otherwise. Thanks Carlos!

@gerlowskija
Copy link
Contributor

Alright, gearing up to merge this now. I've added a CHANGES.txt entry for this - ended up being pretty chunky since this PR both makes a few existing APIs easier to grok, as well as adding a few new ones entirely. Very cool!

One final note - a few files (ClusterPropertyApis.java and ClusterProperty.java) were missing license headers that the ASF requires in every file, so I've added those in.

Thanks again Carlos!

@gerlowskija gerlowskija merged commit af26a5d into apache:main Nov 21, 2024
4 checks passed
gerlowskija added a commit that referenced this pull request Nov 21, 2024
This commit changes several v2 "clusterprop" APIs to be
more in line with the REST-ful design we're targeting for Solr's
v2 APIs.

It also adds new v2 clusterprop APIs for listing-all and fetching-
single clusterprops.

---------

Co-authored-by: Jason Gerlowski <gerlowskija@apache.org>
@cugarte
Copy link
Contributor Author

cugarte commented Nov 21, 2024

Thanks, Jason! And apologies for the broken tests that you had to fix.

@cugarte cugarte deleted the SOLR-16390-v2-deleteclusterprop-ClusterAPI branch November 21, 2024 21:29
@iamsanjay
Copy link
Contributor

Test Failing!

./gradlew :solr:core:test --tests "org.apache.solr.handler.admin.api.ClusterPropsAPITest.testClusterPropertyOpsAllGood" -Ptests.jvms=96 "-Ptests.jvmargs=-XX:TieredStopAtLevel=1 -XX:+UseParallelGC -XX:ActiveProcessorCount=1 -XX:ReservedCodeCacheSize=120m" -Ptests.seed=B00D9B01404FDAFB -Ptests.timeoutSuite=600000! -Ptests.file.encoding=US-ASCII

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:api cat:cloud client:solrj documentation Improvements or additions to documentation tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants