Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-11149. Have a generic version Validator for validating Requests #6932

Draft
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

swamirishi
Copy link
Contributor

@swamirishi swamirishi commented Jul 12, 2024

What changes were proposed in this pull request?

Currently RequestFeatureValidator only has a validator which checks if a certain client version for the OM request made by an older client or if the OM needs to be finalized.
There is a need for a more generic validator which should have a capability to validate the request from the client based on the server's layout version & client's version.
The patch introduces a new annotation RegisterValidator which allows to register a new feature validator for any request. The annotation enforces the validator to have a RequestType, maxVersion, ProcessingPhase in it's definition.
ValidationRegistry should be able to pull the validators registered and inturn should be able to get all the underlying validator methods based on the usage of the registered feature validator. The patch segregates the ClientVersionValidation and FINALIZATION validation into 2 validators named OMClientVersionValidator & OMLayoutVersionValidator respectively which would be responsible for checking the request client version & server side metadata layout version.
The validator has been written in such a way that the same code can be reused for other components (SCM & DN) as well to do their request validations as well.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11149

How was this patch tested?

Existing unit test and additional unit tests

…or to validate clients older than the specified version

Change-Id: Ifcbb239994e7357032934d104ab18063c8d64251
@swamirishi swamirishi requested review from fapifta and errose28 July 12, 2024 04:16
@swamirishi swamirishi marked this pull request as draft July 12, 2024 04:17
@swamirishi
Copy link
Contributor Author

Creating this patch as permanent fix in favour of #6922

@swamirishi
Copy link
Contributor Author

If this approach looks good, I can also go about adding an integration test which tests all Om operations for different client versions.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @swamirishi for making the long-term fix quickly. :)

Only one comment so far (but it applies to many files).

Comment on lines 463 to 466
conditions = {},
processingPhase = RequestProcessingPhase.PRE_PROCESS,
requestType = Type.CreateBucket
requestType = Type.CreateBucket,
maxClientVersion = ClientVersion.ERASURE_CODING_SUPPORT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add {} as default value here:

and tweak changes in @RequestFeatureValidator usage:

-      conditions = ValidationCondition.OLDER_CLIENT_REQUESTS,
+      maxClientVersion = ClientVersion.ERASURE_CODING_SUPPORT,

to reduce the patch. This changes only 1 line per item, instead of 3 lines + an unchanged line in the middle.

@sadanand48
Copy link
Contributor

Looks like I commented without refreshing the PR, this seems to follow same approach as what is suggested as long term solution. Thanks @swamirishi for the quick fix.

Entry<? extends ExecutableElement, ? extends AnnotationValue> entry) {
if (isPropertyNamedAs(entry, ANNOTATION_MAX_CLIENT_VERSION_PROPERTY_NAME)) {
Name maxClientVersion = visit(entry, new MaxClientVersionVisitor());
return !maxClientVersion.contentEquals(MAX_CLIENT_VERSION_FUTURE_VERSION);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also check for sanity of maxClientVersion if not equal to MAX_CLIENT_VERSION_FUTURE_VERSION i.e whether it lies in the same range as ClientVersion.values()

@swamirishi
Copy link
Contributor Author

Looks like I commented without refreshing the PR, this seems to follow same approach as what is suggested as long term solution. Thanks @swamirishi for the quick fix.

Oh we can't do that since this is part of the static code. The annotation processor also doesn't have any access to the enums. So if we add this check, everytime we change something in the enum,. We would have to make a change. Thus bringing us back to the same problem.

@fapifta
Copy link
Contributor

fapifta commented Jul 12, 2024

Please give me some time to get back to this one next week... I gonna need to digest this a bit more and also I would like to dig out some of my notes on earlier plans about how to make this better. I would not rush to commit this, so if it is required to fix this, let's push #6922 if we have time to work out this solution then please wait for me a bit. Thank you!

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @swamirishi. While this PR is focused on the issues with the OLDER_CLIENT_REQUEST condition, the CLUSTER_NEEDS_FINALIZATION condition has the same problem. I suggest this:

  • Adding two new keys in the annotation: one for client version and one for layout version. They are exclusive. Only one is allowed to be specified per annotation.
  • Removing the general ValidationCondition from the annotation.
  • Two 3-nested maps in the registry: one for layout feature validators and one for client version validators since the version enums have different types.

Currently there is some iteration required for every request, even if they don't have a validator. With the general concept of validation conditions removed we can do a simple map lookup which will be more efficient in the common case, especially if we order the keys accordingly.

In this layout I think request version -> request type -> request phase -> validator is the most efficient layout of the map. Most clients will probably be newer versions and can fail out of the check in the first level. After that, most requests will not have compat concerns for a given version, so even more requests exit early. The phase is hardcoded depending on which method is called so there is no option to exit early here, hence it should be last.

This is less general than the validation condition support we have now, but IMO the general implementation is causing code complexity and inefficiencies on the write path for something we do not use and may never need. This also makes it consistent with other upgrade related annotations like DisallowedUntilLayoutVersion which only take a version and act on it with a fixed condition.

EDIT: I just remembered client version is the only thing that is not an exact match, so using it as a key directly will not work. The idea of baking the conditions into the framework to simplify the implementation and make it more consistent with the layout feature finalization framework still seems like a good approach though.

processingPhase = RequestProcessingPhase.PRE_PROCESS,
requestType = Type.CreateBucket
requestType = Type.CreateBucket,
maxClientVersion = ClientVersion.ERASURE_CODING_SUPPORT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this so that ClientVersion.BUCKET_LAYOUT_SUPPORT is provided in the annotation for validators that deal with bucket layout. That would make this an exclusive upper bound on the client version that needs to be processed instead of an inclusive one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can remove the if statements in the body now right?

@@ -81,7 +82,7 @@
* Runtime conditions in which a validator should run.
* @return a list of conditions when the validator should be applied
*/
ValidationCondition[] conditions();
ValidationCondition[] conditions() default {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this field?

* conditions though.
* @returns the max required client version for which the validator runs for older clients.
*/
ClientVersion maxClientVersion() default ClientVersion.FUTURE_VERSION;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we remove the conditions field then this probably shouldn't have a default value. Technically the request validators can be used for things other than client versioning but I don't think we have a use case for that right now. Narrowing the scope of conditions indicates what their intended use is.

Comment on lines 55 to 57
private final Map<Pair<Type, RequestProcessingPhase>, Pair<List<Method>, TreeMap<Integer, Integer>>>
maxAllowedVersionValidatorMap = new HashMap<>(Type.values().length * RequestProcessingPhase.values().length,
1.0f);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get rid ValidationCondition then we just have 3 keys to identify the validator(s) that need to run:

  • Request type
  • Request phase
  • Client version

A triple nested map of these keys will be much easier to follow than the wild data structure created here.

@swamirishi
Copy link
Contributor Author

swamirishi commented Jul 12, 2024

Thanks for working on this @swamirishi. While this PR is focused on the issues with the OLDER_CLIENT_REQUEST condition, the CLUSTER_NEEDS_FINALIZATION condition has the same problem. I suggest this:

  • Adding two new keys in the annotation: one for client version and one for layout version. They are exclusive. Only one is allowed to be specified per annotation.
  • Removing the general ValidationCondition from the annotation.
  • Two 3-nested maps in the registry: one for layout feature validators and one for client version validators since the version enums have different types.

Currently there is some iteration required for every request, even if they don't have a validator. With the general concept of validation conditions removed we can do a simple map lookup which will be more efficient in the common case, especially if we order the keys accordingly.

In this layout I think request version -> request type -> request phase -> validator is the most efficient layout of the map. Most clients will probably be newer versions and can fail out of the check in the first level. After that, most requests will not have compat concerns for a given version, so even more requests exit early. The phase is hardcoded depending on which method is called so there is no option to exit early here, hence it should be last.

This is less general than the validation condition support we have now, but IMO the general implementation is causing code complexity and inefficiencies on the write path for something we do not use and may never need. This also makes it consistent with other upgrade related annotations like DisallowedUntilLayoutVersion which only take a version and act on it with a fixed condition.

Thanks for working on this @swamirishi. While this PR is focused on the issues with the OLDER_CLIENT_REQUEST condition, the CLUSTER_NEEDS_FINALIZATION condition has the same problem. I suggest this:

  • Adding two new keys in the annotation: one for client version and one for layout version. They are exclusive. Only one is allowed to be specified per annotation.
  • Removing the general ValidationCondition from the annotation.
  • Two 3-nested maps in the registry: one for layout feature validators and one for client version validators since the version enums have different types.

Currently there is some iteration required for every request, even if they don't have a validator. With the general concept of validation conditions removed we can do a simple map lookup which will be more efficient in the common case, especially if we order the keys accordingly.

In this layout I think request version -> request type -> request phase -> validator is the most efficient layout of the map. Most clients will probably be newer versions and can fail out of the check in the first level. After that, most requests will not have compat concerns for a given version, so even more requests exit early. The phase is hardcoded depending on which method is called so there is no option to exit early here, hence it should be last.

Version shouldn't be part of the map key? We need the validator running for all clients older than a certain condition.
I believe we need the same for Layout version. E.g. We know if a server hasn't currently finalized ERASURE_CODED_STORAGE_SUPPORT then we can be sure that it hasn't started preparing for HSYNC as well. Thus if a client comes in and tries to use HSYNC the validator should block that request as well.
Moreover we don't need a nested map since all the keys would be used together. If we have Triple or Tuple map it is the same hash code wise. I believe this would be much better to tuple map since you hash map size would be much smaller and only on the elements added. EnumMaps would create a map for all the values. It would be wasteful in the case we don't have validators blocking for all the enum values there are.

This is less general than the validation condition support we have now, but IMO the general implementation is causing code complexity and inefficiencies on the write path for something we do not use and may never need. This also makes it consistent with other upgrade related annotations like DisallowedUntilLayoutVersion which only take a version and act on it with a fixed condition.

Change-Id: Ib1742e6a78f908598021ed0bb80f71d3c1527735
Change-Id: Ic084d8bc456468d66cb26d8fc0cfdd33920fb8ed

# Conflicts:
#	hadoop-hdds/common/src/main/resources/META-INF/services/javax.annotation.processing.Processor
Change-Id: Iab5e022b6a1401998d905c68d491fe2a2a7413b7
Change-Id: Iba9e1caa6a66b033d39c95b20f78ebc8e23fba4d
@swamirishi swamirishi changed the title HDDS-11149. Add a maximum client version in the RequestFeatureValidator to validate clients older than the specified version HDDS-11149. Have a generic version Validator for validating Requests Jul 15, 2024
Change-Id: I05265cad5f26082351c6926c24866b20a50b26d2
@fapifta
Copy link
Contributor

fapifta commented Jul 17, 2024

Hi @swamirishi,

thank you for continued work on this one, the idea presented in the PR seems really promising, and for the first sight, it is the generalisation in the proper direction in order to make it possible to extend the current request validations to other service components, which I really like. To be honest I really envy you to get to this, I never forgot how badly I wanted to push things towards this direction back at the time when the original feature was ready but we could not spare time for this... Anyway... let's go back to the review itself...

Please note that the single most crucial point of this whole system is the validationsFor() methods in ValidatorRegistry, every other bit is secondary compared to the operational effectiveness of how we get the validations, as that is in the way of all request processing. This means optimally the system can figure out at the very first step the cheapest possible way that the there are no validations to apply to a request, as in almost all request will fall into this bucket. Besides that the data structure should be designed in a way that it does not hold back the request processing more than it is absolutely necessary when there are version differences between the client and the server.
From this sense, we possibly should check the operational complexity of querying for a case where the client and the server is at the same version, and the operational complexity of querying for a case where the client and the server has different versions. The different version scenario has two sub cases, where the client is newer, this falls into the bucket of client is the same version as server, as in this case the client side deals with compatibility, and where the server is newer that is the case where multiple validations may be applied.
This note I put here just to reflect to your argument in the HashMap vs EnumMap questions, as this is not just a matter of taste, as a suboptimal solution for the same client server version case for sure affects the whole performance of any service where this runs, so we need to be careful and decide based on this property. (And yes in this case even a few CPU cycles count at scale.) We can neglect the costs of building up the in-memory data structure, as that happens once, and we can also neglect the memory efficiency as it does not matter if this structure at the end of the day retains 1MB or 100MB for the runtime of the process as usually these processes have tens or hundreds of GB RAM available for the processes.
In light of this... I would argue that the array based enum map provides probably a better performance for getting the values at the price of retaining more memory, but that is more feasible for us than using a hashTable and calculating hashCode at every get.
This of course is only an argument if we do not have other means to optimize the same client same server version case, but we need an optimal older client newer server case resolution also, as after an upgrade there might be a bunch of clients to be upgraded and during that happens performance degradation can be problematic. (This might not just happen at rolling upgrade which we don't support yet, but that is the most obvious time when this can become a problem.)

I still did not have the time to digest fully the changes proposed, I wanted to give feedback as early as possible even if it is partial. You can expect some more review comments from me after I have the time to understand things further, most likely just early next week unfortunately.

On the other hand, there is a problem which we need to address.
Please take a look at the solution added in #6796, this PR adds the annotation processors explicitly to those modules where we use the annotation, and makes it impossible to use the annotations outside those modules, as importing the annotations became a violation in the enforcer plugin.
Renaming an annotation or an annotation processor, effectively turns these enforcements off if the names are not updated in the pom files, so with a rename we will need to go through the pom files and fix them to keep the prior enforcements in effect.

A less burning issue, however it is our best interest to deal with this:
I noticed is that due to renames and class deletes and method signature changes affects API documentation, from which the automation removed class references but this way the API doc is sometimes makes no sense, and most of the times deviates from the current implementation of the documented function.

Change-Id: I22ff08e5f4c43cbdda7a7db5f4d83417b55d481e
Change-Id: I551cb26f648bfedb7bc5bfee63b34731b56b4d9b
@fapifta
Copy link
Contributor

fapifta commented Oct 19, 2024

@sumitagrawl, Yes, it will affect startup performance, and also has some memory footprint.

Let say we have 1000 validators, we may use 10MB tops for this data structure but I would guess less than that... however this is just a wild guess... even if it is 100MB we probably are fine, as we are able to directly address all validators we need within the mappings.

Computational performance? It may or may not suffer, it depends, how many old client requests are coming in, how many new features and layout changes were introduced with the last release and so on. The decision on this is quick, and once done, it is clear which validators we need if we need any.

These annotated methods potentially have some retention time, so the number is not monotonically growing, as if we announce dropping compatibility support for example for anything before 1.5 in 2.0 and we require all clusters to be upgraded to 1.5 before it can be upgraded to 2.0, we can simply remove anything that belongs to a a version that is prior to 1.5, except the actual client and server side validations, with that decreasing the number of validators over time.

@adoroszlai
Copy link
Contributor

Should we implement this in steps to avoid too big changes and difficult to review patches?

@swamirishi
Copy link
Contributor Author

Should we implement this in steps to avoid too big changes and difficult to review patches?

Yeah let me see if I can break it into smaller patches.

@swamirishi
Copy link
Contributor Author

swamirishi commented Dec 17, 2024

Avoid server version validator if the server has already been finalized & also can explore skipping registering validators if the server starts up in a finalized state.

Change-Id: I2d19e9052abe0702f6fe391d32737a53601852b1
Change-Id: I39af5db40fa96fee27ef53ab7ef5f44002736e4e

# Conflicts:
#	hadoop-hdds/tools/pom.xml
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCreateRequest.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerRequestHandler.java
#	hadoop-ozone/tools/pom.xml
…0983 (apache#7348)"

This reverts commit e7bf154

Change-Id: Ie8d5b1ec85844b890039eb0026fad90fa4d45601
Change-Id: I9f6854136551102dfe94086a8afaf1c820331533
Change-Id: Ia0d97078d6c9fc61f2f79bc322f966e11ab4a3c4
Change-Id: I63b013066618c0568e13c2323b8403d1ca6d0426
Change-Id: I7701ad25c3fb4cde9f78daef3053c822b41fe4a0
Change-Id: I855a19c67256a27504b7e3d134b2f500a94bf330
Change-Id: I6a5e84b8053a9401ce4210849495af8254561f09
Change-Id: I13bc9e28c5db3b706265abb8cecaf71c5201151c
Change-Id: I8302abcfcaa70b07a85ef9505c7dabdc312955d4
Change-Id: I9b30d241a216733266754086b2add6fd623f27dd
Change-Id: I08b43ff6a366ad4744110512ccd0109e16701d6b
…ce to accomodate in the request validator framework

Change-Id: I4c1844a1dfb6127a73b46a92933db33b09c6b06e
Change-Id: I63cddb955ea6178cd489c2a83922576e8d2ea5eb
Change-Id: I11bff280f429d2eb64f8828887f7349e169e938f
Change-Id: I9ebc447eb9b60caa69e140b06fe899c58dd712a3
Change-Id: Iae51b610a170ede5a565cd83ce70764e8eec9a82

# Conflicts:
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ComponentVersion.java
#	hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/upgrade/LayoutFeature.java
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/validation/VersionExtractor.java
Change-Id: I55280a1b2fdeb7b6ed202098cc0ca412ee7761cf
Change-Id: I33d392ec1f55f26336186551dd4bb81ee1617cff
Change-Id: Iaf2b997ba214db59f897fa430328356f0e8f1f47

# Conflicts:
#	hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/validation/VersionExtractor.java
#	hadoop-ozone/ozone-manager/src/test/java/org/apache/hadoop/ozone/om/request/validation/TestVersionExtractor.java
Change-Id: I866f87f947db6306e0c145b83180ece7e458ee07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants