-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Rework GenesisNetworkInfo
to use the genesis address book
#16759
base: develop
Are you sure you want to change the base?
Conversation
Signed-off-by: Matt Hess <matt.hess@swirldslabs.com>
GenesisNetworkInfo
to use the genesis address bookGenesisNetworkInfo
to use the genesis address book
We're determining if this work is still relevant |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #16759 +/- ##
=============================================
+ Coverage 63.45% 63.82% +0.36%
- Complexity 20319 20558 +239
=============================================
Files 2545 2540 -5
Lines 94605 95131 +526
Branches 9878 9941 +63
=============================================
+ Hits 60035 60714 +679
+ Misses 30954 30800 -154
- Partials 3616 3617 +1
|
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferencesCodacy stopped sending the deprecated coverage status on June 5th, 2024. Learn more |
...rc/main/java/com/hedera/node/app/service/addressbook/impl/schemas/V053AddressBookSchema.java
Show resolved
Hide resolved
Signed-off-by: Matt Hess <matt.hess@swirldslabs.com>
Signed-off-by: Matt Hess <matt.hess@swirldslabs.com>
hedera-node/hedera-app/src/main/java/com/hedera/node/app/info/GenesisNetworkInfo.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @mhess-swl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
...wirlds-platform-core/src/main/java/com/swirlds/platform/system/address/AddressBookUtils.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Matt Hess <matt.hess@swirldslabs.com>
92cf64e
Signed-off-by: Matt Hess <matt.hess@swirldslabs.com>
Signed-off-by: Matt Hess <matt.hess@swirldslabs.com>
@@ -219,4 +237,9 @@ private static Roster retrieveInternal(@NonNull final State state, @Nullable fin | |||
} | |||
return null; | |||
} | |||
|
|||
private static boolean isValidServiceEndpoint(@NonNull final ServiceEndpoint endpoint) { | |||
return endpoint.ipAddressV4().length() > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would hope this kind of input validation is happening in the DAB transactions and that it is unnecessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my attempt to remember the following (current) code:
.gossipEndpoint(List.of(
Pair.of(address.getHostnameExternal(), address.getPortExternal()),
Pair.of(address.getHostnameInternal(), address.getPortInternal()))
.stream()
.filter(pair -> pair.left() != null && !pair.left().isBlank() && pair.right() != 0) // <----- this
.distinct()
So the code I wrote here is slightly different. I can replace this with the original filtering if needed.
final List<ServiceEndpoint> rosterOrdered = endpoints.size() > 1 ? endpoints.reversed() : endpoints; | ||
// Filter out any invalid endpoints | ||
final List<ServiceEndpoint> filtered = endpoints.stream() | ||
.filter(e -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any risk of filtering this down to 0 valid endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. What is the expected behavior in that case? I didn't see any code to handle that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, there is no code to handle that. Can't connect to anyone that has 0 gossip endpoints. Its fundamentally bad input to be able to reach this state. Validation and error throwing should be happening at time of input, first time the data hits our system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anthony-swirldslabs wrote the original. If we have proper input validation, this filter should never catch anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the validation should happen upstream somewhere. At the same time, I think that re-validating here shouldn't hurt.
However, ending up with zero endpoints is no good because such a RosterEntry would be unusable by the platform. In fact, it's likely to start throwing exceptions because the Roster specification states that there must be at least a single entry. And platform rightfully assumes so.
This code needs to throw an exception and fail hard if it ends up with zero endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code needs to throw an exception and fail hard if it ends up with zero endpoints.
Again, if this is true, then why isn't the current code throwing a clear, explicit exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely, "we just missed it back then". Less likely, but not impossible, some unit tests might use bogus AddressBooks w/o any endpoints at all, in which case this code would fail to support them. If the latter is true, well, then we're good. If not though, I think it wouldn't hurt adding the exception while we're refactoring this code anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks @mhess-swl
} catch (CertificateEncodingException e) { | ||
throw new InvalidAddressBookException(e); | ||
} | ||
|
||
final List<ServiceEndpoint> endpoints = AddressBookUtils.endpointsFor(address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This endpointsFor
method returns endpoints in an order that is opposite to what the Roster expects. In the Roster, the external endpoint goes at index 0, followed by an optional internal endpoint. The internal endpoint may be skipped if it's identical to the external one.
The entire platform code relies on this particular order of endpoints in the Roster and will be broken if it's reversed. This is because we're going to deprecate internal endpoints from the Roster eventually and replace them with the node configuration. So we choose to use the entry at index 0 from the start to refer to an external endpoint that the platform normally uses.
It would be best to retain the existing conversion logic in the RosterRetriever class as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be best to retain the existing conversion logic in the RosterRetriever class as is.
The existing conversion logic is why we have had problems and confusion in functional testing to begin with. To be more specific, not having explicit and clear safeguards to maintain said ordering in all the platform code has led to this ordering logic leaking into various places. Now this leak has to be accounted for in disparate locations sprawled across both platform and services code (hence the functional testing issues).
I attempted to standardize/unify all the existing platform and services use cases while maintaining current functionality. Maybe I didn't do a great job, but these APIs and their usages should be consistent and cohesive across the entire codebase, instead of being evaluated solely from the viewpoint of platform alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully support the intent of this change per your description. And if this conversion logic must be re-written, that's okay, too.
Speaking of this particular line of code though, since this call occurs in a Roster-related class and we know that the AddressBookUtils
would return the endpoints in an order that is opposite to the roster specification, it would be a lot more appropriate to flip the entries right here, instead of burying the flipping logic down the call stack in a completely different method. This would help ensure that the rest of the code in this class doesn't need to deal with flipping at all.
However, per a comment below in this PR, it does seem that there's an extra place which requires the flipping logic - in the RosterUtils. It would be ideal to find a way to have this flipping implemented literally once in a single place somehow, to follow the spirit of the intent of this fix.
final List<ServiceEndpoint> rosterOrdered = endpoints.size() > 1 ? endpoints.reversed() : endpoints; | ||
// Filter out any invalid endpoints | ||
final List<ServiceEndpoint> filtered = endpoints.stream() | ||
.filter(e -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the validation should happen upstream somewhere. At the same time, I think that re-validating here shouldn't hurt.
However, ending up with zero endpoints is no good because such a RosterEntry would be unusable by the platform. In fact, it's likely to start throwing exceptions because the Roster specification states that there must be at least a single entry. And platform rightfully assumes so.
This code needs to throw an exception and fail hard if it ends up with zero endpoints.
// Order the endpoints in the expected roster order | ||
final List<ServiceEndpoint> rosterOrdered = filtered.size() > 1 ? filtered.reversed() : filtered; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw this after posting my original comment. I understand how you work around the issue. However, I don't find this particularly elegant or easy to maintain. I'd still prefer to retain the original code because the logic there is quite straightforward.
Also, this workaround can in fact still end up filtering out the external endpoint which the platform really needs, and only keeping the internal endpoint. This is an error condition from the platform perspective.
Again, I'd recommend keeping the existing conversion logic in this class as it was originally written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw this after posting my original comment. I understand how you work around the issue. However, I don't find this particularly elegant or easy to maintain. I'd still prefer to retain the original code because the logic there is quite straightforward.
I have to disagree. As a newcomer to this code, reading the current implementation–or changing it–were not straightforward. Converting into Pair
objects and doing inline validation–essentially duplicating existing validation code elsewhere, I might add–is not only clunky, but error-prone. Furthermore, the roster was designed inside the boundaries of the platform code, and yet the roster itself isn't even consistent with platform's own address book ordering. How is unwieldy, error-prone, and inconsistent code straightforward (or elegant)?
Also, this workaround can in fact still end up filtering out the external endpoint which the platform really needs, and only keeping the internal endpoint. This is an error condition from the platform perspective.
If this is true, then why doesn't the current conversion code account for these potential errors with a clear, explicit exception either?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As suggested above, it may make sense to move the flipping logic to where we obtain the list of endpoints in the first place, so that they're always in the correct order everywhere else in this roster-specific class.
As for the exception, that's a good question to which there's not a good answer other than "we missed it when we wrote it originally." Would adding the exception break any unit tests? If not, is there a good reason not to include it with this conversion logic overhaul while we're at it?
// If we're reading metadata, the endpoint order SHOULD match the address book, which always has the | ||
// internal endpoint at index 0 | ||
node.serviceEndpoint()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you cannot put it into the roster w/o flipping. A RosterEntry should always have the external endpoint at index 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, this needs to be flipped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This only proves my earlier point: passing around two service endpoints in a list, with no designation of internal vs. external, is at best difficult to account for. I recommend we change these APIs from an arbitrary list of ServiceEndpoint
to something more explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree with the sentiment fully. However, as of today, the plan is to use the endpoints in the RosterEntry to represent various external endpoints, such as IP-address based, FQDN-based, and in the future, perhaps IPv6-based. While currently it's unclear how this array of endpoints is going to be actually used, it is certain that we don't have plans to include any internal endpoints in this list because they're generally not useful in the roster as a public data structure.
But as long as we do use this list, we need to have the external endpoint at index zero, so the quoted code needs to flip the endpoints here.
In addition to replacing the roster with the genesis address book in
GenesisNetworkInfo
, this PR also makes a further attempt to standardize how the code creates/retrieves rosters.Closes #16643