-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add http server port in the worker net address #18499
Conversation
Hi @lucyge2022 @jja725 @dbw9580 @ssyssy , could you take a review of this etcd registration change? |
Verified the backward compatibility by |
@com.google.gson.annotations.SerializedName("HttpServerPort") | ||
private int mHttpServerPort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This port is for the restful API server on the worker, right? There is already a "web port" which runs the web UI. Since both the UI and the restful API runs over HTTP, and possibly more services that run over HTTP will be added, can we be more specific that this port is used for the restful API exclusively?
@com.google.gson.annotations.SerializedName("HttpServerPort") | |
private int mHttpServerPort; | |
@com.google.gson.annotations.SerializedName("RestApiPort") | |
private int mRestApiPort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HttpServerPort is used in the PropertyKey
as the config for the restful API. The name is WORKER_HTTP_SERVER_PORT
. If we change the mHttpServerPort
to mRestApiPort
, it will cause naming inconsistencies since other WorkerNetAddress variables (e.g. mWebPort, mDataPort) follow the same names as configs in the PropertyKey
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest keeping the name for now. If we want to change the name in the future, we'll need to change both this one and the property key (and probably any other places using HTTP server port...).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u also add some comment here, saying this is a "optional"-like field which will not be included in equalness comparison
// Skip the comparison of mHttpServerPort for backward compatibility | ||
// && mHttpServerPort == that.mHttpServerPort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's incompatible with comparing the restful api port for equality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also update the hashCode
implementation, or leave a comment explaining why mHttpServerPort
is excluded from computing the hashcode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incompatibility.
As @lucyge2022 mentioned, the previous etcd-registered worker information doesn't have HTTP server ports. 1) When we do the comparison, the lack of HTTP ports will cause an issue. 2) It will also influence worker registration if we just add an HTTP port for an old worker (I added a new test to cover this scenario).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashCode.
Updated with a comment on the exclusion of mHttpServerPort
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we do the comparison, the lack of HTTP ports will cause an issue
When checking equality, WorkerServiceEntity
explicitly ignores all ephemeral information about a worker such as network addresses, and only compares the worker identity. This is why WorkerIdentity
exists. Therefore, a different HTTP port won't cause issue, as any other existing network addresses like hostname and grpc port.
+ "clean local worker identity settings to continue.", | ||
existingEntity.get().getWorkerNetAddress().toString(), | ||
existingEntity.get().getIdentity())); | ||
if (existingEntity.get().equals(entity)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add some comment below, like
// same entity, update with new information.
WorkerInfo wkr2 = new WorkerInfo() | ||
.setIdentity(workerIdentity1) | ||
.setAddress(new WorkerNetAddress() | ||
.setHost("worker2").setContainerHost("containerhostname2") | ||
.setRpcPort(2000).setDataPort(2001).setWebPort(2011) | ||
.setDomainSocketPath("/var/lib/domain.sock")); | ||
.setDomainSocketPath("/var/lib/domain.sock").setHttpServerPort(2021)); | ||
membershipManager.join(wkr1); | ||
// bring wrk1 down and join wrk2 with a same worker identity. | ||
membershipManager.stopHeartBeat(wkr1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u add a test to have a worker join without http port set and afterwards join with http port set , it should be allowed to join and the list of workers should be unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test testWorkerHttpServerPorts
. Could you take a review?
@@ -233,6 +254,8 @@ public boolean equals(Object o) { | |||
&& mDataPort == that.mDataPort | |||
&& mWebPort == that.mWebPort | |||
&& mDomainSocketPath.equals(that.mDomainSocketPath); | |||
// Skip the comparison of mHttpServerPort for backward compatibility | |||
// && mHttpServerPort == that.mHttpServerPort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u remove line 258?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
cool perfect thanks! |
// only the service name and the identity are the factors that define | ||
// what a WorkerServiceEntity is. Any other is either ephemeral or supplementary data. | ||
return mIdentity.equals(anotherO.mIdentity) | ||
&& getServiceEntityName().equals(anotherO.getServiceEntityName()); | ||
&& getServiceEntityName().equals(anotherO.getServiceEntityName()) | ||
&& mAddress.equals(anotherO.getWorkerNetAddress()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment explicitly states that WorkerServiceEntity
ignores worker address for equality comparison. If a worker with the same ID restarts and the address changes, it will leave two different entries on etcd and causes the clients to think there are multiple workers, but there are actually only one worker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bowen, actually when collision happens, in etcd comparison it directly compares the value bytes (EtcdMembershipManager.java line 134) and under that situation we evaluate if there's a need to update, and then we use equals here, if equals we know its the same worker and then update with new info, this is to introduce backward compatibility after adding a field "httpport" into the info to persist on etcd. Can u check if this equals is used somewhere else apart from etcdmembership join that will cause other problems?
// Skip the comparison of mHttpServerPort for backward compatibility | ||
// && mHttpServerPort == that.mHttpServerPort; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we do the comparison, the lack of HTTP ports will cause an issue
When checking equality, WorkerServiceEntity
explicitly ignores all ephemeral information about a worker such as network addresses, and only compares the worker identity. This is why WorkerIdentity
exists. Therefore, a different HTTP port won't cause issue, as any other existing network addresses like hostname and grpc port.
public void testWorkerHttpServerPorts() throws Exception { | ||
final MembershipManager membershipManager = getHealthyEtcdMemberMgr(); | ||
Configuration.set(PropertyKey.WORKER_MEMBERSHIP_MANAGER_TYPE, MembershipType.ETCD); | ||
Configuration.set(PropertyKey.ETCD_ENDPOINTS, getClientEndpoints()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
u dont need to set the config here anymore if getHealthyEtcdMemberMgr is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
workerNetAddress.setHttpServerPort(1021); | ||
membershipManager.join(wkr); | ||
// check if the worker is rejoined and information updated | ||
curWorkerInfo = membershipManager.getLiveMembers().getWorkerById(workerIdentity); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also assert the getallmembers only have one worker -> meaning that we are not creating two entries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point! I added that assertion.
@@ -401,4 +401,44 @@ public void testSameWorkerIdentityConflict() throws Exception { | |||
Assert.assertTrue(curWorkerInfo.isPresent()); | |||
Assert.assertEquals(wkr2.getAddress(), curWorkerInfo.get().getAddress()); | |||
} | |||
|
|||
@Test | |||
public void testWorkerHttpServerPorts() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u say testOptionalFieldChangeInWorkerAddress
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Updated the function name to testOptionalHttpPortChangeInWorkerAddress
.
@lucyge2022 @dbw9580 I updated the PR according to our discussion. Could you take a look again? |
public boolean customizedEquals(Object o) { | ||
if (this == o) { | ||
return true; | ||
} | ||
if (!(o instanceof WorkerNetAddress)) { | ||
return false; | ||
} | ||
WorkerNetAddress that = (WorkerNetAddress) o; | ||
return mHost.equals(that.mHost) | ||
&& mContainerHost.equals(that.mContainerHost) | ||
&& mSecureRpcPort == that.mSecureRpcPort | ||
&& mRpcPort == that.mRpcPort | ||
&& mDataPort == that.mDataPort | ||
&& mWebPort == that.mWebPort | ||
&& mDomainSocketPath.equals(that.mDomainSocketPath); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this does not have to be put in WorkerNetAddress
because the costumized comparison can be different depending on the caller. You can remove this and move code to WorkerServiceEntity
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I moved the comparison logic to WorkerServiceEntity
.
WorkerServiceEntity anotherO = (WorkerServiceEntity) o; | ||
return mIdentity.equals(anotherO.mIdentity) | ||
&& getServiceEntityName().equals(anotherO.getServiceEntityName()) | ||
&& mAddress.customizedEquals(anotherO.getWorkerNetAddress()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move the comparison here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use name as consideredEquivalent ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, or be explicit about what is considered equivalent, e.g. equalsIgnoringOptionalFields
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the function name to equalsIgnoringOptionalFields
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
@lucyge2022 @dbw9580 @ssyssy @jja725 Thanks for your review! |
alluxio-bot, merge this please |
merge failed: |
alluxio-bot, merge this please |
What changes are proposed in this pull request?
This PR enables registering the worker's HTTP server's port in the etcd. This helps to find worker's restful APIs from the Python client.
Why are the changes needed?
Alluxio Python client (e.g. in ML use cases) needs to connect to the worker's REST APIs. But as the http server port isn't included in the worker's information in the etcd, the client fails to find the API endpoint.
Does this PR introduce any user facing changes?
No.