Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add -noReconnectAfter option #738

Merged
merged 12 commits into from
May 15, 2024
Merged

Conversation

Vlatombe
Copy link
Member

@Vlatombe Vlatombe commented May 7, 2024

This option allows to abort retrying after a given timeout expressed in simple form (e.g. 1d, 2h, 10m, 60s, ...)

Downstream work to include this option in the docker image for agents: jenkinsci/docker-agent#802

Testing done

Interactive testing

  • Inbound TCP

    • On initial connection ✅
    • On disconnect ✅
  • Websocket

    • Initial connection ❌: -noReconnect not implemented
    • On disconnect ✅

Submitter checklist

Preview Give feedback

This option allows to abort retrying after a given timeout in seconds.

Interactive testing

* Inbound TCP
** On initial connection ✅
** On disconnect ✅

* Websocket
** Initial connection ❌: `-noReconnect` not implemented
** On disconnect ✅
@Vlatombe Vlatombe requested a review from jglick May 7, 2024 11:51
@jglick
Copy link
Member

jglick commented May 7, 2024

[on WebSocket] -noReconnect not implemented

Right, this is a known bug that could be addressed separately.

@jglick
Copy link
Member

jglick commented May 7, 2024

Does this need to be an option, or can we just hard-code a timeout of (say) one hour? I guess that could break some use case involving a static agent connected to a controller which is down for maintenance over a long weekend?

@Vlatombe
Copy link
Member Author

Vlatombe commented May 7, 2024

I could give the option a sensible default value, but I think one hour may be too short, considering some maintenance followed by a startup sequence on slow hardware for the controller. Maybe one day?

@jglick
Copy link
Member

jglick commented May 7, 2024

Not sure. I guess making the timeout opt-in makes sense in that it can be set by all cloud plugins, as a complement for example to jenkinsci/kubernetes-plugin#1543; and anyone using static agents can decide for themselves if it is helpful.

Copy link
Member

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far.

src/main/java/hudson/remoting/Engine.java Outdated Show resolved Hide resolved
@@ -740,9 +750,14 @@ public void closeRead() throws IOException {
if (noReconnect) {
return;
}
firstAttempt = Instant.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(WS reconnect)

@@ -795,14 +810,18 @@ private void innerRun(IOHub hub, SSLContext context, ExecutorService service) {

try {
boolean first = true;
firstAttempt = Instant.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCP initial connection?

@@ -915,7 +934,7 @@ private void innerRun(IOHub hub, SSLContext context, ExecutorService service) {
}
if(noReconnect)
return; // exit

firstAttempt = Instant.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCP reconnect?

src/main/java/hudson/remoting/Launcher.java Outdated Show resolved Hide resolved
@@ -682,6 +690,7 @@ private List<String> parseJnlpArguments() throws ParserConfigurationException, S
throw new IOException("-jnlpCredentials and -secret are mutually exclusive");
}
}
Instant firstAttempt = Instant.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine though this mode is deprecated so we do not really need to add enhancements to it.

@@ -401,8 +407,13 @@ public void waitForReady() throws InterruptedException {
String oldName = t.getName();
try {
int retries = 0;
Instant firstAttempt = Instant.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(TCP discovery via HTTP)

@jglick
Copy link
Member

jglick commented May 7, 2024

(complements #676 I think)

@Vlatombe Vlatombe marked this pull request as ready for review May 7, 2024 16:18
public final class DurationFormatter {
private DurationFormatter(){}

public static String format(Duration d) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


public static Unit fromSuffix(String suffix) {
for (Unit candidate : values()) {
if (candidate.suffix.equals(suffix.toLowerCase(Locale.ROOT))) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://find-sec-bugs.github.io/bugs.htm#IMPROPER_UNICODE seems to be triggering improperly, since use of Locale.ROOT should be safe AFAIK.

@Vlatombe Vlatombe requested a review from basil May 13, 2024 13:30
@Vlatombe Vlatombe enabled auto-merge May 15, 2024 06:50
@Vlatombe Vlatombe added the enhancement For changelog: An enhancement providing new capability. label May 15, 2024
@Vlatombe Vlatombe merged commit 4280e21 into jenkinsci:master May 15, 2024
13 checks passed
@Vlatombe Vlatombe deleted the noReconnectAfter branch May 15, 2024 07:37
@jglick
Copy link
Member

jglick commented May 15, 2024

Is there a corresponding PR to kubernetes-plugin to activate this mode? I guess there is no simple way to consume jenkinsci/docker-agent#809 prior to merge & release?

@Vlatombe
Copy link
Member Author

Unfortunately no, there isn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement For changelog: An enhancement providing new capability.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants