Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Envvars nodeprops #1

Merged
merged 48 commits into from
Nov 7, 2024
Merged

Conversation

rpashkoff
Copy link
Owner

No description provided.

romankulikov and others added 30 commits October 2, 2015 17:40
User can choose after build behavior of VM.
0. Suspend VM
1. Stop VM
2. Keep VM running
3. Return to previous state
Instead "stopVm" now connector slave gets final command from VM.
…r-VM-stopping-mode

Provide configuration for vm stopping mode
This is the first public plugin release on official Jenkins project
hosting.
Fixed typo and changed repo URL.
As required for maven release.
Now plugin is hosted on jenkins-ci.org. Thus part of custom installation
process is not needed.

Also issue with incorrect computer launcher representation on the web
form is fixed.
If connector slave is not used as build node strategy "Demand" was set
for it. This made possible such situations when connector slave was made
offline due to inactivity for some time. And as a result command to stop
virtual machine was not executed on connector.
Need to block restart untill all VMs are stopped or suspended or paused
are required by their configuration.

Bug #JENKINS-40628
Just don't create node object during provisioning if VM start failed or
IP address was not obtained. Let Jenkins find better executor for this
job.

Bug #JENKINS-40685
Seems large VM on MacMini may resume for 2-3 minutes.

Bug #JENKINS-40685
If VM slave node fails to start for some reason it will be left offline
by Jenkins because we just skip check in retention strategy in such
case for unclear reason.

Bug #JENKINS-40685
Check for VM existence was implemented disfunctional in initial commit.
Let's reimplement it.

Bug #JENKINS-40685
Foundation stone of this procedure is proper errors handling. Previous
attemt did not fix anything in fact. So that if, for example, Jenkins
fails to connect to VM through SSH appropriate node stalls in
offline mode with job handing on it.

Bug #JENKINS-40685
To fix bug with proper limiting the number of VMs runnings on one host
we need to rework VM start-up procesdure so that actual virtual machine
start will be attempted only if it is not running already (if VM is
already running on host no additional limiting could be applied to it).

Bug #JENKINS-40685
We need to check resource limits (CPU and RAM for now as main resources)
before VM start to avoid host overload. Otherwise in case of CPU or
memroy overcommit host performance may degrade significantly, even to
the state of unresponsibility. Yes, Parallels Desktop internally checks
some limits before VM start, but:
0. these checks are not strict (this is bug or not -- I don't know);
1. for plugin these checks blocking looks like common VM start failure.
So its better to distinguish resources checks issues with others to
better handle such errors.

Actual implementation goes in the next commit.

Bug #JENKINS-40685
Here's general algorithm done for CPU counting and RAM separately:
0. take available CPU cores and physical RAM on host;
1. for each running VM sum virtual CPUs and RAM (main memory and video
RAM);
2. add for each VM RAM 500 Mb as virtualization overhead;
3. subtract 1 Gb from host available RAM -- consider this to be taken by
host OS, apps, Jenkins' java etc.
4. so available resources are: host CPUs minus sum of runnings virtual
CPs, host RAM minus all VMs RAM;
5. compare available resources with start VM required resources, if CPU
or RAM exceeds available values -- refuse to start VM.

Bug #JENKINS-40685
At this point we didn't know if prlctl commant finished with sucess or
or not, we just parsed its output. To improve error handling,
particularly in case of resources exceeding cases, let's throw and catch
appropriate exception if prlctl return code is not zero.

Bug #JENKINS-40685
check() method of computer's retention strategy may be called from
different threads. So parallel execution is possible and need to add
locks into method code.
Since maven tests warn against using non-portable implementation from
Oracle run-time.
check() method in retention strategy may be called after VM start but
before launching slave on it. Plain isIdle() check returns true -- and
we started disconnecting and removing node, _before_ initial connecting!
So computer start fails and job is not executed.

To fix this lets introduce idle timeout which should be passed before we
start computer termination. As it is done in retention strategies of
other plugins. 2 minutes value is "taken from the ceiling".
Original implementatino didn't work for multiple clouds configured in
the system. Restart listenter worked only with one instance of the cloud
class. So lets rework restart listener so that it will ask for restart
for every PDfM host computer.

Also need to count VMs for only _after_ they are started.

Bug #JENKINS-40628
romankulikov and others added 18 commits January 23, 2017 18:17
Stupid error in initial implementation was made: hosts CPUSs number and
RAM size were calculated on the master side while they are expected to
be taken from Cloud host slave. So need to add MasterToSlaveCallable for
proper host resources determination.

Bug #JENKINS-40685, improvement #JENKINS-40691
As suggested in #INFRA-588.
Previously we aborted VM computer launch if were unable to get its IP
address via `prlctl list -f`. This could happen because Parallels Tools
are not installed in that VM or just because of a bug in Parallels
Desktop. But we may use a fallback way: just try to use address of
hostname specified in a form of VM configuration.

Bug #JENKINS-41431
At the moment I don't have valid environment to test and fix IPv6
support. So lets set up the plugin to work with v4 only for now.

Bug #JENKINS-41431
Do not provision VM slaves when the host slave is offline. Update
all VM slaves online/offline state in accordance with the host slave.
Prohibit managing VM slaves connection from the GUI.
Implement 'Mark this node temporarily offline' for host slave
@rpashkoff rpashkoff merged commit a325907 into rpashkoff:master Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants