-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension sequencing scenario #2933
Conversation
@@ -91,3 +91,6 @@ ENV/ | |||
# pyenv | |||
.python-version | |||
.vscode/ | |||
|
|||
# resource group txt created for ext_sequencing scenario in e2e pipeline | |||
resource_groups_to_delete.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The combinator creates a resource group and VMSS for the extension sequencing scenario and writes the resource group name to this file. Later, a task in the pipeline runs to cleanup the resources listed in this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But do we need to add the file to .gitignore? Probably you were running the tests with cwd within the source code directory. I usually set cwd to a temporary directory outside the source code.
@@ -275,138 +278,147 @@ def _clean_up(self) -> None: | |||
Cleans up any leftovers from the test suite run. Currently just an empty placeholder for future use. | |||
""" | |||
|
|||
def _setup_node(self, install_test_agent: bool) -> None: | |||
def _setup_node(self, install_test_agent: bool, nodes: List[Dict[str, Any]]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_setup_node, _collect_node_logs, and _check_agent_log were all updated to take in a list of nodes which indicate which nodes the setup/log actions should be taken.
In the case of extension sequencing scenario, the nodes list represents all the instances of the VMSS. Setup and log collection/checking will happen on each instance. In the case of any other extension, the nodes list will only contain one VM.
Codecov Report
@@ Coverage Diff @@
## develop #2933 +/- ##
========================================
Coverage 72.02% 72.02%
========================================
Files 106 106
Lines 16063 16063
Branches 2304 2304
========================================
Hits 11569 11569
Misses 3962 3962
Partials 532 532 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The combinator was updated to create a VMSS for the extension sequencing scenario. We choose one instance of the VMSS to run the extension sequencing scenario on. We report a test failure if VMSS creation fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Template to create virtual machine scale set
"orchestrationMode": "Flexible", | ||
"virtualMachineProfile": { | ||
"extensionProfile": { | ||
"extensions": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This template includes extensions and dependencies which will be added to the scale set at creation.
This is different from the current DCR scenario. The current DCR scenario creates the VMSS, and then repeatedly updates the extensions/dependencies on the scale set in the test to validate the order in which the extensions were enabled. This can be done because the DCR scenario creates a VMSS with 'Uniform' orchestration mode. With Uniform orchestration mode you can update the scale set model and each instance in the scale set will be updated with the new model (including extensions).
We cannot use 'Uniform' orchestration mode for the VMSS in this scenario. Lisa requires that we use 'Flexible' orchestration mode, because while setting up the test environment, LISA looks for the VM in the resource group. In flexible orchestration mode, a VM for each instance is created in the resource group, but for uniform mode there are no VM resources created for the instances.
As a result, this test will look a little different from DCR. We must add the extensions/dependencies at VMSS creation so that each instance will get the extensions/dependencies. We can only validate once, because we cannot update the VM instance extension dependencies after VMSS creation.
Pavan mentioned Flexible VMSS will support UpgradePolicy in the next 1-2 semesters. So I added a TODO to improve this scenario when that happens
tests_e2e/pipeline/pipeline.yml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added job in pipeline to cleanup resources created by the combinator
tests_e2e/tests/lib/identifiers.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added clients to do operations on ResourceGroups and VirtualMachineScaleSetClients in azure python sdk
try: | ||
vmss.delete_extension(ext_to_remove) | ||
except HttpResponseError as e: | ||
if "(BadRequest)" in e.message and "provisionAfterExtensions" in e.message: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you try to remove an extension with dependencies CRP reports an error which we catch here
{
"innererror": {
"internalErrorCode": "MissingDependentExtensionInArgumentDependentExtensionExistsInVMSSInstance"
},
"code": "BadRequest",
"message": "On resource 'lisawalinuxagentextseq20230923164508e0', extension 'RunCommand' specifies 'VmAccess' in its provisionAfterExtensions property, but the extension 'VmAccess' will no longer exist. First, remove the extension 'RunCommand' or remove 'VmAccess' from the provisionAfterExtensions property of 'RunCommand'."
}
name: "ExtSequencing" | ||
tests: | ||
- "ext_sequencing/ext_sequencing.py" | ||
images: "random(endorsed)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original DCR scenario runs on all endorsed images.
I'm choosing to run the scenario on one random endorsed image because VMSS creation is time consuming and test execution cannot start until the combinator is done creating the VMSS resources and environments.
@@ -91,3 +91,6 @@ ENV/ | |||
# pyenv | |||
.python-version | |||
.vscode/ | |||
|
|||
# resource group txt created for ext_sequencing scenario in e2e pipeline | |||
resource_groups_to_delete.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But do we need to add the file to .gitignore? Probably you were running the tests with cwd within the source code directory. I usually set cwd to a temporary directory outside the source code.
@@ -142,6 +161,7 @@ def create_environment_list(self) -> List[Dict[str, Any]]: | |||
Note that if the runbook provides an 'image', 'location', or 'vm_size', those values override any values provided in the | |||
configuration of the test suites. | |||
""" | |||
log: logging.Logger = logging.getLogger("lisa") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create self._log and initialize in init() instead of doing this on every method that needs to log?
} | ||
|
||
def create_environment_ext_sequencing(c_env_name: str) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the scale set is just generic and it does not have anything specific to extension sequencing? Let's remove all the references to "ext_sequencing" and use "scale_set" instead. Maybe "create_scale_set_environment"?
@@ -76,13 +95,13 @@ def __init__(self, runbook: AgentTestSuitesCombinatorSchema) -> None: | |||
if self.runbook.vm_name != '' and (self.runbook.image != '' or self.runbook.vm_size != ''): | |||
raise Exception("Invalid runbook parameters: When 'vm_name' is specified, 'image' and 'vm_size' should not be specified.") | |||
|
|||
self._created_rg_count = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add type annotations to locals as well
env_name = f"{image_name}-{suite_info.name}" | ||
# The ExtSequencing scenario needs to be run on an instance of a scale set, which is not currently | ||
# supported by Lisa. Creation of test resources for this scenario will be handled by the combinator. | ||
if suite_info.name == "ExtSequencing": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of hardcoding this to a specific test suite, let's add a new property to the test suite yml to indicate it runs on a scale set.
# | ||
sudo find "$BUILD_SOURCESDIRECTORY" -exec chown "$USER" {} \; | ||
|
||
az login --service-principal --username "$AZURE_CLIENT_ID" --password "$AZURE_CLIENT_SECRET" --tenant "$AZURE_TENANT_ID" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to login? you are already passing the authentication info in the environment
|
||
az login --service-principal --username "$AZURE_CLIENT_ID" --password "$AZURE_CLIENT_SECRET" --tenant "$AZURE_TENANT_ID" | ||
az account set --subscription "$SUBSCRIPTION_ID" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to check KEEP_ENVIRONMENT before deleting the scale set
def create_environment_ext_sequencing(c_env_name: str) -> Dict[str, Any]: | ||
log.info("Creating VMSS for ExtSequencing scenario") | ||
curr_datetime = datetime.datetime.now().strftime("%Y%m%d-%H%M%S") | ||
rg_name = f"lisa-WALinuxAgent-extseq-{curr_datetime}-e{self._created_rg_count}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use the 'name' variable from the runbook instead of hardcoding "WALinuxAgent"
@@ -0,0 +1,20 @@ | |||
#!/usr/bin/env bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need to move the cleanup code to somewhere in LISA, otherwise developer runs are not cleaned up
VMSS support was added in this PR: #2954 Extension sequencing scenario will be added in a different PR |
Description
This PR adds the extension sequencing scenario which requires VMSS. It includes changes to the combinator and agent test suite to support running tests on scale set instances.
Issue #
PR information
Quality of Code and Contribution Guidelines