Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing extension sequencing in azure linux agent #1298

Merged
merged 14 commits into from
Oct 17, 2018

Conversation

sirajudeens
Copy link
Contributor

@sirajudeens sirajudeens commented Aug 13, 2018

Adding support for extension sequencing in the azure linux agent

Issue #
VMSS extensions can be installed and enabled in a sequence by specifying dependsOn property values in the ARM template. CRP processes the dependency data, builds dependency graph and generates dependency level for each extension. The dependency level information is passed on to the VM agent. The VM agent needs to parse the dependency data and install the extensions based on the dependency level.


PR information

  • The title of the PR is clear and informative.
  • There are a small number of commits, each of which has an informative message. This means that previously merged commits do not appear in the history of the PR. For information on cleaning up the commits in your pull request, see this page.
  • Except for special cases involving multiple contributors, the PR is started from a fork of the main repository, not a branch.
  • If applicable, the PR references the bug/issue that it fixes in the description.
  • New Unit tests were added for the changes made and Travis.CI is passing.

Quality of Code and Contribution Guidelines

Copy link
Member

@hglkrijger hglkrijger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My general feedback is that an already complex section of the code has become harder to understand. I would prefer that when code is touched, it gets simpler. Let me know whether you think this is feasible.

Also some comments and questions inline.

@@ -67,6 +67,7 @@

TELEMETRY_MESSAGE_MAX_LEN = 3200

DEFAULT_EXT_TIMEOUT_MINUTES = 90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is a per-extension timeout, won't CRP timeout well before this for multiple extensions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the timeout value per dependency level.
CRP has an independent time out value. I will start an email thread including the CRP-side owner, on when/how to update the CRP timeout value.

if dep_level != ext_handler.sort_key():
dep_level = ext_handler.sort_key()
deps_wait_until = handlers_wait_until
handlers_wait_until += datetime.timedelta(seconds=(DEFAULT_EXT_TIMEOUT_MINUTES * 60))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - doesn't timedelta have a min property?

dependencies = sum([e.dependencies for e in ext_handler.properties.extensions], [])
for dependency in dependencies:
handler_i = ExtHandlerInstance(dependency.handler, self.protocol)
if dependency.timeout is not None and dependency.timeout < DEFAULT_EXT_TIMEOUT_MINUTES:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not keep converting between seconds and minutes, it's an easy thing to miss.

wait_until -= datetime.timedelta(seconds=( (DEFAULT_EXT_TIMEOUT_MINUTES - dependency.timeout) * 60))
for ext in dependency.exts:
success_status, status = handler_i.is_ext_status_success(ext)
while not success_status:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a strange construct - a while not around an if - please clean up.

azurelinuxagent/ga/exthandlers.py Outdated Show resolved Hide resolved
def is_ext_status_success(self, ext):
status = self.collect_ext_status(ext)
if status is None:
return (True, status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems odd, why is this True?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some extensions do not report any status. We should not wait for such extensions to report status. That's why we return True here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But by returning True here we are saying that a extension that has not reported status yet has succeeded... correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think we can return False in this case. @uiri do you have any comment on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

collect_ext_status returns None if the extension does not report any status. Not all extensions report status.

If you return False in this case, the agent will hang until it times out on such extensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base on offline discussion with @uiri , an extension can install and enable but may not create any status file. In this case, we don't want to wait for it until it is timed out. So we return True in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBD: What should we return if there is no status file?

for substatus in status.substatusList:
if substatus.status != "success":
return (False, status)
return (True, status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is your default, you are not really testing for success, but failure.

dependency.handler.name, ext.name,
ext_handler.name, status))
else:
time.sleep(10)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a long wait.

dep_level = ext_handler.sort_key()
deps_wait_until = handlers_wait_until
handlers_wait_until += datetime.timedelta(seconds=(DEFAULT_EXT_TIMEOUT_MINUTES * 60))
self.wait_on_ext_handler_dependencies(ext_handler, deps_wait_until)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure I agree with the way this is done, which is the feedback I gave @uiri previously. If the dependency chain is simple, this should be just an ordering problem. The way this is built allows for a more complex scenario (A depends on B and C), which was not my expectation. Please clarify.

@hglkrijger
Copy link
Member

@sirajudeens please address the conflicts as well.

@vrdmr, @narrieta please review.

Copy link
Member

@uiri uiri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments on 289f2fe

else:
time.sleep(10)
success_status, status = handler_i.is_ext_status_success(ext)
success_status, status = handler_i.is_ext_status_success(dependency.ext)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing None check for dependency.ext.

time.sleep(5)
success_status, status = handler_i.is_ext_status_success(dependency.ext)

# In case of timeout, we log it and continue with the rest of the extensions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the correct behaviour.

The extension should not be run if its dependency timed out. The extension runs once this function returns, unless this function throws an exception.

dependencies = sum([e.dependencies for e in ext_handler.properties.extensions], [])
for dependency in dependencies:
handler_i = ExtHandlerInstance(dependency.handler, self.protocol)
if dependency.timeout is not None and dependency.timeout < DEFAULT_EXT_TIMEOUT_MINUTES:
wait_until -= datetime.timedelta(seconds=( (DEFAULT_EXT_TIMEOUT_MINUTES - dependency.timeout) * 60))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this logic been removed entirely?

Further divergence from the Windows agent is not desirable.

@sirajudeens sirajudeens force-pushed the extension-sequencing branch 4 times, most recently from c3bdb97 to 7663c56 Compare August 21, 2018 22:54
uiri and others added 9 commits August 30, 2018 14:10
Move dependencyLevel parsing from plugin inside plugins to dependsOn
Use tuples of handler name and extension name to represent dependencies
An extension's dependencies must report a success status before
the agent should install and enable an extension.
The platform gives up to 90 minutes of processing for each dependency level.

The first level should have no dependencies, so any waiting should
return a status right away.

The dependencies of each subsequent level are allocated an additional 90
minutes of processing time to produce a valid status.

If the agent has been processing extensions for more than
(90 minutes) * (number of levels processed so far) then it
should timeout.
@sirajudeens sirajudeens force-pushed the extension-sequencing branch 3 times, most recently from d261306 to 5a9a81e Compare August 30, 2018 15:00
self.ext_handlers.extHandlers.sort(key=operator.methodcaller('sort_key'))
for ext_handler in self.ext_handlers.extHandlers:
# TODO: handle install in sequence, enable in parallel
# Use updated time limit for each dependency level
if dep_level != ext_handler.sort_key():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not simply move through the extHandlers sorted by dependency level, and remove all the additional wait code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We not only make sure the extensions are installed sequentially, we need to make sure that the previous extension in the sequence is installed successfully before proceeding with the next extension in the sequence. Hence this wait code with different timeout value for each level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider something like this:

  for ext_handler in extHandlers:
    handle_ext_handler(ext_handler)
    wait_if_necessary(ext_handler)

Seems a little bit simpler than tracking full dependencies per se. We really just care if the last handler is done, if there is a stated dependency.

Regarding the timeouts, I thought we had decided on a total 90?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @hglkrijger that the for loop can be simplified to not make reference to the dependencies, but just ext_handler. Are the dependencies used only to increase the timeout 90 min per dependency level?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even now, the for loop does not make any reference to the dependency. It just references what was the previous extension installed in order to wait on that.

I think the one @hglkrijger suggests is that 'install an extension, wait for it'.This has a side effect of waiting for the highest level extensions as well, which is not required to be waited for.

Another way based on his suggestion is 'install an extension, wait for it if it is not the highest dependency level'. This requires calculating the highest dependency level in advance.
Do you think we need to change the logic to match this?

Copy link
Member

@narrieta narrieta Sep 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per offline discussion we will got for a flat 90 min (instead of 90 per dependency level)

if not success_status:
msg = "Timeout waiting for success status " \
"from dependency {0}/{1} for {2}" \
"status was: {3}".format(prev_handler.name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status is an instance of ExtensionStatus, so format will produce something like "<...ExtensionStatus object at 0x000001BF0B6B7DA0>"

if status.status != "success":
return (False, status)
for substatus in status.substatusList:
if substatus.status != "success":
Copy link
Member

@narrieta narrieta Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hglkrijger - what is the interpretation of status/substatus/status?... are extensions allowed to report "success" in status and, say, "error" in one of the status/substatus/status? is this considered an extension error?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per offline discussion @sirajudeens will double-check the validation of substatus in CRP to be sure this matches.

try:
depends_on_level = int(getattrib(depends_on_node, "dependencyLevel"))
except (ValueError, TypeError):
depends_on_level = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to catch these errors? this bumps the priority to 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect string value corresponding to an integer value, which is parsed to get the integer value.
In order to guard against any corrupted value, we catch this exception. The dependency level is set to default in in that case. We assume that that kind of extension does not depend on anything so it gets its priority bumped up. Do you have any better suggestion in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I double-checked the previous code and it was doing the same (setting the priority to 0 on error). I think we should log a warning in case bumping it up to 0 affects any dependencies and causes an extension failure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this warning is going to create a lot of noise in the logs if the platform is only sending down dependencyLevel for the cases where dependencies exist.

@sirajudeens are we sending down dependencyLevel all the time regardless or only when the customer specifies dependencies? What about for single VMs?

@@ -871,6 +917,34 @@ def collect_ext_status(self, ext):

return ext_status

def is_ext_handling_complete(self, ext):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be more useful for debugging and logging to define a __str__ method on ExtensionStatus and return the full status.


if dep_level != ext_handler.sort_key():
return True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would cleanup this loop a little, maybe something like

first = self.ext_handlers.extHandlers[0].sort_key() if len(extHandlers) > 0 else None
for i in range(1, len(self.ext_handlers.extHandlers)):
    if self.ext_handlers.extHandlers[i].sort_key() != first:
        return True
return False

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole method seems unnecessary. The agent should not wait for the leaf nodes. In the case where there are no dependencies, every node is a leaf node.

self.ext_handlers.extHandlers.sort(key=operator.methodcaller('sort_key'))
for ext_handler in self.ext_handlers.extHandlers:
# TODO: handle install in sequence, enable in parallel
if ext_seq_required:
Copy link
Member

@narrieta narrieta Sep 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest switching the order of the calls to handle_ext_handler and wait_for_prev_handler_completion (call handle first and wait second) and just add a check to avoid waiting for the last extension. I think this would express the intention of the code a little more clearly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'd suggest something similar to

for i in range(len(self.ext_handlers.extHandlers)):
    self.handle_ext_handler(self.ext_handlers.extHandlers[i], etag)

    if ext_seq_required:
        dep_level = ext_handler.sort_key()
        if dep_level >= 0:
            if i < len(self.ext_handlers.extHandlers) - 1 and !self.wait_for_handler_successful_completion(self.ext_handlers.extHandlers[i], wait_until):
                break

plus updating the comments)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use for i in range(L) in a Python 2 or Python 2/3 codebase such as this one. It creates a list of integers unnecessarily.

For this particular case, keeping track of i is unnecessary. The second to last ext_handler may also be a leaf node so the test needs to be against the maximum dependency level, not against its position in the list. if dep_level >= 0 and dep_level < max_dep_level: ought to be a sufficient test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

# If handled successfully, proceed with the current handler.
# Otherwise, skip the rest of them.
dep_level = ext_handler.sort_key()
if dep_level >= 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment to explain the >=0 (we do this only for install/enable and not for uninstall/disable)


return False

def wait_for_prev_handler_completion(self, prev_handler, wait_until):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove 'prev_' from the name of this function and its parameters. The function just waits for a handler, by itself it doesn't have a concept of "previous handler"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest renaming the function to wait_for_handler_successful_completion or similar

# No need to wait on anything for the very first extension
if prev_handler == None:
return True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if implementing the changes suggested for the handle/wait loop on line 320, then there is no need to keep track of the previous handler and this check for null is not needed

# In case of timeout or terminal error state, we log it and return false
# so that the extensions waiting on this one can be skipped processing
if not ext_completed or status != EXTENSION_STATUS_SUCCESS:
msg = "Extension {0} was timedout or status was: {1}".format(ext.name, status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest rephrasing this message to something similar to "Extension {0} did not reach a terminal state within the allowed timeout. Last status was {1}"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disagree. If the extension completes within the allotted time but returns an error, you still wind up emitting this message. Your suggestion makes that error sound like a timeout. The right thing to do is to cleanly separate the "returned an error" condition from the "did not complete within the allotted time limit" condition, emitting a clear (and distinct) message for each of those two cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonzio , is your concern about the error message or the functionality itself when there is a failure?
As per the last discussion with the VM agent folks, if an extension timedout/failed, we should stop processing the rest of the extensions, when the extension sequencing is required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasonzio agree that 2 separate messages would be better

if self.wait_for_prev_handler_completion(prev_handler, wait_until):
prev_handler = ext_handler
else:
break
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a message saying we stopped processing other extensions

# Missing status file is considered a non-terminal state here
# so that extension sequencing can wait until it becomes existing
if not os.path.exists(ext_status_file):
ext_status = ExtensionStatus(seq_no=seq_no)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there no need to return an ExtensionStatus object. I think only the 'status' field is used by the code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only the 'status' field is used by the code

It will be easier to debug if we have the entire status object and not just the fields used by the code. Also, it is impossible to tell whether it is a status or a substatus otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more substatus per @narrieta 's suggestion.

ext_status = ExtensionStatus(seq_no=seq_no)
ext_status.message = u"Failed to get status file for {0}".format(ext.name)
ext_status.code = -1
ext_status.status = "warning"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"no status file" or similar may be more appropriate

def is_ext_handling_complete(self, ext):
status = self.get_ext_handling_status(ext)

# when there is no status, the handling is complete and return None status
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the comment... this condition would be reached when seq < 0 (i.e. no new user settings) rather than when there is no status file

@@ -310,11 +311,63 @@ def handle_ext_handlers(self, etag=None):
logger.info("Extension handling is on hold")
return

wait_until = datetime.datetime.utcnow() + datetime.timedelta(minutes=DEFAULT_EXT_TIMEOUT_MINUTES)
prev_handler = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this variable is not referenced anymore

# We wait only for the installation. Not for the uninstallation.
# If handled successfully, proceed with the current handler.
# Otherwise, skip the rest of the extension installation.
dep_level = ext_handler.sort_key()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done for install and enable; let's update the comment to reflect that

dep_level = ext_handler.sort_key()
if dep_level >= 0 and dep_level < max_dep_level:
if not self.wait_for_handler_successful_completion(ext_handler, wait_until):
logger.warn("Skipping processing extensions")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the extension name and rephrase the message to read something similar to "Extension {0} failed or timed out, will skip processing the rest of the extensions"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be too much logging. We already log the extension name in the wait function. This is to just log that the rest of the extensions will be skipped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can leave the extension name out if you prefer so. Let's be explicit about why we are skipping the rest of the extensions, though. Maybe something like "An extension failed or timed out, will skip processing the rest of the extensions". Hopefully this will be more helpful for CSS or a customer reading the log.

return False

if status != EXTENSION_STATUS_SUCCESS:
msg = "Extension {0} did not succeed. Last status was {1}".format(ext.name, status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: "Status was" instead of "Last status was"

@sirajudeens sirajudeens force-pushed the extension-sequencing branch 2 times, most recently from 058e721 to fcffb8d Compare September 16, 2018 17:59
extension = Extension(name=handler_name)
exthandler.properties.extensions.append(extension)

# Override the timout value to minimize the test duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "timout"

ExtHandlerInstance.get_ext_handling_status = MagicMock(return_value=None)
self.assertTrue(exthandlers_handler.wait_for_handler_successful_completion(handler, datetime.datetime.utcnow()))

def _test_wait_for_handler_successful_completion(self, exthandlers_handler):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's remove "test" from the name of this helper function. this threw me off... gave me the impression that it was a test

# Missing status file is considered a non-terminal state here
# so that extension sequencing can wait until it becomes existing
if not os.path.exists(ext_status_file):
status = "warning"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should change this status... this gives the impression that the extension is reporting a warning when in actuality there is no status file. I would suggest return something like "status file not found" or similar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is verified against a set of valid values. I think we should adhere to those values. Also it is clearly commented above that condition in which it can occur.

Copy link
Member

@narrieta narrieta Sep 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that if we reach this path, then the status in this message:

"Extension {0} did not reach a terminal state within the allowed timeout. Last status was {1}".

will be "warning". When looking at the log, it'll seem like the extension actually reported a warning, when in actuality it didn't report any status. That may be confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to pass up the entire status, not just the status field, to give a more complete log message and add a __str__ to ExtStatus so that it logs something sensible

return exthandlers_handler.wait_for_handler_successful_completion(exthandler, wait_until)

def test_wait_for_handler_successful_completion_none(self, *args):
test_data = WireProtocolData(DATA_FILE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a comment about what each test is verifying or, better, a more descriptive test name... for this one maybe "test_wait_for_handler_successful_completion_returns_false_when_there_is_no_status_file"

test_data = WireProtocolData(DATA_FILE)
exthandlers_handler, protocol = self._create_mock(test_data, *args)

ExtHandlerInstance.get_ext_handling_status = MagicMock(return_value=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this test verifying? that wait_for_handler_successful_completion returns false when there is no status file? If that is the case then I would instead mock get_status_file_path to return None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When get_ext_handling_status() is mocked, it is unnecessary to mock the get_status_file_path(). It will be a no-op in that case.
The case which you are referring is covered in test_get_ext_handling_status()

ExtHandlerInstance.get_ext_handling_status = MagicMock(return_value=status)
self.assertFalse(self._test_wait_for_handler_successful_completion(exthandlers_handler))

def test_wait_for_handler_successful_completion_timeout(self, *args):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test needs to be re-written... the way it reads now it looks like it is testing that wait_for_handler_successful_completion returns false when the status is "warning" (the test name seems to imply that this should be testing when the extension times out before reporting a terminal status)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"warning" is a non-terminal state which will eventually timeout the wait function and return false. This is what is tested here. We have set the timeout value to be 5 seconds to minimize the duration.
Do you mean to rename this function? If it needs to be rewritten, can you please explain what is missing here?

exthandler.properties.extensions.append(extension)

# list of [seq_no, ext_status_file, is_status_file_exist, expected_result]
test_cases = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add comments about what each test case is verifying? e.g. the first 2 seems to be checking the result when there are no new test settings, the 3rd one seems to be verifying the case when the extension doesn't create a status file, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is already commented in line# 765.



# Wait for the extension installation until it is handled.
# This is done for the install and enable. Not for the uninstallation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this comment is entirely accurate.

"install and enable" refer to transition steps that handle_ext_handler performs to bring an extension from the uninstalled state to the disabled/installed state and from the disabled/installed state to the enabled state. handle_ext_handler takes care of these steps. sort_key is based on the goal state for the extension ("enabled" or not "enabled" which is either "disabled" or "uninstall"). If an extension is uninstalled and the goal state is "disabled", then handle_ext_handler will install but not enable the extension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial ask for a comment similar to this was to clarify the intention of the "dep_level >= 0" check... we wait only when the extension was installed or enabled. As it is currently phrased it seems like it is not helping so feel free to remove it or change it accordingly.

Copy link
Member

@narrieta narrieta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm; automation was successful

@narrieta
Copy link
Member

narrieta commented Oct 8, 2018

BVTs OK

@vrdmr vrdmr added this to the Release v2.2.33 milestone Oct 15, 2018
@narrieta narrieta merged commit 2422787 into Azure:master Oct 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants