Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge ExtensionsGoalState into GoalState #2490

Merged
merged 15 commits into from
Feb 7, 2022

Conversation

narrieta
Copy link
Member

@narrieta narrieta commented Feb 2, 2022

This change makes the ExtensionsGoalState a property of GoalState, instead of a separate object.

I also changed the way we save the history to account for the fact that the Fabric and Fast Track goal states are updated independently from each other.

I also did some refactoring on the protocol classes to start separating the current goal state from the protocol. The changes in this PR prepare the code for more refactoring that I will do on separate PRs.

@@ -184,12 +184,6 @@ class ProtocolNotFoundError(ProtocolError):
"""


class IncompleteGoalStateError(ProtocolError):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see the need to have a separate exception for this error so I changed the code to use ProtocolError.

This requires a change in DCR, which looks for IncompleteGoalStateError. I am preparing that change and will submit the PR on the tux repo.

@codecov
Copy link

codecov bot commented Feb 2, 2022

Codecov Report

Merging #2490 (d1c8949) into fast-track (951908d) will decrease coverage by 0.03%.
The diff coverage is 92.85%.

Impacted file tree graph

@@              Coverage Diff               @@
##           fast-track    #2490      +/-   ##
==============================================
- Coverage       71.81%   71.78%   -0.04%     
==============================================
  Files             101      101              
  Lines           14990    14987       -3     
  Branches         2389     2375      -14     
==============================================
- Hits            10765    10758       -7     
- Misses           3740     3748       +8     
+ Partials          485      481       -4     
Impacted Files Coverage Δ
azurelinuxagent/common/exception.py 99.00% <ø> (-0.01%) ⬇️
azurelinuxagent/common/logcollector.py 88.54% <ø> (ø)
azurelinuxagent/common/logcollector_manifests.py 100.00% <ø> (ø)
...protocol/extensions_goal_state_from_vm_settings.py 79.60% <ø> (-0.41%) ⬇️
azurelinuxagent/common/utils/archive.py 83.84% <78.43%> (-0.51%) ⬇️
azurelinuxagent/common/protocol/wire.py 78.82% <90.47%> (-0.34%) ⬇️
azurelinuxagent/common/protocol/goal_state.py 95.27% <97.03%> (+1.79%) ⬆️
azurelinuxagent/common/protocol/hostplugin.py 88.78% <100.00%> (-0.07%) ⬇️
azurelinuxagent/ga/exthandlers.py 85.90% <100.00%> (ø)
...inuxagent/common/protocol/extensions_goal_state.py 72.34% <0.00%> (-10.64%) ⬇️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 951908d...d1c8949. Read the comment docs.

request to the HostGAPlugin for the vmSettings, which determines the goal state for extensions when using the Fast Track pipeline.

To reduce the number of requests, when possible, create a single instance of GoalState and use the update() method to keep it up to date.
"""
Copy link
Member Author

@narrieta narrieta Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now instantiating GoalState fetches the entire goal state (minus the agent/extension manifests). Looking at the usages of the goal state, there is no need to separate methods to request the goal state and then the extensions config. etc. The fetch_full_goal_state() is now gone.

The only usage for invoking only the goalstate API (and not fetch the rest of the goal state) was to update the headers used by the hostgaplugin. I created a separate method for that (update_host_plugin_headers)

if vm_settings is not None:
self._extensions_goal_state = vm_settings

def save_to_history(self, data, file_name):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a temporary method while I do more refactoring on the methods to fetch the manifests


return xml_text, xml_doc

def _fetch_vm_settings(self, force_update=False):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extensions goal state state is now a property of GoalState so we fetch it here

def save_to_history(self, data, file_name):
self._history.save(data, file_name)

def _initialize_basic_properties(self, xml_doc):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There weren't any changes on how we parse the goal state, so the code in this method and in _fetch_extended_goal_state was just moved here from elsewhere

self.role_config_name = role_config_name
self.container_id = None
self.deployment_id = None
self.role_config_name = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now these properties need to be explicitly initialized by calling GoalState.update_host_plugin_headers().

This is just a temporary change; I need to do some refactoring in the initialization of GoalState/WireProtocol/HostGAPluginProtocol/Telemetry and will improve on this.

def format_message(msg):
return "GET vmSettings [correlation ID: {0} eTag: {1}]: {2}".format(correlation_id, etag, msg)

def get_vm_settings():
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function was part of the retry logic in a previous iteration, but now the retry is done by the caller so I unfolded this function into the containing code

REMOTE_ACCESS_FILE_NAME = "RemoteAccess.{0}.xml"
EXT_CONF_FILE_NAME = "ExtensionsConfig.{0}.xml"
MANIFEST_FILE_NAME = "{0}.{1}.manifest.xml"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the history is maintained by GoalState, instead of the WireClient.


def update_goal_state(self, force_update=False):
"""
Updates the goal state if the incarnation or etag changed or if 'force_update' is True
"""
try:
#
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the goal state, retrieving the vmSettings and keeping the goal state history is now done by GoalState

def get_ext_manifest(self, ext_handler):
if self._goal_state is None:
raise ProtocolError("Trying to fetch Extension Manifest before initialization!")

try:
xml_text = self.fetch_manifest(ext_handler.manifest_uris)
self._save_cache(xml_text, MANIFEST_FILE_NAME.format(ext_handler.name, self.get_goal_state().incarnation))
self._goal_state.save_to_history(xml_text, _MANIFEST_FILE_NAME.format(ext_handler.name))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the get/fetch*manifests functions are still in WireClient; I'll move them into GoalState in a separate refacting round

# 2018-04-06T08:21:37.142697_incarnation_N
# 2018-04-06T08:21:37.142697_incarnation_N.zip
_ARCHIVE_PATTERNS_DIRECTORY = re.compile(r"^\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}\.\d+(_incarnation_(\d+))?$$")
_ARCHIVE_PATTERNS_ZIP = re.compile(r"^\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}\.\d+(_incarnation_(\d+))?\.zip$")


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now GoalState saves the goal state directly to the history folder, so there is not need to flush it

try:
logger.info('Fetching goal state [incarnation {0}]', self._incarnation)

self._history.save_goal_state(xml_text)
Copy link
Member Author

@narrieta narrieta Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we save the goal state to the history folder as we fetch it, instead of saving to /var/lib/waagent and then flushing it to the history folder when the incarnation changes

this requires an update in DCR, where the tests for the status blob is looking for ExtensionsConfig.*.xml in the agent's dir. I'll post the corresponding PR in the tux repo.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample history directory:

Length Name
------ ----
       2022-02-03T00_43_06.228674_1    <=== WS goal state
       2022-02-03T00_43_07.544229     <=== status
       2022-02-03T00_44_13.783542_1310808759485798297   <=== HostGAPlugin goal state
       2022-02-03T00_44_13.783997_2
       2022-02-03T00_44_18.097053
       2022-02-03T00_44_48.209362_5216182003277442798
       2022-02-03T00_44_48.209790_3
       2022-02-03T00_44_50.330195
       2022-02-03T00_45_20.442170_4495420063731861706
       2022-02-03T00_45_20.442600_4
       2022-02-03T00_45_24.596102
       2022-02-03T00_45_48.671177_14329981429087073333
       2022-02-03T00_45_48.671655_5
       2022-02-03T00_45_52.837152
       2022-02-03T00_46_22.944478_9343934122256085228
       2022-02-03T00_46_22.944892_6
       2022-02-03T00_46_27.116754
  2993 2022-02-03T00_43_04.659319_1.zip
   785 2022-02-03T00_43_04.724605_12843543198617625598.zip
  2993 2022-02-03T00_43_06.228674_1.zip
   785 2022-02-03T00_43_06.235001_12843543198617625598.zip

@@ -59,7 +59,7 @@
{
"handlerSettings": {
"protectedSettingsCertThumbprint": "4C4F304667711036E64AF4894B76EB208A863BD4",
"protectedSettings": "MIIBsAYJKoZIhvcNAQcDoIIBoTCCAZ0CAQAxggFpMIIBZQIBADBNMDkxNzA1BgoJkiaJk/IsZAEZFidXaW5kb3dzIEF6dXJlIENSUCBDZXJ0aWZpY2F0ZSBHZW5lcmF0b3ICEFpB/HKM/7evRk+DBz754wUwDQYJKoZIhvcNAQEBBQAEggEADPJwniDeIUXzxNrZCloitFdscQ59Bz1dj9DLBREAiM8jmxM0LLicTJDUv272Qm/4ZQgdqpFYBFjGab/9MX+Ih2x47FkVY1woBkckMaC/QOFv84gbboeQCmJYZC/rZJdh8rCMS+CEPq3uH1PVrvtSdZ9uxnaJ+E4exTPPviIiLIPtqWafNlzdbBt8HZjYaVw+SSe+CGzD2pAQeNttq3Rt/6NjCzrjG8ufKwvRoqnrInMs4x6nnN5/xvobKIBSv4/726usfk8Ug+9Q6Benvfpmre2+1M5PnGTfq78cO3o6mI3cPoBUjp5M0iJjAMGeMt81tyHkimZrEZm6pLa4NQMOEjArBgkqhkiG9w0BBwEwFAYIKoZIhvcNAwcECC5nVaiJaWt+gAhgeYvxUOYHXw==",
"protectedSettings": "MIIBsAYJKoZIhvcNAQcDoIIBoTCCAZ0CAQAxggFpMIIBZQIBADBNMDkxNzA1BgoJkiaJk/Microsoft.Azure.Monitor.AzureMonitorLinuxAgent==",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding the extension name and reducing the length makes debugging easier

@@ -135,7 +135,7 @@
{
"handlerSettings": {
"protectedSettingsCertThumbprint": "59A10F50FFE2A0408D3F03FE336C8FD5716CF25C",
"protectedSettings": "*** REDACTED ***"
"protectedSettings": "MIIBsAYJKoZIhvcNAQcDoIIBoTCCAZ0CAQAxggFpddesZQewdDBgegkxNzA1BgoJkgergres/Microsoft.OSTCExtensions.VMAccessForLinux=="
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not have been "*** REDACTED ***" in the test data

</PluginSettings>
<StatusUploadBlob statusBlobType="BlockBlob">http://foo</StatusUploadBlob>
</Extensions>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests for the status blob type were reaching too much into internals and were a pain to maintain each time I change code in the goal state stuff so finally I got tired of them and change them to use the mock protocol. These are the test data files for the mock

@@ -3425,49 +3426,6 @@ def mock_http_put(url, *args, **_):

self.assertEqual(expected_status, actual_status_json)

def test_it_should_zip_waagent_status_when_incarnation_changes(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we save directly to the history folder so this behavior does not exist anymore

@@ -25,24 +25,3 @@ def test_create_from_vm_settings_should_assume_block_when_blob_type_is_not_valid
extensions_goal_state = ExtensionsGoalStateFactory.create_from_vm_settings(1234567890, load_data("hostgaplugin/vm_settings-invalid_blob_type.json"))
self.assertEqual("BlockBlob", extensions_goal_state.status_upload_blob_type, 'Expected BlockBob for an invalid statusBlobType')

def test_extension_goal_state_should_parse_requested_version_properly(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests have also been giving me pain. I moved them to tests for parsing extensionsConfig/vmSettings

self.assertEqual(AgentGlobals.GUID_ZERO, extensions_goal_state.activity_id, "Incorrect activity Id")
self.assertEqual(AgentGlobals.GUID_ZERO, extensions_goal_state.correlation_id, "Incorrect correlation Id")
self.assertEqual('1900-01-01T00:00:00.000000Z', extensions_goal_state.created_on_timestamp, "Incorrect GS Creation time")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved from test_extensions_goal_state.py

@@ -47,6 +48,20 @@ def assert_property(name, value):
# dependency level (multi-config)
self.assertEqual(1, vm_settings.extensions[3].settings[1].dependencyLevel, "Incorrect dependency level (multi-config)")

def test_extension_goal_state_should_parse_requested_version_properly(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved from test_extensions_goal_state.py

@@ -53,41 +53,6 @@ def _parse_archive_name(name):
incarnation_no_ext = os.path.splitext(incarnation_ext)[0]
return timestamp_str, incarnation_no_ext

def test_archive00(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The StateFlush is now gone, the test is no longer needed

@@ -1112,107 +1097,6 @@ def test_forced_update_should_update_the_goal_state_and_the_host_plugin_when_the
self.assertEqual(protocol.client.get_host_plugin().container_id, new_container_id)
self.assertEqual(protocol.client.get_host_plugin().role_config_name, new_role_config_name)

def test_update_goal_state_should_archive_last_goal_state(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 2 tests were moved to test_fetch_full_goal_state_should_save_goal_state_to_history_directory (test_extensions_goal_state.py)

self.assertEqual(_NUM_GS_FETCH_RETRIES, mock_sleep.call_count, "Unexpected number of retries")
self.assertEqual(_GET_GOAL_STATE_MAX_ATTEMPTS, mock_sleep.call_count, "Unexpected number of retries")

def test_fetch_full_goal_state_should_save_goal_state_to_history_directory(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces the former test_update_goal_state_should_archive_last_goal_state and test_update_goal_state_should_not_persist_the_protected_settings that used to be in test_wire.py

fileutil.write_file(status_path, json_text)

if incarnation_changed:
GoalStateHistory(datetime.datetime.utcnow().isoformat()).save_status(json_text)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status is saved in a different directory with no suffix

Copy link
Member Author

@narrieta narrieta Feb 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it'd be more useful to, on incarnation change, preserve the previous status file instead of overwriting it (i.e. keep a history of the last status we reported for the previous goal state, rather than the first status of the new goal state).

will submit a change for this

@@ -38,110 +38,39 @@

ARCHIVE_DIRECTORY_NAME = 'history'

_MAX_ARCHIVED_STATES = 50
_MAX_ARCHIVED_STATES = 100
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiplying by 2, since now we also have an archive for the agent status per goal state

from test_agent_basics import check_agent_processes, check_sudoers

if __name__ == '__main__':
admin_username = get_vm_data_from_env().admin_username
tests = [
TestFuncObj("check agent processes", check_agent_processes),
TestFuncObj("check agent log", check_waagent_log_for_errors),
TestFuncObj("Verify status blob", lambda: show_blob_content('Status', 'StatusUploadBlob')),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are just parsing /var/lib/waagent/ExtensionsConfig.*.xml and dumping the status and artifacts profile blob. apparently for debugging purposes.

That file does not exist anymore, but the equivalent is in the history folder, which is already captured by automation.

Once we enable Fast Track this "test" would need significant changes to make it aware of vmsettings. Since we already have the debug info in the history, it is not worth to update it.

@narrieta narrieta assigned nagworld9 and unassigned larohra Feb 3, 2022
if status_blob_text is None:
status_blob_text = ""

debug_info = ExtHandlersHandler._get_status_debug_info(vm_status)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while working on the history i noticed that we were modifying the status we report to crp before saving it to the status file.

i changed this to preserve the status exactly as we save it to the blob

def _fetch_extended_goal_state(self, xml_text, xml_doc, vm_settings):
"""
Issues HTTP requests (WireServer) for each of the URIs in the goal state (ExtensionsConfig, Certificate, Remote Access users, etc)
and populates the corresponding properties. If the give 'vm_settings' are not None they are used for the extensions goal state,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and populates the corresponding properties. If the give 'vm_settings' are not None they are used for the extensions goal state,

Suggested change
and populates the corresponding properties. If the give 'vm_settings' are not None they are used for the extensions goal state,
and populates the corresponding properties. If the given 'vm_settings' are not None they are used for the extensions goal state,

extensions_config = ExtensionsGoalStateFactory.create_from_extensions_config(self._incarnation, xml_text, self._wire_client)
self._history.save_extensions_config(extensions_config.get_redacted_text())

if vm_settings is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you hit the 'not defined' error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I don't get your comment

vm_settings is a parameter... did you mean this variable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NM, I missed to look. I thought it's not defined in the function but it's function parameter.

@@ -123,7 +113,7 @@ def get_incarnation(self):

def get_vmagent_manifests(self):
goal_state = self.client.get_goal_state()
ext_conf = self.client.get_extensions_goal_state()
ext_conf = self.client.get_goal_state().extensions_goal_state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we use goal_state object?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks; probably a global search-and-replace; will update


# The goal state for extensions can come from vmSettings when using FastTrack or from extensionsConfig otherwise, self._fetch_extended_goal_state
# populates the '_extensions' property.
self._extensions = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this being used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good catch! This was meant to be '_extensions_goal_state', but I did not remove the old variable ('_extensions')

@narrieta narrieta merged commit ff8b46e into Azure:fast-track Feb 7, 2022
@narrieta narrieta deleted the goal-state branch February 7, 2022 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants