Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorganize the history directory #2520

Merged
merged 6 commits into from
Mar 1, 2022
Merged

Conversation

narrieta
Copy link
Member

@narrieta narrieta commented Feb 25, 2022

The changes I did to the history directory did not capture the goal state the way I intended to. I wanted to capture all the HTTP requests for needed to process a single goal state within the same directory. Some goal states were producing more than 1 directory, plus I was capturing the waagent_status.json in a separate directory. All of that made the history hard to read.

This PR addresses those issues. There is a single directory per goal state, be it Fabric or FastTrack, and the last status for a given goal state is captured in the directory for that goal state.

This is a sample directory, and the contents of 2 sample subdirectories (the "_*" suffix indicates the incarnation for a Fabric goal state, or the eTag for a FastTrack goal state.

root@edp90bdcp2:/var/lib/waagent# ls -l history/
total 128
-rw-r--r-- 1 root root 3777 Feb 25 05:21 2022-02-25T05:21:51.351437_1.zip
-rw-r--r-- 1 root root 4214 Feb 25 05:22 2022-02-25T05:21:52.941740_1.zip
-rw-r--r-- 1 root root 5671 Feb 25 05:23 2022-02-25T05:22:48.440602_2.zip
-rw-r--r-- 1 root root 5859 Feb 25 05:24 2022-02-25T05:23:22.879699_3.zip
-rw-r--r-- 1 root root 5676 Feb 25 05:24 2022-02-25T05:24:01.136465_4.zip
-rw-r--r-- 1 root root 5953 Feb 25 05:25 2022-02-25T05:24:35.446556_5.zip
-rw-r--r-- 1 root root 2605 Feb 25 05:25 2022-02-25T05:25:03.847029_15811112770847678291.zip
-rw-r--r-- 1 root root 2527 Feb 25 05:26 2022-02-25T05:25:44.087642_18000139041792514682.zip
-rw-r--r-- 1 root root 2407 Feb 25 05:26 2022-02-25T05:26:18.309032_4750493307514613305.zip
-rw-r--r-- 1 root root 2553 Feb 25 05:27 2022-02-25T05:26:52.771520_3125115514822901175.zip
-rw-r--r-- 1 root root 2644 Feb 25 05:28 2022-02-25T05:27:27.017290_11215055166409093376.zip
-rw-r--r-- 1 root root 3519 Feb 25 05:28 2022-02-25T05:28:03.292724_16004112586371732150.zip
-rw-r--r-- 1 root root 3628 Feb 25 05:29 2022-02-25T05:28:37.963574_4568237672134507183.zip
-rw-r--r-- 1 root root 3660 Feb 25 05:30 2022-02-25T05:29:16.265539_7975497350458587842.zip
-rw-r--r-- 1 root root 3715 Feb 25 05:30 2022-02-25T05:29:50.576691_2740615665526774226.zip
-rw-r--r-- 1 root root 3730 Feb 25 05:31 2022-02-25T05:30:24.879073_3193126257072759267.zip
-rw-r--r-- 1 root root 3598 Feb 25 05:31 2022-02-25T05:30:59.204375_6507518737698568822.zip
-rw-r--r-- 1 root root 2709 Feb 25 05:32 2022-02-25T05:31:27.550604_13599842578989748385.zip
-rw-r--r-- 1 root root 2629 Feb 25 05:32 2022-02-25T05:31:57.770524_14353420009637935747.zip
-rw-r--r-- 1 root root 2523 Feb 25 05:33 2022-02-25T05:32:31.989915_558826791117675680.zip
-rw-r--r-- 1 root root 2249 Feb 25 05:33 2022-02-25T05:33:00.218760_2856318285684983312.zip
-rw-r--r-- 1 root root 2555 Feb 25 05:34 2022-02-25T05:33:34.758289_2370205063594795044.zip
-rw-r--r-- 1 root root 2619 Feb 25 05:34 2022-02-25T05:34:15.040888_6966483418219710538.zip
-rw-r--r-- 1 root root 2522 Feb 25 05:35 2022-02-25T05:34:43.235101_8780014925603559582.zip
-rw-r--r-- 1 root root 2371 Feb 25 05:35 2022-02-25T05:35:17.449951_16011384681609892725.zip
-rw-r--r-- 1 root root 2502 Feb 25 05:57 2022-02-25T05:35:51.971207_9901895646845909273.zip
drwx------ 2 root root 4096 Feb 25 05:57 2022-02-25T05:57:13.108043_7729999217724852232

root@edp90bdcp2:/var/lib/waagent# unzip -l history/2022-02-25T05:24:35.446556_5.zip
Archive:  history/2022-02-25T05:24:35.446556_5.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     3044  2022-02-25 05:24   ExtensionsConfig.xml
     1721  2022-02-25 05:24   HostingEnvironmentConfig.xml
     2299  2022-02-25 05:24   VmSettings.json
    13020  2022-02-25 05:24   Microsoft.OSTCExtensions.VMAccessForLinux.manifest.xml
     2099  2022-02-25 05:24   GoalState.xml
      899  2022-02-25 05:24   SharedConfig.xml
     1020  2022-02-25 05:24   waagent_status.json
---------                     -------
    24102                     7 files

root@edp90bdcp2:/var/lib/waagent# unzip -l history/2022-02-25T05:25:03.847029_15811112770847678291.zip
Archive:  history/2022-02-25T05:25:03.847029_15811112770847678291.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
     2173  2022-02-25 05:25   VmSettings.json
    13020  2022-02-25 05:25   Microsoft.OSTCExtensions.VMAccessForLinux.manifest.xml
     1632  2022-02-25 05:24   waagent_status.json
---------                     -------
    16825                     3 files

@@ -38,7 +38,7 @@

ARCHIVE_DIRECTORY_NAME = 'history'

_MAX_ARCHIVED_STATES = 100
_MAX_ARCHIVED_STATES = 50
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously I increased this value from 50 to 100 because I was saving waagent_status.json to a separate directory. Since now I am saving it to the same directory as the goal state, I am reverting the value to the original 50

@codecov
Copy link

codecov bot commented Feb 25, 2022

Codecov Report

Merging #2520 (68275b9) into develop (bb9a337) will decrease coverage by 0.03%.
The diff coverage is 90.47%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2520      +/-   ##
===========================================
- Coverage    71.94%   71.91%   -0.04%     
===========================================
  Files          101      102       +1     
  Lines        15147    15135      -12     
  Branches      2401     2400       -1     
===========================================
- Hits         10898    10884      -14     
+ Misses        3771     3769       -2     
- Partials       478      482       +4     
Impacted Files Coverage Δ
azurelinuxagent/common/utils/archive.py 87.96% <75.00%> (-0.77%) ⬇️
azurelinuxagent/ga/update.py 88.39% <76.47%> (-0.16%) ⬇️
azurelinuxagent/ga/exthandlers.py 85.71% <88.88%> (-0.20%) ⬇️
...protocol/extensions_goal_state_from_vm_settings.py 80.91% <100.00%> (+0.22%) ⬆️
azurelinuxagent/common/protocol/goal_state.py 96.78% <100.00%> (+1.42%) ⬆️
azurelinuxagent/common/utils/timeutil.py 100.00% <100.00%> (ø)
azurelinuxagent/ga/collect_telemetry_events.py 90.41% <0.00%> (-1.03%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb9a337...68275b9. Read the comment docs.

timestamp = create_timestamp()
xml_text, xml_doc, incarnation = GoalState._fetch_goal_state(self._wire_client)
self._history = GoalStateHistory(timestamp, incarnation)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main change here is intantiating the history during init; the rest of the changes are just moving the code around to ensure all properties are defined before invoking any method

raise
except ResourceGoneError:
# retry after refreshing the HostGAPlugin
GoalState.update_host_plugin_headers(self._wire_client)
vm_settings, vm_settings_updated = self._wire_client.get_host_plugin().fetch_vm_settings(force_update=force_update)

if vm_settings_updated:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now the vmSettings are added to the history by the caller

}}
'''.format(status_blob_text, debug_info)

fileutil.write_file(status_file, status_file_text)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now the file is created by the caller, and this code is exposed as get_ext_handlers_status_debug_info

# else just ensure the extensions are using the latest vm_settings
if vm_settings is not None:
# else ensure the extensions are using the latest vm_settings
timestamp = create_timestamp()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need different timestamp? can't we use L#135

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L#136 calls _fetch_goal_state, which does a network call, which can have delays. we need a new timestamp here


vm_settings = self._fetch_vm_settings(force_update=force_update)
timestamp = create_timestamp()
xml_text, xml_doc, incarnation = GoalState._fetch_goal_state(self._wire_client)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where/when is this full goal state saved to history folder?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's be saved as part of _fetch_extended_goal_state (called below @line 142) or after retrieving the vmsettings (below @line 150)

larohra
larohra previously approved these changes Mar 1, 2022
Copy link
Contributor

@larohra larohra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some very minor comments, else LGTM


def _fetch_extended_goal_state(self, xml_text, xml_doc, vm_settings):
def _fetch_extended_goal_state(self, xml_text, xml_doc, force_vm_settings_update=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: maybe rename to fetch_complete_goal_state? Extended seemed a bit confusing to me personally

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in init, I split the member variables in 2 blocks, which I called "basic" and "extended", then initialize_basic_properties initializes the former and fetch_extended_goal_state the latter (I did not name fetch_extended_goal_state "initialize_extended_properties" because there is a fetch operation in it and I want that to be explicit)

@@ -175,16 +170,17 @@ def _fetch_goal_state(wire_client):

# In some environments a few goal state requests return a missing RoleInstance; these retries are used to work around that issue
# TODO: Consider retrying on 410 (ResourceGone) as well
incarnation = "unknown"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this ever be "unknown"? It should always get updated right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the linter believes it can be used before initialization at line #184. that would happen, i believe, only if _GET_GOAL_STATE_MAX_ATTEMPTS is 0, but I decided to go ahead an initialize it anyways

except Exception as e:
if not self._errors: # report only 1 error per directory
self._errors = True
logger.warn("Failed to save goal state file {0}: {1} [no additional errors saving the goal state will be reported]".format(target_name, e))
logger.warn("Failed to save goal state file {0}: {1} [no additional errors saving the goal state will be reported]".format(file_name, e))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this error message needs updating? Failed to save {file_name} for incarnation {id} [no additional errors saving the goal state will be reported]: {error}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see... by "goal state file" i did not mean GoalState.xml, but a file related to the goal state... i'll update this to "Failed to save {0} to the goal state history: {1}" (the incarnation or etag are already logged 1 or 2 lines above this message)


return json.dumps(debug_info)
return '''{{
"__comment__": "The __status__ property is the actual status reported to CRP",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Indent missing for better visuals

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not want extra indentation in the actual file, so I added only 4 leading spaces

self._report_extensions_summary(vm_status)
if self._goal_state is not None:
agent_status = exthandlers_handler.get_ext_handlers_status_debug_info(vm_status)
self._goal_state.save_to_history(agent_status, AGENT_STATUS_FILE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe create an abstraction for this too? i.e. self._goal_state.save_reported_status() or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently i'm using save_to_history to save the manifests as well, i'm planning on getting rid of it in another round of refactoring

@narrieta narrieta merged commit 6445aaa into Azure:develop Mar 1, 2022
@narrieta narrieta deleted the fast-track branch March 1, 2022 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants