experimental client: Remove mirrors #1373

jku · 2021-05-04T10:12:05Z

(This is a continuation of and replaces #1368.)

Drop "mirrors" support

New Updater no longer supports downloading each metadata/target file from a list of user-provided mirrors. All downloads are performed from a single url: Default URL prefixes for both metadata and targets can be set in Updater(), and individual target downloads can override the default.

I believe this addresses the comments in #1368 in one way or another. One specific response:

_build_full_url() / try:, download(), verify() loop is repeated a few times. Might it generalise to a helper function?

I've removed a lot of cruft from these functions (including every manual file object close) but have not made a helper except for a more usable download function: I think there will be no need since the functions will be quite tight once all the verification moves to another component that handles the metadata consistency.

Another noteworthy thing is removing some url quoting/checking:

We definitely have to go through the fs path <-> url conversions with a fine comb but this does not seem like the place: the "relative paths" in this case come from metadata and our code so there should be no need to quote here (as long as get_one_valid_targetinfo() argument is valid)?
The current client has this weird interaction where get_list_of_mirrors() quotes parts of the url... and _download_file() then unquotes the whole url -- I'm not quite sure what the objective is
Possibly we just want to document clearly that e.g. get_one_valid_targetinfo() argument must be a valid "path-relative-url" https://url.spec.whatwg.org/#path-relative-url-string (and then test what actually works): I don't want to handle filesystem and OS specific paths in the updater in the places where it can be avoided

If this is controversial I can leave this commit out

Keep the current API and mirrors configuration but use only the first mirror from the list for metadata download. Target files download remains unchanged. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Updater now uses only a single url for metadata download. Target files download use either a default url or an optional one for each file passed by the caller. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

Removing mirrors means we no longer need to do file object handling manually. Note that this means we're now exposing the Updater caller to all kinds of new exceptions (as NoWorkingMirrorError is no longer an excuse we can use). Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

The function actually hashes the target filepath. Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

tuf/client_rework/updater_rework.py

* Make sure all base urls (prefixes) end in a slash * Add documentation to get_one_valid_targetinfo(): That is the one place where the API accepts ill-defined "paths" from the caller * Remove checks from download url handling: we control both the base url and the relative path so there should be no surprises here. Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

jku · 2021-05-06T13:18:26Z

Note that this means we're now exposing the Updater caller to all kinds
of new exceptions (as NoWorkingMirrorError is no longer an excuse we can
use).

This should be called out as "requires future work"

joshuagl

As expected, this simplifies the code a lot. Nice.
I have a question inline on the way we are configuring repositories, proposing we match the semantics of the outgoing mirrors configuration by having a base url to the repository with default relative paths to targets and metadata directories. This seems to work with download_target() being able to replace the entire base url + targets directory join with an optional target_base_url – curious to hear what you think?

We have some issues filed on the fs path <-> URL conversion already (#1018 and #1077). Let's make sure this work is part of the client refactor milestone please, either as a new issue or one (or both) of those issues being moved into the milestone.

There are quite a few TODO docstrings, can we file an issue to complete those and make it part of the appropriate milestone? (Is that 'Experimental client' or 'Client Refactor'?)

All that said, nice clean up. Good work @sechkova and @jku!

tuf/client_rework/updater_rework.py

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

joshuagl · 2021-05-10T22:08:42Z

There's some issues to file (and move into appropriate milestones), then this should be good to merge. We can address the minor items (docstrings, variable names, public vs. private API) here or file issues to tackle them before experimental-client is merged with develop.

jku · 2021-05-12T07:00:48Z

I've marked issues resolved if I think they are now resolved in this PR.
I've filed #1385 for documenting the module, and updated #1312 for exceptions
All of the other TODOs in updater_rework.py will be handled by #1355.

(The CI failure is sslib master build since we're a bit behind current develop)

joshuagl

LGTM, with a few minor comments/questions/suggestions in-line.

I'd like to see some more testing around this, the seek(0) calls especially feel like they require some reassurance that they are matching user expectations.

joshuagl · 2021-05-12T09:05:19Z

tests/test_updater_rework.py

+    metadata_url = os.path.join(url_prefix, 'metadata/')
+    targets_url = os.path.join(url_prefix, 'targets/')


Nit: should this be urllib.parse.urljoin()?

yeah, I'll change that

Fact is that the test needs to be rewritten, it's copied from test_updater.py and is horribly complex and full of issues like this for (I assume) Historical Reasons... but I'm trying to limit scope so not fixing anything else here

joshuagl · 2021-05-12T09:06:45Z

tuf/client_rework/updater_rework.py

-        fetcher:
-
-        consistent_snapshot:
+    TODO


Is this covered by #1385?

Yeah that was the idea: The public API will change a bit as we start using the MetadataBundle so let's document after that

joshuagl · 2021-05-12T09:09:54Z

tuf/client_rework/updater_rework.py

+            repository_name: directory name (within a local directory
+                defined by 'tuf.settings.repositories_directory')


This took me a couple of attempts to parse, I'm not sure my suggestion is any better though.

Suggested change

repository_name: directory name (within a local directory

defined by 'tuf.settings.repositories_directory')

repository_name: name of the directory used for storing local copies of working files

for this repository. Created as a child of 'tuf.settings.repositories_directory'.

This is better but I'm not changing it as I intend to

not use tuf.settings for anything at all in near future

use local_repo_dir as the argument directly -- there's no need for a system wide repositories_directory (at least not one managed by Updater)

this will happen after we have a MetadataBundle #1355 we can use and should be much easier to understand and document

The test has issues like this alsready but let's not add more... Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

joshuagl · 2021-05-12T13:26:59Z

Thanks for responding to feedback, LGTM for a merge to experimental-client.

sechkova and others added 3 commits April 29, 2021 09:28

Download metadata from a single mirror

80ff532

Keep the current API and mirrors configuration but use only the first mirror from the list for metadata download. Target files download remains unchanged. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

Drop mirrors support

576d055

Updater now uses only a single url for metadata download. Target files download use either a default url or an optional one for each file passed by the caller. Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>

new updater: Seek to beginning of file after length check

45259cf

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

jku mentioned this pull request May 4, 2021

client refactor: remove mirrors #1368

Closed

3 tasks

jku marked this pull request as draft May 4, 2021 10:26

Jussi Kukkonen added 4 commits May 4, 2021 13:28

new updater: Rename _get_target_hash()

9fae500

The function actually hashes the target filepath. Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

new updater: Remove misleading comment

888f022

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

Rename url prefixes so they are consistent

3a02583

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

jku force-pushed the remove-mirrors branch from 50aee6c to 0e0c77c Compare May 4, 2021 10:34

jku marked this pull request as ready for review May 4, 2021 11:29

jku commented May 5, 2021

View reviewed changes

tuf/client_rework/updater_rework.py Outdated Show resolved Hide resolved

jku force-pushed the remove-mirrors branch from 0e0c77c to ab210b4 Compare May 5, 2021 13:35

joshuagl reviewed May 6, 2021

View reviewed changes

joshuagl added the experimental-client Items related to the development of a new client (see milestone/8 and theexperimental-client branch) label May 6, 2021

new updater: Improve docstrings

9605e19

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

joshuagl approved these changes May 12, 2021

View reviewed changes

tests: Don't use os.path.join() for URLS

ec4c5ce

The test has issues like this alsready but let's not add more... Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>

jku merged commit 622a54b into theupdateframework:experimental-client May 12, 2021

This was referenced May 18, 2021

experimental client: download functions review #1339

Closed

Split Updater into logical components: redesign mirrors.py and download.py #1307

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

experimental client: Remove mirrors #1373

experimental client: Remove mirrors #1373

jku commented May 4, 2021 •

edited

Loading

jku commented May 6, 2021

joshuagl left a comment

joshuagl commented May 10, 2021

jku commented May 12, 2021 •

edited

Loading

joshuagl left a comment

joshuagl May 12, 2021

jku May 12, 2021

joshuagl May 12, 2021

jku May 12, 2021

joshuagl May 12, 2021

jku May 12, 2021 •

edited

Loading

joshuagl commented May 12, 2021

		metadata_url = os.path.join(url_prefix, 'metadata/')
		targets_url = os.path.join(url_prefix, 'targets/')

		repository_name: directory name (within a local directory
		defined by 'tuf.settings.repositories_directory')

experimental client: Remove mirrors #1373

experimental client: Remove mirrors #1373

Conversation

jku commented May 4, 2021 • edited Loading

jku commented May 6, 2021

joshuagl left a comment

Choose a reason for hiding this comment

joshuagl commented May 10, 2021

jku commented May 12, 2021 • edited Loading

joshuagl left a comment

Choose a reason for hiding this comment

joshuagl May 12, 2021

Choose a reason for hiding this comment

jku May 12, 2021

Choose a reason for hiding this comment

joshuagl May 12, 2021

Choose a reason for hiding this comment

jku May 12, 2021

Choose a reason for hiding this comment

joshuagl May 12, 2021

Choose a reason for hiding this comment

jku May 12, 2021 • edited Loading

Choose a reason for hiding this comment

joshuagl commented May 12, 2021

jku commented May 4, 2021 •

edited

Loading

jku commented May 12, 2021 •

edited

Loading

jku May 12, 2021 •

edited

Loading