Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experimental client: Remove mirrors #1373

Merged
merged 10 commits into from
May 12, 2021

Conversation

jku
Copy link
Member

@jku jku commented May 4, 2021

(This is a continuation of and replaces #1368.)

Drop "mirrors" support

New Updater no longer supports downloading each metadata/target file from a list of user-provided mirrors. All downloads are performed from a single url: Default URL prefixes for both metadata and targets can be set in Updater(), and individual target downloads can override the default.

I believe this addresses the comments in #1368 in one way or another. One specific response:

_build_full_url() / try:, download(), verify() loop is repeated a few times. Might it generalise to a helper function?

I've removed a lot of cruft from these functions (including every manual file object close) but have not made a helper except for a more usable download function: I think there will be no need since the functions will be quite tight once all the verification moves to another component that handles the metadata consistency.

Another noteworthy thing is removing some url quoting/checking:

  • We definitely have to go through the fs path <-> url conversions with a fine comb but this does not seem like the place: the "relative paths" in this case come from metadata and our code so there should be no need to quote here (as long as get_one_valid_targetinfo() argument is valid)?
  • The current client has this weird interaction where get_list_of_mirrors() quotes parts of the url... and _download_file() then unquotes the whole url -- I'm not quite sure what the objective is
  • Possibly we just want to document clearly that e.g. get_one_valid_targetinfo() argument must be a valid "path-relative-url" https://url.spec.whatwg.org/#path-relative-url-string (and then test what actually works): I don't want to handle filesystem and OS specific paths in the updater in the places where it can be avoided

If this is controversial I can leave this commit out

sechkova and others added 3 commits April 29, 2021 09:28
Keep the current API and mirrors configuration but
use only the first mirror from the list for metadata
download.

Target files download remains unchanged.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Updater now uses only a single url for metadata download.
Target files download use either a default url or an
optional one for each file passed by the caller.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
@jku jku mentioned this pull request May 4, 2021
3 tasks
@jku jku marked this pull request as draft May 4, 2021 10:26
Jussi Kukkonen added 4 commits May 4, 2021 13:28
Removing mirrors means we no longer need to do file object handling
manually.

Note that this means we're now exposing the Updater caller to all kinds
of new exceptions (as NoWorkingMirrorError is no longer an excuse we can
use).

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
The function actually hashes the target filepath.

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
@jku jku marked this pull request as ready for review May 4, 2021 11:29
* Make sure all base urls (prefixes) end in a slash
* Add documentation to get_one_valid_targetinfo(): That is the one place
  where the API accepts ill-defined "paths" from the caller
* Remove checks from download url handling: we control both the base url
  and the relative path so there should be no surprises here.

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
@jku
Copy link
Member Author

jku commented May 6, 2021

Note that this means we're now exposing the Updater caller to all kinds
of new exceptions (as NoWorkingMirrorError is no longer an excuse we can
use).

This should be called out as "requires future work"

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As expected, this simplifies the code a lot. Nice.
I have a question inline on the way we are configuring repositories, proposing we match the semantics of the outgoing mirrors configuration by having a base url to the repository with default relative paths to targets and metadata directories. This seems to work with download_target() being able to replace the entire base url + targets directory join with an optional target_base_url – curious to hear what you think?

We have some issues filed on the fs path <-> URL conversion already (#1018 and #1077). Let's make sure this work is part of the client refactor milestone please, either as a new issue or one (or both) of those issues being moved into the milestone.

There are quite a few TODO docstrings, can we file an issue to complete those and make it part of the appropriate milestone? (Is that 'Experimental client' or 'Client Refactor'?)

All that said, nice clean up. Good work @sechkova and @jku!

tuf/client_rework/updater_rework.py Show resolved Hide resolved
tuf/client_rework/updater_rework.py Outdated Show resolved Hide resolved
tuf/client_rework/updater_rework.py Show resolved Hide resolved
tuf/client_rework/updater_rework.py Outdated Show resolved Hide resolved
tuf/client_rework/updater_rework.py Show resolved Hide resolved
@joshuagl joshuagl added the experimental-client Items related to the development of a new client (see milestone/8 and theexperimental-client branch) label May 6, 2021
Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
@joshuagl
Copy link
Member

There's some issues to file (and move into appropriate milestones), then this should be good to merge. We can address the minor items (docstrings, variable names, public vs. private API) here or file issues to tackle them before experimental-client is merged with develop.

@jku
Copy link
Member Author

jku commented May 12, 2021

I've marked issues resolved if I think they are now resolved in this PR.
I've filed #1385 for documenting the module, and updated #1312 for exceptions
All of the other TODOs in updater_rework.py will be handled by #1355.

(The CI failure is sslib master build since we're a bit behind current develop)

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with a few minor comments/questions/suggestions in-line.

I'd like to see some more testing around this, the seek(0) calls especially feel like they require some reassurance that they are matching user expectations.

Comment on lines 126 to 127
metadata_url = os.path.join(url_prefix, 'metadata/')
targets_url = os.path.join(url_prefix, 'targets/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should this be urllib.parse.urljoin()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I'll change that

Fact is that the test needs to be rewritten, it's copied from test_updater.py and is horribly complex and full of issues like this for (I assume) Historical Reasons... but I'm trying to limit scope so not fixing anything else here

fetcher:

consistent_snapshot:
TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this covered by #1385?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the idea: The public API will change a bit as we start using the MetadataBundle so let's document after that

Comment on lines +51 to +52
repository_name: directory name (within a local directory
defined by 'tuf.settings.repositories_directory')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This took me a couple of attempts to parse, I'm not sure my suggestion is any better though.

Suggested change
repository_name: directory name (within a local directory
defined by 'tuf.settings.repositories_directory')
repository_name: name of the directory used for storing local copies of working files
for this repository. Created as a child of 'tuf.settings.repositories_directory'.

Copy link
Member Author

@jku jku May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better but I'm not changing it as I intend to

  • not use tuf.settings for anything at all in near future
  • use local_repo_dir as the argument directly -- there's no need for a system wide repositories_directory (at least not one managed by Updater)

this will happen after we have a MetadataBundle #1355 we can use and should be much easier to understand and document

The test has issues like this alsready but let's not add more...

Signed-off-by: Jussi Kukkonen <jkukkonen@vmware.com>
@joshuagl
Copy link
Member

Thanks for responding to feedback, LGTM for a merge to experimental-client.

@jku jku merged commit 622a54b into theupdateframework:experimental-client May 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
experimental-client Items related to the development of a new client (see milestone/8 and theexperimental-client branch)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants