Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More explicit logic for CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS #394

Merged
merged 6 commits into from
Dec 5, 2023

Conversation

jaimergp
Copy link
Contributor

@jaimergp jaimergp commented Nov 21, 2023

Description

Inspired by #391. Refactoring a bit before we delve in the question of "how many attempts should we try before giving up?".

cc @mbargull

Checklist - did you ...

  • Add a file to the news directory (using the template) for the next release's release notes?
  • Add / update necessary tests?
  • Add / update outdated documentation?

@conda-bot conda-bot added the cla-signed [bot] added once the contributor has signed the CLA label Nov 21, 2023
return max_attempts_from_env
if in_state.update_modifier.FREEZE_INSTALLED:
# this the default, but can be overriden with --update-specs
# TODO: should we cap this at a reasonable number? some base envs have 100s of pkgs
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main point of this PR :) Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't messed with libmamba solver at all, so absolutely no idea/feeling about what ranges would be sensible.

If you don't want a hard limit, you could use some function that just increases slower for higher inputs.
If you want a hard limit you could either do min(n_installed, limit) or something that only approaches the limit gradually like math.ceil(n_installed / (1 + (n_installed / limit)**2)**.5) or something more/less sophisticated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each solve is usually sub-second, but some complex one that need some extra backtracking (like the one reported in your issue) take a 2-4s. If you end up with 100 installed packages then you need to wait a few minutes for the solver to give up.

By "giving up" I mean that we stop trying to unlock installed packages and we let the solver modify them as needed, just as mamba does. So maybe we can get by with 10 attempts. In other words, let the solver unfreeze up to 5-10 conflicting installed packages one by one until we just let everything float.

Alternatively, we could make the retry loop be based on time spent and not iterations, but that might be more complex than necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to try with 10 attempts and see if that is enough to pass all tests. Then we adjust as necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.
Could you add that as a constant variable (DEFAULT_LIBMAMBA_SOLVER_MAX_ATTEMPTS or something ) to better track that?


Alternatively, we could make the retry loop be based on time spent and not iterations, but that might be more complex than necessary.

Yes, complexity, but also determinism ;).

@jaimergp jaimergp marked this pull request as ready for review November 22, 2023 09:28
@jaimergp
Copy link
Contributor Author

Errors are due to conda/conda#13360

@jaimergp jaimergp closed this Nov 27, 2023
@jaimergp jaimergp reopened this Nov 27, 2023
Copy link
Member

@jezdez jezdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few suggestions, nothing major

conda_libmamba_solver/solver.py Outdated Show resolved Hide resolved
int(os.environ.get("CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS", len(in_state.installed))) + 1,
)
for attempt in range(1, max_attempts):
for attempt in range(1, self._max_attempts(in_state) + 1):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to prepopulate the max attempts at the start of the solve as a instance attribute, instead of inline, so it's easier to debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this might break some tests. It shouldn't, but by saving it as instance attribute we are assuming we are going to have the same number of attempts for every call to solve_for_*() (for the lifetime of the same instance). In some instances, that might be tied to len(in_state.installed).

So I'd rather keep it like it is because it's technically more correct and does not assume that "one instantiation, one solve".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you!

@jezdez
Copy link
Member

jezdez commented Dec 4, 2023

@mbargull To clarify, does this actually fix #391 for you or is this just a nice-to-have?

@mbargull
Copy link
Member

mbargull commented Dec 5, 2023

@mbargull To clarify, does this actually fix #391 for you or is this just a nice-to-have?

No, gh-391 actually failed on the first attempt for me.
The example from gh-391 was just one of those for which @jaimergp noticed long run times for when it didn't segfaulted (since in that example it has to downgrade some packages).


gh-381 incidentally fixes the example from gh-391, but it is unclear if other cases might still fail depending on the ordering; see the discussing from #391 (comment) onward.

Co-authored-by: Jannis Leidel <jannis@leidel.info>
@jezdez
Copy link
Member

jezdez commented Dec 5, 2023

@mbargull To clarify, does this actually fix #391 for you or is this just a nice-to-have?

No, gh-391 actually failed on the first attempt for me. The example from gh-391 was just one of those for which @jaimergp noticed long run times for when it didn't segfaulted (since in that example it has to downgrade some packages).

gh-381 incidentally fixes the example from gh-391, but it is unclear if other cases might still fail depending on the ordering; see the discussing from #391 (comment) onward.

Ah, thanks for the pointer, much appreciated!

@jezdez jezdez changed the title more explicit logic for CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS More explicit logic for CONDA_LIBMAMBA_SOLVER_MAX_ATTEMPTS Dec 5, 2023
@jezdez jezdez merged commit 96c59f3 into main Dec 5, 2023
71 checks passed
@jezdez jezdez deleted the limit-max-attempts branch December 5, 2023 13:54
@github-actions github-actions bot added the locked [bot] locked due to inactivity label Dec 5, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 5, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla-signed [bot] added once the contributor has signed the CLA locked [bot] locked due to inactivity
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants