Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT-#7141: Add an ability to use config variables with a context manager #7142

Merged
merged 9 commits into from
Apr 4, 2024

Conversation

dchigarev
Copy link
Collaborator

@dchigarev dchigarev commented Apr 2, 2024

What do these changes do?

This PR enables for this behavior to be possible:

import modin.config as cfg

print(cfg.RangePartitioning.get()) # False

with cfg.update_config(RangePartitioning, True):
    print(cfg.RangePartitioning.get()) # True

print(cfg.RangePartitioning.get()) # False
  • first commit message and PR title follow format outlined here

    NOTE: If you edit the PR title to match this format, you need to add another commit (even if it's empty) or amend your last commit for the CI job that checks the PR title to pick up the new PR title.

  • passes flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
  • passes black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
  • signed commit with git commit -s
  • Resolves Add an ability to modify config variables using context manager #7141
  • tests added and passing
  • module layout described at docs/development/architecture.rst is up-to-date

…a context manager

Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
modin/config/envvars.py Outdated Show resolved Hide resolved
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
@dchigarev dchigarev marked this pull request as ready for review April 2, 2024 15:15
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
True
"""
if value is None:
assert isinstance(config, dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asserts may be disabled in user code; it is better to throw ValueError exception.

anmyachev
anmyachev previously approved these changes Apr 3, 2024
Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
True
"""
if value is not None:
config = {cast(Parameter, config): value}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there (or could be in the future) any parameters that support None value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, it's better to use some nodefault placeholder instead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check isinstance(config, dict) first. The config parameter could be either dict or Parameter.

Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
@AndreyPavlenko
Copy link
Collaborator

AndreyPavlenko commented Apr 3, 2024

I've 2 minor suggestions:

  1. The function name update() seems not intuitive to me. From this name I would think that it's intended for the config update, but it's rather a config context switch than update. What about context() or override()?
  2. With the current approach we have different syntax for a single and multiple values:
with cfg.update(cfg.RangePartitioning, True):
    ...
with cfg.update({cfg.RangePartitioning: True, cfg.NPartitions: 1, cfg.MinPartitionSize: 64}):
    ...

What about something like this?

with cfg.context(RangePartitioning=True):
    ...
with cfg.context(RangePartitioning=True, NPartitions=1, MinPartitionSize=64):
    ...

It seems more consistent and compact.
A simple implementation:

# modin/config/__init__.py
modin_config = sys.modules[__name__]

@contextlib.contextmanager
def context(**configs):
    for k, v in configs.items():
        config = getattr(modin_config, k)
        configs[k], _ = config.get(), config.put(v)
    try:
        yield
    finally:
        for k, v in configs.items():
            getattr(modin_config, k).put(v)

Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
@dchigarev
Copy link
Collaborator Author

The function name update() seems not intuitive to me.

context and override also don't seem intuitive to me :D Renamed to update_config

What about something like this?

with cfg.context(RangePartitioning=True):
    ...
with cfg.context(RangePartitioning=True, NPartitions=1, MinPartitionSize=64):
    ...

It looks good, but it limits even more on how users can apply this. For example, one must either use a string name to refer to a config or import it separately, they can't refer to a config using cfg.Variable:

with cfg.context(cfg.RangePartitioning=True): # doesn't work
# File "<stdin>", line 1
#    cfg.context(cfg.RangePartitioning=True)
#                ^
# SyntaxError: expression cannot contain assignment, perhaps you meant "=="?

# one should either do:
with cfg.context("RangePartitioning"=True):
    ...

# or import the config variable separately
from modin.config import RangePartitioning
with cfg.context(RangePartitioning=True):
    ...

I don't like this type of limitation

@AndreyPavlenko
Copy link
Collaborator

or import it separately

No, you don't need any additional imports. The code I provided works as is, the only import is required - import modin.config as cfg

@AndreyPavlenko
Copy link
Collaborator

Renamed to update_config

update_config seems unintuitively redundant :)

import modin.config as config

with config.update_config(RangePartitioning=True):
    ...

vs

import modin.config as config

with config.context(RangePartitioning=True):
    ...

vs

import modin.config as config

with config.override(RangePartitioning=True):
    ...

@dchigarev
Copy link
Collaborator Author

or import it separately

No, you don't need any additional imports. The code I provided works as is, the only import is required - import modin.config as cfg

Agree, but it still doesn't allow you to write with context(cfg.RangePartitioning=...)

@AndreyPavlenko
Copy link
Collaborator

Agree, but it still doesn't allow you to write with context(cfg.RangePartitioning=...)

It doesn't, but do you really need it? with context(RangePartitioning=...) seems more convenient.

Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
modin/config/pubsub.py Outdated Show resolved Hide resolved
>>> AsyncReadMode.get()
False
"""
import modin.config as cfg
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense moving this function to the modin.config module?

Copy link
Collaborator Author

@dchigarev dchigarev Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean to modin/config/__ini__.py? Idk whether we should define something without a strong reason in an __init__.py file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's up to you. If you think the current location is better, I don't mind.
LGTM!

Signed-off-by: Dmitry Chigarev <dmitry.chigarev@intel.com>
Copy link
Collaborator

@anmyachev anmyachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! @AndreyPavlenko any more comments?

@anmyachev anmyachev merged commit 0dfd88d into modin-project:master Apr 4, 2024
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an ability to modify config variables using context manager
3 participants