Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does shared_model_manager.manager.lora.download_default_loras() potentially lead to a hang up? #142

Closed
tazlin opened this issue Dec 29, 2023 · 2 comments · Fixed by #147
Closed
Labels
question Further information is requested

Comments

@tazlin
Copy link
Member

tazlin commented Dec 29, 2023

The prior round of testing used this fixture for the lora inference tests:

    @pytest.fixture(autouse=True, scope="class")
    def setup_and_teardown(self, shared_model_manager: type[SharedModelManager]):
        assert shared_model_manager.manager.lora
        shared_model_manager.manager.lora.download_default_loras()
        shared_model_manager.manager.lora.wait_for_downloads()
        yield
        shared_model_manager.manager.lora.stop_all()

Which, when run on my machine (and particularly through the CI runner), would hang here indefinitely during full CI runs, but not when the lora inference tests were first or run on their own.

Does running the lora model manager, or some other test(s) lead to this hang up in certain circumstances?

@tazlin tazlin added the question Further information is requested label Dec 29, 2023
@tazlin
Copy link
Member Author

tazlin commented Dec 31, 2023

This is happening again on a064615. I feel that something is happening during the CI that might not be happening in production that is causing this. This is the second PR round where it is specifically timing out during wait_for_downloads where the file is already on disk.

@tazlin
Copy link
Member Author

tazlin commented Dec 31, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant