Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When is initialize method called? #2801

Closed
InakiRaba91 opened this issue Nov 20, 2023 · 4 comments · Fixed by #2809
Closed

When is initialize method called? #2801

InakiRaba91 opened this issue Nov 20, 2023 · 4 comments · Fixed by #2809

Comments

@InakiRaba91
Copy link
Contributor

📚 The doc issue

I've created a custom handler with the following initialize method

class CustomHandler(VisionHandler):
    def initialize(self, context):
        print("Got here 000!")
        time.sleep(20)
        print("Got here 111!")
        super(VisionHandler, self).__init__()

I spin up the server using a single runner by running torchserve --start --ncs --ts-config model-store/config.properties, where config.properties looks like:

inference_address=http://127.0.0.1:8080
management_address=http://127.0.0.1:8081
metrics_address=http://127.0.0.1:8082
model_store=/home/inaki/code/animal_classifier/model-store
load_models=animal.mar
min_workers=1
max_workers=1
default_workers_per_model=1
model_snapshot={"name":"startup.cfg", "modelCount":1, "models":{"animal":{"1.0":{"defaultVersion":true, "marName":"animal.mar", "minWorkers":1, "maxWorkers":1, "batchSize":2, "maxBatchDelay":2000, "responseTimeout":30000}}}}

I notice the "Got here" logs don't show up during the initial phase, where I assumed the model was loaded. Instead, they show up when I submit the first request to the server (curl -X POST http://localhost:8080/predictions/animal -T ./data/cats_and_dogs/frames/2.png), but not for subsequent requests. And there's no sleep time in between the two prints.

My assumption is that printing the logs is somehow cached? I'd like to know if there's a diagram to better understand the flow.

I noticed too that in the model_service_worker, there seem to be two routes for handling incoming requests based on this branching. Can somebody explain what is the distinction between cmd == b"I" and cmd == b"L"?

Suggest a potential alternative/fix

Including a diagram/explanation with the spin-up flow in the documentation

@lxning
Copy link
Collaborator

lxning commented Nov 22, 2023

@InakiRaba91

  1. initialize is called during model loading. Please try the following code. The backend log (ie. the log custom handler) will be sent to frontend for log processing.
class CustomHandler(VisionHandler):
    def initialize(self, context):
        print("Got here 000!")
        time.sleep(20)
        print("Got here 111!")
        super().initialize(context)
  1. cmd == b"I" => model loading request from frontend
  2. cmd == b"L" => model inference request from frontend

@irabanillo91
Copy link

Thanks. I think 2-3 might be the other way around? Like

  1. cmd == b"I" => model Inference request from frontend
  2. cmd == b"L" => model Loading request from frontend

@lxning
Copy link
Collaborator

lxning commented Nov 23, 2023

@InakiRaba91 sorry for my typo. you are right about 2-3.

@InakiRaba91
Copy link
Contributor Author

Then I'd suggest adding a comment in the code to explain this since it's not trivial to infer without prior knowledge. I've submitted this tiny PR to add it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants