Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the early terminative issue in httpRequestHandlerChain #1175

Merged
merged 2 commits into from
Jul 29, 2021

Conversation

Silas606
Copy link

Description

Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes #1174

Type of change

minor code change.
Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

Feature/Issue validation/testing

Tested with

curl localhost:8080/ping
curl localhost:8080/models
curl localhost:8080/workflow

Checklist:

  • Have you added tests that prove your fix is effective or that this feature works?
  • New and existing unit tests pass locally with these changes?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-cpu
  • Commit ID: 54e1be6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-gpu
  • Commit ID: 54e1be6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-win
  • Commit ID: 54e1be6
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-cpu
  • Commit ID: 15936ff
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-gpu
  • Commit ID: 15936ff
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@lxning lxning requested a review from maaquib July 27, 2021 16:44
@maaquib
Copy link
Collaborator

maaquib commented Jul 27, 2021

@Silas606 Thanks fot this. Do you mind also adding the output of the sanity suite run as well (doc)

@Silas606
Copy link
Author

Hi @maaquib here is the output of torchserve_sanity.py. The doc in your reply linked to a Team meeting which I don't have access to. I just attach the log here. Thanks.
out.log

@chauhang
Copy link
Contributor

@lxning Can you please clarify why the same port is being used for the Management and Inference APIs? And how is this fix addressing the issue?

@Silas606
Copy link
Author

Sorry I think I made a mistake here as mentioned in #1174 (comment)

No changes needed in ManagementRequestHandler.java as it's not blocking anyone. The issue is just with WorkflowInferenceRequestHandler.java

@chauhang The fix in WorkflowInferenceRequestHandler.java simply follows rules in InferenceRequestHandler.java to only throw ResourceNotFoundException when this API request should clearly be handled here by moving the segment length check after confirming segments[1] equals wfpredict. others such as /models will be handed to subsequent handlers.

Currently in 0.4.0, WorkflowInferenceRequestHandler will throw ResourceNotFoundException for any API requests with segments.length<3 without considering subsequent handlers. With different API ports, that's OK for now since WorkflowInferenceRequestHandler itself is the last handler in inference chain. This logic seems a bit weird to me for a handler in chain though it can work properly. However, when sharing the same port between inference and management API, it will terminate the single chain early and localhost:8080/models just falls in this category. Thus those 404 errors.

I think sharing port is working fine for 0.3.1 and it would be reasonable to expect sharing API port to work also in 0.4.0 or later versions ? Do you have plans to deprecate this feature or are there any concerns with sharing API port ? Really appreciate your help.

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-cpu
  • Commit ID: 3ef7b7f
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@Silas606
Copy link
Author

I have updated the commit to drop changes in ManagementRequestHandler.java and here is the sanity output out_3ef7b7f.log

@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-gpu
  • Commit ID: 3ef7b7f
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@lxning lxning merged commit 092ecf0 into pytorch:master Jul 29, 2021
@sagemaker-neo-ci-bot
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: torch-serve-build-win
  • Commit ID: 926216d
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@lxning lxning added the bug Something isn't working label Nov 3, 2021
@lxning lxning added this to the v0.4.3 milestone Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Torchserve 0.4.0 model & workflow management api unreachable
5 participants