-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow model to specify padding value instead of using first element #1279
Comments
I've located the problem in batching_util.cc:
Looks like we pad a tensor with first value of flatten tensor (or?) and not with a zero and that probably might work for image processing but not good for sequence batching. Please take a look. |
Thank you for your feedback. Could you elaborate on why padding with the first value (and not zero) doesn't work with sequence batching? We chose to use a real element instead of zero precisely to avoid assigning a specific meaning to zero. |
@oleg-yaroshevskiy Closing this issue as this has been in "awaiting response" status for more than a month. Please add additional comments and we can open the issue again. Thanks! |
Is there anyway to specify the padding value in the serving parameters ? Padding with zero is quite common. |
That might be related to tensor2tensor transformer implementation, as I'm not sure if they do any masking in the encoder. So imagine batch size 2 input will look like: In huggingface transformers they have optional masking argument. |
Reopening and marking as a feature request. If anyone wants to contribute to the project to fix this please feel free to! |
Are you still looking for a resolution? We are planning on prioritising the issues based on the community interests. Please let us know if this issue still persists with the latest TF Serving 2.12.1 release so that we can work on fixing it. |
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you. |
This issue was closed due to lack of activity after being marked stale for past 7 days. |
I think the correct way to work around this issue is to encode the var-length features using RaggedTensor.from_tensor(tensor, lengths) format, i.e. with two input tensors:
The original variable-lengthed RaggedTensor can then be recovered by calling This is safe against whatever padding algorithm Tensorflow Serving uses because the |
Hi,
I know, here it's zero tolerance to questions but after some testing I believe this is a bug. Can't be reproduced without using
pad_variable_length_inputs: true
flag. Will close this one from t2t asap.For long time I'm trying to solve a tf serving problem for transformer nmt model inference. So let's assume I create simple signature:
and everything works well.
Then I want to enable batching and due to different sequence lengths I set
pad_variable_length_inputs: true
in batching.config file.Thats where the problem starts. Serving returns garbage for short sequences f.e:
as we can see a longer one was predicted well and for short one it messed up with repeated tokens. This can't be reproduced for single element inference or out of tf serving env.
Example after decoding:
Any idea? Is it related to tf-serving padding? Or not the best input signature?
Give me a clue,
thanks.
Environment information
tf serving 1.13.0
tensor2tensor==1.6.6
The text was updated successfully, but these errors were encountered: