-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AOT C Codegen Type Issue #8062
Comments
cc @Mousius @manupa-arm i think Graph Memory planning is sharing the output buffer (here named |
thanks @gromero for helping to find this issue! |
@mehrdadh , can you confirm how do you know the output type is int8 ? I see symptoms of a different problem here because there is a fused cast at the last operator. For argument sake, can you usethe last operator's arg type (post-cast -- in fused_divide_add_round_cast_clip_cast) and see whether the issue persists. If my thinking is correct, the compiled model might expect the output type to be int_16 and not int_8. cc @giuseros : I think there might be int_16 cast at the very end causing this. |
@manupa-arm it is a quantized network and the final output is expected to be int8. Here is the full AOT generated library:
|
|
Ack, Yes its not what I thought. |
In this PR we are decoupling AOT from the Graph Memory Planner. Since AOT has the runner expressed in TIR we can get rid of the GMP in relay and use the Storage Rewrite Pass to do memory planning on the runner function. This also sorts out the issue mentioned in apache#8062 Change-Id: I6e33fadbf0462edf0366ee37e84ffde26123d3cb
In this PR we are decoupling AOT from the Graph Memory Planner. Since AOT has the runner expressed in TIR we can get rid of the GMP in relay and use the Storage Rewrite Pass to do memory planning on the runner function. This also sorts out the issue mentioned in apache#8062 Change-Id: I6e33fadbf0462edf0366ee37e84ffde26123d3cb
In this PR we are decoupling AOT from the Graph Memory Planner. Since AOT has the runner expressed in TIR we can get rid of the GMP in relay and use the Storage Rewrite Pass to do memory planning on the runner function. This also sorts out the issue mentioned in apache#8062 Change-Id: I6e33fadbf0462edf0366ee37e84ffde26123d3cb
In this PR we are decoupling AOT from the Graph Memory Planner. Since AOT has the runner expressed in TIR we can get rid of the GMP in relay and use the Storage Rewrite Pass to do memory planning on the runner function. This also sorts out the issue mentioned in apache#8062 Change-Id: I6e33fadbf0462edf0366ee37e84ffde26123d3cb
Hi all, Why is this a problem? Because GMP can expand temporaries to that they can be shared. Thinking the output is a temporary it expands the output as well, but the output is not expandable in AOT (since it is provided by the user). We were able to reproduce this with mobilenet quantized. The fix we come up with in #8096 is to remove the dependency from the GMP and use the TIR StorageRewrite pass on the runner function, which does not suffer of this problem. The memory foot print is the same, so we basically got two birds with one stone. |
Thanks @giuseros! |
will be solved by this PR: #8096 |
In this PR we are decoupling AOT from the Graph Memory Planner. Since AOT has the runner expressed in TIR we can get rid of the GMP in relay and use the Storage Rewrite Pass to do memory planning on the runner function. This also sorts out the issue mentioned in apache#8062 Change-Id: I6e33fadbf0462edf0366ee37e84ffde26123d3cb
In this PR we are decoupling AOT from the Graph Memory Planner. Since AOT has the runner expressed in TIR we can get rid of the GMP in relay and use the Storage Rewrite Pass to do memory planning on the runner function. This also sorts out the issue mentioned in apache#8062 Change-Id: I6e33fadbf0462edf0366ee37e84ffde26123d3cb
will close this when the PR or another solution merges |
@mehrdadh yes, this issue is resolved. |
I have a model that I tried to deploy using AOT runtime. The model final output has type
int8
and based on that I allocated a placeholder for the output like this:However, when I look at the C code generated for AOT runtime library, here's what has been generated:
output_0
is the placeholder for final output (output_data0
) that we passed to functiontvm__run_func
and it hasint8
type, however,output_0
has been used other intermediate functions and assigned other types likefloat
. For ex.fused_nn_contrib_dense_pack_add_fixed_point_multiply_add_clip_cast_cast_subtract_14669711146056581479_
function is defined here:and here
T_multiply
is the output_0 which is interpreted as float type and this cause memory overriding of other variables.One quick fix is to assign the final output as the largest size that we used in graph(float32/float64) to avoid this problem, but we need a better way to fix this problem.
The text was updated successfully, but these errors were encountered: