遇到两个问题 #3

INDEX108 · 2024-05-16T15:56:46Z

1.编译sp的engine的时候会遇到trt报错，原因是选择最大512点时，网络经过topk操作返回数值里有混合类型。这个暂时通过不限制512点来正常编译。
2.项目好像不能正常运行，节点直接死掉了
[ WARN] [1715874831.991182397]: /home/index108/VINS/VINS_FUSION/src/VINS-Fusion-master/config/euroc/euroc_stereo_imu_config.yaml
[ WARN] [1715874831.991823929]: fix extrinsic param
[vins_estimator-3] process has died [pid 168234, exit code -11, cmd /home/index108/VINS/D_VINS/devel/lib/vins/vins_node /home/index108/VINS/VINS_FUSION/src/VINS-Fusion-master/config/euroc/euroc_stereo_imu_config.yaml __name:=vins_estimator __log:=/home/index108/.ros/log/7da11d82-139c-11ef-a8f9-e1cdc8ddab3b/vins_estimator-3.log].
log file: /home/index108/.ros/log/7da11d82-139c-11ef-a8f9-e1cdc8ddab3b/vins_estimator-3*.log
[loop_fusion-4] process has died [pid 168239, exit code -11, cmd /home/index108/VINS/D_VINS/devel/lib/loop_fusion/loop_fusion_node /home/index108/VINS/VINS_FUSION/src/VINS-Fusion-master/config/euroc/euroc_stereo_imu_config.yaml __name:=loop_fusion __log:=/home/index108/.ros/log/7da11d82-139c-11ef-a8f9-e1cdc8ddab3b/loop_fusion-4.log].
log file: /home/index108/.ros/log/7da11d82-139c-11ef-a8f9-e1cdc8ddab3b/loop_fusion-4*.log

kajo-kurisu · 2024-05-17T01:19:34Z

1.请确保你使用的sp的onnx模型与导出方法和我一致
2.loop节点挂掉可能是因为你没有在deep_net.h中设置正确的engine文件，vins节点请提供更详细的报错信息，以及你的配置情况

INDEX108 · 2024-05-17T18:17:16Z

这是我的sp onnx模型导出方法，配置是torch2.1.0+cu121 onnx 1.13.1 onnxruntime1.17.3 onnxsim0.4.36
使用的trtexec指令：trtexec --onnx='/home/index108/VINS/D_VINS/LightGlue-ONNX/weights/superpoint_v1.onnx' --fp16 --minShapes=image:1x1x480x752 --optShapes=image:1x1x480x752 --maxShapes=image:1x1x480x752 --saveEngine=/home/index108/VINS/D_VINS/model/superpoint_752x480_512.engine --warmUp=500 --duration=10
报错：
[05/18/2024-02:14:09] [W] [TRT] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[05/18/2024-02:14:09] [W] [TRT] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
[05/18/2024-02:14:09] [E] Error[3]: If_401_OutputLayer: Input types should be equivalent. Types are Int32 and Float.
[05/18/2024-02:14:09] [E] [TRT] ModelImporter.cpp:771: While parsing node number 401 [If -> "onnx::Slice_479"]:
[05/18/2024-02:14:09] [E] [TRT] ModelImporter.cpp:772: --- Begin node ---
[05/18/2024-02:14:09] [E] [TRT] ModelImporter.cpp:773: input: "onnx::If_478"
output: "onnx::Slice_479"
output: "onnx::Unsqueeze_480"
name: "If_401"
op_type: "If"
attribute {
name: "then_branch"
g {
node {
input: "/Concat_11_output_0"
output: "keypoints.2"
name: "Identity_402"
op_type: "Identity"
}
node {
input: "/Reshape_13_output_0"
output: "scores.7"
name: "Identity_403"
op_type: "Identity"
}
name: "sub_graph"
output {
name: "keypoints.2"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_param: "Identitykeypoints.2_dim_0"
}
dim {
dim_value: 2
}
}
}
}
}
output {
name: "scores.7"
type {
tensor_type {
elem_type: 1
shape {
dim {
dim_param: "Identityscores.7_dim_0"
}
}
}
}
}
}
type: GRAPH
}
attribute {
name: "else_branch"
g {
node {
output: "483"
name: "Constant_404"
op_type: "Constant"
attribute {
name: "value"
t {
dims: 1
data_type: 7
raw_data: "\001\000\000\000\000\000\000\000"
}
type: TENSOR
}
}
node {
input: "/Constant_126_output_0"
input: "483"
output: "484"
name: "Reshape_405"
op_type: "Reshape"
attribute {
name: "allowzero"
i: 0
type: INT
}
}
node {
input: "/Reshape_13_output_0"
input: "484"
output: "scores.11"
output: "indices"
name: "TopK_406"
op_type: "TopK"
attribute {
name: "axis"
i: 0
type: INT
}
attribute {
name: "largest"
i: 1
type: INT
}
attribute {
name: "sorted"
i: 1
type: INT
}
}
node {
input: "/Concat_11_output_0"
input: "indices"
output: "487"
name: "Gather_407"
op_type: "Gather"
attribute {
name: "axis"
i: 0
type: INT
}
}
name: "sub_graph1"
output {
name: "487"
type {
tensor_type {
elem_type: 7
shape {
dim {
dim_param: "Gather487_dim_0"
}
dim {
dim_value: 2
}
}
}
}
}
output {
name: "scores.11"
type {
tensor_type {
elem_type: 1
shape {
dim {
dim_param: "TopKscores.11_dim_0"
}
}
}
}
}
}
type: GRAPH
}

[05/18/2024-02:14:09] [E] [TRT] ModelImporter.cpp:774: --- End node ---
[05/18/2024-02:14:09] [E] [TRT] ModelImporter.cpp:777: ERROR: ModelImporter.cpp:195 In function parseGraph:
[6] Invalid Node - If_401
If_401_OutputLayer: Input types should be equivalent. Types are Int32 and Float.
[05/18/2024-02:14:09] [E] Failed to parse onnx file
[05/18/2024-02:14:09] [I] Finished parsing network model. Parse time: 0.0201627
[05/18/2024-02:14:09] [E] Parsing model failed
[05/18/2024-02:14:09] [E] Failed to create engine from model or file.
[05/18/2024-02:14:09] [E] Engine set up failed
使用的tensorRT8.6.1.6+CUDA12.1+CUDNN8.9.5

INDEX108 · 2024-05-17T19:03:44Z

1.请确保你使用的sp的onnx模型与导出方法和我一致 2.loop节点挂掉可能是因为你没有在deep_net.h中设置正确的engine文件，vins节点请提供更详细的报错信息，以及你的配置情况

您好 vins的解决了现在就是loop fusion在运行过程中会死掉目前报错：
[ WARN] [1715972313.674659166]: /home/index108/VINS/D_VINS/src/D_VINS/config/euroc/euroc_mono_imu_config.yaml
[ WARN] [1715972313.675163396]: fix extrinsic param
[ WARN] [1715972313.683998136]: waiting for image and imu...
[ WARN] [1715972351.778812641]: gyroscope bias initial calibration -0.00362679 0.0231133 0.0789586
[2024-05-18 02:59:13][error][preprocess_kernel.cu:514]:launch failed: invalid configuration argument
[2024-05-18 02:59:13][error][trt_infer.cpp:22]:NVInfer: 3: [executionContext.cpp::enqueueInternal::795] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/executionContext.cpp::enqueueInternal::795, condition: bindings[x] || nullBindingOK
)
[2024-05-18 02:59:13][fatal][trt_infer.cpp:281]:Enqueue failed, code 9[cudaErrorInvalidConfiguration], message invalid configuration argument
[loop_fusion-3] process has died [pid 16520, exit code -6, cmd /home/index108/VINS/D_VINS/devel/lib/loop_fusion/loop_fusion_node /home/index108/VINS/D_VINS/src/D_VINS/config/euroc/euroc_mono_imu_config.yaml __name:=loop_fusion __log:=/home/index108/.ros/log/17d716b0-147c-11ef-af6a-1f3345969fd2/loop_fusion-3.log].
log file: /home/index108/.ros/log/17d716b0-147c-11ef-af6a-1f3345969fd2/loop_fusion-3*.log

INDEX108 · 2024-05-17T19:07:12Z

这是不是因为我编译onnx时没有限制max特征点为512导致的后续错误

kajo-kurisu · 2024-05-18T02:16:07Z

1.看起来你的命令没有问题，尝试使用torch1.13导出onnx模型，或使用cuda11.7导出engine模型，版本不同可能对各算子的处理方式也不同
2.在 trt_infer.cpp 的210行左右有对模型的输入输出的维度绑定，你可以检查一下你导出的模型的名字是否跟里面写的有所变动

INDEX108 · 2024-05-18T02:26:59Z

1.看起来你的命令没有问题，尝试使用torch1.13导出onnx模型，或使用cuda11.7导出engine模型，版本不同可能对各算子的处理方式也不同 2.在 trt_infer.cpp 的210行左右有对模型的输入输出的维度绑定，你可以检查一下你导出的模型的名字是否跟里面写的有所变动

您好，感谢耐心回复。不过我尝试用torch1.13-cu117+onnx1.16导出模型，但是1.13貌似不支持aten::scaled_dot_product_attention，我不明白您是如何在该版本下导出的

kajo-kurisu · 2024-05-18T03:08:59Z

你是对的，我刚才重新试了一下，发现sp和lg的方法写的有些问题，如果你想自己导出lg，请使用torch2.1，可以正常导出engine。sp可以像LightGlue-ONNX一样使用torch2.0进行导出。为了方便使用，建议直接使用https://github.com/fabio-sim/LightGlue-ONNX/releases 的V0.1.3 中的superpoint_lightglue.onnx 和 superpoint_512.onnx 进行engine的导出

INDEX108 · 2024-05-18T03:29:34Z

1.看起来你的命令没有问题，尝试使用torch1.13导出onnx模型，或使用cuda11.7导出engine模型，版本不同可能对各算子的处理方式也不同 2.在 trt_infer.cpp 的210行左右有对模型的输入输出的维度绑定，你可以检查一下你导出的模型的名字是否跟里面写的有所变动

您好。我在cuda12.1 pytorch2.1.0+cu121 按照您的方法下载了v0.1.3的两个onnx并且编译了engine，目前是还没有跑rosbag，节点就死了：
[2024-05-18 11:29:19][fatal][trt_tensor.cpp:359]:Assert failed, ndims == shape_.size()
[loop_fusion-4] process has died [pid 16078, exit code -6, cmd /home/index108/VINS/D_VINS/devel/lib/loop_fusion/loop_fusion_node /home/index108/VINS/D_VINS/src/D_VINS/config/euroc/euroc_mono_imu_config.yaml __name:=loop_fusion __log:=/home/index108/.ros/log/cf22d440-14c6-11ef-b67e-abc4e968c450/loop_fusion-4.log].
log file: /home/index108/.ros/log/cf22d440-14c6-11ef-b67e-abc4e968c450/loop_fusion-4*.log

kajo-kurisu · 2024-05-18T03:32:32Z

在 trt_infer.cpp 的210行左右有对模型的输入输出的维度绑定，你可以检查一下你导出的模型的各个输入输出的名字是否跟里面写的有所变动

INDEX108 · 2024-05-18T03:37:58Z

在 trt_infer.cpp 的210行左右有对模型的输入输出的维度绑定，你可以检查一下你导出的模型的各个输入输出的名字是否跟里面写的有所变动

好的，我打印输出一下看看。我刚刚发现：
1.使用v0.1.3下载的superpoint512.onnx导出的engine+v0.1.3下载的lightglue.onnx导出的engine会导致直接报错
2.使用v0.1.3下载的superpoint512.onnx导出的engine+自己在2.1.0下编译的lightglue不会直接报错，但rosbag跑起来报错，与之前一样
3.使用自己在2.1.0下编译的superpoint.onnx（不限制maxkeypoint）+自己在2.1.0下编译的lightglue不会直接报错，但rosbag跑起来报错，与之前一样

INDEX108 · 2024-05-18T03:44:24Z

在 trt_infer.cpp 的210行左右有对模型的输入输出的维度绑定，你可以检查一下你导出的模型的各个输入输出的名字是否跟里面写的有所变动

您好我在如图地方试图打印bindingName 但是报错前他没有任何输出

kajo-kurisu · 2024-05-18T03:53:42Z

使用netron查看onnx的维度名称，另外如果不是这里的错，请debug一下来确定报错位置

kajo-kurisu · 2024-05-18T04:15:21Z

导出的spre模型有问题，你可以检查一下

INDEX108 · 2024-05-18T04:16:08Z

导出的spre模型有问题，你可以检查一下

好像是的我调试得到了具体的报错位置

INDEX108 · 2024-05-18T04:23:21Z

导出的spre模型有问题，你可以检查一下

奇怪的是我导出spre时没有遇到任何问题这导致我以为问题出在sp。请问该如何检查呢我没有思路

kajo-kurisu · 2024-05-18T04:37:06Z

先看onnx的名称和维度，再检查trtexec导出时你的设置，工具上面已经提到过。

INDEX108 · 2024-05-18T04:40:43Z

谢谢您的耐心。我检查的trtexec导出时与您的语句是一致的
trtexec --onnx='/home/index108/VINS/D_VINS/superpoint_recover_des_480x752.onnx' --fp16 --saveEngine='/home/index108/VINS/D_VINS/model/superpoint_recover_des_480x752.engine' --warmUp=500 --duration=10 --minShapes=keypoints_r:1x20x2 --optShapes=keypoints_r:1x150x2 --maxShapes=keypoints_r:1x512x2
onnx可视化显示如下：

kajo-kurisu · 2024-05-18T05:17:09Z

我在torch1.13下走了一遍sp_re的流程，没有出现问题，你可以尝试一下

INDEX108 · 2024-05-18T05:19:10Z

我在torch1.13下走了一遍sp_re的流程，没有出现问题，你可以尝试一下

是sp_re单独用torch1.13+cu117么我主环境是pytorch2.1.0+cuda12.1 是否会影响

INDEX108 · 2024-05-18T05:23:46Z

我在torch1.13下走了一遍sp_re的流程，没有出现问题，你可以尝试一下

刚刚尝试了pytorch1.13+cu117下转onnx 在用trt编译还是一样的错误=-=不过我电脑安装的cuda是12.1 tensorrt8.6.1.6

kajo-kurisu · 2024-05-18T05:23:46Z

是的，sp_re单独用这个版本导出。模型正确导出之后，运行的时候只和cuda以及tensorrt版本相关，与pytorch无关，应该不会有问题，并且cuda和tensorrt的多版本切换是十分方便的

INDEX108 · 2024-05-18T05:48:39Z

是的，sp_re单独用这个版本导出。模型正确导出之后，运行的时候只和cuda以及tensorrt版本相关，与pytorch无关，应该不会有问题，并且cuda和tensorrt的多版本切换是十分方便的

好的，谢谢！不过在pytorch1.13+cu117下转onnx 在用trt编译还是一样的错误，我电脑安装的cuda是12.1 tensorrt8.6.1.6我后续再尝试在cuda11.7下trt编译engine以及catkin_make DVINS

INDEX108 · 2024-05-18T08:53:12Z

是的，sp_re单独用这个版本导出。模型正确导出之后，运行的时候只和cuda以及tensorrt版本相关，与pytorch无关，应该不会有问题，并且cuda和tensorrt的多版本切换是十分方便的

我刚刚把cuda也换到11.7了，还是不行。请问您具体测试的sp_re的python环境是什么是项目里写的那样吗

kajo-kurisu · 2024-05-18T10:06:40Z

cuda11.7, torch1.13.1,再次确认，没有问题

INDEX108 · 2024-05-18T10:20:02Z

cuda11.7, torch1.13.1,再次确认，没有问题

目前我没什么思路了，这个地方能看出来是什么问题导致的么

kajo-kurisu · 2024-05-18T10:44:15Z

应该是你的模型的输入输出设置有问题，导致了推理结果搬运错误，请仔细检查

INDEX108 · 2024-05-18T13:56:05Z

您说的设置是在导出onnx时的设置还是trt的设置我没有找到任何可能错误的地方

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年5月18日(周六) 晚上6:44 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [kajo-kurisu/D_VINS] 遇到两个问题 (Issue #3) 应该是你的模型的输入输出设置有问题，导致了推理结果搬运错误，请仔细检查 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

kajo-kurisu · 2024-05-20T02:40:26Z

最后确认，没有任何问题。请检查你是否改动了config中的参数，如果需要更改请参照 euroc_stereo_imu_config.yaml 的格式填写自己的参数文件。

INDEX108 · 2024-05-20T12:43:26Z

没有改动任何config

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年5月20日(周一) 上午10:40 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [kajo-kurisu/D_VINS] 遇到两个问题 (Issue #3) 最后确认，没有任何问题。请检查你是否改动了config中的参数，如果需要更改请参照 euroc_stereo_imu_config.yaml 的格式填写自己的参数文件。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

INDEX108 · 2024-05-20T19:44:39Z

非常抱歉是我的问题我用的config不对。。。请问这个是只能用于双目吗

…

---原始邮件--- 发件人: ***@***.***> 发送时间: 2024年5月20日(周一) 上午10:40 收件人: ***@***.***>; 抄送: ***@***.******@***.***>; 主题: Re: [kajo-kurisu/D_VINS] 遇到两个问题 (Issue #3) 最后确认，没有任何问题。请检查你是否改动了config中的参数，如果需要更改请参照 euroc_stereo_imu_config.yaml 的格式填写自己的参数文件。 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

chenjack9017 · 2024-10-22T04:56:11Z

你是对的，我刚才重新试了一下，发现sp和lg的方法写的有些问题，如果你想自己导出lg，请使用torch2.1，可以正常导出engine。sp可以像LightGlue-ONNX一样使用torch2.0进行导出。为了方便使用，建议直接使用https://github.com/fabio-sim/LightGlue-ONNX/releases 的V0.1.3 中的superpoint_lightglue.onnx 和 superpoint_512.onnx 进行engine的导出

Hi, hello, do you still use these two onnx files now?

kajo-kurisu · 2024-10-22T12:26:46Z

你是对的，我刚才重新试了一下，发现sp和lg的方法写的有些问题，如果你想自己导出lg，请使用torch2.1，可以正常导出engine。sp可以像LightGlue-ONNX一样使用torch2.0进行导出。为了方便使用，建议直接使用https://github.com/fabio-sim/LightGlue-ONNX/releases 的V0.1.3 中的superpoint_lightglue.onnx 和 superpoint_512.onnx 进行engine的导出

Hi, hello, do you still use these two onnx files now?

yes

kajo-kurisu closed this as completed May 20, 2024

This was referenced Aug 5, 2024

段错误 #9

Closed

单目运行 #10

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

遇到两个问题 #3

遇到两个问题 #3

INDEX108 commented May 16, 2024

kajo-kurisu commented May 17, 2024

INDEX108 commented May 17, 2024

INDEX108 commented May 17, 2024

INDEX108 commented May 17, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024 •

edited

Loading

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024 •

edited

Loading

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024 via email

kajo-kurisu commented May 20, 2024

INDEX108 commented May 20, 2024 via email

INDEX108 commented May 20, 2024 via email

chenjack9017 commented Oct 22, 2024

kajo-kurisu commented Oct 22, 2024

遇到两个问题 #3

遇到两个问题 #3

Comments

INDEX108 commented May 16, 2024

kajo-kurisu commented May 17, 2024

INDEX108 commented May 17, 2024

INDEX108 commented May 17, 2024

INDEX108 commented May 17, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024 • edited Loading

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024 • edited Loading

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024

kajo-kurisu commented May 18, 2024

INDEX108 commented May 18, 2024 via email

kajo-kurisu commented May 20, 2024

INDEX108 commented May 20, 2024 via email

INDEX108 commented May 20, 2024 via email

chenjack9017 commented Oct 22, 2024

kajo-kurisu commented Oct 22, 2024

kajo-kurisu commented May 18, 2024 •

edited

Loading

kajo-kurisu commented May 18, 2024 •

edited

Loading