feat(ground_segmentation): add time_keeper #8585

a-maumau · 2024-08-22T08:46:54Z

Description

This PR will add time keeper feature for ground_segmentation.

The time tracking is set only for relatively long processing time functions/methods.

You need to modified the parameter file ground_segmentation.param.yaml in autoware_launch to run in autoware.

feat(ground_segmentation): add option for time keeper autoware_launch#1134

Sample output (updated from original comment)

# total time

- scan_ground_filter
100.00% faster_filter: total 3706.36 [ms], avg. 18.53 [ms], run count: 200
    ├── 67.09% convertPointcloudGridScan: total 2486.72 [ms], avg. 12.43 [ms], run count: 200
    │   ├── 50.49% scan: total 1871.41 [ms], avg. 9.36 [ms], run count: 200
    │   ├── 16.53% sort: total 612.49 [ms], avg. 3.06 [ms], run count: 200
    │   └── 0.08% rest: 2.82 [ms]
    ├── 28.57% classifyPointCloudGridScan: total 1059.02 [ms], avg. 5.30 [ms], run count: 200
    │   ├── 28.56% classify_loop: total 1058.42 [ms], avg. 5.29 [ms], run count: 200
    │   │   ├── 26.91% radial_loop: total 997.52 [ms], avg. 0.01 [ms], run count: 71999
    │   │   └── 1.64% rest: 60.90 [ms]
    │   └── 0.02% rest: 0.60 [ms]
    ├── 3.50% extractObjectPoints: total 129.54 [ms], avg. 0.65 [ms], run count: 200
    └── 0.84% rest: 31.07 [ms]

- ray_ground_filter
  100.00% filter: total 10171.14 [ms], avg. 36.46 [ms], run count: 279
    ├── 70.96% ConvertXYZIToRTZColor: total 7217.59 [ms], avg. 25.87 [ms], run count: 279
    ├── 13.44% ClassifyPointCloud: total 1366.64 [ms], avg. 4.90 [ms], run count: 279
    ├── 6.86% ExtractPointsIndices: total 698.06 [ms], avg. 2.50 [ms], run count: 279
    │   ├── 4.16% initializePointCloud2: total 423.15 [ms], avg. 0.76 [ms], run count: 558
    │   └── 2.70% rest: 274.91 [ms]
    └── 8.74% rest: 888.85 [ms]

- ransac_ground_filter
  100.00% filter: total 130212.22 [ms], avg. 1943.47 [ms], run count: 67
    ├── 0.07% transformPointCloud: total 89.05 [ms], avg. 0.66 [ms], run count: 134
    ├── 99.45% applyRANSAC: total 129490.73 [ms], avg. 1932.70 [ms], run count: 67
    ├── 0.07% extractPointsIndices: total 93.24 [ms], avg. 1.39 [ms], run count: 67
    ├── 0.00% getPlaneAffine: total 0.39 [ms], avg. 0.01 [ms], run count: 67
    └── 0.41% rest: 538.80 [ms]

How was this PR tested?

Notes for reviewers

None.

Interface changes

None.

Effects on system behavior

None.

github-actions · 2024-08-22T08:47:14Z

Thank you for contributing to the Autoware project!

🚧 If your pull request is in progress, switch it to draft mode.

Please ensure:

You've checked our contribution guidelines.
Your PR follows our pull request guidelines.
All required CI checks pass before marking the PR ready for review.

badai-nguyen · 2024-08-27T02:31:30Z

@a-maumau Thank you for your PR,
I think there are no time keeper for below functions

    │   ├── 0.02% checkBreakGndGrid: total 0.14 [ms], avg. 0.00 [ms], run count: 427548
    │   ├── 0.01% recheckGroundCluster: total 0.09 [ms], avg. 0.00 [ms], run count: 164117
    │   ├── 0.01% checkContinuousGndGrid: total 0.04 [ms], avg. 0.00 [ms], run count: 269943
    │   ├── 0.02% checkDiscontinuousGndGrid: total 0.12 [ms], avg. 0.00 [ms], run count: 354528
    │   └── 81.57% rest: 512.40 [ms]

Could you tell me how did you extract these result?

technolojin · 2024-08-27T05:28:00Z

similar to #8597 (comment)

a-maumau · 2024-08-29T02:05:50Z

@badai-nguyen Sorry for my poor explanations and making some confusions.

The time tracking is set only for relatively long processing time functions/methods.

This meant "I remove some timekeeper from the code".

The reason I removed the timekeeper from some methods is that its processing time is relatively small compared to the other parts.
It was not intended in this PR, but I found that adding the timekeeper for some method make the entire process slow later.

For example, following two images are output from debug tools.

The first image shows the visualization of processing_time_ms of the current code (no timekeeper). Second image is showing the case when the timekeeper added to the all methods.

In the second image, the processing time of scan_ground_filter (brown plot) ranged in like 160~240 ms. I think this is caused by too many call of some methods with timekeeper, I guess.

Checking the autoware_debug_tools's processing_time_visualizer,

100.00% faster_filter: total 238.61 [ms], avg. 238.61 [ms], run count: 1
    ├── 35.13% convertPointcloudGridScan: total 83.82 [ms], avg. 83.82 [ms], run count: 1
    │   ├── 0.52% get_point_from_global_offset: total 1.23 [ms], avg. 0.00 [ms], run count: 138102
    │   └── 34.61% rest: 82.59 [ms]
    ├── 64.21% classifyPointCloudGridScan: total 153.22 [ms], avg. 153.22 [ms], run count: 1
    │   ├── 0.03% get_point_from_global_offset: total 0.07 [ms], avg. 0.00 [ms], run count: 138102
    │   ├── 0.00% initializeFirstGndGrids: total 0.00 [ms], avg. 0.00 [ms], run count: 358
    │   ├── 0.03% calcGridSize: total 0.06 [ms], avg. 0.00 [ms], run count: 88036
    │   ├── 0.03% checkBreakGndGrid: total 0.07 [ms], avg. 0.00 [ms], run count: 35212
    │   ├── 0.04% recheckGroundCluster: total 0.08 [ms], avg. 0.00 [ms], run count: 14935
    │   ├── 0.02% checkContinuousGndGrid: total 0.04 [ms], avg. 0.00 [ms], run count: 24707
    │   ├── 0.01% checkDiscontinuousGndGrid: total 0.02 [ms], avg. 0.00 [ms], run count: 28117
    │   └── 64.07% rest: 152.88 [ms]
    ├── 0.44% extractObjectPoints: total 1.06 [ms], avg. 1.06 [ms], run count: 1
    └── 0.22% rest: 0.51 [ms]

100.00% faster_filter: total 24.13 [ms], avg. 24.13 [ms], run count: 1
    ├── 72.51% convertPointcloudGridScan: total 17.50 [ms], avg. 17.50 [ms], run count: 1
    ├── 24.45% classifyPointCloudGridScan: total 5.90 [ms], avg. 5.90 [ms], run count: 1
    ├── 2.50% extractObjectPoints: total 0.60 [ms], avg. 0.60 [ms], run count: 1
    └── 0.53% rest: 0.13 [ms]

This result is not from the same point in the rosbag file, so it is hard compare the result, but the ratio of processing times of convertPointcloudGridScan and classifyPointCloudGridScan got largely changed.
At least, we can say the overall processing time has reduced if we remove the timekeeper from some methods.

a-maumau · 2024-08-29T02:20:51Z

@technolojin It seems it is taking time for the for-loop which depends on the input point cloud number, and this might indicate the sample result of rest in the top PR comment comes from too many timekeeper.

For example in the case of adding timekeeper only for few methods,

100.00% faster_filter: total 24.29 [ms], avg. 24.29 [ms], run count: 1
    ├── 64.59% convertPointcloudGridScan: total 15.69 [ms], avg. 15.69 [ms], run count: 1
    ├── 31.13% classifyPointCloudGridScan: total 7.56 [ms], avg. 7.56 [ms], run count: 1
    ├── 3.76% extractObjectPoints: total 0.91 [ms], avg. 0.91 [ms], run count: 1
    └── 0.52% rest: 0.13 [ms]

convertPointcloudGridScan

autoware.universe/perception/autoware_ground_segmentation/src/scan_ground_filter/node.cpp

Lines 133 to 158 in 81934e4

    
           for (size_t global_offset = 0; global_offset + in_cloud_point_step <= in_cloud_data_size; 
        
                global_offset += in_cloud_point_step) { 
        
             get_point_from_global_offset(in_cloud, input_point, global_offset); 
        
             auto x{input_point.x - x_shift};  // base on front wheel center 
        
             auto radius{static_cast<float>(std::hypot(x, input_point.y))}; 
        
             auto theta{normalizeRadian(std::atan2(x, input_point.y), 0.0)}; 
        
             // divide by vertical angle 
        
             auto radial_div{static_cast<size_t>(std::floor(theta * inv_radial_divider_angle_rad))}; 
        
             uint16_t grid_id = 0; 
        
             if (radius <= grid_mode_switch_radius_) { 
        
               grid_id = static_cast<uint16_t>(radius * inv_grid_size_m); 
        
             } else { 
        
               auto gamma{normalizeRadian(std::atan2(radius, virtual_lidar_z_), 0.0f)}; 
        
               grid_id = grid_id_offset + gamma * inv_grid_size_rad; 
        
             } 
        
             current_point.grid_id = grid_id; 
        
             current_point.radius = radius; 
        
             current_point.point_state = PointLabel::INIT; 
        
             current_point.orig_index = point_index; 
        
             // radial divisions 
        
             out_radial_ordered_points[radial_div].emplace_back(current_point); 
        
             ++point_index; 
        
           }

: 8.962 ms

autoware.universe/perception/autoware_ground_segmentation/src/scan_ground_filter/node.cpp

Lines 161 to 165 in 81934e4

    
           for (size_t i = 0; i < radial_dividers_num_; ++i) { 
        
             std::sort( 
        
               out_radial_ordered_points[i].begin(), out_radial_ordered_points[i].end(), 
        
               [](const PointData & a, const PointData & b) { return a.radius < b.radius; }); 
        
           }

: 4.043 ms

8.962 + 4.043 ~ 13 ms (total is 15.69ms)

classifyPointCloudGridScan
- the top loop take: 7.541 ms (total is 7.56 ms)

badai-nguyen · 2024-08-29T02:22:01Z

@badai-nguyen Sorry for my poor explanations and making some confusions.

The time tracking is set only for relatively long processing time functions/methods.

This meant "I remove some timekeeper from the code".

The reason I removed the timekeeper from some methods is that its processing time is relatively small compared to the other parts. It was not intended in this PR, but I found that adding the timekeeper for some method make the entire process slow later.

For example, following two images are output from debug tools.

The first image shows the visualization of processing_time_ms of the current code (no timekeeper). Second image is showing the case when the timekeeper added to the all methods.

In the second image, the processing time of scan_ground_filter ranged in like 160~240 ms. I think this is caused by too many call of some methods with timekeeper, I guess.

Checking the autoware_debug_tools's processing_time_visualizer,
100.00% faster_filter: total 238.61 [ms], avg. 238.61 [ms], run count: 1
    ├── 35.13% convertPointcloudGridScan: total 83.82 [ms], avg. 83.82 [ms], run count: 1
    │   ├── 0.52% get_point_from_global_offset: total 1.23 [ms], avg. 0.00 [ms], run count: 138102
    │   └── 34.61% rest: 82.59 [ms]
    ├── 64.21% classifyPointCloudGridScan: total 153.22 [ms], avg. 153.22 [ms], run count: 1
    │   ├── 0.03% get_point_from_global_offset: total 0.07 [ms], avg. 0.00 [ms], run count: 138102
    │   ├── 0.00% initializeFirstGndGrids: total 0.00 [ms], avg. 0.00 [ms], run count: 358
    │   ├── 0.03% calcGridSize: total 0.06 [ms], avg. 0.00 [ms], run count: 88036
    │   ├── 0.03% checkBreakGndGrid: total 0.07 [ms], avg. 0.00 [ms], run count: 35212
    │   ├── 0.04% recheckGroundCluster: total 0.08 [ms], avg. 0.00 [ms], run count: 14935
    │   ├── 0.02% checkContinuousGndGrid: total 0.04 [ms], avg. 0.00 [ms], run count: 24707
    │   ├── 0.01% checkDiscontinuousGndGrid: total 0.02 [ms], avg. 0.00 [ms], run count: 28117
    │   └── 64.07% rest: 152.88 [ms]
    ├── 0.44% extractObjectPoints: total 1.06 [ms], avg. 1.06 [ms], run count: 1
    └── 0.22% rest: 0.51 [ms]
100.00% faster_filter: total 24.13 [ms], avg. 24.13 [ms], run count: 1
    ├── 72.51% convertPointcloudGridScan: total 17.50 [ms], avg. 17.50 [ms], run count: 1
    ├── 24.45% classifyPointCloudGridScan: total 5.90 [ms], avg. 5.90 [ms], run count: 1
    ├── 2.50% extractObjectPoints: total 0.60 [ms], avg. 0.60 [ms], run count: 1
    └── 0.53% rest: 0.13 [ms]
This result is not from the same point in the rosbag file, so it is hard compare the result, but the ratio of processing times of convertPointcloudGridScan and classifyPointCloudGridScan got largely changed. At least, we can say the overall processing has reduced if we remove the timekeeper from some methods.

@a-maumau thank you for your clarification.

It makes sense to me know since I also got quite the same result at my side
It's also a good point to known that timekeeper quite takes time to process. ❤️

a-maumau · 2024-08-29T07:12:17Z

I added more timekeepers to make the processing time clearer for scan_groud_filter in f70830b .

Sample output:

100.00% faster_filter: total 3706.36 [ms], avg. 18.53 [ms], run count: 200
    ├── 67.09% convertPointcloudGridScan: total 2486.72 [ms], avg. 12.43 [ms], run count: 200
    │   ├── 50.49% scan: total 1871.41 [ms], avg. 9.36 [ms], run count: 200
    │   ├── 16.53% sort: total 612.49 [ms], avg. 3.06 [ms], run count: 200
    │   └── 0.08% rest: 2.82 [ms]
    ├── 28.57% classifyPointCloudGridScan: total 1059.02 [ms], avg. 5.30 [ms], run count: 200
    │   ├── 28.56% classify_loop: total 1058.42 [ms], avg. 5.29 [ms], run count: 200
    │   │   ├── 26.91% radial_loop: total 997.52 [ms], avg. 0.01 [ms], run count: 71999
    │   │   └── 1.64% rest: 60.90 [ms]
    │   └── 0.02% rest: 0.60 [ms]
    ├── 3.50% extractObjectPoints: total 129.54 [ms], avg. 0.65 [ms], run count: 200
    └── 0.84% rest: 31.07 [ms]

technolojin

I reviewed focusing on the timekeeper implementation.

perception/autoware_ground_segmentation/src/scan_ground_filter/node.cpp

a-maumau · 2024-08-30T06:24:16Z

Output change at 45de8e7 :

100.00% faster_filter: total 2810.32 [ms], avg. 18.25 [ms], run count: 154
    ├── 69.23% convertPointcloudGridScan: total 1945.50 [ms], avg. 12.63 [ms], run count: 154
    │   ├── 51.90% horizontal_angle_grouping: total 1458.67 [ms], avg. 9.47 [ms], run count: 154
    │   ├── 17.24% sort: total 484.54 [ms], avg. 3.15 [ms], run count: 154
    │   └── 0.08% rest: 2.29 [ms]
    ├── 26.85% classifyPointCloudGridScan: total 754.67 [ms], avg. 4.90 [ms], run count: 154
    ├── 3.12% extractObjectPoints: total 87.62 [ms], avg. 0.57 [ms], run count: 154
    └── 0.80% rest: 22.52 [ms]

technolojin

The comments are more on details.

Other then those, LGTM.

perception/autoware_ground_segmentation/src/scan_ground_filter/node.cpp

codecov · 2024-08-30T08:55:01Z

Codecov Report

Attention: Patch coverage is 41.02564% with 46 lines in your changes missing coverage. Please review.

Project coverage is 25.23%. Comparing base (67265bb) to head (a19b5e5).
Report is 18 commits behind head on main.

Files with missing lines	Patch %	Lines
...round_segmentation/src/scan_ground_filter/node.cpp	42.85%	24 Missing and 8 partials ⚠️
...ground_segmentation/src/ray_ground_filter/node.cpp	38.46%	3 Missing and 5 partials ⚠️
...und_segmentation/src/ransac_ground_filter/node.cpp	33.33%	3 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8585      +/-   ##
==========================================
- Coverage   25.23%   25.23%   -0.01%     
==========================================
  Files        1325     1328       +3     
  Lines       98572    98729     +157     
  Branches    38088    38163      +75     
==========================================
+ Hits        24874    24910      +36     
- Misses      71156    71227      +71     
- Partials     2542     2592      +50

Flag	Coverage Δ		*Carryforward flag
differential	`11.14% <41.02%> (?)`
total	`25.22% <ø> (-0.01%)`	⬇️	Carriedforward from 67265bb

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

a-maumau · 2024-09-02T01:21:26Z

@technolojin The reason for adding the new block (scope) in those lines might not be that clear. Would it be alright to remove that explanation?

technolojin · 2024-09-02T02:41:39Z

@technolojin The reason for adding the new block (scope) in those lines might not be that clear. Would it be alright to remove that explanation?

I do not think the new blocks are not clear.
It is possible to separate those by methods, name it by its functionality(, and measure the time).

This process (separate a long code to blocks) is meaningful and useful to maintain a complex code readable and manageable.

YoshiRi

LGTM

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

Shin-kyoto

@a-maumau

Can you resolve conflicts?

a-maumau · 2024-09-09T05:32:26Z

@Shin-kyoto I resolved the conflict.

technolojin · 2024-09-09T10:13:57Z

@Shin-kyoto can you review again please?

* add time_keeper Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> * add timekeeper option Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> * add autoware_universe_utils Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> * fix topic name Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> * add scope and timekeeper Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> * remove debug code Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> * remove some timekeeper and mod block comment Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> --------- Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com> Signed-off-by: emuemuJP <k.matsumoto.0807@gmail.com>

a-maumau requested review from amc-nu, badai-nguyen, miursh, YoshiRi and yukkysaito as code owners August 22, 2024 08:46

github-actions bot added component:perception Advanced sensor data processing and environment understanding. (auto-assigned) tag:require-cuda-build-and-test labels Aug 22, 2024

a-maumau marked this pull request as draft August 22, 2024 09:38

a-maumau marked this pull request as ready for review August 23, 2024 01:57

a-maumau mentioned this pull request Aug 23, 2024

feat(ground_segmentation): add option for time keeper autowarefoundation/autoware_launch#1134

Merged

4 tasks

badai-nguyen added the run:build-and-test-differential Mark to enable build-and-test-differential workflow. (used-by-ci) label Aug 27, 2024

vividf assigned a-maumau Aug 27, 2024

technolojin requested changes Aug 30, 2024

View reviewed changes

technolojin approved these changes Aug 30, 2024

View reviewed changes

a-maumau force-pushed the mau/feat/perception/ground_segmentation branch from 18e4ab2 to 2c68ffb Compare September 2, 2024 03:08

a-maumau requested review from kosuke55, kyoichi-sugahara, mkuri and zusizusi as code owners September 2, 2024 03:08

YamatoAndo removed request for ktro2828, taikitanaka3, maxime-clem, Motsu-san, zulfaqar-azmi-t4, dan-dnn, TaikiYamada4, yuki-takagi-66, vividf and mkquda September 2, 2024 07:09

technolojin self-assigned this Sep 2, 2024

YoshiRi approved these changes Sep 9, 2024

View reviewed changes

technolojin enabled auto-merge (squash) September 9, 2024 04:34

technolojin disabled auto-merge September 9, 2024 04:34

a-maumau added 7 commits September 9, 2024 13:37

add time_keeper

baa6655

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

add timekeeper option

645c52c

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

add autoware_universe_utils

27e6f08

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

fix topic name

838f555

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

add scope and timekeeper

1778386

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

remove debug code

a8c7b35

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

remove some timekeeper and mod block comment

a19b5e5

Signed-off-by: a-maumau <maumaumaumaumaumaumaumaumaumau@gmail.com>

Shin-kyoto requested changes Sep 9, 2024

View reviewed changes

technolojin force-pushed the mau/feat/perception/ground_segmentation branch from d8fd91a to a19b5e5 Compare September 9, 2024 05:41

technolojin requested a review from Shin-kyoto September 9, 2024 05:41

technolojin enabled auto-merge (squash) September 9, 2024 05:43

Shin-kyoto approved these changes Sep 10, 2024

View reviewed changes

technolojin merged commit dabeaa6 into autowarefoundation:main Sep 10, 2024
30 of 32 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ground_segmentation): add time_keeper #8585

feat(ground_segmentation): add time_keeper #8585

a-maumau commented Aug 22, 2024 •

edited

Loading

github-actions bot commented Aug 22, 2024 •

edited

Loading

badai-nguyen commented Aug 27, 2024 •

edited

Loading

technolojin commented Aug 27, 2024

a-maumau commented Aug 29, 2024 •

edited

Loading

a-maumau commented Aug 29, 2024

badai-nguyen commented Aug 29, 2024

a-maumau commented Aug 29, 2024

technolojin left a comment

a-maumau commented Aug 30, 2024

technolojin left a comment •

edited

Loading

codecov bot commented Aug 30, 2024 •

edited

Loading

a-maumau commented Sep 2, 2024

technolojin commented Sep 2, 2024

YoshiRi left a comment

Shin-kyoto left a comment

a-maumau commented Sep 9, 2024

technolojin commented Sep 9, 2024

feat(ground_segmentation): add time_keeper #8585

feat(ground_segmentation): add time_keeper #8585

Conversation

a-maumau commented Aug 22, 2024 • edited Loading

Description

Related links

How was this PR tested?

Notes for reviewers

Interface changes

Effects on system behavior

github-actions bot commented Aug 22, 2024 • edited Loading

badai-nguyen commented Aug 27, 2024 • edited Loading

technolojin commented Aug 27, 2024

a-maumau commented Aug 29, 2024 • edited Loading

a-maumau commented Aug 29, 2024

badai-nguyen commented Aug 29, 2024

a-maumau commented Aug 29, 2024

technolojin left a comment

Choose a reason for hiding this comment

a-maumau commented Aug 30, 2024

technolojin left a comment • edited Loading

Choose a reason for hiding this comment

codecov bot commented Aug 30, 2024 • edited Loading

Codecov Report

a-maumau commented Sep 2, 2024

technolojin commented Sep 2, 2024

YoshiRi left a comment

Choose a reason for hiding this comment

Shin-kyoto left a comment

Choose a reason for hiding this comment

a-maumau commented Sep 9, 2024

technolojin commented Sep 9, 2024

a-maumau commented Aug 22, 2024 •

edited

Loading

github-actions bot commented Aug 22, 2024 •

edited

Loading

badai-nguyen commented Aug 27, 2024 •

edited

Loading

a-maumau commented Aug 29, 2024 •

edited

Loading

technolojin left a comment •

edited

Loading

codecov bot commented Aug 30, 2024 •

edited

Loading