Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasionnally crash in fastrtps #507

Closed
BaiCai1990 opened this issue Jan 29, 2021 · 7 comments
Closed

Occasionnally crash in fastrtps #507

BaiCai1990 opened this issue Jan 29, 2021 · 7 comments
Labels
bug Something isn't working more-information-needed Further information is required

Comments

@BaiCai1990
Copy link

BaiCai1990 commented Jan 29, 2021

Bug report

Required Info:

  • Operating System:
    Debian 10 buster
  • Installation type:
    from source
  • Version or commit hash:
  • DDS implementation:
    Fast-RTPS
  • Client library (if applicable):
    rclcpp

Steps to reproduce issue

backtrace:
#0  0x00007ffff2159704 in eprosima::fastrtps::rtps::StatefulReader::send_acknack(eprosima::fastrtps::rtps::WriterProxy const*, eprosima::fastrtps::rtps::RTPSMessageSenderInterface const&, bool) () from /opt/ros2_foxy/install/fastrtps/lib/libfastrtps.so.2
#1  0x00007ffff2156c7c in std::_Function_handler<bool (), eprosima::fastrtps::rtps::WriterProxy::WriterProxy(eprosima::fastrtps::rtps::StatefulReader*, eprosima::fastrtps::rtps::RemoteLocatorsAllocationAttributes const&, eprosima::fastrtps::ResourceLimitedContainerConfig const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) () from /opt/ros2_foxy/install/fastrtps/lib/libfastrtps.so.2
#2  0x00007ffff212a080 in eprosima::fastrtps::rtps::TimedEventImpl::trigger(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1000000000l> > >) () from /opt/ros2_foxy/install/fastrtps/lib/libfastrtps.so.2
#3  0x00007ffff2128bbc in eprosima::fastrtps::rtps::ResourceEvent::do_timer_actions() ()
   from /opt/ros2_foxy/install/fastrtps/lib/libfastrtps.so.2
#4  0x00007ffff21290bb in eprosima::fastrtps::rtps::ResourceEvent::event_service() ()
   from /opt/ros2_foxy/install/fastrtps/lib/libfastrtps.so.2
#5  0x00007ffff67c5b2f in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff7e6ffa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#7  0x00007ffff64a34cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Expected behavior

run normal

Actual behavior

occasionally crash.

Additional information

  // controller.cpp, **key code** 
  auto publish_interval_ms =
      std::chrono::milliseconds(static_cast<uint64_t>(1000.0 / 100));
  control_timer_ = ros_node_hdr_->create_wall_timer(publish_interval_ms, [=](){
    RunMain();
  });

void ControllerBridge::RunMain()
{
  // 获取当前状态
  GlobalMotionState current_state = motion_state_mechine_hdr_->GetCurrentMotionState();

  // 执行控制
  switch(current_state){
  case GlobalMotionState::IDLE:{
    idle_controller_hdr_->Init();
    idle_controller_hdr_->RunOnce();

    // switch to WALK status
    mutex_.lock();
    if(super_controller_hdr_->GetRemainRouteNum() > 0) {
      motion_state_mechine_hdr_->SignalWalk();
    }
    mutex_.unlock();

    data_sender_ptr_->PublishTaskStatus();
    break;
  }
  case GlobalMotionState::WALK:{
    // core control pragram
    mutex_.lock();
    LOG_INFO("walk begin");
    super_controller_hdr_->SetInput(msg_localizer_, msg_obs_info_, msg_axis_info_);
    if(super_controller_hdr_->RunOnce() && super_controller_hdr_->GetRemainRouteNum() <= 0){
      motion_state_mechine_hdr_->SignalWalkComplete();
      super_controller_hdr_->Init();                         
      LOG_INFO("task complete.");
    }
    mutex_.unlock();

    data_sender_ptr_->PublishTaskStatus();
    break;
  }
  case GlobalMotionState::MANUAL:{
    manual_controller_hdr_->RunOnce();

    // clear route list
    mutex_.lock();
    super_controller_hdr_->ClearRouteList();
    super_controller_hdr_->Init();
    mutex_.unlock();

    // reset task status
    TaskStateMotion::Instance()->ResetTaskStatus();

    data_sender_ptr_->PublishTaskStatus();
    break;
  }
  case GlobalMotionState::SEMI_AUTO:{
    semi_auto_controller_hdr_->RunOnce();
    // clear rouelist
    mutex_.lock();
    super_controller_hdr_->ClearRouteList();
    super_controller_hdr_->Init();
    mutex_.unlock();

    data_sender_ptr_->PublishTaskStatus();
    break;
  }
  case GlobalMotionState::ERROR:{
    error_controller_hdr_->RunOnce();
    data_sender_ptr_->PublishTaskStatus();
    break;
  }
  default:{
    LOG_ERROR("错误状态.");
  }
  }
}


***main.cpp
int main(int argc, char** argv) {
  rclcpp::init(argc, argv);
  std::shared_ptr<rclcpp::Node> node = std::make_shared<rclcpp::Node>("controller");

   ......

   ControllerHdr controller_hdr = std::make_shared<Controller>(node);

  controller_hdr->RunMain();
  rclcpp::shutdown();
  return 0;
}



Feature request

Feature description

Implementation considerations

@clalancette
Copy link
Contributor

@BaiCai1990 Can you provide a reproducing example here? That is, how do you get into this situation?

@MiguelCompany @EduPonz This seems to be somewhere in the Fast-DDS stack. Could you take a look?

@BaiCai1990
Copy link
Author

@clalancette I switched to cyclonedds, the probability of problems is greatly reduced.

@BaiCai1990
Copy link
Author

The key code I have provide @clalancette

@hidmic
Copy link
Contributor

hidmic commented Feb 8, 2021

@BaiCai1990 I believe @clalancette is asking for an example we can actually compile and execute to reproduce. It's quite difficult to track down the issue otherwise.

@hidmic hidmic added bug Something isn't working more-information-needed Further information is required labels Feb 8, 2021
@hidmic
Copy link
Contributor

hidmic commented Apr 8, 2021

@BaiCai1990 friendly ping

@BaiCai1990
Copy link
Author

@hidmic The complete code i can not provide, Because it involves company secrets。But i will provide a simplified version recently, thinks for your attention.

@clalancette
Copy link
Contributor

Since we cannot reproduce this and there is no example we can run, I'm going to close this out. Please feel free to reopen if you are still having problems and can provide an example showing the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working more-information-needed Further information is required
Projects
None yet
Development

No branches or pull requests

3 participants