Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overdue a lot after using 2 d435 for a long time on Android #4066

Closed
guokiddo1 opened this issue May 24, 2019 · 12 comments
Closed

overdue a lot after using 2 d435 for a long time on Android #4066

guokiddo1 opened this issue May 24, 2019 · 12 comments
Assignees
Labels

Comments

@guokiddo1
Copy link

guokiddo1 commented May 24, 2019

  • Before opening a new issue, we wanted to provide you with some useful suggestions (Click "Preview" above for a better view):

  • All users are welcomed to report bugs, ask questions, suggest or request enhancements and generally feel free to open new issue, even if they haven't followed any of the suggestions above :)


Required Info
Camera Model D435
Firmware Version 05.11.01.100
Operating System & Version Android
Kernel Version (Linux Only)
Platform RK3399
SDK Version 2.18.1
Language c++
Segment Robot

Issue Description###

<Describe your issue / question / feature request / etc..>
Hi. Our team using 2 d435 on RK3399.

1.We build librealsense2.so from Android Studios.
2.And use it in a binary(name: RGBD_Server) which only get the streams from the 2 RGBD and send out through socket.
3.adb connect to the RK3399. Push the binary and the libs to it. And run the binary through cmd.
./RGBD_Server

Configs of 2 d435:
W x H = 848 x 480
fps = 15
preset = HIGH_ACCURACY
streams = depth & left IR


TEST:
At first 2 RGBD works fine. But we notice that after about 4 hours(more or less), one of the RGBD become very unstable.
So we change the log level to DEBUG and open the Android kernal log test it again.

RESULT:
kernal log: libuvc show no ERROR
librealsense log:

  1. From the first frame, every frame will come out a " overdue " log.
  2. Repeat receiving frame 0 of every stream. And will come with a overdue log.
    the frequency will be very high!
    3.After about 6 hours, one of the RGBD becomes unstable. And soon(maybe about 10 mins later) even lsusb cant find it.

the following logs contains the 3 phenomenon above.

23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:516) fps 15 Infrared 109482 3012241.778000 Depth 109482 3012241.778000
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:529) (TS: )Infrared 109482 3012241.778000 Depth 109482 3012241.778000 fps 15 gap 66.666664 next_expected: 3012308.444664
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Infrared,0,DispatchedAt,1558612193527.291016
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Infrared#0] overdue. (Duration: 1558612193527.291016ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\proc\syncer-processing-block.cpp:28) SYNCED: Infrared 109482, 3012241.778000 Depth 109482, 3012241.778000
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:502) CallbackStarted,Infrared,109482,DispatchedAt,1558612193527.592041
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Infrared,109481,DispatchedAt,1558612193527.679932
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Infrared#109481] overdue. (Duration: 77.159912ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109481,DispatchedAt,1558612193527.866943
23/05 11:49:53,527 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109481] overdue. (Duration: 69.266846ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,528 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:502) CallbackStarted,Depth,109482,DispatchedAt,1558612193528.048096
23/05 11:49:53,528 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,0,DispatchedAt,1558612193528.102051
23/05 11:49:53,529 DEBUG [516574672112] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Infrared,0,DispatchedAt,1558612193528.159912
23/05 11:49:53,533 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sensor.cpp:453) FrameAccepted,Depth,Counter,0,Index,0,BackEndTS,0.000000,SystemTime,1558612193533.135010 ,diff_ts[Sys-BE],1558612193533.135010,TS,3021743.526000,TS_Domain,Hardware Clock,last_frame_number,0,last_timestamp,0.000000
23/05 11:49:53,536 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:502) CallbackStarted,Depth,0,DispatchedAt,1558612193536.345947
23/05 11:49:53,537 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:143) DISPATCH (TS: )--> Depth 0 3021743.526000

23/05 11:49:53,537 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: )Depth Infrared Infrared Color
23/05 11:49:53,537 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: )Depth Infrared Infrared Color
23/05 11:49:53,537 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: )Depth Infrared Infrared Color
23/05 11:49:53,537 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: )Depth Infrared Infrared Color
23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:143) DISPATCH (TS: (FN: I Depth I Infrared I Infrared ) I Color )--> Depth 0 3021743.526000

23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: (FN: I Depth I Infrared I Infrared ) I Color )Depth Infrared Infrared
23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: (FN: I Depth I Infrared I Infrared ) I Color )Depth Infrared Infrared
23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:551) clean inactive stream in (TS: (FN: I Depth I Infrared I Infrared ) I Color )Depth Infrared Infrared
23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:143) DISPATCH (FN: I Depth I Infrared I Infrared )--> Depth 0 3021743.526000

23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:422) clean inactive stream in (FN: I Depth I Infrared I Infrared )Depth
23/05 11:49:53,538 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:422) clean inactive stream in (FN: I Depth I Infrared I Infrared )Infrared
23/05 11:49:53,539 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193539.012939
23/05 11:49:53,539 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2885.302002ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,539 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193539.459961
23/05 11:49:53,539 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2819.269043ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,539 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193539.768066
23/05 11:49:53,539 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2752.756104ms, FPS: 15, Max Duration: 62ms)

23/05 11:49:53,540 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193540.008057
23/05 11:49:53,540 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2686.362061ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,540 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193540.247070
23/05 11:49:53,540 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2618.258057ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,540 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193540.547119
23/05 11:49:53,540 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2552.906006ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193540.950928
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2486.604004ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193541.164062
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2420.248047ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193541.375000
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2353.334961ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:288) CallbackFinished,Depth,109575,DispatchedAt,1558612193541.733887
23/05 11:49:53,541 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\archive.cpp:295) Frame Callback [Depth#109575] overdue. (Duration: 2286.659912ms, FPS: 15, Max Duration: 62ms)
23/05 11:49:53,542 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:99) I Depth--> Depth 0, 3021743.526000

23/05 11:49:53,542 DEBUG [516695270640] (F:\Code\third_part\realsense-app\app\src\main\cpp\librealsense\src\sync.cpp:258) SYNC (FN: I Depth I Infrared I Infrared )--> Depth 0 3021743.526000


Please let me know if there's anything I can try to make this work.

Thanks in advance.

@guokiddo1
Copy link
Author

we update the librealsense sdk to 2.22.0.
It really increase the rate of success of opening the RGBD at first time.
But the unstable phenomenon is still showing.

1.Still can't see much different from the kernal log.
2.One strange thing is that after the freq of the frame become unstable.
The ls /dev/video only show video0 & video1
which the right result shoule be 6 videos.

Than we kill the process with out pulling out USB.
And we restart the programe. The programe can still run ! All stream are working fine.
The ls /dev/video still only show video0 & video1

@matkatz
Copy link
Contributor

matkatz commented May 28, 2019

Hi @guokiddo1
Are you using the android solution for unrooted devices
or the previous solution?
I have a possible solution for the frequency issue that seems to improve it- #4032 (more tests are required)
As for the USB disconnect, I also see this issue on my RK3399 setup and I steel don't have a root-cause/ solution for that.

@guokiddo1
Copy link
Author

guokiddo1 commented Jun 20, 2019

We do a lot of work to fix this.
Here are some result.

1.we open the devices not from Java(Android) but from JNI. We use libuvc to open the D435.
So I afaird the #4032 cant solve our problem.
2. Now time, the biggest problem is that the D435 may suddenly send no data randomly after 1~5hours.
First we dont know whether this is the hardware problem or software problem or sdk problem.
We print logs from Apk, SDK, BUS, kernel and we locate the bug.

We use tcpdump to get all the USB BUS data when running the process.
And when the bug shows, the devices return NULL!
the right len should be 814376 & 407336 (2 stream from on rgbd)

A41BFBF5-32D0-4c9b-87BE-BCEC77C2F7B5

We are pretty sure the hardware is still connected.
the lsusb result is correct and we use API: dev.hardware_reset() the devices can re-open.

But the problem is that if everytime we use hardware_reset(), it will cost about 30s to open 2 RGBD.
It could be fix only by reset one of the error endpoints. And could cost less time.(Just like#4032 do)

Now we are trying to fix the problem by logging the transfer->status from libuvc/stream.cpp first.
Check if libuvc could recognize the error-endpoint.
Than we will try to only reset the error endpoints through libuvc.

Do you have any more advices for this ?

@guokiddo1
Copy link
Author

guokiddo1 commented Jun 20, 2019

By the way, we are using sdk 2.23.0.
Use the default CMake not the android.cmake

@guokiddo1
Copy link
Author

We add logs (print transfer->status) in src/libuvc/stream.cpp when get _uvc_stream_callback

When one rgbd return nothing
the log shows:
21/06 07:20:12,779 DEBUG [535557055728] (stream.cpp:836) uvc_stream_callback; status: 1

status:1 refer LIBUSB_TRANSFER_ERROR
so we check src/libuvc/stream.cpp _uvc_stream_callback

the origion code like this:
case LIBUSB_TRANSFER_ERROR:
UVC_DEBUG("retrying transfer, status = %d", transfer->status);
break; }
when get LIBUSB_TRANSFER_ERROR; the origin code will do notiong.

We will try libusb_clear_halt to reset the end-point
case LIBUSB_TRANSFER_ERROR:
libusb_clear_halt
UVC_DEBUG("retrying transfer, status = %d", transfer->status);
break; }

@matkatz
Copy link
Contributor

matkatz commented Jun 23, 2019

Reset the endpoint (clear halt) might do the work.
Currently, on the usbhost backend, I'm resting the end point in two cases, overflow error and protocol error.
I didn't see transfer error yet but I'm not sure if libusb and usbhost are completely aligned.

@RealSenseCustomerSupport
Copy link
Collaborator


@guokiddo1 Any luck with libusb_clear_halt?

@guokiddo1
Copy link
Author

guokiddo1 commented Jul 2, 2019

for our new binary version.

  1. we add "libusb_clear_halt" when recieve LIBUSB_TRANSFER_ERROR

  2. for SDK build config:

    BUILD_WITH_OPENMP = false
    CMAKE_BUILD_TYPE = Release
    ENABLE_ZERO_COPY = true

    to reduce the CPU utilization


And some very strange things happends.
After open D435 3~5 hours. The USB 3.0 host of RK3399 is halt !!
All the devices connect to USB 3.0 disappear.( we see from lsusb)
We found this on several RK3399.

We are not sure wheather it is a bug of D435 or SDK or our RK3399.
But we guess maybe it belongs to SDK because the bug shows very high frequency
only after we update the SDK .
But it's also hard to imagine how a USB devices would crash the USB host.


So we are now very conflusing about our next step and
could only try some solution to locate the bug more accurately.

@matkatz
Copy link
Contributor

matkatz commented Jul 4, 2019

I also see this issue on my FIreFly boards (3399 and 3328) with the solution for un-rooted devices.
This issue was reproduced also on earlier versions of librealsense.

Configure more streams at higher resolutions accelerates the reproduction.
For example with [depth 1280x720x30 | color 1280x720x30] I was able to run more than 10 hours with no issues (until I needed to stop the test), while configuring [depth 1280x720x30 | color 1920x1080x30 | IR1 1280x720x30 | IR2 1280x720x30] ran for less than a minute before the device was disconnected.

When checking the dmesg log this is what is get:
[ 116.651896] xhci-hcd xhci-hcd.12.auto: xHCI host not responding to stop endpoint command.
[ 116.651921] xhci-hcd xhci-hcd.12.auto: Assuming host is dying, halting host.
[ 116.665990] usb 4-1: usbfs: USBDEVFS_CONTROL failed cmd ealsense.camera rqt 33 rq 1 len 1024 ret -110
[ 116.666922] xhci-hcd xhci-hcd.12.auto: HC died; cleaning up
[ 116.670018] usb 4-1: USB disconnect, device number 2

I was not able to reproduce it on pixel 2 or Samsung S8.
Will update once I have any progress with this problem.

@guokiddo1
Copy link
Author

We run it as root devices
and use SDK 2.23.0

@RealSenseCustomerSupport
Copy link
Collaborator


Tracked by DSO-13154. No progress yet.

@RealSenseCustomerSupport
Copy link
Collaborator


@guokiddo1 With your confirmation we will close this issue. Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants