Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grab timed out at random #752

Open
kesaroid opened this issue May 13, 2024 · 7 comments
Open

Grab timed out at random #752

kesaroid opened this issue May 13, 2024 · 7 comments
Assignees

Comments

@kesaroid
Copy link

kesaroid commented May 13, 2024

Describe what you want to implement and what the issue & the steps to reproduce it are:

A setup with 24 a2A1920-51gcPRO GigE cameras has been set up. Increasing the file descriptor made pylon-viewer more stable, and the cameras don't disappear from Devices as often. All of the cameras take up less than 5GBps so that does not seem to be the issue.
But after successfully initializing and running the cameras, after a few hours the following error occurs.

Initialization:

tlf = pylon.TlFactory.GetInstance()
tlf.EnumerateDevices()
cam_info = pylon.DeviceInfo()
cam_info.SetIpAddress(camera_ip)
camera = pylon.InstantCamera(tlf.CreateDevice(cam_info))

Frame capture:

pylon_grab_exception = pylon.TimeoutHandling_ThrowException
try:
    grab_result_front = camera_front.RetrieveResult(1000, pylon_grab_exception)
except Exception as e:
    print(e)

Error:

_genicam.TimeoutException: Grab timed out. Possible reasons are: The image transport from the camera device is not working properly, e.g., all GigE network packets for streaming are dropped; The camera uses explicit triggering (see TriggerSelector for more information) and has not been triggered; Single frame acquisition mode is used and one frame has already been acquired; The acquisition has not been started or has been stopped. : TimeoutException thrown (file 'InstantCameraImpl.h', line 1036)

The error happens at random and everytime it does, we just reinitialize the cameras and it works fine. We want to stop the error from occuring in the first place.

Is your camera operational in Basler pylon viewer on your platform

Yes

Hardware setup & camera model(s) used

CPU architecture: X86_64
Operating System: Ubuntu 22.04 Update 3
RAM: 128gb

Interfaces used to connect the cameras:
NIC: Broadcom 57504 quad NIC 10GBE
Switch: Netgear M4300-28G-POE+
Cable types/lengths: Combination of CAT6a, fiber/25 meters

Runtime information:

python: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0]
platform: linux/x86_64/5.15.0-102-generic
pypylon: 3.0.1 / 7.2.1
@bobby-burns
Copy link

I don't know if you are still having this issue. But correct me if I am wrong, it sounds like from your post that the connection to the cameras isn't completely stable. If you drop connection for more than a second, RetrieveResult will timeout and throw an error. You can try increasing the timeout to compensate.

@cpt-wojtech
Copy link

Hi kesaroid,
if you have a long exposure time or a framerate that is near the 1000ms interval, I would also recommend to increase the timeout.
In general you seem to have problems with the connection (disappearing cameras). So I would first perform the necessary settings for your network adapters by following this guide
https://docs.baslerweb.com/network-configuration-(gige-cameras)#changing-the-network-adapter-properties-linux

or letting the pylon GigE Configurator doing the work for you
https://docs.baslerweb.com/overview-of-the-pylon-gige-configurator

With this much cameras connected to one switch please use the Bandwidth Manager in the pylon Viewer to setup the right Frame Transmission delay and Inter-Packet Delay.

If the issue then still persists, could you tell how the camera behaves once this timeout happens? Does the camera stop the streaming and you get the timeout all the time? Why do you need to reinitialize the cameras?

@kesaroid
Copy link
Author

I don't know if you are still having this issue. But correct me if I am wrong, it sounds like from your post that the connection to the cameras isn't completely stable. If you drop connection for more than a second, RetrieveResult will timeout and throw an error. You can try increasing the timeout to compensate.

I understand your suggestion to increase RetrieveResult but since we want to capture all the frames at a certain framerate, isn't it better to reinitialize the camera? Increasing this value would only delay the time required to obtain a certain frame. Increasing the value may reduce the need for re-initialization, but the packets will be lost anyway.
The cameras are unstable and ideally we want to rectify this issue and fix. But we are not sure where the issue is arising from

@kesaroid
Copy link
Author

kesaroid commented May 21, 2024

@cpt-wojtech

if you have a long exposure time or a framerate that is near the 1000ms interval, I would also recommend to increase the timeout.

For the application we are developing, the framerate is required to be 24. So, 1/24s would be an ideal RetrieveResult to make sure no frame is being dropped.

With this much cameras connected to one switch please use the Bandwidth Manager in the pylon Viewer to setup the right Frame Transmission delay and Inter-Packet Delay.

We have identified the best transport layer settings to be the following

  • Packet Size: 1500
  • Inter-packet delay: 6912
  • Frame transmission delay: Increments of 1518 for each subsequent camera

If the issue then still persists, could you tell how the camera behaves once this timeout happens?

Once the packet is not retrieved, the camera basically disappears. So, there is no point in trying to get the frame again, we just directly try to reinitialize the camera and wait till it is back online again. Since the frame is retrieved using the following snippet, isGrabbing is True, whereas RetrieveResult fails

        while (camera_front.IsGrabbing() and camera_side.IsGrabbing()):
            acquisition_date_time = datetime.now()

            with camera_front.RetrieveResult(frame_timeout, pylon_grab_exception) as grab_result_front:
                with camera_side.RetrieveResult(frame_timeout, pylon_grab_exception) as grab_result_side:
                    front_grab_succeded = grab_result_front.GrabSucceeded()
                    side_grab_succeded = grab_result_side.GrabSucceeded()

                    if front_grab_succeded and side_grab_succeded:

Does the camera stop the streaming and you get the timeout all the time?

No, it happens randomly. It happens once every hour or every couple of hours. It either happens during the time of initializing the camera, in which case we keep retrying, or happens randomly after a few hours, in which case we close the camera, and reinitialize.

Why do you need to reinitialize the cameras?

Otherwise the streaming won't work. Even though IsGrabbing returns true, all RetrieveResult will fail

@bobby-burns
Copy link

bobby-burns commented May 21, 2024

If you are using multiple cameras, definitely look into using the pylon.InstantCameraArray instead of trying to grab individual cameras. You can find out which camera took the picture in the array with grab_result.GetCameraContext().

If you would like your program to not crash from RetrieveResult, you can change the exception type to pylon.TimeoutHandling_Return instead of throw exception and handle it some other way.

Otherwise, this seems like a hardware/networking problem, not a pypylon problem. This is unfortunately out of my expertise.

Also, the RetrieveResult function does not control to the framerate, that is done by doing

camera.AcquisitionFrameRateEnable.SetValue(true);
camera.AcquisitionFrameRate.SetValue(24.0);

The camera acquisition speed is independent of the RetriveResult timeout.

@stbnps
Copy link

stbnps commented May 23, 2024

If you are using multiple cameras, definitely look into using the pylon.InstantCameraArray instead of trying to grab individual cameras. You can find out which camera took the picture in the array with grab_result.GetCameraContext().

I think InstantCameraArray won't be very useful for our use case. If I understood the documentation right, when we have 2 cameras in the array, and one of them is not streaming frames (let's say it's because of networking issues), then InstantCameraArray.RetrieveResult would keep returning frames from just one of the two cameras right?

In our use case we want to retrieve frames from those two cameras continuously. If we acquire one frame from one of the two cameras, and we don't manage to retrieve a frame from the other camera within 1/frame rate seconds, then the frame from the first camera is of no use.

We may definitely be seeing a networking issue here, but we have already followed the docs to change the MTU and other network parameters without luck. We also tried 2 different switches.

By the way... When looking at CameraQuickTester results, the streaming test passes even if the PC didn't retrieve a few frames. Does that mean that, when streaming video, it is possible that now and then we lose one frame? If that's the case, how often could this hapen?

@cpt-wojtech
Copy link

Because this is a network related question and you are dealing with 24 cameras, I would recommend to take this issue to one of your local Basler support teams
https://www.baslerweb.com/en/support/contact/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants