Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grabber returns only 0 counts from ROIs #1369

Closed
JKiethe opened this issue Oct 8, 2019 · 23 comments
Closed

Grabber returns only 0 counts from ROIs #1369

JKiethe opened this issue Oct 8, 2019 · 23 comments

Comments

@JKiethe
Copy link

JKiethe commented Oct 8, 2019

Bug Report

One-Line Summary

The grabber only detects 0 counts in a region of interest, even though the camera image shows non-zero counts.

Issue Details

Steps to Reproduce

  1. Connecting CameraLink output of Andor iXonUltra 897 to Grabber Base input
  2. Running the following experiment:
from artiq.experiment import *
from artiq.language import ns, us, ms, MHz

class FrameGrabberExample(EnvExperiment):

    def build(self):
        self.setattr_device("core")
        self.setattr_device("grabber0")
        self.setattr_device("ttl4")

      @kernel
    def run(self):
        rois = [[227, 237, 237, 247], [247, 237, 257, 247]]
        mask = 0
        self.core.reset()
        for i in range(len(rois)):
            x0 = rois[i][0]
            y0 = rois[i][1]
            x1 = rois[i][2]
            y1 = rois[i][3]
            mask |= 1 << i
            self.grabber0.setup_roi(i, x0, y0, x1, y1)
        n = [0] * len(rois)

        self.ttl4.pulse(10 * us)  # camera trigger
        delay(20 * ms)
        self.grabber0.gate_roi(mask)
        self.ttl4.pulse(10 * us)  # camera trigger

        self.grabber0.input_mu(n)

        self.core.break_realtime()
        self.grabber0.gate_roi(0)

        print("ROI sums:", n)
        print("ROI mask:", mask)

Expected Behavior

print("ROI sums:", n) displays the correct counts, i.e. the counts of the camera image, which are read out after the experiment. For the given ROIs and camera settings around 50000 counts per ROI are observed.

Actual (undesired) Behavior

Print out is ROI sums: [0, 0].

Your System

  • Operating System: Windows 10 Pro
  • ARTIQ version: 5.6938.4e77be05.beta
  • Version of the gateware and runtime loaded in the core device: 5.0.dev+567.g99e490f9
  • Hardware involved: Kasli v1.1, Grabber v1.1, Andor iXon Ultra897

Additional information:

The method grabber0.input_mu(n) actually puts in 0s in the list n. This was tested by initializing n with non-zero values. The result printed was still 0.

This bevaior is seemingly independent on the ROI coordinates. The same happens for the following ROIs as well:

  • rois = [[1, 1, 10, 10 ], [503, 503, 512, 512]]
  • rois = [[503, 503, 512, 512], [1, 5000, 10, 5010]]

I am not surprised the last one has only 0s, as the ROI edges are outside the actual image (512px X 512px). But the other ROIs should give some non-zero values, as far as I understand.

This result does not dependent on external triggering of the camera, as done in the above example. Even using the video mode / internal triggering of the camera produces the same zero result.

Connecting to the Medium link input, instead of Base, blocks execution. This is probably because the Grabber does not get any input during the ROI gate time. (The same thing happens if CameraLink option of the camera is off).
This also indicates, that grabber.input_mu(n) actually registers inputs (or at least an image), but for some reason the counts are 0.

Camera Link cable length is 5m.

@jordens
Copy link
Member

jordens commented Oct 8, 2019

  • Check your camera firmware. There were bugs in versions prior to "10.181". We had to jump through some hoops to get that fixed and get the upgra. (Maybe check with Christian @chanlists)
  • What's on the serial console? There should be some messages from Grabber.
  • What's the total sum of pixels (roi [0, 0, 4095, 4095] or something like that)?

@JKiethe
Copy link
Author

JKiethe commented Nov 19, 2019

Sorry, that I took a while to answer. I wanted to wait on the answer from Andor about the newest camera firmware before replying.

Here are the answers to your questions:

  • camera firmware: 6.17. This is also the newest version for the Andor iXon Ultra 897, according to Andor. For the Ultra 888 used by @chanlists, the correct firmware version is, as you stated, 10.181.
  • No errors are printed in the console by the grabber. When the experiment is run on debug logging level, the output is:
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm:connected to 10.0.16.141:1381
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:sending message: type=<Request.SystemInfo: 3>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:receiving message: type=<Reply.SystemInfo: 2>
WARNING:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:Mismatch between gateware (5.0.dev+567.g99e490f9) and software (5.6938.4e77be05.beta) versions
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:sending message: type=<Request.LoadKernel: 5>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:receiving message: type=<Reply.LoadCompleted: 5>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:sending message: type=<Request.RunKernel: 6>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:running kernel
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:receiving message: type=<Reply.RPCRequest: 10>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:rpc service: [2]<built-in function print> ['ROI sums:', [0, 0]] {} -> b'n'
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:rpc service: 2 ['ROI sums:', [0, 0]] {} = None
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:sending message: type=<Request.RPCReply: 7>
INFO:worker(134,frame_grabber_example.py):print:ROI sums: [0, 0]
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:receiving message: type=<Reply.RPCRequest: 10>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:rpc service: [2]<built-in function print> ['ROI mask:', 3] {} -> b'n'
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:rpc service: 2 ['ROI mask:', 3] {} = None
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:sending message: type=<Request.RPCReply: 7>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:receiving message: type=<Reply.RPCRequest: 10>
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:rpc service: [1]<function Core.run.<locals>.set_result at 0x000001538AFF6D08> (async) [None] {} -> b'n'
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:receiving message: type=<Reply.KernelFinished: 7>
INFO:worker(134,frame_grabber_example.py):print:ROI mask: 3
DEBUG:worker(134,frame_grabber_example.py):artiq.coredevice.comm_kernel:disconnected

It also does not contain any messages from the grabber, AFAICS.

  • Total sum of all pixels ( roi [1, 1, 512, 512] for this camera type) is also 0. Corresponding RTIO log (channel 24 is the grabber base channel, channel 4 is a TTL used as a trigger):
Log channel: 50
DDS one-hot: True
OutputMessage(channel=24, timestamp=74455600713944, rtio_counter=74455600593720, address=0, data=1)
OutputMessage(channel=24, timestamp=74455600713952, rtio_counter=74455600595960, address=1, data=1)
OutputMessage(channel=24, timestamp=74455600713960, rtio_counter=74455600596616, address=2, data=512)
OutputMessage(channel=24, timestamp=74455600713968, rtio_counter=74455600597456, address=3, data=512)
OutputMessage(channel=4, timestamp=74455600713976, rtio_counter=74455600600272, address=0, data=1)
OutputMessage(channel=4, timestamp=74455600723976, rtio_counter=74455600601240, address=0, data=0)
OutputMessage(channel=25, timestamp=74455620723976, rtio_counter=74455600601984, address=0, data=1)
OutputMessage(channel=4, timestamp=74455620723976, rtio_counter=74455600602392, address=0, data=1)
OutputMessage(channel=4, timestamp=74455620733976, rtio_counter=74455600603008, address=0, data=0)
InputMessage(channel=25, timestamp=0, rtio_counter=74455621191040, data=2147483648)
InputMessage(channel=25, timestamp=0, rtio_counter=74455621195128, data=0)
OutputMessage(channel=25, timestamp=74455621323368, rtio_counter=74455621200888, address=0, data=0)
StoppedMessage(rtio_counter=74462987151392)

The first input message for channel 25 is the maximum of a signed 32 bit integer, while the second input contains the 0 counts returned at the end. The actual image, as readout via USB, only shows around 125,000,000 counts over the whole image.

Does anyone know how to check, what the camera sends over the link output? Besides using a different frame grabber, e.g. from Bitflow?

@sbourdeauducq
Copy link
Member

Does anyone know how to check, what the camera sends over the link output?

With Migen microscope. This is how it was initially developed:
https://github.com/quartiq/grabber

@jordens
Copy link
Member

jordens commented Nov 19, 2019

Or just a scope on the clock and one of data lanes. Either on grabber or directly on the cable. There are LVAL/FVAL/DVAL which will toggle on RX2. But if you look on any of the other three data pairs you should see bits. Your observation would mean flat zeros. The signal standard is LVDS.

image

https://en.wikipedia.org/wiki/File:Camera_link_serialization.jpg

@jordens
Copy link
Member

jordens commented Nov 19, 2019

Also there should be grabber messages about the recovered pixel clock frequency during boot (of Kasli or the camera). Check whether those make sense. You may need to look at the actual serial console (the third USB interface on the Kasli USB connection is the serial port, 115200-8n1).

@jordens
Copy link
Member

jordens commented Feb 19, 2020

ping @JKiethe
Is this resolved? If yes, how?

@JKiethe
Copy link
Author

JKiethe commented Feb 24, 2020

Sorry for not answering for such a long time. It is not yet resolved, as I did not have time to investigate further.

I will work on it this week and get back to you in a few days.

@jkellerPTB
Copy link

I've had a look at the LVDS signals on an analog scope. The camera was running in video mode, ca 120ms exposure, with only a sub-image read out. The shutter is closed, i.e. the pixel data are just noise.

There are three consecutive bits on X2 which I assume to be the L/F/DVAL data for now *

I see three different kinds of data, some with just DVAL high, some with DVAL and FVAL high and some with DVAL, FVAL and LVAL high. The latter two have random high values of the bits on X0:

RefCurve_2020-03-10_0_105831 Wfm
RefCurve_2020-03-10_1_110014 Wfm
RefCurve_2020-03-10_2_110247 Wfm

There are no clock cycles at all in which DVAL is low.

On the serial console, I see

[ 74894.749921s]  INFO(board_artiq::grabber): grabber0 locked: 39MHz
[ 74894.955919s]  INFO(board_artiq::grabber): grabber0 alignment success

when the cameralink is activated.

*They appear a bit early with respect to the clock, but it's hard to be sure with the limited timing resolution. I'm waiting to borrow an analog scope with 4x the sampling rate, as I don't have a quick way to convert LVDS to single-ended right now. If there is an easy way to check for such a misalignment on the FPGA, that would also help.

@jordens
Copy link
Member

jordens commented Mar 10, 2020

There isn't really an automatic way to fix clock-data-skew by entire bits. The data lanes are supposed to be aligned to the clock.
But thanks a lot for the data and the analysis! This is already helpful.

  • If the one bit shift is real, it would explain it. The question is whether it is and where it comes from. It is probably not that urgent to go for the more expensive scope. You might be limited by ringing and probe bandwidth. The DVAL bit nicely aligns with the inferred slopes that you have added. Just one bit early.
  • The DVAL always high also matches what I remember from the 888.
  • Does the 39 MHz (or maybe 40 due to rounding) match your ADC (horizontal, pixel) frequency?
  • There should also be a message on the console about grabber0 frame size: .. x .. whenever there is a frame with a new (including the first) frame size. Does that match? If not then the LVAL/FVAL synchronization doesn't work, maybe for the bit shift reason above.
  • unsafe { (csr::GRABBER[g].clk_sampled_read)() == 0b1100011 }
    contains the desired clock pattern. A shot in the blue would be to try 0b1000111 (or 0b1110001 if I messed up the LSB/MSB order deriving this) as the target alignment and see whether this changes anything. Only firmware compilation and flashing required, no gateware (there is even a way to load the firmware without flashing but I don't remember the commands). If this indeed fixes it, I would talk to Andor.

@jkellerPTB
Copy link

  • Does the 39 MHz (or maybe 40 due to rounding) match your ADC (horizontal, pixel) frequency?

The camera link clock rate is independent of any CCD readout settings. The signal on Xclk and DVAL bits X2 keep running even without any exposures (for completeness: the horizontal shift rate was 17MHz, the line shift time 0.5us).

  • There should also be a message on the console about grabber0 frame size: .. x .. whenever there is a frame with a new (including the first) frame size. Does that match? If not then the LVAL/FVAL synchronization doesn't work, maybe for the bit shift reason above.

I don't ever see that message. The only ones from the grabber are the two quoted above when the camera link is activated, and "lock lost" when it's turned off.

  • unsafe { (csr::GRABBER[g].clk_sampled_read)() == 0b1100011 }

    contains the desired clock pattern. A shot in the blue would be to try 0b1000111 (or 0b1110001 if I messed up the LSB/MSB order deriving this) as the target alignment and see whether this changes anything. Only firmware compilation and flashing required, no gateware (there is even a way to load the firmware without flashing but I don't remember the commands). If this indeed fixes it, I would talk to Andor.

This sounds like the kind of test I had in mind - using the already available data on the FPGA rather than setting up new hardware to analyze the LVDS signals. We currently have no environment set up for compiling things ourselves though, but I'll look into it in the next couple of days.

@jkellerPTB
Copy link

Stretching the definition of "couple of days" a bit, but I have some good news:

Changing the expected clock pattern to 0b1000111 did it. The grabber now detects the correct frame size (and reports changes on the console) and outputs nonzero values for the ROI counts.

I will get in touch with Andor about this and keep you posted here.

@jkellerPTB
Copy link

I'm afraid the success was rather short-lived. Today we discovered that it only works intermittently. I'll keep looking into it and share more details once I have a clearer description.

@jkellerPTB
Copy link

Ok, this is awkward: The problem was solved by ordering a new MDR cable (and using the unmodified firmware, i.e. the originally expected clock pattern 1100011).

For completeness: The new cable is 3m long (vs. 5m for the one we had the issues with), but my guess would be that there was an issue with that specific cable / connectors rather than one with cable length.

I think this issue can be closed now.

@jordens
Copy link
Member

jordens commented Aug 7, 2020

Thanks for hunting this down!
Do you have a part number for the working and broken cables?
I seem to remember that years ago the cables that came with these cameras looked "home built", i.e. had shrink tubing at the connector ends indicating cable-connector size mismatches and mechanical fragility.

@jordens jordens closed this as completed Aug 7, 2020
@jkellerPTB
Copy link

Cable quality doesn't seem to be the issue, it was a professionally made cable (no heat shrink etc)
After digging out the model number of the "old" cable (3M 14526-EZ8B-500-07C), I think the issue is with the wiring. While the pinout is correct, the individual LVDS pairs aren't on twisted pairs. XClk+ is on an individual wire, for example.

While looking into this, I realized that the working cable has a (differently) wrong wiring as well. I wish I had looked into this a bit earlier; seems like we got lucky. Since I would get a different one next time, I don't think stating the part number here makes sense.

For reference, the correct wiring can be found at http://www.volkerschatz.com/hardware/clink.html:
image

@dhslichter
Copy link
Contributor

XClk+ is on an individual wire, for example.

Good grief! Wow, a frustrating situation but glad that the solution in the end turns out to be relatively simple. It is definitely terrifying to see the variety of "creative" cable manufacture that exists...

@chanlists
Copy link

chanlists commented Aug 16, 2020 via email

@jbqubit
Copy link
Contributor

jbqubit commented Aug 16, 2020

Thanks for tracking this down @jkellerPTB. I updated wiki with cable advice. https://github.com/sinara-hw/Grabber/wiki#overview

@JKiethe
Copy link
Author

JKiethe commented Aug 24, 2021

I am not sure, where the following information is placed best. It is interesting for users of the Andor iXon Ultra 897 and the Sinara Grabber, which is why I added it to this issue.

@jordens & @sbourdeauducq: If I should move/copy it somewhere so that it reaches the ARTIQ community better, please tell me.

Bug in FPGA of Andor Ixon Ultra 897

In the CameraLink implementation of the Andor Ixon Ultra 897 (i.e. the 512px x 512px variant) is a bug, which occurs due to some error in the FPGA of that camera model. This leads to the following behaviour on ARTIQ side:

  • Readout of full image OR quadratic sub-image lead to correct count values in the ROIs read by the Grabber. (A sub-image physically read out less of the chip to increase frame rate.)
  • BUT readout of a rectangular sub-image leads to some pixels in the sub-image being corrupted. These are read by the Grabber as having 0 counts. As far as we and Andor have seen, this concerns part of the lowest row in the sub-image, but there is no guarantee that this is consistent for each setup.

Andor is aware of this behavior and can reproduce it with a different frame grabber card. They suspect an error on the camera FPGA and are currently working on fixing this, but it might take some months to supply a patch.
For anyone working with the specific combination iXon Ultra 897 and Grabber, keep in mind that readout of non-quadratic subimages can lead to corrupted pixels on the Grabber side. To circumvent it, either

  • read full chip every time (which for most experiments takes too long)
  • use a quadratic subimage (still takes long dependent on what you want to image)
  • increase any rectangular subimage by a few rows/columns to shift the corrupted pixels outside the camera chip region you are interested in. This needs to be tested for each sub-image size and position in my experience.

Remark: For the iXon Ultra 888, i.e. the 1024px x 1024px model, this bug does not occur. The readout over the CameraLink works correctly independent on the sub-image dimensions.

@jordens
Copy link
Member

jordens commented Aug 24, 2021

@JKiethe Thanks for all that valuable info!
I'm unsure where the best place for it is. Putting it here is certainly not a bad choice.

@airwoodix
Copy link
Contributor

@JKiethe thanks for the summary on that bug! Did Andor release a fix by now?

@JKiethe
Copy link
Author

JKiethe commented May 3, 2024

@airwoodix Yes, they did fix it. Sorry for not updating this thread when they informed us.

In the firmware with FPGA version of 6.30 (and I would assume above) the bug has been fixed for the iXon Ultra 897.
Andor sent us the necessary files to update our camera firmware directly. On their website I have not seen any indication of a firmware update. It is probably best to contact them or your local distributor for this update.

@airwoodix
Copy link
Contributor

Awesome! Thanks for the quick feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants