Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gateware/eem: Force IOB=TRUE on Urukul SYNC output #1382

Merged
merged 1 commit into from
Nov 5, 2019

Conversation

dnadlinger
Copy link
Collaborator

Without this, the final register in the SYNC signal TTLClockGen
isn't (always) placed in the I/O tile, leading to more jitter
than necessary, and causing "double window" artefacts. See
sinara-hw/Urukul#16 for more details.

(Patch based on work by Weida Zhang, testing by various members
of the community in Oxford and elsewhere.)


@jordens: Over on the Urukul issue, you mentioned:

We generally expect the toolchain to pack these registers. I'd be surprised if it didn't here. If it didn't we should find out and check all similar cases (e.g. TTL I/O or SPI clocks) where we assume packing.

Given that this definitely changes behaviour on the Urukul SYNC clock, are there any other signals we should pin like this right away?

@sbourdeauducq
Copy link
Member

You can use this feature of Python to avoid all the , () in the code:

>>> a, b, *c = (1, 2, 3)
>>> a
1
>>> b
2
>>> c
[3]
>>> a, b, *c = (1, 2)
>>> a
1
>>> b
2
>>> c
[]

Without this, the final register in the SYNC signal TTLClockGen
isn't (always) placed in the I/O tile, leading to more jitter
than necessary, and causing "double window" artefacts. See
sinara-hw/Urukul#16 for more details.

(Patch based on work by Weida Zhang, testing by various members
of the community in Oxford and elsewhere.)
@dnadlinger
Copy link
Collaborator Author

Ah, didn't realise it worked for empty lists – such is polyglot life. Thanks!

@jordens
Copy link
Member

jordens commented Oct 30, 2019

As I said you should check whether the assumption is flawed. If it is then we probably need this everywhere: ttls, frequency generators, spi2.

@hartytp
Copy link
Collaborator

hartytp commented Oct 30, 2019

I believe that what @WeiDaZhang looked at the floor plan for the current artiq master and confirmed that the clock gen was not in the IOB without this pr.

I’m not aware of any Xilinx literature saying that the final FF for a registered output will be in the IOB unless one asks for it, so I don’t think this should be a surprise.

@hartytp
Copy link
Collaborator

hartytp commented Oct 30, 2019

@dnadlinger why not apply the IOB attribute directly to the phy?

@jordens
Copy link
Member

jordens commented Oct 30, 2019

@hartytp That's a misnegation. Assuming you want to say the opposite, that there is no evidence Xilinx packs registers into IOBs by default:

That's incorrect. By default ISE does try to pack registers into IOBs and that is well specified in the literature (IOB=AUTO) and is typically enforced further by defaulting to IOB=TRUE globally via options. And about Vivado the only difference to ISE is claimed to be the lack of IOB=FORCE (which is irrelevant here). Also the general experience is that Vivado typically packs registers.

Failing here is indeed a surprise and it is crucial to figure out why it didn't pack here because it makes a lot of other code (TTLs, other frequency generators, spi2, suservo) questionable.
That is one big open issue and the other is why this wasn't seen when measuring the jitter.
It also doesn't make sense to apply this exclusively to the Urukul instance of the frequency generator.

@whitequark
Copy link
Contributor

@jordens In nMigen an instance of FDCE was not packed into the IOB even with (*IOB=TRUE*) attribute set unless (*DONT_TOUCH=TRUE*) was also set, according to @jfng who discovered this. That could be related.

@jordens
Copy link
Member

jordens commented Oct 30, 2019

@whitequark I think that may have been due to the attribute being either on the wrong entity (it only works on cells, not nets/pins IIRC) or lost when nets/ports/pins are renamed/merged (which is worked around with DONT_TOUCH/KEEP). Vivado treats failure to honor IOB=TRUE as an error according to the documentation.

@whitequark
Copy link
Contributor

@jordens Makes sense, thanks. (I've opened an issue).

@hartytp
Copy link
Collaborator

hartytp commented Oct 30, 2019

@jordens let's check we're on the same page. As I understand it, the data we have are:

  1. @WeiDaZhang has looked at the floor plan for the current artiq master and verified that the sync generator does not always compile with an IOB FF.
  2. One generally expects significantly higher jitter levels for registered outputs which don't use an IOB FF, than for those that do pack to the IOB.
  3. Looking at our Kasli/Urukul setup, @WeiDaZhang has found varying levels of jitter on the SYNC eye scans for different Urukuls. Permuting cables and EEM slots, the jitter tracks the EEM slot on Kasli, not the Urukul or the cabling.
  4. Forcing the SYNC gen FF into the IOB (either using an ODDR or 'IOB=TRUE') reliably gives good SYNC eye scans on all slots with all the hardware we tested on
  5. With the current ARTIQ master we see a high level of SYNC jitter when using the FPGA as the SYNC source. All 4 DDSs show low SYNC jitter when the SYNC source is DDS0, while they show high jitter when the source is the FPGA. This is reproducible when the only change is flipping the mux control line on Urukul. These results don't seem to depend on the length of the ribbon cable, etc which suggests it's not some ground bounce issue or something that depends on the phase of the SYNC source relative to the local DDS clocks
  6. We have some scope data that suggests there is no difference in the jitter of the FPGA sync source and the DDS0 sync source.

Observations:

  • I find it hard to construct a physically plausible model that reconciles (5) with (6). Adding @WeiDaZhang's new data, it seems overwhelmingly likely to me that there was an issue with the measurements in (6) -- we're talking about relatively low absolute levels of jitter, which are quite easy to miss. It would be nice to re-take that data to dot the is and cross the ts, but I don't think there is anyone with the kit to take that data who will have time to take the data any time soon.
  • I believe that we all agree that (1) is a bug that needs to be fixed regardless of the previous point. If you're sceptical about the observation, it should be quick to independently confirm it.
  • As far as we can tell, any fix to (1) also fixes the Urukul SYNC jitter issue. The most plausible explanation I can see for this is that the SYNC jitter issue is just FPGA jitter. It would be useful to have independent confirmation of this from someone else to give us more statistics for that claim.

Assuming we agree on the above, there is a separate issue as to why Vivado's actual behaviour does not match your expectations/the documentation.

@jordens
Copy link
Member

jordens commented Oct 30, 2019

None of this is in question. Urukul SYNC must be a IOB register. But additionally and regarding the PR:

  • Is this a reliable way to fix it? An attribute on a Migen net/register tends not to be reliable (gateware/eem: Force IOB=TRUE on Urukul SYNC output #1382 (comment)).
  • Is it a sufficiently complete and consistent way to address the problem or does it address one symptom? Does the same problem exist elsewhere? It seems plausible that this is a significant source of (simple) TTL jitter (isolation jumpers sinara-hw/DIO_BNC#7 and references) for example. And fixing it comprehensively may well make spi phys or suservo more reliable over longer cables.

I don't think Vivado's behavior necessarily violates the documentation. Not all registers can be packed into IOBs (automatically or forced). The typical cases where they can't are known.
I'm just reevaluating the assumptions we've made in the past to avoid being surprised by this again.

@dnadlinger
Copy link
Collaborator Author

Regarding this PR, I don't (want to) know enough about the internal quirks of Migen to know what the most appropriate place to put the attribute would be. Both this and explicitly inserting a ODDR special work, as tested by @WeiDaZhang.

As for what assumptions were made in other parts of the code (simple TTL outputs, SPI, …), I was hoping that this would be more obvious to the people who originally wrote that code. It does seem like there could be extra jitter in unexpected places right now, but then, the other signals typically also aren't sampled by a clock 8 times the frequency.

Xiliinx AR# 66668 does suggest that IOB=TRUE is a hard constraint, by the way, and briefly mentions DONT_TOUCH: https://www.xilinx.com/support/answers/66668.html

@dnadlinger
Copy link
Collaborator Author

dnadlinger commented Oct 30, 2019

I briefly checked the build log for a gateware that does exhibit the extra jitter, and didn't see any obvious warnings mentioning "IOB", "packed" or the name of the sync signal in the Verilog source.

(This, of course, might or might not imply that Vivado considers not packing the register into the I/O tile a normal occurrence.)

@jordens
Copy link
Member

jordens commented Oct 31, 2019

I don't see anybody interested in digging deeper and address the issues mentioned. Go ahead and merge it.

@hartytp
Copy link
Collaborator

hartytp commented Oct 31, 2019

I’ll see if Weida has time to look into the issues you mentioned when he’s back from holiday. Personally I don’t think that should stop us from merging this now.

@sbourdeauducq
Copy link
Member

I don’t think that should stop us from merging this now.

The code for a proper solution may be very different.

@dnadlinger
Copy link
Collaborator Author

dnadlinger commented Nov 1, 2019

I don't see anybody interested in digging deeper and address the issues mentioned.

Regarding Migen, if there are reliability issues with putting the attribute on the pin signal using Misc, I'd hope that it would be trivial for its primary developer to suggest a solution that is more robust (or at least far more productive than me stabbing in the dark).

The code for a proper solution may be very different.

In what way? Do you mean if we end up globally applying IOB=TRUE instead? Instantiating an ODDR special explicitly might be more "proper" if there are Migen codegen issues, but only works on 7-series targets.

@sbourdeauducq
Copy link
Member

I'd hope that it would be trivial for its primary developer to suggest a solution

It is not trivial.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants