Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zynq Kasli #1

Closed
dtcallcock opened this issue Dec 2, 2018 · 74 comments
Closed

Zynq Kasli #1

dtcallcock opened this issue Dec 2, 2018 · 74 comments

Comments

@dtcallcock
Copy link
Member

A new Kasli based on the forthcoming v1.2 design, but with a Zynq-7000 series FPGA (2x ARM CPU) instead of an Artix-7 (no hard CPU).

This will allow the proposed Zynq version of Artiq to utilise the EEM hardware.

The proposed FPGA is the XC7Z030-3FFG676I. The reasons for choosing this FPGA (as laid out by @hartytp) are:

  • Very similar (same Kintex-7 fabric) to the XC7Z045 FFG900 –2 used on the Xilinx ZX706 Evaluation Kit that will likely be used for gateware development.
  • The CPUs on the 7Z030 go up to maximum 1GHz clock rate available on 7000 series Zynqs.
  • Fastest speed grade to ensure we can clock the AXI bus as fast as possible. The thought is that cost sensitive users can stick with the Artix Kasli, and the main user base for Zynq Kasli will be people who don’t mind paying an extra $50 to get the fastest FPGA.
  • The design costs + risks will be much lower if we use a Zync model that @gkasprow has used in previous designs. He's used this FPGA before.
  • It has enough pins to drive all 12 EEM connectors + WR oscillators

SDRAM would be routed to the ARM subsystem only. The gateware will still have DMA access based on tests carried out by @cjbe, where he found that:

I have tested DMA from PL to PS-SDRAM via the HP ports, and indeed it works without any problems, and one gets the advertised bandwidth.

I am working on getting this board funded and would appreciate any input.

Also, is 'Zynq Kasli' a confusing name? Do we need to dust off our Russian maps and come up with something more unique?

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

Maybe this time the name that says something? Something like ZynTroller or ZynQer

@sbourdeauducq
Copy link
Member

"Kazli" is probably too confusing. Maybe "Fastli"?

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

or "Fastler" like fast controller

@hartytp
Copy link

hartytp commented Dec 3, 2018

Personally, I think that names like "fastino" and "fastli" make an already somewhat confusing naming situation somewhat worse. Maybe let's try to come up with something that's not just a mutilation of "Kasli"?

@sbourdeauducq
Copy link
Member

Then we can call it Kutashi, after a lake near Kasli :)

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

It sounds not professional in polish. Kutas = dick :)

@sbourdeauducq
Copy link
Member

There is also Lake Kirety

@hartytp
Copy link

hartytp commented Dec 3, 2018

My current priority list re naming:

  1. Don't get drawn into a lengthy argument about naming
  2. Don't build more hardware with Russian names many people find hard to pronounce and remember

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

A few ideas from acronym creator
SECON Soc Eem CONTROLLER
SECCo Soc Eem COntroller
SCOT Soc COnTroller
SERAC Soc EuRocArd Carrier
SEACOT Soc EurocArd COnTroller
ZECCA Zynq EuroCard CArrier
ZECI Zynq Eem Carrer

@marmeladapk
Copy link
Member

I would vote for Kasli Zynq (KasliZ for short). It's clear what this board is and it's similar naming scheme to AFC/AFCK/AFCZ.

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

Let's talk about specification of the module.

  • we won't need RGMII and PHY, and will use MGT for Ethernet acces as it is now in Kasli.

  • same quad USB will be used, one UART to PS, another to PL, JTAG and I2C

  • no USB host

  • no SD Card

  • power supply based on Exar chip or individual converters, but it takes more place. (I have working ZynQ design with Exar)

  • If we don't use Exar, then current monitoring will be done using ZynQ ADC. I also have working project, but it's quite complex because uses external controller-sequencer. Only a few ZynQ ADC pins could be used because most of them are reserved for LVDS

  • We can add EEM protection against damage caused by too high voltage on EEM. But it will add some cost. Every LVDS pin needs high power, low capacitance diode. But since SoC is quite expensive, we can consider it. We would need 12x16 BAS52-02V diodes (0.11 EUR@100pcs). It would cost 21EUR just for diodes and another 2..3 EUR for protection circuit. Diodes have 5pF (at 10V) and If of 750mA. At 1V capacitance will be much larger. This should not interfere with usual LVDS signalling but may cause issues with Grabber.
    We can use similar diodes, with lower If current and much lower capacitance
    obraz

  • WR clocking using I2C oscillators only, no DACs + VCXOs

  • can we use PS GPIOs for SFP control? They can be driven only by software. For what else we can use these pins?

  • some banks use 1.8, some use 2.5V. We have only 2 HP banks and 3 HR (11,13) banks. To have correct LVDS termination, we need to supply 2.5V to HR and 1.8V to HP

  • we have 4 MGTs, 3 goes to SFPs, one to SATA

  • one MGT bank clock goes to clock distribution as in Kasli, one goes to 125MHz oscillator. WR clock oscillator is connected to the input of clock distribution network

  • we will unify the way how SFPs are controlled in Kasli and Kasli Zynq. I want to use I2C extenders with control pins pullups/pulldowns that make SFP working by defualt. Maybe some pins need to be controlled by PS GPIO but I think it's more convenient to have access also from PL. In this way Kasli core could be uses without issues in Kasli ZynQ.

@sbourdeauducq
Copy link
Member

* we won't need RGMII and PHY, and will use MGT for Ethernet acces as it is now in Kasli.

@hartytp wanted to use PS Ethernet for performance reasons (it is possible to have the same performance out of a fabric Ethernet core since Ethernet frames are long and things can be pipelined, but the interfacing with the ARM core is most likely annoying). And the ZC706 will use the PS Ethernet already.

@sbourdeauducq
Copy link
Member

* can we use PS GPIOs for SFP control?

I think so. Same for low-speed I2C that controls LEDs and such.

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

@sbourdeauducq I wouldn't use PS I2C. It has so many bugs that is useless. We usually instantiate one in PL.
It does not matter if you use MIO RGMII or EMIO GMII + PCS/PMA. GEM is the same. But second case requires Xilinx PCS/PMA which instantiates MGT. So external PHY can actually make life from SW point of view far easier.

@sbourdeauducq
Copy link
Member

sbourdeauducq commented Dec 3, 2018

I wouldn't use PS I2C. It has so many bugs that is useless.

Yeah sure, this is standard fare with Xilinx cores and I had no intention to use that. We can do bit-banged I2C on GPIO, unless that core is buggy too.

Are you saying that the PS can use a GT transceiver directly for Ethernet, without the Ethernet logic in the fabric?

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

This is true in UltraScale+ ZynQ - GTR can be used that way. In ZynQ you have to instantiate PCS/PMA logic between EMIO GMII and MGT.

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

It should fit easily
obraz

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

If we want to make life of @sbourdeauducq easier (no logic between GEM EMIO GMII and MGT) and use external PHY chip that talks to SFP directly, it means that this SFP would have to be used for Ethernet, without DRTIO. We can also use 1000Base-T connector and spare a few bugs on SFP cage and SFP-RJ45 transceiver.

@sbourdeauducq
Copy link
Member

If the Xilinx stuff can give a GMII interface to the fabric without bugs and quirks, then using the GT solution is fine (except, of course, for the usual problems associated with the poor design of Xilinx transceivers, and in particular the GTP/GTX incompatibility - so it's not "free").

* We can add EEM protection against damage caused by too high voltage on EEM. But it will add some cost.

In addition to the signal degradation and component cost, this will also make the board layout and assembly more complex. Kasli systems are normally assembled in a controlled environment where ESD can be avoided, and then the metallic enclosure should protect against ESD. So I would not add those diodes.

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

That's why I developed simple adapters that ensure such protection during tests only. It is to avid FPGA death in case of LVDS shorted to 3.3V. 2 Kasli died during tests.
And Tom asked if it makes sense to add to all boards, in my opinion it doesn't.

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

edit - LVDS shorted to 3.3V

@jordens
Copy link
Member

jordens commented Dec 3, 2018

IMO we should consider the first steps in the transition towards the cPCI-serial backplane here and now. I.e. move all the parts into the 160 mm towards the panel and move the EEM connectors towards the back already now. Then going to the cPCI-serial backplane would amount to shortening the board and using different connectors.

@hartytp
Copy link

hartytp commented Dec 3, 2018

And Tom asked if it makes sense to add to all boards, in my opinion it doesn't.

Don't bother with it.

@hartytp
Copy link

hartytp commented Dec 3, 2018

IMO we should consider the first steps in the transition towards the cPCI-serial backplane here and now.

@jordens how would you want to use one of those backplanes? We've talked about this a few times, and I don't see any particularly good way of hooking Kasli/EEMs up to a BP. (The issue being how heterogeneous the EEMs are in terms of size, number of EEM connectors required etc). AFAICT, while a BP might work okay in some circumstances, in general the issues with a BP are at least as bad as the issues with using ribbon cable. Although, maybe I'm missing something?

@jordens
Copy link
Member

jordens commented Dec 3, 2018

You have to bite the bullet and
a) make all cards the same length, but I am pretty sure that PCB area is not that costly and this is minor
b) figure out fixed panel widths (when doing custom backplanes) or live with 4 (?) HP (which would only affect DIO_BNC and Sampler right now). That might not be so bad and adding BNC multipin pigtails or moving to SMA could be done.
c) figure out how to deal with the double-EEM connector EEMs. Right now the double-EEMs would need to go to the two 8 lane PCIx slots and the single EEMs would go to the six 4 lane PCIx slots when using existing backplanes. But we should also consider redesigning these EEMs to single-EEM connector.

I don't think any of those are particularly bad or insurmountable constraints compared with the constraints of ribbon and coax cables. But this would likely be an all-or-nothing change (unless we figure out a way to build some adapter from the backplane to IDCs.
In any case, moving the components to the front and the IDC connectors to the back (even in two rows) sounds like a no brainer to me.

@hartytp
Copy link

hartytp commented Dec 3, 2018

@jordens yes. FWIW I really like the current flexibility in the EEM system re board sizes/shapes and IO requirements. Loosing some of that flexibility is likely to be one of the costs of moving to a fixed BP.

It's been a while since I thought about this, but IIRC the PCIx BP connectivity might not be ideal for us.

But, yes, none of the objections is insurmountable. All I'm saying is that -- as someone who was originally arguing that we should try to implement a EEM/Kasli BP -- having thought about this a bit, the tradeoffs/costs involved in the BP seem to me to outweigh the benefits. Having put together quite a few systems here, I actually don't think the ribbon cables are that bad. Others may disagree.

Anyway, let's not get sucked into a long discussion about this now. Before we have that discussion, someone would need to present a fully fleshed out proposal and I don't think there is the interest or time to do that at the moment.

In any case, moving the components to the front and the IDC connectors to the back (even in two rows) sounds like a no brainer to me.

Sure, if this doesn't cause issues with the routing/mechanics/SI/PI/thermal management/etc and if it doesn't consume an unreasonable amount of @marmeladapk's time then why not.

@hartytp
Copy link

hartytp commented Dec 3, 2018

The only other thing I'd add is that if we really want a BP then we should at least consider moving some of these designs to AMCs. The racks, power supplies and cooling are good and relatively inexpensive for what they are.

I know that the experiences with Sayma have been a bit depressing so far, but I do not think that's a fundamental issue with uTCA, so much as a result of various aspects of the approach taken in that project.

@jordens
Copy link
Member

jordens commented Dec 3, 2018

Ultimately the ribbon cable collection and the coax cables are a hack. A very pragmatic and convenient one in the beginning. And I think we chose wisely at that time.

Yes. I do like uTCA for what it achieves. But the racks, power supplies, and cooling are also good for cPCI. To break even with the uTCA complexity, much more powerful eems would be needed.

@gkasprow
Copy link
Member

gkasprow commented Dec 3, 2018

cPCI serial seem to fit out needs ideally.
The only drawback is card count - the controller supports only 8 slots.
Controller slot has
8 x PCI Express
6 x4 links
2 x8 links (fat pipes)
2 dedicated I²C high-speed buses for the fat pipes
Optional serial RapidIO
8 x SATA/SAS
Supported by SGPIO bus (SFF-8485 specification) for hot swapping
8 x USB 2.0
8 x USB 3.0
8 x Ethernet 10GBASE-T
which gives us another 8 LVDS links

so potentially we have 16 LVDS links between controller nad every module with standard backplane. There is also full mesh 4xdiff connection between every module.
The front panels are spaced by 4HP, so mechanics would become the same.
The only problem is how we solve the issue with single and double EEM and 4 and 8 HP panels.
8 slots should not be a problem because in most cases thare are boards that consume 2 EEM.
Panel size should be not a big issue once we switch to SMA/SMB/SMC instead of BNC.
I see a few options options:

  • assign dedicated slots with double EEM connectivity. In 8 slot chassis, 4 slots would have 16LVDS, 4 slots would have 8 LVDS each. THis gives 12x8LVDS which is already supported by Kasli.
  • make own backplane where we can space slots by 4 and 8 HP and fit to 19" rack
  • add a few LVDS bus muxes to the controller so 4x8LVDS can be assigned to any slots that require double bandwidth.
  • combine all 3 together

@gkasprow
Copy link
Member

gkasprow commented Dec 4, 2018

What we can do is to assign dual LVDS ports to slots 2, 4, 6, 8 by default.
Slots 1, 3 and 5, 7 would be connected with bus crosspoint switch to 4 FPGA EEM ports.
Such bus crosspoint switch can assign 2 EEMs to slot 1 or one EEM to slot 1 and one EEM to slot 3.
In this way user can insert 8 HP dual EEM boards to slot 4, without loosing slot 3 because its signals will be redirected to port 1.
so slot assignment could look like this:
slot 1: single EEM + optional signals from slot 3
slot 2: dual EEM
slot 3: optional EEM
slot 4: dual EEM
slot 5: single EEM + optional signals from slot 7
slot 6: dual EEM
slot 7: optional EEM.

In this way we minimise amount of not used EEM slots. We can i.e. pack 6 Urukuls or 2 Samplers and 4 Urukuls and we don't loose single EEM port.

Octal LVDS bus switches are cheap.

@dhslichter
Copy link
Member

LVDS has still the same voltage levels in 1.8, 2.5 and 3.3V systems.

Yes, but the common-mode voltage will be different, and some devices (not all) are not happy to receive a common-mode voltage other than 1.25V, for example. The point is just that it's not necessarily straightforward.

@sbourdeauducq
Copy link
Member

the common-mode voltage will be different

Really? AFAIK a LVDS transmitter is always supposed to use 1.2V as common-mode voltage.

@dhslichter
Copy link
Member

LVDS is (in theory, in the standard) supposed to allow for variation in the common-mode voltage. Depending what what you are transmitting to/from, different ways of implementing the common-mode voltage are used, and most of them are mid-supply basically. Looking at the Zynq US+ dc characteristics, it seems they do use a 1.25 V nominal common mode for both, so that should be manageable.

@gkasprow
Copy link
Member

True, some devices may have other common-mode voltage, but usually on their inputs. On the outputs it must be 1.25V, otherwise, it's not LVDS. But some LVDS drivers may exhibit non-standard behavior when both inputs are held low - both outputs get tied to 3.3V. But in such case, 2.5V supply won't help either.

@gkasprow
Copy link
Member

The specification of the controller is on the way.

@hartytp
Copy link

hartytp commented Aug 12, 2019

@gkasprow are you suggesting that we should use that board as the main Zynq Kasli for Sinara? If so, can you remind me the reasons it would be better for us to use that board than to make a new design that's tailored to our specifications?

There are a few things that seem non-ideal to me about that:

  • Ultrascale FPGA will make it harder to port the (NIST-funded) ARTIQ-Zynq prototype to Kasli-Zynq
  • I think we'd want roughly the same FP as Kasli v2.0 (inc 4 SPFs etc). (Do we need an extra SPF for the internal Ethernet connection? Or, does that just go via one of the SFPs?). Would you be relying on putting all of that on the FMC? Seems unnecessarily cumbersome
  • Where would the EEM connectors go? Or, would this be CPICs only?
  • Not sure about the clock tree. Also, note that the WR main TCXO is quite a bit worse than the SI549. Why not just scrap it and use the Si549 like Kasli v2.0?

@gkasprow
Copy link
Member

@hartytp the board is funded and will be designed anyway for CERN. I'm doing my best to make it fully compatible with EEM modules in CPCIs chassis. Just in case you want to use it.
We will use it for our other projects @creotech with other EEM boards, especially for EGSE systems for satellites. We have plans to provide ARTIQ support.
I know that it will be some work to port it to ARTIQ.
Additional 4 SFPs can be easily plugged as FMC-SFP board. There is already one SFP on board.
It would be CPCIs only.
We can make it possible to use both types of oscillators.
If you like it, I can, later on, make a version with 5 SFPs.
We will get the SoC at a very low price, a small fraction of Digikey price.

@dhslichter
Copy link
Member

dhslichter commented Aug 16, 2019

@gkasprow I still think we should just make a proper Zynq Kasli with the Zynq-7000 series, EEMs, our desired clocking, our desired FP configuration, etc, and not worry about trying to port this CERN board to ARTIQ. The savings, such as they might be, are really not worth the compromises IMHO.

@gkasprow
Copy link
Member

gkasprow commented Dec 2, 2019

What is the status of the ARTIQ port for ZynQ?
I think we can design this board relatively quickly. I can assign one experienced engineer for 2 months. I can also assign one SW engineer who knows ARTIQ, but he would need some guidance.
We will keep compatibility with Kasli in terms of connectors placement to be able to use the same CPCIS adapter.
Any conclusion about the name? I'd like to move it to the dedicated repo.

@hartytp
Copy link

hartytp commented Dec 2, 2019

I'm happy with Kasli-Zynq or something like that to minimize confusion (basically, bill it as more of a design variant than a new design).

My vote would be to keep this as close to Kasli v2.0 as possible.

  • add an RJ45 to the FP for ethernet and copy ethernet from the zc706
  • maybe put USB under ethernet as in Stabilizer to make room for the RJ45
  • RAM to PS only
  • min-range Zynq FPGA

@sbourdeauducq
Copy link
Member

What is the status of the ARTIQ port for ZynQ?

The key parts are demonstrated on ZC706 or Zedboard so it's looking quite good, just a lot of work to put all the software/gateware pieces together.
Please stay as close as possible to the ZC706. Things like Zynq-Ultrascale in particular are dangerous.

@gkasprow
Copy link
Member

gkasprow commented Dec 2, 2019

Actually the only things that we connect outside the SoC are Ethernet chip, SDRAM, USB, UART and config memory. ARTIQ do not care about the power supply. So, if we connect the peripherals like on ZC706, there should be no issues.

@dhslichter
Copy link
Member

dhslichter commented Dec 2, 2019

Please stay as close as possible to the ZC706. Things like Zynq-Ultrascale in particular are dangerous.

I want to re-emphasize this -- there is NO reason from our standpoint to work with Zynq US at this stage. Some tens of dollars price differential per board is NOT worth the cost of the extra development/debugging time.

I think Kasli-Zynq or Kasli-ARM would be reasonable names. We might not want to use someone's copyrighted name as part of our board name though? Could do Kasli-HC (hard core) or Kasli-Z or Kasli-HP (high performance) to avoid this....

I agree that we should try to think of this board as a drop-in replacement, basically a design variant with some improved internals, for the standard Kasli. Assuming the port goes well, I imagine that many groups will want to switch over, based on discussions with others on the topic.

@gkasprow
Copy link
Member

gkasprow commented Dec 2, 2019

I'm talking about building a series 7 ZynQ board. UltraScale version is also under development, but it is covered by the CERN contract.

@gkasprow gkasprow transferred this issue from sinara-hw/meta Dec 2, 2019
@gkasprow
Copy link
Member

gkasprow commented Dec 8, 2019

My student will take care of this board very soon. He will post new issues so don't be surprised.

@hartytp
Copy link

hartytp commented Dec 29, 2019

XC7Z030-3FFG676 sounds like a good choice for this.

As mentioned before, I'd ideally like to go for 4 SPF + 1 RJ45 on the front panel, copying ethernet from the zc706.

Ram to the PS only.

Other than that, stick as close to the current Kasli 2.0 design as possible.

@gkasprow
Copy link
Member

gkasprow commented Jan 6, 2020

The schematics are close to completion.
Btw, @sbourdeauducq did you see this

@sbourdeauducq
Copy link
Member

sbourdeauducq commented May 5, 2020

* no SD Card

We want a SD card if possible. From experience on ZC706, it might be the least troublesome way to get non-volatile storage on Zynq, and there's the easy "escape hatch" of extracting the SD card and putting it into a $5 USB adapter in case Zynq won't write to it. The PS NOR flash controller is messed up; the only thing that seems to work semi-reliably is a read-only mode where the flash contents are mapped into the CPU memory. We didn't manage to write to the flash at all - even with the official Xilinx flasher on one of the two boards we have. There may be some workarounds in some cases like re-routing the flash to GPIO and bit-banging, but it's extra trouble and development time.

@filipswit
Copy link
Collaborator

I'll try to fit it. It looks there is enough space above RAM but only for this kind of socket: https://www.molex.com/molex/products/part-detail/memory_card_socket/0473092651

The white line is the outline of microSD
2020-05-07_21h54_13
2020-05-07_21h54_59

@gkasprow
Copy link
Member

gkasprow commented May 7, 2020

I used a slightly higher version of this Molex connector and managed to fit the BGA chip under the SD card. You can use it as well and rotate the connector if that helps.

@sbourdeauducq
Copy link
Member

How well does this connector hold the SD card?
We'll want to ship systems with the SD card pre-installed, and carriers sometimes handle packages roughly - while they have teams whose job is to find excuses to deny claims. And just normal transportation vibrations tend to dislodge things. So our hardware should be pretty sturdy.

@gkasprow
Copy link
Member

gkasprow commented May 9, 2020

We used these connectors in the production of 8k pieces of LTE routers. There were several issues with these boards but it never happened that SD slipped off.
Here are some photos
2020-05-09 19 37 56
2020-05-09 19 35 20
2020-05-09 19 35 11

@gkasprow gkasprow reopened this May 9, 2020
@jordens
Copy link
Member

jordens commented May 9, 2020

They have the standard spring loaded ratchet locking mechanism, right? There is no way they'd fall out. Get proper packaging and tie down cables properly and make sure fasteners designs are right and tightened properly.

@filipswit
Copy link
Collaborator

Version with SD card is on a repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants