============================================================ SVT note N.83 GLUE3 = AMSGLUE ============================================================ Rough draft. April 6 1997 Updated July 6 1997 for new logic. 1.INTRODUCTION ------------ This note describes the GLUE chip that is located in the AMS board. It is therefore called AMSGLUE. It was formerly called GLUE3 since it is level 3 in the glue tree. The AMSGLUE is implemented in VHDL for the Cypress WARP compiler. It is targeted for the Cypress CY7C384A-2AC FPGA chip, now sold as QuickLogic QL12x16-2PQ100 (-2 indicates the speed grade, -0 will probaly be enough). This chip has a TQFP100 package. A large part of this note is made by pieces (mostly comments) cut and pasted from the VHDL code. The VHDL code is in: suncdf1.pi.infn.it:~belforte/warp/amsglue/*.vhd 2. THE GLUE TREE --------------- The glue chips are organised as a tree with the AMS at its root and the AM chips as leaves. Usually this tree is drawn "upsidedown" with the AMS at the top, and the AM chips at the bottom, this picture gives the orientation for the top/up/down names to the glue sides. At the bottom of the tree the GLUE0 talks to AM chips at the lowest level of the AM board, one GLUE0 connects eight AM chips. The AM board then contains four GLUE1 (each connects four GLUE0) and one GLUE2 (or VMEGLUE) that connects the four GLUE1, incorporates the VME interface for the AM baord and talks to the AMbus on P3. Therefore AMSGLUE communicated with GLUE2 on one side and with the other parts of the AMS board on the other. One AMSGLUE can deal with up to four GLUE2, i.e. four AM boards. Only two are planned so far. 3. AMSGLUE TASK --------------- Like every glue, the AMSGLUE has 2 sides: TOP and DOWN. The DOWN side talks to P3 and the GLUE2, the TOP side talks to the AMS logic. N.B. in the other glues the two sides are called up and down instead, the reason to use TOP instead of UP will be clear later. The task of AMSGLUE is different in the two different AMS operating phases: 1. during INPUT (i.e. whenever the AMS microsequencer sends an OPCODE which is different from OUTPUT). 2. during OUTPUT (i.e. when the OPCODE is OUTPUT) 1: in this case AMSGLUE acts as a simple one level pipeline from the TOP to the DOWN side, i.e. it is a syncronous register who latches OPCODE, LAYER and DATA at each clock rising edge, and sends them to the DOWN side. 2: now the AMSGLUE works in the other directions, i.e. from the DOWN to the TOP side. Its task is to transfer the AM addresses to the TOP side where the AMS will send them out as roads. This operation is synchronised via the DONE_/DV_/SEL_ protocol in the same way as for the other glues. Since the AMS has only one glue though, the TOP SEL_ signal is likely useless, and although it has been kept for history and safety, it is always asserted. TEST MODE: when the AMS is in test mode the AMSGLUE replaces its TOP DV_ output with the registerd value of the input VMEDV_ bit, so that its value can be obtained from the internal VME address bus and used to address the AMS microsequencer RAM for read/write. The TOP DV_ is the only signal that goes directely from the AMSGLUE to the AMS microsequencer. This task is very simple, and the logic that implements it will be omitted in the following, just to keep the description more straight. 4. AMSGLUE PINS -------------- The AMSGLUE uses almost all pins of the CY7C384 TQFP100: one of the two reversed clock pins is used for the chip CLOCK (sometimes called CLK, or CLK1 in the VHDL, the reason is historical and irrelevant) one high drive input pin (out of 6) is used for input SEL_: SEL_TOP_ All the I/O pads are used. The pin list is rather obvious and is not reported here, there are a few important remarks though: 1. ALL CONTROLL FLAGS ARE ACTIVE LOW AT THE CHIP PINS, therefore they have a trailing underscore (_). All these signals are inverted in the i/o buffers and the internal logic design uses always active high signals. 2. THERE IS AN UNFORTUNATE CONFUSION WITH NAMES: some signals to/from the AMS have been called _UP in the past, so the AMS schematics still have DV_UP_, SEL_UP_ etc. This is now changed and in this document all signals to/from the AMS logic are consistently called name_TOP. Eventually all documents will be consistent. ************************************************************************ ** Therefore xxx_UP SIGNALS ARE ONLY INTERNAL, ALL AMSGLUE PINS THAT ** ** IN THE PAST HAD THE _UP NAME SHOULD BE READ AS xxx_TOP !! ** ************************************************************************ 5. AMSGLUE ARCHITECTURE ----------------------- The glue has two parts that work indipendently in parallel: the INPUT CELL and the OUTPUT ENGINE. They are synchronised on the same common CLOCK. TOP SIDE: layer data opcode address sel_ dv_ done_ 3bit 12bit 4bit 16bit 1bit 1bit 1bit | | | /|\ | /|\ | \|/ \|/ | | \|/ | | ---------------- \|/ --------------- | | | | <------------> | |-- \|/ | INPUT | | OUTPUT |<------ clock --> | | clock--> | | 1bit | CELL | | ENGINE | | | | |---> ---------------- --------------- | | | | /|\ | /|\ | \|/ \|/ \|/ | \|/ | \|/ DOWN SIDE: layer data opcode address sel_ dv_ done_ 3bit 12bit 4bit 14bit 4bit 4bit 1bit The INPUT CELL is a simple collection of registers, they are always active and transmit the layer(3 bits), data(12 bits) and opcode(4 bits) from the TOP to the DOWN side at each clock edge, all of the times, no matter what the other signals are. The OUTPUT ENGINE is more complex. It is modeled on the architecture of the other glues. The differences are: 1. there is no MODE. Using the terminology of the lower level glues, AMSGLUE is always in mode PIPE. 2. Unlike GLUE0, and like GLUE1 and GLUE2 the down interface is simpler since it talks with other glues, only one clock is involved and the DOWN FiniteStateMachine (see later) has only 3 states. 3. The major difference, which adds something new, is the following: ******************************************************************** ** unlike all the other glues, the AMSGLUE has an additiona layer ** ** between the its UP side and the OUTPUT. This is the TOP logic, ** ** that adds one more address register and one more finite state ** ** machine. These are called ADDRESS_UP and TOP_FSM. The output ** ** pins are called ADDRESS_TOP and are driven by the ADDRESS_UP ** ** register, just like in the other glues the output pins are ** ** called ADDRESS_UP. The task of the TOP FSM is to make the TOP ** ** side of the AMSGLUE talk the AMS protocol (a data driven ** ** pipeline) rahter then the GLUE protocol. In practice the TOP ** ** FSM and the ADDRESS_UP register guarantees that the output ** ** DV_TOP_ signal has indeed the meaning of a DataValid signal ** ** for the Addres_Top pins: it has the same timing and when it is ** ** low it means that there are valid data in output, it can thus ** ** be used directely as a latch enable for the top address, just ** ** like is done in the AMS OUT_CTR chip. ** ******************************************************************** 6. INPUT CELL ------------ There is little to say. Here is the description from the VHDL code that implements it. It should be totally self_explaining. This is extracted from toppkg3.vhd. ---------------------------------------------------------------------- -- INPUT DATA CELL -- simple one stage pipeline, clocks data from UP to DOWN on each -- clock rising edge. Always active. ---------------------------------------------------------------------- -- INPUT LAYER CELL -- simple one stage pipeline, clocks layer from UP to DOWN on each -- clock rising edge. Always active. ---------------------------------------------------------------------- -- INPUT OPCODE CELL -- simple one stage pipeline, clocks opcode from UP to DOWN on each -- clock rising edge. Always active. 7. OUTPUT ENGINE --------------- Communication with the AMS is via the DV_TOP_ and DONE_TOP_ signals: the protocol is simple: - if OPCODE_UP is different from OUTPUT, nothing happens: DV_ is 1 and DONE_ is 1 (all false). This is the input phase. - in the output phase, the AMS must keep OPCODE=OUTPUT at all times. - if OPCODE_UP=OUTPUT, the AMSGLUE will assert DV_=0 as soon as it has loaded valid address data into its internal register. Whenever DV_=0 there are valid data on the ADDRESS_UP output pins. The internal logic that generates DV_TOP_ and ADDRESS_UP is exactely the same, so they have the same timing up to pin-to-pin skew. This skew is of few nsec and the typical delay from clock edge to DV_TOP_ valid is 10 nsec for -1 speed grade. These delays change slightly with re-fitting of the chip and change more for diffent speed grades. Check with simulation for precise information. The first TOP_DV_=0 will appear a fixed number of clock cycles after the first OUTPUT opcode is sent, of the order of 10. The precise value is still not fixed, it will depend on details still to be defined in the P3 communications and inside GLUE2. - Once the ASMGLUE has asserted DV_TOP_ the first time, it will wait for a DONE_TOP_=0 signal from the AMS before going to the next address. DONE_TOP_ is internally sampled on the clock rising edge. DV_TOP_ is removed as soon as DONE_TOP_ is detected and will not be reasserted for at least one clock cycle. The next DV_TOP_=0 may come in 2 or 3 clock cycles depending on avaibility of data on the down side. - If there is no DV_TOP_=0 for 3 clock cycles, it means that there are no more data to send and the AMS should go to end-event procedure. - At any time the AMS can suspend the output by simply delaying DONE_TOP_ assertion for as many cycles at it wishes. - The AMSGLUE is fully reset and all its internal data cleared whenver the OPCODE is changed to any other then OUTPUT. Communication with the GLUE2 is via the DV_DOWN_, SEL_DOWN_ and DONE_DOWN_ signals and it is identical to what happens between the other glue levels. Now there is no guarantee from the lower lever glues of the timing of the address with respect to the DV_, but the chips are built to the specification that everyhting should fit in 3 clock cycles. At the beginnig of the first cycle the various GLUE2 assert their DV_ signals, they are all received at the same time as DV_DOWN_ by the AMSGLUE. In the ASMGLUE there is a priority encoder that it completely asynchronous and continously monitor DV_DOWN_, the priority encoder decides which GLUE2 to read first and asserts the corresponding SEL_DOWN_ signal (only one of the SEL_DOWN_ is true at any given time). The GLUE2 use SEL_ to asyncronously enable the output from its internal address registers to the P3 bus. At the 3rd clock cycle, the AMSGLUE assumes that address bits are now stable on the P3 bus, and latches them internally with the 3rd clock rising edge that follows the DV_DOWN_=0 transition. At the same time the AMSGLUE asserts the DONE_DOWN_ signal to tell GLUE2 to send out its next address. The priority encoder also codes the number (0-3) of the chosen GLUE2 in two bits (Address_X) that are added to the left of the address_down word to form the address_up. The architecture of the OUTPUT ENGINE implements these protocols via the following logic blocks: -One reset logic, that generates the reset for all fsms's. -Two 14-bits registers for the address_down (ADDRESS_A and ADDRESS_B), one address multiplexer and the ADDRESS_UP register -One priority encoder and address encoder -Three finite state machines: DOWN_FSM talks with GLUE2 and UP_FSM UP_FSM talks with DOWN_FSM, keeps track of ADDRESS_A and ADDRESS_B registers, talks with TOP_FSM TOP_FSM talks with UP_FSM and the AMS, keeps track of ADDRESS_UP register The block diagram of the OUTPUT ENGINE is: OPCPDE_UP DV_TOP DONE_TOP ADDRESS_TOP 4bit 1bit 1bit 16bit | /|\ | /|\ | | \|/ | | ------------ AUP_CE -------------- | | TOP | ----------------------> | ADDRESS_UP | | |------> | FSM | | register | | | ------------ -------------- | | /|\ | /|\ | | DV_UP| | DONE_UP |Address_C \|/ | | \|/ | ------- | ------------ ------- |RESET|-| FSM | | EN_A, EN_B / MUX \ ------- |------> | UP |-----------------------> | C | | RESET | FSM |--------| --------- | | | | /|\ /|\ | ------------ | Address_A| |Address_B | /|\ | | | | | LATCH | | FULL | ------------------- | | \|/ | LA, LB | ADDRESS_A and B | | ------------- |--------> | registers | | | DOWN | ------------------- |------> | FSM | /|\ /|\ ------------- Addresss_X | | | /|\ 2bit | | | | | | | | DVIN ------------------- | | |----------| PRIORITY/ADDR | | | | ENCODER | | | ------------------- | | | /|\ | \|/ \|/ | | DONE_DOWN SEL_DOWN DV_DOWN ADDRESS_DOWN 1bit 4bit 4bit 14bit Here is the description of the various components of the OUTPUT ENGINE, from comments to the VHDL code (with a few addirions). A few pieces of VHDL source have also been pasted here to describe e.g. some fsm outputs that are not in the comments, hope they are clear enough by themselves. From l2_pkg3.vhd: --ADDRESS REGISTERS -- address registers A and B for standard output engine. Address_c is -- the multiplexer that sends the chosen address register value to the UP -- address register -- output engine address registers A and B -- LA stores into A, LB stores into B -- the most significant bits come from the DV_DW encoding in ADDRESS_X -- Now the multiplexer: A and B into C according to EN_A and EN_B -- since the mux must always end up in an assignements (otherwise a flip-flop -- would be synthetised) the easiest way is to make simply A -> C all of the -- times except when EN_B is true, in which case B -> C. The UP FSM logic -- gurantees that EN_A and EN_B are never true at the same time. -- (the other way exchanging A and B would be fine as well, of course). -- It is up to the TOP FSM to make sure that Address_UP is never written -- unless a valid A or B is present -- Now the ADDRESS_UP register. Under control by TOP fsm via -- AddressUP_ClockEnable (AUP_CE), latch current valid address into ADDRESS_UP -- PRIORITY ENCODER -- priority encoder: only one of the SEL_ is true (0) at any time, and -- corresponds to the first of the DV_ which is true (0), where DV_DW(0) -- has the highest priority, and DV_DW(3) the lowest -- The priority encoder also sends the DVIN signal to the DOWN_FSM. Functionally -- DVIN is the logical OR of the DV_DOWN_ bits, DVIN is true whenever at least on of -- the four DV_DOWN_ is true. But for makimum speed and avoid long setup times -- for the DOWN FSM state flipflops the DV_DOWN are synchronised first and -- DVIN is the OR of the DV_DW_SYNC signals. From fsmpkg3.vhd: --RESET --Generates the FSM_RESET signal to the three fsms's. -- Reset generation for FSM's -- All FSM's are synchronously reset whenever OPCODE_UP is different from -- OUTPUT. When the AMS sets the opcpde to OUTPUT this glue will start acting -- at the next clock cycle, while the output opcode is still propagating -- down the glue tree, all the dowstream glues will definitely be reset -- by then, as well as the bottom AM chips that are reset at each event -- by the REsetHandshake - ClearCounter - ClearHitRegister sequence. -- TOP FINITE STATE MACHINE (PECULIAR to AMSGLUE alias GLUE3) -- Two-state FSM to dialogue with the AMS -- the two states are FULL and EMPTY, they indicates the status of -- the ADDRESS_UP register. This register is handled as a data driven -- pipe with DV_TOP flag to signal the presence of valid new data in -- ADDRESS_UP. DV_TOP has the same timing as data bits in the register -- (i.e. it is the Q of a flip-flop whose D is the Address_UP clock_enable) -- and is the high active inverse of the output signal DV_TOPn. -- -- State diagram: -- ------------ -- | EMPTY | -- ------------ -- | ^ -- | /|\ -- DV_UP | | DONE_TOP -- \|/ | -- v | -- ------------ -- | FULL | -- ------------ -- -- The FSM outputs are DONE_UP, DV_TOP and AUP_CE. When Address_Up register -- is empty, the FSM is ready to load it and DONE_UP is asserted -- (so DONE_UP = EMPTY), and when the register is full DV_TOP is -- asserted (DV_TOP = FULL = NOT(EMPTY)). AUP_CE (AddressUP ClockEnable) is -- the clock enable for the address_up register, it is asserted at the same -- time the FSM makes the transition form empty to full, it lasts only one -- clock cycle to make sure only the valid data flagged by DV_UP are latched. -- AUP_CE = '1' when (StateTop=empty and DV_UP='1') else '0'; -- -- -- One status bit (1=empty, 0=full) is enough, but we will use -- 2 bit with one-hot-one encoding for maximum speed, also this -- allows to identify the two status bits with the outputs, so to have -- them "in parallel output registers". -- Note that ONE-HOT-ONE is needed to force one flip-flop for each -- state, one-hot-one would result in only one flip-flop to be synthetised -- -- -- UPSTREAM FINITE STATE MACHINE -- This fsm is to keep register A and B full with good data from the -- down side and send the proper one to address_up register via the -- address multiplexer. The multiplexer is implemented by enabling the one -- between A and B who got loaded first, in order to preserve the order -- in which data have been read from the down side. -- The priority between A and B is fixed: register A is always filled first. -- Writing into the A and B registers is controlled via the two clock -- enables LA and LB. The decision of which register's data go to the -- up side is via the two output enables EN_A and EN_B. -- The up fsm is controlled by inputs LATCH from the down fsm and DONE_UP -- from the top fsm. The down fsm asserts LATCH when there is a new -- valid addres on the address_down bus. The top fsm asserts DONE_UP when -- it is ready for the next address. As in all GLUE protocols, valid data -- are always registered and presented to the next stage as soon as they -- appear, without waiting for a first DONE. -- In general in the GLUE tree the DONE signal must be validated by -- the SEL_, since DONE is common to all the GLUEs at the same level. -- Here this may not be necessary (see following comments in the code). -- The up fsm answers to top and down fsm's via the two signals DV_UP -- and FULL: DV_UP tells the top fsm that there are valid data in output -- to be latched in the address_up register, FULL is a halt flag to the -- down fsm to signal that both A and B registers are full and no more -- data can be accepted. -- -- The top fsm has 5 states that correspond to the various possible -- combination of "fullness" of the address registers A and B: -- empty: both registers empty, no data have been loaded yet -- FA: regA is full, reg B is empty -- FB: regB is full, reg A is empty -- FAB: both registers full, A was loaded first -- FBA: both registers full, B was loaded first -- -- The state diagram is (LATCH fills one register, DONE_UP frees one) -- -- --------- --------- -- | | | | -- | FAB | | FBA | -- | | | | -- --------- --------- -- ^ \ DONE_UP/ ^ -- /|\ \ DONE_UP / /|\ -- | \------\ /------/ | -- | X | -- | / \ | LATCH -- | /------- --------\ | -- | | | | -- LATCH | \|/ \|/ | -- | v v | -- --------- --------- -- | | LATCH & DONE_UP | | -- | FA |<--------------->| FB | -- | |<-----\ | | -- --------- | --------- -- | | | -- DONE_UP | | LATCH | -- | | | DONE_UP -- | --------- | -- | | | | -- \--------->| empty |<-----/ -- | | -- --------- -- -- One-hot-one encoding is chosen again for maximum speed. Outputs are -- placed in parallel registers for speed and fanout whenever posssible. -- -- DONE_UP must be validate by SEL_UP to make sure that this glue -- is really the one the above level wants to talk with. Indeed in the -- AMS there is only one GLUE, so SEL_UP could probably be removed everywere, -- we keep it temporarely in case it results that is needed for the -- the protocol. One change at a time. -- now the registered outputs outputs: process (NextState) begin if (NextState /=empty) then NextDV_UP <= '1'; else NextDV_UP <= '0'; end if; if (NextState=FAB or NextState=FBA) then NextFULL <= '1'; else NextFULL <= '0'; end if; if (NextState=FA or NextState=FAB) then NextEN_A <= '1'; else NextEN_A <= '0'; end if; if (NextState=FB or NextState=FBA) then NextEN_B <= '1'; else NextEN_B <= '0'; end if; end process outputs; -- here is the register instantiation sync: process(clk) begin if(CLK'event and CLK='1') then StateUp <= NextState; DV_UP <= NextDV_UP; FULL <= NextFULL; EN_A <= NextEN_A; EN_B <= NextEN_B; end if; end process sync; -- now the combinatorial outputs, these are functions of both state and -- inputs, so can not be registered LA <= '1' when ((StateUp=empty or StateUp=FB) and LATCH='1') else '0'; LB <= '1' when ((StateUp=FA) and LATCH='1') else '0'; -- DOWNSTREAM FINITE STATE MACHINE -- No mode, machine goes into HALT status whenever the general FSM_RESET -- signal is valid and once it is removed the fsm stays there until there -- are valid data on the down side (DVIN true) and the UP FSM is ready -- for data (FULL false). -- There are 3 states only in this machine, which are cycled through one -- at each clock cycle in a regular way. Once the FSM is started by DVIN -- it goes through the full state sequence automatically: -- State_0: is the HALT state, stay here until DVIN is true and FULL is false. -- then go to State_1 at first clock edge. Note that DVIN is delayed -- by one clock cycle with respect to DV_DWN_ (is built from the -- synchronised version of DV_DWN_), so the priority encoder is -- already done when DVIN is detected and address will be ready for -- latch right now. -- At next clock cycle moves to State_0 and checks again for DVIN. -- State_1: now receives address from the down side: assert DONE_ and -- LATCH, at next clock edge the address_down will be latched in -- address_A or address_B by the UP FSM and the address_down will -- be incremented by the GLUE2. Also at next clock edge moves -- automatically State_2 to let GLUE2 acknowledge DONE. -- State_2: it is just "to wait" for the downstream GLUE to receive DONE and -- for the corresponding new state of DV_DWN_ to make it through the -- synchronising logic, so get DVIN up to date. -- -- State diagram: -- -- ----- DVIN & ----- ----- -- | 0 | ------------------> | 1 | ---> | 2 | -->| -- ----- notFULL ----- ----- | -- ^ | -- /|\ | -- |--------------------------------------------| -- -- Output controls are DONE and LATCH: they are true only in State_1 -- -- One-hot-one encoding is chosen again for maximum speed. Outputs are -- placed in parallel registers for speed and fanout. outputs: process (NextState) begin if (NextState=s1) then NextDONE <= '1'; NextLATCH <= '1'; else NextDONE <= '0'; NextLATCH <= '0'; end if; end process outputs; sync: process(clk) begin if(CLK'event and CLK='1') then StateDown <= NextState; DONE <= NextDONE; LATCH <= NextLATCH; end if; end process sync;