 _______________________________________________________________________
|                                                                       |
| XILINX APPLICATIONS XAPP004V:  CXSD16-V2.00               BN-3-21-94  |
|_______________________________________________________________________|


README file for the XC3000A Counter CXSD16:
===========================================

Note: A more detailed description of this application can be found in
Section 8 of the Xilinx Data Book.


CXSD16
------

This counter is a 16-bit Loadable Down Counter with a Count Enable (CE),
Parallel Load (PE), and Clock (CLK).  All signals are active-High.  The
operation of the counter is based on
2-bit cells.  The count outputs are
parallel decoded for all ones and then ANDed in the carry CLBs to form
the carries that advance each set of 2 count bits.  See the application note
on loadable 16- and 32-bit binary counters.  

The speed of the counter is determined by speed of the carry chain.  
A speed of approx. 37 MHz in the XC3000A-6 and 54 MHz in the XC3100A-3 
can be achieved.

Design files included in directory CXSD16:

  README          This README file.
  SCH\CXSD16H.1   Top-level Viewlogic V4.1.3a schematic
  SCH\CXSD16.1    16 Bit Loadable Down Counter              (Sheet 1)
  SCH\CXSD16.2    CLBMAPs for the counter                   (Sheet 2)
  SYM\*.1         Viewlogic Symbol for Counter
  WIR\*.1         Viewlogic Wire files

  XNF\            Xilinx Netlists for High Level Schematic
  CXSD16H.LCA     Placed and Routed LCA file
  CXSD16H.CST     Contraints file for the CLB placement.
  CXSD16H.XRP     Xdelay Timing Report using XC3000A-6

Software Versions used:

  DS390 Version 4.1.3a Viewlogic and Interface

Recommended Layout, Routing:

   Simple floorplanning will significantly improve the performance of any 
design.  The recommended layout is listed in the constraints file CXSD16H.CST.
The placement is shown in the application note and is meant to minimize the 
delays of the Q0-Q15 bits through their parallel decoding through the C6, C8, 
C10, C12, and C16 CLBs. The placement is arranged in a rectangle and generally 
in columns of function from left to right.   

The first 8 count bits Q0 through Q7 are in 2 columns and alternating 
left-right up the 2 columns.  The next 2 columns contain the parallel 
decoding for these bits with the CLBs CXSTA/(P0-3, P4-7, P8-11, and 
P12-15) and the carries in CLBs CXSTA/(C6/16, C8/10, and C12/14).  They 
are placed to be in close proximity to the Q0-7 bits in the left 2 columns 
and the Q8-15 bits in the right 2 columns.  The total is a block 4 high 
and 6 wide.  

The placement is listed below:

place block CXSTA/C6 : DC;
place block CXSTA/C8 : AD;
place block CXSTA/C12 : DD;
place block CXSTA/P0-1 : BC;
place block CXSTA/P4-5 : CC;
place block CXSTA/P8-9 : BD;
place block CXSTA/P12-13 : CD;
place block Q0 : AA;
place block Q1 : AB;
place block Q2 : BA;
place block Q3 : BB;
place block Q4 : CA;
place block Q5 : CB;
place block Q6 : DA;
place block Q7 : DB;
place block Q8 : AE;
place block Q9 : AF;
place block Q10 : BE;
place block Q11 : BF;
place block Q12 : CE;
place block Q13 : CF;
place block Q14 : DE;
place block Q15 : DF;

   For maximum performance, some hand routing may be required, although 
APR will do a very good job on longline assignment and the use of zero delay
routing resources.
 
   The recommended routing is now described.  The longest carry chain is
from the least significant 16 bits of the counter Q0-Q15 through their
respective 4 bit decodes and through the C6/16, C8/10, or C12/14 carry logic
and into the CLBs for bits Q6 to Q15.  This is 2 CLB logic delays and 3 Net
delays along with the clock to output and setup time. The CXSTA/CE1
(count enable ORed with parallel enable) line can be routed to the .EC pins
of the Q0-Q15 CLBs on the vertical long line to keep it out of the way. 

The remaining critical nets are the Q0 through Q15 counter outputs, the 
parallel decode signals P0-3, P4-7, and P8-11 and the C6 through C14 carries. 
One must be careful with the other parallel decodes (P0-1, P4-5, P8-9, 
and P12-13) which are also critical but because they each only go one place 
they are routed after the above nets have been completed.  

The CE1 signal is not critical, but is it shown in the schematic as long and 
critical as a reminder that for convenience it is to be routed on long lines 
to save resources that other signals could use. It is recommended that the 
RoutePin (rp) command in XDE be used to route the CXSTA/(P0-3, P4-7, P8-11, 
and P12-15) to the CXSTA/(C6/16, C8/10, and C12/14) CLBs. The Q0-Q15 nets 
should be routed with the RoutePin command to the CXSTA/(P0-3, P4-7, P8-11, 
and P12-15) CLBs since these routings are critical to the overall performance.
Next the CXSTA/(C6, C8, C10, C12, C14, and TC) signals can be routed.  

Performance:

XDELAY was used to report all clock-to-set-up paths. See the .XRP file.

