Version 1.0, 4/10/02
One issue with running simulations unbounded to wire-rate is that it can hide errors because there is no concept of device overflows or underflows. Further the design can become un-balanced, say for example if an efficient receiver races ahead of the rest of the design, hogging shared system resources and potentially penalizing another part of the system.
Another approach is to simulate bounded, but to bind to a wire-rate that is faster than the actual wire-rate. The disadvantage of this technique is that it is an iterative process. To discover the maximum performance one must raise the wire rate until the design fails to keep up, and then one must lower it until the design runs correctly without any overflows or underflows.
SIMULATION MEASUREMENT PROCEDURE AND RESULTS
In the simulation environment, 29, 40, and 1500 byte packets are measured using the Developer’s Workbench IX Bus Device Simulator’s streams facility. The workloads are homogeneous, in that the same sized packets are sent into both Ethernet and ATM ports.
To measure the performance of the design, the simulation is run with the Ethernet ports bounded to 100Mbps, and the ATM ports bounded to 155 or 622 Mbps, as appropriate. The simulator is set to stop if it detects a device overflow or underflow.
Full-bandwidth input streams of the specified packet size are simultaneously applied to all ATM and Ethernet ports present in the configuration for at least 1M cycles.
Upon completion of the simulation run, the line rates in the IX Bus Device Status window are observed. The Ethernet ports should be receiving at 100 Mbps each. The ATM port(s) should be receiving at 622 (or 155 Mbps each). For 29-byte packets, the Ethernet side should transmit at wire-rate and discard excess ATM input. For 40 and 1500-byte packet workloads the ATM side should transmit at wire-rate and discard excess Ethernet input.
No device overflows or underflows were detected during the simulation.
Simulated 29-byte packet performance1
For the OC-12 and 4xOC-3 configurations running the 1 cell/PDU workload, the simulation stops with a watch-point when the MSGQ from the ATM Receive Microengine to the IP Route Microengine fills to capacity. This means that the IPR Microengine is not able to keep up with the 1 lookup/cell workload (1.4M lookups/sec). Upon disabling the watch-point and completing the 1M cycle simulation, the number of PDUs dropped due to the ATM_RX_IPR_FULLQ is compared to the total number of cells received via ATM. This shows that the IP Router Microengine drops 19-22% of the cells received via ATM. Conversely, it shows that the IP Router Microengine is routing 78-81% of the 1.4M cells/sec input, or about 1.1M routes/second.
While this observation shows that under this workload the IP Router does not keep up with the input, it shows that for a workload with 2-cell PDUs, the IP Router has the capability of routing (1.1 – 1.4/2) = 400K routes/second more than the maximum 700K routes/second required.
1Simulations for 29-byte, 40-byte, and 1500-byte packet loads were run using 133 MHz memory (-75).