Intel IXP12xx Hardware 29-byte packet performance, Single-cell/PDU Performance using 133MHZ Dram

Page 10

Version 1.0, 4/10/02

the number of times the PHY was not fed a cell in time to keep the wire busy, and thus had to manufacture an idle cell. The number reported here is from the 2nd counters query when 2 “_VolgaGetChanCounters” are issued on the same line at the VxWorks prompt (this is because “_VolgaGetChanCounters” prints out the delta between a previous invocation and the present invocation). IXF6012 Overflows are measured the same way, and they are generally the result of the StrongARM* core overhead involved in running the “_VolgaGetChanCounters” command itself. “Ethernet Transmit Kframe/sec” captures the lowest and highest results as received and reported by SmartBits600 over the 8 Ethernet ports.

The test measurements are repeated with a variable number of full-bandwidth Ethernet ports driving the design. The test with “0” Ethernet input ports shows the maximum possible ATM-to- Ethernet performance, that is, when there is no Ethernet-to-ATM traffic to load down the system. This is effectively a half-duplex ATM-to-Ethernet forwarding measurement. More Ethernet input ports are added to show how the system handles the increase in load, even though for 40 and 1500-byte packet measurements, 6-8 Ethernet ports over-subscribe available ATM transmit bandwidth.

Hardware 29-byte packet performance

Ethernet

ATM

IXF6012

ATM

IXF6012

Ethernet

Ethernet

Input

Transmit

Transmit

Receive

Overflows

Transmit

Transmit

Ports

Rate [%]

Idle

Ports

 

KFrame/s

[MB/s]

8

84

N/A

1

4000

132 – 138

8

– 9

7

73

N/A

1

1000

127 – 147

8.5

– 9.5

6

63

N/A

1

0

133 – 148

8.5

– 9.5

0

0

N/A

1

0

148.8

9.5

Figure 5Single-cell/PDU Performance using 133MHZ DRAM

The bottom entry in the table with 0 Ethernet Input Ports shows half-duplex performance – i.e. what the design does when it is only forwarding this workload from ATM to Ethernet. The result is wire-rate ATM Receive and Ethernet Transmit performance, and the StrongARM core can run “_VolgaGetChanCounters” without disturbing the data plane at all. As discussed above, this workload is attempting to transmit 949Mbps out the 800Mbps of Ethernet ports. Indeed, 8 Ethernet ports X 148,808 frames/sec = 1.19M packets/second; while the ATM Receive packet rate is 1.4M packets/sec. Looking at the microengine counters, The ratio between the packets dropped due to full Ethernet Transmit queues and the packets dropped due to a full IP Router input MSGQ shows that about 37% of the dropped packets are due to Ethernet transmit queues being full, and the remaining 63% are due to the IP Router Microengine not being able to route 1.4M packets/second. This is consistent with the simulation result for the same workload that showed the IP router couldn’t keep up with 1.4 routes/second.

Transmitting from Smartbits on 6 full-bandwidth Ethernet ports impacts Ethernet Transmit performance, but only on a couple of ports. But this is not enough Ethernet input to saturate ATM Transmit.

Increasing the Ethernet workload to 7 ports, and then 8 ports, increases the ATM Transmit performance, but with the ratio of 949Mbps Ethernet to 622Mbps ATM, this is still not enough Ethernet input to saturate the ATM Transmitter. Also, Ethernet Transmit performance starts to

Page 10 of 17

Image 10
Contents IXP12xx ATM OC12/Ethernet IP Router Example Design Version 1.0, 4/10/02 Measurement Environment OverviewAlternate Dram Timing Protocol Performance of IP over ATM vs. Ethernet KEY Workloads & Approaches to Testing the Example DesignSingle Cell PDU Workload Frame and PDU Length versus IP Packet Length Cycle Budgets to support Line Rates Multiple Cells/PDU WorkloadCycle and Instruction Budgets Cycles/cellUsec/frame = 1559 cycles/frame Cells/PDU Virtual Circuits Cycles/Cell Cycles/cell -7ESimulation Measurement Procedure and Results Simulated 29-byte packet performance1Simulated 40-byte and 1500-byte packet performance2 Hardware Measurement Procedure and ResultsHardware Measurement Results Single-cell/PDU Performance using 133MHZ Dram Hardware 29-byte packet performance9.5 Hardware 40-byte packet performance138 142 144 88,300518 Hardware 1500-byte packet performance517 ATM Queue to Core Throughput Queue to Core Measurement TechniqueEthernet Queue to Core Throughput Resource Utilization and Headroom Analysis Microengine Register and Microstore HeadroomScratchpad RAM Capacity Sram CapacitySdram Capacity Sram and Sdram BandwidthAppendix Buffer Allocation in Dram