>If you consider routing between two 1Gb ethernet networks connected to two 1Gb ethernet ports (for example wan and lan0 in the EspressoBin case), >maximum throughput is 2Gbps — just because each port can receive 1Gbps at maximum (2Gbps in sum) and there is no way how to get more data into the >router. How did you calculate the value 4Gbps?
By 4Gbps I meant “bidirectional”, which equals 2 Gbps full duplex. If you look at the setup again, it should get clear:
Smartbits Sender (port 1, max 1Gbps tx) -> lan0 ---RGMII--> CPU ---RGMII--> wan -> Smartbits Receiver (port 2, max 1 Gbps rx)
Smartbits Receiver(port 1, max 1Gbps rx) <- lan0 <--RGMII--- CPU <---RGMII-- wan <- Smartbits Sender (port 2, max 1 Gbps tx)
Each gbit port can manage 1Gbps full duplex, i.e send 1Gbps and simultaneously receive 1Gbps. If the CPU handles each packet for routing, it has to pass the RGMII interface twice. This adds up to a total of 4Gbps of bidirectional traffic, or 2Gbps fullduplex, passing the RGMII bus.
The frame rate improvement is because using the dedicated interrupt of the PCIe interface I can now split the workload between the two cores using the smp_affinity settings: IRQ 32 (PCIe) is handled by core 1, irq 9 (RGMII) is handled by core0. Both cores together can manage more frames.
In my first test there was only the RGMII interrupt (irq 9) signalling all network traffic. This was exclusively handled by one core, while the other one was idling. This bottlenecked the frame rate.