[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Speed issues with bridge firewall



I've built a bridging firewall for our compute cluster, and I've run across a few issues that I'm hoping someone can help me with. First, let me explain my setup.

The firewall box is a SuperMicro 1U box with ServerWorks GC-LE chipset, dual 1.8 GHz Xeons, 1 GB RAM, 40 gig hard drive, and two gigabit NIC's (one Intel, the other NatSemi 83820). OpenBSD doesn't support SMP, so only one of the processors is being used.

The test machines were three of our gateway machines.  All of them have
gigabit ports using the tg3 chipset.  They are connected using a NetGear
"dumb" gigabit switch.

The firewall will be filtering a gigabit connection to our cluster, and
includes some basic Quality of Service functions to insure that SSH traffic has precedence over, say, BBFTP.


I tested thoroughput using iperf [http://dast.nlanr.net/Projects/Iperf].
I made 5 runs in each configuration, dropped the highest and lowest, and
averaged the remaining three scores.  I ran iperf with the following
options: "-t 1200 -P 30" (which means "run the test for 1200 seconds,
running 30 parallel threads").

Results:

    No firewall:    939 Mbits/sec thoroughput
    Firewall:       785 Mbits/sec thoroughput

So our bridging firewall achieves ~84% of full line speed. However, during testing the firewall had a load level of 4.3. There doesn't appear to be any packet loss, but I'm not sure if it is affecting latency or not. Does anyone know a good way of testing that? The firewall console is completely frozen when it's under that stress.

After seeing the load level I tried to fix that by adjusting the ruleset. The first thing I did was measure the "null" case by disabling filtering and measuring what the load level was when the box was simply functioning as a bridge.

I was a bit shocked when the firewall, doing nothing but bridging, had a
load level of 3.7 under the same test. "Load level" is about as precise a metric as bogomips, but by squinting bridging seems to be taking roughly 6 times as much computer time as filtering.


Does OpenBSD 3.3 not support zero-copy? Or is there something trivial I'm missing here? I wouldn't have expecting bridging to put that kind of load on the CPU.

Any ideas are appreciated.


Mat Binkley [email protected]