[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: pf vs ASIC firewalls
On Mar 14, 2005, at 2:26 PM, Mike Frantzen wrote:
Could Someone please tell me the advantages of PF against Firewalls
using the ASIC technology in terms of Security and perfomance??
Many (most? all?) vendors shipping what they call ASIC firewalls are
actually running software on a network processor (NPU). The benefit is
that most NPUs will process packets in real-time so if they claim to
support X gigabit per second then they can probably sustain that even
with minimum sized 64byte ethernet frames;
You think? I've been a bit curious about this, especially in the
low-end ("cheap") consumer-grade hardware. Just because a device
"supports" 10/100 ethernet, that doesn't mean that it can saturate it
across all ports simultaneously. Assuming nothing is being funneled
(inbound traffic on 2 ports destined for the same output port), which
is "legitimate" lossage, can one of those sub-$100 10-16 port 10/100
switches really saturate half their ports at 100 Mbits/sec? That is,
you might get full-throughput at 100Mbits from the first to the second,
but there has to be some central IC that keeps track of minimal routing
tables and locks & releases access to the transceivers. Can something
like that actually pass 100Mbit simultaneously from 0->1, 2->3, 4->5,
etc? I may be completely wrong, but I'd bet the design specs for those
things basically say that the entire "device" can move 100Mbit total.
After all, that's more than sufficient for 99% of the people that would
buy them. Hell, half the people that buy them probably wouldn't notice
the difference between 10Mbit and 100Mbit. Just a curiosity.
The down side to NPUs is that they have to service every packet in a
fixed amount of time so they can't do much. They need to have fixed
sized state and fragment reassembly tables. They also aren't allowed
do much work per packet. You will also be able to surf Moore's law
better with a normal x86 processor than with an NPU.
Well, the only difference between that and the requirements for a
PF-style setup is that there's more room for buffering. That is, you
don't strictly have to finish servicing a packet by the time the next
one arrives (or a hardware buffer by the time it would be filled
again), but if you want to saturate, you have to have a "service time"
per packet of less than the amount of time it takes for the packet to
transmit. (If your "service time" is less than the duration of the
packet transmit, you get 100% saturation. If it takes 5-6us to
transmit a 64-byte ethernet frame, and you take 50-60us to process it
and hand it off to the outgoing transceiver (if necessary), then you
only get 10% saturation.)
The strict time restrictions are what limits functionality, of course.
The more complex your ruleset, the longer it takes to process. Rules,
like PF's, that can look inside protocol data for additional blocking
are even more processor-intensive. Things like keeping statistics add
some minimal overhead, too.
And of course, buffering really only makes things more complex: every
system bus is added overhead and potential bottleneck. Is that packet
being stored in an on-chip cache, or did the OS copy it to off-chip
core? Etc. Add to that the fact that one processor has to do this for
all interfaces, and moving from 2 to 6 interfaces is going to give you,
what, perhaps 33% theoretical bandwidth (not even counting the extra
routing table overhead, or added rule complexity), because every one of
a pair of interfaces may be saturated with traffic that needs to be
serviced to meet theoretical maximum capacity.
That doesn't even get into things like logging, or restrictions from
the OS design. If you want to be able to saturate AND log, you need to
know what you're logging to. If writing to disk, or sending messages
to a loghost, even on a dedicated interface, adds extra system latency
or traffic. In theory, extreme logging on all 64-byte ethernet frames
for some rule or another could generate MORE traffic than that which
you are logging.
I have no idea if the BSD's--or any general-purpose host OS, for that
matter--will, for example, prevent logging to disk during bursts of
traffic. Perhaps there's a kernel option for priority of servicing
An under-designed PF-style system might give you added functionality by
allowing traffic to be buffered, and still handle a moderate-load
network without dropping frames, if it can "catch up" when traffic
bursts fall off. And that might be sufficient for most purposes,
having more features than a hardware-only solution, but only a
strictly-designed real-time system can make any guarantees. And those
can only do so by imposing limits on the size of the "rules" applied.
All of that said, I wonder if there isn't some way to implement
something vaguely PF-ish in an FPGA that would allow more control over
the rulesets than an off-the-shelf ASIC.