[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: pps or other unknown upper bound?



On Fri, Nov 18, 2005 at 12:49:48AM +0100, Daniel Hartmeier wrote:
> On Thu, Nov 17, 2005 at 04:52:40PM -0500, Jon Hart wrote:
> 
> > Bingo.  There are entries in the logs when this condition happens but it
> > is not entirely clear what the problem aside from the fact that it is
> > a "BAD STATE":
> > 
> > Nov 17 21:44:48 fw-1 /bsd: pf: BAD state: TCP 10.7.0.112:12345
> >    10.7.0.112:12345 10.8.0.112:59635 [lo=3722728956 high=3722735388
> >    win=6432 modulator=4006337120 wscale=0] [lo=3737716700
> >    high=3737723132 win=6432 modulator=3433376110 wscale=0] 9:9
> >    S seq=3723083242 ack=3737716700 len=0 ackskew=0 pkts=5:5 dir=in,fwd
> > Nov 17 21:44:48 fw-1 /bsd: pf: State failure on: 1       | 5  
> 
> The address/port pairs are probably clear (there's three pairs, two are
> equal unless the state involves translation).
> 
> What you see is the existing state pf has and the packet that was
> associated with it (based on the source/destination addresses/ports),
> but failed the sequence number checks.
> 
> The square brackets [] contain the sequence number windows the state
> allows. The digits 1 and 5 in the last part indicate which window rules
> were violated: the packet's seq=3723083242 is higher than the upper
> limit high=3722735388. That's why the packet is blocked.
> 
> Now, the theory is that the client is reusing the source port 59635
> before the time-wait of the previous connection (which the state we see
> represents) is over.
>
> The 'S' part means the blocked packet was a SYN.
> 
> The '9:9' part means the FINs where exchanged and ACKed in both
> directions, so the connection was closed normally (and no RSTs were
> sent).
> 
> 'pkts=5:5' means that the prior connection consisted of only 5 packets
> each in both directions.
Thank you for such a detailed explanation.  I'm not sure if this is
anywhere in any docs, but if it isn't, this is a great thing for the
archives and hopefully others will benefit from it.
> This all makes sense. Assuming you're fetching a tiny document from the
> web server in a fast loop, the client will run out of random source
> ports. It's probably honouring 2MSL up to the point where it simply has
> no choice (other than stalling further connect(2) calls), until ports
> free up.
The funny thing is, in my tests, despite having ~31000 source ports to
choose from, the client is unlucky enough most of the time and very
quickly manages to reuse a port.  It depends on what else the client is
doing, but I saw a case earlier today that after about 300 connections,
the source port was reused.  
> I think the real solution in this case is to re-think the application
> protocol. If the application re-connects to the server at this rate
> (like 32,000 connections per minute), it's wasting a lot of network
> bandwidth (for connection establishment and tear-down) and accumulating
> a lot of latency. It would be much smarter to use one persistent
> connection and pass multiple transactions over that. Maybe SOAP supports
> that (if not, is it authenticating 32,000 times per minute, too? ;)
This is the way I've been starting to lean.  The restrictions are there
for a good reason and it just so happens that the clients are doing
things they should not be doing.  As much as I'd like to "fix" the
stack of the client, that is a knob turning expedition I'm not too keen
on at this stage in the game.  We have already started down the road of
rethinking the client code to cache, use keep alives or a number of
other approaches.
> If you want to adjust pf so it will expire the states earlier, you can
> lower the tcp.closed timeout value (from the default 90s to 1s). Expired
> states are only removed in intervals (default is 10s, adjustable), so if
> you lower a timeout to < 10s, you probably also want to lower the
> interval accordingly (that may increase CPU load if you have many states).
Yup, this is something that I've considered.  I've also seen your next
message and I'm pondering doing per-rule timeouts, but I have not found
a setting that is optimal.  Yet.
> The reason we keep a state in FIN_WAIT or TIME_WAIT is that there might
> be spurious packets arriving late (like packets that travelled through
> slower alternative paths across the network). By keeping the state
> entry, those are associated with the state and don't cause pflog logging
> (they'd usually not have a SYN flag and would get blocked and logged
> according to your policy). So you'll likely not break anything by
> lowering the timeout values, but you might be getting some more packets
> logged as ordinary blocks.
This has been my approach for much of today -- I know of many ways to
fix or alleviate this problem on the client, firewall and/or the server,
but most either have undesireable side affects or unknown consequences.
In my opinion, the consequences of rethinking the client are pretty
clear and it seems like the best approach at this point.
Thats not to say that I or others won't run into this problem again, so
your suggestions will be very helpful.  Do per rule based timeouts where
traffic is known to be difficult like this one was, or, if worse comes
to worse, use Karl's suggestion of not keeping state.  But, as he and
others said, ew!
Thanks,
-jon