[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Problems with stalling sessions

On Tue, Nov 08, 2005 at 01:39:21AM +0100, Per-Olov Sjöholm wrote:
> Hi
> I have a redundant firewall with CARP. 3.6 STABLE plus all patches from CVS 
> for stable (updated last week). The firewalls have 7 nic ports each. 
> External, internal, pfsync and 4 dmz interfaces. The servers are firewalls, 
> DNS, mailrelay, antivirus, spamkillern ntp and dhcp for internal hosts.
> Everything works perfect! Except for the facts that sessions are stalling 
> during transfers of big files. I have tried to remove "aggressive timeouts", 
> "adaptive timeouts" and "scrub" without success. It doesn't matter if the 
> transfer goes over NAT from Lan to internet or from a real IP on dmz2 to the 
> internet. We have tried many different protocols such as SSH, amanda and more.
> Turning on -x loud give ALOT of the below (maybe irrelevant??)
> --snip--
> Nov  8 00:49:53 san /bsd: pfsync: ignoring stale update (3) id: 
> 4367413c000b4c76 creatorid: e31b4f22
> Nov  8 00:49:53 san /bsd: pfsync: ignoring stale update (3) id: 
> 4367413c000b4c75 creatorid: e31b4f22
> Nov  8 00:49:53 san /bsd: pfsync: ignoring stale update (3) id: 
> --snip--
Do you get these all the time or just when the system is under load?
For some reason your primary carp host is getting hold updates from
someone else, presumably the other carp machine.  Something seems out of
whack here.  
> Nothing comes up as blocked in the firewall log when a session is stalling.
> I have Intel 10/100 (fxp nics) and Soekris lan1641 quad boards (sis nics)
When I read 'sis' I immediately suspected those cards as the problem as
I know others on the list have had problems with those cards under load
in the past.  I believe this may have been fixed in more recent releases
though, but don't quote me on it.
> Don't look to close to the queuing stuff as it's not complete.
> The rows from Firewall-1 pf.conf (primary) on the link below.
> http://www.incedo.org/~sjoholmp/pf/pf.conf
> (secondary FW have exactly the same pf.conf)
The only comment I have about that ruleset that may be relevant is the
max states.   Even though you've got it commented out it will still
default to 10k states unless you say otherwise.  This may not even be
relevant because a large transfer should not necessarily drive the
number of states through the roof.  Depends on the method used to
download, of course.
> Any suggestions?
In no particular order...
Figure out why you are getting stale updates from pfsync.  Do a simple
test.  Your two carp hosts, ONE other client machine.  From the client,
initiate a connection outbound and ensure that the two pfsync hosts have
similar (if not identical) state tables.  
When downloading, keep an eye on carp and see if the two hosts are
flopping between master and slave.  If you don't feel like doing this
manually, use ifstated (may not have been available in 3.6 though).
Use systat/vmstat to see how the system is acting under load.  Looks
like we are dealing with a 2M pipe so it shouldn't be an issue but worth
looking at anyway.
Take the second host out of the picture entirely and see if your
problems persist.