[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: stalled connections between pf servers

Steve Witucke wrote:
I am new to using PF, long time user of IPFilter. I switched to OpenbSD/PF last
week to setup a system to provide me with redundancy for my outbound
connections. The setup consists of 2 machines, each connected to a different
internet connection, and serving two internal subnets. (See ASCII diagram below)

Well if nothing else, that is the best ascii diagram I've ever seen on this list. It's really good :)

One test I was able to do was to take one machine offline, and test my resultant
connectivity between the two subnets. I took one machine at a time offline, and
found running with 1 server (regardless of which) I was able to transfer a large
file with no problem at all between subnets. So I think I can rule out hardware
as my issue.

Just to be clear, you're removing machines from the 30 network one at a time? And with 1 left everything is stellar? Does it matter which server is left running?

I have one running theory at this time, but I'm not exactly certain how I could
go about testing it out. (viewing the ASCII diagram for this will help). Traffic
entering on fxp0 on HOBBES (from and destined for has two possible routes. It could be routed to CALVIN ( the
master for ), or simply leave out fxp1 which is on the 30.0
network. Further, I have noticed through some testing that if I ping from the 20.0 network, HOBBES responds, even though it is the
backup for that carp interface. So packets never make it to the other server I

On Hobbes, traffic to the 30 network only has one route and that will be thru fxp1. You can test this with "route -n get". Your routing is asymmetric. Traffic goes thru Hobbes in one direction and thru Calvin in the other.

Now, the return traffic on the 30.0 network thinks (and rightly so) that it's
default gateway is CALVIN, and if my above test is true, then instead of passing
the packet over to HOBBES for routing to the 20.0 network, the packet leaves
CALVIN on the x12 interface and comes back on the 20.0 network from a different

Exactly, same as traffic from 20 to 30.

What did tcpdump tell you about traffic stalling?
Are the two switches directly connected?
Are you seeing carp transitions at all? i.e, are you seeing either machine flip between master and backup?
Are states being synchronized properly between the firewalls?

Your ruleset looks ok to me.