[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: stalled connections between pf servers



>Well if nothing else, that is the best ascii diagram I've ever seen on 
>this list. It's really good :)
Thanks :)
>Just to be clear, you're removing machines from the 30 network one at a 
>time? And with 1 left everything is stellar? Does it matter which server 
>is left running?
Not quite. Let me be a bit more specific. I'm saying that if I shutdown HOBBES
and let all the traffic go through CALVIN that the traffic is perfect. The same
is true if I shutdown CALVIN and let all the traffic go through HOBBES. My
problem comes when both are running, one is master and one is backup for each
subnet respectively; traffic between the subnet's stalls.
>On Hobbes, traffic to the 30 network only has one route and that will be 
>thru fxp1. You can test this with "route -n get 192.168.30.0". Your 
>routing is asymmetric. Traffic goes thru Hobbes in one direction and 
>thru Calvin in the other.
Well, there's the part I am not so certain about. Looking back at my diagram,
you will see that each switch/subnet has two cables, one from each server. If
not for CARP, I would agree with you as theoretically, HOBBES is the default
gateway for 20.0 traffic, and CALVIN is the default gateway for 30.0 traffic. 
However, I do not think that is the case. Here's my experiment:
If I am sitting on 192.168.20.20 for example, and I ping 192.168.20.1 tcpdump
running on HOBBES shows that it responded to the ping. It's the master for
192.168.20.1 so it should. The really odd part is when I ping 192.168.30.1. I
found that HOBBES still responds, not CALVIN; I do not see any traffic with
tcpdump running on CALVIN, even though HOBBES is the backup for the 192.168.30.1
address. 
Further, if I do an "ifconfig carp0 down" on HOBBES, then I DO see a response
from CALVIN for 30.1. I can guarantee that HOBBES says "BACKUP" before I try
this. 
>What did tcpdump tell you about traffic stalling?
Honestly, I think I need some help with this. I'm not too sure I am seeing what
I should. I see a connection starting, and then tcpdump just stops displaying
for a given interface. I *think* I've seen that traffic pickup on the other
interface on the opposite server (same subnet interface on the other server).
Though that test was rather late and night and I might have been seeing
things...
However, if I am not, then I think that is my problem. Can you think if a
route-to I could do or something that would ensure that traffic entering on fxp0
and destined for 30.0/24 would be routed back to HOBBES and back out fxp0, and,
for example, never be routed back out xl1 on CALVIN? 
>Are the two switches directly connected?
It's actually 1 switch (Dell 3024) with two vlans. There is no connectivity
between the two vlans except through the openbsd boxes, so it just made sense to
me to draw them as separate switches since that's essentially what they are. 
>Are you seeing carp transitions at all? i.e, are you seeing either 
>machine flip between master and backup?
I do not believe that I am seeing that. I can run an ifconfig over and over and
always see backup/master and then master/backup on the respective machines as
they should be. 
>Are states being synchronized properly between the firewalls?
I think that they are. When I do a tcpdump on pfsync0  I see messages for
inserts and deletes to the state table etc. Is there a way I can check for sure
that the table is actually being updated? I do notice that if I just yank a
cable during a ping -t that I loose 1 reply, and then the other machine takes
over. Could that be an indication of the state not being there? I've not tried a
longer scp request yet. 
>Your ruleset looks ok to me.
Yea :) And thank you very much for your help.