[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Fwd: PF Load balancing plans?
Fwd: PF Load balancing plans?
dormando <[email protected]>
Mon, 15 Nov 2004 23:52:07 -0500
a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=IEu8MkR9cEbwSyDDW9oNSa16LQBz1KXlEuJvMnPDSYd+dO4svpni7FN5z8jAeM4QYHZ5iqzYB1NWw6td2pVWloi8k7l6NSRggz8H+E1vziHrdNn/IaI939zrRNFQYYIUFCO9rDwclqIsC5/u8R4ETBWQmYYMPVezbqB7dzeJBcw=
Sigh. I thought I had [email protected] cc'ed, but I didn't. sorry :(
Thanks for taking the moment to respond, I appreciate it.
> I'm not aware of any specific plans or ongoing work in that area. Maybe
> start with evaluating the features pf has right now, and give us an idea
> of what is missing for your setup.
Okay. I will detail what would be nice in terms of vague
implementation below. Will try to keep away from specifying more than
the actual needs for now.
> I can't promise that anyone will commit to a list of features, but if
> cost is not an issue and you want to donate, there are always
> opportunities, like
It would make me happy to be able to send at least one person to the
hackathon. However, the company I work for us more "typical", so I
cannot promise anything myself. I buy cds and t-shirts when I can
because I like and use the system, the company might only want to
reward when given. I won't hold anyone to that. Either these things
exist or they don't. If we don't go with the OpenBSD route, perhaps
the reasons why we went that way could help this one side of PF become
likely for larger companies like mine to use in the future.
To put it shortly; we want the load balancing feature set of something
like a Big/IP. Except I'm not fully sure what types of load balancing
they offer these days. Nothing about its GUI's or config file types or
implementations, but simply some base features.
A mechanism for health checking servers and marking them alive/dead
based on criteria. Things like SLBD start this, but we'd need
something more fully fleshed out and updated. This is probably the
A method of using either pluggable load balancing algorithms, along
with allowing certain meta data such as weights to server definitions.
Round robin is nice, but so are weighted round robins, "least total
connection" balancing, slow start balancing, etc. I always liked
mod_backhand (http://www.backhand.org/) due to it being easy to write
my own load balancing algorithms. It suffered from implementation
issues such as having available information updated only once per
second (at 180+ hits per second, implementation details like that
become important problems), and it was HTTP only. This is one of the
biggest reasons why we can't make OpenBSD work for us right now. We
can hack in the above, but we need more load balancing options.
Complete flexibility would be fantastic.
Methods of matching on HTTP filtering can be great helps with figuring
out application load balancing. HAProxy
(http://w.ods.org/tools/haproxy/) does this very well at a regex and
header level. Big/IP's also typically have TCL-based filtering which
can be applied to each HTTP connection. Write a little script and
direct what resultant pool the connection will go to. This allows for
sites with a single hostname to load balance to many different
clusters for different reasons. Simple HTTP filtering can do this as
well, just to a lesser degree. We can outright filter credit card
numbers going out in the plain "just in case", and deny IIS worms from
crawling through the load balancer. We're not talking about an IDS,
but an HTTP header filter for junk simple and complicated. It feels
like overstepping a bounds to go there, but perhaps this could still
be handled in some way.
Given that I brought up HAProxy, I would like to note that it does a
lot of things right in my mind. The configuration (not the format in
particular, but what it represents) allows a lot of connection
control, detail, and for easily configuring a large setup (we might
have over 200 small clusters). I can easily have multiple ports on
multiple IPs go to the same pool (our arrowpoint cannot do that right
now), run a backup maintenance page server if no other servers are
available, and fine-tune control timeouts on connections. A pool of
chat servers should have a normal timeout range. A pool of application
webservers should cut the connection after 30 seconds. A pool of large
file download servers should cut the connection after an hour, etc.
This is all very important for us, as we have all of these I believe.
Then there's the optional but nice, which I will try to not ramble
about as much:
Max connections per cluster, dropping ones that come in after that.
Perhaps a DoS on one cluster won't exhaust all resources trying to
shove more connections down a pipe that can only handle 90 at once
Max connections per server, dropping if all are overloaded. Now we can
specify that our application servers shouldn't try to handle more than
30 connections at once. If one big guy can handle 90, then he should
handle up to that much.
We can already dynamically add and remove servers, but can we
gracefully add and remove servers? Stop whole pools? We like being
able to stop new connections from coming in, then after everything is
done processing take the server down.
Timing out connections where we don't hear from the client after so
much time, or we don't hear from the server after so much time.
Syncing this data between load balancers? Failover and CARP load
balancing main servers connections becomes hard with some of these
extentions (in my eyes at least).
Statistics. I absolutely love the amount of numbers PF leaves
available to me. I would like every number and statistic for a cluster
I could get my hands on.
There are a lot of things that PF does inherently that I don't believe
I need to list. One fault with using something like HAProxy is that it
is a TCP proxy; we lose the initial IP address because the connection
is being proxied through a usespace utility. It uses a lot of file
descriptors. It's based on select and not kqueue or epoll. Other
At least the first bunch of stuff would put OpenBSD in the competitive
running as a serious contender (if its speed is good and I can get it
all working), the full gamut would make it competitive with any *real*
feature a commercial product could have, IMHO. A lot of them offer IDS
and weird filtering and blah de blah blah, but the stuff that actually
matters and is probably halfway-decent feature wise I have already
listed. Flexible load balancing can go above and beyond that.
I'm getting tired, but I've tried to proofread this. If something is
not clear I am fully willing to clarify. I don't want these to look
like demands, these are what we would need and like to have to be able
to use OpenBSD in our large setup, instead of some weird commercial
product that costs a couple kidneys. If any of this is taken seriously
and possibly implemented I would be thrilled.