redundancy/failover reliability/clusters

matthewventura · Sep 24, 2004

what hardware setups are people using with DA to get a hosting system without a single point of failure?

existenz · Sep 24, 2004

RAID 0, 1 on two servers using round robbin DNS on different subnets both are updated via a set of shell scripts. The both backup to two off site backups each night of all important data and both make DVD copies of folders every 6 hours.

Its crazy honestly no matter how good of a setup you have you will ALWAYS loose some data during a major failure.

matthewventura · Sep 24, 2004

whats a major failure?

jmstacey · Sep 24, 2004

I would say if one server went down completely. since there is no easy way that I know of to keep such things as say mysql databases perfectly in sync constantly (although there is that mysql cluster out now. hmm anyone tried it?)

Either way, theres always the possibility of there being new data on one server that wasn't synchronized with the other server before it crashed, which would result it lost data.

At least thats how I think of a major system crash

nobaloney · Sep 25, 2004

You can avoid a single point of failure for static websites, but not for email, and not for databases and/or database driven sites, as they cannot be easily replicated in realtime.

To avoid a single point of failure for static websites:

Run two servers on geographically diverse networks.

Have each server run it's own DNS, pointing websites to one or more IP#s entirely on the same server.

Point to one of the machines as ns1 and the other as ns2.

However there's still a single point of failure you have no control over, and that's the way DNS works:

Once an IP# is in the DNS server your user uses it won't clear until the TTL has expired.

Additionally, once you've run your browser on your desktop, it won't clear it's DNS cache until it's shut down and restarted. If you're using your local router as your DNS source (and most small networks do so) it probably isn't honoring TTL either, but perhaps will have to be restarted to clear it's cache.

And many ISPs (among them AOL, though I'd hardly call them an ISP

) don't even honor the short TTLs you'd normally use to avoid at least some of the problem.

We're currently building some redundancy into our network at Level3 in Tustin, California, by having multiple ingress/egress points to the Internet through Multiple routers and switches to our servers. But even so there will be single points of failure.

Jeff

whoppe · Sep 29, 2004

Here,

Use this:

http://www.foundrynetworks.com/
http://www.netapp.com/products/filer/clustered.html

with servers from a leading manufacturer.
I'm assuming you can get this all done without busting the bank.
(Less than $100 000)

redundancy/failover reliability/clusters

matthewventura

Verified User

existenz

Verified User

matthewventura

Verified User

jmstacey

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

whoppe

New member