how to prevent Servers downtime?

giant

Verified User
Joined
Jun 15, 2004
Messages
13
Hi

I want to make sure that my servers do not goes down.

For example, I have two servers, what can i do to ensure that if server 1 is down, my server 2 takes over.

Is this done by having ns1/ns2 on server 1 and ns3/ns4 on server two?

Welcome any suggestions on how i can prevent downtime!

thanks a million!

giant
 
It's a simple question with perhaps dozens of complex answers, and lots of ideas.

Generally you can't solve the problem without throwing a lot of money at it, and even then you might not (even Google and Yahoo have had outages).

If all the sites on the server are static sites (no databases, etc.) then you could have two servers configured the same, with the websites on both servers.

Then have one server answer only as ns1 and the other answer only as ns2.

Then let the DNS on each server point to the IP#s for the same server.

And change your TTL for all your zones on both servers to 600 seconds.

That way as long as both servers are up they'll share the load but if one goes down the other will get all the load.

However if someone is looking at your site when the server they're looking at goes down they probably won't be able to see the site until the server goes back up, or they or their ISPs reset their system(s).

Jeff
 
Hi Jeff

Q1. Assuming we adopt the above ns1 and ns2 on different servers, will it work for the email services?

Q2. Is raid required to mirror server 1 and 2?

Q3. If databases are involved, any suggestions on how the web databases should be treated?

thanks

giant
 
giant said:
Q1. Assuming we adopt the above ns1 and ns2 on different servers, will it work for the email services?
Yes, but probably not the way you expect it to work.

What will happen is there will be two separate email accounts, and some mail will hit one, and some the other, so your users will have to set up two accounts for receiving email, with the same information, except that for "POP" server, the user will have to use the IP#s rather than a server name, so s/he downloads email from both servers.

You could set up backup MX instead, but that's quite a bit more complex, and would result in the users not being able to get email if/when the main server is down.
Q2. Is raid required to mirror server 1 and 2?
Not required, not doable, since RAID works only within one server. Of course you could make your servers a lot more reliable by using RAID in each of them; our hosting servers use RAID. If both your servers are in the same datacenter perhaps you could use a single NFS drivestore. But of course then your NFS server would be a single point of failure.

Adidtionally, we don't recommend both servers be in the same data center, because if they are, then you're protecting against server outage but not against network outage.
Q3. If databases are involved, any suggestions on how the web databases should be treated?
It's possible to replicate MySQL across multiple servers, but it can be complex. Google for both commercial and open source solutions.

Jeff
 
I'm not sure if any of this is even possible, just a wild idea.

Set up the second server to precisely mirror the first one offline (not normally accessable), updating every 5 minutes or however often you can do it without taxing system resources and if your script was capable of making a live copy

Then possible a script that has the backup server monitor the primary server and the moment it goes down for a duration of time (say 30seconds or so incase it was just a restart) have the backup server activate itself and forcefully bind and lock itself to the ip addresses so that all traffic now goes to the backup server.
Along with maybe a fancy notification system to warn you about it via email.

Then you can come in check things out, repair the primary server, set it up as backup, changing the previous backup server to the primary server. Something like that to minimize downtime in case of a systems failure.
 
Last edited:
A couple of Ideas..

Load Balancer : $$$$, *very* resiliant, does everything
Basically the load balancer has your public IP address with multiple servers sitting 'behind' it with private IPs. When a request comes in to your public address it puts it through to the server with the least traffic.

Metaconfig might look like this:
"forward all 169.x.x.10:80 requests to $servergroup1"
"$servergroup1 = 192.168.0.10, 192.168.0.11, 192.168.0.12"

With advanced setups you can have load balancer/servers in multiple physical locations all answering the same IPs. This way if you lose FarmA then FarmB takes all the traffic. With an extensive setup your worst case is that some SSL sessions have to restart.

DNS round robin + NAS: $$, simple, not as flexible
A cheap/simple way to do it is set your 2 servers to share their filesystems off of an NFS server. Set your DNS record to have the IPs of both servers: www.domain.com = 169.x.x.10, 169.x.x.11
Then change the TTLs on your DNS records to something like 5 minutes. When a server goes down just pull its A record and traffic will start going to the remaining server.

Clustering: $$$, high performance, limited support
Some services, such as MySQL also have built in 'cluster' services. In this case you might have 1 'master' server and 2 'front' servers. All traffic would go to a 'front' server. if its a data read then the 'front' responds. If its a data write then the 'front' passes the write request back to the 'master' server.
 
jmstacey said:
Set up the second server to precisely mirror the first one offline (not normally accessable), updating every 5 minutes or however often you can do it without taxing system resources and if your script was capable of making a live copy
<SNIP>

Totally possible, but tricky. Rsync would be used for copying your data over. Your main problems would be with file locking issues. If you had a managed switch you could even have your watchdog turn off the dead servers port and enable the live servers port. All the pieces to do this are available, you just need some sort of a watchdog daemon to tie it all together.
 
jmstacey said:
I'm not sure if any of this is even possible, just a wild idea.
It's possible. It requires Portable IP#s and BGP routing (which can be an expensive solution, and is not ordinarily available to a colocated server) at each location, to change routing information for IP#s in real time.

Or...

Both servers must bein the same data center, which gives you no redundancy if the connection or the entire datacenter goes down.

(A few months ago a major Northern California data center had an electrical problem; their UPS system created surgers which threw most of their hosted servers offline and damaged some beyond reapir.

We offer similar service to the one I explained, on a custom basis, in multiple data centers, or in one data center with redundant connectivity. For most people it's not cost effective.

Jeff
 
Re: A couple of Ideas..

donavan said:
Metaconfig might look like this:
"forward all 169.x.x.10:80 requests to $servergroup1"
"$servergroup1 = 192.168.0.10, 192.168.0.11, 192.168.0.12"
Won't work with DA because DA requires public IP#s.

And has other problems as well, such as everything is in the same datacenter... and you've just moved the point of failure from the server to that expensive load-balancing hardware.
With advanced setups you can have load balancer/servers in multiple physical locations all answering the same IPs. This way if you lose FarmA then FarmB takes all the traffic. With an extensive setup your worst case is that some SSL sessions have to restart.
Better, but you still have to have your DA servers on routable IP#s.
DNS round robin + NAS[/B]: $$, simple, not as flexible
A cheap/simple way to do it is set your 2 servers to share their filesystems off of an NFS server. Set your DNS record to have the IPs of both servers: www.domain.com = 169.x.x.10, 169.x.x.11
Then change the TTLs on your DNS records to something like 5 minutes. When a server goes down just pull its A record and traffic will start going to the remaining server.
Then you're in the same data center, with the same reliability issues I've mentioned previously. And presuming all your services will work in an NFS environment, your NFS server is now your single point of failure.

Jeff
 
I agree...

Uptime costs Money. Theres no way around it. If someone *really* wants that mythical 5 9s uptime then they should be prepared to drop LARGE coin on it. Theres a whole industry (alteon, F5, foundry, etc) built around this type of service.

I think for the original posters budget the multi site NS way would work fine, but any dynamic content could put a crimp on it. I wonder how well a Master/Slave MySQL setup would work for remote locations?

Diverse NS/MX servers are simple, easy, and a smart plan. Big, physically diverse farms are great idea, but at that point your looking at 6 diget costs just for hardware.
 
Multiple diverse NS servers yes.

MX servers? I'm not sure anymore, since you'll need a method for only accepting email for actual users while the main MX is down.

Most email experts are no longer recommending multiple diverse MX for just this reason. If you don't have an up-to-date list of actual users available to the secondary MX servers, you'll be collecting a lot of undeliverable and unreturnable spam, virusses, etc.

Jeff
 
jlasman said:
Multiple diverse NS servers yes.

MX servers? I'm not sure anymore, since you'll need a method for only accepting email for actual users while the main MX is down.

Most email experts are no longer recommending multiple diverse MX for just this reason. If you don't have an up-to-date list of actual users available to the secondary MX servers, you'll be collecting a lot of undeliverable and unreturnable spam, virusses, etc.

Jeff

Isn't something like this possible ?

Backup server
Is linked with BIND transfers to the primary server, also there are MX records for this server.
If the primairy server goes down, the backup server should take over all e-mail functions for the domains listed in the /var/named dir.
So it needs to accept all mail to those domains and forward it to the primary servers when they get back online.

Furthermore it would be a nice touch to see the server change all DNS records while there is HTTP downtime to a certain message, so getting a message from the browser about how down the server is, isn't possible.

I believe the first part would be realiseable, although i don't know how to do something like that in Exim (if someone can tell me how to do it, i would like it very much). For the latter part, I'm not so sure, as the DNS changes would need to be almost dynamical, don't know if BIND or Apache could do that...
 
Backup MX

Well thats kind of that standard backup MX. It accepts *all* mail addressed to your domains and then forwards it on to your primary site once its back up. Ive only done it in Qmail but its very simple.
I think what jlasman meant was that youre going to get shedloads of junk for invalid users. Youll basically be acting as a catchall so when Spammy McSpamson sends to [email protected] your going to have a lot of crap pointed at your primary once its back up.
 
Yep.

And this creates so many problems in real life that most experienced professional admins are abandoning backup mx servers in favor of building servers less likely to fail and putting them on networks les slikely to fail.

Jeff
 
jlasman said:
Yep.

And this creates so many problems in real life that most experienced professional admins are abandoning backup mx servers in favor of building servers less likely to fail and putting them on networks les slikely to fail.

Jeff

Possibly true, but the main problem is that everything can go wrong (and it will one day...).
I am looking for a way to let a server do what i described before, with Exim. Since our new backup server (don't know if this is the right word, as it's our first backup server) is a Celeron 2.4 Ghz and all it has to do is catch all e-mail and send it trough to the correct server/user on other server, i believe the thing should also be able to check some RBL servers and possibly do some virus scanning. This won't filter all 'bad' e-mail, but it will filter about 75%. This will also help the primairy server once it gets back up, as receiving a few thousand e-mails and processing them through a virus scan & spamassassin can, well, get the server load a bit high...

All i'm looking for at the moment is a way to accomplish this with Exim...


I think what jlasman meant was that youre going to get shedloads of junk for invalid users. Youll basically be acting as a catchall so when Spammy McSpamson sends to [email protected] your going to have a lot of crap pointed at your primary once its back up.
If you would check the SPF record and do some filtering before delivering, i guess the load of crap tends to decrease a bit.
As you don't have to catch everything, just *@<domains listed in /var/named> as that would be up to date.
 
Once you've decided to do it, it's easily enough done, but will require a lot of changes to exim.conf. See the extensive exim documentaiton online, or get the exim book published by UIT Cambridge; you can find it here.

Jeff
 
email problem

hey every one there
i have a big problem in my server
how can i repair My email boxes ?
how can to fix this error :

The message could not be sent because one of the recipients was rejected by the server. The rejected e-mail address was '[email protected]'. Subject 'hey Franco', Account: 'mail.tnm.com.sa', Server: 'mail.tnm.com.sa', Protocol: SMTP, Server Response: '550 authentication required', Port: 25, Secure(SSL): No, Server Error: 550, Error Number: 0x800CCC79

and suddenly i had a problem in some of email boxes in my server it can't send and receive

could you tell me how to fix it through direct admin and ssh


thanks
 
We don't have near enough information to help you.

I'm presuming you got this error when sending a message.

Do you host the yijia-heater.com domain? If not, then the error has nothing to do with mailboxes on your server.

The error indicates that you tried to send email through the server that sent you the email, to a domain NOT on the server, but that you didn't authorize yourself.

Presuming the server that gave you the message hosts your accounts, then the way to authenticate yourself is to try to receive email (Send and Receive if you're using either Outlook or IE) before you try to send.

As far as problems with your mailboxes, we don't know what problem, or what the symptoms are, so we'd be guessing.

Jeff
 
Back
Top