how to prevent Servers downtime?

giant · Aug 18, 2004

Hi

I want to make sure that my servers do not goes down.

For example, I have two servers, what can i do to ensure that if server 1 is down, my server 2 takes over.

Is this done by having ns1/ns2 on server 1 and ns3/ns4 on server two?

Welcome any suggestions on how i can prevent downtime!

thanks a million!

giant

sander815 · Aug 19, 2004

i think theres software to do that: loadbalancing, clustering
i don't have any experience with it though, so can't help

http://www.linuxvirtualserver.org/
http://www.linux-ha.org/

nobaloney · Aug 19, 2004

It's a simple question with perhaps dozens of complex answers, and lots of ideas.

Generally you can't solve the problem without throwing a lot of money at it, and even then you might not (even Google and Yahoo have had outages).

If all the sites on the server are static sites (no databases, etc.) then you could have two servers configured the same, with the websites on both servers.

Then have one server answer only as ns1 and the other answer only as ns2.

Then let the DNS on each server point to the IP#s for the same server.

And change your TTL for all your zones on both servers to 600 seconds.

That way as long as both servers are up they'll share the load but if one goes down the other will get all the load.

However if someone is looking at your site when the server they're looking at goes down they probably won't be able to see the site until the server goes back up, or they or their ISPs reset their system(s).

Jeff

giant · Aug 19, 2004

Hi Jeff

Q1. Assuming we adopt the above ns1 and ns2 on different servers, will it work for the email services?

Q2. Is raid required to mirror server 1 and 2?

Q3. If databases are involved, any suggestions on how the web databases should be treated?

thanks

giant

nobaloney · Aug 20, 2004

giant said:
Q1. Assuming we adopt the above ns1 and ns2 on different servers, will it work for the email services?

Yes, but probably not the way you expect it to work.

What will happen is there will be two separate email accounts, and some mail will hit one, and some the other, so your users will have to set up two accounts for receiving email, with the same information, except that for "POP" server, the user will have to use the IP#s rather than a server name, so s/he downloads email from both servers.

You could set up backup MX instead, but that's quite a bit more complex, and would result in the users not being able to get email if/when the main server is down.

Q2. Is raid required to mirror server 1 and 2?

Not required, not doable, since RAID works only within one server. Of course you could make your servers a lot more reliable by using RAID in each of them; our hosting servers use RAID. If both your servers are in the same datacenter perhaps you could use a single NFS drivestore. But of course then your NFS server would be a single point of failure.

Adidtionally, we don't recommend both servers be in the same data center, because if they are, then you're protecting against server outage but not against network outage.

Q3. If databases are involved, any suggestions on how the web databases should be treated?

It's possible to replicate MySQL across multiple servers, but it can be complex. Google for both commercial and open source solutions.

Jeff

jmstacey · Aug 20, 2004

I'm not sure if any of this is even possible, just a wild idea.

Set up the second server to precisely mirror the first one offline (not normally accessable), updating every 5 minutes or however often you can do it without taxing system resources and if your script was capable of making a live copy

Then possible a script that has the backup server monitor the primary server and the moment it goes down for a duration of time (say 30seconds or so incase it was just a restart) have the backup server activate itself and forcefully bind and lock itself to the ip addresses so that all traffic now goes to the backup server.
Along with maybe a fancy notification system to warn you about it via email.

Then you can come in check things out, repair the primary server, set it up as backup, changing the previous backup server to the primary server. Something like that to minimize downtime in case of a systems failure.

donavan · Aug 21, 2004

A couple of Ideas..

Load Balancer : $$$$, *very* resiliant, does everything
Basically the load balancer has your public IP address with multiple servers sitting 'behind' it with private IPs. When a request comes in to your public address it puts it through to the server with the least traffic.

Metaconfig might look like this:
"forward all 169.x.x.10:80 requests to $servergroup1"
"$servergroup1 = 192.168.0.10, 192.168.0.11, 192.168.0.12"

With advanced setups you can have load balancer/servers in multiple physical locations all answering the same IPs. This way if you lose FarmA then FarmB takes all the traffic. With an extensive setup your worst case is that some SSL sessions have to restart.

DNS round robin + NAS: $$, simple, not as flexible
A cheap/simple way to do it is set your 2 servers to share their filesystems off of an NFS server. Set your DNS record to have the IPs of both servers: www.domain.com = 169.x.x.10, 169.x.x.11
Then change the TTLs on your DNS records to something like 5 minutes. When a server goes down just pull its A record and traffic will start going to the remaining server.

Clustering: $$$, high performance, limited support
Some services, such as MySQL also have built in 'cluster' services. In this case you might have 1 'master' server and 2 'front' servers. All traffic would go to a 'front' server. if its a data read then the 'front' responds. If its a data write then the 'front' passes the write request back to the 'master' server.

donavan · Aug 21, 2004

jmstacey said:
Set up the second server to precisely mirror the first one offline (not normally accessable), updating every 5 minutes or however often you can do it without taxing system resources and if your script was capable of making a live copy
<SNIP>

Totally possible, but tricky. Rsync would be used for copying your data over. Your main problems would be with file locking issues. If you had a managed switch you could even have your watchdog turn off the dead servers port and enable the live servers port. All the pieces to do this are available, you just need some sort of a watchdog daemon to tie it all together.

nobaloney · Aug 21, 2004

jmstacey said:
I'm not sure if any of this is even possible, just a wild idea.

It's possible. It requires Portable IP#s and BGP routing (which can be an expensive solution, and is not ordinarily available to a colocated server) at each location, to change routing information for IP#s in real time.

Or...

Both servers must bein the same data center, which gives you no redundancy if the connection or the entire datacenter goes down.

(A few months ago a major Northern California data center had an electrical problem; their UPS system created surgers which threw most of their hosted servers offline and damaged some beyond reapir.

We offer similar service to the one I explained, on a custom basis, in multiple data centers, or in one data center with redundant connectivity. For most people it's not cost effective.

Jeff

nobaloney · Aug 21, 2004

Re: A couple of Ideas..

donavan said:
Metaconfig might look like this:
"forward all 169.x.x.10:80 requests to $servergroup1"
"$servergroup1 = 192.168.0.10, 192.168.0.11, 192.168.0.12"

Won't work with DA because DA requires public IP#s.

And has other problems as well, such as everything is in the same datacenter... and you've just moved the point of failure from the server to that expensive load-balancing hardware.

With advanced setups you can have load balancer/servers in multiple physical locations all answering the same IPs. This way if you lose FarmA then FarmB takes all the traffic. With an extensive setup your worst case is that some SSL sessions have to restart.

Better, but you still have to have your DA servers on routable IP#s.

DNS round robin + NAS[/B]: $$, simple, not as flexible
A cheap/simple way to do it is set your 2 servers to share their filesystems off of an NFS server. Set your DNS record to have the IPs of both servers: www.domain.com = 169.x.x.10, 169.x.x.11
Then change the TTLs on your DNS records to something like 5 minutes. When a server goes down just pull its A record and traffic will start going to the remaining server.

Then you're in the same data center, with the same reliability issues I've mentioned previously. And presuming all your services will work in an NFS environment, your NFS server is now your single point of failure.

Jeff

donavan · Aug 21, 2004

I agree...

Uptime costs Money. Theres no way around it. If someone *really* wants that mythical 5 9s uptime then they should be prepared to drop LARGE coin on it. Theres a whole industry (alteon, F5, foundry, etc) built around this type of service.

I think for the original posters budget the multi site NS way would work fine, but any dynamic content could put a crimp on it. I wonder how well a Master/Slave MySQL setup would work for remote locations?

Diverse NS/MX servers are simple, easy, and a smart plan. Big, physically diverse farms are great idea, but at that point your looking at 6 diget costs just for hardware.

nobaloney · Aug 21, 2004

Multiple diverse NS servers yes.

MX servers? I'm not sure anymore, since you'll need a method for only accepting email for actual users while the main MX is down.

Most email experts are no longer recommending multiple diverse MX for just this reason. If you don't have an up-to-date list of actual users available to the secondary MX servers, you'll be collecting a lot of undeliverable and unreturnable spam, virusses, etc.

Jeff

Icheb · Aug 22, 2004

jlasman said:
Multiple diverse NS servers yes.

MX servers? I'm not sure anymore, since you'll need a method for only accepting email for actual users while the main MX is down.

Most email experts are no longer recommending multiple diverse MX for just this reason. If you don't have an up-to-date list of actual users available to the secondary MX servers, you'll be collecting a lot of undeliverable and unreturnable spam, virusses, etc.

Jeff

Isn't something like this possible ?

Backup server
Is linked with BIND transfers to the primary server, also there are MX records for this server.
If the primairy server goes down, the backup server should take over all e-mail functions for the domains listed in the /var/named dir.
So it needs to accept all mail to those domains and forward it to the primary servers when they get back online.

Furthermore it would be a nice touch to see the server change all DNS records while there is HTTP downtime to a certain message, so getting a message from the browser about how down the server is, isn't possible.

I believe the first part would be realiseable, although i don't know how to do something like that in Exim (if someone can tell me how to do it, i would like it very much). For the latter part, I'm not so sure, as the DNS changes would need to be almost dynamical, don't know if BIND or Apache could do that...

donavan · Aug 22, 2004

Backup MX

Well thats kind of that standard backup MX. It accepts *all* mail addressed to your domains and then forwards it on to your primary site once its back up. Ive only done it in Qmail but its very simple.
I think what jlasman meant was that youre going to get shedloads of junk for invalid users. Youll basically be acting as a catchall so when Spammy McSpamson sends to [email protected] your going to have a lot of crap pointed at your primary once its back up.

nobaloney · Aug 22, 2004

Yep.

And this creates so many problems in real life that most experienced professional admins are abandoning backup mx servers in favor of building servers less likely to fail and putting them on networks les slikely to fail.

Jeff

Icheb · Aug 23, 2004

jlasman said:
Yep.

And this creates so many problems in real life that most experienced professional admins are abandoning backup mx servers in favor of building servers less likely to fail and putting them on networks les slikely to fail.

Jeff

Possibly true, but the main problem is that everything can go wrong (and it will one day...).
I am looking for a way to let a server do what i described before, with Exim. Since our new backup server (don't know if this is the right word, as it's our first backup server) is a Celeron 2.4 Ghz and all it has to do is catch all e-mail and send it trough to the correct server/user on other server, i believe the thing should also be able to check some RBL servers and possibly do some virus scanning. This won't filter all 'bad' e-mail, but it will filter about 75%. This will also help the primairy server once it gets back up, as receiving a few thousand e-mails and processing them through a virus scan & spamassassin can, well, get the server load a bit high...

All i'm looking for at the moment is a way to accomplish this with Exim...

I think what jlasman meant was that youre going to get shedloads of junk for invalid users. Youll basically be acting as a catchall so when Spammy McSpamson sends to [email protected] your going to have a lot of crap pointed at your primary once its back up.

If you would check the SPF record and do some filtering before delivering, i guess the load of crap tends to decrease a bit.
As you don't have to catch everything, just *@<domains listed in /var/named> as that would be up to date.

nobaloney · Aug 23, 2004

Once you've decided to do it, it's easily enough done, but will require a lot of changes to exim.conf. See the extensive exim documentaiton online, or get the exim book published by UIT Cambridge; you can find it here.

Jeff

tnmcomsa · Nov 17, 2004

email problem

hey every one there
i have a big problem in my server
how can i repair My email boxes ?
how can to fix this error :

The message could not be sent because one of the recipients was rejected by the server. The rejected e-mail address was '[email protected]'. Subject 'hey Franco', Account: 'mail.tnm.com.sa', Server: 'mail.tnm.com.sa', Protocol: SMTP, Server Response: '550 authentication required', Port: 25, Secure(SSL): No, Server Error: 550, Error Number: 0x800CCC79

and suddenly i had a problem in some of email boxes in my server it can't send and receive

could you tell me how to fix it through direct admin and ssh

thanks

nobaloney · Nov 18, 2004

We don't have near enough information to help you.

I'm presuming you got this error when sending a message.

Do you host the yijia-heater.com domain? If not, then the error has nothing to do with mailboxes on your server.

The error indicates that you tried to send email through the server that sent you the email, to a domain NOT on the server, but that you didn't authorize yourself.

Presuming the server that gave you the message hosts your accounts, then the way to authenticate yourself is to try to receive email (Send and Receive if you're using either Outlook or IE) before you try to send.

As far as problems with your mailboxes, we don't know what problem, or what the symptoms are, so we'd be guessing.

Jeff

how to prevent Servers downtime?

giant

Verified User

sander815

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

giant

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

jmstacey

Verified User

donavan

Verified User

donavan

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

nobaloney

NoBaloney Internet Svcs - In Memoriam †

donavan

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

Icheb

Verified User

donavan

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

Icheb

Verified User

nobaloney

NoBaloney Internet Svcs - In Memoriam †

tnmcomsa

New member

nobaloney

NoBaloney Internet Svcs - In Memoriam †