Exim not retrying after T=remote_smtp defer (-44)

wattie · Feb 14, 2017

All of my e-mails sent to the crappy (but popular in Bulgaria) e-mail provider abv.bg are being held by a graylist filter:

2017-02-14 08:02:34 H=pmx.abv.bg [194.153.145.93] SMTP error from remote mail server after RCPT TO:<[email protected]>: 450 4.7.1 <[email protected]>: Recipient address rejected: Service is temporarily unavailable. Please try again later.
2017-02-14 08:02:34 H=pmx.abv.bg [194.153.145.27] SMTP error from remote mail server after RCPT TO:<[email protected]>: 450 4.7.1 <[email protected]>: Recipient address rejected: Service is temporarily unavailable. Please try again later.
2017-02-14 08:02:34 H=pmx.abv.bg [194.153.145.77] SMTP error from remote mail server after RCPT TO:<[email protected]>: 450 4.7.1 <[email protected]>: Recipient address rejected: Service is temporarily unavailable. Please try again later.
2017-02-14 08:02:34 H=pmx.abv.bg [194.153.145.18] SMTP error from remote mail server after RCPT TO:<[email protected]>: 450 4.7.1 <[email protected]>: Recipient address rejected: Service is temporarily unavailable. Please try again later.
2017-02-14 08:02:34 H=pmx.abv.bg [194.153.145.78] SMTP error from remote mail server after RCPT TO:<[email protected]>: 450 4.7.1 <[email protected]>: Recipient address rejected: Service is temporarily unavailable. Please try again later.
2017-02-14 08:02:34 [email protected] R=lookuphost T=remote_smtp defer (-44) H=pmx.abv.bg [194.153.145.78]: SMTP error from remote mail server after RCPT TO:<[email protected]>: 450 4.7.1 <[email protected]>: Recipient address rejected: Service is temporarily unavailable. Please try again later.

If I wait 4-5 minutes and hit the "retry" button in the mail queue, everything will be fine - the message will be sent successfully.

My problem is that exim is NOT retrying automatically on these messages. I can send and receive fine to/from everywhere, but retrying is not automatic.

I am using the following exim.conf with the latest version of Exim:

# SpamBlockerTechnology* powered exim.conf, Version 4.5.2
# Dec 13, 2016

Here is the retry part of the config (the default one - it should retry in 15 mins):

#EDIT#65:
# Domain Error Retries
# ------ ----- -------
begin retry
* quota
* * F,2h,15m; G,16h,1h,1.5; F,4d,8h
# End of Exim 4 configuration

Any help will be appreciated.

wattie · Feb 14, 2017

There is something strange which may be related.

There is an error on exim restart:

root@srv2:/var/log/exim # /usr/local/etc/rc.d/exim restart
Shutting down exim: [ OK ]
kill: 31011: No such process
Starting exim: [ OK ]
root@srv2:/var/log/exim #

It appears to be working after that restart:

root@srv2:/var/log/exim # /usr/local/etc/rc.d/exim status
exim (pid 57447 59867 59870 60050 ) is running...

Stop or restart will give error with exactly the same pid issue (31011). Emails are going in and out with no problems (except the retry issue described in the first post).

I did recompile (./build update, ./build exim) but it did not help.

wattie · Feb 14, 2017

I got what 31011 is - it is /var/run/spamd.pid

There is no such process:

# kill -9 31011
31011: No such process

The spamd appears to be up and running:

# ps -aux | grep spam
root 20165 0.0 0.3 223836 103304 - I 14:05 0:44.19 spamd child (perl)
root 60690 0.0 0.4 244316 124196 - I 09:01 1:56.06 spamd child (perl)
root 95015 0.0 0.2 182876 53560 - Ss Fri10 0:45.16 /usr/bin/spamd -d -c -m 15 (perl)
root 61998 0.0 0.0 14796 2592 0 S+ 15:30 0:00.00 grep spam

wattie · Feb 14, 2017

I found what the issue with the pid file is. When doing ./build spamassassin, it's starting spamd like that:

/usr/bin/spamd -d -c -m 15

It should be:

/usr/bin/spamd -d -c -m 15 --pidfile=/var/run/spamd.pid

(this is what "/usr/local/etc/rc.d/exim restart" is doing).

I manually started spamd with the --pidfile option and now restarting exim gives no problem.

Still on the main topic - the exim is NOT retrying to send my e-mails...

Richard G · Feb 14, 2017

(this is what "/usr/local/etc/rc.d/exim restart" is doing).

Is it? Where did you find this?
Because I've got spamassassin running but there is no spamd.pid file present. It's not called either.
I see this in my process list:

Code:

/usr/bin/spamd -d -c -4 -m 15

So I got the impression you created a solution which is not default.

As to your original issue.

My problem is that exim is NOT retrying automatically on these messages

How long did you wait? Exim does not send them again every few minuts. We're talking hours here before the first retry, depending on the retry-time.

wattie · Feb 14, 2017

Richard G said:
Is it? Where did you find this?

I get it from here:

/usr/local/etc/rc.d/exim

if [ -e /usr/bin/spamd ]; then /usr/bin/spamd -d -c -m 15 --ipv4 --pidfile=$SPAM_PID 1>/dev/null 2>/dev/null; fi

I am using the default configuration on FREEBSD (had to mention it, heh).

Richard G said:
How long did you wait? Exim does not send them again every few minuts. We're talking hours here before the first retry, depending on the retry-time.

I waited long enough (hours). It should retry every 15 minutes the first 2 hours according to the configuration. But it isn't retrying at all. See this from the exim.conf:

* * F,2h,15m; G,16h,1h,1.5; F,4d,8h

F - first, 2 hours, retry every 15 minutes, then G - geometrical progression, ...

I did not wrote that - it's from the directadmin exim.conf version 4.5.2

On one place I read that it may be a load issue (the deliver_queue_load_max variable which defaults to 10). My load is a lot down than 10 - it's "load averages: 0.42, 0.67, 0.74"... so it shouldn't be a load issue.

wattie · Feb 14, 2017

Updated exim.conf to 4.5.3 and exim.pl to 22 (from 21) - still not retrying...

Richard G · Feb 14, 2017

Ah it might be different then between Centos and FreeBSD.
I've got:

Code:

if [ -e /usr/bin/spamd ]; then /usr/bin/spamd -d -c -4 -m 15 1>/dev/null 2>/dev/null; fi

the -4 is for ipv4 but no spamd pidfile as you can see. That's why I wondered.

Since the new exim.conf and exim.pl are still not retrying, I don't know.
Are you on DA 1.50.0? This one has a log bug, maybe it's also causing this retry issue on FreeBSD? I'll keep following this topic (without commenting further), getting curious on why the retry is not working.

wattie · Feb 15, 2017

I deleted the previous two messages.

This morning the mail queue was full with more than 400 e-mails. Retrying them manually worked fine. So it is clear that exim is not resending correctly. it's not only abv.bg, but also yahoo and few others.

I did a custom (hopefully temporarily) solution with a cron job running exim -qf every 15 minutes.

P.S. I am on the latest everything, incl. DA.

wattie · Feb 17, 2017

This is still driving me nuts. The cronjob solution works but it is far from it should be

Richard G · Feb 17, 2017

The cronjob is a workaround, not a solution. Have you already contacted DA support? You could mail them and point them to the topic to ask their opinion, they will respond here. Or maybe send in a ticket.

wattie · Feb 18, 2017

That will be the next move. Thanks.

DirectAdmin Support · Feb 18, 2017

Hi guys,

Reading over the exim docs, there should be a queue-runner process periodically triggers by the exim daemon.

Delivery is said to be deferred when the message remains on the queue for a subsequent delivery attempt after a temporary failure. Such messages get processed again by queue-runner processes that are periodically started, either by an Exim daemon or via cron or by hand.

and in the man pages

Usually the -bd option is combined with the -q<time> option, to specify that the daemon should also initiate periodic queue runs.

and

-q<qflags><time> When a time value is present, the -q option causes Exim to run as a daemon, starting a queue runner process at intervals specified by the given time value. This form of the -q option is commonly combined with the -bd option, in which case a
single daemon process handles both functions. A common way of starting up a combined daemon at system boot time is to use a command such as

/usr/exim/bin/exim -bd -q30m

Such a daemon listens for incoming SMTP calls, and also starts a queue runner process every 30 minutes.

Soo.. first thing would be to check how exim was started, to ensure you see the "-bd -q1h" flag (or similar).

Our FreeBSD 11 build box looks like this:

Code:

root@freebsd11-64:~ # ps ax |grep exim
57200  -  Is         0:07.54 /usr/sbin/exim -bd -q1h -oP /var/run/exim.pid

John

DirectAdmin Support · Feb 18, 2017

If it's not FreeBSD 11, let me know exactly which FreeBSD you've got, so we can check over the boot script's options.

wattie · Feb 19, 2017

It is FreeBSD11. It is started like this:

Code:

root@srv2:~ # ps -aux | grep exim
mail       15909    0.0  0.0   31720    9172  -  Ss   04:02       0:00.91 /usr/sbin/exim -bd -q1h -oP /var/run/exim.pid

In the daemon code there was:

Code:

# Source exim configureation.
if [ -f /etc/sysconfig/exim ] ; then
        . /etc/sysconfig/exim
else
        DAEMON=yes
        QUEUE=1h
fi

Since there is no /etc/sysconfig/exim file, it's going to the else part.

I changed it to 15m and removed the cronjob workaround. I'll report back soon if that fixes my issue (but I guess it's counted as a workaround too, eh?).

On my older server (FreeBSD 9.1), it is running as "/usr/sbin/exim -bd -q1h -oP /var/run/exim.pid" and there was (and are) no issues... But it is older exim.conf (4.2.2)
and this could matter...

wattie · Feb 20, 2017

Maybe it was not the perfect solution but it is ok now - stable whole monday.

Exim not retrying after T=remote_smtp defer (-44)

wattie

Verified User

wattie

Verified User

wattie

Verified User

wattie

Verified User

Richard G

Verified User

wattie

Verified User

wattie

Verified User

Richard G

Verified User

wattie

Verified User

wattie

Verified User

Richard G

Verified User

wattie

Verified User

DirectAdmin Support

Administrator

DirectAdmin Support

Administrator

wattie

Verified User

wattie

Verified User