nginx_apache SSL problem: peer closed connection in SSL handshake (104: Connection reset by peer) while SSL handshaking to upstream

activate

Verified User
Joined
May 30, 2017
Messages
38
Location
Terneuzen, Netherlands
The 15th of this month I finally upgraded our 3 servers to latest OpenSSL. Recompiled all the needed and things seemed ok.

Last week I got a message from a user of one of the servers. Site was not loading styling/images/javacsript and intermittently.
Turned out it was for more than 1 client.

Cross checked the other servers and did some rebuilds and rewrite_confs... only to end up with 2 out of 3 servers generating the same error, also intermittently.

When checking a site on one of the 2 affected servers the response varies between '502 Bad Gateway' and loading the site as it should. However, for all of the affected users the https://webmail.domain.tld is still working. (Although initially it might give a 502 as well)

So now I'm trying to figure out why and hoping someone here can help me in my thinking process :)
 

activate

Verified User
Joined
May 30, 2017
Messages
38
Location
Terneuzen, Netherlands
Thanks for your response Brent.
Just gave it a try but unfortunately. Was worth a shot indeed. But that would not comply with the webmail not giving the errors.

It's just that in the last month I finally implemented IPv6 and upgraded OpenSSL, so loads of changes happened and now I'm just trying to see if I can come up with all the things that need to be checked.

So far I'm still stumped by the webmail still working correctly. Since it is using the same setup as the website for the most part.
I'll see if I can figure out something that way.

Edit: The things that I built were nginx_apache, exim, dovecot, clamav, curl, php, pureftpd and I bet I'm forgetting one now but I can't think of anything critical I left out

Edit2: These are the messages showing up in the nginx log for one domain

2020/06/26 15:43:30 [error] 21244#0: *159044 peer closed connection in SSL handshake (104: Connection reset by peer) while SSL handshaking to upstream, client: x.x.x.x, server: domain.tld, request: "GET /wp-content/uploads/cache/fvm/1592881641/out/footer-dcc5723453ddef17f2e069161defd9e4dd650251.min.js HTTP/2.0", upstream: "https://x.x.x.x:8081/wp-content/upl...c5723453ddef17f2e069161defd9e4dd650251.min.js", host: "domain.tld", referrer: "https://domain.tld/"

it seems it is just CSS, images and JS files. So perhaps it's something in the static files.. But then again, not for the webmail..

It's gonna be a long weekend...
 
Last edited:

activate

Verified User
Joined
May 30, 2017
Messages
38
Location
Terneuzen, Netherlands
Ok,

So I found this in /var/log/httpd/error_log

Code:
[Fri Jun 26 15:54:04.894311 2020] [core:notice] [pid 2371:tid 140203966118016] AH00052: child pid 12549 exit signal Aborted (6)
*** Error in `/usr/sbin/httpd': double free or corruption (!prev): 0x00000000029467d0 ***
On both servers experiencing the problem.

There are no updates when doing yum update and AFAIK the rest is done by custombuild and up to date.

One of them is still CentOS 6 and the other is CentOS 7. Our other CentOS 7 server is not having these issues.

Would it help to build nginx_apache again? I believe to have done so more than one time over the last days and seems to no avail.
 

bdacus01

Verified User
Joined
Jul 22, 2017
Messages
1,304
Location
Murfreesboro

activate

Verified User
Joined
May 30, 2017
Messages
38
Location
Terneuzen, Netherlands
I managed to rectify my OpenSSL related mess.

On checking the different httpd binaries (ldd) on the servers I noticed something on the one server still operating correctly.
Apparently I was mixing up some locations and installed OpenSSL in /usr on the 2 affected servers and in /usr/local/ssl on the one unaffected server.
Afterwards I have installed OpenSSL in /usr/local/ssl on the other machines as well and now all is well again.

On all machines I did have to recompile pycurl to keep yum working.
 
Top