Solved API (CMD_API_DOMAIN) is super slow with 1.62

kam · Jun 11, 2021

============================
Updated On 2021-6-15 Problem Solved

I fixed it by change "SSL Certificate" setting from "Use best match certificate" to anythings else.
With the old version, I don't encounter this problems when I choose "Use best match certificate" with no certificate.
May be the version 1.62 change somethings in the script,
It will keep looping to execute Openssl, for those domains with "Use best match certificate" SSL option selected while those domains had no certificates.

Most of the users may not encounter this issues.
But for whom already have many domains in DA and using "Use best match certificate" with no certificates as their default option.
After upgrade to 1.62, they will found that the waiting time to add / modify / delete domains are much longer then usual, .while it's working well in old version with the same settings.

==============================

Hello,

I have around 1200 domains.
80 of them are added as top level, and the remaining added as Alias (DomainPointer).

After upgrade to version 1.62, the API interface took 56 seconds to handle a modify request.
I remember that it only took around 1~2 seconds before the upgrade. Can anyone please look into it.

Thanks

Query

CMD_API_DOMAIN?action=modify&domain=anywhere.in&bandwidth=3333&uquota=shared&ssl=&cgi=&php=

Response

factor · Jun 11, 2021

There has been a noted change to the API did you test out the updates? I dont see your exact call listed. Maybe it will lead you to an answer.

Version 1.62.0 | Directadmin Docs

DirectAdmin Knowledge Base

directadmin.com

DirectAdmin 1.62.0 has been released

Hello, We're excited to announce the full release of DirectAdmin 1.62.0. The full list of changes are listed here: https://directadmin.com/versions.php?version=1.620000 Some of the new features: HTTP/2 support on port 2222 via the new Go wrapper Automatic SSL Certificate management CGroup...

forum.directadmin.com

Behavior Changes
json=yes calls to the user=fred GET version of these calls have output changes

CMD_API_SHOW_USER_DOMAINS
CMD_SHOW_USER
CMD_API_SHOW_USER_USAGE
CMD_API_SHOW_USER_CONFIG

Please test your API calls if you're using these, as the "quota" value was not in the correct format before. It's now an array.

Behavior change: CMD_API_SHOW_USER_DOMAINS, CMD_SHOW_USER, CMD_API_SHOW_USER_USAGE, CMD_API_SHOW_USER_CONFIG: quota

www.directadmin.com
There is a GET option which lets you revert to the old value, as a workaround (but please update your scripts to use the corrected array method)

kam · Jun 11, 2021

I check it out manually with postman. It's a straightforward update request via API and I can confirmed that it's very slow.

Also, during the time of waiting for the API to response. The server CPU usage somehow goes up to 100%. I think this behaviors is not normal. ?

factor · Jun 11, 2021

kam said:
I think this behaviors is not normal.

for sure not.

DirectAdmin Support · Jun 11, 2021

kam said:
After upgrade to version 1.62, the API interface took 56 seconds to handle a modify request.

Hi @kam,

My first guess would be a custom hook.. and the fact it's hovering around 1 minute seems like some standard timeout.
With that, it still need to be debugged, so I'd recommend running DA in debug mode 3000, trigger the modification and see what it does.

https://help.directadmin.com/item.php?id=293&in1=3000

It will likely stop showing output at some point while it's "doing something", and that, plus the few pages+ before would be relevant for debugging what it's actually doing. Level 3000 is going to spit out a lot, but if it's a hook, we'd be looking for output starting with:

Code:

System::executeIfExists: running:

to spot which script it is.

You can also check for hooks in:
/usr/local/directadmin/scripts/custom/*
/usr/local/directadmin/plugins/*/hooks/*

(sh files, or hook-named folders, with any .sh files inside)

Either, all_pre.sh/all_post.sh, or something related to modifying the domain.

Again, that's only if it's a hook causing it. The debug might show endless output (stuck in a loop perhaps) in which case, let know if you spot repeated info, or just nothing.. in which case let us know the info above it to try and sort out what it's doing.

One other trick (might be simpler than the debug output) is to throw a USR1 signal at the process.
This will tell that process to dump it's last logged process location to the /var/log/directadmin/error.log.
This might be useful, or it might be too specific to know what's going on, but worth a shot, eg:

Code:

killall -USR1 directadmin; tail -f /var/log/directadmin/error.log

Also check if it does the same thing from the GUI (Evolution skin), as it's the same code.. so should do the same thing.
If not, then that might point to something with the login keys, if you're using one.

If you paste anything here, be sure not to include any sensitive information.

John

kam · Jun 12, 2021

Hello,

I received below errors at the time waiting for the API to response. [Debug Mode enabled]

[root@server]# ./directadmin b3000 | grep string

Nearly 15K line of errors are generated.
I can't find any files named "httpd_tokens" on my server.

I had not set anything for custom httpd.
But somehow after upgrade to 1.62, the script try to read all the custom httpd settings which are not existed and generated errors.

DirectAdmin Support · Jun 12, 2021

At the higher debug level, that output might be normal. It's just saying they don't exist, and would likely move on.
What we're interested in would be "non stop" output that's taking too long.. or something in a loop.
Does the above output go for 10+ seconds or is the output changing to something else?
Or.. does the output just stop and you're waiting for the browser response, but nothing is being output in the console?
This might require a ticket to debug, we may need to login to see what's going on (I could probably track it down fairly quickly if logged in)

kam · Jun 12, 2021

Thanks for your help. I think I made a mistake and run it with `| grep string`. And that's why I unable to get the big picture.

This time, I run it again with

./directadmin b3000

Now I found which hookup process get stuck and consume most of the time during looping.

execute('/usr/bin/openssl x509 -text -certopt no_header,no_version,no_serial,no_signame,no_pubkey,no_sigdump,no_aux -in /etc/httpd/conf/ssl.crt/server.crt', maxsize=145, fd=1, env=0)

When CMD_API_DOMAIN is called, no matter it's add / modify / delete . I found that the script try to loop all domains to execute the (Openssl) process. Even it's just a Alias (Domain Pointer), it still try to loop for the same parent domain again and again for the `openssl` execution.

Most of the time consumed is to wait for that Openssl process to completed.

I decided to try to delete a domain with the directadmin web interface, and I can confirmed that it having the same issue.
It try to loop for the `openssl` execution. I have to wait for > 55 seconds to delete a domain. ?

kam · Jun 15, 2021

Problem Solved

I fixed it by change "SSL Certificate" setting from "Use best match certificate" to anythings else.
With the old version, I don't encounter this problems when I choose "Use best match certificate" with no certificates.
May be the version 1.62 change somethings in the script,
It will keep looping to execute Openssl, for those domains with "Use best match certificate" SSL option selected while those domains had no certificates.

Most of the users may not encounter this issues. But for whom already have many domains in DA and using "Use best match certificate" with no certificates as their default option.
After upgrade to 1.62, they will found that the waiting time to add / modify / delete domains are much longer then usual, while it's working well in old version with the same settings.

DirectAdmin Support · Jun 15, 2021

Hi @kam,

Thanks for the feedback. Can you confirm which files are "big"? As that would be my best guess as to why it's slow.
The files that could be in play are (with the commands to generate a line count):

Code:

cat /etc/virtual/domains | wc -l
cat /etc/virtual/domainowners | wc -l
cat /etc/virtual/snidomains | wc -l

so we can narrow down what might be slow and why.
I might be able to hunt for optimizations for the given files (I'll check now anyway)

I'll also look into caching the server.crt info, so it's not called over and over

John

kam · Jun 15, 2021

DirectAdmin Support said:
Thanks for the feedback. Can you confirm which files are "big"? As that would be my best guess as to why it's slow.

root@kam:/usr/local/directadmin/custombuild# cat /etc/virtual/domains | wc -l
955
root@kam:/usr/local/directadmin/custombuild# cat /etc/virtual/domainowners | wc -l
954
root@kam:/usr/local/directadmin/custombuild# cat /etc/virtual/snidomains | wc -l
3

------------------------
I fix above slow problem by choosing "self signed certificate" instead of "Use best match certificate"

But then I encounter 100% cpu problem.

I found that dataskq will keep looping for all the Alias domains and attempt to obtain the let's encrypt cert.
But indeed I was already choose to use the "self signed certificate" for the parent domain that all Alias domains pointed to. In general this should not be happened. However, it somehow failed to detect self signed certificate settings and keep looping all Alias domains for let's encrypt cert.

In this case, I have no choice but to disable the Automatic SSL Certificate management to get rid with this 100% loading problem.

/usr/local/directadmin/directadmin set admin_ssl_check_retries 0
service directadmin restart

DirectAdmin Support · Jun 15, 2021

Thanks, I've added extra caching (plus stat() check to cache the previous cases that want to ensure they're not using stale info):

Version 1.62.1 | Directadmin Docs

DirectAdmin Knowledge Base

directadmin.com

but I've not yet pushed it.. until the above is resolved.

For those multiple dataskq calls, can you confirm if the domain aliases already have a request created?
I assume that when moving away from the auto ssl mode, the requests for the pointers (likely subdomains too) are still there?
For the domain that had the retry disabled (changed away from "use best match" to anything else), check it's pointers/subdomains to see if there are retry files still present here:

Code:

/usr/local/directadmin/data/users/USER/domains/*.ssl.next_retry

where that file being present implies the retries will continue on (slowing over time).

For now, I'll fix the code on that assumption, but please confirm anyway to ensure I'm fixing the correct thing

John

DirectAdmin Support · Jun 15, 2021

Update: The pre-release binaries are now available with the above change, but I've also added a change such that, if you pick anything other than "Best Match" (aka: old "Shared Server Cert"), it will now clear the .ssl and .ssl.next_retry files for this domain, plus all subdomains/pointers under this domain, so they don't retry.

If you'd like the binaries now, use the pre-release guide:

https://help.directadmin.com/item.php?id=408

If that doesn't resolve the issue, let me know and we can dig furture (I might need more info though. If you can create a ticket, that would speed up the process

)

John

kam · Jun 15, 2021

DirectAdmin Support said:
For those multiple dataskq calls, can you confirm if the domain aliases already have a request created?

I have two parent domains for the user account named "kam",
The parent domain (end with .cc) have 950 Alias domains point to it and it's using self signed certificate.
While another parent domain (end with .tv) is using Let's encrypt certificate.

I can confirmed that ssl.next_retry are existed for many Alias domains.

root@kam:/usr/local/directadmin/data/users/kam/domains# ls *.ssl.next_retry | wc -l
580

==================================================

If you want to dig further, you can replicate the problem by create two parent domains, let's say domain1.com and domain2.com
Set the domain1.com use Self signed certificate
Set the domain2.com use Let's encrypt certificate
Then create domain3.com to domain999.com As Alias domains and point them to domain1.com.
I think it will be the best way to investigate into this problem.

Kam

DirectAdmin Support · Jun 16, 2021

Thanks for the info. I've added a few areas for improvement:

Version 1.62.1 | Directadmin Docs

DirectAdmin Knowledge Base

www.directadmin.com

Pre-release binaries should be done uploading in about 2 minutes.

I'll be pushing 1.62.1 today with this and other fixes, so let me know ASAP if the issue has not been resolved.
If not, please clarify the exact command being used (I now know the "state" to duplicate, thank you)

John

Solved API (CMD_API_DOMAIN) is super slow with 1.62

kam

Verified User

factor

Verified User

Version 1.62.0 | Directadmin Docs

DirectAdmin 1.62.0 has been released

Behavior change: CMD_API_SHOW_USER_DOMAINS, CMD_SHOW_USER, CMD_API_SHOW_USER_USAGE, CMD_API_SHOW_USER_CONFIG: quota

kam

Verified User

factor

Verified User

DirectAdmin Support

Administrator

kam

Verified User

DirectAdmin Support

Administrator

kam

Verified User

kam

Verified User

DirectAdmin Support

Administrator

kam

Verified User

DirectAdmin Support

Administrator

Version 1.62.1 | Directadmin Docs

DirectAdmin Support

Administrator

kam

Verified User

DirectAdmin Support

Administrator

Version 1.62.1 | Directadmin Docs

Solved API (CMD_API_DOMAIN) is super slow with 1.62

Verified User

Verified User

Behavior change: CMD_API_SHOW_USER_DOMAINS, CMD_SHOW_USER, CMD_API_SHOW_USER_USAGE, CMD_API_SHOW_USER_CONFIG: quota​

Verified User

Verified User

Administrator

Verified User

Administrator

Verified User

Verified User

Administrator

Verified User

Administrator

Administrator

Verified User

Administrator

Behavior change: CMD_API_SHOW_USER_DOMAINS, CMD_SHOW_USER, CMD_API_SHOW_USER_USAGE, CMD_API_SHOW_USER_CONFIG: quota