$#!tstemd cgroups CPU, RAM, IO, (TASKs?) made easy

Yeah, my DA installation has been borked all week. Support are looking at it, I paid for LAN support and troubleshooting so it should get somewhere. Just means the backup plug-in is on for next week, which is when I planned to do this. No bother just might be a bit delayed.

Keep the useful info coming 👍
 
I use a very simple version where I user custom package options to set cpu, mem, tasks and bandwidth limits.

In user_create_post.sh I write the slice info to /etc/systemd/system/user-NNNN.slice.
In user_destroy_pre.sh I obviously remove the file again.
In user_modify_post.sh I update the user-NNNN.slice file and run a a few "systemctl set-property user-${userID}.slice CPUQuota=${resCPU}%" commands to activate the changes directly.
After each script I check if the daemon needs to be reloaded and user.slice needs to restart.

Pretty straightforward stuff.

What would be nice is some way to present usage graphs so it's easier to see if a user needs more resources.
 
Actually, I use MRTG now to show the users load against his cpu limit.

One thing that kinda bothers me is the reseller. If e.g. a reseller has limits like 100% cpu ( = one core ) and 1G ram, then it would be silly if he could assign 2G ram to one of his users. The same goes for assigning 512M to maybe ten of his users totalling 5G.
Rescheduling each user to a max of 1G / # users would give each user 100M... not good resources management either.

Any ideas on how to handle this would be welcome ;)
 
I use a very simple version where I user custom package options to set cpu, mem, tasks and bandwidth limits.

In user_create_post.sh I write the slice info to /etc/systemd/system/user-NNNN.slice.
In user_destroy_pre.sh I obviously remove the file again.
In user_modify_post.sh I update the user-NNNN.slice file and run a a few "systemctl set-property user-${userID}.slice CPUQuota=${resCPU}%" commands to activate the changes directly.
After each script I check if the daemon needs to be reloaded and user.slice needs to restart.

Pretty straightforward stuff.

What would be nice is some way to present usage graphs so it's easier to see if a user needs more resources.
Would you like to share how you did this?
 
I got a chance to look into this a little bit more.

Unfortunately, this doesn't work with php-fpm. The user-slice information isn't read into the php-fpm pool fork per user. The user-slick information is only read from processes that originally start out from that user. Since the majority of process usage in the webhosting arena is via web and via PHP - this would appear to be a non-starter for me.

Unless I'm missing something from the systemd aspect that will allow the user-slice information to be read when the php-fpm process forks out to individual user pools.

Ideally the user-slices would be fully templateable. That's what cgroup templates used to do. Any process run by a user fell under the jurisdiction of that user's cgroup template. This does not appear to be the case with user-slices.
 
I got a chance to look into this a little bit more.

Unfortunately, this doesn't work with php-fpm. The user-slice information isn't read into the php-fpm pool fork per user. The user-slick information is only read from processes that originally start out from that user. Since the majority of process usage in the webhosting arena is via web and via PHP - this would appear to be a non-starter for me.

Unless I'm missing something from the systemd aspect that will allow the user-slice information to be read when the php-fpm process forks out to individual user pools.

Ideally the user-slices would be fully templateable. That's what cgroup templates used to do. Any process run by a user fell under the jurisdiction of that user's cgroup template. This does not appear to be the case with user-slices.
Well, not with the default user slice but that's only for users logging in using ssh e.g.
For php-fpm you'll need to do something else, but that's not that hard either. Systemd uses templates like '[email protected]'. You'll need toch use this to start the 'root' fpm-php process for each user. This service can run a post script like:

[Service]
ExecStartPost=/blabla/cg_user.sh %i 73

This cg_user.sh script gets the username and php version as a parameter. You can use that to get the cgroup settings from his package and set them in /sys/fs/cgroup/user.slice/<userid>/cpu.max etc.
 
Well, not with the default user slice but that's only for users logging in using ssh e.g.
For php-fpm you'll need to do something else, but that's not that hard either. Systemd uses templates like '[email protected]'. You'll need toch use this to start the 'root' fpm-php process for each user. This service can run a post script like:

[Service]
ExecStartPost=/blabla/cg_user.sh %i 73

This cg_user.sh script gets the username and php version as a parameter. You can use that to get the cgroup settings from his package and set them in /sys/fs/cgroup/user.slice/<userid>/cpu.max etc.
It's been discussed in https://github.com/php/php-src/pull/2440 and https://bugs.php.net/bug.php?id=70605. There is little activity there, so.. maybe these need to be 'revived' ? :)
 
It's been discussed in https://github.com/php/php-src/pull/2440 and https://bugs.php.net/bug.php?id=70605. There is little activity there, so.. maybe these need to be 'revived' ? :)
Well, how I've done it, works good enough for me because I don't have to change the php code. (I still once in a while have nightmares about the apache itk mod). By moving the root php-fpm process (one pid only used when the fpm service is started) will have all subsequent user fpm processes also running in this cgroup automatically. The only 'issue' is that I have a php-fpm root process per user.
But if cgroup support for pools would become mainstream i'd adopt it asap. But for now I'm happy with having it being part of an DA package like:

Schermafbeelding 2020-12-19 om 20.32.37.png
 
Just took a quick look at the php source code and I'm wondering if a simple solution would work to change the fpm_children.c function fpm_children_make() where a child is forked and after a successful fork simply write the child pid to the required cgroup.procs. That would prevent the need to have a master process per user. Something like:

child->pid = pid;
/* Added: Move pid to cgroup */
sprintf(str, "/sys/fs/cgroup/user.slice/%s/cgroup.procs", wp->config->name );
destFile = fopen(str, "w+");
fwrite(destFile, pid);
fclose(destFile);
/* EOA: Move pid to cgroup */
fpm_clock_get(&child->started);
fpm_parent_resources_use(child);

zlog(is_debug ? ZLOG_DEBUG : ZLOG_NOTICE, "[pool %s] child %d started", wp->config->name, (int) pid);
 
@DanielP

Hello. It has taken me until now to come back to this, mainly because CentOS 8 died, so needed to know what system to aim for. Hurrah, AlmaLinux is here!

So I am running AlmaLinux 8.3 (CentOS 8.3 variant) and - unfortunately, I can not reproduce your results :(

I added the systemd.unified_cgroup_hierarchy=1 config option to /etc/default/grub and did grub2-mkconfig -o /boot/grub2/grub.cfg - rebooted and so on - all to no avail. I tried the dd command with top in another shell and it reported 99% - unlucky for me!

Please can you elaborate on what exactly it is you did and how you replicated this behaviour? I had to install grub2 for instance, modify for grub2 commands etc etc. Just a bit more background is needed I think.

Here is my use case:
Code:
[root@staging1 ~]# dnf install grub2*
...
[root@staging1 ~]# vi /etc/default/grub # ..
[root@staging1 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
done
[root@staging1 ~]# shutdown -r now
[root@staging1 ~]# adduser test1
[root@staging1 ~]# id -u test1
1005
[root@staging1 ~]# systemctl status user-1005.slice
● user-1005.slice - User Slice of UID 1005
   Loaded: loaded (/etc/systemd/system/user-1005.slice; static; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/user-.slice.d
           └─10-defaults.conf
   Active: inactive (dead)

Thanks in advance! :wow:
 
First, login or su to your test1 user and start a dd if=/dev/random of=/dev/null.
This will use between 1 and 2 cpucores as much as possible (dd and rngd)

In another root shell type:

systemctl set-property user-1005.slice CPUQuota=5%

You should see drop the cpu usage to 5% for both processes combined. Tested on CentOS Linux release 8.3.2011.
 
First, login or su to your test1 user and start a dd if=/dev/random of=/dev/null.
This will use between 1 and 2 cpucores as much as possible (dd and rngd)

In another root shell type:

systemctl set-property user-1005.slice CPUQuota=5%

You should see drop the cpu usage to 5% for both processes combined. Tested on CentOS Linux release 8.3.2011.

I don't see the results. What steps did you take to get cgroups functional?
 
excellent, thank you for the useful contribution.
There is one thing I want to ask because I haven't tried it yet.
certainly can be done in KVM, but can this Cgroups feature also be done in OpenVZ?
 
excellent, thank you for the useful contribution.
There is one thing I want to ask because I haven't tried it yet.
certainly can be done in KVM, but can this Cgroups feature also be done in OpenVZ?
Don't think so, though, in my opinion you should realy not be using OpenVZ in the first place.
 
OK, I must need to try again then. Will let you know how it goes
So I've opted for a clean install. Installing now, then will try to follow more instructions more closely. I have to admit I did take a second look at this yesterday and it still wasn't working (I could see changes to hard disk so maybe not grub, ugh) but in light of fact it could be something else I am doing a full reset of my server. Thanks @sysdev I'll post more later on.
 
So, AlmaLinux release 8.3 (Purple Manul) is appearing not to work with this. Downloading CentOS 8 now, will try with that too.
 
Hmpff. Am stuck! It must be something I'm doing wrong. I've done it 4/5 times now on fresh installs to no avail.

Here is what I'm doing

Code:
vi /etc/default/grub
grub2-mkconfig -o /boot/grub2/grub.cfg
shutdown -r now

And then...
Code:
useradd test1
id -u test1
systemctl set-property user-1000.slice CPUQuota=5%

Only for top to show out of control CPU stats. Any advice appreciated at this stage. Thank you
 
Back
Top