Skip to content

systemd issue scheduling many jobs #529

@MarcWort

Description

@MarcWort

Hi, first of all I want to thank you for this awesome project :).

I run probably unusual setup with resticprofile and rest-server as a central backup server with ~600 repos at the moment. Each repo has automatically generated profile forget schedule:

  forget:
    schedule:
      at: '10:00'
      systemd-drop-in-files:
      - "/etc/restic-server/random-delay-12h.dropin.conf"
      - "/etc/restic-server/mount.dropin.conf"
    retry-lock: 5m
    group-by: tags,host
    path: false
    repack-small: true
    prune: true
    keep-last: 5
    keep-within-daily: 14d
    keep-daily: '30'
    keep-weekly: 12
    keep-monthly: 12

When running resticprofile unschedule --all && resticprofile schedule --all to unschedule removed repos and add newly added it takes 5 minutes which is not great but acceptable.

The real problem is that systemd can't keep with all these actions and runs out of buffer or something and can't even create new ssh (user-) sessions for that period.

Jul 23 09:44:53 backupserver systemd[1]: Started resticprofile-forget@profile-server1.timer - forget timer for profile server1 in /etc/restic/profiles.yaml
Jul 23 09:44:53 backupserver systemd[1]: Reloading.
Jul 23 09:44:54 backupserver systemd[1]: Reloading.
Jul 23 09:44:54 backupserver systemd[1]: Started resticprofile-forget@profile-server2.timer - forget timer for profile server2 in /etc/restic/profiles.yaml.
Jul 23 09:44:54 backupserver systemd[1]: Reloading.
Jul 23 09:44:54 backupserver systemd[1]: Reloading.
Jul 23 09:44:55 backupserver systemd[1]: Started resticprofile-forget@profile-server3.timer - forget timer for profile server3 in /etc/restic/profiles.yaml.
Jul 23 09:44:55 backupserver systemd[1]: Reloading.
Jul 23 09:44:55 backupserver systemd[1]: Reloading.
Jul 23 09:44:56 backupserver systemd[1]: Started resticprofile-forget@profile-server4.timer - forget timer for profile server4 in /etc/restic/profiles.yaml.
Jul 23 09:44:56 backupserver systemd[1]: Reloading.
Jul 23 09:44:56 backupserver systemd[1]: Reloading.
Jul 23 09:44:56 backupserver sshd[3320588]: Accepted publickey for monitoring from [...]
Jul 23 09:44:56 backupserver sshd[3320588]: pam_unix(sshd:session): session opened for user monitoring(uid=1026) by (uid=0)
Jul 23 09:44:56 backupserver sshd[3312180]: pam_systemd(sshd:session): Failed to create session: Connection timed out
Jul 23 09:44:56 backupserver systemd[1]: Started resticprofile-forget@profile-server5.timer - forget timer for profile server5 in /etc/restic/profiles.yaml.
Jul 23 09:44:56 backupserver sshd[3312180]: pam_unix(sshd:session): session closed for user monitoring
Jul 23 09:44:56 backupserver systemd[1]: session-51438.scope: Deactivated successfully.
Jul 23 09:44:56 backupserver systemd[1]: Reloading.
Jul 23 09:44:57 backupserver systemd[1]: Reloading.
Jul 23 09:44:57 backupserver sshd[3314665]: pam_systemd(sshd:session): Failed to create session: No buffer space available

Maybe it is related to systemd/systemd#13674

these aren't limits you are supposed to ever hit, and if you do anyway, then other bad things have happened much earlier. They are safety nets, nothing more.

So my questions:

Would it be possible to reduce the number of systemd reloads?

https://github.com/creativeprojects/resticprofile/blame/63dee21a1beaa3d72df3666365363b43b40c26a4/schedule/handler_systemd.go#L176

here for example it would be possible to do a systemctl enable --now $timer, right?

When running schedule with --all would it be possible to do a single reload after placing the files and then enable them?

Maybe even enable them creating symlinks and reload after that?

I am running resticprofile version 0.31.0 commit 81a6e45f50e3c2196ff4e62e85bfb61e0e5816a0 with

systemd 252 (252.38-1~deb12u1)
+PAM +AUDIT +SELINUX +APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT +QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified

on debian 12.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions