Enable systemd hardening options for named

Ludovic Gasc gmludo at gmail.com
Tue Jan 16 13:11:07 UTC 2018


2018-01-16 13:52 GMT+01:00 Daniel Stirnimann <daniel.stirnimann at switch.ch>:

> Hello all,
>
> Just wondering, if one is already using selinux in enforcing mode, does
> systemd hardening provide any additional benefit?
>

Very good question, I'm not sure at all:
To my understanding, it might be complementary, at least it's possible in
systemd config to apply a SELinux context:
https://www.freedesktop.org/software/systemd/man/systemd.exec.html#Mandatory%20Access%20Control

It's clearly possible that all options we have put in the systemd config
file has an equivalent in SELinux/AppArmor/SMACK and that its propose more
security options than systemd. But I have only basic knowledge on these
technologies, interested in by feedbacks from a person who has written
rules with that.

The main advantage I see to work on this systemd config file is to increase
the default security configuration in major distributions without extra
dependencies: 99% of people don't customize the default daemons config
setup, and SELinux/AppArmor/SMACK aren't always used.


>
> Daniel
>
> On 16.01.18 12:21, Ludovic Gasc wrote:
> > Hi,
> >
> > I have merged config files from Tony, Robert, and me.
> > I have tried to be the most generic, the result below.
> >
> > It seems to work here without regression, except a warning:
> > managed-keys-zone: Unable to fetch DNSKEY set '.': operation canceled
> >
> > But only at the first boot, I don't see the message anymore when I
> > restart the daemon.
> > Any clue ?
> >
> > Thanks for your feedbacks.
> >
> > [Unit]
> > After=network-online.target
> >
> > [Service]
> > Type=simple
> > TimeoutSec=25
> > Restart=always
> > RestartSec=1
> > User=bind
> > Group=bind
> > CapabilityBoundingSet=CAP_NET_BIND_SERVICE
> > AmbientCapabilities=CAP_NET_BIND_SERVICE
> > SystemCallFilter=~@mount @debug acct modify_ldt add_key adjtimex
> > clock_adjtime delete_module fanotify_init finit_module get_mempolicy
> > init_module io_destroy io_getevents iopl ioperm io_setup io_submit
> > io_cancel kcmp kexec_load keyctl lookup_dcookie migrate_pages move_pages
> > open_by_handle_at perf_event_open process_vm_readv process_vm_writev
> > ptrace remap_file_pages request_key set_mempolicy swapoff swapon uselib
> > vmsplice
> >
> > NoNewPrivileges=true
> > PrivateDevices=true
> > PrivateTmp=true
> > ProtectHome=true
> > ProtectSystem=strict
> > ProtectKernelModules=true
> > ProtectKernelTunables=true
> > ProtectControlGroups=true
> > InaccessiblePaths=/home
> > InaccessiblePaths=/opt
> > InaccessiblePaths=/root
> > ReadWritePaths=/run/named
> > ReadWritePaths=/var/cache/bind
> > ReadWritePaths=/var/lib/bind
> >
> >
> > --
> > Ludovic Gasc (GMLudo)
> >
> > 2018-01-15 21:14 GMT+01:00 Robert Edmonds <edmonds at mycre.ws
> > <mailto:edmonds at mycre.ws>>:
> >
> >     Tony Finch wrote:
> >     > Ludovic Gasc <gmludo at gmail.com <mailto:gmludo at gmail.com>> wrote:
> >     > >
> >     > > 1. The list of minimal capabilities needed for bind to run
> correctly:
> >     > > http://man7.org/linux/man-pages/man7/capabilities.7.html
> >     <http://man7.org/linux/man-pages/man7/capabilities.7.html>
> >     >
> >     > named already drops capabilities - have a look at the code around
> here:
> >     > https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=
> blob;f=bin/named/unix/os.c;hb=v9_11_2#l234
> >     <https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=
> blob;f=bin/named/unix/os.c;hb=v9_11_2#l234>
> >     >
> >     > Note that it's a bit clever - the privileges are dropped in two
> stages,
> >     > right at the start, and after the server has been configured.
> >
> >     I checked just now to see what that code actually ends up doing, and
> on
> >     my system I ended up with:
> >
> >         $ grep -h ^Cap /proc/$(pidof named)/**/status | sort | uniq -c
> >               6 CapAmb:     0000000000000000
> >               6 CapBnd:     0000003fffffffff
> >               6 CapEff:     0000000001000400
> >               6 CapInh:     0000000000000000
> >               6 CapPrm:     0000000001000400
> >         $
> >
> >     That decodes to:
> >
> >      - The effective and permitted capabilities sets were reduced to
> >        CAP_NET_BIND_SERVICE and CAP_SYS_RESOURCE.
> >
> >      - The ambient and inheritable capabilities sets were cleared.
> >
> >      - The capability bounding set was left completely open-ended.
> >
> >     It's not clear why CAP_SYS_RESOURCE needs to be retained past
> startup:
> >
> >             /*
> >              * XXX  We might want to add CAP_SYS_RESOURCE, though it's
> not
> >              *      clear it would work right given the way linuxthreads
> >     work.
> >              * XXXDCL But since we need to be able to set the maximum
> number
> >              * of files, the stack size, data size, and core dump size to
> >              * support named.conf options, this is now being added to
> test.
> >              */
> >             SET_CAP(CAP_SYS_RESOURCE);
> >
> >     See commits 5e4b7294d88ab58371d8c98e05ea80086dcb67cd,
> >     108490a7f8529aff50a0ac7897580b59a73d9845. "[T]o test"?
> >
> >     CAP_SYS_RESOURCE is documented as permitting:
> >
> >        CAP_SYS_RESOURCE
> >               * Use reserved space on ext2 filesystems;
> >               * make ioctl(2) calls controlling ext3 journaling;
> >               * override disk quota limits;
> >               * increase resource limits (see setrlimit(2));
> >               * override RLIMIT_NPROC resource limit;
> >               * override maximum number of consoles on console
> allocation;
> >               * override maximum number of keymaps;
> >               * allow more than 64hz interrupts from the real-time clock;
> >               * raise msg_qbytes limit for a System V message queue
> >     above  the
> >                 limit in /proc/sys/kernel/msgmnb (see msgop(2) and
> >     msgctl(2));
> >               * allow  the  RLIMIT_NOFILE resource limit on the number
> >     of "in-
> >                 flight" file descriptors to  be  bypassed  when
> >     passing  file
> >                 descriptors  to  another process via a UNIX domain
> >     socket (see
> >                 unix(7));
> >               * override the /proc/sys/fs/pipe-size-max limit when
> >     setting the
> >                 capacity of a pipe using the F_SETPIPE_SZ fcntl(2)
> command.
> >               * use  F_SETPIPE_SZ to increase the capacity of a pipe
> >     above the
> >                 limit specified by /proc/sys/fs/pipe-max-size;
> >               * override /proc/sys/fs/mqueue/queues_max  limit  when
> >     creating
> >                 POSIX message queues (see mq_overview(7));
> >               * employ the prctl(2) PR_SET_MM operation;
> >               * set  /proc/[pid]/oom_score_adj to a value lower than the
> >     value
> >                 last set by a process with CAP_SYS_RESOURCE.
> >
> >     I would guess that retaining CAP_NET_BIND_SERVICE and
> CAP_SYS_RESOURCE
> >     during the process runtime permits open-ended reloading of the
> config at
> >     runtime (e.g., binding to a new IP address on port 53 without
> needing to
> >     restart the daemon). So even though BIND drops some capabilities,
> it's
> >     still running with elevated privileges compared to a traditional
> >     non-root user.
> >
> >     systemd permits a nice pattern for network daemons that want to run
> as
> >     an unprivileged user, but bind to a privileged port (and without
> using
> >     socket activation), without starting the process as root. Basically,
> you
> >     put something like this in the unit file:
> >
> >         [Service]
> >         User=…
> >         Group=…
> >         CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT
> >     CAP_SETPCAP
> >         AmbientCapabilities=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT
> CAP_SETPCAP
> >         …
> >
> >     Any needed filesystem directories and permissions need to be set up
> >     correctly before hand. The service is started by the init system as
> the
> >     unprivileged User/Group specified in the unit file, so there's no
> need
> >     to change UID/GID. CAP_NET_BIND_SERVICE is then used to bind to a
> >     privileged port, CAP_SYS_CHROOT is used to perform the chroot, and
> >     CAP_SETPCAP is used to drop all remaining capabilities from the
> >     capability sets and the capability bounding set, so you end up with a
> >     completely unprivileged process at runtime. (Alternatively you could
> >     keep CAP_NET_BIND_SERVICE and drop CAP_SYS_CHROOT and CAP_SETPCAP, if
> >     you wanted to retain the capability to perform privileged binds at
> >     runtime. Or you could eliminate CAP_SYS_CHROOT and use other systemd
> >     functionality to make parts of the filesystem inaccessible, etc.)
> This
> >     pattern might be a bit hard to retrofit into BIND at this point,
> though,
> >     other than by adding more knobs.
> >
> >     --
> >     Robert Edmonds
> >     _______________________________________________
> >     Please visit https://lists.isc.org/mailman/listinfo/bind-users
> >     <https://lists.isc.org/mailman/listinfo/bind-users> to unsubscribe
> >     from this list
> >
> >     bind-users mailing list
> >     bind-users at lists.isc.org <mailto:bind-users at lists.isc.org>
> >     https://lists.isc.org/mailman/listinfo/bind-users
> >     <https://lists.isc.org/mailman/listinfo/bind-users>
> >
> _______________________________________________
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
>
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.isc.org/pipermail/bind-users/attachments/20180116/34d8ad02/attachment-0001.html>


More information about the bind-users mailing list