Enable systemd hardening options for named

Wed Jan 31 14:18:17 UTC 2018

Hi,

as a Fedora maintainer of BIND package, I can say only that SELinux in
enforcing mode will provide better hardening than most of suggested
changes. That does not mean they are not useful, but most of them are
irrelevant with SELinux in enforcing mode. We want all Fedora users to
run in enforcing mode, especially on servers.

Especially restricting path access does not make sense with SELinux. It
is much more powerful and is already used.

Dne 16.1.2018 v 13:52 Daniel Stirnimann napsal(a):
> Hello all,
> 
> Just wondering, if one is already using selinux in enforcing mode, does
> systemd hardening provide any additional benefit?
> 
> Daniel
> 
> On 16.01.18 12:21, Ludovic Gasc wrote:
>> Hi,
>>
>> I have merged config files from Tony, Robert, and me.
>> I have tried to be the most generic, the result below.
>>
>> It seems to work here without regression, except a warning:
>> managed-keys-zone: Unable to fetch DNSKEY set '.': operation canceled
>>
>> But only at the first boot, I don't see the message anymore when I
>> restart the daemon.
>> Any clue ?
>>
>> Thanks for your feedbacks.
>>
>> [Unit]
>> After=network-online.target
>>
>> [Service]
>> Type=simple
>> TimeoutSec=25
>> Restart=always
>> RestartSec=1
>> User=bind
>> Group=bind
>> CapabilityBoundingSet=CAP_NET_BIND_SERVICE
>> AmbientCapabilities=CAP_NET_BIND_SERVICE
>> SystemCallFilter=~@mount @debug acct modify_ldt add_key adjtimex
>> clock_adjtime delete_module fanotify_init finit_module get_mempolicy
>> init_module io_destroy io_getevents iopl ioperm io_setup io_submit
>> io_cancel kcmp kexec_load keyctl lookup_dcookie migrate_pages move_pages
>> open_by_handle_at perf_event_open process_vm_readv process_vm_writev
>> ptrace remap_file_pages request_key set_mempolicy swapoff swapon uselib
>> vmsplice
>>
>> NoNewPrivileges=true
>> PrivateDevices=true
>> PrivateTmp=true
>> ProtectHome=true
>> ProtectSystem=strict
>> ProtectKernelModules=true
>> ProtectKernelTunables=true
>> ProtectControlGroups=true
>> InaccessiblePaths=/home
>> InaccessiblePaths=/opt
>> InaccessiblePaths=/root
>> ReadWritePaths=/run/named
>> ReadWritePaths=/var/cache/bind
>> ReadWritePaths=/var/lib/bind
>>
>>
>> --
>> Ludovic Gasc (GMLudo)
>>
>> 2018-01-15 21:14 GMT+01:00 Robert Edmonds <edmonds at mycre.ws
>> <mailto:edmonds at mycre.ws>>:
>>
>>     Tony Finch wrote:
>>     > Ludovic Gasc <gmludo at gmail.com <mailto:gmludo at gmail.com>> wrote:
>>     > >
>>     > > 1. The list of minimal capabilities needed for bind to run correctly:
>>     > > http://man7.org/linux/man-pages/man7/capabilities.7.html
>>     <http://man7.org/linux/man-pages/man7/capabilities.7.html>
>>     >
>>     > named already drops capabilities - have a look at the code around here:
>>     > https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=blob;f=bin/named/unix/os.c;hb=v9_11_2#l234
>>     <https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=blob;f=bin/named/unix/os.c;hb=v9_11_2#l234>
>>     >
>>     > Note that it's a bit clever - the privileges are dropped in two stages,
>>     > right at the start, and after the server has been configured.
>>
>>     I checked just now to see what that code actually ends up doing, and on
>>     my system I ended up with:
>>
>>         $ grep -h ^Cap /proc/$(pidof named)/**/status | sort | uniq -c
>>               6 CapAmb:     0000000000000000
>>               6 CapBnd:     0000003fffffffff
>>               6 CapEff:     0000000001000400
>>               6 CapInh:     0000000000000000
>>               6 CapPrm:     0000000001000400
>>         $
>>
>>     That decodes to:
>>
>>      - The effective and permitted capabilities sets were reduced to
>>        CAP_NET_BIND_SERVICE and CAP_SYS_RESOURCE.
>>
>>      - The ambient and inheritable capabilities sets were cleared.
>>
>>      - The capability bounding set was left completely open-ended.
>>
>>     It's not clear why CAP_SYS_RESOURCE needs to be retained past startup:
>>
>>             /*
>>              * XXX  We might want to add CAP_SYS_RESOURCE, though it's not
>>              *      clear it would work right given the way linuxthreads
>>     work.
>>              * XXXDCL But since we need to be able to set the maximum number
>>              * of files, the stack size, data size, and core dump size to
>>              * support named.conf options, this is now being added to test.
>>              */
>>             SET_CAP(CAP_SYS_RESOURCE);
>>
>>     See commits 5e4b7294d88ab58371d8c98e05ea80086dcb67cd,
>>     108490a7f8529aff50a0ac7897580b59a73d9845. "[T]o test"?
>>
>>     CAP_SYS_RESOURCE is documented as permitting:
>>
>>        CAP_SYS_RESOURCE
>>               * Use reserved space on ext2 filesystems;
>>               * make ioctl(2) calls controlling ext3 journaling;
>>               * override disk quota limits;
>>               * increase resource limits (see setrlimit(2));
>>               * override RLIMIT_NPROC resource limit;
>>               * override maximum number of consoles on console allocation;
>>               * override maximum number of keymaps;
>>               * allow more than 64hz interrupts from the real-time clock;
>>               * raise msg_qbytes limit for a System V message queue
>>     above  the
>>                 limit in /proc/sys/kernel/msgmnb (see msgop(2) and
>>     msgctl(2));
>>               * allow  the  RLIMIT_NOFILE resource limit on the number
>>     of "in-
>>                 flight" file descriptors to  be  bypassed  when 
>>     passing  file
>>                 descriptors  to  another process via a UNIX domain
>>     socket (see
>>                 unix(7));
>>               * override the /proc/sys/fs/pipe-size-max limit when
>>     setting the
>>                 capacity of a pipe using the F_SETPIPE_SZ fcntl(2) command.
>>               * use  F_SETPIPE_SZ to increase the capacity of a pipe
>>     above the
>>                 limit specified by /proc/sys/fs/pipe-max-size;
>>               * override /proc/sys/fs/mqueue/queues_max  limit  when 
>>     creating
>>                 POSIX message queues (see mq_overview(7));
>>               * employ the prctl(2) PR_SET_MM operation;
>>               * set  /proc/[pid]/oom_score_adj to a value lower than the
>>     value
>>                 last set by a process with CAP_SYS_RESOURCE.
>>
>>     I would guess that retaining CAP_NET_BIND_SERVICE and CAP_SYS_RESOURCE
>>     during the process runtime permits open-ended reloading of the config at
>>     runtime (e.g., binding to a new IP address on port 53 without needing to
>>     restart the daemon). So even though BIND drops some capabilities, it's
>>     still running with elevated privileges compared to a traditional
>>     non-root user.
>>
>>     systemd permits a nice pattern for network daemons that want to run as
>>     an unprivileged user, but bind to a privileged port (and without using
>>     socket activation), without starting the process as root. Basically, you
>>     put something like this in the unit file:
>>
>>         [Service]
>>         User=…
>>         Group=…
>>         CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT
>>     CAP_SETPCAP
>>         AmbientCapabilities=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT CAP_SETPCAP
>>         …
>>
>>     Any needed filesystem directories and permissions need to be set up
>>     correctly before hand. The service is started by the init system as the
>>     unprivileged User/Group specified in the unit file, so there's no need
>>     to change UID/GID. CAP_NET_BIND_SERVICE is then used to bind to a
>>     privileged port, CAP_SYS_CHROOT is used to perform the chroot, and
>>     CAP_SETPCAP is used to drop all remaining capabilities from the
>>     capability sets and the capability bounding set, so you end up with a
>>     completely unprivileged process at runtime. (Alternatively you could
>>     keep CAP_NET_BIND_SERVICE and drop CAP_SYS_CHROOT and CAP_SETPCAP, if
>>     you wanted to retain the capability to perform privileged binds at
>>     runtime. Or you could eliminate CAP_SYS_CHROOT and use other systemd
>>     functionality to make parts of the filesystem inaccessible, etc.) This
>>     pattern might be a bit hard to retrofit into BIND at this point, though,
>>     other than by adding more knobs.
>>
>>     --
>>     Robert Edmonds
>>     _______________________________________________
>>     Please visit https://lists.isc.org/mailman/listinfo/bind-users
>>     <https://lists.isc.org/mailman/listinfo/bind-users> to unsubscribe
>>     from this list
>>
>>     bind-users mailing list
>>     bind-users at lists.isc.org <mailto:bind-users at lists.isc.org>
>>     https://lists.isc.org/mailman/listinfo/bind-users
>>     <https://lists.isc.org/mailman/listinfo/bind-users>
>>
> _______________________________________________
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list
> 
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users
> 

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemensik at redhat.com  PGP: 65C6C973