Enable systemd hardening options for named

Petr Menšík pemensik at redhat.com
Wed Jan 31 18:28:35 UTC 2018



Dne 31.1.2018 v 15:37 Reindl Harald napsal(a):
> 
> Am 31.01.2018 um 15:18 schrieb Petr Menšík:
>> as a Fedora maintainer of BIND package, I can say only that SELinux in
>> enforcing mode will provide better hardening than most of suggested
>> changes. That does not mean they are not useful, but most of them are
>> irrelevant with SELinux in enforcing mode. We want all Fedora users to
>> run in enforcing mode, especially on servers.
>>
>> Especially restricting path access does not make sense with SELinux. It
>> is much more powerful and is already used.
> 
> it is completly irrelevant because when you switch SELinux to
> "permissive" in case you need to debug something it's gone and hence
> layered-security is always the way to go
That depends on your needs. Remember we never prevent you from modifying
default configuration to adjust your needs. You can increase security of
your installation any way you wish.

My point is, SELinux is an additional layer to basic security. By basic
security I mean correctly configured permissions on all files.
Up-to-date system with all security updates installed. Correctly
configured ACL.

Usually additional layers provide more security at cost of more checking
and restrictions. They may consume more resources for extra protection.
That resources spend on additional layers are not free. That is why I do
not want to add as much layers as possible - because it is quite
possible they will slow down the server without bringing something real
as a benefit. From what I understood, each PrivateXY= in systemd
requires additional mount. It does not consume much, but still
something. I were able to bring the kernel to knees with one bug related
to mounting. It consumes resources.

CapabilityBoundingSet sounds good, but file access restrictions are
already solved by SELinux. Also with good reporting layer, failures are
visible and more or less understandable. SystemCallFilter sounds
promising. Starting as non-privileged from systemd looks nice too.
> 
> the same for service-configuration even if you have iptables running - i
> had a case some years ago when i tried to enable SELinux on my personal
> machine that i found out failed logins in samba *because* SELinux leaded
> for whatever reason in iptables to fail at boot
I am really sorry for that. If you have trouble with Samba or Iptables,
please fill a bug - http://bugzilla.redhat.com. The point is, that can
happen, you might be forced to set to permissive for a while. Still,
your BIND is not at risk just because of that. Just solve your problem
and turn enforcing back, please. Use semanage to create local exceptions
if required.
> 
>> Dne 16.1.2018 v 13:52 Daniel Stirnimann napsal(a):
>>> Hello all,
>>>
>>> Just wondering, if one is already using selinux in enforcing mode, does
>>> systemd hardening provide any additional benefit?
>>>
>>> Daniel
>>>
>>> On 16.01.18 12:21, Ludovic Gasc wrote:
>>>> Hi,
>>>>
>>>> I have merged config files from Tony, Robert, and me.
>>>> I have tried to be the most generic, the result below.
>>>>
>>>> It seems to work here without regression, except a warning:
>>>> managed-keys-zone: Unable to fetch DNSKEY set '.': operation canceled
>>>>
>>>> But only at the first boot, I don't see the message anymore when I
>>>> restart the daemon.
>>>> Any clue ?
>>>>
>>>> Thanks for your feedbacks.
>>>>
>>>> [Unit]
>>>> After=network-online.target
>>>>
>>>> [Service]
>>>> Type=simple
>>>> TimeoutSec=25
>>>> Restart=always
>>>> RestartSec=1
>>>> User=bind
>>>> Group=bind
>>>> CapabilityBoundingSet=CAP_NET_BIND_SERVICE
>>>> AmbientCapabilities=CAP_NET_BIND_SERVICE
>>>> SystemCallFilter=~@mount @debug acct modify_ldt add_key adjtimex
>>>> clock_adjtime delete_module fanotify_init finit_module get_mempolicy
>>>> init_module io_destroy io_getevents iopl ioperm io_setup io_submit
>>>> io_cancel kcmp kexec_load keyctl lookup_dcookie migrate_pages
>>>> move_pages
>>>> open_by_handle_at perf_event_open process_vm_readv process_vm_writev
>>>> ptrace remap_file_pages request_key set_mempolicy swapoff swapon uselib
>>>> vmsplice
>>>>
>>>> NoNewPrivileges=true
>>>> PrivateDevices=true
>>>> PrivateTmp=true
>>>> ProtectHome=true
>>>> ProtectSystem=strict
>>>> ProtectKernelModules=true
>>>> ProtectKernelTunables=true
>>>> ProtectControlGroups=true
>>>> InaccessiblePaths=/home
>>>> InaccessiblePaths=/opt
>>>> InaccessiblePaths=/root
>>>> ReadWritePaths=/run/named
>>>> ReadWritePaths=/var/cache/bind
>>>> ReadWritePaths=/var/lib/bind
>>>>
>>>>
>>>> -- 
>>>> Ludovic Gasc (GMLudo)
>>>>
>>>> 2018-01-15 21:14 GMT+01:00 Robert Edmonds <edmonds at mycre.ws
>>>> <mailto:edmonds at mycre.ws>>:
>>>>
>>>>      Tony Finch wrote:
>>>>      > Ludovic Gasc <gmludo at gmail.com <mailto:gmludo at gmail.com>> wrote:
>>>>      > >
>>>>      > > 1. The list of minimal capabilities needed for bind to run
>>>> correctly:
>>>>      > > http://man7.org/linux/man-pages/man7/capabilities.7.html
>>>>      <http://man7.org/linux/man-pages/man7/capabilities.7.html>
>>>>      >
>>>>      > named already drops capabilities - have a look at the code
>>>> around here:
>>>>      >
>>>> https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=blob;f=bin/named/unix/os.c;hb=v9_11_2#l234
>>>>
>>>>     
>>>> <https://source.isc.org/cgi-bin/gitweb.cgi?p=bind9.git;a=blob;f=bin/named/unix/os.c;hb=v9_11_2#l234>
>>>>
>>>>      >
>>>>      > Note that it's a bit clever - the privileges are dropped in
>>>> two stages,
>>>>      > right at the start, and after the server has been configured.
>>>>
>>>>      I checked just now to see what that code actually ends up
>>>> doing, and on
>>>>      my system I ended up with:
>>>>
>>>>          $ grep -h ^Cap /proc/$(pidof named)/**/status | sort | uniq -c
>>>>                6 CapAmb:     0000000000000000
>>>>                6 CapBnd:     0000003fffffffff
>>>>                6 CapEff:     0000000001000400
>>>>                6 CapInh:     0000000000000000
>>>>                6 CapPrm:     0000000001000400
>>>>          $
>>>>
>>>>      That decodes to:
>>>>
>>>>       - The effective and permitted capabilities sets were reduced to
>>>>         CAP_NET_BIND_SERVICE and CAP_SYS_RESOURCE.
>>>>
>>>>       - The ambient and inheritable capabilities sets were cleared.
>>>>
>>>>       - The capability bounding set was left completely open-ended.
>>>>
>>>>      It's not clear why CAP_SYS_RESOURCE needs to be retained past
>>>> startup:
>>>>
>>>>              /*
>>>>               * XXX  We might want to add CAP_SYS_RESOURCE, though
>>>> it's not
>>>>               *      clear it would work right given the way
>>>> linuxthreads
>>>>      work.
>>>>               * XXXDCL But since we need to be able to set the
>>>> maximum number
>>>>               * of files, the stack size, data size, and core dump
>>>> size to
>>>>               * support named.conf options, this is now being added
>>>> to test.
>>>>               */
>>>>              SET_CAP(CAP_SYS_RESOURCE);
>>>>
>>>>      See commits 5e4b7294d88ab58371d8c98e05ea80086dcb67cd,
>>>>      108490a7f8529aff50a0ac7897580b59a73d9845. "[T]o test"?
>>>>
>>>>      CAP_SYS_RESOURCE is documented as permitting:
>>>>
>>>>         CAP_SYS_RESOURCE
>>>>                * Use reserved space on ext2 filesystems;
>>>>                * make ioctl(2) calls controlling ext3 journaling;
>>>>                * override disk quota limits;
>>>>                * increase resource limits (see setrlimit(2));
>>>>                * override RLIMIT_NPROC resource limit;
>>>>                * override maximum number of consoles on console
>>>> allocation;
>>>>                * override maximum number of keymaps;
>>>>                * allow more than 64hz interrupts from the real-time
>>>> clock;
>>>>                * raise msg_qbytes limit for a System V message queue
>>>>      above  the
>>>>                  limit in /proc/sys/kernel/msgmnb (see msgop(2) and
>>>>      msgctl(2));
>>>>                * allow  the  RLIMIT_NOFILE resource limit on the number
>>>>      of "in-
>>>>                  flight" file descriptors to  be  bypassed  when
>>>>      passing  file
>>>>                  descriptors  to  another process via a UNIX domain
>>>>      socket (see
>>>>                  unix(7));
>>>>                * override the /proc/sys/fs/pipe-size-max limit when
>>>>      setting the
>>>>                  capacity of a pipe using the F_SETPIPE_SZ fcntl(2)
>>>> command.
>>>>                * use  F_SETPIPE_SZ to increase the capacity of a pipe
>>>>      above the
>>>>                  limit specified by /proc/sys/fs/pipe-max-size;
>>>>                * override /proc/sys/fs/mqueue/queues_max  limit  when
>>>>      creating
>>>>                  POSIX message queues (see mq_overview(7));
>>>>                * employ the prctl(2) PR_SET_MM operation;
>>>>                * set  /proc/[pid]/oom_score_adj to a value lower
>>>> than the
>>>>      value
>>>>                  last set by a process with CAP_SYS_RESOURCE.
>>>>
>>>>      I would guess that retaining CAP_NET_BIND_SERVICE and
>>>> CAP_SYS_RESOURCE
>>>>      during the process runtime permits open-ended reloading of the
>>>> config at
>>>>      runtime (e.g., binding to a new IP address on port 53 without
>>>> needing to
>>>>      restart the daemon). So even though BIND drops some
>>>> capabilities, it's
>>>>      still running with elevated privileges compared to a traditional
>>>>      non-root user.
>>>>
>>>>      systemd permits a nice pattern for network daemons that want to
>>>> run as
>>>>      an unprivileged user, but bind to a privileged port (and
>>>> without using
>>>>      socket activation), without starting the process as root.
>>>> Basically, you
>>>>      put something like this in the unit file:
>>>>
>>>>          [Service]
>>>>          User=…
>>>>          Group=…
>>>>          CapabilityBoundingSet=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT
>>>>      CAP_SETPCAP
>>>>          AmbientCapabilities=CAP_NET_BIND_SERVICE CAP_SYS_CHROOT
>>>> CAP_SETPCAP
>>>>          …
>>>>
>>>>      Any needed filesystem directories and permissions need to be
>>>> set up
>>>>      correctly before hand. The service is started by the init
>>>> system as the
>>>>      unprivileged User/Group specified in the unit file, so there's
>>>> no need
>>>>      to change UID/GID. CAP_NET_BIND_SERVICE is then used to bind to a
>>>>      privileged port, CAP_SYS_CHROOT is used to perform the chroot, and
>>>>      CAP_SETPCAP is used to drop all remaining capabilities from the
>>>>      capability sets and the capability bounding set, so you end up
>>>> with a
>>>>      completely unprivileged process at runtime. (Alternatively you
>>>> could
>>>>      keep CAP_NET_BIND_SERVICE and drop CAP_SYS_CHROOT and
>>>> CAP_SETPCAP, if
>>>>      you wanted to retain the capability to perform privileged binds at
>>>>      runtime. Or you could eliminate CAP_SYS_CHROOT and use other
>>>> systemd
>>>>      functionality to make parts of the filesystem inaccessible,
>>>> etc.) This
>>>>      pattern might be a bit hard to retrofit into BIND at this
>>>> point, though,
>>>>      other than by adding more knobs.
> _______________________________________________
> Please visit https://lists.isc.org/mailman/listinfo/bind-users to
> unsubscribe from this list
> 
> bind-users mailing list
> bind-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/bind-users

-- 
Petr Menšík
Software Engineer
Red Hat, http://www.redhat.com/
email: pemensik at redhat.com  PGP: 65C6C973


More information about the bind-users mailing list