Monthly Archives: April 2005

SELinux kernel changes in 2.6.12-rc3

A few more SELinux kernel patches have been merged since 2.6.12-rc2 and are now available in -rc3:

  • Explicit support has been added for the KOBJECT_UEVENT Netlink family. This allows SELinux permissions to be applied specifically to these types of sockets, rather than the default, which was to treat them as generic Netlink sockets. KOBJECT_UEVENT messages are sent by the kernel to userspace to provide notification of changes to kobjects. You’ll most likely need to update to the latest policy packages if you install this kernel, as haldaemon makes use of these types of Netlink sockets, for which older policy will not have any permissions. This is what you’ll see without an updated policy:
    avc:  denied  { create } for  scontext=system_u:system_r:hald_t
                                  tcontext=system_u:system_r:hald_t
                                  tclass=netlink_kobject_uevent_socket
    
  • A bug was fixed in the detection of NETLINK_IP6_FW messages (as used in the upstream kernel by ip6_queue), where such messages would instead be detected as generic Netlink messages.
  • Stephen Smalley fixed an audit related deadlock in SELinux, which was discovered by IBM testing. His patch moves more SELinux logging to the audit subsystem, cleaning up the SELinux code and allowing the logged information to be more complete and reliable. I would thus now suggest always running SELinux with audit enabled (which is the default now in Fedora rawhide). With audit enabled and auditd running, AVC messages will now go to wherever auditd is configured to send them, as specified by /etc/auditd.conf. This is /var/log/audit/audit.log on my system, which takes a bit of getting used to after years of having AVC messages splattered across the console.

New SELinux features for 2.6.12: name_connect, working reiserfs xattrs, checkreqprot etc.

With Linus’ kernel at -rc2 stage, we now have a reasonable idea of what 2.6.12 will look like. Here’s an overview of SELinux kernel features which have been added since 2.6.11.

name_connect
This is a new permission for the TCP socket class, specified by Dan Walsh and implemented by Stephen Smalley. It extends the way SELinux handles the connect(2) syscall for TCP sockets. The existing SELinux connect permission is applied to all classes of sockets, simply determining whether a domain is allowed to use the connect(2) syscall on the specified socket class. With name_connect, SELinux makes an additional check as to whether the domain can connect to a specific set of port types. Here are some examples from current policy sources:

allow mount_t portmap_port_t:tcp_socket name_connect;
allow squid_t { http_port_t http_cache_port_t }:tcp_socket name_connect;
allow unconfined_t port_type:tcp_socket name_connect;

The first is the simplest case, where a domain needs to connect to one type of port. In this case, the mount_t domain is allowed to initiate TCP connections to ports of portmap_port_t type. The latter is defined in policy as port 111 via the portcon directive. This allows the mount_t domain to perform portmap queries over TCP when mounting NFS partitions.

Importantly, as SELinux denies everything by default, allowing only what’s explicitly specified, the mount_t domain can’t connect to any other types of port.

In the second example, we see how the policy language supports multiple port types: the squid_t domain is allowed to initiate connections to ports of type http_port_t and http_cache_port_t.

A catch-all entry is shown last, for the unconfined_t domain. In this case, the port_type attribute is used to signify all port types. This is useful if you don’t know what ports a domain will need, or haven’t yet analyzed the needs of the domain — assuming you prefer it to work correctly rather than not at all.

In terms of the implementation, some thought was given to extending the existing packet-level controls, recv_msg and send_msg, to detect whether a TCP packet has a SYN bit set. This approach would have required adding significant complexity and per-packet processing overhead to the SELinux code; been potentially difficult to support extensibly with the existing policy language; and replicated existing iptables functionality. And as we’re only concerned with outgoing stream connections and the associated local process, it turned out better to place the hook at the socket level. This was done by adding some TCP specific code to the existing selinux_socket_connect hook, called during connect(2), via the LSM security_socket_connect hook. No core kernel code was modified.

The name_connect permission should help lock down a lot of the networking policy. For example, domains which need networking also tend to need to perform DNS lookups. The new permission will allow domains to be locked down to only outgoing DNS (in the TCP case) and whatever else they actually need. This kind of fine grained control should help prevent some common network security problems, such as compromised systems being used to send spam. If, say, Apache is compromised via a remote hole, SELinux can still lock down the network resources used by the application at the kernel level. In this case, with name_connect, it’s quite simple: domains which need to initiate TCP connections to port 25 are provided with the permission, all others are denied by default.

reiserfs
Previously, SELinux file labeling for reiserfs via xattrs has been seriously broken, as described in this mailing list post. The problem was that reiserfs maintains internal inodes, which the VFS and thus SELinux were not distinctly aware of, leading to problems such as a deadlock between reiserfs and SELinux when reiserfs creates internal inodes for storing xattr data. This issue was solved by Jeff Mahoney of Suse, who added an S_PRIVATE flag to mark inodes as internal to the filesystem. This flag signals to the LSM framework (and its applications, such as SELinux) that access control for the inode will be handled internally by the low-level filesystem code.

It appears to be working well now, although I’m not sure how much testing coverage it’s had. Currently, standard SELinux policy disables the use of xattrs for reiserfs, so if you want to try it out, you’ll need to edit policy sources:

  1. Edit genfs_contexts and remove the genfscon line for reiserfs:
    # reiserfs - until xattr security support works properly
    #genfscon reiserfs      /      system_u:object_r:nfs_t
    
  2. Edit fs_use and add:
    fs_use_xattr reiserfs system_u:object_r:fs_t;

Build, then load the policy (and install it if you wish to keep using it). When mounting any reiserfs partitions after this, you should see the following type of kernel log message:

SELinux: initialized (dev hda3, type reiserfs), uses xattr

The “uses xattr” clause shows that xattr labeling is active on the partition (which will need labeling if it’s never been labeled before). If you intend to keep using reiserfs with SELinux xattrs before your distro fully supports it, you’ll also need to update the policy Makefile so that reiserfs is included in the FILESYSTEMS variable, and also make similar changes to the fixfiles script.

Enhanced MLS
Multi-level security (MLS) is typically used by military and government folk who need to manage information at different security levels, and handle users with varying security clearances. SELinux has always had rudimentary and experimental MLS support, although I don’t believe it’s ever had much use. A company called Trusted Computer Solutions (TCS) has been working on a more flexible and fully featured MLS implementation for SELinux, and an initial patch of theirs is now in the upstream kernel. It updates the core MLS implementation, replacing hard-coded logic with a more flexible and expressive system based on policy language constraints. The patch also allows MLS to be enabled at boot-time (like SELinux itself), and supports a single policy binary format for MLS and non-MLS systems.

Chad Hanson gave a talk on their work at the recent SELinux symposium, slides of which may be downloaded here. MLS under SELinux is still incomplete, lacking features such as polyinstantiated directories, and somewhat experimental. If you’re feeling brave, you can try setting up an MLS system with the new TCS code by following the README.MLS in the latest selinux-doc package. It’s probably best to try this on a system which you don’t mind reinstalling from scratch if something goes wrong.

Memory protection checking for legacy binaries & libraries
In recent kernels, when a process maps a memory region (or changes its protection), the requested protection flags may be modified by the kernel before access control checks are performed. This currently happens if a binary or shared library is marked as needing an executable stack, and PROT_READ is requested. In this case, the kernel will also add PROT_EXEC (“read-implies-exec”). With this behavior, SELinux policy for some applications had to be loosened to allow execution privileges when they may not have really been required (e.g. an application isn’t marked at all and the kernel assumes it needs an executable stack).

To address this, Stephen Smalley implemented a way to control whether SELinux uses the protection that the kernel will apply, or the protection originally requested by the application. An selinuxfs node has been added, checkreqprot, which allows selection of the required SELinux behavior by writing one of the following values to it:

0 – SELinux uses the protection value to be applied by the kernel.
1 – SELinux uses the protection value originally requested by the application.

The current value may be viewed simply with:

# cat /selinux/checkreqprot 
1

The default value may be set during kernel compilation via the SECURITY_SELINUX_CHECKREQPROT_VALUE parameter.

Miscellaneous
SELinux now logs the details of any Netlink messages it doesn’t understand, to handle the case where new message types are added and the SELinux code is not updated. Such messages are now also allowed through if in permissive mode.

The SELinux boot options have been documented in the kernel source tree, as requested by Andrew Morton.

Code which handles writes to the /proc/<pid>/attr nodes has been modified to treat a newline character as a null value. The normal way to clear these nodes of any value is to do an empty write, but Posix says that this can be ignored and we can’t rely on that behavior. So, thanks to Posix, you can now clear the value of a procattr node by writing a newline to it. This also allows easier modification of the nodes from shell scripts, but direct manipulation of these nodes should not normally performed: use libselinux and helper utilities instead.

Note that these changes are now available in Fedora rawhide kernels (and likely other distributions’ development pools tracking Linus’ kernel). If you update the kernel, make sure you also update your SELinux policy, as older policies will not have any allow rules for the name_connect permission and you could find your system somewhat more secure than you intended.