1. Introduction
1.1 Who needs that
2. Principles
2.1 Non reversible isolation
2.2 Isolation areas
2.3 New system calls
2.4 Limiting super-user: The capabilities system
2.5 Enhancing the capability system
2.6 Playing with the new system calls
2.6.1 Playing with /usr/sbin/chcontext
2.6.2 Playing with /usr/sbin/chcontext as root
2.6.3 Playing with /usr/sbin/chbind
2.6.4 Playing with /usr/sbin/reducecap
2.7 Unification
3. Applications
3.1 Virtual server
3.2 Per user fire-wall
3.3 Secure server/Intrusion detection
3.4 Fail over servers
4. Installation
4.1 The packages
4.2 Setting a virtual server
4.3 Basic configuration of the virtual server
4.4 Entering the virtual server
4.5 Configuring the services
4.6 Starting/Stopping the virtual server
4.7 Starting/Stopping all the virtual servers
4.8 Restarting a virtual server from inside
4.9 Executing tasks at vserver start/stop time
4.10 Issues
4.11 How real is it ?
5. Features
6. Future directions
6.1 User controlled security box
6.2 Kernel enhancements
6.2.1 Per context disk quota
6.2.2 Global limits
6.2.3 Scheduler
6.2.4 Security issues /dev/random /dev/pts Network devices
7. Alternative technologies
7.1 Virtual machines
7.2 Partitioning
7.3 Limitation of those technologies
8. Conclusion
9. Download
10. References
Top Up

2.3 New system calls


The new system calls, as well as the existing chroot() system call are sharing one common feature: Their effect can't be reversed. Once you have executed one of those system call (chroot, new_s_context, set_ipv4root), you can't get back. This affects the current process and all the child processes. The parent process is not influenced.

  • new_s_context (int ctx)

    This system call sets a new security context for the current process. It will be inherited by all child processes. The security context is just an id, but the system call makes sure a new unused one is allocated.

    A process can only see other processes sharing the same security context. When the system boot, the original security context is 0. But this one is not privileged in anyway. Processes member of the security context 0 can only interact (and see) processes member of context 0.

    This system call isolates the processes space.

  • Setting the capabilities ceiling

    This is handle by the new_s_context system call as well. This reduces the ceiling capabilities of the current process. Even setuid sub-process can't grab more capabilities. The capability system found since Linux 2.2 is explained later in this document.

  • set_ipv4root(unsigned long ip)

    This system call locks the process (and children) into using a single IP when they communicate and when they installs a service. This system call is a one shot. Once a process have set its IPV4 (Internet Protocol Version 4) address to something different from, it can't change it anymore. Children can't change it either.

    If a process tries to bind a specific IP number, it will succeed only if this corresponds to the ipv4root (if different from If the process bind to any address, it will get the ipv4root.

    Basically, once a process is locked to a given ipv4root it is forced to use this IP address to establish a service and communicate. The restriction on services is handy: Most service (Web servers, SQL servers) are binding to address With the ipv4root sets to a given IP you can have two virtual servers using the exact same general/vanilla configuration for a given services and running without any conflict.

    This system calls isolate the IP network space.

Those system calls are not privileged. Any user may issue them.

Top Up

One big HTML document