/etc/rc.d/init.d/httpd stop /usr/sbin/chbind --ip eth0 /etc/rc.d/init.d/httpd start
This is annoying, especially because we want to automate the boot process. To solve this, the vserver package supplies 3 new services in the main server. They are off by default. There is v_sshd, v_xinetd and v_httpd. Those services are simple wrapper to start the original service using the chbind command above. So if you need to run the same network service in the main server and some virtual server, turn off the service (linuxconf/control/control service activity) and enable the corresponding v_ service.
/usr/sbin/vserver SERVER restart
But it can't be executed from inside the virtual server. To solve this the rebootmgr (robot manager) was created. It is a service which installs a unix domain socket in each virtual server (/dev/reboot). The /sbin/vreboot utility is used to request a reboot from any virtual server. The reboot manager knows which vserver is sending the request and execute the command above.
The /sbin/vhalt utility is supplied as well, which simply end a virtual server activity.
The rebootmgr is a service and is off by default. You must enable it.
vserver is also a little more verbose when starting and stopping a vserver. This makes the /var/log/boot.log file a little more explicit.
The sysv init script was missing the tags used by Linuxconf so it knows when rebootmgr must be restarted.
It is possible to share files using hard links. vunify was created to merge common package using hard link so they use the same disk space. It works like this
/usr/lib/vserver/vunify refserver serv1 serv2 serv3 -- pkg1 pkg2 pkg3 ...
refserver is the reference server. serv1 to serv3 are other vservers and all vservers have a copy of the same RPM packages pkg1, pkg2. The utility will extract the list of file own by the various packages. Configuration files are omitted from the list. Then it will walk the vservers and erase the files and setup a hard link pointing to the file in refserver. Once done, it will turn the file immutable bit. This bit prevent the vservers from modifying the file. Even root can't do it. You will end up saving a log of disk space. You may gain some performance as well since the various shared object will be loaded only once in memory for all vservers. This could be noticeable if you have many vservers.
Experiment have shown that unifying the packages bash, glibc, perl and binutils save 60 megabytes per vserver. Given those package seldom change, this is a nice saving.
vunify also feature a --undo option to un-unify some package. It turns off the immutable bit and place a copy of the files in each vservers. This can be useful if you intend to update one package.
Once set, a security context is locked. It can't request another security context.
This changes the way priority is calculated for processes in a vserver. The priority is kind of aglomerated. A vserver running 50 active processes will have a same impact on the server as if it was running a single process (roughly).
By default, all new vserver are created with S_FLAGS="lock".
There was a drawback with this. The unified vservers were locked somewhat. A vserver administrator could not perform package update for example.
The new IMMUTABLE-LINKAGE-INVERT solves this. It modifies the way an immutable file behave. With this flag on, the file may be unlinked (removed), allowing normal package updates. But the original data can't be modified.
The default for vunify and vbuild is to set both IMMUTABLE-FILE and IMMUTABLE-LINKAGE-INVERT bits on linked file. This gives you robustness (one vserver can't modify the linked file shared by other vservers) and flexibility (one vserver may evolve independently.
You absolutely need vserver 0.6 to use this kernel. You can find more information about the new immutable-linkage-invert flag at http://sam.vilain.net/immutable. You will find there a modified ext2fsprog package to use those flags. The vunify and vbuild utility do not need this package to operate though.
Here is the command line usage:
vbuild [ options ] reference-server new-vservers
By default, the immutable_file and immutable_link flags are set on the files. So if you want no immutable flags, you must use --noflags. If you want a single flag, you must use --noflags first, then the --immutable or --immutable-mayunlink flag.
vunify [ options ] reference-server vservers ... -- packages
By default, the immutable_file and immutable_link flags are set on the files. So if you want no immutable flags, you must use --noflags. If you want a single flag, you must use --noflags first, then the --immutable or --immutable-mayunlink flag.
If packages is ALL, then all common package with the reference server will be unified. The new vunify makes sure this is the same package version before unifying.
If you have already some vserver running and want to upgrade to the new kernel, here is the update sequence:
# Stop all vservers /etc/rc.d/init.d/vservers stop # Disable the vservers service /sbin/chkconfig vservers off # Install the new kernel in LILO # reboot # Update to the new vserver package rpm -Uvh vserver-0.6-1.i386.rpm # Enable the vservers package /sbin/chkconfig vservers on # Start the vservers /etc/rc.d/init.d/vservers start
This setting defines ulimit settings passed to the vserver when it is started.
This contains a set of capability available to vserver. For example, if you want a vserver to be able to do some pings, put the CAP_NET_RAW capability there.
When starting a vserver, the /var/run directory was not cleared. In some situation, the various startup script were failing because a bogus PID file was left there from a previous run.
The private flag is a little weird. Once a security context has this flag set, it is not possible to join it. Even root in the root server with all capabilities is not allowed. This makes the virtual server fairly private. Security context 1 can still see which processes are executing in the vserver, but can't interfere.
Since ext3 is now part of 2.4.16, it has been modified to support the IMMUTABLE_LINKAGE feature.
This utility works both in graphical and text mode. You must install the packages linuxconf-lib and linuxconf-util from http://www.solucorp.qc.ca/linuxconf/download.hc .
I have separated the vserver package in two: vserver and vserver-admin. newvserver is part of the later.
/usr/sbin/vrpm ALL -- -Uvh /tmp/*.rpm /usr/sbin/vrpm server1 server2 -- -Uvh this-rpm.rpm
It takes care of the --root command line option.
This always umount /proc and /dev/pts even if the vserver was not running. The "enter" command leaves /proc and /dev/pts mounted and this was causing problems at shutdown time.
This is a little helper to control a service inside a vserver. This allows you to enter, act on a sysv service and leave.
This lets you execute a command in the context of a vserver and exits. The "service" and "enter" commands are implemented on top of that.
To support this feature, the /usr/sbin/vserver script had to be reworked a bit since entering a vserver context involves using chroot. So we had to kind of enter the context, then kill CAP_SYS_CHROOT
chmod 000 /vservers
Setting these permission bits (well turning them all off) make the directory inaccessible for any other user than root. The change in the kernel ctx-6 makes such a directory unusable even by root in a different security context (not 0).
The /usr/sbin/vserver will create the /vservers appropriately. If the directory exist, it will check the permissions and signal the admin if they are not 000.
The features are:
You can get the patch and binaries as usual from ftp://ftp.solucorp.qc.ca/pub/vserver . The pub/vserver/patches also contains a relative patch from ctx-5 to ctx-6. You can review what was done this way.
This kernel plugs probably most security issues. There is still to many things visible in /proc as seen from a vserver. A new file system called vproc will be written to provide a limited view.
While this kernel should prevent a vserver administrator to gain access to the vserver, there are still ways to produce some DOS by exhausting all resources. The nproc feature works correctly and control the amount of processes used by a vserver. Some more work is needed to address all the other resource limits (files, memory, ...)
There were no way to tell that you did not want a NIS domain name in a vserver when there was one set in the root server. You can now enter "none" as the S_DOMAINNAME value to achieve this.
Here is what fakeinit does in the kernel:
This assigned the current process so it works like the process number 1. Using this trick, a normal /sbin/init may be run in a vserver. The /usr/sbin/vserver command will use /sbin/init to start and stop a vserver. A properly configured /etc/inittab is needed though.
One nice thing about this feature is that the /usr/sbin/vserver is somewhat distribution independent. It simply runs /sbin/init to start a vserver and then "/sbin/init 6" to stop it (and then kills the remaining process). There are some drawbacks (for now) though and input are welcome.
First, the vserver start-up is no more synchronous. The /usr/sbin/vserver used to run "/etc/rc.d/rc 3" and wait until it ends. Now, it runs /sbin/init, but /sbin/init won't end until the vserver ends. So /usr/sbin/vserver has to let go /sbin/init in background. This is a little annoying.
When a vserver is started like this, we don't see all the service started as before. Without fakeinit, we see each service getting started and a OK/FAIL message for each. Now, it goes completely silent. I have not investigated this behavior. I suspect /sbin/init is opening a new tty (console) and runs the start-up scripts using that newly open console.
Since /sbin/init runs all the start-up code, we don't know when it is done so we can't run the post-start section of the /etc/vservers/xx.sh script properly.
Note that both start-up strategy still work: fakeinit and the original. So you current vserver installation will work as before without any fiddling. Once we have iron out the fakeinit drawback, this will become the default way of doing things.
and it will locate the vserver owning that process, enter its security context and issue the kill.
Only files are erased from /var/run at vserver build and start-up time. Sub-directories are left. Also, /var/run/utmp is created empty at start-up time.
It is created empty at vserver build time. It is ignored after that.
When entering a running vserver, the S_CAPS setting was not enabled for the shell. So if you had given the vserver some capabilities, they were not available when using "enter".
The ulimit resources for a user used to be shared across vserver. This was plain wrong since user ID N in a vserver is unrelated to user ID N in another vserver.
Contributed by Patrick Schaaf <email@example.com>
Note, this is unrelated to the multi-IP-per-vserver concept. A vserver normally use a single IP to listen and talk. In general, this is not a problem. But it breaks a little semantic. Most services out there simply do a bind on IP 0.0.0.0. This way, they expect to grab any incoming traffic. They also expect that talking to 127.0.0.1 is a good way (configuration less) to talk to themselves. Some services are using localhost (which is redirect to the ipv4root of the vserver) and some are using 127.0.0.1 directly.
The ctx-8 kernel now maps 127.0.0.1 to the ipv4root of the vserver on the fly. This solves some issues with samba and should also (not tested) solve the issue with PostgreSQL.
The output of netstat is now filtered by vserver. This includes /proc/net/tcp. This is not done per ipv4root but using the security context. This was contributed by Martin Josefsson <firstname.lastname@example.org>
2.4.18 introduces new system calls (reserved at least), so we had to move our own at the end. If you have vserver-0.12, it does not matter, as it adapts to the kernel on the fly. You can use the same binary to run a 2.4.17ctx-any kernel or the new 2.4.18ctx-8.
vserver-stat was changed to use the new dynamic system call feature
This was done because the two system calls are not official( reserved in the official kernel) and probably won't be until we have covered more grounds...
vserver-0.12 uncovered a flaw where the file /proc/self/status was not properly parsed. But there was another gotcha. When used with an older kernel (older than 2.4.17ctx-8), the utility were using the values in /usr/include/asm/unistd.h. Unfortunately, those values are dependent on the kernel currently installed on your computer. If it is a 2.4.18 kernel, the system calls have different number than on older 2.4.17 kernel.
To make the story short, the vserver-0.13 utilities are not relying on kernel headers for their default, so work with older kernel as well as the new 2.4.17ctx-8 or 2.4.18ctx-8. They have been tested on 2.4.17ctx-6. Please upgrade.
If you only specify the --ip option with a device, the broadcast address of the device is used. This is used namely in the /etc/init.d/v_xxxx services.
/usr/sbin/chbind --ip eth0 /bin/sh
the new chbind works on older kernel. The broadcast address is simply ignored.
There is no configuration changed to take advantage of that. You need the new kernel and vserver-0.14. Stuff like samba (which was already working in most cases) are now working completely. Even dhcpd works inside a vserver (see the FAQ though).
So the set_ipv4root was changed, but the kernel sports a syscall versioning system and vserver-0.14 supports it. So vserver-0.14 works with any "ctx" kernel. The new kernel works also with older vserver utilities except the vserver broadcast address will be improperly assigned.
/usr/sbin/vfiles reference-server server
Using the output of this command, one may archive only the relevant part of a vserver. You can use this to move a vserver from one server to another. Only move few megs. On the target server, use vunify to fill the missing files. You must have a identical reference server on the target server though.
Anyway, this is general purpose. Life may tell us if this is really useful :-)
The extra configuration file is optional.
I would be interested in other script like this to install SuSE, Mandrake and Debian from scratch. At some point, the newvserver front-end will offer those in the pop-up list. So you will be able to install either from the root server, from another vserver or from any distribution CDroms.
Contribution welcome :-)
When not using the fakeinit (per vserver private init process) facility the vserver script was starting in runlevel 3. It is now using the default runlevel (initdefault) found in the vserver /etc/inittab file. One step closer to have this script distribution independent. Please test it and send me other fixes as needed.
The script uses /etc/init.d or /etc/rc.d/init.d on the fly.
When stopping a vserver, the IP alias is removed even if the vserver was not running. When you "enter" a vserver, the IP alias is put in place. If you stop it, it is removed. Especially useful when you fiddle with a two copies of a vserver (on different physical server).
When stopping a vserver, the vserver script /etc/vservers/xx.sh is always called with the post-stop argument. So doing a "vserver xx stop" clean everything.
When doing "vserver xx exec ..." or "vserver xx enter" and the vserver xx is not running, the /etc/vservers/xx.sh script is called with the pre-start option, making sure the vserver is entered with the proper environment.
When doing "vserver xx enter", bash is started with the option --login. This insure proper environment settings.
Now that 2.4.18ctx-10 works (should be as reliable as 2.4.18ctx-8), it is time to test ctx-9 enhancements, notably the ability to support UDP broadcast in vservers. Samba now work completely out of the box in a vserver. Please test it out.
The build process configure properly /etc/sysconfig/network to help some package operate properly. netatalk for one grabs the host name from /etc/sysconfig/network.
Do not forget the quotes!
The vserver utility will create the necessary IP aliases. The first one is created using the vserver name (eth0:name) and the other are adding a number as a suffix (eth0:name1, eth0:name2, ...).
The multi-IP support keeps the original semantic of the vserver in some ways. A service doing a bind ANY (bind to 0.0.0.0) will setup its IP service on the first IP number of the vserver. If you wants to listen to several IP, you will need to configure your service for each IP number explicitly. For example, for apache, you will need multiple listen statements. By default, apache has a "listen 80" statement (a bind any) which translate in a vserver to a listen first-ip-of-the-vserver:80. So you must simply add listen statement for the remaining IP. For example, for the above IPROOT statement:
This departs from the normal behavior of a Unix/Linux OS. When you do a bind any, you end up listening to every IP configured on the box. It was not possible to achieve that easily in the kernel, while keeping performance high (100%) and yet control which IP may be use by a vserver. So this is a compromise. Time will tell how usable it is.
There are still few things to do to completely support Debian, notably the unification of DEB package. We are getting there.
IPROOT="eth0:188.8.131.52 eth1:184.108.40.206 192.168.1.2" IPROOTDEV=eth2
In the above example, IP 220.127.116.11 will be setup on device eth0, 18.104.22.168 will be installed on eth1 and 192.168.1.2 will go on eth2 (IPROOTDEV is used by default).
This way, the package could be moved to /usr/local/sbin and /usr/local/lib/vserver if needed.
vserver NAME suexec user command args ...
The "status" command was added. It reports some information about a given vserver. Here is an example:
$ /usr/sbin/vserver smb001 status Server smb001 is running 18 processes running Vserver uptime: 12:01
The --silent general option was added. It kills all informative messages. It is generally used with the exec or running command. For example:
# Counting the processes vserver --silent XXX exec ps ax | wc -l
# File /etc/vservices/xxx.conf IP="22.214.171.124 126.96.36.199 192.168.1.2"
vrpm --unify serv1 serv2 -- -Uvh some packages ...
Currently, there is no Linux distribution (no OS in fact) which may answer those questions. Once a server has been abused, the intruder may have changed quite a lot and may have cover his tracks. When you execute a command on such a machine, you can't really trust the output.
Now, on a linux server running vservers and no network service in the root server, you have one part of the solution. The root server and the kernel can't be tempered. So you can always trust the various commands you are running.
Now proving that along its entire life, the root server has never been opened to crack attempt, is a difficult project. All I say is that a root vserver can't be modified from the vservers or anywhere else if it has no network service.
This is nevertheless one goal of the vserver project. Create a robust and trusted root server in which you can run all kind of more flexible virtual servers.
Back to our normal schedule...
So if you trust the root server and you trust another (reference) vserver (one which is never running), you can use the vcheck utility to perform an rpm verify command, but using the RPM database in the reference vserver. The corresponding packages will be checked.
vcheck --verify refvserver vserver1 vserver2 ...
vcheck has another option, --diffpkgs, to compare the package list in to vserver. You can see how to vservers evolved.
/usr/sbin/vserver server exec command "argument with space"
And the command will receive a single argument.
The other section is called NIS/Ldap. It lets you enter the NIS domain, NIS server, LDAP base dn and LDAP server.
Both section are normally found at the end of the installation of a Linux distribution.
This information is enabled in the vserver using the authconfig command. Not all distribution carry this command. We will have to figure out how to enable this on all distribution. If /usr/sbin/authconfig is missing in the vserver, the information is not applied. So newvserver works anyway.
/usr/lib/vserver/install-rh8.0 redhat full
Now, if you run this with the Redhat supplied kernel (2.4.18-14), it works. If you use a 2.4.19 kernel (2.4.19ctx-14 for example), rpm installs few package and then wait forever, trapped in a pause() system call.
I have not yet explain this behavior.
The --nodev option tells vserver not to skip this step.
/usr/sbin/vserver --nodev server enter
An IP alias will be set on this virtual device after configuring it. It uses the loopback number as the default IP to configure the vlan device.
to perform the getpwnam() call, glibc uses NSS (Name Service Switch) plug-gins to access the user information. These plug-gins are taken in the vserver environment and are not always compatible with the root server glibc.
To avoid this problem, we really need two utilities. One running in the root server, switching root and then calling another (/bin/id ?) in the vserver to learn about the user. This way, both utilities will be compatible with each world. Remember that a root server may be some Linux distribution/release and the vserver may be running a totally different distribution/release.
For now, I have fixed the problem somewhat, but it is not perfect Before switching root, I perform a getpwnam("root"), so the plug-gins are loaded. When I perform the real getpwnam, after the switch, the plug-gins are already in memory so they work. Further, if the target user is root, I do not need to perform any of this and uses the UID 0.
Note that this capchroot feature is needed by the suexec sub-command of the vserver command.
This cheat kind of works. It works for most people. Now, if your vserver is running NIS and not your root server, for example, then the NSS plug-gins loaded are not the one needed in the vserver. In this case it does not work.
We will need a better solution. For now, what we have will work for pretty much everybody.
This is normal. Unless two vserver are sharing some IPs, they are allowed to do a bind(0.0.0.0) on the same port and it will show this way. So this is a little strange in the root server, but perfectly ok in the vserver.
This optimization may go away, maybe, when we will attack the per vserver private network loopback. Depending on the solution selected, the common case may become two IPs per vserver (the loopback and the main IP).
I have not done any benchmark with the new bind(any) stuff. It might be a little slower. Potentially not visible. Comments welcome.
There is a TAB in the form to enter up to 4 directories to exclude.
This avoids having all your vserver ending with the same sshd private keys...
where stuff in square bracket is optional.
Now even if you are not using fakeinit, /var/run/utmp is properly initialized with the proper runlevel (as found in /etc/inittab).
This problem was specific to /var/log (and everything under) as far as I can see, so this is hard-coded in the vrpm script.
There are two ways to select a profile:
Once you have started a vserver with a given profile, it is stored in the /var/run/vserver/XX.ctx file, so you can enter and stop the vserver using the active profile, even if you have changed the profile value in the configuration file.
The newvserver tool has been modified so you can immediately enter the second profile value. By default, one profile is called prod and the other is called backup.
vservers may be used as a fail-over strategy where whole servers may be switched on and off on the fly. Now one may use some synchronization tool (rsync ?) to make sure the backup is up to date. Sometime, it is not enough and you wish to maintain the backup in sync with the production vserver in real-time or almost. To perform that, you need to enable the backup server, but you can't do that unless you provide different network setting (avoid having two vserver running with the same IP). So the profile concept was introduced.
When starting a vserver using a given profile, the environment variable PROFILE is defined so you can perform various action such as exchanging key configuration file, starting services differently and so on.
Linuxconf users may want to enable the switchprofile pseudo service (available lately) to switch between different configuration file set.
When you enter the IP addresses of the vserver, you may specify the netmask associated with this address. The syntax is
IPV4ROOT=[device:]ip/mask ...The /mask is new. If you forget to enter the mask, it is using the mask of the device on which the ip alias will be set.
Debian user have complained for some time. Now your chance. Patch for distrib-info welcome.
A man page has been written to explain what it does.
Previously, all those services were bound to eth0 only. Now by default (unless overridden in /etc/vservices/xxx.conf) the services are bound to 127.0.0.1 and eth0.
This solves for one the problem with ssh X11 port forwarding since ssh assumes connections are done on localhost (127.0.0.1) and not eth0.
This utility reads one vserver configuration file and prints the relevant variable one by line, without any comment. This utility should be used by any script operating on a vserver. C++ application should use the vutil_readconf() function, which is using printconf.sh.
The utility is generally used like this in the various scripts:
eval `/usr/lib/vserver/printconf.sh --quote vserver-name`
The --quota option puts double quotes around the values to make the output usable by scripts. C++ application are not using it. They simply assume that everything passed the equal sign is the value, up to the end of the line.
All utilities in the vserver package are now complying with this strategy.
This tells the vserver utility to generate a new /etc/mtab in a vserver every time it is entered or started. This is done after the pre-start step of the vserver companion script (/etc/vservers/*.sh).
Originally, the file /etc/mtab was produced only at vserver creation time and this was fine for most cases. Sometime a vserver is mapped over multiple volume and /etc/mtab must be adjusted. Now this is done auto-magically.
Every mounts visible in the vserver is included. Dummy devices (/dev/hdvN) are generated on the fly. Network mounts are shown as is (the origin of the mount)
This feature may be turned off by entering
GENERATEMTAB=noin the vserver configuration file.
This controls the base directory use to setup vservers. This variable is normally written in /etc/vservers.conf and shared by several vservers. The actual rool of the vserver is $VSERVERS_ROOT/name
A vserver may override this variable. The newvserver utility will write a VSERVERS_ROOT= line in the vserver configuration file if a different value was selected (compared to the one in /etc/vservers.conf).
While a vserver may redefine VSERVERS_ROOT, it is not that convenient. A vserver may simply define VSERVERDIR to point wherever it fits. This variable is optional, but the printconf.sh utility makes sure it is defined: If a vserver configuration file do not define VSERVERDIR, then it is set to $VSERVERS_ROOT/name.
Application dealing with vservers file should rely on VSERVERDIR (and use printconf.sh)
newvserver use /etc/vservers.conf to extract the default value for VSERVERS_ROOT. It also checks /etc/vservers/newvserver.default to extract the value from the newvroot variable (if available) (So /etc/vservers/newvserver.defaults override /etc/vservers.conf).
The --vroot command line option was also added to setup the default value.
# Description: Some vserver source /etc/vservers.conf ...
Using this strategy, sites are free to implement whatever logic they want to manage vservers. For example, sites may decide to move the S_CAPS or S_FLAGS to /etc/vservers.conf to minimize repetition in vservers.
All the utilities have been modified to obey this rules. Utilities for example, must source one vserver configuration file to learn its VSERVERS_ROOT directory (more on this in this change log).
You do not have to change anything to use vserver 0.29. The printconf.sh utility sources /etc/vservers.conf before sourcing the vserver configuration file. But newvserver and "vserver build" produces configuration files with the proper source command at the top.
It turns out to be more practical to bind services to 127.0.0.1 and eth0. X11 forwarding in sshd is working better like this. So now, all v_xxx services are bound to 127.0.0.1 and eth0, unless overridden by the /etc/vservices/xxx.conf
I am preparing a full presentation of this. For now, this is just to explain why the vserver command as the new sub-commands "assemble" and "remove".
A synthetic vserver is a vserver created out of another vserver but merging together resource file (packages), data and configurations. Once a vserver is assembled, you may use it as usual. Once you are done with it, you can remove it.
The interest of this strategy is that you usually end up with many vservers for different project, but all based on the same package set. They are often cloned from the same reference vserver. As the number of vserver grows, you end up with more and more admin tasks duplicated among all those vservers.
With synthetic vserver, you always have a clear separation from the admin tasks touching to reference vservers and configuration tasks associated with specific projects.
Anyway, as I said before, a full paper will be written shortly. To conclude, a synthetic vserver reduces the amount of admiin tasks needed and at the same time, provides an exact description of of project.
To enable this, just enter
in the configuration file. Note that this also works even if the root server (the workstation or notebook) is also using DHCP to get an IP address. If this is the case, you must edit the file /etc/dhclient.conf (generally missing) and place the following line.
send dhcp-client-identifier "root server";
This allows the DHCP server to tell apart the requests coming from vservers from the requests coming from the root server, since they are all sharing the same MAC address.