VPS Management

From JCWiki
Jump to navigation Jump to search

Common Problems

Login to any machine without a password

This is possible via the use of ssh keys. The process is thus:

1. place the public key for your user (root@mail) in the /root/.ssh/authorized_keys file on the server you wish to login to

cat /root/.ssh/id_dsa.pub

(paste that into authorized_keys on the target server). If the file doesn't exist, create it.

2. enable root login (usually only applies to FreeBSD). Edit the /etc/ssh/sshd_config on the target server and change: #PermitRootLogin no to PermitRootLogin yes

3. Restart the sshd on the target machine. First, find the sshd process:

jailps <hostname> | grep sshd 

or

vp <VEID> | grep sshd

Look for the process resembling:

root     17296  0.0  0.0  5280 1036 ?        Ss    2011   4:27 /usr/sbin/sshd 

(this is the sshd)

Not:

root      6270  0.5  0.0  6808 2536 ?        Ss   14:33   0:00 sshd: root [priv]

(this is an sshd child- someone already ssh'd in as root)

Restart the sshd:

kill -1 <PID>

Ex:

kill -1 17296

You may now ssh in.

Once you're done, IF you enabled root login, you should repeat steps 2 and 3 to disable root logins.

Letting someone in who has locked themselves out (killed sshd, lost pwd)

There are two ways people frequently lock themselves out - either they forget a password, or they kill off sshd somehow.

These are actually both fairly easy to solve. First, let's say someone kills off their sshd, or somehow mangles /etc/ssh/sshd_config such that it no longer lets them in.

Their email may be very short, or it may have all sorts of details about how you should fix sshd_config to let them in ... just ignore all of this. They can fix their own mangled sshd. Fixing this is very simple. First, edit the /etc/inetd.conf on their system and uncomment the telnet line:

telnet stream  tcp     nowait  root    /usr/libexec/telnetd    telnetd
#telnet stream  tcp6    nowait  root    /usr/libexec/telnetd    telnetd

(just leave the tcp6 version of telnet commented)

Then, use jailps to list the processes on their system, and find their inetd process. Then simply:

kill -HUP (pid)

where (pid) is the PID of their inetd process. Now they have telnet running on their system and they can log in and do whatever they need to do.

The only complications that could occur are:

a) their firewall config on our firewall has port 23 blocked, in which case you will need to open that - will be covered in a different lesson.

b) they are not running inetd, so you can't HUP it. If this happens, edit their /etc/rc.conf, add the inetd_enable="YES" line, and then kill their jail with /tmp/jailkill.pl - then restart their jail with the jail line from their quad/safe file. Easy.

If they have forgotten a password,

On 6.x+ you can reset their password with:

jexec <jailID from jls> passwd root

Note: the default password for 6.x jails is 8ico2987, for 4.x it is p455agfa

On 4.x, you need to cd to their etc directory ... for instance:

cd /mnt/data2/198.78.65.136-col00261-DIR/etc

and run:

vipw -d .

Then paste in these two lines (theres a paste with these):

root:$1$krszPxhk$xkCepSnz3mIikT3vCtJCt0:0:0::0:0:Charlie &:/root:/bin/csh
user:$1$Mx9p5Npk$QdMU6c8YQqp2FW2M3irEh/:1001:1001::0:0:User &:/home/user:/bin/sh

overwriting the lines they already have for "user" and "root" - then just tell them that both user and root have been reset to the default password of p455agfa.

For linux, just passwd inside shell or

vzctl set <veid> --userpasswd root:p455agfa –save

Starting in 2009 we began giving out randomized passwords for FreeBSD and Linux as the default password. That is stored with each system in Mgmt. You should look for and reset the password to that password in the event of a reset and refer the customer to use their original password from their welcome email- this way we don’t have to send the password again via email (in clear text).


sendmail can’t be contacted from ext ip (only locally)

By default redhat puts this line in sendmail.mc:

DAEMON_OPTIONS(`Port=smtp,Addr=127.0.0.1, Name=MTA')

which makes it only answer on localhost. Comment it out like:

dnl DAEMON_OPTIONS(`Port=smtp,Addr=127.0.0.1, Name=MTA')

and then rebuild sendmail.cf with:

m4 /etc/mail/sendmail.mc > /etc/sendmail.cf

virt doesn’t properly let go of ve’s ip(s) when moved to another system

On virtuozzo 2.6 systems, it's been observed that when moving ips from one virt to another that sometimes the routing table will not get updated to reflect the removal of the ip addresses.

A recent example was a customer that was moving to a new ve on a new virt and the ip addresses were traded between the two ve's. After the trade the two systems were not able to talk to each other. When looking at the routing table for the old system all the ip addresses were still in the routing table as being local, like so:

netstat -rn | grep 69.55.225.149
69.55.225.149   0.0.0.0         255.255.255.255 UH       40 0          0 venet0

This was preventing traffic to the other system from being routed properly. The solution is to manually delete the route:

route delete 69.55.225.149 gw 0.0.0.0

Supposedly, this was fixed in 2.6.1

sshd on FreeBSD 6.2 segfaults

First try to reinstall ssh

cd /usr/src/secure
cd lib/libssh
make depend && make all install
cd ../../usr.sbin/sshd
make depend && make all install
cd ../../usr.bin/ssh
make depend && make all install

Failing that, find the library that’s messed up:

ldd /usr/sbin/sshd
         libssh.so.3 => /usr/lib/libssh.so.3 (0x280a3000) 
         libutil.so.5 => /lib/libutil.so.5 (0x280d8000) 
         libz.so.3 => /lib/libz.so.3 (0x280e4000) 
         libwrap.so.4 => /usr/lib/libwrap.so.4 (0x280f5000) 
         libpam.so.3 => /usr/lib/libpam.so.3 (0x280fc000) 
         libbsm.so.1 => /usr/lib/libbsm.so.1 (0x28103000) 
         libgssapi.so.8 => /usr/lib/libgssapi.so.8 (0x28112000) 
         libkrb5.so.8 => /usr/lib/libkrb5.so.8 (0x28120000) 
         libasn1.so.8 => /usr/lib/libasn1.so.8 (0x28154000) 
         libcom_err.so.3 => /usr/lib/libcom_err.so.3 (0x28175000) 
         libroken.so.8 => /usr/lib/libroken.so.8 (0x28177000) 
         libcrypto.so.4 => /lib/libcrypto.so.4 (0x28183000) 
         libcrypt.so.3 => /lib/libcrypt.so.3 (0x28276000) 
         libc.so.6 => /lib/libc.so.6 (0x2828e000) 
         libmd.so.3 => /lib/libmd.so.3 (0x28373000)

md5 them and compare to other jail hosts or jails running on host

for libcrypto reinstall:

/usr/src/crypto
make depend && make all install

FreeBSD VPS

Starting jails: Quad/Safe Files

FreeBSD customer systems do not start up automatically at boot time. When one of our freebsd machines boots up, it boots up, and does nothing else. To start jails, we put the commands to start each jail into a shell script(s) and run the script(s). Jail startup is something that needs to be actively monitored, which is why we don’t just run the script automatically. More on monitoring later.

NOTE: >=7.x we have moved to 1 quad file: quad1. Startups are not done by running each quad, but rather startalljails which relies on the contents of quad1. The specifics of this are lower in this article. What follows here applies for pre 7.x systems.

There are eight files in /usr/local/jail/rc.d:

jail3# ls /usr/local/jail/rc.d/
quad1   quad2   quad3   quad4   safe1   safe2   safe3   safe4
jail3#

four quad files and four safe files.

Each file contains an even number of system startup blocks (total number of jails divided by 4)

The reason for this is, if we make one large script to startup all the systems at boot time, it will take too long - the first system in the script will start up right after system boot, which is great, but the last system may not start for another 20 minutes.

Since there is no way to parralelize this during the startup procedure, we simply open four terminals (in screen window 9) and run each script, one in each terminal. This way they all run simultaneously, and the very last system in each startup script gets started in 1/4th the time it would if there was one large file

The files are generally organized so that quad/safe 1&2 have only jails from disk 1, and quad/safe 3&4 have jails from disk 2. This helps ensure that only 2 fscks on any disk are going on at once. Further, they are balanced so that all quad/safe’s finish executing around the same time. We do this by making sure each quad/safe has a similar number of jails and represents a similar number of inodes (see js).

The other, very important reason we do it this way, and this is the reason there are quad files and safe files, is that in the event of a system crash, every single vn-backed filesystem that was mounted at the time of system crash needs to be fsck'd. However, fsck'ing takes time, so if we shut the system down gracefully, we don't want to fsck.

Therefore, we have two sets of scripts - the four quad scripts are identical to the four safe scripts except for the fact that the quad scripts contain fsck commands for each filesystem.

So, if you shut a system down gracefully, start four terminals and run safe1 in window one, and safe2 in window 2, and so on.

If you crash, start four terminals (or go to screen window 9) and run quad1 in window one, and quad2 in window 2, and so on.

Here is a snip of (a 4.x version) quad2 from jail17:

vnconfig /dev/vn16 /mnt/data2/69.55.228.7-col00820
fsck -y /dev/vn16
mount /dev/vn16c /mnt/data2/69.55.228.7-col00820-DIR
chmod 0666 /mnt/data2/69.55.228.7-col00820-DIR/dev/null
jail /mnt/data2/69.55.228.7-col00820-DIR mail1.phimail.com 69.55.228.7 /bin/sh /etc/rc

# moved to data2 col00368
#vnconfig /dev/vn28 /mnt/data2/69.55.236.132-col00368
#fsck -y /dev/vn28
#mount /dev/vn28c /mnt/data2/69.55.236.132-col00368-DIR
#chmod 0666 /mnt/data2/69.55.236.132-col00368-DIR/dev/null
#jail /mnt/data2/69.55.236.132-col00368-DIR limehouse.org 69.55.236.132 /bin/sh /etc/rc

echo ‘### NOTE ### ^C @ Local package initialization: pgsqlmesg: /dev/ttyp1: Operation not permitted’
vnconfig /dev/vn22 /mnt/data2/69.55.228.13-col01063
fsck -y /dev/vn22
mount /dev/vn22c /mnt/data2/69.55.228.13-col01063-DIR
chmod 0666 /mnt/data2/69.55.228.13-col01063-DIR/dev/null
jail /mnt/data2/69.55.228.13-col01063-DIR www.widestream.com.au 69.55.228.13 /bin/sh /etc/rc

# cancelled col00106
#vnconfig /dev/vn15 /mnt/data2/69.55.238.5-col00106
#fsck -y /dev/vn15
#mount /dev/vn15c /mnt/data2/69.55.238.5-col00106-DIR
#chmod 0666 /mnt/data2/69.55.238.5-col00106-DIR/dev/null
#jail /mnt/data2/69.55.238.5-col00106-DIR mail.azebu.net 69.55.238.5 /bin/sh /etc/rc

As you can see, two of the systems specified are commented out - presumably those customers cancelled, or were moved to new servers.

As you can see, the vnconfig line is the simpler command line, not the longer one that was used when it was first configured. As you can see, all that is done is, vnconfig the filesystem, then fsck it, then mount it. The fourth command is the `jail` command used to start the system – but that will be covered later.

Here is the safe2 file from jail17:

vnconfig /dev/vn16 /mnt/data2/69.55.228.7-col00820
mount /dev/vn16c /mnt/data2/69.55.228.7-col00820-DIR
chmod 0666 /mnt/data2/69.55.228.7-col00820-DIR/dev/null
jail /mnt/data2/69.55.228.7-col00820-DIR mail1.phimail.com 69.55.228.7 /bin/sh /etc/rc

# moved to data2 col00368
#vnconfig /dev/vn28 /mnt/data2/69.55.236.132-col00368
#mount /dev/vn28c /mnt/data2/69.55.236.132-col00368-DIR
#chmod 0666 /mnt/data2/69.55.236.132-col00368-DIR/dev/null
#jail /mnt/data2/69.55.236.132-col00368-DIR limehouse.org 69.55.236.132 /bin/sh /etc/rc

echo ‘### NOTE ### ^C @ Local package initialization: pgsqlmesg: /dev/ttyp1: Operation not permitted’
vnconfig /dev/vn22 /mnt/data2/69.55.228.13-col01063
mount /dev/vn22c /mnt/data2/69.55.228.13-col01063-DIR
chmod 0666 /mnt/data2/69.55.228.13-col01063-DIR/dev/null
jail /mnt/data2/69.55.228.13-col01063-DIR www.widestream.com.au 69.55.228.13 /bin/sh /etc/rc

# cancelled col00106
#vnconfig /dev/vn15 /mnt/data2/69.55.238.5-col00106
#mount /dev/vn15c /mnt/data2/69.55.238.5-col00106-DIR
#chmod 0666 /mnt/data2/69.55.238.5-col00106-DIR/dev/null
#jail /mnt/data2/69.55.238.5-col00106-DIR mail.azebu.net 69.55.238.5 /bin/sh /etc/rc

As you can see, it is exactly the same, but it does not have the fsck lines.

Take a look at the last entry - note that the file is named:

/mnt/data2/69.55.238.5-col00106

and the mount point is named:

/mnt/data2/69.55.238.5-col00106-DIR

This is the general format on all the FreeBSD systems. The file is always named:

IP-custnumber

and the directory is named:

IP-custnumber-DIR

If you run safe when you need a fsck, the mount will fail and jail will fail:

# mount /dev/vn1c /mnt/data2/jails/65.248.2.131-ns1.kozubik.com-DIR
mount: /dev/vn1c: Operation not permitted

No reboot needed, just run the quad script

Starting with 6.x jails, we added block delimiters to the quad/safe files, the block looks like:

echo '## begin ##: nuie.solaris.mu'
fsck -y /dev/concat/v30v31a
mount /dev/concat/v30v31a /mnt/data1/69.55.228.218-col01441-DIR
mount_devfs devfs /mnt/data1/69.55.228.218-col01441-DIR/dev
devfs -m /mnt/data1/69.55.228.218-col01441-DIR/dev rule -s 3 applyset
jail /mnt/data1/69.55.228.218-col01441-DIR nuie.solaris.mu 69.55.228.218 /bin/sh /etc/rc
echo '## end ##: nuie.solaris.mu'

These are more than just informative when running quad/safe’s, the echo lines MUST be present for certain tools to work properly. So it’s important that any updates to the hostname also be updated on the 2 echo lines. For example, if you try to startjail a jail with a hostname which is on the jail line but not the echo lines, the command will return with host not found.

FreeBSD 7.x+ notes

Starting with the release of FreeBSD 7.x, we are doing jail startups in a slightly different way. First, thereis only 1 file: /usr/local/jail/rc.d/quad1 There are no other quads or corresponding safe files. The reason for this is twofold, 1. We can pass –C to fsck which will tell is to skip the fsck if the fs is clean (no more need for safe files), 2. We have a new startup script which can be launched multiple times, running in parallel to start jails, where quad1 is the master jail file. Quad1 could still be run as a shell script, but it would take a very long time for it to run completely so it’s not advisable; or you should break it down into smaller chunks (like quad1, quad2, quad3, etc)

Here is a snip of (a 7.x version) quad1 from jail2:

echo '## begin ##: projects.tw.com'
mdconfig -a -t vnode -f /mnt/data1/69.55.230.46-col01213 -u 50
fsck -Cy /dev/md50c
mount /dev/md50c /mnt/data1/69.55.230.46-col01213-DIR
mount -t devfs devfs /mnt/data1/69.55.230.46-col01213-DIR/dev
devfs -m /mnt/data1/69.55.230.46-col01213-DIR/dev rule -s 3 applyset
jail /mnt/data1/69.55.230.46-col01213-DIR projects.tw.com 69.55.230.46 /bin/sh /etc/rc
echo '## end ##: projects.tw.com'

Cancelled jails are no longer commented out and stored in quad1, rather they’re moved to /usr/local/jail/rc.d/deprecated

To start these jails, start the 4 ssh sessions as you would for a normal crash and then instead of running quad1-4, instead run startalljails in each window. IMPORTANT- before running startalljails you should make sure you ran preboot once as it will clear out all the lockfiles and enable startalljails to work properly.

Problems with the quad/safe files

When you run the quad/safe files, there are two problems that can occur - either a particular system will hang during initialization, OR a system will spit out output to the screen, impeding your ability to do anything. Or both.

First off, when you start a jail, you see output like this:

Skipping disk checks ...
adjkerntz[25285]: sysctl(put_wallclock): Operation not permitted
Doing initial network setup:.
ifconfig: ioctl (SIOCDIFADDR): permission denied
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
Additional routing options: TCP keepalive=YESsysctl:
net.inet.tcp.always_keepalive: Operation not permitted.
Routing daemons:.
Additional daemons: syslogd.
Doing additional network setup:.
Starting final network daemons:.
ELF ldconfig path: /usr/lib /usr/lib/compat /usr/X11R6/lib /usr/local/lib
a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout /usr/X11R6/lib/aout
Starting standard daemons: inetd cron sshd sendmail sendmail-clientmqueue.
Initial rc.i386 initialization:.
Configuring syscons: blanktime.
Additional ABI support:.
Local package initialization:.
Additional TCP options:.

Now, let's look at this line, near the end:

Local package initialization:.

This is where a list of daemons that are set to start at boot time willshow up. You might see something like:

Local package initialization: mysqld apache sendmail sendmail-clientmqueue

Or something like this:

Local package initialization: postgres postfix apache

The problem is that many systems (about 4-5 per machine) will hang on that line. Basically it will get to some of the way through the total daemons to be started:

Local package initialization: mysqld apache

and will just sit there. Forever.

Fortunately, pressing ctrl-c will break out of it. Not only will it break out of it, but it will also continue on that same line and start the other daemons:

Local package initialization: mysqld apache ^c sendmail-clientmqueue

and then continue on to finish the startup, and then move to the next system to be started.

So what does this mean? It means that if a machine crashes, and you start four screen-windows to run four quads or four safes, you need to periodically cycle between them and see if any systems are stuck at that point, causing their quad/safe file to hang. A good rule of thumb is, if you see a system at that point in the startup, give it another 100 seconds - if it is still at the exact same spot, hit ctrl-c. Its also a good idea to go back into the quad file (just before the first command in the jail startup block) and note that this jail tends to need a control-c or more time as follows:

echo '### NOTE ### slow sendmail'
echo '### NOTE ###: ^C @ Starting sendmail.'

NEVER hit ctrl-c repeatedly if you don't get an immediate response - that will cause the following jail’s startup commands to be aborted.

A second problem that can occur is that a jail - maybe the first one in that particular quad/safe, maybe the last one, or maybe one in the middle, will start spitting out status or error messages from one of its init scripts. This is not a problem - basically, hit enter a few times and see if you get a prompt - if you do get a prompt, that means that the quad/safe script has already completed. Therefore it is safe to log out (and log out of the user that you su'd from) and then log back in (if necessary).

The tricky thing is, if a system in the middle starts flooding with messages, and you hit enter a few times and don't get a prompt. Are you not getting a prompt because some subsequent system is hanging at the initialization, as we discussede above ? Or are you not getting a prompt because that quad file is currently running an fsck ? Usually you can tell by scrolling back in screen’s history to see what it was doing before you started getting the messages.

If you don’t get clues from history, you have to use your judgement - instead of giving it 100 seconds to respond, perhaps give it 2-3 mins ... if you still get no response (no prompt) when you hit enter, hit ctrl-c. However, be aware that you might still be hitting ctrl-c in the middle of an fsck. This means you will get an error like "filesystem still marked dirty" and then the vnconfig for it will fail and so will the jail command, and the next system in the quad file will then start starting up.

If this happens, just wait until the end of all the quad files have finished, and start that system manually.

If things really get weird, like a screen flooded with errors, and you can't get a prompt, and ctrl-c does nothing, then you need to just eventually (give it ten mins or so) just kill that window with ctrl-p, then k, and then log in again and manually check which systems are now running and which aren't, and manually start up any that are not.

Don't EVER risk running a particular quad/safe file a second time. If the quad/safe script gets executed twice, reboot the machine immediately.

So, for all the above reasons, anytime a machine crashes and you run all the quads or all the safes, always check every jail afterwards to make sure it is running - even if you have no hangs or complications at all. Run this command:

jailpsall

Note: postboot also populates ipfw counts, so it should not be run multiple times, use jailpsall for subsequent extensive ps’ing

And make sure they all show as running. If one does not show as running, check its /etc/rc.conf file first to see if maybe it is using a different hostname first before starting it manually.

One thing we have implemented to alleviate these startup hangs and noisy jails, is to put jail start blocks that are slow or hangy at the bottom of the safe/quad file. Further, for each bad jail we note in each quad/safe just before the start block something like:

echo ‘### NOTE ### ^C @ Local package initialization: pgsqlmesg: /dev/ttyp1: Operation not permitted’

That way we’ll be prepared to ^C when we see that message appear during the quad/safe startup process. If you observe a new, undocumented hang, after the quad/safe has finished, place a line similar to the above in the quad file, move the jail start block to the end of the file, then run buildsafe


Recovering from a crash (FreeBSD)

Diagnose whether you have a crash

The most important thing is to get the machine and all jails back up as soon as possible. Note the time, you’ll need to create a crash log entry (Mgmt. -> Reference -> CrashLog). The first thing to do is head over to the serial console screen and see if there’s any kernel error messages output. Try to copy any messages (or just a sample of repeating messages) you see into the notes section of the crash log. If there are no messages, the machine may just be really busy- wait a bit (5-10min) to see if it comes back. If it's still pinging, odds are its very busy. Note, if you see messages about swap space exhausted, the server is obviously out of memory, however it may recover briefly enough for you to get a jtop in to see who's lauched a ton of procs (most likely) and then issue a quick jailkill to get it back under control.

If it doesn't come back, or the messages indicate a fatal error, you will need to proceed with a power cycle (ctrl+alt+del will not work).

Power cycle the server

If this machine is not a Dell 2950 with a DRAC card (i.e. if you can’t ssh into the DRAC card (as root, using the standard root pass) and issue

racadm serveraction hardreset

then you will need someone at the data center power the macine off, wait 30 sec, then turn it back on. Make sure to re-attach via console:

tip jailX

immediately after power down.

(Re)attach to the console

Stay on the console the entire time during boot. As the BIOS posts- look out for the RAID card output- does everything look healthy? The output may be scrambled, look for "DEGRADED" or "FAILED". Once the OS starts booting you will be disconnected (dropped back to the shell on the console server) a couple times during the boot up. The reason you want to quickly re-attach is two-fold: 1. If you don’t reattach quickly then you won’t get any console output, 2. you want to be attached before the server potentially starts (an extensive) fsck. If you attach after the fsck begins, you’ll have seen no indication it started an fsck and the server will appear frozen during startup- no output, no response.

IMPORTANT NOTE: on some older FreeBSD systems, there will be no output to the video (KVM) console as it boots up. The console output is redirected to the serial port ... so if a jail crashes, and you attach a kvm, the output during the bootup procedure will not be shown on the screen. However, when the bootup is done, you will get a login prompt on the screen and will be able to log in as normal. /boot/loader.conf is where serial console redirect output lives, so comment that if you want to catch output on kvm. On newer systems it sends most output to both locations.

Assess the heath of the server

Once the server boots up fully, you should be able to ssh in. Look around- make sure all the mounts are there and reporting the correct size/usage (i.e. /mnt/data1 /mnt/data2 /mnt/data3 - look in /etc/fstab to determine which mount points should be there), check to see if RAID mirrors are healthy. See megacli, aaccheck

Before you start the jails, you need to run preboot. This will do some assurance checks to make sure things are prepped to start the jails. Any issues that come out of preboot need to be addressed before starting jails.

Start jails

More on starting jails Customer jails (the VPSs) do not start up automatically at boot time. When a FreeBSD machines boots up, it boots up, and does nothing else. To start jails, we put the commands to start each jail into a shell script(s) and run the script(s). Jail startup is something that needs to be actively monitored, which is why we don’t just run the script automatically.

In order to start jails, we run the quad files: quad1 quad2 quad3 and quad4 (on new systems there is only quad1). If the machine was cleanly rebooted- which wouldn't be the case if this was a crash, you may run the safe files (safe1 safe2 safe3 safe4) in lieu of quads.

Open up 4 logins to the server (use the windows in a9) In each of the 4 windows you will:

If there is a startalljails script (and only quad1), run that command in each of the 4 windows. It will parse through the quad1 file and start each jail. Follow the instructions here for monitoring startup. Note that you can be a little more lenient with jails that take awhile to start- startalljails will work around the slow jails and start the rest. As long as there aren't 4 jails which are "hung" during startup, the rest will get started eventually. -or- If there is no startalljails script, there will be multiple quad files. In each of the 4 windows, start each of the quads. i.e. start quad1 in window1, quad2 in window2 and so on. DO NOT start any quad twice. It will crash the server. If you accidentally do this, just jailkill all the jails which are in the quad and run the quad again. Follow the instructions here for monitoring quad startup.

Note the time the last jail boots- this is what you will enter in the crash log.

Save the crash log.

Check to make sure all jails have started

There's a simple script which will make sure all jails have started, and enter the ipfw counter rules: postboot Run postboot, which will do a jailps on each jail it finds (excluding commented out jails) in the quad file(s). We're looking for 2 things:

  1. systems spawning out of control or too many procs
  2. jails which haven't started

On 7.x and newer systems it will print out the problems (which jails haven't started) at the conclusion of postboot. On older systems you will need to watch closely to see if/when there's a problem, namely:

[hostname] doesnt exist on this server

When you get this message, it means one of 2 things: 1. the jail really didn't start: When a jail doesn't start it usually boils down to a problem in the quad file. Perhaps the path name is wrong (data1 vs data2) or the name of the vn/mdfile is wrong. Once this is corrected, you will need to run the commands from the quad file manually, or you may use startjail <hostname>

2. the customer has changed their hostname (and not told us) so their jail is running, just under a different hostname: On systems with jls, this is easy to rectify. First, get the customer info: g <hostname> Then look for the customer in jls: jls | grep <col0XXXX> From there you will see their new hostname- you should update that hostname in the quad file: don't forget to edit it on the ## begin ## and ## end ## lines, and in mgmt. On older systems without jls, this will be harder, you will need to look further to see their hostname- perhaps its in their /etc/rc.conf


Once all jails are started, do some spot checks- try to ssh or browse to some customers, just to make sure things are really ok.

Adding disk to a 7.x/8.x jail

or

Moving customer to a different drive (md)

NOTE: this doesn’t apply to mx2 which uses gvinum. Use same procedure as 6.x NOTE: if you unmount before mdconfig, re-mdconfig (attach) then unmount then mdconfig -u again


(parts to change/customize are red)


If someone wants more disk space, there’s a paste for it, send it to them (explains about downtime, etc).

1. Figure out the space avail from js. Ideally, you want to put the customers new space on a different partition (and create the new md on the new partition).


2. make a mental note of how much space they're currently using


3. jailkill <hostname>


4. Umount it (including their devfs) but leave the md config’d (so if you use stopjail, you will have to re-mdconfig it)


5. g <customerID> to get the info (IP/cust#) needed to feed the new mdfile and mount name, and to see the current md device.


6a. When there's enough room to place new system on an alternate, or the same drive: USE CAUTION not to overwrite (touch, mdconfig) existing md!!
touch /mnt/data3/69.55.234.66-col01334
mdconfig -a -t vnode -s 10g -f /mnt/data3/69.55.234.66-col01334 -u 97
newfs /dev/md97


Optional- if new space is on a different drive, move the mount point directory AND use that directory in the mount and cd commands below:
mv /mnt/data1/69.55.234.66-col01334-DIR /mnt/data3/69.55.234.66-col01334-DIR

mount /dev/md97 /mnt/data3/69.55.234.66-col01334-DIR
cd /mnt/data3/69.55.234.66-col01334-DIR


confirm you are mounted to /dev/md97 and space is correct:
df .


do the dump and pipe directly to restore:
dump -0a -f - /dev/md1 | restore -r -f -
rm restoresymtable


when dump/restore completes successfully, use df to confirm the restored data size matches the original usage figure


md-unconfig old system:
mdconfig -d -u 1


archive old mdfile.
mv /mnt/data1/69.55.237.26-col00241 /mnt/data1/old-col00241-mdfile-noarchive-20091211


edit quad (vq1) to point to new (/mnt/data3) location AND new md number (md97)


restart the jail:
startjail <hostname>


6b. When there's not enough room on an alternate partition, or on the same drive...but there is enough room if you were to remove the existing customer's space:


mount backup nfs mounts:
mbm
(run df to confirm backup mounts are mounted)


dump the customer to backup2 or backup1
dump -0a -f /backup4/col00241.20120329.noarchive.dump /dev/md1
(when complete WITHOUT errors, du the dump file to confirm it matches size, roughly, with usage)


unconfigure and remove old mdfile
mdconfig -d -u 1
rm /mnt/data1/69.55.237.26-col00241

(there should now be enough space to recreate your bigger system. If not, run sync a couple times)


create the new system (ok to reuse old mdfile and md#):
touch /mnt/data1/69.55.234.66-col01334
mdconfig -a -t vnode -s 10g -f /mnt/data1/69.55.234.66-col01334 -u 1
newfs /dev/md1
mount /dev/md1 /mnt/data1/69.55.234.66-col01334-DIR
cd /mnt/data1/69.55.234.66-col01334-DIR


confirm you are mounted to /dev/md1 and space is correct:
df .


do the restore from the dumpfile on the backup server:
restore -r -f /backup4/col00241.20120329.noarchive.dump .
rm restoresymtable


when dump/restore completes successfully, use df to confirm the restored data size matches the original usage figure


umount nfs:
mbu


If md# changed (or mount point), edit quad (vq1) to point to new (/mnt/data3) location AND new md number (md1)


restart the jail:
startjail <hostname>


7. update disk (and dir if applicable) in mgmt screen


8. update backup list AND move backups, if applicatble

Ex: mvbackups 69.55.237.26-col00241 jail9 data3


9. Optional: archive old mdfile

mbm
gzip -c old-col01588-mdfile-noarchive-20120329 > /deprecated/old-col01588-mdfile-noarchive-20120329.gz
mbu
rm old-col01588-mdfile-noarchive-20120329

Adding disk to a 6.x jail (gvinum/gconcat)

or

Moving customer to a different drive (gvinum/gconcat)

(parts to change are highlighted)


If someone wants more disk space, there’s a paste for it, send it to them (explains about downtime, etc).


1. Figure out the space avail from js. Ideally, you want to put the customers new space on a different partition (and create the new md on the new partition).


2. make a mental note of how much space they're currently using


3. stopjail <hostname>


4. g <customerID> to get the info (IP/cust#) needed to feed the new mount name and existing volume/device.


5a. When there's enough room to place new system on an alternate, or the same drive (using only UNUSED - including if it's in use by the system in question - gvinum volumes):


Configure the new device:
A. for a 2G system (single gvinum volume):
bsdlabel -r -w /dev/gvinum/v123
newfs /dev/gvinum/v123a

-or-


B. for a >2G system (create a gconcat volume):
try to grab a contiguous block of gvinum volumes. gconcat volumes MAY NOT span drives (i.e. you cannot use a gvinum volume from data3 and a volume from data2 in the same gconcat volume).

gconcat label v82-v84 /dev/gvinum/v82 /dev/gvinum/v83 /dev/gvinum/v84
bsdlabel -r -w /dev/concat/v82-v84
newfs /dev/concat/v82-v84a


Other valid gconcat examples:
gconcat label v82-v84v109v112 /dev/gvinum/v82 /dev/gvinum/v83 /dev/gvinum/v84 /dev/gvinum/v109 /dev/gvinum/v112
gconcat label v82v83 /dev/gvinum/v82 /dev/gvinum/v83

Note, long names will truncate: v144v145v148-v115 will truncate to v144v145v148-v1 (so you will refer to it as v144v145v148-v1 thereafter)


Optional- if new volume is on a different drive, move the mount point directory (get the drive from js output) AND use that directory in the mount and cd commands below:
mv /mnt/data1/69.55.234.66-col01334-DIR /mnt/data3/69.55.234.66-col01334-DIR


confirm you are mounted to the device (/dev/gvinum/v123a OR /dev/concat/v82-v84) and space is correct:
A. mount /dev/gvinum/v123a /mnt/data3/69.55.234.66-col01334-DIR
-or-
B. mount /dev/concat/v82-v84a /mnt/data3/69.55.234.66-col01334-DIR

cd /mnt/data3/69.55.234.66-col01334-DIR
df .


do the dump and pipe directly to restore:
dump -0a -f - /dev/gvinum/v1 | restore -r -f -
rm restoresymtable


when dump/restore completes successfully, use df to confirm the restored data size matches the original usage figure


edit quad (vq1) to point to new (/mnt/data3) location AND new volume (/dev/gvinum/v123a or /dev/concat/v82-v84a) , run buildsafe


restart the jail:
startjail <hostname>


5b. When there's not enough room on an alternate partition, or on the same drive...but there is enough room if you were to remove the existing customer's space (i.e. if you want/need to reuse the existing gvinum volumes and add on more):


mount backup nfs mounts:
mbm
(run df to confirm backup mounts are mounted)


dump the customer to backup2 or backup1
dump -0a -f /backup4/col00241.20120329.noarchive.dump /dev/gconcat/v106-v107
(when complete WITHOUT errors, du the dump file to confirm it matches size, roughly, with usage)


unconfigure the old gconcat volume
list member gvinum volumes:

gconcat list v106v107


Output will resemble:

Geom name: v106v107
State: UP
Status: Total=2, Online=2
Type: AUTOMATIC
ID: 3530663882
Providers:
1. Name: concat/v106v107
   Mediasize: 4294966272 (4.0G)
   Sectorsize: 512
   Mode: r1w1e2
Consumers:
1. Name: gvinum/sd/v106.p0.s0
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Mode: r1w1e3
   Start: 0
   End: 2147483136
2. Name: gvinum/sd/v107.p0.s0
   Mediasize: 2147483648 (2.0G)
   Sectorsize: 512
   Mode: r1w1e3
   Start: 2147483136
   End: 4294966272

stop volume and clear members
gconcat stop v106v107
gconcat clear gvinum/sd/v106.p0.s0 gvinum/sd/v107.p0.s0


create new device- and its ok to reuse old/former members
try to grab a contiguous block of gvinum volumes. gconcat volumes MAY NOT span drives (i.e. you cannot use a gvinum volume from data3 and a volume from data2 in the same gconcat volume).
gconcat label v82-v84v106v107 /dev/gvinum/v82 /dev/gvinum/v83 /dev/gvinum/v84 /dev/gvinum/v106 /dev/gvinum/v107
bsdlabel -r -w /dev/concat/v82-v84v106v107
newfs /dev/concat/v82-v84v106v107a


Optional- if new volume is on a different drive, move the mount point directory (get the drive from js output) AND use that directory in the mount and cd commands below:
mv /mnt/data1/69.55.234.66-col01334-DIR /mnt/data3/69.55.234.66-col01334-DIR

confirm you are mounted to the device (/dev/concat/v82-v84v106v107) and space is correct:
mount /dev/concat/v82-v84v106v107a /mnt/data3/69.55.234.66-col01334-DIR

cd /mnt/data3/69.55.234.66-col01334-DIR
df .

do the restore from the dumpfile on the backup server:
restore -r -f /backup4/col00241.20120329.noarchive.dump .
rm restoresymtable

when dump/restore completes successfully, use df to confirm the restored data size matches the original usage figure


edit quad (vq1) to point to new (/mnt/data3) location AND new volume (/dev/concat/v82-v84v106v107a), run buildsafe


restart the jail:
startjail <hostname>


TODO: clean up/clear old gvin/gconcat vol


6. update disk (and dir if applicable) in mgmt screen


7. update backup list AND move backups, if applicatble


Ex: mvbackups 69.55.237.26-col00241 jail9 data3


DEPRECATED - steps to tack on a new gvin to existing gconcat- leads to corrupted fs bsdlabel -e /dev/concat/v82-v84

To figure out new size of the c partition, multiply 4194304 by the # of 2G gvinum volumes and subtract the # of 2G volumes: 10G: 4194304 * 5 – 5 = 20971515 8G: 4194304 * 4 – 4 = 16777212 6G: 4194304 * 3 – 3 = 12582909 4G: 4194304 * 2 – 2 = 8388606

To figure out the new size of the a partition, subtract 16 from the c partition: 10G: 20971515 – 16 = 20971499 8G: 16777212 – 16 = 16777196 6G: 12582909 – 16 = 12582893 4G: 8388606 – 16 = 8388590

Orig: 8 partitions:

  1. size offset fstype [fsize bsize bps/cpg]
 a:  8388590       16    4.2BSD     2048 16384 28552
 c:  8388606        0    unused        0     0         # "raw" part, don't edit

New: 8 partitions:

  1. size offset fstype [fsize bsize bps/cpg]
 a: 12582893       16    4.2BSD     2048 16384 28552
 c: 12582909        0    unused        0     0         # "raw" part, don't edit

sync; sync

growfs /dev/concat/v82-v84a

fsck –fy /dev/concat/v82-v84a

sync

fsck –fy /dev/concat/v82-v84a

(keep running fsck’s till NO errors)

Problems un-mounting - and with mount_null’s

If you cannot unmount a filesystem, beacuse it says the filesystem is busy, it is because of three things:

a) the jail is still running

b) you are actually in that directory, even though the jail is stopped

c) there are still dev, null_mount or linprocfs mount points mounted inside that directory.

d) when trying to umount null_mounts that are really long and you get an error like “No such file or directory”, it’s an OS bug where the dir is truncated. No known fix

e) there are still files open somewhere inside the dir. Use fstat | grep <cid> to find the process that has files open

f) Starting with 6.x, the jail mechanism does a poor job of keeping track of processes running in a jail and if it thinks there are still procs running, it will refuse to umount the disk. If this is happening you should see a low number in the #REF column when you run jls. In this case you can safely umount –f the mount.

Please note -if you forcibly unmount a (4.x) filesystem that has null_mounts still mounted in it, the system will crash within 10-15 mins.

Misc jail Items

We are overselling hard drive space on jail2, jail8, jail9, a couple jails on jail17, jail4, jail12 and jail18. Even though the vn file shows 4G size, it doesn’t actually occupy that amount of space on the disk. So be careful not to fill up drives where we’re overselling – use oversellcheck to confirm you’re not oversold by more than 10G. There are other truncated jails, they are generally noted in a the file on the root system: /root/truncated

The act of moving a truncated vn to another system un-does the truncating- the truncated vn is filled with 0’s and it occupies physical disk space for which it’s configured. So, you should use dumpremote to preserve the truncation.

FreeBSD VPS Management Tools

These files are located in /usr/local/jail/rc.d and /usr/local/jail/bin

jailmake

Applies to 7.x+ On older systems syntax differs, run jailmake once to see.

Note: this procedure differs on mx2 which is 7.x but still uses gvinum

  1. run js to figure out which md’s are in use, which disk has enough space, IP to put it on
  2. use col00xxx for both hostnames if they don’t give you a hostname
  3. copy over dir, ip and password to pending customer screen

Usage: jailmake IP[,IP] CID disk[1|2|3] md# hostname shorthost ipfw# email [size in GB]

Ex:

Jail2# jailmake 69.55.234.66 col01334 3 97 vps.bsd.it vps 1334 fb@bsd.it

jailps

jailps [hostname]

DEPRECATED FOR jps: displays processes belonging to/running inside a jail. The command takes one (optional) argument – the hostname of the jail you wish to query. If you don’t supply an argument, all processes on the machine are listed and grouped by jail.

jps

jps [hostname]

displays processes belonging to/running inside a jail. The command takes one (optional) argument – the hostname or ID of the jail you wish to query.

jailkill

jailkill <hostname>

stops all process running in a jail.

You can also run:

jailkill <JID>

problems

Occasionally you will hit an issue where jail will not kill off:

jail9# jailkill www.domain.com
www.domain.com .. killed: none
jail9#

Because no processes are running under that hostname. You cannot use jailps.pl either:

jail9# jailps www.domain.com
www.domain.com doesn’t exist on this server
jail9#

The reasons for this are usually:

  • the jail is no longer running
  • the jail's hostname has changed

In this case,

>=6.x: run a jls|grep <jail's IP> to find the correct hostname, then update the quad file, then kill the jail.

<6.x: the first step is to cat their /etc/rc.conf file to see if you can tell what they set the new hostname to. This very often works. For example:

cat /mnt/data2/198.78.65.136-col00261-DIR/etc/rc.conf

But maybe they set the hostname with the hostname command, and the original hostname is still in /etc/rc.conf.

The welcome email clearly states that they should tell us if they change their hostname, so there is no problem in just emailing them and asking them what they set the new hostname to.

Once you know the new hostname OR if a customer simply emails to inform you that they have set the hostname to something different, you need to edit the quad and safe files that their system is in to input the new hostname.

However, if push comes to shove and you cannot find out the hostname from them or from their system, then you need to start doing some detective work.

The easiest thing to do is run jailps looking for a hostname similar to their original hostname. Or you could get into the /bin/sh shell by running:

/bin/sh

and then looking at every hostname of every process:

for f in `ls /proc` ; do cat /proc/$f/status ; done

and scanning for a hostname that is either similar to their original hostname, or that you don't see in any of the quad safe files.

This is very brute force though, and it is possible that catting every file in /proc is dangerous - I don't recommend it. A better thing would be to identify any processes that you know belong to this system – perhaps the reason you are trying to find this system is because they are running something bad - and just catting the status from only that PID.

Somewhere there’s a jail where there may be 2 systems named www. Look at /etc/rc.conf and make sure they’re both really www. If they are, jailkill www, jailps www to make sure not running. Then immediately restart the other one, as the fqdn (as found from a rev nslookup)

  • on >=6.x the hostname may not yet be hashed:
jail9 /# jls
 JID Hostname                    Path                                  IP Address(es)
   1 bitnet.dgate.org            /mnt/data1/69.55.232.50-col02094-DIR  69.55.232.50
   2 ns3.hctc.net                /mnt/data1/69.55.234.52-col01925-DIR  69.55.234.52
   3 bsd1                        /mnt/data1/69.55.232.44-col00155-DIR  69.55.232.44
   4 let2.bbag.org               /mnt/data1/69.55.230.92-col00202-DIR  69.55.230.92
   5 post.org                    /mnt/data2/69.55.232.51-col02095-DIR  69.55.232.51 ...
   6 ns2                         /mnt/data1/69.55.232.47-col01506-DIR  69.55.232.47 ...
   7 arlen.server.net            /mnt/data1/69.55.232.52-col01171-DIR  69.55.232.52
   8 deskfood.com                /mnt/data1/69.55.232.71-col00419-DIR  69.55.232.71
   9 mirage.confluentforms.com   /mnt/data1/69.55.232.54-col02105-DIR  69.55.232.54 ...
  10 beachmember.com             /mnt/data1/69.55.232.59-col02107-DIR  69.55.232.59
  11 www.agottem.com             /mnt/data1/69.55.232.60-col02109-DIR  69.55.232.60
  12 sdhobbit.myglance.org       /mnt/data1/69.55.236.82-col01708-DIR  69.55.236.82
  13 ns1.jnielsen.net            /mnt/data1/69.55.234.48-col00204-DIR  69.55.234.48 ...
  14 ymt.rollingegg.net          /mnt/data2/69.55.236.71-col01678-DIR  69.55.236.71
  15 verse.unixlore.net          /mnt/data1/69.55.232.58-col02131-DIR  69.55.232.58
  16 smcc-mail.org               /mnt/data2/69.55.232.68-col02144-DIR  69.55.232.68
  17 kasoutsuki.w4jdh.net        /mnt/data2/69.55.232.46-col02147-DIR  69.55.232.46
  18 dili.thium.net              /mnt/data2/69.55.232.80-col01901-DIR  69.55.232.80
  20 www.tekmarsis.com           /mnt/data2/69.55.232.66-col02155-DIR  69.55.232.66
  21 vps.yoxel.net               /mnt/data2/69.55.236.67-col01673-DIR  69.55.236.67
  22 smitty.twitalertz.com       /mnt/data2/69.55.232.84-col02153-DIR  69.55.232.84
  23 deliver4.klatha.com         /mnt/data2/69.55.232.67-col02160-DIR  69.55.232.67
  24 nideffer.com                /mnt/data2/69.55.232.65-col00412-DIR  69.55.232.65
  25 usa.hanyuan.com             /mnt/data2/69.55.232.57-col02163-DIR  69.55.232.57
  26 daifuku.ppbh.com            /mnt/data2/69.55.236.91-col01720-DIR  69.55.236.91
  27 collins.greencape.net       /mnt/data2/69.55.232.83-col01294-DIR  69.55.232.83
  28 ragebox.com                 /mnt/data2/69.55.230.104-col01278-DIR 69.55.230.104
  29 outside.mt.net              /mnt/data2/69.55.232.72-col02166-DIR  69.55.232.72
  30 vps.payneful.ca             /mnt/data2/69.55.234.98-col01999-DIR  69.55.234.98
  31 higgins                     /mnt/data2/69.55.232.87-col02165-DIR  69.55.232.87 ...
  32 ozymandius                  /mnt/data2/69.55.228.96-col01233-DIR  69.55.228.96
  33 trusted.realtors.org        /mnt/data2/69.55.238.72-col02170-DIR  69.55.238.72
  34 jc1.flanderous.com          /mnt/data2/69.55.239.22-col01504-DIR  69.55.239.22
  36 guppylog.com                /mnt/data2/69.55.238.73-col00036-DIR  69.55.238.73
  40 haliohost.com               /mnt/data2/69.55.234.41-col01916-DIR  69.55.234.41 ...
  41 satyr.jorge.cc              /mnt/data1/69.55.232.70-col01963-DIR  69.55.232.70
jail9 /# jailkill satyr.jorge.cc
ERROR: jail_: jail "satyr,jorge,cc" not found

Note how it's saying satyr,jorge,cc is not found, and not satyr.jorge.cc.

The jail subsystem tracks things using comma-delimited hostnames. That is created every few hours:

jail9 /# crontab -l
0 0,6,12,18 * * * /usr/local/jail/bin/sync_jail_names

So if we run this manually:

jail9 /# /usr/local/jail/bin/sync_jail_names

Then kill the jail:

jail9 /# jailkill satyr.jorge.cc

successfully killed: satyr,jorge,cc

It worked.


If you ever see this when trying to kill a jail:

# jailkill e-scribe.com
killing JID: 6 hostname: e-scribe.com
3 procs running
3 procs running
3 procs running
3 procs running
...

jailkill probably got lost trying to kill off the jail. Just ctrl-c the jailkill process, then run a jailps on the hostname, and kill -9 any process which is still running. Keep running jailps and kill -9 till all processes are gone.

jailpsall

jailpsall

will run a jailps on all jails configured in the quad files (this is different from jailps with no arguments as it won’t help you find a “hidden” system)

jailpsw

jailpsw

will run a jailps with an extra -w to provide wider output

jt (>=7.x)

jt

displays the top 20 processes on the server (the top 20 processes from top) and which jail owns them. This is very helpful for determining who is doing what when the server is very busy.

jtop (>=7.x)

jtop

a wrapper for top displaying processes on the server and which jail owns them. Constantly updates, like top.

jtop (<7.x)

jtop

displays the top 20 processes on the server (the top 20 processes from top) and which jail owns them. This is very helpful for determining who is doing what when the server is very busy.

stopjail

stopjail <hostname> [1]

this will jailkill, umount and vnconfig –u a jail. If passed an optional 2nd argument, it will not exit before umounting and un-vnconfig’ing in the event jailkill returns no processes killed. This is useful if you just want to umount and vnconfig –u a jail you’ve already killed. It is intelligent in that it won’t try to umount or vnconfig –u if it’s not necessary.

startjail

startjail <hostname>

this will start vnconfig, mount (including linprocfs and null-mounts), and start a jail. Essentially, it reads the jail’s relevant block from the right quad file and executes it. It is intelligent in that it won’t try to mount or vnconfig if it’s not necessary.

jpid

jpid <pid>

displays information about a process – including which jail owns it. It’s the equivalent of running cat /proc/<pid>/status

canceljail

canceljail <hostname> [1]

this will stop a jail (the equivalent of stopjail), check for backups (offer to remove them from the backup server and the backup.config), rename the vnfile, remove the dir, and edit quad/safe. If passed an optional 2nd argument, it will not exit upon failing to kill and processes owned by the jail. This is useful if you just want to cancel a jail which is already stopped.

jls

jls [-v]

Lists all jails running:

JID #REF IP Address      Hostname                     Path
 101  135 69.55.224.148   mail.pc9.org                 /mnt/data2/69.55.224.148-col01034-DIR
  1. REF is the number of references or procs(?) running

Running with -v will give you all IPs assigned to each jail (7.2 up)

JID #REF Hostname                     Path                                  IP Address(es)
 101  139 mail.pc9.org                 /mnt/data2/69.55.224.148-col01034-DIR 69.55.224.14869.55.234.85

startalljails

startalljails

7.2+ only. This will parse through quad1 and start all jails. It utilizes lockfiles so it won’t try to start a jail more than once- therefore multiple instances can be running in parallel without fear of starting a jail twice. If a jail startup gets stuck, you can ^C without fear of killing the script. IMPORTANT- before running startalljails you should make sure you ran preboot once as it will clear out all the lockfiles and enable startalljails to work properly.

aaccheck.sh

aaccheck.sh

displayes the output of container list and task list from aaccli

backup

backup

backup script called nightly to update jail scripts and do customer backups

buildsafe

buildsafe

creates safe files based on quads (automatically removing the fsck’s). This will destructively overwrite safe files

checkload.pl

checkload.pl

this was intended to be setup as a cronjob to watch processes on a jail when the load rises above a certain level. Not currently in use.

checkprio.pl

checkprio.pl

will look for any process (other than the current shell’s csh, sh, sshd procs) with a non-normal priority and normalize it

diskusagemon

diskusagemon <mount point> <1k blocks>

watches a mount point’s disk use, when it reaches the level specified in the 2nd argument, it exits. This is useful when doing a restore and you want to be paged as it’s nearing completion. Best used as: diskusagemon /asd/asd 1234; pagexxx

dumprestore

dumprestore <dumpfile>

this is a perl expect script which automatically enters ‘1’ and ‘y’. It seems to cause restore to fail to set owner permissions on large restores.

g

g <search>

greps the quad/safe files for the given search parameter

gather.pl

gather.pl

gathers up data about jails configured and writes to a file. Used for audits against the db

gb

gb <search>

greps backup.config for the given search parameter

gbg

gbg <search>

greps backup.config for the given search parameter and presents just the directories (for clean pasting)

ipfwbackup

ipfwbackup

writes ipfw traffic count data to a logfile

ipfwreset

ipfwreset

writes ipfw traffic count data to a logfile and resets counters to 0

js

js

output varies by OS version, but generally provides information about the base jail: - which vn’s are in use - disk usage - info about the contents of quads - the # of inodes represented by the jails contained in the group (133.2 in the example below), and how many jails per data mount, as well as subtotals - ips bound to the base machine but not in use by a jail - free gvinum volumes, or unused vn’s or used md’s

/usr/local/jail/rc.d/quad1:
        /mnt/data1 133.2 (1)
        /mnt/data2 1040.5 (7)
        total 1173.7 (8)
/usr/local/jail/rc.d/quad2:
        /mnt/data1 983.4 (6)
        total 983.4 (6)
/usr/local/jail/rc.d/quad3:
        /mnt/data1 693.4 (4)
        /mnt/data2 371.6 (3)
        total 1065 (7)
/usr/local/jail/rc.d/quad4:
        /mnt/data1 466.6 (3)
        /mnt/data2 882.2 (5)
        total 1348.8 (8)
/mnt/data1: 2276.6 (14)
/mnt/data2: 2294.3 (15)

Available IPs:
69.55.230.11 69.55.230.13 69.55.228.200

Available volumes:
v78 /mnt/data2 2G
v79 /mnt/data2 2G
v80 /mnt/data2 2G

load.pl

load.pl

feeds info to load mrtg - executed by inetd.

makevirginjail

makevirginjail

Only on some systems, makes an empty jail (doesn't do restore step)

mb

mb <mount|umount>

(nfs) mounts and umounts dirs to backup2. Shortcuts are mbm and mbu to mount and unmount.

notify.sh

notify.sh

emails reboot@johncompanies.com – intended to be called at boot time to alert us to a machine which panics and reboots and isn’t caught by bb or castle.

orphanedbackupwatch

orphanedbackupwatch

looks for directories on backup2 which aren’t configured in backup.config and offers to delete them

postboot

postboot

to be run after a machine reboot and quad/safe’s are done executing. It will:

  • do chmod 666 on each jail’s /dev/null
  • add ipfw counts
  • run jailpsall (so you can see if a configured jail isn’t running)

preboot

preboot

to be run before running quad/safe – checks for misconfigurations:

  • a jail configured in a quad but not a safe
  • a jail is listed more than once in a quad
  • the ip assigned to a jail isn’t configured on the machine
  • alias numbering skips in the rc.conf (resulting in the above)
  • orphaned vnfile's that aren't mentioned in a quad/safe
  • ip mismatches between dir/vnfile name and the jail’s ip
  • dir/vnfiles's in quad/safe that don’t exist

quadanalyze.pl

quadanalyze.pl

called by js, produces the info (seen above with js explanation) about the contents of quad (inode count, # of jails, etc.)

rsync.backup

rsync.backup

does customer backups (relies on backup.config)

taskdone

taskdone

when called will email support@johncompanies.com with the hostname of the machine from which it was executed as the subject

topten

topten

summarizes the top 10 traffic users (called by ipfwreset)

trafficgather.pl

trafficgather.pl [yy-mm]

sends a traffic usage summary by jail to support@johncomapnies.com and payments@johncompanies.com. Optional arguments are year and month (must be in the past). If not passed, assumes last month. Relies on traffic logs created by ipfwreset and ipfwbackup

trafficwatch.pl

trafficwatch.pl

checks traffic usage and emails support@johncomapnies.com when a jail reaches the warning level (35G) and the limit (40G). We really aren’t using this anymore now that we have netflow.

trafstats

trafstats

writes ipfw traffic usage info by jail to a file called jc_traffic_dump in each jail’s / dir

truncate_jailmake

truncate_jailmake

a version of jailmake which creates truncated vnfiles.

vb

vb

the equivalent of: vi /usr/local/jail/bin/backup.config

vs (freebsd)

vs<n>

the equivalent of: vi /usr/local/jail/rc.d/safe<n>

vq<n> the equivalent of: vi /usr/local/jail/rc.d/quad<n>

dumpremote

dumpremote <user@machine> </remote/location/file-dump> <vnX>

ex: dumpremote user@10.1.4.117 /mnt/data3/remote.echoditto.com-dump 7 this will dump a vn filesystem to a remote machine and location

oversellcheck

oversellcheck

displays how much a disk is oversold or undersold taking into account truncated vn files. Only for use on 4.x systems

mvbackups (freebsd)

mvbackups <dir> (1.1.1.1-col00001-DIR) <target_machine> (jail1) <target_dir> (data1)

moves backups from one location to another on the backup server, and provides you with option to remove entries from current backup.config, and simple paste command to add the config to backup.config on the target server

jailnice

jailnice <hostname>

applies renice 19 [PID] and rtprio 31 –[PID] to each process in the given jail

dumpremoterestore

dumpremoterestore <device> <ip of target machine> <dir on target machine>

ex: dumpremoterestore /dev/vn51 10.1.4.118 /mnt/data2/69.55.239.45-col00688-DIR dumps a device and restores it to a directory on a remote machine. Requires that you enable root ssh on the remote machine.

psj

psj

shows just the procs running on the base system – a ps auxw but without jail’d procs present

perc5iraidchk

perc5iraidchk

checks for degraded arrays on Dell 2950 systems with Perc5/6 controllers

perc4eraidchk

perc4eraidchk

checks for degraded arrays on Dell 2850 systems with Perc4e/Di controllers

Virtuozzo VPS

Note: We use VEID (Virtual Environment ID) and CTID (Container ID) interchangably. Similarly, VE and CT. They mean the same thing. VZPP = VirtuoZzo Power Panel (the control panel for each CT)

All linux systems exist in /vz, /vz1 or /vz2 - since each linux machine holds roughly 60-90 customers, there will be roughly 30-45 in each partition.

The actual filesystem of the system in question is in:

/vz(1-2)/private/(VEID)

Where VEID is the identifier for that system - an all-numeric string larger than 100.

The actual mounted and running systems are in the corresponding:

/vz(1-2)/root/(VEID)

But we rarely interact with any system from this mount point.

You should never need to touch the root portion of their system – however you can traverse their filesystem by going to /vz(1-2)/private/(VEID)/root (/vz(1-2)/private/(VEID)/fs/root on 4.x systems) the root of their filesystem is in that directory, and their entire system is underneath that.

Every VE has a startup script in /etc/sysconfig/vz-scripts (which is symlinked as /vzconf on all systems) - the VE startup script is simply named (VEID).conf - it contains all the system parameters for that VE:

# Configuration file generated by vzsplit for 60 VE
# on HN with total amount of physical mem 2011 Mb

VERSION="2"
CLASSID="2"

ONBOOT="yes"

KMEMSIZE="8100000:8200000"
LOCKEDPAGES="322:322"
PRIVVMPAGES="610000:615000"
SHMPAGES="33000:34500"
NUMPROC="410:415"
PHYSPAGES="0:2147483647"
VMGUARPAGES="13019:2147483647"
OOMGUARPAGES="13019:2147483647"
NUMTCPSOCK="1210:1215"
NUMFLOCK="107:117"
NUMPTY="19:19"
NUMSIGINFO="274:274"
TCPSNDBUF="1800000:1900000"
TCPRCVBUF="1800000:1900000"
OTHERSOCKBUF="900000:950000"
DGRAMRCVBUF="200000:200000"
NUMOTHERSOCK="650:660"
DCACHE="786432:818029"
NUMFILE="7500:7600"
AVNUMPROC="51:51"
IPTENTRIES="155:155"
DISKSPACE="4194304:4613734"
DISKINODES="400000:420000"
CPUUNITS="1412"
QUOTAUGIDLIMIT="2000"
VE_ROOT="/vz1/root/636"
VE_PRIVATE="/vz1/private/636"
NAMESERVER="69.55.225.225 69.55.230.3"
OSTEMPLATE="vzredhat-7.3/20030305"
VE_TYPE="regular"
IP_ADDRESS="69.55.225.229"
HOSTNAME="textengine.net"


As you can see, the hostname is set here, the disk space is set here, the number of inodes, the number of files that can be open, the number of tcp sockets, etc. - all are set here.

In fact, everything that can be set on this customer system is set in this conf file.


All interaction with the customer system is done with the VEID. You start the system by running:

vzctl start 999

You stop it by running:

vzctl stop 999

You execute commands in it by running:

vzctl exec 999 df -k

You enter into it, via a root-shell backdoor with:

vzctl enter 999

and you set parameters for the system, while it is still running, with:

vzctl set 999 --diskspace 6100000:6200000 --save

vzctl is the most commonly used command - we have aliased v to vzctl since we use it so often. We’ll continue to use vzctl in our examples, but feel free to use just v.

Let's say the user wants more diskspace. You can cat their conf file and see:

DISKSPACE="4194304:4613734"

So right now they have 4gigs of space. You can then change it to 6 with:

vzctl set 999 --diskspace 6100000:6200000 --save

IMPORTANT: all issuances of the vzctl set command need to end with –save - if they don't, the setting will be set, but it will not be saved to the conf file, and they will not have those settings next time they boot.

All of the tunables in the conf file can be set with the vzctl set command. Note that in the conf file, and on the vzctl set command line, we always issue two numbers seperated by a colon - that is because we are setting the hard and soft limits. Always set the hard limit slightly above the soft limit, as you see it is in the conf file for all those settings.

There are also things you can set with `vzctl set` that are not in the conf file as settings, per se. For instance, you can add IPs:

vzctl set 999 --ipadd 10.10.10.10 --save

or multiple IPs:

vzctl set 999 --ipadd 10.10.10.10 --ipadd 10.10.20.30 --save

or change the hostname:

vzctl set 999 --hostname www.example.com --save

You can even set the nameservers:

vzctl set 999 --nameserver 198.78.66.4 --nameserver 198.78.70.180 --save

Although you probably will never do that.

You can disable a VPS from being started (by VZPP or reboot) (4.x):

vzctl set 999 --disabled yes --save 

You can disable a VPS from being started (by VZPP or reboot) (<=3.x):

vzctl set 999 --onboot=no --save 

You can disable a VPS from using his control panel:

vzctl set 999 --offline_management=no --save 

You can suspend a VPS, so it can be resumed in the same state it was in when it was stopped (4.x):

vzctl suspend 999

and to resume it:

vzctl resume 999

to see who owns process:

vzpid <PID>

to mount up an unmounted ve:

vzctl mount 827


To see network stats for CT's:

# vznetstat
VEID    Net.Class  Output(bytes)   Input(bytes)
-----------------------------------------------
24218     1            484M             39M
24245     1            463M            143M
2451      1           2224M            265M
2454      1           2616M            385M
4149      1            125M             68M
418       1           1560M             34M
472       1           1219M            315M
726       1            628M            317M
763       1            223M             82M
771       1           4234M            437M
-----------------------------------------------
Total:               13780M           2090M


One thing that sometimes comes up on older systems that we created with smaller defaults is that the system would run out of inodes. The user will email and say they cannot create any more files or grow any files larger, but they will also say that they are not out of diskspace ... they are running:

df -k

and seeing how much space is free - and they are not out of space. They are most likely out of inodes - which they would see by running:

df -i

So, the first thing you should do is enter their system with:

vzctl enter 999

and run: df -i

to confirm your theory. Then exit their system. Then simply cat their conf file and see what their inodes are set to (probably 200000:200000, since that was the old default on the older systems) and run:

vzctl set 999 --diskinodes 400000:400000 --save

If they are not out of inodes, then a good possibility is that they have maxed out their numfile configuration variable, which controls how many files they can have in their system. The current default is 7500 (which nobody has ever hit), but the old default was as low as 2000, so you would run something like:

vzctl set 999 --numfile 7500:7500 --save

You cannot start or stop a VE if your pwd is its private (/vz/private/999) or root (/vz/root/999) directories, or anywhere below them.


Recovering from a crash (linux)

Diagnose whether you have a crash

The most important thing is to get the machine and all ve’s back up as soon as possible. Note the time, you’ll need to create a crash log entry (Mgmt. -> Reference -> CrashLog). The first thing to do is head over to the serial console screen and see if there’s any kernel error messages output. Try to copy any messages (or just a sample of repeating messages) you see into the notes section of the crash log – these will also likely need to be sent to virtuozzo for interpretation. If the messages are spewing too fast, hit ^O + H to start a screen log dump which you can ob1182.pts-38.bb serve after the machine is rebooted. Additionally, if the machine is responsive, you can get a trace to send to virtuozzo by hooking up a kvm and entering these 3 sequences:

alt+print screen+m
alt+print screen+p
alt+print screen+t

If there are no messages, the machine may just be really busy- wait a bit (5-10min) to see if it comes back. If it's still pinging, odds are its very busy. If it doesn't come back, or the messages indicate a fatal error, you will need to proceed with a power cycle (ctrl+alt+del will not work).

Power cycle the server

If this machine is not a Dell 2950 with a DRAC card (i.e. if you can’t ssh into the DRAC card and issue racadm serveraction hardreset, then you will need someone at the data center to power the macine off, wait 30 sec, then turn it back on. Make sure to re-attach via console (tip virtxx) immediately after power down.

(Re)attach to the console

Stay on the console the entire time during boot. As the BIOS posts- look out for the RAID card output- does everything look healthy? The output may be scrambled, look for "DEGRADED" or "FAILED". Once the OS starts booting you will be disconnected (dropped back to the shell on the console server) a couple times during the boot up. The reason you want to quickly re-attach is two-fold: 1. If you don’t reattach quickly then you won’t get any console output, 2. you want to be attached before the server potentially starts (an extensive) fsck. If you attach after the fsck begins, you’ll have seen no indication it started an fsck and the server will appear frozen during startup- no output, no response.

Start containers/VE's/VPSs

When the machine begins to start VE’s, it’s safe to leave the console and login via ssh. All virts should be set to auto start all the VEs after a crash. Further, most (newer) virts are set to “fastboot” it’s VE’s (to find out, do:

grep -i fast /etc/sysconfig/vz 

and look for VZFASTBOOT=yes). If this was set prior to the machine’s crash (setting it after the machine boots will not have any effect until the vz service is restarted) it will start each ve as fast as possible, in serial, then go thru each VE (serially), shutting it down running a vzquota (disk usage) check, then bringing it back up. The benefit is that all VE’s are brought up quickly (within 15min or so depending on the #), the downside is a customer watching closely will notice 2 outages – 1st the machine crash, 2nd their quota check (which will be a much shorter downtime- on the order of a few minutes).

Where “fastboot” is not set to yes (i.e on quar1), vz will start them consecutively, checking the quotas one at a time, and the 60th VE may not start until an hour or two later - this is not acceptable.

The good news is, if you run vzctl start for a VE that is already started, you will simply get an error: VE is already started. Further, if you attempt to vzctl start a VE that is in the process of being started, you will simply get an error: unable to lock VE. So, there is no danger in simply running scripts to start smaller sets of VEs. If the system is not autostarting, then there is no issue, and even if it does, when it conflicts, one process (yours or the autostart) will lose, and just move on to the next one.

A script has been written to assist with ve starts: startvirt.pl which will start 6 ve’s at once until there are no more left. If startvirt.pl is used on a system where “fastboot” was on, it will circumvent the fastboot for ve’s started by startvirt.pl – they will go through the complete quota check before starting- therefore this is not advisable when a system has crashed. When a system is booted cleanly, and there's no need for vzquota checks, then startvirt.pl is safe and advisable to run.

Make sure all containers are running

You can quickly get a feel for how many ve’s are started by running:

[root@virt4 log]# vs
VEID 16066 exist mounted running
VEID 16067 exist mounted running
VEID 4102 exist mounted running
VEID 4112 exist mounted running
VEID 4116 exist mounted running
VEID 4122 exist mounted running
VEID 4123 exist mounted running
VEID 4124 exist mounted running
VEID 4132 exist mounted running
VEID 4148 exist mounted running
VEID 4151 exist mounted running
VEID 4155 exist mounted running
VEID 42 exist mounted running
VEID 432 exist mounted running
VEID 434 exist mounted running
VEID 442 exist mounted running
VEID 450 exist mounted running
VEID 452 exist mounted running
VEID 453 exist mounted running
VEID 454 exist mounted running
VEID 462 exist mounted running
VEID 463 exist mounted running
VEID 464 exist mounted running
VEID 465 exist mounted running
VEID 477 exist mounted running
VEID 484 exist mounted running
VEID 486 exist mounted running
VEID 490 exist mounted running

So to see how many ve’s have started:

[root@virt11 root]# vs | grep running | wc -l
     39

And to see how many haven’t:

[root@virt11 root]# vs | grep down | wc -l
     0

And how many we should have running:

[root@virt11 root]# vs | wc -l
     39

Another tool you can use to see which ve’s have started, among other things is vzstat. It will give you CPU, memory, and other stats on each ve and the overall system. It’s a good thing to watch as ve’s are starting (note the VENum parameter, it will tell you how many have started):

4:37pm, up 3 days,  5:31,  1 user, load average: 1.57, 1.68, 1.79
VENum 40, procs 1705: running 2, sleeping 1694, unint 0, zombie 9, stopped 0
CPU [ OK ]: VEs  57%, VE0   0%, user   8%, sys   7%, idle  85%, lat(ms) 412/2
Mem [ OK ]: total 6057MB, free 9MB/54MB (low/high), lat(ms) 0/0
Swap [ OK ]: tot 6142MB, free 4953MB, in 0.000MB/s, out 0.000MB/s
Net [ OK ]: tot: in  0.043MB/s  402pkt/s, out  0.382MB/s 4116pkt/s
Disks [ OK ]: in 0.002MB/s, out 0.000MB/s

  VEID ST    %VM     %KM         PROC    CPU     SOCK FCNT MLAT IP
     1 OK 1.0/17  0.0/0.4    0/32/256 0.0/0.5 39/1256    0    9 69.55.227.152
    21 OK 1.3/39  0.1/0.2    0/46/410 0.2/2.8 23/1860    0    6 69.55.239.60
   133 OK 3.1/39  0.1/0.3    1/34/410 6.3/2.8 98/1860    0    0 69.55.227.147
   263 OK 2.3/39  0.1/0.2    0/56/410 0.3/2.8 34/1860    0    1 69.55.237.74
   456 OK  17/39  0.1/0.2   0/100/410 0.1/2.8 48/1860    0   11 69.55.236.65
   476 OK 0.6/39  0.0/0.2    0/33/410 0.1/2.8 96/1860    0   10 69.55.227.151
   524 OK 1.8/39  0.1/0.2    0/33/410 0.0/2.8 28/1860    0    0 69.55.227.153
   594 OK 3.1/39  0.1/0.2    0/45/410 0.0/2.8 87/1860    0    1 69.55.239.40
   670 OK 7.7/39  0.2/0.3    0/98/410 0.0/2.8 64/1860    0  216 69.55.225.136
   691 OK 2.0/39  0.1/0.2    0/31/410 0.0/0.7 25/1860    0    1 69.55.234.96
   744 OK 0.1/17  0.0/0.5    0/10/410 0.0/0.7  7/1860    0    6 69.55.224.253
   755 OK 1.1/39  0.0/0.2    0/27/410 0.0/2.8 33/1860    0    0 192.168.1.4
   835 OK 1.1/39  0.0/0.2    0/19/410 0.0/2.8  5/1860    0    0 69.55.227.134
   856 OK 0.3/39  0.0/0.2    0/13/410 0.0/2.8 16/1860    0    0 69.55.227.137
   936 OK 3.2/52  0.2/0.4    0/75/410 0.2/0.7 69/1910    0    8 69.55.224.181
  1020 OK 3.9/39  0.1/0.2    0/60/410 0.1/0.7 55/1860    0    8 69.55.227.52
  1027 OK 0.3/39  0.0/0.2    0/14/410 0.0/2.8 17/1860    0    0 69.55.227.83
  1029 OK 1.9/39  0.1/0.2    0/48/410 0.2/2.8 25/1860    0    5 69.55.227.85
  1032 OK  12/39  0.1/0.4    0/80/410 0.0/2.8 41/1860    0    8 69.55.227.90

When you are all done, you will want to make sure that all the VEs really did get started, run vs one more time.

Note the time all ve’s are back up and enter that into and save the crash log entry.

Occasionally, a ve will not start automatically. The most common reason for a ve not to come up normally is the ve was at it’s disk limit before the crash, and will not start since they’re over the limit. To overcome this, set the disk space to current usage level (the system will give this to you when it fails to start), start the ve, then re-set the disk space back to the prior level. Lastly, contact the customer to let them know they’re out of disk (or allocate more disk if they're entitled to more).

Hitting performance barriers and fixing them

There are multiple modes virtuozzo offers to allocate resources to a ve. We utilize 2: SLM and UBC parameters On our 4.x systems, we use all SLM – it’s simpler to manage and understand. There are a few systems on virt19/18 that may also use SLM. Everything else uses UBC. You can tell a SLM ve by:

SLMMODE="all"

in their conf file.

TODO: detail SLM modes and parameters.

If someone is in SLM mode and they hit memory resource limits, they simply need to upgrade to more memory.

The following applies to everyone else (UBC).

Customers will often email and say that they are getting out of memory errors - a common one is "cannot fork" ... basically, anytime you see something odd like this, it means they are hitting one of their limits that is in place in their conf file.

The conf file, however, simply shows their limits - how do we know what they are currently at ?

The answer is a file called v - this file contains the current status (and peaks) of their performance settings, and also counts how many times they have hit the barrier. The output of the file looks like this:

764: kmemsize         384113     898185    8100000    8200000          0
     lockedpages           0          0        322        322          0
     privvmpages        1292       7108     610000     615000          0
     shmpages            270        528      33000      34500          0
     dummy                 0          0          0          0          0
     numproc               8         23        410        415          0
     physpages            48       5624          0 2147483647          0
     vmguarpages           0          0      13019 2147483647          0
     oomguarpages        641       6389      13019 2147483647          0
     numtcpsock            3         21       1210       1215          0
     numflock              1          3        107        117          0
     numpty                0          2         19         19          0
     numsiginfo            0          4        274        274          0
     tcpsndbuf             0      80928    1800000    1900000          0 
     tcprcvbuf             0     108976    1800000    1900000          0
     othersockbuf       2224      37568     900000     950000          0
     dgramrcvbuf           0       4272     200000     200000          0
     numothersock          3          9        650        660          0
     dcachesize        53922     100320     786432     818029          0
     numfile             161        382       7500       7600          0
     dummy                 0          0          0          0          0
     dummy                 0          0          0          0          0
     dummy                 0          0          0          0          0
     numiptent             4          4        155        155          0

The first column is the name of the counter in question - the same names we saw in the systems conf file. The second column is the _current_ value of that counter, the third column is the max that that counter has ever risen to, the fourth column is the soft limit, and the fifth column is the hard limit (which is the same as the numbers in that systems conf file).

The sixth number is the failcount - how many times the current usage has risen to hit the barrier. It will increase as soon as the current usage hits the soft limit.

The problem with /proc/user_beancounters is that it actually contains that set of data for every running VE - so you can't just cat /proc/user_beancounters - it is too long and you get info for every other running system.

You can vzctl enter the system and run:

vzctl enter 9999
cat /proc/user_beancounters

inside their system, and you will just see the stats for their particular system, but entering their system every time you want to see it is combersome.

So, I wrote a simple script called "vzs" which simply greps for the VEID, and spits out the next 20 or so lines (however many lines there are in the output, I forget) after it. For instance:

# vzs 765:
765: kmemsize        2007936    2562780    8100000    8200000          0
     lockedpages           0          8        322        322          0
     privvmpages       26925      71126     610000     615000          0
     shmpages          16654      16750      33000      34500          0
     dummy                 0          0          0          0          0
     numproc              41         57        410        415          0
     physpages          1794      49160          0 2147483647          0
     vmguarpages           0          0      13019 2147483647          0
     oomguarpages       4780      51270      13019 2147483647          0
     numtcpsock           23         37       1210       1215          0
     numflock             17         39        107        117          0
     numpty                1          3         19         19          0
     numsiginfo            0          6        274        274          0
     tcpsndbuf         22240     333600    1800000    1900000          0
     tcprcvbuf             0     222656    1800000    1900000          0
     othersockbuf     104528     414944     900000     950000          0
     dgramrcvbuf           0       4448     200000     200000          0
     numothersock         73        105        650        660          0
     dcachesize       247038     309111     786432     818029          0
     numfile             904       1231       7500       7600          0
     dummy                 0          0          0          0          0
     dummy                 0          0          0          0          0
     dummy                 0          0          0          0          0
     numiptent             4          4        155        155          0

That showed us just the portion of /proc/user_beancounters for system 765.

When you run the vzs command, always add a : after the VEID.

So, if a customer complains about some out of memory errors, or no more files, or no more ptys, or just has an unspecific complain about processes dying, etc., the very first thing you need to do is check their beancounters with vzs. Usually you will spot an item that has a high failcount and needs to be upped.

At that point you could simply up the counter with `vzctl set`. Generally pick a number 10-20% higher than the old one, and make the hard limit slightly larger than the the soft limit. However our systems now come in several levels and those levels have more/different memory allocations. If someone is complaining about something other than a memory limit (pty, numiptent, numflock), it’s generally safe to increase it, at least to the same level as what’s in the /vzconf/4unlimited file on the newest virt. If someone is hitting a memory limit, first make sure they are given what they deserve:

(refer to mgmt -> payments -> packages)

To set those levels, you use the setmem command.

The alternate (DEPRECATED) method would be to use one of 3 commands: 256 <veid> 300 <veid> 384 <veid> 512 <veid>

If the levels were not right (you’d run vzs <veid> before and after to see the effect) tell the customer they’ve been adjusted and be done with it. If the levels were right, tell the customer they must upgrade to a higher package, tell them how to see level (control panel) and that they can reboot their system to escape this lockup contidion.

Customers can also complain that their site is totally unreachable, or complain that it is down ... if the underlying machine is up, and all seems well, you may notice in the beancounters that network-specific counters are failing - such as numtcpsock, tcpsndbuf or tcprcvbuf. This will keep them from talking on the network and make it seem like their system is down. Again, just up the limits and things should be fine.

On virts 1-4, you should first look at the default settings for that item on a later virt, such as virt 8 - we have increased the defaults a lot since the early machines. So, if you are going to up a counter on virt2, instead of upping it by 10-20%, instead up it to the new default that you see on virt8.

Traffic accounting on linux

DEPRECATED - all tracking is done via bwdb now. This is how we used to track traffic.

TODO: update for diff versions of vz

Unlike FreeBSD, where we have to add firewall count rules to the system to count the traffic, on virtuozzo counts the traffic for us. You an see the current traffic stats by running `vznetstat`:

# vznetstat
VEID    Net.Class  Output(bytes)   Input(bytes)
-----------------------------------------------
24218     1            484M             39M
24245     1            463M            143M
2451      1           2224M            265M
2454      1           2616M            385M
4149      1            125M             68M
418       1           1560M             34M
472       1           1219M            315M
726       1            628M            317M
763       1            223M             82M
771       1           4234M            437M
-----------------------------------------------
Total:               13780M           2090M

As you can see the VEID is on a line with the in and out bytes. So, we simply run a cron job:

4,9,14,19,24,29,34,39,44,49,55,59 * * * * /root/vztrafdump.sh

Just like we do on FreeBSD - this one goes through all the VEs in /vz/private and greps the line from vznetstat that matches them and dumps it in /jc_traffic_dump on their system. Then it does it again for all the VEs in /vz1/private. It is important to note that vznetstat runs only once, and the grepping is done from a temporary file that contains that output - we do this because running vznetstat once for each VE that we read out of /vz/private and /vz1/private would take way too long and be too intensive.

You do not need to do anything to facilitate this other than make sure that that cron job is running - the vznetstat counters are always running, and any new VEs that are added to the system will be accounted for automatically.

Traffic resetting no longer works with vz 2.6, so we disable the vztrafdump.sh on those virts.

Watchdog script

On some of the older virts, we have a watchdog running that kills procs that are deemed bad per the following:

/root/watchdog from quar1

if echo $line | awk '{print $(NF-3)}' | grep [5-9]...
  then
# 50-90%
      if echo $line | awk '{print $(NF-1)}' | grep "...:.."
      then
# running for > 99min
        echo $line >> /root/wd.log
        /usr/sbin/vzpid $pid | tail -1 >> /root/wd.log
        kill -9 $pid

      fi
      if echo $line | awk '{print $(NF-1)}' | grep "....m"
      then
# running for > 1000min
        echo $line >> /root/wd.log
        /usr/sbin/vzpid $pid | tail -1 >> /root/wd.log
        kill -9 $pid

      fi
  fi

  if echo $line | awk '{print $(NF-3)}' | grep [1-9]...
  then
# running for 10-90 percent
    if echo $line | awk '{print $NF}' | egrep 'cfusion|counter|vchkpw'
    then

      if echo $line | awk '{print $(NF-1)}' | grep "[2-9]:.."
      then
# between 2-9min
        echo $line >> /root/wd.log
        /usr/sbin/vzpid $pid | tail -1 >> /root/wd.log
        kill -9 $pid

      elif echo $line | awk '{print $(NF-1)}' | grep "[0-9][0-9]:.."
      then
# up to 99min
        echo $line >> /root/wd.log
        /usr/sbin/vzpid $pid | tail -1 >> /root/wd.log
        kill -9 $pid

      fi
    fi
  fi


Misc Linux Items

We are overselling hard drive space ... when you configure a linux system with a certain amount of disk space (the default is 4gigs) you do not actually use up 4gigs of space on the system. The diskspace setting for a user is simply a cap, and they only use up as much space on the actual disk drive as they are actually using.

When you create a new linux system, even though there are some 300 RPMs or so installed, if you run `df -k` you will see that the entire 4gig partition is empty - no space is being used. This is because the files in their system are "magic symlinks" to the template for their OS that is in /vz/template - however, any changes to any of those files will "disconnect" them and they will immediately begin using space in their system. Further, any new files uploaded (even if those new files overwrite existing files) will take up space on the partition.


if you see this:

[root@virt8 root]# vzctl stop 160 ; vzctl start 160
VE is not running
Starting VE ...
VE is unmounted
VE is mounted
Adding IP address(es): 69.55.226.83 69.55.226.84
bash ERROR: Can't change file /etc/sysconfig/network
Deleting IP address(es): 69.55.226.83 69.55.226.84
VE is unmounted
[root@virt8 root]#

it probably means they no longer have /bin/bash - copy one in for them

ALSO: another possibility is that they have removed the `ed` RPM from their system - it needs to be reinstalled into their system. But since their system is down, this is tricky ...

VE startup scripts used by 'vzctl' want package 'ed' to be available inside VE. So if package 'ed' will be enabled in OS template config and OS template itself VE #827 is based on - this error should be fixed.

yes, it is possible to add RPM to VE while it not running. Try to do following:

# cd /vz/template/<OS_template_with_ed_package>/
# vzctl mount 827
# rpm -Uvh --root /vz/root/827 --veid 827 ed-0.2-25.i386.vz.rpm

Usually theres an error, but its ok

Note: replace 'ed-0.2-25.i386.vz.rpm' in last command with actual version of 'ed' package you have.


So how do I know what template the user has ? cat their conf file and it is listed in there. For example, if the conf file has:

[root@virt12 sbin]# vc 1103
…snip…
OSTEMPLATE="debian-3.0/20030822"
TEMPLATES="mod_perl-deb30/20030707 mod_ssl-deb30/20030703 mysql-deb30/20030707 proftpd-deb30/20030703 webmin-deb30/20030823 "
…

then they are on debian 3.0, all of their system RPMs are in /vz/template/debian-3.0, and they are using version 20030822 of that debian 3.0 template. Also, they’ve also got additional packages installed (mod_perl, mod_ssl, etc). Those are also found under /vz/template


Edits needed to run java:

When we first created the VEs, the default setting for privvmpages was 93000:94000 ... which was high enough that most people never had problems ... however, you can;t run java or jdk or tomcat or anything java related with that setting. We have found that by setting privvmpages to 610000:615000 that java runs just fine. That is now the default setting. It is exceedingly rare that anyone needs it higher than that, although we have seen it once or twice.

Any problems with java at all - the first thing you need to do is see if the failcnt has raised for privvmpages.

# vzctl start 160
Starting VE ...
vzquota : (error) Quota on syscall for 160: Device or resource busy
Running vzquota on failed for VE 160 [3]

This is because my pwd is _in_ their private directory - you can't start it until you move out

People seem to have trouble with php if they are clueless newbies. Here are two common problems/solutions:

no... but i figured it out myself. problem was the php.ini file that came vanilla with the account was not configured to work with apache (the ENGINE directive was set to off).

everything else seems fine now.


the problem was in the php.ini file. I noticed that is wasnt showing the code when it was in an html file so I looked at the php.ini file and had to change it so it recognized <? tags aswell as <?php tags.

Also, make sure added to httpd.conf

   AddType application/x-httpd-php .php

You can set the time zone:

You can change the timezone by doing this:

ln -sf /usr/share/zoneinfo/<zone> /etc/localtime

where <zone> is the zone you want in the /usr/share/zoneinfo/ directory.


Failing shm_open calls:

first, please check if /dev/shm is mounted inside VE. 'cat /proc/mounts' command should show something like this:

tmpfs /dev/shm tmpfs rw 0 0

If /dev/shm is not mounted you have 2 ways to solve issue: 1. execute following command inside VE (doesn't require VE reboot):

mount -t tmpfs none /dev/shm

2. add following string to /etc/fstab inside VE and reboot it:

tmpfs         /dev/shm        tmpfs           defaults        0 0

You can have a mounted but not running ve Just:

vzctl mount <veid>

When a debian sys can’t get on the network, and you try:

vzctl set 1046 --ipadd 69.55.227.117
Adding IP address(es): 69.55.227.117
Failed to bring up lo.
Failed to bring up venet0.

They probably removed iproute package, which must be the one from swsoft. To restore:

# dpkg -i --veid=1046 --admindir=/vz1/private/1046/root/var/lib/dpkg --instdir=/vz1/private/1046/root/ /vz/template/debian-3.0/iproute_20010824-8_i386.vz.deb
(Reading database ... 16007 files and directories currently installed.)
Preparing to replace iproute 20010824-8 (using .../iproute_20010824-8_i386.vz.deb) ...
Unpacking replacement iproute ...
Setting up iproute (20010824-8) ...

Then restart their ve


in a ve i do:

cd /
du -h .

and get: 483M .

i do:

bash-2.05a# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/vzfs             4.0G  2.3G  1.7G  56% /

how can this be?

Is it possible that quota file was corrupted somehow? Please try to:

vzctl stop <VEID>
vzquota drop <VEID>
vzquota init <VEID>
vzctl start <VEID>

How to stop vz from starting after reboot:

VIRTUOZZO=no 

in

/etc/sysconfig/vz

To start:

service vz start

(after setting VIRTUOZZO=yes in /etc/sysconfig/vz)

service vz restart will do some kind of 'soft reboot' -- restart all VPSes and reload modules without rebooting the node

if you need to shut down all VPSes really really fast, run killall -9 init


Postfix tip:

You may want to tweak settings: default_process_limit=10

Virtuozzo VPS Management Tools

vm

TODO

cancelve

cancelve <veid>

this will:

  • stop a ve
  • check for backups (offer to remove them from the backup server

and the backup.config)

  • rename the private dir
  • check for PTR, provide the commands to reset to default
  • and rename the ve’s config
  • remind you to remove firewall rules
  • remind you to remove DNS entries

ipadd

ipadd  <veid> <ip> [ip] [ip]

add’s ip(s) to a ve

ipdel

ipdel <veid> <ip> [ip] [ip]

removes ip(s) from a ve

vc

vc <veid>

the equivalent of: cat /vzconf/<veid>.conf

vl

vl

will displays a list of ve #’s, 1 per line. (ostensibly to use in a for loop)

vp

vp <veid>

the equivalent of: vzps auxww –E <veid>

vpe

DEPRECATED

vpe <veid> 

this will allow you to do a vp when a ve is running out of control, the equivalent of (deprecated since vp operates outside the VPS):

vzctl set <veid> --kmemsize 2100000:2200000
vzctl exec <veid> ps auxw
vzctl set <veid> --kmemsize (ve’s orig lvalue):(ve’s orig hvalue)

vt

vt <veid>

the equivalent of: vztop –E <veid>

vr

vr <veid>

the equivalent of: vzctl stop <veid>; vzctl start <veid> You can run this even if the ve is down - the stop command will just fail

vs

vs [veid]

displays status (output of vzctl status <veid>) on each ve configured on the system (as determined by /vzconf/*.conf) If passed an argument, gives the status for just that ve. A running system looks like: VEID 16066 exist mounted running

A ve which is not running (but does exist) looks like: VEID 9990 exist unmounted down

A ve which is not running and doesn’t exist looks like: VEID 421 deleted unmounted down

vs2

vs2 [veid]

this is similar to vs in that it displays status (output of vzctl status <veid>) on each ve, but the difference is it’s list comes from doing an ls on the data dirs. This was meant to catch the rare case where a ve configured but exists.

vw

vw [veid]

displays the output of ‘w’ (the equivalent of vzctl exec <veid> w) for each configured ve (as determined by /vzconf/*.conf). Useful for determing which ve is contributing to a heavily-loaded system. If passed an argument, gives ‘w‘ output for just that ve. Ex:

[root@virt2 etc]# vw
134
 10:52pm  up 79 days,  6:14,  0 users,  load average: 0.02, 0.02, 0.00
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
16027
  2:52pm  up 7 days, 19:54,  0 users,  load average: 0.00, 0.00, 0.00
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT
16055
  2:52pm  up 79 days,  6:38,  0 users,  load average: 0.00, 0.04, 0.07
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU  WHAT

vwe

vwe [constraint]

just like vw, but takes a constraint as an argument, only show’s ve’s with loads >= the constraint provided. If no constraint is provided, 1 is used by default

vzs

vzs [veid]

displays the beancounter status for all ve’s, or a particular ve if an argument is passed

ve

ve <veid>

the equivalent of: vzctl enter <veid>

vx

vx <veid> <command>

the equivalent of: /usr/sbin/vzctl exec <veid> <command>

dprocs

dprocs [count]

a script which outputs a continuous report (or a certain number of reports if an option is passed) of processes stuck in the D state and which VPS’s those procs belong to.

setmem

setmem <VEID> <256|512|768|1024>

adjusts the memory resources for the VE

afacheck.sh

afacheck.sh

displays the health/status of containers and mirrors on an adaptec card (currently quar1, tempvirt1-2, virt9, virt10)- all other are LSI

backup

backup

backup script called nightly to update virt scripts and do customer backups

checkload.pl

checkload.pl

this was intended to be setup as a cronjob to watch processes on a virt when the load rises above a certain level. Not currently in use.

findbackuppigs.pl

findbackuppigs.pl

looks for files larger than 50MB which customers have asked us to backup. Emails matches to linux@johncompanies.com

gatherlinux.pl

gatherlinux.pl

gathers up data about ve’s configured and writes to a file. Used for audits against the db

gb

gb <search>

greps backup.config for the given search parameter

gbg

gbg <search>

greps backup.config for the given search parameter and presents just the directories (for clean pasting)

linuxtrafficgather.pl

linuxtrafficgather.pl [yy-mm]

sends a traffic usage summary by ve to support@johncomapnies.com and payments@johncompanies.com. Optional arguments are year and month (must be in the past). If not passed, assumes last month. Relies on traffic logs created by netstatreset and netstatbackup

linuxtrafficwatch.pl

linuxtrafficwatch.pl

checks traffic usage and emails support@johncomapnies.com when a ve reaches the warning level (35G) and the limit (40G). Works on virtuozzo versions <= 2.6.x. We really aren’t using this anymore now that we have netflow.

linuxtrafficwatch2.pl

linuxtrafficwatch2.pl

checks traffic usage and emails support@johncomapnies.com when a ve reaches the warning level (35G) and the limit (40G). Works on virtuozzo version 2.6.x. We really aren’t using this anymore now that we have netflow.

load.pl

load.pl

feeds info to load mrtg - executed by inetd.

mb (linux)

mb <mount|umount>

(nfs) mounts and umounts dirs to backup2. Shortcuts are mbm and mbu to mount and unmount.

migrate

migrate <ip of node migrating to> <veid> <target_veid> [target dir: vz | vz1 | vz2]

this is basically a wrapper for vzmigrate – vzmigrate is a util to seamlessly move a ve from one host to another. This wrapper was written cause vz virtuozzo version 2.6 had a bug where the ve’s ip(s) on the src system were not properly removed from arp/route tables. This script mitigates that. Since it makes multiple ssh connections to the target host, it’s a good idea to put the pub key for the src system in the authorized_keys file on the target host. In addition, it emails ve owners when their migration starts and stops (if they place email addresses in a file on their system: /migrate_notify). To move everyone off a system, you’d do:

for f in `vl`; do migrate <ip> $f; done

migrateonline

migrateonline <ip of node migrating to> <veid> <target_veid> [target dir: vz | vz1 | vz2]

this is the same as migrate but will migrate a ve in –online mode which means it won’t be shut down at the end of the migration. This only works when migrating ve’s between 2 machines running a 2.6 kernel (currently tempvirt1-2. virt16-19, virt12). If you get an error that the machine you’re trying to migrate to has a different CPU or features, etc, then you have to edit the file and add the –f switch to the vzmigrate line- you can basically ignore this kind of warning (but never ignore a warning about missing templates on the destination node). NOTE: This edit (if made to migrateonline) will be overwritten by the base script during each night’s backup.

netstatbackup

DEPRECATED

netstatbackup 

writes traffic count data to a logfile. Works on virtuozzo versions 2.5.x.

netstatbackup2

DEPRECATED

netstatbackup2

writes traffic count data to a logfile. Works on virtuozzo versions 2.6.x.

netstatreset

DEPRECATED

netstatreset

writes traffic count data to a logfile and resets counters to 0. Works on virtuozzo versions 2.5.x

netstatreset2

DEPRECATED

netstatreset2

writes traffic count data to a logfile. Works on virtuozzo versions 2.6.x.

orphanedbackupwatchlinux

orphanedbackupwatchlinux 

looks for directories on backup2 which aren’t configured in backup.config and offers to delete them

rsync.backup (linux)

rsync.backup

does customer backups (relies on backup.config)

startvirt.pl

startvirt.pl

forks off start ve commands – keeps 6 running at a time. This is not to be used on systems where fastboot is enabled as it circumvents the benefit of the fastboot. The script will occasionally not exit gracefully and will continue to use up CPU, so it should be watched. Also, don’t exit from the script till you’re sure all ve’s are started – if you do you need to start them manually and may have to free up locks. On some systems, the startvirt script doesn’t exit cleanly and you have to ^C out of it. Be careful though- doing so can leave some VE’s in an odd bootup state and you may need to ‘vr’ them manually. You should check to see which ve’s aren’t running and/or confirm all have started when ^C’ing out of startvirt.

taskdone (linux)

taskdone

when called will email support@johncompanies.com with the hostname of the machine from which it was executed as the subject

vb (linux)

vb

the equivalent of: vi /usr/local/sbin/backup.config

vemakeXX

DEPRECATED

vemakerh9 

ve create script for RH9 (see vemake)

vemakedebian30 

ve create script for debian 3.0 (Woody) (see vemake)

vemakedebian31 

ve create script for debian 3.1 (Sarge) (see vemake)

vemakedebian40 

ve create script for debian 4.0 (Etch) (see vemake)

vemakefedora, vemakefedora2, vemakefedora4, vemakefedora5, vemakefedora6, vemakefedora7

ve create script for fedora core 1, 2, 4, 5, 6, 7 respectively (see vemake)

vemakecentos3, vemakecentos4

ve create script for centos 3, 4 respectively (see vemake)

vemakesuse, vemakesuse93, vemakesuse100

ve create script for suse 9.2, 9.3, 10.0 respectively (see vemake)

vemakeubuntu5, vemakeubuntu606, vemakeubuntu606 vemakeubuntu704

ve create script for ubuntu 5.10, 6.06, 6.10, 7.04 respectively (see vemake)

vemove

DEPRECATED

vemove <veid> <target_ip> </vz/private/123>

this script simplifies the old way of moving ve’s from one system to another - in short moving a ve to or from a virt running virtuozzo < 2.6.x It’s the equivalent of: tar cfpP - <veid> --ignore-failed-read | (ssh -2 -c arcfour <target_ip> "split - -b 1024m </vz/private/123>.tar" )This should only be used if migrate/vzmigrate can’t be used.

vim.watchdog

vim.watchdog 

looks for and kills procs matching “vi|vim|nano|pine|elm” that are running for a long time & consuming lots of cpu. Works on virtuozzo versions 2.5.x

vim.watchdog2

vim.watchdog2

looks for and kills procs matching “vi|vim|nano|pine|elm” that are running for a long time & consuming lots of cpu. Works on virtuozzo versions 2.6.x.

vzmigrate

vzmigrate <target_ip> -r no <veid>:[dst veid]:[dst /vzX/private/veid]:[dst /vzX/root/veid]

(this is the raw command “wrapped” by migrate/migrateonline) this will seamlessly move a ve from one host to another. The ve will run for the duration of the migration till the very end when it’s shut down, ip moved and started up on the target system. The filesystem on the src will remain. This should be watched – occasionally the move will timeout and leave the system shut down. If target private and root aren’t specified it just puts it in /vz. Only works when both systems are running virtuozzo 2.6.x

vztrafdump.sh

DEPRECATED

vztrafdump.sh

writes traffic usage info by ve to a file called jc_traffic_dump in each ve’s / dir. Works on virtuozzo versions <= 2.5.x.

vztrafdump2.sh

DEPRECATED

vztrafdump2.sh

writes traffic usage info by ve to a file called jc_traffic_dump in each ve’s / dir. Works on virtuozzo versions 2.6.x.

addtun

addtun <veid>

Add’s tun device to ve.

bwcap

bwcap <veid> <kbps>

Ex: bwcap 1234 512 Caps a VE’s bandwidth to the amount given

setdisk

setdisk <veid> <diskspace in GB>

Ex: setdisk 1234 5 Gives a VE’s a given amount of disk space

vdf

vdf <veid> 

the equivalent of: vzctl exec <veid> df –h

vdff

vdff

runs a (condensed) vdf for all ve’s in your pwd (must be run from /vz/privateN)

mvbackups

mvbackups <veid> <target_machine> (virt1) <target_dir> (vz1)

moves backups from one location to another on the backup server, and provides you with option to remove entries from current backup.config, and simple paste command to add the config to backup.config on the target server

checkquota

checkquota

for all the ve’s in the cwd (run from /vz/private, /vz1/private, etc) reports what vz quota says they’re using and what the actual usage is (as reported by du)

clearquota

clearquota <veid>

Recalculates a ve’s quota, prints out the usage before and after. The equivalent of: vdf <veid>; v stop <veid>; vzquota drop <veid>; v start <veid>; vdf <veid>

dprocs

dprocs

Sometimes the server’s have a large number of processes get stuck in the D state- this script shows (every 3 secs) which VE’s have D procs, which procs are stuck and a running average of the top “offenders”

vzstat

vstat

sort of like top for VZ. sort VEs by CPU usage by pressing 'o' and then 'c' keys