Editing
VPS Management
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Problems with the quad/safe files == When you run the quad/safe files, there are two problems that can occur - either a particular system will hang during initialization, OR a system will spit out output to the screen, impeding your ability to do anything. Or both. First off, when you start a jail, you see output like this: <pre>Skipping disk checks ... adjkerntz[25285]: sysctl(put_wallclock): Operation not permitted Doing initial network setup:. ifconfig: ioctl (SIOCDIFADDR): permission denied lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384 Additional routing options: TCP keepalive=YESsysctl: net.inet.tcp.always_keepalive: Operation not permitted. Routing daemons:. Additional daemons: syslogd. Doing additional network setup:. Starting final network daemons:. ELF ldconfig path: /usr/lib /usr/lib/compat /usr/X11R6/lib /usr/local/lib a.out ldconfig path: /usr/lib/aout /usr/lib/compat/aout /usr/X11R6/lib/aout Starting standard daemons: inetd cron sshd sendmail sendmail-clientmqueue. Initial rc.i386 initialization:. Configuring syscons: blanktime. Additional ABI support:. Local package initialization:. Additional TCP options:.</pre> Now, let's look at this line, near the end: Local package initialization:. This is where a list of daemons that are set to start at boot time willshow up. You might see something like: Local package initialization: mysqld apache sendmail sendmail-clientmqueue Or something like this: Local package initialization: postgres postfix apache The problem is that many systems (about 4-5 per machine) will hang on that line. Basically it will get to some of the way through the total daemons to be started: Local package initialization: mysqld apache and will just sit there. Forever. Fortunately, pressing ctrl-c will break out of it. Not only will it break out of it, but it will also continue on that same line and start the other daemons: Local package initialization: mysqld apache ^c sendmail-clientmqueue and then continue on to finish the startup, and then move to the next system to be started. So what does this mean? It means that if a machine crashes, and you start four screen-windows to run four quads or four safes, you need to periodically cycle between them and see if any systems are stuck at that point, causing their quad/safe file to hang. A good rule of thumb is, if you see a system at that point in the startup, give it another 100 seconds - if it is still at the exact same spot, hit ctrl-c. Its also a good idea to go back into the quad file (just before the first command in the jail startup block) and note that this jail tends to need a control-c or more time as follows: <pre>echo '### NOTE ### slow sendmail' echo '### NOTE ###: ^C @ Starting sendmail.'</pre> '''NEVER''' hit ctrl-c repeatedly if you don't get an immediate response - that will cause the following jail’s startup commands to be aborted. A second problem that can occur is that a jail - maybe the first one in that particular quad/safe, maybe the last one, or maybe one in the middle, will start spitting out status or error messages from one of its init scripts. This is not a problem - basically, hit enter a few times and see if you get a prompt - if you do get a prompt, that means that the quad/safe script has already completed. Therefore it is safe to log out (and log out of the user that you su'd from) and then log back in (if necessary). The tricky thing is, if a system in the middle starts flooding with messages, and you hit enter a few times and don't get a prompt. Are you not getting a prompt because some subsequent system is hanging at the initialization, as we discussede above ? Or are you not getting a prompt because that quad file is currently running an fsck ? Usually you can tell by scrolling back in screen’s history to see what it was doing before you started getting the messages. If you don’t get clues from history, you have to use your judgement - instead of giving it 100 seconds to respond, perhaps give it 2-3 mins ... if you still get no response (no prompt) when you hit enter, hit ctrl-c. However, be aware that you might still be hitting ctrl-c in the middle of an fsck. This means you will get an error like "filesystem still marked dirty" and then the vnconfig for it will fail and so will the jail command, and the next system in the quad file will then start starting up. If this happens, just wait until the end of all the quad files have finished, and start that system manually. If things really get weird, like a screen flooded with errors, and you can't get a prompt, and ctrl-c does nothing, then you need to just eventually (give it ten mins or so) just kill that window with ctrl-p, then k, and then log in again and manually check which systems are now running and which aren't, and manually start up any that are not. Don't EVER risk running a particular quad/safe file a second time. If the quad/safe script gets executed twice, reboot the machine immediately. So, for all the above reasons, anytime a machine crashes and you run all the quads or all the safes, '''always''' check every jail afterwards to make sure it is running - even if you have no hangs or complications at all. Run this command: <tt>[[#jailpsall|jailpsall]]</tt> Note: [[#postboot|postboot]] also populates ipfw counts, so it '''should not be run multiple times''', use <tt>jailpsall</tt> for subsequent extensive ps’ing And make sure they all show as running. If one does not show as running, check its /etc/rc.conf file first to see if maybe it is using a different hostname first before starting it manually. One thing we have implemented to alleviate these startup hangs and noisy jails, is to put jail start blocks that are slow or hangy at the bottom of the safe/quad file. Further, for each bad jail we note in each quad/safe just before the start block something like: echo ‘### NOTE ### ^C @ Local package initialization: pgsqlmesg: /dev/ttyp1: Operation not permitted’ That way we’ll be prepared to ^C when we see that message appear during the quad/safe startup process. If you observe a new, undocumented hang, '''after''' the quad/safe has finished, place a line similar to the above in the quad file, move the jail start block to the end of the file, then run [[#buildsafe|buildsafe]]
Summary:
Please note that all contributions to JCWiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
JCWiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information