10 Ways to Improve LAMP

I wrote this in this is in around 2002. It's a bit outdated, but some of the points are still relevant.

  1. The Linux kernel should mention in /dev/kmsg when it kills an errant process. Most processes don't expect to die unexpectedly. If the kernel takes it upon itself to send SIGSEGV to a process, and it doesn't handle it, that probably means it won't report it. From there the kernel will terminate the process, but won't say a thing. It would be nice if, when you get a page at 3AM that something isn't running anymore, you could do dmesg and see that the kernel killed something you needed because it misbehaved. This eliminates a good amount of guesswork (did it exit on its own, did someone else shut it down, wtf?)

  2. Three easy changes that would improve select() functionality under glibc.

    • Perhaps document in the man page select(2) that you cannot watch more than FD_SETSIZE file descriptors? Just mentioning that the symbol exists would save a lot of time. It's set to 1024 by default, which means that either the limit won't ever be hit, or that it will be hit at 4:30AM on a Saturday.
    • Under Linux/glibc, select() will allow you to add more than FD_SETSIZE file descriptors to a descriptor set. If you do this, you basically overrun a buffer, but none of the macros tell you this. Only when the call is taken by the kernel does the kernel realize you exceeded the boundary and return -EINVAL. This causes select() to return an error condition, but the man page is again misleading because it would have you believe that if you're receiving EINVAL in errno, that you passed an argument that wasn't a file descriptor (which, from an API point of view, you didn't do).
    • Allow programs to redefine FD_SETSIZE! This way when we go through the agony of learning A and B by ourselves, we can correct the problem without having to modify libc headers. Yeah, we should have been using poll(2), but select(2)'s interface is easier to use, and that's no consolation when you've written an application around select and have to deal with this issue in an emergency. And it's not like the man page said to use poll because select was crippled.
  3. BIND sure could use a configtest. Anyone who runs an ISP could tell you how many times they've modified a zone and restarted BIND only to find out later that the zone file has a syntax error and that BIND either refused to start altogether (such as BIND up to version 8), or just started and ignored the zone (versions greater than or equal to 9), and noted this in the log file somewhere. Now djbdns, that's clever. The zone information is compiled into a binary database. The program that compiles the database does sanity checking. If there's a problem, the old database is left intact, an error message displayed on your console. Forgiving the fact that BIND still uses an in-core database scheme, it would be nice to have a configtest command so that if you have goofed, you don't have to frantically search for an error while live requests are hitting your broken nameservers.

  4. BIND zone file evaluation sure could use a $MODIFIED variable. The $MODIFIED variable would be substituted with the modification timestamp of the zone file it appears in. The purpose would be to put this into the serial section of the zone file so that you don't have to keep remembering to increment it every time you make a damn change. djbdns does this too.

  5. Apache graceful shutdown. Apache has a graceful restart command. This is effectively an internal stop and start, but done in such a way that clients in the middle of a transaction get to finish it. There is however, no graceful shutdown. If you have to take an Apache server down for any reason, you risk killing a client's transaction in mid-flight. Trite at first, but there are plenty of circumstances where telling a client "Connection Refused" is better than saying "We took your data, and we started doing something with it, but we stopped mysteriously without telling you what happened. Deal."

  6. MySQL sure could use a live backup feature. And innodb hot backup is way too unwieldly. Being able to do a snapshot dump while still serving requests would be grand. The framework for this feature is all there, what's the freakin' hold up? Why is this needed? Lets say you want to take a live MySQL server and add a replicative slave database. To synchronize the slave, you need to stop all processing on the master, take a database dump, and then restart processing. It means downtime, sometimes a lot of downtime, and downtime sucks.

  7. The filesystem sure could use a live backup feature. Unless you want to impose downtime, there's no way to take a consistent snapshot of the filesystem. As a result, you can get stuck with backup images where the beginning is hours older than the end.

  8. rsync has trouble replicating frequently changing files. Workaround: repeat rsyncs in a loop until they succeed. An ugly hack -- it's possible that it could never succeed. Per-file multi-versioning support would fix this. I want to believe patching rsync to do private mmap()s is the way to get this support, but I bet private mmaps are implemented the easy way instead of the correct way.

  9. Apache doesn't always clean up after itself. Presume Apache shuts down ungracefully. Restarting Apache may fail because Apache can't create new shm segments since old ones are laying around. Instead of recognizing its predecessor's remains and handling them, it just quits, paralyzed, waiting for someone to figure out that they need to be manually deleted with commands that they've never used for anything else ever (icps, icprm). Workaround: wrap apache startup script with shm segment killers.

  10. Email notification on boot. A server should send root an email when it boots. This can raise a sys admin's awareness to a whole wealth of issues before they become real problems. Workaround: easy enough to stick this in a startup script, which is apparantly how the FreeBSD crew feels about it because they do it for you.

A lot of these have to be learned from experience. While it makes for a good sadistic rite of passage between sys admins ("Isn't it a kicker when syslog and BIND deadlock? It took me years of rebooting a dead production server before we figured that business out! I look back on it all and laugh! Hah hah!"), I'm of the opinion that as time goes on, things should get better.


Return to Michael Bacarella's homepage