Cultured Perl: Managing Linux Configuration Files

CVS backs up, distributes, and simplifies your configuration files. In this article, Teodor Zlatanov discusses how to save time, energy and frustration when working with Linux configuration files by using your CVS tree. (This introductory-level article was first published by IBM developerWorks, June 10, 2004, at http://www.ibm.com/developerWorks).

The average developer spends more time navigating, learning, and debugging configuration files than you’d expect. But you can save that time — and loads of energy and frustration — with one of the tools you probably use every day: your CVS tree. Take these tips on backing up, distributing, and making portable your peskiest Linux™ (and UNIX®) config files.

Working with configuration files can be a bewildering part of using Linux and computers in general. No standards exist, though several have been proposed. For example, Samba and rsync use INI-style configurations; passwd is in a decades-old colon-separated format that doesn’t allow colons in any field; sudo comes with a visudo program to keep people from entering wrong information in the sudoers file; Emacs uses Lisp for configuration files. And the list goes on…

Now, I’m not complaining about the variety of configuration files. I understand the historical and practical reasons for this Configuration Tower of Babel. Changing the Samba configuration format, for instance, would annoy thousands upon thousands of administrators. In another example, Emacs’ internal language is Lisp, a powerful high-level language, so using anything else for Emacs configuration files would be ridiculous.

No, my point is the effect all this variety has on the Linux user: a large portion of a Linux user’s computer time is spent learning, writing, and debugging configuration files. Thus, it is useful to have a system in which these configuration files (1) are backed up automatically, (2) are distributed automatically, and (3) work on multiple flavors of UNIX and distributions of Linux. This article explains how to achieve the first two goals, and gets you started on the road to achieving the third one.

The Plan

We’ll use CVS to hold the configuration files. Feel free to use any other versioning system. Subversion is gaining popularity quickly. The FSF has GNU tla (GNU arch), another nice versioning system. The essential features you need are provided by all those and many others, including the non-free ones like Rational® ClearCase®.

In my configuration scheme, each configuration file is in a single directory or in one of its subdirectories. The configuration files are be named uniquely, and the directories denote machines or platforms rather than location. Thus, the file name maps uniquely to a location in the filesystem. For example, passwd will always be used for /etc/passwd, while cshrc will be used for /home/tzz/.cshrc for user tzz.

For a few programs I use daily, I’ll show how I handle multiple platforms with the help of my configuration system and changing the configuration files themselves.

All the examples I show use the C shell to set environment variables. Modifying them to use GNU bash or something else should not be terribly difficult.

IBM developerWorksVisit developerWorks for thousands of developer articles, tutorials, and resources related to open standard technologies, IBM products, and more. See developerWorks.

{mospagebreak title=Setting up CVS}

You probably already have CVS installed on your machine. If not, get it (see the Resources section) and install it. If you are using another versioning system, try to set up something similar to what I show below.

First of all, you need to create a CVS repository. I’ll assume you have access to a machine that can be used as a CVS server through OpenSSH or Pserver CVS access (Pserver is the communication protocol for CVS; see Resources for more information). Then, you need to create a module called config, which I will use to hold the sample configuration files. Finally, you need to arrange a way to use your CVS repository remotely non-interactively, through OpenSSH, Pserver, or whatever is appropriate. This last point is highly dependent on your particular system administration skills, level of paranoia, and environment, so I can only point you to some information in the Resources. I will assume you have configured non-interactive (ssh-agent) logins through OpenSSH for the rest of this article.

Listing 1. Set up the CVS repository on a machine

# assume that /cvsroot is your repository’s home
> setenv CVSROOT /cvsroot
# this will use $CVSROOT if no -d option is specified
> cvs init
# check that it worked
> ls /cvsroot
# you should see one directory called CVSROOT
CVSROOT

Now that the repository is set up, you can continue using it remotely (you can do the steps below on the CVS server, too — just leave CVSROOT as in Listing 1).

Listing 2. Remotely add the config module to CVS

# user tzz, machine home.com, directory /cvsroot is the CVSROOT
> setenv CVSROOT
tzz@home.com:/cvsroot
# use SSH as the transport
> setenv CVS_RSH ssh
# use a temporary directory for the module creation
> cd /tmp
> mkdir config
> cd config

# tzz is the “vendor name” and initial is the “release tag”, they can
# be anything; the -m flag tells CVS not to ask us for a message

# if this fails due to SSH problems, see the Resources
> cvs import -m ” config tzz initial
No conflicts created by this import
# now let’s do a test checkout
> cd ~
> rm -rf /tmp/config
> cvs co config
cvs checkout: Updating config
# check everything is correct
> ls config
CVS

Now you have a copy of the config CVS module checked out in your home directory; we’ll use that as our starting point. I’ll use my user name tzz and home directory /home/tzz in this article, but, of course, you should use your own user name and directory as appropriate.

Let’s create a single file. The CVS options file, cvsrc, seems appropriate since we’ll be using CVS a lot more.

Listing 3. Create and add the cvsrc file

> cd ~/config
> echo “cvs -z3″ > cvsrc
> echo “update -P -d” >> cvsrc
> cvs add cvsrc
# you really don’t need log messages here
> cvs commit -m ”
> ln -s ~/config/cvsrc ~/.cvsrc

From this point on, all your CVS options will live in ~/config/cvsrc, and you will update that file instead of ~/.cvsrc. The specific options you added tell CVS to retrieve directories when they don’t exist, and to prune empty directories. This is usually what users want. For the remaining machines you want to set up this way, you need to check out the config module again and make the link again.

Listing 4. Check out the config module and make the cvsrc link

> cd ~
# set the following two for remote access
> setenv CVSROOT …
> setenv CVS_RSH …
# now check out “config” — this will get all the files
> cvs checkout config
> cd ~/config
> ln -s ~/config/cvsrc ~/.cvsrc

You may also know that Linux allows for hard links in addition to the symbolic ones you just created. Because of the limitations of hard links, they are not suitable to this scheme. For instance, say you create a hard link, ~/.cvsrc, to ~/config/cvsrc and later you remove ~/config/cvsrc (there are many ways this could happen). The ~/.cvsrc file would still hold the old contents of what used to be ~/config/cvsrc. Now, you check out ~/config/cvsrc again. The ~/.cvsrc file, however, will not be updated. That’s why symbolic links are better in this situation.

Let’s say you change cvsrc to add one more option:

Listing 5. Modify and commit cvsrc

> cd ~/config
> echo “checkout -P” > cvsrc
> cvs commit -m ”

Now, to update ~/.cvsrc on every other machine you use, just do the following:

Listing 6: Modify and commit cvsrc

> cd ~/config
> cvs update

This is nice and easy. What’s even nicer is that the CVS update shown above will update every file in ~/config, so all the files you keep under this CVS scheme will be up-to-date at once with one command. This is the essence of the configuration scheme shown here; the rest is just window dressing.

Note that once you’ve checked out a module, there’s a directory in it called “CVS.” The CVS directory has enough information about the CVS module that you can do update, commit, and other CVS operations without specifying the CVSROOT variable.

IBM developerWorksVisit developerWorks for thousands of developer articles, tutorials, and resources related to open standard technologies, IBM products, and more. See developerWorks.

{mospagebreak title=Automatic updates and commits}

For automatic updates and commits, I have written a very simple Perl program, maintain.pl. The longest part of the program is the help text, so you can imagine it’s not full of complex code. I will go through it regardless, but keep in mind that a shell script could do the same job if needed.

The only thing maintain.pl does not do is make the symbolic links. Since that has to be done just once, and on some systems you do not want the links wholesale, the complexity of the task compared to the simplicity of doing it manually was simply too much. I know because I wrote the symbolic link code and got rid of it later.

I had to write and maintain yet another configuration file that mapped out many filenames. There were many exceptions; for example, two Linux and Solaris systems I use have radically different setups. There were just too many things to worry about, and I found that manually installing the links was much easier. Of course, your experience may vary — I encourage you to try to find the most appropriate approach for your own environment.

The maintain.pl script begins with the usual definition of configuration options, loading of command-line arguments, and help text.

Listing 7. Preliminaries in the maintain.pl script

#!/usr/local/bin/perl -w

# {{{ modules and constants
use strict;
use AppConfig qw/:expand :argcount/;
# }}}

$| = 1;       # autoflush the output

my $config = AppConfig->new();
$config->define(
 ‘HELP’     =>
 { ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => ‘H’},
# update level, higher checks out more
 ‘LEVEL’    =>
 { ARGCOUNT => ARGCOUNT_ONE,  DEFAULT => 5 },
 ‘CONFFILE’ =>
 { ARGCOUNT => ARGCOUNT_ONE,  ALIAS => ‘F’,
   DEFAULT => glob(“~/config/maintain.conf”) },
 ‘CVS’      =>
 { ARGCOUNT => ARGCOUNT_ONE,  DEFAULT => ‘cvs’ },
 ‘CVS_RSH’  =>
 { ARGCOUNT => ARGCOUNT_ONE,  DEFAULT => ‘ssh’ },
 ‘UPDATE’   =>
 { ARGCOUNT => ARGCOUNT_HASH },
 ‘DRYRUN’   =>
 { ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => ‘N’ },
 ‘COMMIT’   =>
 { ARGCOUNT => ARGCOUNT_NONE, DEFAULT => 0, ALIAS => ‘C’ },
);

$config->args();
if (-r $config->CONFFILE() && -f $config->CONFFILE())
{
 $config->file($config->CONFFILE());
}
else
{
 print “The file ” . $config->CONFFILE() .
       ” was not readable, skippingn”;
}

if ($config->HELP())
{
 print <<EOHIPPUS;

$0

Run $0 without any arguments to load
@{[$config->CONFFILE()]}
and update everything in it at level
@{[$config->LEVEL()]} or less.

Switches:
 -level (default @{[$config->LEVEL()]}) :
   check out everything at this level or less

 -help (-h) : print this help

 -conffile (-f, default @{[$config->CONFFILE()]}) :
   load this configuration

 -cvs (default @{[$config->CVS()]}) :
   where to find the cvs program

 -cvs_rsh (default @{[$config->CVS_RSH()]}) :
   sets the CVS_RSH environment variable

 -update : populate the UPDATE hash in the configuration
           file or like this:
           -update /home/tzz/           see below for explanation

 -commit (-c) : don’t just update, also do a commit of
                anything changed

 -dryrun (-n) : don’t run anything, just test directories
                and levels

Configuration file:

Very simple AppConfig format; everything in the switches can be
specified in the configuration file as well, e.g.

COMMIT = 1
UPDATE /home/tzz/config = 0

The example above says that /home/tzz/config will be updated at level

0 or higher, and that you always want to commit when you run this

program.

EOHIPPUS

 exit 0;

}

$ENV{CVS_RSH} = $config->CVS_RSH();

If you are unfamiliar with the AppConfig module, you should check out the Resources section for useful info on managing configurations.

I do a glob() call to determine the default CONFFILE, because the user’s home directory could be anywhere. If the CONFFILE contains invalid data, AppConfig automatically kills the whole program (this can be changed to be just a warning). The script can even run without a configuration file.

After printing out the help text, I set the CVS_RSH environment variable to the appropriate value (defaults to ssh). This is so that the user does not have to set that environment variable in some other way, which is especially convenient for users who put maintain.pl in their crontab.

After all these preliminaries, let’s look at the heart of the script:

Listing 8: main loop of maintain.pl

foreach my $spot (keys %{$config->UPDATE()})
{
 my $level = 0 + $config->UPDATE()->{$spot};
 next if $level > $config->LEVEL();
 print “Spot $spot, Level $leveln”;
 chdir $spot;
 if ($config->DRYRUN())
 {
  print “Not updating due to DRYRUNn”;
 }
 else
 {
  system($config->CVS() . ” -q update”);
 }

 if ($config->COMMIT())
 {
  if ($config->DRYRUN())
  {
   print “Not committing due to DRYRUNn”;
  }
  else
  {
   system($config->CVS() . ” commit -m ””)
  }
 }

}

This is a simple loop. I run through every spot, which is really a directory, and do a cvs update if the spot’s level is less than or equal to the LEVEL configuration variable, defaulting to 5. In addition, if the COMMIT flag is set, I do a cvs commit -m ”, which commits all changes with an empty log message. In fact, if it weren’t for the DRYRUN flag, this loop would be just a few lines long.

I use system() with the string form instead of the multiple argument form. You could do it the second way — see perldoc -f system for details on the usage of this function call.

Also, I don’t check the result of the system() call, because it’s unnecessary. There’s nothing maintain.pl can (or should) do in the case of a CVS update or commit problem, since these are crucial configuration files we don’t want to update blindly.

The configuration file is simplicity itself:

Listing 9. maintain.conf

# the number is the update level
UPDATE /home/tzz/emacs = 0
UPDATE /home/tzz/config = 0
UPDATE /home/tzz/articles = 1
UPDATE /home/tzz/gnus/gnus = 1

Remember you can set any AppConfig variable here, so you can override the default LEVEL or CVS_RSH, for instance. I update my Emacs, config, articles, and gnus directories through maintain.pl, but their update levels are different to reflect the frequency with which I update (I do level 0 twice every day and level 1 once daily).

IBM developerWorksVisit developerWorks for thousands of developer articles, tutorials, and resources related to open standard technologies, IBM products, and more. See developerWorks.

{mospagebreak title=Organizing your new configuration}

This section will cover my personal experiences with using the configuration system you’ve set up so far. Take ideas freely, but remember that my personal setup is not right for everyone.

I keep directories based on machines and operating systems, as specific as they need to be. For instance, I keep my Linux-specific configurations under “linux” but, because my home machine “heechee” has a specific keyboard, I have a heechee directory as well for the heechee-specific configurations.

The overriding rule, though, should be that if you can express a configuration in one file instead of multiple versions for multiple platforms, do it. Otherwise you’ll spend most of your time maintaining two or more versions of the same file, and that’s not fun.

Let’s start with an example from my cshrc file, which has one version for all machines. I take advantage of the C shell language’s built-in decision logic to make alternate decisions:

Listing 10. Define the precmd for various platforms

switch ($OSTYPE)
 case “solaris”:
 case “SunOS”:
  alias precmd ‘/bin/echo “33]0;${HOST}:$cwd07c”‘
 breaksw
 case “linux”:
  alias precmd ‘echo -n “33]0;${HOST}:$cwd07″‘
 breaksw
endsw

The commands above specify different versions of the same thing. The Linux echo needs an -n switch to avoid printing a new line, while the Solaris version needs a c at the end of the string. The effect of this is to set the title of an xterm window to HOST:/DIRECTORY whenever a prompt is printed.

Clearly, whenever you can make decisions in the configuration file itself, you usually don’t need to make multiple versions of the same file in distinct directories. My Emacs configuration, for instance, has just one version for all six or so varieties of machines I use regularly — and some of them are running Emacs 20, which is many years old!

Sometimes you do have to do some splitting. The xmodmaprc file, for instance, sets up mapping between keycodes and key names (among many other things it can do). I keep a version for my home machine in ~/config/heechee/xmodmaprc and another version in ~/config/sun/xmodmaprc for all the Sun machines I use. There is no logic in the xmodmaprc format, so splitting it is the only recourse. I did, however, create just one xmodmaprc file for all the Sun machines, because all of them have the same keyboard model.

The crontab file (which I keep in ~/.crontab and periodically reload into crontab) is an extreme example of a configuration file that needs to be specific to each machine. The crontab from my home machine would be inappropriate for any other machine, and there is no logic in the standard crontab format to choose between cron jobs based on anything other than time.

The bottom line is that you should figure out if multiple versions of a configuration file are needed, and then decide the best way to organize those multiple versions. Your goal should be to have a consistent environment, not to spend hours upon hours writing and maintaining configuration files. I hope the techniques explained in this article prove useful in your search for configuration Nirvana.

IBM developerWorksVisit developerWorks for thousands of developer articles, tutorials, and resources related to open standard technologies, IBM products, and more. See developerWorks.

{mospagebreak title=Conclusion}

I hope you found this article interesting and useful. Take what you can from it — I’ve spent years perfecting my setup, and it should serve you in good stead.

Convert to this scheme a little at a time, don’t get overwhelmed. You can easily spend days rewriting your configurations — so do it gradually and you’ll enjoy the process.

The greatest benefit you’ll see is the automatic update function. On any of your machines, you can commit a file and it will show up everywhere else the next time maintain.pl is run! Even if you disagree with the directory structure, think about the power of the automatic updates and how they can be useful to you.

The second benefit you get is configuration archiving. Every version of your configurations will be in the revision control system! If you make a mistake, you can go back to an earlier version. If you lose a whole machine to, say, disk failure — you can recover all the time-consuming configuration files you wrote for it in minutes.

Don’t be tempted to convert everything to this scheme. Convert just the things you want to keep or reuse. Binary files don’t work well with CVS — at the very least, you won’t have the diff capability that CVS provides for text files. Also, CVS has trouble with renaming directories, although it’s certainly possible if you also rename the directory in the repository.

Finally, keep good backups of your CVSROOT repository, wherever it is. I hope you never need them.

Resources

Download the maintain.pl script and the maintain.conf configuration file used in this article.

Read all of Ted’s Perl articles in the Cultured Perl columns on developerWorks.

CVS home contains many CVS-related links. Free software versioning systems include Subversion and GNU arch (also known as GNU tla). Commercial offerings include Rational ClearCase.

Essential CVS (O’Reilly & Associates, 2003) by Jennifer Vesperman is a good CVS overview, and CVS Pocket Reference, 2nd edition (O’Reilly & Associates, 2003) by Gregor Purdy is an excellent quick reference to CVS — I highly recommend it.

Open Source Development with CVS, 3rd Edition (Paraglyph Press, 2003) by Karl Fogel and Moshe Bar is a freely available online book; you can also purchase a copy at the bookstore.

Version Control with Subversion (O’Reilly & Associates, 2004) is an interesting read.

dotfiles.com is an excellent resource for learning about configuring the C shell, bash, Emacs, and many, many other Linux and UNIX programs. It’s highly recommended; just don’t blame us when you spend your whole weekend browsing the site.

OpenSSH is a standard, free, and very good implementation of the SSH protocol. CVS Pserver is good for allowing anonymous CVS access, but it is insecure.

OpenSSH non-interactive logins with the help of an ssh-agent are explained in OpenSSH key management (developerWorks, July 2001), a three-part series by Daniel Robbins.

AppConfig is a CPAN module for parsing command-line options and configuration files. In Cultured Perl: Application configuration with Perl (developerWorks, October 2000), Ted demonstrates how the AppConfig module can handle local configuration storage for Perl programs, and how such configurations can be stored in a database that can then be accessed from any machine on the network.

You may also want to read Understanding Linux configuration files (developerWorks, December 2001), which explains those configuration files on a Linux system that control user permissions, system applications, daemons, services, and other administrative tasks.

Meanwhile, Debugging configure (developerWorks, December 2003) discusses what to do when good config files go bad, and an automatic configuration script doesn’t work. Tips for users as well as for developers help you to keep failures to a minimum.

Find more resources for Linux developers in the developerWorks Linux zone.

Purchase Linux books at discounted prices in the Linux section of the Developer Bookstore.

Develop and test your Linux applications using the latest IBM tools and middleware with a developerWorks Subscription: you get IBM software from WebSphere®, DB2®, Lotus®, Rational®, and Tivoli®, and a license to use the software for 12 months, all for less money than you might think.

Download no-charge trial versions of selected developerWorks Subscription products that run on Linux, including WebSphere Studio Site Developer, WebSphere SDK for Web services, WebSphere Application Server, DB2 Universal Database Personal Developers Edition, Tivoli Access Manager, and Lotus Domino Server, from the Speed-start your Linux app section of developerWorks. For an even speedier start, help yourself to a product-by-product collection of how-to articles and tech support.

IBM developerWorksVisit developerWorks for thousands of developer articles, tutorials, and resources related to open standard technologies, IBM products, and more. See developerWorks.

[gp-comments width="770" linklove="off" ]

chat