Scripts for Linux and HP-UX

by
David W. Eaton 602-953-0336 dwe@arde.com Artronic Development 4848 E. Cactus Rd. - Suite 505-224 Scottsdale, Arizona 85254 http://www.arde.com	Chip Richards (email now inactive) NiEstu P.O. Box 39489 Phoenix, Arizona 85069
Original: March 1999 Updated: March 2003

Introduction

This presentation assumes you already have a working knowledge of Unix. We will discuss techniques and scripts used to help administer several networks of HP-UX and Linux machines. This includes scripts for:

administrators
developers
users

Some of the scripts were created to help former Apollo Domain/OS users "bridge" to a UNIX environment.

The scripts discussed here were made available to conference attendees. Most code is provided under the provisions of the Perl "Artistic license" and should be considered "examples", not "product". It is likely you will need or want to alter it at your site to suit your needs. The individual scripts are contained in one of several different freely available packages.

Scripts for administrators

Replicate links (repln)

This script replicates symbolic links. It is intended to be used to propagate an /opt structure, mostly net-supported packages, across an internal network. The system we use to manage the contents of our /opt directory tries to address three different problem areas:

Package directory naming
Where things live
Search paths

Package directory naming

To take the three areas in order, let's talk about package naming first. What we do is keep the actual package directories (containing the bin, lib, or whatever other directories the package presents to the world) somewhere else besides "/opt". They can actually be just about anywhere, but we have chosen as a convention the directory "/opt/pkg". In that directory, the application package directories all have version numbers, like "/opt/pkg/sced-0.94", "/opt/pkg/xv-3.10a", and so on.

In the "/opt" directory itself, we have only softlinks to those directories. The softlink names do not have version numbers, so we have, for example, "/opt/sced -> pkg/sced-0.94", "/opt/xv -> pkg/xv-3.10a", and so on. When we roll out a new version of something, we create a new app directory in pkg, for example "/opt/pkg/sced-1.0", and swap the link to the new directory, so we now have "/opt/sced -> pkg/sced-1.0".

So the world (user shells, other apps, compiles, links, etc.) sees simple names without version numbers, eg. "/opt/sced/bin", "/opt/libjpeg/lib", etc. The administrator can see the versioned names, and can keep as many parallel versions on hand as he/she wants. That's all there is to it.

All the rest of our scheme is independent of the above.

Where things live

Secondly, our scheme tries to make it easy to manage things across a network. We hate distributing all /opt files across every machine in our network, and not just because it eats disk space. It's also a nightmare when updating. So what we have in fact is a directory on one single machine which is our "master opt node". He's really the only one who has a /opt structured as described above. All the other nodes have a link in their /opt directories called "mstr", which points to the master opt node. Since we have NFS automounting, our links look like "/opt/mstr -> /net/srvr01/opt". And all the app links just reference the master node: "/opt/sced -> mstr/sced", "/opt/xv -> mstr/xv", etc. On another network, we have divorced it even further -- the master opt node has all the opt stuff living in a separate directory on another partition, called "/_c/opt.10", so that we can also have "opt.9", "opt.11", and even "opt.apollo" or "opt.linux" if we want. On that node, its master link is "/opt/mstr -> /_c/opt.10", and its opt is all full of "mstr/appname" links just like a regular network node. A further refinement uses the "mstr" links to point back to "/opt" itself, for a more self-contained structure. So the /opt data really can live anywhere.

A sub-part of the "where things live" question is where to keep the sources. First, we keep the original distribution tarfiles in "/opt/pkg/archives". Second, we keep the unfolded source directories under the application directory in a directory named "src", so it would be "/opt/pkg/sced-1.0/src/sced-1.0". These choices are totally arbitrary -- you could keep things in a completely different directory not related to /opt if you wanted, and it wouldn't matter a bit to the process. Everything would proceed exactly as described above. It only makes a difference when you are creating tarballs for distribution.

Note the seemingly redundant "sced-1.0/src/sced-1.0" in the pathname above. The first sced-1.0 was created by hand, and follows a strict pattern: appname-version. The second one happens to be the name used by the author of sced when packaging his source. Many open-source packages follow the same convention we do, but some don't. One example is the lynx web browser, which untars into "lynx2-8" or some other name depending upon the version number; in that case the sources would be in "/opt/pkg/lynx-2.8/src/lynx2-8". The src directory provides a convenient way of keeping the source files in their own directory even if the package author tars up the files directly, not placing them into a particular directory of their own.

Search paths

Finally, there's the whole notion of search paths--the PATH, MANPATH, XUSERFILESEARCHPATH, and similar environment variables, the -I and -L options on a C compile, and so on. The "classical" technique, used by HP and Interex's 1988 FAST project, is to add "/opt/appname/bin" to PATH for every application you install, and likewise for MANPATH, SHLIB_PATH, etc. If that works well enough for you, fine, you can stop reading now. If, however, you don't like waiting half a minute for apps near the end of that list, or having just one environment variable take up half your screen if you happen to list them, here's what we do.

We have a set of "convenience" directories, which hark back to the old /usr/local days. They contain only softlinks. In "/opt/bin", there are links to all the /opt binaries: "/opt/bin/sced -> /opt/sced/bin/sced", "/opt/bin/xv -> /opt/xv/bin/xv", etc. So all we add to PATH is "/opt/bin". Same for /opt/lib and SHLIB_PATH, /opt/man and MANPATH, /opt/info and INFOPATH, etc. We only use these for convenience -- when we make mods to another package's makefile, for example, we always use "-L/opt/libjpeg/lib" instead of "-L/opt/lib". But the convenience links can be awfully, um, convenient for day-to-day work, quickie testing, etc.

The repln script

Which all finally brings us to the "repln" script. It tries to automate all three parts. It's broken into two logical parts itself -- mechanism and policy. The "mechanism" part is a simple script that replicates symbolic links from one directory to another. It can also create links to a set of actual files. Using that mechanism alone, you can do all of the above. However, two "policy" options exist, --app and --opt. The --app option helps in creating the convenience links in /opt/bin, /opt/lib, and so on, for an individual application. And we use the --opt option to propagate our entire /opt structure to a new node, and to propagate updates from the master node around the network. The --app and --opt code knows about the /opt structure described above, whereas the rest of repln doesn't -- it's just a link munger and can be used for other link replication purposes.

Of course this is but one method for dealing with /opt. We have used it on about a half a dozen independent sites for a couple of years now with good success. Yes, there are several drawbacks to the scheme; some parts work very well, some are a little painful, some a lot. Overall the system has served our needs and eliminated just about every problem we had with /usr/local, or our early attempts at using /opt.

There are web pages which document these things in a bit more detail. Currently, they may be found at "http://www.arde.com/Papers/optdir/". We will update the online copy of this presentation if this address changes.

Start and stop virtual domains (ipsetup)

This script has been used on both Caldera and Red Hat Linux systems, though it should work on other systems as well. Its purpose is to provide a convenient way to start and stop virtual IP addresses on a single machine. It is written to be executed from the standard rcX.d directories at startup and shutdown, but as is typical with such scripts, it may be executed by root from the command line any time it is needed.

The "start" operation issues a set of "ifconfig eth0:X machinename" commands, while the "stop" operation issues a set of "ifconfig eth0:X down" commands. As long as the sequence number in the rcX.d directories ensures that the IP adresses are started before the Web server is started and stopped after the Web server is stopped, the virtual domains should work correctly.

One advantage to starting and stopping them in this way is that a virtual domain may be moved easily from one physical host to another either to perform maintenance or to upgrade hardware or the OS.

What server is that Web site using? (whoru)

Whether you administer multiple Web servers or are just curious to know what server software is being used by some particular site, whoru can provide the answer you need. It opens an http dialog with the specified Web server, then returns to you the results of that dialog which define what type of server is running.

You may find this script useful as a template for performing other http operations, however, for more complex HTTP operations, you should consider using libwww-perl. Whoru uses only bare socket calls, which is both its strength and its weakness.

Scripts for developers

Patch Generation Assistant

This package of three scripts provides assistance in the creation of source patches for collections of text files. Developed for the InterWorks FAST project, the scripts in this package are general enough to be used for various purposes besides just computer source code.

The general idea is to start with a collection of text files (such as the source code of an open-source software package), modify them by various means, then generate a single patch file which incorporates the changes made to all the files.

To implement the mechanism, you should use the ednew script to create an editable copy of a file before you alter it. This will create a copy of the file for editing, and stash the original text in a file with a ".original" suffix. The patchgen script then traverses the directory tree, performing a diff operation on all ".original" files, and concatenating the patches into a single file.

The checkeds script helps you check for edited files which did not have the extension .original (in the event that you forgot to run ednew before performing your edit.

Preserve file for local edit (ednew)

Run the ednew script with a file name, or list of file names, to be edited. It will guard against multiple invocations by checking for "file.original" for each file it is asked to copy. It also sets the "write" permission for the copy. By renaming the original and creating a copy, ednew preserves the original timestamp of the file. An example:

ednew Makefile config.h

This will move Makefile to Makefile.original, and create a writeable copy of it called Makefile. It will also move config.h to config.h.original and create an editable config.h. The filenames can actually be pathnames, eg.:

ednew src/fumble.c

In this case, the copies are placed in the same directories as the originals, so the above would move src/fumble.c to src/fumble.c.original and create an editable copy src/fumble.c.

Generate patch for locally edited files (patchgen)

Run the patchgen script by changing your current working directory to the "top" of the source tree for which you wish to generate a patch file. Then, run it by giving it the name of the patch file to be generated. For example, it could called in this way:

patchgen FAST.patch

This will traverse the src directory tree, which includes the modified sources that you have created during the porting of the package, looking for files with the ".original" suffix. For each such file it finds, it will append a unified context diff patch to the FAST.patch file. Note that this requires the GNU diff program--the standard HP-UX diff program does not accept the appropriate -u option. This can actually be fiddled around--the patch program accepts numerous input formats, but the unified context diff is usually best. The GNU diff program is provided as a FAST package, so install it first!

The resulting patch file can be used by unpacking the original source into a directory, then running patch <patchfile. This will apply all the differences given in the patch file. Presuming that you have been religious in your use of ednew, this should duplicate the modified source tree that you produced through editing. Note that patch uses ".orig" as its "original file" suffix, so the resulting tree won't have the same filenames, but should have the same contents. Any build operations should then work as they did with your modified sources.

Warn of edited files without .original extension (checkeds)

In a section above, we said "Presuming that you have been religious in your use of ednew ...". This isn't a totally realistic expectation all the time, so the checkeds script has been provided to give a bit of help in this area. If you use an editor which creates recognizable backup files by appending a suffix to them, such as ".bak" or "~" (tilde), the checkeds script will look to see if any files have been edited (that is, if any files have backups) which do not have a corresponding "file.original". This process is not perfect, and won't work at all if your editor is like pico and doesn't create backups, but it can help in some cases.

The intent is that you run checkeds immediately prior to running patchgen, and from the same working directory. It traverses the directory tree looking for files with ".bak" or "~" suffixes, and checks to see if there is a properly named ".original" for that file also. It takes an optional -v argument, which means run in "verbose" mode. Without the -v, checkeds only prints a message for files without a corresponding ".original". Ideally, then, it would produce no output. With -v, it also prints a message for all backup files, even those which pass muster. So, call it like this:

checkeds

Or else like this:

checkeds -v

Note that checkeds does not attempt to create a ".original" file from the backup file. It has no way of knowing whether you have done multiple edits to the file, thus rendering the backup different from the original. If you detect any rogue edits using checkeds, you should unpack the original file(s) from their source distribution package, and manually rename them with the ".original" suffix.

If your editor uses a backup suffix other than ".bak" or "~" (tilde), you can add your editor's backup-file suffix to the list given in the "backups" variable at the top of the checkeds script.

Template for perl scripts (perltemplate.pl)

This script provides the architecture for what is usually needed in a perl script. The template includes processing of options on the command line and a display of help information for the user. Being a template, it does not perform any particularly useful work. Instead, it writes the content of an input file to the output file. However, it may be modified easily to perform transformations on the input file or other actions as needed. Copy and change this script as desired.

Scripts for users

Display ASCII Character Set (ascii)

The ASCII character set is displayed in a table, with the number of columns determined by a user option. By default the Hex value of each character is shown, but an option permits the Octal value to be included. Another option includes the HTML entity code for certain characters.

Rename (ren)

This is a script for people who miss an actual "rename files" command under Unix. The "mv" command is nice, but you usually can't do "mv *.c *.h" and get what you want. If you've ever wanted to do it that way, then you might like the ren script. Also useful for Apollo folks missing the "chn" command -- like the "fpblock" command described elsewhere in this document, ren isn't a complete replacement for its Aegis counterpart, but does the important stuff. It recognizes MS-DOS wildcards, Aegis patterns, and full Perl regex's. The following are equivalent:

   ren  --dos    --from=*.c    --to=*.x
   ren  --aegis  --from=?*.c   --to==.x
   ren  --perl   --from=.*\.c  --to=$&.x

More complex expressions are possible, of course, and there are shortcut ways of calling ren, without all the "--" options.

Note that wildcards in the pattern arguments must be protected from the shell, usually by enclosing the argument(s) in quotes, preceding them with a backslash, or using the '-f' and '-t' forms of the pattern arguments.

In addition, ren will permit you to easily rename files to all upper or all lower case letters without the need to enter both forms of the file name. Additional options allow you to view the rename commands before they are issued or to view the translated patterns without actually having performed the rename operation.

Collisions (new filenames which already exist) will be marked with a "+". View-only output (-v) is marked by a "|" in place of a dash in the arrow marker.

Backup your Pilot (pilotbu)

This is a wrapper to provide some enhanced features when using the pilot-link package to backup the data and applications from your Palm/USRobotics/3Com Palm device or IBM WorkPad to your Linux machine. Before issuing the appropriate pilot-xfer command to perform the backup, it creates a new directory containing the current date. When the backup is performed, the files will reside in this new directory.

This enables you to keep several backups of your Pilot at once, while still recognizing easily which is the latest backup.

Since this was written, we have discovered incback, by Hakan Ardo. It is far better than this script and should serve you well.

Locate and test softlinks (findlink)

This script locates all soft links in a directory and displays them to the user, showing the target path as well as the link name. It also checks to see if the target exists and displays a question mark a the end of the line if the target does not exist. It is most useful when run with its recursive option so that it descends an entire directory structure, reporting on all the links encountered.

Hunt down path name of a named executable file (hunt)

This script interrogates the user's $PATH variable, then searches each directory found there to determine where the specified command is located and which one (if there are several) will be executed. It also scans some "traditional" places which may not be on your PATH (such as /opt/someappname/bin) to try to help locate the command (or alternate versions of it). If an entry is found which is actually a soft link pointing to another file, the link is followed to determine if the resulting target file is actually there. This script was created because the conventional which and whereis commands did not do quite what was wanted, particularly for new UNIX users.

There are a variety of arguments allowed for the command. Running hunt -h will provide a list and brief description of each.

  EXAMPLES
  --------

  Find which 'tar' will be executed:

   $ hunt  tar
     /opt/bin/tar --> /opt/tar/bin/tar ***
     /usr/bin/tar
     /bin/tar
   Found, but not on your PATH:
     /sbin/tar

  Try to find commands containing 'Mosaic' in either upper or
  lower case:

   $ hunt  -i -n Mosaic
   Did not find an executable instance of 'mosaic' on your PATH

Convert text file from DOS to Unix (cvtdos2ux)

As anyone who has dealt with multiple platforms knows, different operating environments use different character combinations to mark the end of a line in a text file. Programs such as ftp will adjust appropriately when the proper modes are set before initiating a transfer. However, all too often a file with "DOS" line endings (carriage return/line feed) makes its way to a Unix machine where the standard is a single "newline" to terminate a line.

Another situation which can cause problems for some applications is that many PC tools tend to make one long line for an entire paragraph of information.

This script locates the line terminating characters and swaps them for Unix "newlines". Also, as an option, it will search for the end of sentences (defined as a period followed by one or more spaces) and insert a "newline" following each one. (This may cause odd lines when abbreviations are contained in the file, but at least excessively long lines are reduced to something more managable and more likely to be able to be used by applications sensitive to line length. A more comprehensive conversion script (dmsify) is described in Web Tips and Examples.

Compare (Netscape) bookmark files (diffbkm)

Have you ever created a new bookmark entry while using Netscape at home, then gotten to work and wished you had that same bookmark there (or vice versa)? If only one of these two bookmark files was changed, it is a simple matter to transfer the changed one and cover up the unchanged one, thus bringing both in synch. However, if both instances had been changed, but changed differently, locating the changes can be quite cumbersome since Netscape keeps "last visited" information for each bookmark.

This script makes copies of a specified bookmark file and the current system's main bookmark file, extracts the "last visited" data from each, then compares the two, showing only the "significant" differences.

(Though written in perl, this script uses the standard Unix directory /tmp to contain the working copies of the two files and uses the Unix system command diff to actually do the compare. Thus, it may not be applicable to non-Unix use.)

Find blocks containing a pattern (fpblock)

Those of you who are former (or current) Apollo Domain/OS users may have become accustomed to the Aegis command fpatb. Its purpose was to locate patterns contained within a block of lines which could be identified by a particular start and end pattern. The command would then display all lines of each block which contained the desired pattern. Alternatively, only those blocks which did not contain the pattern could be shown.

For example, if the input file test contains:

   Line x1
   Line y1
   Line z1
   Line x2
   Line y2
   Line z2
   Line x3
   Line w3
   Line z3

then running the command "fpblock -b=x -e=z -p=y test" will yield:

fpblock - 0.5 - 20 JAN 99
Block 1 lines 1 - 3:
   Line x1
   Line y1
   Line z1

Block 2 lines 4 - 6:
   Line x2
   Line y2
   Line z2

2 matching blocks found of 3 blocks.

If run with the "quiet" option, -q, only the appropriate lines of the matching blocks of the input file are shown, not the additional information.

While other options exist, fpblock does not implement all aspects of fpatb. However, since it is a perl script, it may be used on any platform where perl is installed.

Compare Tree (cmtree)

Another Aegis command former Apollo Domain/OS users may be familiar with was cmt. Its purpose was to compare the contents of two directories and report the results. The command diff -r may provide what you need, but if not, you might want to take a look at cmtree. This scrip also offers some particular extensions to users of ClearCase (the SCM system from Rational Software) which allow comparison of multiple versions of an element in a VOB with another or with a flattened directory of version files. (This script was added in version 1.1 of the bundle uxutils and may be fetched via the link at the end of this document.)

Date It (dateit)

Rename the files specified on the command line so that each one's name is prepended with the file's date and time modified (and optionally a fixed prefix). (This script was added in version 1.2 of the bundle uxutils and may be fetched via the link at the end of this document.)

Convert from Apollo fmt to HTML (fmt2html)

OK, so while we are talking about Apollo's, did you move some old document files written in fmt to your HP-UX system, but now have no way to view them other than in their fmt markup state? This script makes a stab at converting some of the most commonly used fmt constructs to HTML in preparation for using the file on a Web site.

While the conversion is not complete (and constructs such as macros are not addressed at all), it should provide a good start at reusing Apollo-written documentation files in a newer Web context.

Directory lister (q)

The directory lister has a long and illustrious history, beginning with frustrations with the shortcomings of the DOS "dir" command. Its primary purpose is to list a directory's contents sorted by last-modified date and placing the most recently modified entries at the bottom of the list. However, it can do a few other useful things that ls won't. For example, it will give accurate file sizes and totals, and show full timestamps (including year and time) regardless of the age of the file. By default, it shows the number of items and number of bytes found in each directory listed. Optionally, it can accumulate these values to provide a grand total for all directories listed.

Wrap up

We hope you have found some of the scripts and techniques discussed here useful. If while trying any of the scripts we have discussed you happen to make improvements to them, please shuttle those changes back to us so we can try to make them available to others.

Where is this paper?

This presentation was prepared by Artronic Development and NiEstu for the 1999 InterWorks (HP User Group) Conference and has been updated on occasion since that time. An outline and an abstract are available for your convenience. This is an HTML document located at URL "http://www.arde.com/Papers/UnixScript/paper.html".

Where are these scripts?

At this writing, the scripts (and others contained in the respective packagees) may be obtained via http from the Artronic Development and NiEstu sites via these links:

Biographical Sketch:

(Dave Eaton and Chip Richards operate their own businesses (Artronic Development and NiEstu respectively). Primarily software developers, both also administer HP-UX and Linux in their mixed networks.)