[Access to Wang masthead]

Archiving Tools Continued

Putting some ZIP in your system

From "Migration",  Access to Wang, July 1994
  [ Prior Article ]     [ Return to the Catalog of articles ]     [ Next Article ]  

Last month I compared a few archive tools and showed some of the ways they can be used. This time we can review these tools as alternative to traditional Unix commands in production use. I will use the ZIP/UNZIP tools in the examples below. Most of the methods described here will work with most of the archive tools described previously - ZIP/UNZIP, LHarc, PKZIP - and others that were not mentioned, including Gnu ZIP (gzip), ZOO, Arc, Sit, and others. I prefer ZIP to other archive tools because it has the greatest amount of support on all platforms (particularly MS-DOS) and some unique features all its own.

Copies of these and other tools are available from PC Bulletin Boards, on-line services (CompuServe, America Online, etc.), or the Internet.

Unix archiving tools

Archive tools can help you organize files, retain prior versions for future reference, and reduce the amount of disk space used by these files. The traditional Unix method for creating such archives is by using a combination of the tar and compress commands. tar ("tape archiver") combines one or more files into a single output, while compress reduces the size of a file. Since files created with tar are not compressed, the archive file is the same size as all of the files within it combined, so compress is needed to reduce the file size. In comparison, ZIP and similar tools provide archiving and, optionally, file compression within a single utility.

Archive tools like ZIP also much easier to use. Suppose you wish to create a compressed archive file and remove the original copies of the files afterwards. Using traditional Unix methods, you might enter the following commands:


tar cvf myarch *         # create myarch.tar
compress myarch.tar      # compress myarch.tar to myarch.tar.Z
rm *                     # remove the original files

To create the same archive file using ZIP, you would merely enter zip -m myarch * (translated, "move all files into myarch.zip"). The resulting ZIP file would be smaller by about 20% and it would take less time to create the archive.

The file naming requirements of the tar and compress tools are inflexible: file names must end with .Z (upper-case Z) to be recognized by compress, and the .tar extension should be added for tar. The result is often a file name that does not fit into the file name requirements of other systems and may even exceed the 14- character limit of some Unix file systems. For example, the archive file long.program.tar.Z would be truncated on our Unix system to long.program.ta - indistinguishable from its uncompressed form long.program.tar and resulting in a duplicate file name error whenever the compress utility is used to extract it. In comparison, ZIP prefers to have a .zip extension, but any file name is acceptable; long.program.zi would work fine.

Finally, it is difficult to assess the contents of files archived using traditional Unix tools: the tared files must be uncompressed before a listing of the archive's contents can be viewed, resulting in a temporary loss of file space. ZIP files can be interrogated directly using the -l (listing) or -v (verbose listing) options of UNZIP, and these file names can in turn be passed to other programs or placed in files for other purposes. The syntax for this file listing is the same for any platform: enter unzip -l myarch.zip to view the contents of archives created in the Unix, MS-DOS, or any other environment.

Naturally, you will lose some portability to other versions of Unix if you use ZIP instead of tar and compress; that is the reason nearly all Unix files in public archives are maintained this way. ZIP files are mostly of interest if your destination is internal or one of several desktop systems, such as MS-DOS, Apple, or others.

Production uses for archive files

It is often useful to incorporate the creation of an archive file into a production process. In some cases, the archive file itself is the product; for example, the resulting files from a large extract may be bound for an MS-DOS system. Shell scripts provide a convenient way of creating such archive files.

In the script example shown in Figure 1, the first line moves the current working directory location to a common area. The next three lines are typical job-stream commands, each producing a single output file in the working directory. The ZIP program is invoked using the - m (move) option to create the archive file.

Figure 2 shows a revision of this script that reduces the amount of file space required to run the job by incrementally storing each output file as it is created. The last step also introduces another refinement: automatic notification of the job run and archive contents. In this example, the UNZIP command is used to interrogate myarch.zip and the resulting file list is mailed to user dsb. (The Unix mail system - though crude in its presentation - provides excellent opportunities for automated notification such as this. More on this topic next month.)

Special techniques

The following is a potpourri of small program solutions that archive tools can help provide:

Finding and backing up files across the system: If you have a number of small files scattered across directories on the system, it is often difficult to locate and back them up. For example, .ini files on PCs contain critical information that changes daily, but most are buried in program directories that may not be backed up frequently. I use a file name search tool in combination with the capabilities of ZIP to accept file names through Standard Input to meet this need, like this:


whereis *.ini | zip iniback -@

This statement uses the common DOS system utility whereis.com to locate files with an extension of .ini, then passes the files names to ZIP through a pipe (Standard Input). The Unix find utility could be used to locate files instead, allowing powerful searches by date, time, user ID, or other factors. The result: a cross-system backup of these small but important files.

Another way to search for file names is through the ZIP "include" specification, -i. The same search could be expanded to include files with the extension of .grp with the following statement:


zip -r iniback c:\*.* -i *.ini *.grp

Translated, this means "zip recursively all files on the c: drive, but include only those with extensions of .ini or .grp." Using dates as names: Production archive files need to be identified uniquely, and it is sometimes necessary to retain multiple copies of file sets. Using portions of the date or time provides a convenient means of providing this visibility. Many versions of Unix allow use of the date command to provide a formatted output that can meet this need. If supported by your system, the following syntax should provide the current date in year/month/day (YYMMDD) format in the shell variable FILENAME:


FILENAME=`date +%y%m%d`
   . . .
zip -m $FILENAME *

In this example, the variable FILENAME is assigned the value of the date and the value of that name ($FILENAME) is later used as a ZIP file name. The file name assignment uses the powerful Unix trick of command substitution to place the result of a process (the date command) into a variable. date can also supply other information, including the time, Julian date, time zone, and formatting characters such as new line, tab, etc.

Automating comments: Comments can be entered into the headers of zip files through the -z option (e.g. zip -z myarch) and typing the information to be included, ending with a period on the first line. Comments may also be included by combining information from a number of sources into a single text file and using that file as input to the comment process. Figure 3 shows this, using the same archive process as an example. A temporary file is created with the text to be included in the header. This file is added to the archive file, and the temporary file is removed. The resulting comment will be displayed every time an archive listing is produced.

As always, I hope you find some application of this information in your own environment. If you have any comments or specific archiving problems to solve, please send them and I will present them in this space.


Figure 1: Example of Script Using ZIP Archive


#!/bin/sh
# prodzip - example of production ZIP usage
cd /work/mydir           # change to the working directory
wrun process1 > file1    # run program process1 to produce file1
process2 file1 > file2   # run script file process2 to produce file2

# extract entries with "phy" from the passwd file
grep -i phy /etc/passwd > file3
zip -m myarch *          # move all files into myarch.zip

Figure 2: Revised Script Using ZIP Archive


#!/bin/sh
# prodzip2 - example of production ZIP usage
cd /work/mydir           # change to the working directory
wrun process1 > file1    # run program process1 to produce file1
zip -m myarch file1      # move file1 to archive
process2 file1 > file2   # run script file process2 to produce file2
zip -m myarch file2      # move file2 to archive

# extract entries with "phy" from the passwd file
grep -i phy /etc/passwd > file3

zip -m myarch file3      # move file3 to archive
unzip -v myarch | mail dsb    # create archive listing; mail to dsb

Figure 3: ZIP Header Comments


#!/bin/sh
# prodzip3 - example of production ZIP usage (comment headers)
cd /work/mydir           # change to the working directory
wrun process1 > file1    # run program process1 to produce file1
zip -m myarch file1      # move file1 to archive
process2 file1 > file2   # run script file process2 to produce file2
zip -m myarch file2      # move file2 to archive

# extract entries with "phy" from the passwd file
grep -i phy /etc/passwd > file3
zip -m myarch file3      # move file3 to archive

# Create header comment in temporary file
RUNDATE=`date +%y%m%d:%H%M`   # formatted date and time

# create header.txt file
echo "Production file run: $RUNDATE" > header.txt
echo "" >> header.txt
echo "Please distribute to Rachel Smith in Accounting" >> header.txt
echo "" >> header.txt

# Add comment to archive; clean up temporary file
zip -z myarch < header.txt
rm header.txt

# Create archive listing; mail to dsb
unzip -v myarch | mail dsb

  [ Prior Article ]     [ Return to the Catalog of articles ]     [ Next Article ]  


Copyright © 1994 Dennis S. Barnes
Reprints of this article are permitted without notification if the source of the information is clearly identified