[Access to Wang masthead]

Better Houskeeping

A two-part series on system management

From "VS Workshop",  Access to Wang, June 1988
  [ Prior Article ]     [ Return to the Catalog of articles ]     [ Next Article ]  

As this is written, I am making final preparations for an appearance at the TechConnect West show in Los Angeles, where I will be speaking on system management process and tools, sharing some of the tools and techniques I use to manage the three VS systems in our Seattle office. I attend events such as TechConnect as much to learn from others as to impart any of my experiences, and I look forward to returning home enlightened (and hoarse) from conversations with members of the Wang community.

Since most readers of this column will not have the opportunity to attend TechConnect personally, I will cover most of the points from my presentation over the next two issues. This month I will cover the management of disk and other I/O resources, including routine error log management, backup, disk space reporting and control, and file integrity. Next month I will extend into application control, including performance measurement, security and user billing, configuration management, communications billing, and problem resolution methods. As in the L.A. seminar, I will concentrate on USERAIDS that are helpful for these purposes and describe when commercial products should be considered.

My shop

Before beginning, I should give you some idea of the systems and resources I monitor. Perkins Coie is a moderately large law firm (approximately 260 lawyers in six offices). Like other law firms, our primary applications is Word Processing - VS/IIS, to be specific. In addition, we also have some legal accounting systems, a mainframe-like data base product, and a few small applications written under PACE.

This load is currently shouldered by two VS 100s and a VS 300, supporting a combined user base of over 500. At this writing, the VS 300 is our primary WP machine (165 workstations in use), with one VS 100 used for litigation support (the BASIS data base product) and the other for accounting. VS/IIS is supported on all three machines, and WP+ is under investigation for possible production use. We hope to select and purchase an office automation product some time this year, and it will probably be used at all Perkins Coie offices to enhance communication within the firm and to the outside as well.

I view these three systems and their applications as representative of the three major categories of all systems: the traditional data processing model, where a few large indexed files are under heavy shared use among many users; the text processing model, with many users, many open files, more CPU usage and less disk activity; and the scientific model, where disk and CPU requirements fluctuate wildly according to the demands of a small user base. Each application model requires a different approach to control and maintenance.

Monitoring log files

All of us have more interesting and rewarding work than monitoring system logs. Unfortunately, the information they contain is important enough to demand regular attention. In the interest of fulfilling this important role (and remaining awake), I have attempted to reduce this review process to the practical minimum and intercede on an exception basis. The philosophy here is to establish guidelines for most parameters, examine data against these standards on a consistent basis, and attempt to retain only data that is relevant.

For example, one of the drudgeries awaiting the systems administrator is that of keeping on top of the error logs. Applying my philosophy of relevancy, I purge the I/O error log once every week, using the date as the file name and placing the files in a common library. After copying the file, I set the retention date two months ahead and attempt to scratch the library; naturally, only files past their retention date will actually be removed. (Those less foolhardy might opt instead to copy the files to a removable volume before scratching.) In this way, files are retained for a finite period (approximately two months) and it is easy to research past error entries.

I use a similar approach to Wang Systems Networking logs. WSN log files are created every day, and again the date is used as the file name. Since they are so easy to identify, I merely scratch the files more than a month old. For convenience in this task, it's hard to beat the SCRATCH Useraid, which allows you to check off the files to be removed and watch them disappear without a murmur.

Another review headache is the Transfer log. Again, I have set up a process to identify the contents of the file, protect it from removal for one month, and scratch unneeded files. In this case, I experimented to find the program that produces the log file (FTPRTLOG) and wrote a procedure to create a log file, copy it to a standard library, and protect it. I run this procedure weekly, but I am considering a daily run via our automatic background scheduling system.

Backup

Backup approaches must vary with the applications, so there is little reason to go into great detail here. We create a full system backup every week and send it off-site, then use incremental backups for the remainder of the week. One exception: our accounting files are both large and in frequent use, to we always perform a full backup on that system.

By the way, if you are using an incremental backup approach be aware that many system utilities (such as COPY) will change the modification date and thus trigger a copy of the file on the next backup. Releasing excess space in a library or reorganizing an entire WP library, for example, will update the modification dates. For these reasons, I use a commercial software product that restores all file ownership and date information after performing a copy or release.

At Perkins, backups are performed at night (the accounting system) and early in the day (the other two systems). This reflects typical usage and minimizes impact to the users. Batch accounting jobs are started after the nightly backup and run quicker with no interference from the users.

Borrowing an idea from other shops in the area, we no longer print any backup listings; instead, they are copied only a removable disk and retained. With backup listings that are typically over 600 pages, this approach saves paper and allows on-line searches for files. Naturally, there is some risk that the listing may be lost or rendered unusable, so this is not the best method for all needs.

Disk management

Next to Word Processing aids, the bulk of the USERAIDS collection consists of disk management utilities. Likewise, much of the commercial utility software available for the VS is related to disk and file management. For all of this effort, however, there are few products that materially aid the disk management process.

Let's first go over disk space measurement - that is, the number of disk blocks available for new files. One exceptional USERAID for disk space is DISKUSE. This simple program performs only one function: it shows the amount of disk space used for all volumes, expressed as a percentage. It's remarkable how much this says about the state of your disk drives.

Other disk management USERAIDS: CHKBLKS, which produces a list of files using more than a stated number of blocks; FILEINFO, a utility that builds a file of disk statistics that can be summarized using REPORT; DISPMANY, a multi-file display utility that could be used to review and scratch old print files; DISPRINT, an excellent utility that allows each user to review and manage their own print files; RELFILE, which releases extra space in files (be careful!); LIBSIZE, which sums the block usage of a library; and WPSREORG, a Word Processing reorganization utility that also shows the number of blocks saved. We also use EVTOC, Richard Evans's summary report of block usage by library. Those interested in disk management by guilt may wish to look into SORTFILE, which can produces file usage reports based on file ownership. (Alas, my copy of SORTFILE (version 2.01) does not work under release 7.13 of the operating system!)

Until recently, I used FILEINFO to collect data on file usage, then used the REPORT utility to produce exception listing of grossly over-allocated files. The names on this report were then processed by CREATE and became a table file, and subsequently were passed individually to a reorganization procedure using a local program item called RUNMULTI. Complicated? You bet! That's why I broke down and bought a commercial disk space tool for this purpose.

Disk fragmentation

Disk space alone is not the only disk management issue; the degree of disk fragmentation is of equal importance for effective system usage. To understand disk fragmentation, consider the worst case: large Word Processing systems. Over 500 files are edited in a typical day at our shop, and any number of them will require either more or less disk space than before modification. When excess space is released from a file, it often constitutes a small island in the midst of a sea of large files. Since this space is usually too small for regular use, it sits; as time passes, the number of small, unused disk areas grows. Before very long new files of any size cannot find a single area of sufficient size - even though DISKUSE may show only 65% disk usage!

The single best solution to this problem is to remove all files, initialize the disk, and restore the files. This process will bunch all files together for maximum efficiency. Unfortunately, a full restore from backup is both dangerous and time-consuming - dangerous because it requires perfection in the backup copy.

An acceptable compromise is the Compress-In-Place (CIP) utility distributed by Wang. This works by copying files together and recreating the disk catalog (the Volume Table of Contents, generally known as the VTOC). Again, this is a dangerous process so a backup must be made before beginning. Some of the commercial disk management products also have a CIP option that works even when the disk is in use; these may be run frequently to lessen the effect of frequent file access.

A final note on fragmentation: utilities that release excess disk allocations or scratch or archive files will all contribute to fragmentation problems. For this reason, I usually back up the disk, release the file space (through scratching, archiving, or releasing excess space), then run a CIP utility to regain some of the newly-released space in contiguous form.

File integrity

Beyond the physical location of the file and its disk space efficiency, there is also a group of issues I call file integrity. This term encompasses the internal order of the file, the state of its elements (particularly compression characters) and the physical order of its index and data blocks. A related issue exists as to whether the VTOC accurately reflects the record count, location, and other parameters for the file.

There are a number of approaches to monitoring file integrity. First, frequently-used indexed and alternated-indexed files require regular reorganization for best performance and minimum space usage. Reorganization is usually accomplished by using Wang's COPY utility to create a new file, then using this new file instead of the previous copy. Naturally, care must be taken so that the original file is not damaged or scratched before the new file is ready.

Best performance is achieved when frequently-used files are placed near the VTOC to minimize the movement of the disk heads. Regular file reorganizations can result in poor placement of the file on the disk by allowing it to creep further from the disk catalog. Look to the DISKMAP, VOLFRAG, or FILELABL utilities to show placement of important files. To correct poor file placement, perform a full disk backup and restore important files first, overlaying the remainder of the disk's contents afterwards.

Wang's VERIFY utility (not to be confused with the VERIFY option of DISKINIT) should be run regularly to check the record count and other parameters with similar values in the VTOC. Since this utility must have complete access to the file, it should be run after hours. We use a background procedure to run VERIFY nearly every night after midnight.

Whew - that's enough for one month. Next time I'll continue into access and application control.


  [ Prior Article ]     [ Return to the Catalog of articles ]     [ Next Article ]  


Copyright © 1988 Dennis S. Barnes
Reprints of this article are permitted without notification if the source of the information is clearly identified