Access To Wang - January 1991 - Copyright © 1991 Dennis S. Barnes

	File Experiments Existing disk management software doesn't fulfill simple needs
*From "VS Workshop", Access to Wang, January 1991*
[ Prior Article ] [ Return to the Catalog of articles ] [ Next Article ]

Sometimes it's the simple questions that really make you work. Suppose your manager just walked in and announced plans to replace the General Ledger system. The new G/L software will be installed next week, and users will begin entering transactions in parallel starting the following week. She wants to know what the present disk storage requirements are to assess the impact of the additional data and program files, and expects an immediate response from you. After an embarrassed silence, you mutter that you will have an answer later in the day and scurry off to build some reports.

The tools of the systems trade have improved, but we still have so far to go. Questions that seem obvious to humans of average intelligence defy definition as an acceptable query to a sophisticated data base. Sure, modern disk management software allows you to select and report on usage from several perspectives, but the likely result is too much detail and too little clarity.

The following discussion is intended to serve two purposes: first, to reveal some of my experiments in applying the concept of sets to file identification; secondly, to present a challenge to vendors of disk management software to better the current situation.

I got interested in file selections by attribute while reviewing the SELCOPY utility (a USERAID or VSAID; distributed by the United States Society of Wang Users). SELCOPY copies files matching entered criteria to another disk or to tape. Possible selection criteria include the owner ID (a single user or blank for all users), file protection class (A through Z, #, @, $, or blank), the file type (Consecutive, Indexed, Print, Program, Alternate- Indexed, Log, or Word Processing), and date ranges. What if the same selection approach could be applied to disk usage reporting - perhaps including ambiguous (wild-card) file, library, or volume references?

Disk Management paradise

The ideal disk management software would review status and activity over time, isolating trends and highlighting abnormalities. It would allow queries in natural language (e.g. "What is the space used by REPORT files throughout the system?") and return the answers in reasonable time. It would format information graphically for better understanding but still retain the detail for further interrogations. In short, this software doesn't exist yet.

Consider the following questions about disk usage:

How much of our system is taken up by accounting data files for the current companies? How much by the object files for the system?

How much disk space is consumed by WP documents?

At what period of the month do our disks fill up most? Which libraries are growing fastest? When can we expect to reach the maximum storage point for the year?

In all of these examples there are elements that elude the machine but appear obvious to us.

How would you define the limits of "accounting data files" or "current companies"?

Document usage might be determined by summing the space used by files of the Word Processing file type. Only documents have a file type of Word Processing, right? Nope; so do WPS object files, work files for WP, and possibly other files as well. And what about those documents on volumes not officially recognized by WP?

Reviews of usage over time imply comparisons with earlier usage records and the ability to when and how will these records be created? How often should the sample be taken? What will perform the actual comparison?

Only a few disk management tools summarize the information in an understandable way, such as by library. Fewer yet produce charts to explain their results. Ever tried to make a point with your one-minute manager by waving around a 75-page report?

Experiments with FILEINFO

The FILEINFO utility extracts file information by volume or library and creates a file that can be interrogated. It provides a means of testing some of the file identification concepts brought up above. I have covered FILEINFO in several prior columns (see "The FILEDATA procedure" in the July 1987 issue of Access 87 for a full description), so I will not dwell on it here. For this experiment I needed to review information on all files on the system. Since FILEINFO can work with one volume at a time, I wrote a procedure that extracts each volume's data and creates a work file for each. After all volumes had been processed, the individual volume data files were merged into a single data file using the CREATE utility.

For the first trials, I concentrated on the file type, using the REPORT utility to sum block usage by file type by volume. The results were then keyed into a spreadsheet for further reporting and graphing (See Figure 1, below).

The results were surprising. Previously, I had no clear idea of the space usages of various file types and was confused by the number of files in some libraries. This report gets to the heart of the matter by showing disk commitment in blocks, the only comparable measure. Further refinements to this spreadsheet added percentages by disk, a file count by type by volume, and graphs of the results. Finally, our storage requirements make some sense.

Defining file type sets

While file types are easy to extract and interpret, they do not provide a fine enough breakdown of volume usage. What about the other factors that identify a file, such as the library or file name, record size, or internal contents? Some of these subsets can be defined externally (see Figure 2, below); others would require an understanding of the internal contents of the file. A good example is Procedure files: other than naming conventions, how would you separate Procedure files from COBOL source, data files, or any other 80-character consecutive file?

It should be obvious that neither FILEINFO nor any other disk management tool currently available offers what is needed here. The logical understanding of file usage to this degree must include external characteristics, file naming conventions, and exceptions in its definitions. Refinement of queries to this degree is not practical with crude tools like FILEINFO.

Software developers: are you listening?

Figure 1: Sample Report - Disk Usage by File Type

Numeric values other than percentages indicate disk blocks allocated.

File Type DISK01 DISK02 DISK03 DISK04 Total % used

Alt-Indexed 25,094 24,192 84,596 31,253 165,135 25.3

Consecutive 21,689 12,316 6,920 23,794 64,719 9.9

Indexed 87,702 17,660 47,431 60,138 212,931 32.6

Log 22,838 465 258 490 24,051 3.7

Object 23,191 2,252 1,600 49,012 76,055 11.6

Print 41 16,217 104 8,944 25,306 3.9

WP 0 85,192 0 130 85,322 13.1

Totals 180,555 158,294 140,909 173,761 653,519 100.0

File Type	DISK01	DISK02	DISK03	DISK04	Total	% used
Alt-Indexed	25,094	24,192	84,596	31,253	165,135	25.3
Consecutive	21,689	12,316	6,920	23,794	64,719	9.9
Indexed	87,702	17,660	47,431	60,138	212,931	32.6
Log	22,838	465	258	490	24,051	3.7
Object	23,191	2,252	1,600	49,012	76,055	11.6
Print	41	16,217	104	8,944	25,306	3.9
WP	0	85,192	0	130	85,322	13.1
Totals	180,555	158,294	140,909	173,761	653,519	100.0

Figure 2: Sample File Type Subsets (Partial List)

File Type Usage Identifiers

Alternate-Indexed Data files Library

Consecutive Data files
Procedures
Program source files
Wang INFO files Library; other
Contents
Library
Library

Indexed CONTROL files
Data files
REPORT files Library; size
Library
Library; size

Log Data files

Object Program files

Print General reports
Screen dumps Name (?)
Size

Word Processing Documents
Font files
Data exchange files
WP work and queue files
WPS object files Library; name
Library
Library; name
Library
Library; name

File Type	Usage	Identifiers
Alternate-Indexed	Data files	Library
Consecutive	Data files Procedures Program source files Wang INFO files	Library; other Contents Library Library
Indexed	CONTROL files Data files REPORT files	Library; size Library Library; size
Log	Data files
Object	Program files
Print	General reports Screen dumps	Name (?) Size
Word Processing	Documents Font files Data exchange files WP work and queue files WPS object files	Library; name Library Library; name Library Library; name

[ Prior Article ] [ Return to the Catalog of articles ] [ Next Article ]

File Experiments

Existing disk management software doesn't fulfill simple needs

Disk Management paradise

Experiments with FILEINFO

Defining file type sets

Figure 1: Sample Report - Disk Usage by File Type

Figure 2: Sample File Type Subsets (Partial List)