Transient and Persistent Data Access 89 window lengths much longer than the time needed to complete a typical burst, one might usually expect to see an entire burst, isolated somewhere in the window. But for window lengths shorter than this characteristic time, and assuming that such a characteristic time actually exists, one might usually expect to see some portion of a burst, spread throughout the window. The characteristic burst time, and the corresponding number of requests, could then be identified by the transition between these two patterns of behavior. The result (7.2) is in sharp contrast, however, with the outcome of the thought experiment just presented. Patterns of reference that conform to the hierarchical reuse model, although bursty, do not preferentially exhibit bursts of any specific, quantifiable length. Instead, they are bursty at all time scales. It is reasonable to hope, therefore, that individual transient or persistent data items can be distinguished from each other by applying the metric (7.1). Provided that the time window is long enough for the persistence of a given item of data to become apparent, this persistence should be reflected in a high outcome for P. On the other hand, regardless of the time scale, we may hope to recognize transient data from a small outcome for P. Two I/O traces, each covering 24 hours at a moderate - to - large OS/390 instal - lation, were used to investigate this idea. The two traced installations were: A. A large data base installation running a mix of CICS, IMS, DB2, and batch. B. A moderate - sized DB2 installation running primarily on - line and batch DB2 work, with on - line access occurring from a number of time zones in different parts of the world. Figure 7.1. Distribution of probability density for the metric P: track image granularity. 90 THE FRACTAL STRUCTURE OF DATA REFERENCE Figure 7.2. Distribution ofprobability density for the metric P: cylinder image granularity. Figure 7.3. storage. Distribution of probability density for the metric P: file granularity, weighted by Transient and Persistent Data Access 91 Figures 7.1 through 7.3 present the observed distribution of the metric P for each installation. The three figures present distributions measured at three levels of granularity: track image, cylinder image, and the total storage containing a given file. Here files should be taken to represent the highest level of granularity, since the average file size tends to correspond to many cylinder images (9 in one recent survey). The three figures show a pronounced bimodal behavior in the metric P. Individual data items tend strongly toward the two extremes of either P ≈ 0 or P ≈ 1. This appears to confirm the existence of the two contrasting modes of behavior, persistent and transient, as just proposed in the previous paragraphs. Based upon the region of P where the persistent mode of behavior becomes clearly apparent in the figures, we shall define the observations for a given data item as reflecting persistent behavior if (7.3) Otherwise, we shall take the observations to reflect transient behavior. The figures also show that the role ofpersistent data is increasingly important at higher levels of granularity. Only a relatively few observed track images behave in a persistent manner; however, a substantial percentage of observed file storage was persistent, when measured at the file level of granularity. 2. The analysis just presented was limited to studying the behavior of the metric P relative to a specific, selected time window of 24 hours. If we now use (7.3) to focus our attention specifically on the issue of whether observed behavior appears to be persistent or transient, however, it becomes possible to investigate a broad range of time periods. Such an investigation is important, since clearly any metric purporting to distinguish persistent from transient behavior should tend to show results that are robust with respect to the exact choice of time interval. Figures 7.4 through 7.6 present the average percentage of storage capacity associated with track images, cylinder images, or files seen to be active at the two study installations, during windows of various durations, ranging from 15 minutes up to 24 hours. As we should expect, this percentage depends strongly on the granularity of the object being examined. Considered at a track level of granularity, it appears that only 10–20 percent of storage capacity tends to be active over a 24 - hour period (based upon the two study installations); at a cylinder level of granularity, more like 20 to 40 percent of storage is active; and at a file level, 25 to 50 percent of the capacity is active. Figures 7.7 through 7.9 explore persistent data at the same two installations. The three figures present the percentage of active track images, cylinder images, PERIODS UP TO 24 HOURS 92 THE FRACTAL STRUCTURE OF DATA REFERENCE Figure 7.4. Active track images as a function of window size. Figure 7.5. Active cylinder images as a function ofwindow size. Transient and Persistent Data Access 93 Figure 7.6. Active file storage as a function ofwindow size. Figure 7.7. Persistent track images as a function ofwindow size. . from a number of time zones in different parts of the world. Figure 7.1. Distribution of probability density for the metric P: track image granularity. 90 THE FRACTAL STRUCTURE OF DATA REFERENCE. portion of a burst, spread throughout the window. The characteristic burst time, and the corresponding number of requests, could then be identified by the transition between these two patterns of. the existence of the two contrasting modes of behavior, persistent and transient, as just proposed in the previous paragraphs. Based upon the region of P where the persistent mode of behavior