Hierarchical Reuse Model 11 Figure 1.3. storage pool at one of 11 surveyed VM installations. Distribution of page frame interarrival times. Each curve shows a user or system Figure 1.2 assumes that we are interested in a cache management scheme based upon track images. While a scheme of this type usually applies to storage control cache, however, the caching performed in host processor memory is normally based on smaller units of granularity. For example, the minidisk cache facility mentioned earlier manages data in units of one page frame (one block of 4096 bytes). The unit ofgranularity used in managing the cache has an important effect upon its interarrival times and miss ratio. To illustrate the impact of cache granularity, Figure 1.3 presents the interarrival times observed at the page level of granularity, based upon the same traces as those presented in Figure 1.2. Note that at the page level of granularity, a reasonable guess for the value of θ would be roughly half of that obtained assuming cache management based upon tracks (thus, θ page ≈ 0.125). 4. VISITS TO MEMORY So far, we have observed that data item interarrivals tend to exhibit a fractal structure. The present section shows that this fact, beyond being interesting in itself, is extremely helpful to the performance analyst. The power of (1.3) to solve practical problems comes from the simple and mathematically tractable statements that it yields about the time spent in memory during a cache visit. We shall examine closely the structure of such visits, which the hierarchical reuse model predicts to be predominately transient. Based upon our analysis of 12 cache visits, we then calculate the resulting memory requirements. Finally, we illustrate these results by assessing the effectiveness of typical cache memories. 4.1 VISIT DURATION In the previous section, we examined the visit to cache of a track referenced exactly once. We now extend the discussion to an arbitrary visit to cache, comprising one or more references to some identified track. THE FRACTAL STRUCTURE OF DATA REFERENCE Let us first subdivide the entire visit into two parts: 1. The back end, starting with the last reference during the visit, and ending when the track is removed from the cache. By our earlier assumptions, the back end always has the same duration, namely τ. 2. The front end. This interval starts with the first reference during the visit, and goes up to (but does not include) the last reference. (For single - reference visits the front end is null and has zero duration). Let the average duration of the front end be called ∆τ. Thus, the total average duration of the visit as a whole is given by (1.7) Suppose, now, that the cumulative distribution function of the time U be - tween references to a given track is F ( . ). We do not necessarily assume that F is given by (1.3); any distribution will do at this stage. Let (1.8) be the average length of time prior to a hit since the previous reference to the same track. By definition, ∆τ specifies the amount of front end time that passes per cache miss. Similarly, g(τ) specifies the amount of front end time that passes per cache hit. Letting r be the total rate of I/O requests, this gives two ways of calculating the total rate at which front end time occurs: or, This is a general result; it does not depend upon the assumption of hierarchical reuse probability. = (1.9) Hierarchical Reuse Model 13 If, however, we apply the hierarchical reuse model, we can calculate g(τ) by plugging (1.10) into (1.8). Due to the factor of x that appears in the integral, we choose a strategy of formal evaluation throughout its entire range, including values x approaching zero (which, although problematic from the standpoint of the model, are insignificant). This evaluation yields Substituting into (1.9) now gives (1.11) Combining this result with (1.7), we therefore obtain The average residency time is directly proportional to the single - reference residency time. It is important to note that, for typical values of θ, the average length of the front end is only a fraction of the entire cache visit. For example, the guestimate (1.6) just suggested in the previous subsection yields a front end which averages one quarter of the entire cache visit. This means that a typical visit to cache consists of a rapid succession of requests, followed by a relatively much longer period in which the track ages out of the cache. The vast majority of tracks, visiting a real cache, tend to exhibit exactly the pattern of behavior just described. Occasional tracks can be found whose use is so persistent that they stay in cache for extended periods, but such tracks tend to make a relatively small contribution to the total time spent in cache by all tracks. 4.2 IMPLICATIONS FOR MEASUREMENT A simple, easily applied technique exists that makes it possible to estimate the average residency time T, in a running environment, based upon live measurements. To accomplish this, we may proceed by applying Little's law. This fundamental result, widely applied in many areas of performance analysis, was first proved by J. D. C. Little [14]. It states that for any system (where system is very broadly defined), (1.13) 14 for the averages of these three quantities. Suppose that measurements are available (as they normally are in a VM or OS/390 environment) for the rate of requests r and the miss ratio m. Let z represent the amount of cache memory occupied by a track, and s represent the total cache size. Then we may conclude that the population of tracks currently visiting the cache is given by (1.13): THE FRACTAL STRUCTURE OF DATA REFERENCE Therefore, By contrast, measurements of the single - reference residency time τ require specific instrumentation that is not available as a part of standard storage subsystemreporting. The comparative ease ofmeasuring the average residency time, and the comparative difficulty of measuring the single - reference residency time, tends to make the former very attractive as a basis for day - to - day capacity planning. Planning based upon the average residency time is investigated further in Chapter 3. Nevertheless, Computer Associate’s CA-ASTEX software package for storage monitoring does include the capability to report the single - reference residency time. The single - reference residency time is also reported, based upon analysis of trace data, by the cache simulation tool called Cache Analysis Aid ( CAA). This IBM field diagnostic tool is not offered directly as a product, but IBM storage customers can usually arrange for its use. By laking advantage of (1.12), it is also possible to estimate the value of θ for a running workload, given measurements of T and τ: (1.16) Another interesting implication of the related equations (1.11) and (1.12) involves determining, from trace data, whether a given track is in cache at the time it is requested. Simulation tools, such as CAA, do this by applying the rules for LRU management to the sequence of requests occurring in the trace. Let us now consider how to simplify this method of analysis. One approach is to apply the criterion of time - in - cache. This greatly reduces the amount of information needed about events prior to the current request. Rather than reproducing the entire sequence of cache management actions leading up to the current request, we need only find the previous request made to the same track. The time since this request can then be compared with the single - reference residency time to assess whether the request is a hit or a miss. Hierarchical Reuse Model 15 Nevertheless, both the approach based upon the criterion of time - in - cache, as well as that based upon LRU list simulation, share a common drawback: their scope of application excludes a substantial period of time at the beginning of the trace. For the former, the time excluded is equal to τ; for the latter, a “warm up” period is needed long enough to fill the simulated cache memory (ordinarily, the length of the “warm up” period is somewhere between τ and The result (1.11) makes possible an entirely different method of analysis. T). Suppose that the time line is divided into equal intervals (t i , t i+1 ], where If a track is requested at least once during an interval, consider (for purposes of having a convenient term) that the track has been touched or, equivalently, that a touch to the track has occurred. Then the “burstiness” of references, together with the fact that no two references during an interval can be separated in time by long enough for a track to age out of cache, imply that: 1. Most touches to a track entail a miss. 2. If a touch does entail a miss, then it must be the first I/O to the track during Moreover, it is possible to apply (1.11) to quantify the term “most” just used in observation (1) above. By taking advantage of the fact that most touches entail amiss, it is possible to estimate the numberofmisses during an interval merely by counting the number of touches (the number of distinct tracks referenced). Since this method uses no information about events in previous intervals, none of the available data must be excluded. To calculate the probability that a given touch to a track entails a miss, observe that for every miss, there is a corresponding visit to the cache. For each visit, in turn, there is a corresponding back end; and for each back end, there is exactly one back end I/O demarking the point where it starts. In addition, an interval cannot contain references from more than one distinct visit; so no more than one miss I/O and no more than one back end I/O can occur in a given interval. Since our objective is to count touches that entail a miss, we may therefore proceed by counting instead touches that entail a back end I/O. Now, for each interval containing a back end I/O to a given track, there is a corresponding (possibly empty) set of intervals where the track is touched but there is no back end I/O. The number of such intervals is given by the number of interval boundaries crossed by the front end (here again we make use of the fact that an interval cannot contain references from more than one distinct visit). Also, every back end lies in exactly two adjacent intervals, thus crossing one interval boundary. the interval. Any subsequent I/O’s must be hits. . size. Then we may conclude that the population of tracks currently visiting the cache is given by (1.13): THE FRACTAL STRUCTURE OF DATA REFERENCE Therefore, By contrast, measurements of the. Let the average duration of the front end be called ∆τ. Thus, the total average duration of the visit as a whole is given by (1.7) Suppose, now, that the cumulative distribution function of the. where the track is touched but there is no back end I/O. The number of such intervals is given by the number of interval boundaries crossed by the front end (here again we make use of the fact