4.13 Architecture for Verification Software (§ 5) 95 There is an implicit assumption underlying this strategy, however, that should be acknowledged when adopting it. This assumption is, simply, that the density of bugs is highest in the condensed space, owing to the com- plexity associated with the logical decisions and computations made at these multi-dimensional vertices. Focusing simulation resources on the boundary values concentrates test generator power on the condensed space to ensure maximum likelihood of exposing bugs. Of course, some useful fraction of verification resources should allocated to exploring the full, un- condensed space as well. Experience and accumulation of empirical data will guide this allocation of simulation bandwidth. If certain functionality defined in the specifications seems to be buggier than other the rest of the target, it may be advisable to focus test generation such that the VTGs derived from these specifications are more thoroughly (or perhaps even exhaustively) simulated. High coverage on known buggy areas can greatly reduce the risk of unexposed bugs in the related RTL. Sometimes it is necessary to modify tests to avoid generating faulty be- havior due to known bugs. This can usually be accomplished by adjusting weights (perhaps to zero) for sub-ranges of variables on which the behav- ior is dependent, thereby causing tests to be generated that test functional- ity unaffected by the known bugs. If this has been done, then before “closing out” the bug report, it is essential that any and all of these modifications be removed so that the functionality previously exhibiting faulty behavior can continue to be exercised. 4.13 Architecture for Verification Software (§ 5) The foundation of sturdy software architecture consists of well-defined data structures and that is where design of the verification software begins. Fig. 4.8. Architecting the test environment The architecture for the test environment (i.e., everything other than the instance or instances to be verified) will be dependent on a number of fac- tors, the discussion of which is outside the scope of this book. These in- clude: • the programming language or languages used to implement the soft- ware, • the inclusion of leveraged code or commercially available verification IP, and 96 Chapter 4 – Planning and Execution 4.13 Architecture for Verification Software (§ 5) 97 • the computing environment (usually a multitude of networked plat- forms) in which the software executes. Nevertheless, there are a number of ingredients that are common to all architectures and these should be taken into full consideration during the planning stage. These ingredients include: • Sockets 5 for the instances of the verification target: Usually, the instance is provided by a separate engineering team. In the case of commercial IP, instances are generated based upon values of internal connectivity in which case the generated instance or instances must fit neatly into the architecture of the verification software (think “plug-and-play”). • Context models: The verification software must present to the instance a complete set of bus functional models, memory models, and models for other devices with which the hard prototype must eventually cooperate. • Activation models: The provision of clocking and reset signals to the in- stance is nearly always dependent upon the frequency of operation of the device as well as upon what is present in the context for the instance. • Test generator: The many, many sequences of stimuli are commonly re- ferred to as tests for an instance. The production of these stimuli are nearly always provided by some discrete test generator that not only provides stimuli but also provides the expected results (responses to the stimuli) for comparison against actual results from the instance so that faulty behavior can be detected. Such tests must, of course, be able to distinguish clearly between success and failure (obvious, yet many tests suffer from a lack to make such clear distinctions). 5 The use of the term “sockets” in this context is not to be confused with software sockets for establishing connections within networked systems. • Deterministic tests (also known as directed tests): It is very convenient to have provisions in the verification software for the inclusion of de- terministic tests or deterministic sequences of stimuli. Conditions, for example, are often established by executing some fixed preamble that has been modified for the specific values of condition variables. Within random sequences of instructions or commands there will often appear deterministic sequences such as procedure calls or returns or other lan- guage idioms. A deterministic sequence of bus transactions might be used to drive bus activity to saturation. Deterministic tests are also often used to check state immediately following activation, de-activation of some clock domain, re-activation whether full (via power-up) or partial • Transactors: Stimuli are applied to the target by verification software components commonly called transactors. These are protocol-knowledge- able generators of signal values to be applied to the individual input ports (or bidirectional ports) of the target. Verification IP provides transactors for a wide variety of industry-standard protocols, alleviating the need to develop them from scratch. • Monitors: All rules and guidelines relevant to the morph of the instance must be checked for violation. Monitors that observe external and inter- nal responses must be included. • Protocol checkers: These are a special type of monitor in that they check that a defined protocol for exchanging information, usually via an indus- try standard protocol such as PCI-X or USB, has been followed in every respect. Here also, verification IP provides monitors for a wide variety of such protocols. • Expected results checkers: These are also a special type of monitor that checks that data resulting from communications via some protocol or transformation via some engine (virtual to real address translation, for example) are correct. • Coverage models: The scenarios defined in association with the many rules and guidelines are implemented as coverage models as defined by Lachish. These coverage models check coverage of specific functional trajectories through subgraphs of the functional space corresponding to the scenarios of interest. Collectively, the foregoing elements constitute a testbench (as defined in Bailey). The architecture for the software for verification of the soft prototype should anticipate the following functionality: • Generate all variability to be handled by the target. • Check compliance with all rules and guidelines. • Reproduce an exact sequence of stimuli (for a particular instance, con- text, activation, and operation) for the purposes of failure analysis and subsequent verification of changes made to eliminate a defect. • Be able to execute on multiple networked computing platforms and ag- gregate all of the results. • Collect coverage data (values of variables and selected execution histo- ries) on a per-revision basis sufficient to produce standard results. 98 Chapter 4 – Planning and Execution (via soft reset), The ability to incorporate such deterministic sequences readily into the random test is enormously valuable. 4.13 Architecture for Verification Software (§ 5) 99 • Transform a sequence of stimuli recorded on a hard prototype into an identical (or equivalent) sequence for the soft prototype to facilitate fail- ure analysis. • Derive manufacturing tests, typically from the morph for testability. • Incorporate changes as the various documents (specifications) that de- fine the target are revised. 4.13.1 Flow for Soft Prototype (§ 5) Before considering a typical flow that extends from the standardized ap- proach to verification, it’s necessary to consider the data structures needed to retain records of simulations. The following requirements on data col- lection should be taken into account so that standard measures and stan- dard views can be generated from them: 1. Values used for (visited by) all standard variables, including all inter- nal and external responses must be recorded. 2. Suitable time-stamping of data is implied in the requirements for the standard measures and views described in the next chapter. 3. Aggregate coverage data on a per-revision basis across all tests in the regression suite and across all simulation platforms contributing re- gression results. A typical verification process will follow these steps: 1. Generate or sense the instance and context 2. Activate the system (the instance in its context) 3. Establish the conditions of operation (initialize). Deterministic se- quences or templates are often useful here. 4. Excite using stimuli subject to error imposition. Apply all determinis- tic tests first so that subsequent convergence (see chapter 5) of pseudo-randomly generated tests can be evaluated. Then apply pseudo-random stimuli and all other excitations. Fig. 4.9 illustrates a typical CRV process. In the following paragraphs we will elaborate on these several steps. Experienced verification engi- neers will find these steps familiar and may also recognize how past verifi- cation projects have employed alternative steps. 100 Chapter 4 – Planning and Execution Fig. 4.9. Generalized CRV process 4.13 Architecture for Verification Software (§ 5) 101 4.13.2 Random Value Assignment (§ 5) A fundamental operation within any verification process using CRV is the random selection of a value for a variable from its defined range and sub- ject to any constraints defined for that selection. Consider the following example. <’ type major_opcode: [ LOAD, STORE, BRANCH, ALU, SYSTEM ] (bits: 3); // Major opcode field in instructions. struct instruction { major_op: major_opcode; // One of several fields. keep soft major_op == select { 40: LOAD; 30: STORE; 20: BRANCH; 20: ALU; 10: SYSTEM; // Total of weights is 120. } // Remaining fields of instruction not shown. } ‘> In this brief segment of e code the major opcode field (a standard vari- able of stimulus composition) for instructions has been defined. Three bits are used to encode five possible values (enumerated as LOAD, STORE, BRANCH, ALU, and SYSTEM) for this variable. Built into the e language is the notion of generating values for these fields on a pseudo-random basis, subject to the constraints provided. In this example, a relative weighting among the five possible values is provided. The most likely value to be assigned to this field is LOAD (40 out of 120) and the least likely value to be assigned is SYSTEM (10 out of 120). Apparently the test containing these constraints intends to exercise the instruction pipeline with a prevalence of memory accesses via LOAD and STORE instructions, with some BRANCH instructions mixed in to make things even more interesting. In the figures that follow, the boxes labeled as “assign random values” represent this fundamental CRV operation at work. 102 Chapter 4 – Planning and Execution 4.13.3 General CRV Process (§ 5) Fig. 4.9 illustrates a generalized CRV process and shows how the stan- dard variables are consulted at various steps in the process. Consider each step in some detail. The process begins when a new version or release of the RTL is made available for regression testing. A given body of results relates only to a specific version of the RTL. Any change in the RTL, no matter how small or “minor,” requires that a new set of results be obtained. It is the nature of bugs that little changes in RTL can produce faulty or undesirable behavior. So, whenever a new version is produced, it starts a new run of CRV. The step labeled “generate instance/context pair” represents a number of different ways in which a system (the pair of an instance and a context) can be set up for simulation. If the project is producing a single IC, then the instance is often pro- vided by a separate engineering team responsible for the design of the RTL for the IC. In such a case, a copy of the RTL is integrated into the test- bench for simulation with a suitable context. If the IC is being designed for one and only one purpose, then the con- text is fixed (except when simulating the testability morph which would most likely be simulated with a context representing the test fixture for the IC). If the IC is intended for use in a variety of systems, then the context must be generated (or chosen) randomly, using values assigned to vari- ables of external connectivity. If the RTL is being designed as multi- instance IP, then both the instance and the context would be generated ran- domly by the verification team, again using values assigned to the vari- ables of connectivity during the “assign random values” process. Simulation begins with the “activate system” step. Power is applied, clocks begin to toggle, and reset signals are deasserted according to values of activation variables. Following activation, the “initialize system” step causes the various reg- isters in the target and in the context to receive the values produced by the “assign random values” process based on the internal and external vari- ables of condition. After the system is initialized, the action continues with the “run test” step. This step may be one of the more complex steps to implement by the verification engineer, particularly if tests are generated dynamically based upon results obtained so far. There are numerous advantages to dynami- cally generated tests over statically generated tests and these are discussed later in this chapter. 4.13 Architecture for Verification Software (§ 5) 103 Keep in mind that this flow just discussed is a highly simplified gener- alization of CRV. Any given project is likely to have differences in how the CRV process is managed. For example, some systems may undergo numerous, multiple initializa- tions, such as a multi-processor system waking one or more idle serf proc- essors. As another example, consider a system incorporating USB. Cable connections and disconnections will result in multiple activations and ini- tializations of sets of logic within the target. One important detail not depicted in Fig. 4.9 requires consideration. Generating tests depends on values of variables of stimulus, of course, but often also on values previously assigned to variables of connectivity or condition. That is, the ranges (or even existence) of some variables of stimulus are dependent on values of variables of connectivity or condition and also activation in certain cases. Such dependencies are not shown in the figure but are vital in generating meaningful tests for the system. For example, consider a processor instance (see Fig. 2.4) that lacks the optional FPU. Generating a test with lots of floating-point instructions for such an instance might not be worthwhile. On the other hand, some tests containing floating-point might be needed if an instance lacking the FPU must properly handle such instructions, perhaps by a trap, the handler for which emulates floating-point instructions. During the last three steps (activate, initialize, and run test) the re- sponses of the target are observed and checked against the defined rules and guidelines. If a rule is violated, this constitutes a failure, and failures nearly always invalidate subsequent results, so there is little value in al- lowing the simulation to continue for much longer. On the other hand, it is often useful to have the results from the immediately ensuing clock cycles (a few tens or hundreds of cycles) to facilitate failure analysis to determine the nature and extent of the faulty behavior, after which point the simula- tion would be halted. The results of a test that passes should be added to the database of stan- dard results for subsequent analysis. The results of a test that fails do not constitute coverage of the function points visited during the test and must not be added to the standard results. Instead, the entire execution history from activation up to the point of failure (and perhaps a bit beyond) is dumped to disk for analysis by the verification team. Failure analysis will be discussed further later on in this chapter. The CRV process repeats, often on dozens or perhaps hundreds of simu- lation engines, until no more testing is needed. Deciding when no more CRV is needed is based on a variety of factors. If a majority of tests are failing, then it’s likely that one or more major bugs somewhere is causing these failures and additional results may be of little value, in which case the simulation engines might be better used for other purposes (or pro- jects). If, on the other hand, all tests are passing, then completion criteria 104 Chapter 4 – Planning and Execution will determine when no more CRV is needed. These criteria will be based on measures of coverage (discussed in chapter 6) and on risk analysis (dis- cussed in chapter 7). Other factors that determine when to stop the CRV process include such things as: no more disk space available, new RTL version available, other demands on simulation engines, and so forth. Often during working hours, engineers use simulation engines for interactive work with evenings and weekends devoted to continuous CRV. 4.13.4 Activation and Initialization (§ 5) There may be many paths to just a few post-activation states. Arrival times of deassertions of multiple reset signals with respect to each other and with respect to other excitation might all result in the same post-activation state. Hard resets, warm resets, and soft resets (provoked by software) may each have a single post-reset state. By saving such states for subsequent testing beginning with initialization some simulation cycles can be saved. Establishing conditions in the target and in its context might be accom- plished by a deterministic preamble common to all or most tests. After conditions have been established but before applying subsequent stimuli, save the state of simulation. Then other sequences of stimuli can be ap- plied to without having to repeat the preamble. Fig. 4.10. De-activation and re-activation testing Now consider Fig. 4.10. This figure depicts an automated process for testing all (or some fraction of) possible ways a given instance/context pair can be activated. If a given instance/context pair is intended to be further subjected to CRV testing, the simulator state can be saved for later reload- ing in a CRV process. . defined by Lachish. These coverage models check coverage of specific functional trajectories through subgraphs of the functional space corresponding to the scenarios of interest. Collectively,. variables on which the behav- ior is dependent, thereby causing tests to be generated that test functional- ity unaffected by the known bugs. If this has been done, then before “closing out”. the bug report, it is essential that any and all of these modifications be removed so that the functionality previously exhibiting faulty behavior can continue to be exercised. 4.13 Architecture