Adaptive Techniques for Dynamic Processor Optimization Theory and Practice by Alice Wang and Samuel Naffziger_17 pot

8 220 0
Adaptive Techniques for Dynamic Processor Optimization Theory and Practice by Alice Wang and Samuel Naffziger_17 pot

Đang tải... (xem toàn văn)

Thông tin tài liệu

Chapter 12 The Challenges of Testing Adaptive Designs 297 The Itanium 2 has a thermal management system very similar to power measurement. Using the same VCO (Figure 12.17) as in the power measurement system, the thermal solution has the resolution to measure temperature with a precision << 1ºC. Figure 12.17 Block diagram of thermal measurement. (© IEEE 2006) However, in order to calibrate the system a known temperature with << 1ºC of error needs to be supplied by the test environment. The test environment has to test parts with varying power draw, in a short amount of time, and with limited thermal probes. To achieve the desired thermal control in a test environment, the part would need to be submerged in an oil bath. This is not possible while achieving the required test throughput. As a result, the accuracy of the thermal monitoring system is not limited by the processor capabilities, but instead is limited by the capabilities of the test environment. As more and more adaptive techniques are used to stretch the capabilities of silicon, investments will need to be made in validation and test systems to fully utilize the new capabilities. Adaptive circuit techniques have the ability to reduce processor guard-bands provided the test infrastructure can emulate the use conditions adequately. 12.4 Guard-Band Concerns of Adaptive Power Management After one considers the correctness of adaptable systems, one must deliver the value that they offer in the product environment. One of the primary 298 Eric Fetzer, Jason Stinson, Brian Cherkauer, Steve Poehlman manufacturing considerations in designing an adaptive frequency/power control system is performance variability tolerance. A system based on any type of analog measurement will inherently be susceptible to part-to- part variation as well as environmental variation. For example, the Montecito system that makes an on-die analog measurement of the power being consumed will be subject to part-to-part variation —no two parts will have exactly the same mix of leakage and dynamic power. This means as voltage is raised or lowered, the power consumed by parts will vary compared to one another. The same is true with temperature variation, which affects the leakage power but not the dynamic power. Also, the ideal voltage versus frequency curve is subject to part-to-part variation, and attempting to optimize this on a per-part basis will introduce additional variability. This variability can also be a function of more subtle effects such as the aging of components. Voltage regulator outputs may drift as they age, cooling systems may provide less airflow, and even the leakage of the processor itself changes with aging. Thus, it is exceedingly difficult to make a processor that behaves identically from run-to-run and part-to-part throughout its lifetime if it depends on an analog power measurement for the basis of its performance adaptability. Systems that depend on a temperature measurement to adapt performance are subject to similar variability compared to those that measure power directly. Reducing the number of possible operating conditions from a continuous curve to a series of a few discrete conditions greatly reduces the exposure to variability, as most variation will not be enough to move from one operating condition to the next. However, if absolutely deterministic behavior is required of a design, another approach is to replace analog sensing with architectural event counters. Using architectural counters [19], specific architectural events can serve as a proxy for power dissipation, by weighting each one according to its expected contribution to the power. Assuming the weighting is not done on a part-by-part basis, all processors will behave identically on identical code streams. This potentially gives up some benefits of the analog schemes, which squeeze out more from the design by using actual power or temperature measurements instead of a proxy. However, this even-based approach guarantees part-to-part and workload-to-workload repeatability—also making benchmarking and design debug much more straightforward. Chapter 12 The Challenges of Testing Adaptive Designs 299 From a manufacturability standpoint, both analog and architectural designs require similarly sized guard-bands (Adaptive Op. Point, Figure 12.18) to guarantee power stays within limits. Because of issues in testing and operation, this guard-band is larger than the guard-band required at a non- adaptive operating point. From an analog perspective, the design is dependent on the ability to make an accurate current measurement, often in the noisy environment of a running system. 0.80 0.90 1.00 1.10 1.20 1.30 1.40 1.00 1.20 1.40 1.60 1.80 2.00 2.20 Frequency (GHz) Voltage (V) Not Measured Data, illustrative purposes only Frequency (GHz) Voltage (V) No Adapt Op. Point Worst Case Activity Code @ P max Frequency (GHz) Voltage (V) Not Measured Data, illustrative purposes only Frequency (GHz) Voltage (V) No Adapt Op. Point Worst Case Activity Code @ P max Real App Activity Code @ P max Large Guardband for Power measurment variability Small Guardband for Test environment issues Adaptive Op. Point Figure 12.18 Comparison of operating point with and without adaptation. Architectural counters are not subject to analog noise or accuracy, but they must be placed and weighted carefully in order to provide the best mapping to power. One drawback of the architectural approach is that the worst-case power event needs to be well understood to be detected and the system needs tuning based on silicon-collected data to be accurate. Another drawback is that it is very difficult to cover data-dependent power. That is to say, you can map a certain architectural operation to a given power level, but you cannot easily modify that power level based on the operands or the specific data being manipulated, as this requires too deep a penetration of the architectural monitors. Determinism and repeatability give architectural power estimates a significant advantage over the analog measurements. Unlike the situation where the analog measurement-based power management must be disabled for almost all production testing, an architectural power-based system will 300 Eric Fetzer, Jason Stinson, Brian Cherkauer, Steve Poehlman determine steps to maintain a constant power level. While voltage and frequency responses may not be properly emulated on the tester, the measurement system itself will behave in a predictable and testable manner. 12.5 Conclusion From wafer test to final testing of parts in systems, determinism and repeatability are the cornerstones of bringing a processor design to market. Adaptive techniques used in modern processors like those demonstrated in this chapter make determinism and repeatability difficult to achieve. In some cases, the test infrastructure is not able to keep up with the processor’s ability to adapt, and as a result the guard-bands that adaptation is trying to eliminate will remain. Careful planning, along with novel test techniques like the ones described in this chapter, needs to be employed to realize the full potential of adaptive techniques. Additional significant breakthroughs will be required for higher levels of adaptation involving applications, OS, firmware, system components, and the processor to be fully production testable. References [1] Naffziger, S., et al., “The Implementation of a 2-core Multi-Threaded Itanium-Family Processor,” IEEE Journal of Solid-State Circuits, Vol. 41, No. 1 pp. 197–209, Jan. 2006 [2] Thompson, S., et al., “A 90 nm logic technology featuring 50 nm strained silicon channel transistor, 7 layers of Cu interconnects, low k ILD, and 1 μm 2 SRAM cell,” Electron Devices Meeting, 2002. IEDM '02. Digest. International, pp. 61–64, Dec. 2002 [3] Mahoney, P., Fetzer, E., et al., “Clock distribution on a dual-core, multi- threaded Itanium®-family processor,” Solid-State Circuits Conference, 2005. Digest of Technical Papers. ISSCC. 2005 IEEE International, Vol. 1, pp. 292–599, 6–10 Feb. 2005 [4] Anderson, F.E., Wells, J.S., Berta, E.Z., “The core clock system on the next generation Itanium microprocessor,” Solid-State Circuits Conference, 2002. Digest of Technical Papers. ISSCC. 2002 IEEE International, Vol. 1, pp. 146–453, 3–7 Feb. 2002 [5] Geannopoulos, G., Dai, X., “An adaptive digital deskewing circuit for clock distribution networks”, Solid-State Circuits Conference, 1998. Digest of Technical Papers. 45th ISSCC 1998 IEEE International, pp. 400–401, 5–7 Feb. 1998 Chapter 12 The Challenges of Testing Adaptive Designs 301 [6] Peterson, W.W., Weldon, E.J., Jr., Error-Correcting Codes, 2nd editions, MIT Press: Cambridge Mass., 1972 [7] Ziegler, J. F., Srinivasan, G. R., et al, “Terrestrial cosmic rays and soft errors,” IBM Journal of R and D, Vol. 40 No.1 1996 [8] Ershov, M., Saxena, S., et al., “Dynamic recovery of negative bias temperature instability in p-type metal-oxide-semiconductor field-effect transistors,” Applied Physics Letters, , Vol. 83, No. 8, pp. 1647–1649, August 25 2003 [9] Agostinelli, M., et al., “Erratic fluctuations of SRAM cache Vmin at the 90nm process technology node,” Electron Devices Meeting, 2005. IEDM Technical Digest. IEEE International, pp. 655–658, Dec. 5 2005 [10] McGowen, R., Poirier, C., et al., “Power and Temperature Control on a 90- nm Itanium Microprocessor,” Solid-State Circuits, IEEE Journal of Vol. 41, No. 1, pp. 229–237, Jan. 2006 [11] Wayne Needham, Cheryl Prunty, Eng Hong Yeoh, “High Volume Microprocessor Test Escapes, An Analysis Of Defects Our Test Are Missing”, IEEE International Test Conference, pp. 25–34, 1998. [12] Mike Mayberry, John Johnson, Navid Shahriari, Mike Trip, “Realizing the Benefits of Structural Test For Intel Microprocessors”, IEEE International Test Conference, pp. 456–463, 2002. [13] Ismet Bayraktaroglu, Jim Hunt, Daniel Watkins, “Cache Resident Functional Microprocessor Testing: Avoiding High Speed IO Issues”, IEEE International Test Conference Conference, 2006. [14] Huston, R., “Microprocessor Functional Test Generation on the Sentry 600”, IEEE International Test Conference, 1974. [15] Praveen Parvathala, Kailas Maneparambil, William Lindsay, “ FRITS – A Microprocessor Functional BIST Method”, IEEE International Test Conference, pp. 590–598, 2002. [16] Krantis, N., Xenoulis, G., Paschalis, A., Gizopoulos, D., Zorian, Y., “Application and Analysis of RT-Level Software-Based Self-testing for Embedded Processor Cores”, IEEE Intetrnational Test C440. [17] Wei-Cheng Lai, Kwang-Ting Cheng, “Instruction-Level DFT for Testing Processor and IP Cores in System-on-a-Chip”, Design Automation Conference ,pp. 59–64, 2001. [18] Tsang, J., et. al., “Picosecond imaging circuit analysis”, IBM Journal of Research and Development, Vol. 44, No. 4, pp. 583–603, 2000. [19] Leon, A. S., et al., “A Power-Efficient High-Throughput 32-Thread SPARC Processor,” IEEE J. Solid-State Circuits, Vol. 42, No. 1, pp. 7–16, Jan. 2007. [20] Harry Hsiung, “Manufacturing and test Solutions with EFI”, Intel Developers Forum, 2003. [21] Peter Maxwell, Ismed Hartanto, Lee Bentz, “Comparing Functional and Structural Tests”, IEEE International Test Conference, pp. 400–407, 2000. [22] Satish M. Thatte, Jacob A. Abraham, “Test Generation For Microprocessors”, IEEE Transactions On Computers, Vol. 29, No. 6, pp. 429–441. [23] Advanced Configuration and Power Interface Specification, rev 3.0b, http://www.acpi.info/spec.htm, October 2006 Index Adaptive body-bias, 25, 45, 77 Adaptive voltage scaling, 25 Aging, 87, 151 negative bias temperature instability (NBTI), 11 Asynchronous design, 230 bundled data, 230 dual-rail, 231 Asynchronous latch controller, 240 Body-bias, 2, 12, 20 adaptive, 4, 25, 45, 77 controller, 88 forward, 27, 60 reverse, 27, 55 Canary circuits, 179 Clock generation, 138 Clocking jitter, 150 skew, 150, 274 Control loop, 199 Critical path, 145, 210 DC-DC, 108 inductor-based, 109 switched-cap, 110 Device sizing, 98 Drain induced barrier lowering (DIBL), 17, 50 Dynamic voltage scaling (DVS), 26, 50, 95, 123, 126, 176 Error correction coding, 106, 277 Error detection, 182 Frequency island, 207–208 Frequency optimization, 33 Globally asynchronous, locally synchronous (GALS), 208 Guardbands, 299 Hardware and software control, 68 In-situ monitor, 181 Leakage current gate, 2, 17, 50 gate edge diode leakage (GEDL), 18 gate induced diode leakage (GIDL), 20, 39 subthreshold, 2, 17, 50 Leakage current monitor, 56 Low-dropout (LDO), 109 Manufacturing test, 272, 279 ATPG, 280 clock de-skew, 288 power management, 289 wafer sort, 280 Microprocessor, 121 Minimum energy tracking, 112 Negative bias temperature instability (NBTI), 11 Noise, 145 Operating system control (OS), 70 Performance monitor, 128 PLL, 87, 138 Power monitor, 279 Power optimization, 33 Process variation, 41, 79, 145, 149, 175, 207, 210, 267 die-to-die, 79 304 Index Random dopant fluctuations, 11 Ring oscillatior, 33 Shadow latch, 187 Short-channel effect, 59 SRAM, 101, 134, 249 active sleep, 260 bias generator, 262 passive sleep, 261 read assist, 257 reliability, 267 replica path, 258 soft errors, 267 subthreshold, 107 timing, 257 write assist, 253 Static noise margin (SNM), 134 flip-flops, 97 read, 104, 250 SRAM, 104 write, 250 Sub-threshold CMOS, 97 Supply voltage variation, 150, 177 Technology scaling, 1, 26, 75, 175 Temperature variation, 7, 57, 150, 177, 207, 217 Threshold-voltage variation, 13 Ultra dynamic voltage scaling, 95 Variable channel-length, 5 Variable frequency scaling, 207 Variable threshold CMOS (VTCMOS), 55 Voltage/frequency hopping, 51 Voltage controlled oscillator (VCO), 280 Voltage regulator, 278 Voltage scaling, 2 adaptive, 25 Continued from page ii Abstraction Refinement for Large Scale Model Checking Chao Wang, Gary D. Hachtel, and Fabio Somenzi ISBN 978-0-387-28594-2, 2006 A Practical Introduction to PSL Cindy Eisner and Dana Fisman ISBN 978-0-387-35313-5, 2006 Thermal and Power Management of Integrated Systems Arman Vassighi and Manoj Sachdev ISBN 978-0-387-25762-4, 2006 Leakage in Nanometer CMOS Technologies Siva G. Narendra and Anantha Chandrakasan ISBN 978-0-387-25737-2, 2005 Statistical Analysis and Optimization for VLSI: Timing and Power Ashish Srivastava, Dennis Sylvester, and David Blaauw ISBN 978-0-387-26049-9, 2005 . Siva G. Narendra and Anantha Chandrakasan ISBN 978-0-387-25737-2, 2005 Statistical Analysis and Optimization for VLSI: Timing and Power Ashish Srivastava, Dennis Sylvester, and David Blaauw. P max Large Guardband for Power measurment variability Small Guardband for Test environment issues Adaptive Op. Point Figure 12.18 Comparison of operating point with and without adaptation determinism and repeatability are the cornerstones of bringing a processor design to market. Adaptive techniques used in modern processors like those demonstrated in this chapter make determinism and

Ngày đăng: 21/06/2014, 22:20

Từ khóa liên quan

Mục lục

  • cover.jpg

  • front-matter.pdf

  • fulltext.pdf

  • fulltext_001.pdf

  • fulltext_002.pdf

  • fulltext_003.pdf

  • fulltext_004.pdf

  • fulltext_005.pdf

  • fulltext_006.pdf

  • fulltext_007.pdf

  • fulltext_008.pdf

  • fulltext_009.pdf

  • fulltext_010.pdf

  • fulltext_011.pdf

  • back-matter.pdf

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan