Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 189 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
189
Dung lượng
3,2 MB
Nội dung
APPLICATION-SPECIFIC THERMAL MANAGEMENT OF COMPUTER SYSTEMS RAMKUMAR JAYASEELAN NATIONAL UNIVERSITY OF SINGAPORE 2009 APPLICATION-SPECIFIC THERMAL MANAGEMENT OF COMPUTER SYSTEMS RAMKUMAR JAYASEELAN (B.E., Computer Science Engineering, College of Engineering Guindy, Anna University) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF SINGAPORE 2009 Contents Contents i Abstract viii Acknowledgements x List of Publications xii List of Figures xiv List of Tables xvii Introduction 1.1 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Thesis Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Related Work 2.1 2.0.1 Heat Production & Removal in a Computing System . . . . 2.0.2 Techniques to Reduce On-Chip Temperature . . . . . . . . . 11 Micro-architectural and System Level Techniques . . . . . . . . . . 13 2.1.1 Comparison with Power Reduction Techniques . . . . . . . . 13 2.1.2 Taxonomy of Micro-Architectural and System Level Thermal Management . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.1.3 Static Techniques . . . . . . . . . . . . . . . . . . . . . . . . 15 2.1.4 Runtime Techniques . . . . . . . . . . . . . . . . . . . . . . 17 Workload Characterization 3.1 3.2 3.3 21 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.1 Tool Chain for Workload Characterization . . . . . . . . . . 23 Application Thermal Behavior . . . . . . . . . . . . . . . . . . . . . 25 3.2.1 Thermal Behavior of Individual Applications . . . . . . . . . 26 3.2.2 Impact of Processor Configuration on Thermal Profile . . . . 31 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 iii Dynamic Thermal Management via Architecture Adaptation 4.1 39 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.1.1 Architecture Level Thermal Management . . . . . . . . . . . 41 4.1.2 Software Based Thermal Management . . . . . . . . . . . . 42 4.1.3 Architecture Adaptivity . . . . . . . . . . . . . . . . . . . . 43 4.2 Overview of Thermal Management Framework . . . . . . . . . . . . 43 4.3 Neural Network Classifier . . . . . . . . . . . . . . . . . . . . . . . 46 4.3.1 Classifier Architecture . . . . . . . . . . . . . . . . . . . . . 47 4.3.2 Training the Classifier . . . . . . . . . . . . . . . . . . . . . 49 4.3.3 Accuracy of the Classifier . . . . . . . . . . . . . . . . . . . 50 4.4 Performance Prediction Model . . . . . . . . . . . . . . . . . . . . . 51 4.5 Configuration Search Strategy . . . . . . . . . . . . . . . . . . . . . 57 4.6 Experimental Methodology and Results . . . . . . . . . . . . . . . . 62 4.6.1 Processor Model and Workloads . . . . . . . . . . . . . . . . 62 4.6.2 Dynamic Thermal Managements Schemes . . . . . . . . . . 63 4.6.3 Performance Comparison . . . . . . . . . . . . . . . . . . . . 63 4.6.4 Temperature Profiles and Throughput . . . . . . . . . . . . 64 4.6.5 Configuration Points for Adaptive DTM . . . . . . . . . . . 68 iv 4.7 4.6.6 Impact of Inaccuracy in Classifier . . . . . . . . . . . . . . . 69 4.6.7 Impact of Individual Configuration Parameters . . . . . . . 70 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Adaptive Thermal Management of Muti-Core Systems 5.1 5.2 5.3 5.4 72 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 5.1.1 Multi-core Thermal Management . . . . . . . . . . . . . . . 78 5.1.2 Power Management in Multi-Core Systems . . . . . . . . . . 79 Hybrid Thermal Management for Multi-Cores . . . . . . . . . . . . 80 5.2.1 Hybrid Thermal Management Architecture . . . . . . . . . . 81 Problem Formulation and Overview . . . . . . . . . . . . . . . . . . 82 5.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 82 5.3.2 Thermal Management Framework . . . . . . . . . . . . . . . 83 Local Configuration Search . . . . . . . . . . . . . . . . . . . . . . . 86 5.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 5.4.2 Neural Network Classifier . . . . . . . . . . . . . . . . . . . 87 5.4.3 Configuration Search Algorithm . . . . . . . . . . . . . . . . 91 5.4.4 Overhead of the Algorithm . . . . . . . . . . . . . . . . . . . 94 v 5.5 5.6 5.7 Global Configuration Routine . . . . . . . . . . . . . . . . . . . . . 94 5.5.1 Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 5.5.2 Operating Frequency . . . . . . . . . . . . . . . . . . . . . . 95 5.5.3 Core Coupling Factor . . . . . . . . . . . . . . . . . . . . . . 96 5.5.4 Final Configurations . . . . . . . . . . . . . . . . . . . . . . 96 5.5.5 Overheads and Scalability . . . . . . . . . . . . . . . . . . . 97 Experimental Settings and Results . . . . . . . . . . . . . . . . . . 97 5.6.1 Simulation Flow . . . . . . . . . . . . . . . . . . . . . . . . . 98 5.6.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.6.3 DTM Techniques . . . . . . . . . . . . . . . . . . . . . . . . 100 5.6.4 Throughput of Different DTM schemes . . . . . . . . . . . . 101 5.6.5 Weighted Performance . . . . . . . . . . . . . . . . . . . . . 104 5.6.6 Configurations Selected . . . . . . . . . . . . . . . . . . . . . 105 5.6.7 Impact of Backup Technique . . . . . . . . . . . . . . . . . . 107 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 vi Task Sequencing for Thermal Management 108 6.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 6.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.3 Task Sequencing 6.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 6.3.1 Thermal Profile of a Task Sequence . . . . . . . . . . . . . 115 6.3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . 118 6.3.3 Task Sequencing Algorithm . . . . . . . . . . . . . . . . . . 119 Sequencing & Voltage Scaling . . . . . . . . . . . . . . . . . . . . . 122 6.4.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . 122 6.4.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 6.5 Optimal Voltage Scaling . . . . . . . . . . . . . . . . . . . . . . . . 125 6.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 129 6.7 6.6.1 Task Sequencing Algorithm . . . . . . . . . . . . . . . . . . 130 6.6.2 Voltage Scaling . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.6.3 Sensitivity to Thermal Resistance . . . . . . . . . . . . . . . 133 6.6.4 Sensitivity to Slack Amount . . . . . . . . . . . . . . . . . . 134 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 vii Temperature Aware Dynamic Scheduling 7.1 7.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 7.1.1 General Purpose Scheduler Driven Thermal Management . . 138 7.1.2 Thermal Management Approaches for Hard Real Time Systems139 7.1.3 Thermal Management for Media Applications . . . . . . . . 140 Temperature Aware Scheduling Framework and Thermal Model . . 141 7.2.1 7.3 136 Thermal Model . . . . . . . . . . . . . . . . . . . . . . . . . 142 Temperature Aware Scheduling . . . . . . . . . . . . . . . . . . . . 143 7.3.1 Thermal Adjustment Phase . . . . . . . . . . . . . . . . . . 145 7.3.2 Best Effort Scheduler . . . . . . . . . . . . . . . . . . . . . . 146 7.3.3 CPU Share between a Hot and Cold Task . . . . . . . . . . 147 7.4 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 149 7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Conclusion 154 8.1 Summary of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . 154 8.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Abstract Rising power density and on-chip temperature are seen as one of the major hurdles in sustaining processor performance improvement trends. Managing on-chip temperature has become an important aspect at all levels of computer system design. In this thesis, we focus on micro-architecture and system level techniques to manage temperature. Previously proposed approaches for thermal management have revolved around developing efficient heuristics and control policies which attempt to maximize the performance of the system while maintaining temperature constraints. In contrast, we take a workload and processor configuration centric approach to temperature management. We first characterize the thermal behavior of a processor under variations in workload as well as variations in the hardware configuration. Our characterization shows that the thermal behavior of the processor is highly sensitive to workload properties and hardware configuration. Armed with this characterization, we propose thermal management approaches that (i) alter the workload or (ii) alter the processor configuration to manage temperature. In the first part of the thesis we present techniques that manage temperature by adapting the configuration of the processor at runtime. We model the thermal management problem as a hardware configuration search problem. Our framework samples the performance counters to determine the characteristics of the workload executing on the system and uses an online search algorithm to determine the most appropriate thermally safe configuration for that workload. This framework 155 control algorithms or heuristics to employ the mechanism. The second aspect of the design is specifically challenging because any mechanism to reduce the temperature of the chip entails a performance loss. Thus, the main objective is to chose control algorithms or heuristics that adjust the severity of the response to the severity of thermal stress. In this thesis we take a complimentary approach to thermal management. We observe that the thermal behavior of a processor system is highly dependent on the properties of the workload and the hardware configuration. Thus temperature can be controlled either by (i) altering the workload or (ii) altering the hardware configuration. Using this observation we design two classes of thermal management techniques. The first class of techniques are entirely software driven and control temperature by altering the workload executing on the processor in a multi-tasking systems. In a multitasking system, the workload executing on the processor is controlled by the scheduler that decides on the allocation of the CPU to different processes in the system. Our thermal management strategies operate in conjunction with the scheduler and construct a thermally efficient schedule. The second class of techniques use a combination of hardware and software (hybrid) to dynamically determine the best possible processor configuration for a given workload. Hardware based schemes such as dynamic voltage and frequency scaling (DVFS), fetch throttling, clock gating and others have been previously for thermal management. However these approaches have been viewed as competing alternatives and thermal management solutions have focussed on comparing these techniques and determining the most appropriate mechanism. We observe that when a combination of such techniques are employed together, they work synergetically and provide highly efficient thermal management solution. The key challenge in such a scenario is to manage the explosion in the configuration space. Our techniques sift dynamically through a large configuration space and determine the most optimal thermally efficient setting. Our key results can be summarized as follows • The temperature of a set of tasks executing on a processor is highly dependent 156 on the order of execution of the tasks. We design a thermal management strategy that determines the thermally optimal ordering for a set of tasks. On an average, our technique can lower the temperature of the processor by 4.09o C without any impact on performance. • We also observe that the temperature of a multitasking system is sensitive to the shares of execution time between hot and cold tasks in the system. Based on this observation, we design a thermal management strategy that manages temperature by controlling the relative shares on execution time between hot and cold tasks. Our thermal management strategy provides better performance than more complicated schemes while maintaining a host of scheduling requirements such as fairness and real time constraints. • We observe that configuring multiple hardware parameters simultaneously has a large impact on performance and temperature. We design a software based thermal management solution that employs multiple thermal control mechanisms simultaneously and our framework results in a 39% reduction in overhead in comparison to the best known existing technique. • We extend our thermal management strategy to multi-core systems and design a software based thermal management for multi-core systems. Our strategy is simpler to implement and results in a significantly better throughput than the best performing thermal management scheme for multi-core systems. 8.2 Future Work In this thesis we have established the need for a workload driven approach for thermal management of high performance computer systems. Exploiting flexibility either in terms of altering the workload (scheduler driven approaches) or hardware can help manage temperature effectively without compromising on performance. 157 One avenue for future work is a detailed exploration of the boundaries between scheduling driven and hardware thermal management solutions. Hardware DTM mechanisms can provide immediate response to a thermal emergency while scheduling driven mechanisms kick in slower but help shape the long term thermal profile. All previously explored solutions including ours have been either entirely hardware based or scheduling based solutions. Examining the boundary and synergy between hardware and scheduling based approaches would help us design better thermal management strategies. A natural extension of our solutions would be for thermal management of heterogenous multi-core systems where all cores are not equal. A typical example of such a system is the Intel Core i7 [5] processor which is a SMT processor with four cores and SMT threads per core. In such a system, the software sees eight logical cores but all of them are not identical. This is because two SMT contexts share the same physical core. From the perspective of thermal management there is asymmetry since two logical cores which share the same physical core would show more temperature dependence than two distinct physical cores. Extension of our framework to such a setting would be an interesting avenue for future work. In this thesis we have focussed on techniques that optimize the performance of the computer system under thermal constraints. But of late power dissipation and energy consumption have become important design parameters. The techniques proposed in this thesis can be extended to handle multiple objective optimizations such as optimizing for a combination of energy consumption, performance and temperature. Bibliography [1] AMD Phenom II X4 and AMD Phenom II X3 Processors. http://www.amd.com/ us-en/Processors/ProductInformation/0,,30_118_15331_15917,00.html. [2] ARM Cortex A8 Processor. http://www.arm.com/products/CPUs/ARM_ Cortex-A8.html. [3] BSIM Device Models. http://www-device.eecs.berkeley.edu/~bsim3/. [4] IBM cools 3-D chips with H2O. http: // www. zurich. ibm. com/ news/ 08/ 3D_ cooling. html . [5] Intel Core i7 Processor. http://www.intel.com/products/processor/corei7/ index.htm. [6] Intel Core2 Duo Processor. http://www.intel.com/products/processor/ core2duo/index.htm. [7] Intel. Pentium Processor. http://www.intel.com/products/processor/ pentium4/specs.htm. [8] Matlab Neural Network Toolbox. www.mathworks.com/access/helpdesk/help/ pdf_doc/nnet/nnet.pdf. [9] Semiconductor research corporation packing thrust strategic needs. http: // www. src. org/ fr/ S200504packaging\ _needs. pdf,2005. [10] SPEC CPU2000 Benchmark Suite. http://www.spec.org/cpu2000/. 159 [11] TSMC Device Scaling Trends. http://www.tsmc.com/english/b_technology/ b01_platform/b0101_advanced.htm. [12] David H. Albonesi, Rajeev Balasubramonian, Steven G. Dropsho, Sandhya Dwarkadas, Eby G. Friedman, Michael C. Huang, Volkan Kursun, Grigorios Magklis, Michael L. Scott, Greg Semeraro, Pradip Bose, Alper Buyuktosunoglu, Peter W. Cook, and Stanley E. Schuster. Dynamically tuning processor resources with adaptive processing. Computer, 36(12), 2003. [13] R. Iris Bahar and Srilatha Manne. Power and energy reduction via pipeline balancing. In ISCA ’01: Proceedings of the 28th annual international symposium on Computer architecture, 2001. [14] Rajeev Balasubramonian, David Albonesi, Alper Buyuktosunoglu, and Sandhya Dwarkadas. Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, 2000. [15] Anirban Basu, Sheng-Chih Lin, Vineet Wason, Amit Mehrotra, and Kaustav Banerjee. Simultaneous optimization of supply and threshold voltages for lowpower and high-performance circuits in the leakage dominant era. In DAC ’04: Proceedings of the 41st annual conference on Design automation, 2004. [16] Reinaldo Bergamaschi, Indira Nair, Gero Dittmann, Hiren Patel, Geert Janssen, Nagu Dhanwada, Alper Buyuktosunoglu, Emrah Acar, Gi-Joon Nam, Dorothy Kucar, Pradip Bose, John Darringer, and Guoling Han. Performance modeling for early analysis of multi-core systems. In CODES+ISSS ’07: Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis, pages 209–214, New York, NY, USA, 2007. ACM. [17] Engin Martinez Jose F Bitirgen, Ramazan Ipek. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In MICRO 41: Proceedings of the 41st annual ACM/IEEE international symposium on Microarchitecture, 2008. 160 [18] Manjit Borah, Robert Michael Owens, and Mary Jane Irwin. Transistor sizing for minimizing power consumption of cmos circuits under delay constraint. In ISLPED ’95: Proceedings of the 1995 international symposium on Low power design, 1995. [19] Shekhar Borkar. Design challenges of technology scaling. IEEE Micro, 19(4), 1999. [20] Shekhar Borkar. Designing reliable systems from unreliable components: The challenges of transistor variability and degradation. IEEE Micro, 25(6), 2005. [21] David Brooks and Margaret Martonosi. Dynamic thermal management for highperformance microprocessors. In HPCA ’01: Proceedings of the 7th International Symposium on High-Performance Computer Architecture, 2001. [22] David Brooks, Vivek Tiwari, and Margaret Martonosi. Wattch: a framework for architectural-level power analysis and optimizations. SIGARCH Comput. Archit. News, 28(2), 2000. [23] Doug Burger and Todd M. Austin. The simplescalar tool set, version 2.0. SIGARCH Comput. Archit. News, 25(3), 1997. [24] Alper Buyuktosunoglu, David Albonesi, Stanley Schuster, David Brooks, Pradip Bose, and Peter Cook. A circuit level implementation of an adaptive issue queue for power-aware microprocessors. In GLSVLSI ’01: Proceedings of the 11th Great Lakes symposium on VLSI, 2001. [25] Thidapat Chantem, Robert P. Dick, and X. Sharon Hu. Temperature-aware scheduling and assignment for hard real-time applications on mpsocs. In DATE ’08: Proceedings of the conference on Design, automation and test in Europe, 2008. [26] Pedro Chaparro, Jose Gonzalez, and Antonio Gonzalez. Thermal-aware clustered microarchitectures. In ICCD ’04: Proceedings of the IEEE International Conference on Computer Design, 2004. [27] Yen-Kuang Chen and S. Y. Kung. Trend and challenge on system-on-a-chip designs. J. Signal Process. Syst., 53(1-2), 2008. 161 [28] Aviad Cohen, Finkelstein Finkelstein, Avi Mendelson, Ronny Ronen, and Dmitry Rudoy. On estimating optimal performance of cpu dynamic thermal management. IEEE Comput. Archit. Lett., 2(1), 2003. [29] Ayse Kivilcim Coskun, Tajana Simunic Rosing, and Kenny C. Gross. Proactive temperature management in mpsocs. In ISLPED ’08: Proceeding of the thirteenth international symposium on Low power electronics and design, 2008. [30] Ayse Kivilcim Coskun, Tajana Simunic Rosing, and Kenny C. Gross. Temperature management in multiprocessor socs using online learning. In DAC ’08: Proceedings of the 45th annual conference on Design automation, 2008. [31] Ayse Kivilcim Coskun, Tajana Simunic Rosing, and Keith Whisnant. Temperature aware task scheduling in mpsocs. In DATE ’07: Proceedings of the conference on Design, automation and test in Europe, 2007. [32] Erik P. DeBenedictis. Will moore’s law be sufficient? In SC ’04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, 2004. [33] Ashutosh S. Dhodapkar and James E. Smith. Managing multi-configuration hardware via dynamic working set analysis. SIGARCH Comput. Archit. News, 30(2), 2002. [34] Ashutosh S. Dhodapkar and James E. Smith. Managing multi-configuration hardware via dynamic working set analysis. SIGARCH Comput. Archit. News, 30(2), 2002. [35] James Donald and Margaret Martonosi. Techniques for multicore thermal management: Classification and new exploration. In ISCA ’06: Proceedings of the 33rd annual international symposium on Computer Architecture, 2006. [36] Mohamed Gomaa, Michael D. Powell, and T. N. Vijaykumar. Heat-and-run: leveraging smt and cmp to manage power density through the operating system. In ASPLOS-XI: Proceedings of the 11th international conference on Architectural support for programming languages and operating systems, 2004. 162 [37] Pawan Goyal, Xingang Guo, and Harrick M. Vin. A hierarchical cpu scheduler for multimedia operating systems. 2001. [38] Binns F. Carmean D. M. Gunther, S. and J. C Hall. Managing the impact of increasing microprocessor power consumption. In In Intel Technology Journal, 2001. [39] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. Mibench: A free, commercially representative embedded benchmark suite. In WWC ’01: Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, 2001. [40] Heather Hanson, Stephen W. Keckler, Soraya Ghiasi, Karthick Rajamani, Freeman Rawson, and Juan Rubio. Thermal response to dvfs: analysis with an intel pentium m. In ISLPED ’07: Proceedings of the 2007 international symposium on Low power electronics and design, 2007. [41] Jahangir Hasan, Ankit Jalote, T. N. Vijaykumar, and Carla E. Brodley. Heat stroke: Power-density-based denial of service in smt. In HPCA ’05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture, 2005. [42] John L. Hennessy and David A. Patterson. Computer Architecture; A Quantitative Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992. [43] Seongmoo Heo, Kenneth Barr, and Krste Asanovi´c. Reducing power density through activity migration. In ISLPED ’03: Proceedings of the 2003 international symposium on Low power electronics and design, 2003. [44] Michael Huang, Jose Renau, Seung-Moon Yoo, and Josep Torrellas. A framework for dynamic energy efficiency and temperature management. In MICRO 33: Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture, 2000. 163 [45] Michael C. Huang, Jose Renau, and Josep Torrellas. Positional adaptation of processors: application to energy reduction. SIGARCH Comput. Archit. News, 31(2), 2003. [46] Wei Huang, Mircea R. Stant, Karthik Sankaranarayanan, Robert J. Ribando, and Kevin Skadron. Many-core design from a thermal perspective. In DAC ’08: Proceedings of the 45th annual conference on Design automation, 2008. [47] W-L. Hung, Y. Xie, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. Thermalaware task allocation and scheduling for embedded systems. In DATE ’05: Proceedings of the conference on Design, Automation and Test in Europe, 2005. [48] Canturk Isci, Alper Buyuktosunoglu, Chen-Yong Cher, Pradip Bose, and Margaret Martonosi. An analysis of efficient multi-core global power management policies: Maximizing performance for a given power budget. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, 2006. [49] Tejas S. Karkhanis and James E. Smith. A first-order superscalar processor model. SIGARCH Comput. Archit. News, 32(2). [50] Tejas S. Karkhanis and James E. Smith. Automated design of application specific superscalar processors: an analytical approach. In ISCA ’07: Proceedings of the 34th annual international symposium on Computer architecture, 2007. [51] Mircea Stan Karthik Sankaranarayanan, Sivakumar Velusamy and Kevin Skadron. A case for thermal-aware floorplanning at the microarchitectural level. In Journal of Instruction-Level Parallelism, 2005. [52] Amit Kumar, Li Shang, Li-Shiuan Peh, and Niraj K. Jha. Hybdtm: a coordinated hardware-software approach for dynamic thermal management. In DAC ’06: Proceedings of the 43rd annual conference on Design automation, 2006. [53] Chunho Lee, Miodrag Potkonjak, and William H. Mangione-Smith. Mediabench: a tool for evaluating and synthesizing multimedia and communicatons systems. In 164 MICRO 30: Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture, 1997. [54] Wonbok Lee, Kimish Patel, and Massoud Pedram. Dynamic thermal management for mpeg-2 decoding. In ISLPED ’06: Proceedings of the 2006 international symposium on Low power electronics and design, 2006. [55] M. Levy. Keynote talk #1- eembc and the purposes of embedded processor benchmarking. In ISPASS ’05: Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005, 2005. [56] Jian Li and Jose F. Dynamic power-performance adaptation of parallel computation on chip multiprocessors. In In HPCA ’06: Proceedings of the 12th International Symposium on High-Performance Computer Architecture, 2006. [57] Yingmin Li, Dharmesh Parikh, Yan Zhang, Karthik Sankaranarayanan, Mircea Stan, and Kevin Skadron. State-preserving vs. non-state-preserving leakage control in caches. In DATE ’04: Proceedings of the conference on Design, automation and test in Europe, 2004. [58] Yongpan Liu, Robert P. Dick, Li Shang, and Huazhong Yang. Accurate temperature-dependent integrated circuit leakage power estimation is easy. In DATE ’07: Proceedings of the conference on Design, automation and test in Europe, 2007. [59] Yongpan Liu, Huazhong Yang, Robert P. Dick, Hui Wang, and Li Shang. Thermal vs energy optimization for dvfs-enabled processors in embedded systems. In ISQED ’07: Proceedings of the 8th International Symposium on Quality Electronic Design, 2007. [60] Zhijian Lu, John Lach, Mircea R. Stan, and Kevin Skadron. Improved thermal management with reliability banking. IEEE Micro, 25(6), 2005. 165 [61] Ke Meng, Russ Joseph, Robert P. Dick, and Li Shang. Multi-optimization power management for chip multiprocessors. In PACT ’08: Proceedings of the 17th international conference on Parallel architectures and compilation techniques, 2008. [62] Andreas Merkel and Frank Bellosa. Balancing power consumption in multiprocessor systems. In EuroSys ’06: Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, 2006. [63] Pierre Michaud, Andr´e Seznec, Damien Fetis, Yiannakis Sazeides, and Theofanis Constantinou. A study of thread migration in temperature-constrained multicores. ACM Trans. Archit. Code Optim., 4(2), 2007. [64] Kresimir Mihic, Tajana Simunic, and Giovanni De Micheli. Reliability and power management of integrated systems. In DSD ’04: Proceedings of the Digital System Design, EUROMICRO Systems, 2004. [65] Matteo Monchiero, Ramon Canal, and Antonio Gonz´alez. Design space exploration for multicore architectures: a power/performance/thermal view. In ICS ’06: Proceedings of the 20th annual international conference on Supercomputing, 2006. [66] Rajarshi Mukherjee and Seda Ogrenci Memik. Physical aware frequency selection for dynamic thermal management in multi-core systems. In ICCAD ’06: Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design, 2006. [67] Srinivasan Murali, Almir Mutapcic, David Atienza, Rajesh Gupta, Stephen Boyd, Luca Benini, and Giovanni De Micheli. Temperature control of high-performance multi-core platforms using convex optimization. In DATE ’08: Proceedings of the conference on Design, automation and test in Europe, 2008. [68] Srinivasan Murali, Almir Mutapcic, David Atienza, Rajesh Gupta, Stephen Boyd, Luca Benini, and Giovanni De Micheli. Temperature control of high-performance 166 multi-core platforms using convex optimization. In DATE ’08: Proceedings of the conference on Design, automation and test in Europe, 2008. [69] Srinivasan Murali, Almir Mutapcic, David Atienza, Rajesh Gupta, Stephen Boyd, and Giovanni De Micheli. Temperature-aware processor frequency assignment for mpsocs using convex optimization. In CODES+ISSS ’07: Proceedings of the 5th IEEE/ACM international conference on Hardware/software codesign and system synthesis, 2007. [70] Madhu Mutyam, Feihui Li, Vijaykrishnan Narayanan, Mahmut Kandemir, and Mary Jane Irwin. Compiler-directed thermal management for vliw functional units. SIGPLAN Not., 41(7), 2006. [71] Sri Hari Krishna Narayanan, Guilin Chen, Mahmut x. Mahmut Kandemir, and Yuan Xie. Temperature-sensitive loop parallelization for chip multiprocessors. In ICCD ’05: Proceedings of the 2005 International Conference on Computer Design, 2005. [72] Sri Hari Krishna Narayanan, Mahmut Kandemir, and Ozcan Ozturk. Compilerdirected power density reduction in noc-based multi-core designs. In ISQED ’06: Proceedings of the 7th International Symposium on Quality Electronic Design, 2006. [73] Jason Nieh and Monica S. Lam. A smart scheduler for multimedia applications. ACM Trans. Comput. Syst., 21(2), 2003. [74] Vidyasagar Nookala, David J. Lilja, and Sachin S. Sapatnekar. Temperature-aware floorplanning of microarchitecture blocks with ipc-power dependence modeling and transient analysis. In ISLPED ’06: Proceedings of the 2006 international symposium on Low power electronics and design, 2006. [75] David A. Patterson and John L. Hennessy. Computer organization & design: the hardware/software interface. 1993. 167 [76] Erez Perelman, Greg Hamerly, and Brad Calder. Picking statistically valid and early simulation points. In PACT ’03: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, 2003. [77] Fred J. Pollack. New microarchitecture challenges in the coming generations of cmos process technologies (keynote address)(abstract only). In MICRO 32: Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture, 1999. [78] Dmitry Ponomarev, Gurhan Kucuk, and Kanad Ghose. Reducing power requirements of instruction scheduling through dynamic allocation of multiple datapath resources. In MICRO 34: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, 2001. [79] Michael D. Powell, Ethan Schuchman, and T. N. Vijaykumar. Balancing resource utilization to mitigate power density in processor pipelines. In MICRO 38: Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture, 2005. [80] A. Watwe R. Viswanath, V. Wakharkar and V. Lebonheur. Thermal performance challenges from silicon to systems. In In Intel Technology Journal 3Q 2000, Q3 2000. [81] Ravishankar Rao and Sarma Vrudhula. Performance optimal processor throttling under thermal constraints. In CASES ’07: Proceedings of the 2007 international conference on Compilers, architecture, and synthesis for embedded systems, 2007. [82] John Regehr and John A. Stankovic. Hls: A framework for composing soft realtime schedulers. In RTSS ’01: Proceedings of the 22nd IEEE Real-Time Systems Symposium (RTSS’01), 2001. [83] H. Rosten and R. Viswanath. Thermal modeling of the pentium processor package. Proc. 44th Electron. Comp. Technol. Conf, 1994. 168 [84] Hector Sanchez, Belli Kuttanna, Tim Olson, Mike Alexander, Gian Gerosa, Ross Philip, and Jose Alvarez. Thermal management system for high performance powerpctm microprocessors. In COMPCON ’97: Proceedings of the 42nd IEEE International Computer Conference, 1997. [85] Ruchira Sasanka, Christopher J. Hughes, and Sarita V. Adve. Joint local and global hardware adaptations for energy. In ASPLOS-X: Proceedings of the 10th international conference on Architectural support for programming languages and operating systems, 2002. [86] N. N. Schraudolph. A Fast, Compact Approximation of the Exponential Function. In Technical Report INDISA-07-98. [87] Kevin Skadron. Hybrid architectural dynamic thermal management. In DATE ’04: Proceedings of the conference on Design, automation and test in Europe, 2004. [88] Kevin Skadron, Tarek Abdelzaher, and Mircea R. Stan. Control-theoretic techniques and thermal-rc modeling for accurate and localized dynamic thermal management. In HPCA ’02: Proceedings of the 8th International Symposium on HighPerformance Computer Architecture, 2002. [89] Kevin Skadron, Mircea R. Stan, Karthik Sankaranarayanan, Wei Huang, Sivakumar Velusamy, and David Tarjan. Temperature-aware microarchitecture: Modeling and implementation. ACM Trans. Archit. Code Optim., 1(1), 2004. [90] Allan Snavely and Dean M. Tullsen. Symbiotic jobscheduling for a simultaneous multithreaded processor. In ASPLOS-IX: Proceedings of the ninth international conference on Architectural support for programming languages and operating systems, 2000. [91] Jayanth Srinivasan and Sarita V. Adve. Predictive dynamic thermal management for multimedia applications. In ICS ’03: Proceedings of the 17th annual international conference on Supercomputing, 2003. [92] Gilbert Strang. Introduction to Linear Algebra. Wellesley Cambridge Press, 1993. 169 [93] Haihua Su, Frank Liu, Anirudh Devgan, Emrah Acar, and Sani Nassif. Full chip leakage estimation considering power supply and temperature variations. In ISLPED ’03: Proceedings of the 2003 international symposium on Low power electronics and design, 2003. [94] Scott Taylor, Michael Quinn, Darren Brown, Nathan Dohm, Scot Hildebrandt, James Huggins, and Carl Ramey. Functional verification of a multiple-issue, outof-order, superscalar alpha processor—the dec alpha 21264 microprocessor. In DAC ’98: Proceedings of the 35th annual conference on Design automation, 1998. [95] Lothar Thiele and Reinhard Wilhelm. Design for timing predictability. Real-Time Syst., 28(2-3), 2004. [96] Shengquan Wang and Riccardo Bettati. Delay analysis in temperature-constrained hard real-time systems with general task arrivals. In RTSS ’06: Proceedings of the 27th IEEE International Real-Time Systems Symposium, 2006. [97] Shengquan Wang and Riccardo Bettati. Reactive speed control in temperatureconstrained real-time systems. Real-Time Syst., 39(1-3), 2008. [98] Jonathan A. Winter and David H. Albonesi. Addressing thermal nonuniformity in smt workloads. ACM Trans. Archit. Code Optim., 5(1), 2008. [99] Raj Yavatkar and Murli Tirumala. Platform wide innovations to overcome thermal challenges. Microelectron. J., 39(7), 2008. [100] Inchoon Yeo, Chih Chun Liu, and Eun Jung Kim. Predictive dynamic thermal management for multicore systems. In DAC ’08: Proceedings of the 45th annual conference on Design automation, 2008. [101] D. Zhigang Hu Skadron K. Yingmin Li Lee, B. Brooks. Cmp design space exploration subject to physical constraints. In High-Performance Computer Architecture, 2006. The Twelfth International Symposium on High-Performance Computer Architecture, 2006. 170 [102] Wanghong Yuan and Klara Nahrstedt. Energy-efficient soft real-time cpu scheduling for mobile multimedia systems. In SOSP ’03: Proceedings of the nineteenth ACM symposium on Operating systems principles, 2003. [103] Inchoon Yeo Heung Ki Lee Eun Jung Kim Ki Hwan Yum. Effective dynamic thermal management for mpeg-4 decoding. In 25th International Conference on Computer Design, ICCD., 2007. [104] Sushu Zhang and Karam S. Chatha. Approximation algorithm for the temperatureaware scheduling problem. In ICCAD ’07: Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design, 2007. [105] Xiangrong Zhou, Chenjie Yu, and Peter Petrov. Compiler-driven register re- assignment for register file power-density and temperature reduction. In DAC ’08: Proceedings of the 45th annual conference on Design automation, 2008. [106] Changyun Zhu, Zhenyu Gu, Li Shang, R.P. Dick, and R. Joseph. Three- dimensional chip-multiprocessor run-time thermal management. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 27(8):1479– 1492, Aug. 2008. [...]... this thesis exploits heterogeneity in the thermal characteristics of applications for thermal management in multi-tasking systems We observe that given a set of applications that execute concurrently in a multi-tasking system, the resulting thermal profile is highly dependent on the order of execution of the different tasks in the system and the relative share of CPU time provided to the different (hot... approaches for thermal management We observe that the thermal behavior of a micro-processor is highly sensitive to both the application executing on the processor as well as the processor configuration We characterize the sensitivity of thermal behavior to application characteristics and hardware configuration, and 7 exploit these characteristics to design new thermal management solutions Our thermal management. .. employ a combination of hardware and software for thermal management The hardware provides multiple thermal management knobs that are controlled in software Unlike previously proposed solutions, that employ hardware feedback controllers, we rephrase the thermal management problem as a hardware configuration search problem We design a highly efficient software based dynamic thermal management framework... system-level thermal management schemes manage to keep the temperature below the threshold while satisfying a set of system level requirements such as real time constraints, fairness and performance 1.2 Thesis Contributions With modern computer systems being severely constrained by rising on-chip temperature, thermal management solutions have become a central aspect of computer system design The goal of any thermal. .. including thermal safety as one of the key requirements We discuss each one of these classes of techniques in detail Software Based Static Techniques Static thermal management approaches fit naturally in the embedded space as the workload to be executed on the system is known in advance Embedded systems are often designed under strict constraints on area, power, performance and cost Many of these systems. .. hotspots, thermal management solutions have the advantage of being able to monitor and control the temperature of the hot-spots To summarize, thermal management techniques are essential to (i) ensure that the temperature of the hot-spots on-chip are under control and, (ii) boost system performance under a given TDP package by supplementing heat removal techniques A computer system has a number of layers of. .. based techniques and software based techniques Runtime 15 Thermal Management Techniques Dynamic/Runtime Techniques Static/Design Time Techniques Hardware Based Techniques Software Based Techniques Hardware Based Techniques Software Based Techniques Hybrid Techniques Figure 2.1: Overview of previous approaches for thermal management techniques comprise of hardware based techniques, software based techniques... unexplored aspects of temperature/system performance tradeoffs We present two software driven approaches and two hybrid approaches for thermal management in this thesis The software driven approaches exploit the variability among the thermal profiles of different applications in a multi-tasking system The first approach tries to determine the most thermally optimal execution ordering of tasks in a multi-tasking... design a set of thermal management techniques that exploit application and hardware heterogeneity for thermal management We observe that processor thermal behavior is highly sensitive to both the application characteristics as well as processor configuration Using these observations, we design two classes of thermal management techniques The first class of techniques exploit hardware adaptivity to manage temperature... techniques, we present a brief account of how heat is produced and removed in a processor 2.0.1 Heat Production & Removal in a Computing System A typical computer system consists of one or more applications executing on a micro-processor An application consists of a stream of instructions and each instruction encodes a specific sequence of activities on the different units of the 10 processor For instance, . APPLICATION- SPECIFIC THERMAL MANAGEMENT OF COMPUTER SYSTEMS RAMKUMAR JAYASEELAN NATIONAL UNIVERSITY OF SINGAPORE 2009 APPLICATION- SPECIFIC THERMAL MANAGEMENT OF COMPUTER SYSTEMS RAMKUMAR. General Purpose Scheduler Driven Thermal Management . . 138 7.1.2 Thermal Management Approaches for Hard Real Time Systems1 39 7.1.3 Thermal Management for Media Applications . . . . . . . . 140 7.2. 23 3.2 Application Thermal Behavior . . . . . . . . . . . . . . . . . . . . . 25 3.2.1 Thermal Behavior of Individual Applications . . . . . . . . . 26 3.2.2 Impact of Processor Configuration on Thermal