Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 147 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
147
Dung lượng
19,28 MB
Nội dung
APPLICATION-SPECIFIC WORKLOAD SHAPING IN RESOURCE-CONSTRAINED MEDIA PLAYERS BALAJI RAMAN Master of Science, NUS A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILIOSOPHY DEPARTMENT OF COMPUTER SCIENCE SCHOOL OF COMPUTING NATIONAL UNIVERSITY OF SINGAPORE January 2009 Acknowledgments Samarjit Chakraborty, my graduate advisor and guru, accepted me as his PhD student, proposed this thesis topic, involved substantially in my research, writing, and presentation. Samarjit’s empathy towards students, his tolerance for my annoying demands, and his patience with my tortoise pace deserves a standing ovation from heaven. Samarjit taught me to acquire excellence as a habit, and to reject mediocrity, especially in writing. His countless advice on both technical, and non-technical matters resonates in my everyday academic life. Wei Tsang Ooi, my co-advisor and mentor, taught and trained me the fundamental skills that a research student should possess. This thesis benefited on Wei Tsang’s insistence on clarity in writing, correctness in results, and simplicity in style. His emphasis on research ethics was such that those rules are hammered into my head. Wei Tsang spent innumerable amount of hours in meetings, and reviewing my writing. This countably infinite hours does not include the hours he spent on devising small courses on writing, reading and presentation, and pondering on my research topics on his own. Not being tired of these labors, being an excellent listener, he offered great career advice that suited me. Tulika Mitra, my master thesis advisor, paved the way for my doctoral studies. I enjoyed our weekly meetings, when I learned why and how to put an effort to think and concentrate on a research problem. I practice the discipline and the integrity that Tulika taught, conveying through her own actions. Apart from all these advices, I benefited greatly on Tulika’s teaching on diligence in writing, especially, when presenting related work. I had a good fortune when Paolo Ienne gave me an opportunity to internship at EPFL. The intense intellectual discussion on my thesis helped me to a great extent to write my thesis after my internship. Paolo, presented my thesis work in an important international forum, and explained its impact to the relevant audience. His advice on my career had a significant, positive impact in my application process to postdoctoral jobs. I thank the numerous reviewers of my publications, who pointed out sevi ii Acknowledgments eral improvements, and gave concrete suggestions. In particular, I thank my thesis committee members, Weng Fai Wong, Wang Ye, and Andy Pimentel. Many people gave generously of their time, and helped me with the administration. I thank Loo Line Fong for responding me promptly at critical times. I thank her as well for administrative support during my student years at NUS. I thank Chan Tim Fook, Embedded Systems laboratory in-charge, who provided me with all the computational resources I needed. I thank the following friends who helped me to communicate with staff at NUS, when I came to France: Ankit Goel, Ashwin Nanjappa, and Deepak Gangadharan. Chantal Schneeberger, administrative staff at EPFL, went beyond her means to help during my internship in Lausanne, Switzerland. My friends provided the needed rest and relaxation in the forms of plays and movies. I thank Chanakya, Subramanian, and Sudharsanan for counseling me at difficult times, for loaning money when needed, and for providing company when the deadlines required to work past midnight. I cherished the company of Ramkumar, Senthilnathan, Unmesh, Chandra, Vijaykumar, Pan Yu, Linh, Kathy, Yanhong, Satish, Cheng Wei, Ma Lin, and other friends. I am profoundly grateful to my parents, who tolerated when I was busy for trips to India, who stayed with me in Singapore for many months, who responded with useful advice and counseling every week, and who energized me during my vacations in India. As though that were not enough, my father tolerated with me when I discussed all the technical details of my research work, and my mother sounded persuaded when I reasoned why I am a student for so many years. I am indebted to my sister Sudha Raman, whose confidence and success are infectious, and her encouragement provided me with the essential moral support needed for my stay in Singapore. She provided me with partial financial support for attending conferences, and when my stipend arrived late. Sudha showed lots of patience whenever I stressed out over studies, and vented at home. Sudha, from childhood, led me in my personal and academic life. While I will chose a different venue to completely state her positive influence on me, in brevity: I dedicate this thesis to my sister Sudha Raman. I thank you all and God. Table of Contents Introduction 1.1 What is Workload Shaping? . . . . . . . . . . . . . . . . . . . 1.2 Shaping Techniques . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Background 19 2.1 Analytical Model: A Bird’s Eye Review . . . . . . . . . . . . . 20 2.2 Tuning Scheduler Parameters . . . . . . . . . . . . . . . . . . 25 2.2.1 2.3 Methodologies . . . . . . . . . . . . . . . . . . . . . . . 27 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.3.1 Our System Model . . . . . . . . . . . . . . . . . . . . 34 Buffering for Smoothing 3.1 39 Buffering Vs Workload . . . . . . . . . . . . . . . . . . . . . . 40 3.1.1 Basic Intuition . . . . . . . . . . . . . . . . . . . . . . 40 3.1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . 41 3.2 Frequency Estimation . . . . . . . . . . . . . . . . . . . . . . . 43 3.3 Delay Redistribution . . . . . . . . . . . . . . . . . . . . . . . 44 3.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.3.2 Relation to Previous Work . . . . . . . . . . . . . . . . 47 3.3.3 Illustrative Example . . . . . . . . . . . . . . . . . . . 48 3.3.4 Problem statement . . . . . . . . . . . . . . . . . . . . 51 3.3.5 Playout Delay Redistribution . . . . . . . . . . . . . . 52 3.3.6 Buffer Size Estimation . . . . . . . . . . . . . . . . . . 56 3.3.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 iii iv TABLE OF CONTENTS Buffering for Multiple Applications 4.1 Motivation . . . . . . . . . . . . . . 4.1.1 Our Contribution . . . . . . 4.1.2 Reference works . . . . . . . 4.2 Illustrative Example . . . . . . . . 4.2.1 Problem Statement . . . . . 4.3 Dynamic Buffering . . . . . . . . . 4.3.1 Schedulability Analysis . . . 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . Buffering with Stochastic Guarantees 5.1 Basic Idea . . . . . . . . . . . . . . . 5.2 Motivation . . . . . . . . . . . . . . . 5.3 Illustrative Example . . . . . . . . . 5.4 Minimizing Buffering . . . . . . . . . 5.4.1 Buffer Underflow . . . . . . . 5.5 Numerical Evaluation . . . . . . . . . 5.5.1 Minimum playout delay . . . 5.5.2 Validation . . . . . . . . . . . Future Work and Conclusions 6.1 Modeling Processor Waiting Time 6.2 General Stochastic Framework . . 6.2.1 A motivating example . . 6.3 Final Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 66 68 69 71 74 75 77 84 . . . . . . . . . . . . . . . . 89 90 90 94 96 96 103 103 106 . . . . 109 . 110 . 118 . 119 . 121 . . . . . . . . List of Figures 1.1 Shaping Techniques for Multimedia players. . . . . . . . . . . 2.1 Dimensions of SoC Design. . . . . . . . . . . . . . . . . . . . . 21 2.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.1 Our system model and technique. FIFO buffers connect PEs in pipeline. An application is partitioned and mapped onto the different PEs that run tasks concurrently. Buffer size reduces on redistributing playout delay. . . . . . . . . . . . . . . . . . . . . 3.2 46 Buffer fill levels with initial playout delay: (a) very small, (b) large, and (c) redistributed. . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.4 Initial playout delay values as minimum required processor frequency drops and stabilizes. . . . . . . . . . . . . . . . . . . . . 3.5 Change in buffer fill levels with redistributing playout delay. 3.6 Playout delay estimation w.r.t processing requirement of tasks (VLD and IQ) running in PE 1. 57 . . . 61 . . . . . . . . . . . . . . . . . . . . . 63 4.1 Setup for dynamic workload shaping. . . . . . . . . . . . . . . 68 4.2 Dynamically controlling the playout buffer fill level as two applications are being scheduled. . . . . . . . . . . . . . . . . . . 71 4.3 Buffering time versus workload for a low bit rate and low resolution video stream. . . . . . . . . . . . . . . . . . . . . . . . 78 4.4 A schedulable system. . . . . . . . . . . . . . . . . . . . . . . 80 4.5 Schedulable regions for different flow . . . . . . . . . . . . . . . 81 4.6 A non-schedulable system. . . . . . . . . . . . . . . . . . . . . 82 v vi LIST OF FIGURES 4.7 Schedulable regions of a periodic task (p = 600 ms, e = 80 × 106 cycles). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 4.8 Schedulable region for a setup consisting of a periodic task along with an MPEG-2 decoder decoding a low bit rate and low resolution video stream. . . . . . . . . . . . . . . . . . . . 84 5.1 Processing requirement reduces with large initial delay. The production rate is high when playout starts after small delay. . 5.2 Delay value reduces on relaxing buffer constraints. The output stream at times cannot catch-up with consumption and playout buffer underflows. . . . . . . . . . . . . . . . . . . . . 5.3 Correlation among playout delay, buffer size, and buffer underflow. Increase in playout delay (and buffer size) decreases buffer underflow. . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Playout buffer underflow over time. The variability in underflow substantially reduces with large increase in playout delay. 5.5 Meeting desired stochastic constraints. The probability that the playout buffer underflows is no more than the stochastic bounding function. . . . . . . . . . . . . . . . . . . . . . . . . 5.6 The cumulative distribution of processor frequency. Processor cycles/second allocated to the video decoding task and therefore the playout buffer underflow are probabilistic. . . . . . . . 5.7 Accuracy of analytical model. Minimum playout delay estimated using mathematical model is close to the delay values obtained from simulation. . . . . . . . . . . . . . . . . . . . . 6.1 Multimedia SoC model. . . . . . . . . . . . . . . . . . . . . . 6.2 Case a: Buffer underflow due to processor latency, Case b: Play-out constraint met with increase in processor share for decoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Model of communication . . . . . . . . . . . . . . . . . . . . 6.4 System architectures and models used for analysis in previous works. Memory latency modeled for architectures with offchip memory, shared memory, and FIFO (right to left). . . . 91 92 95 96 105 105 107 . 111 . 112 . 114 . 115 Abstract Much research in system-level design for multimedia devices is based on analysis with system models, but how insightful are they? System simulation is the prime technique used in computer architecture and embedded system design to explore potential design solutions and validate design choices. Unfortunately, simulation seldomly gives real insight and strong guarantees on the dynamic behavior of a system. On the other hand, existing analytical models could not capture some important attributes of multimedia systems. Consequently, the analysis with such mathematical models is not beneficial for efficient system design. A useful analysis with either simulation or analytical models should provide resource saving techniques. These methods can exploit the key characteristic features of the multimedia streams. The fluid nature in arrivals and inconstant processing requirements of data items are multimedia’s inherent characteristic features. But, these characteristic features are predictable. So, the foreseeable properties could be studied to yield techniques that can significantly save on-chip resources. This thesis proposes techniques to shape multimedia workload so as to effectively utilize on-chip resources such as processor and memory. These shaping techniques attempt to solve the problem in providing guarantees for high-quality media output with minimal on-chip resources. The research approach is to use analytical models and accurately capture the variable characteristics in arrival and execution of items in multimedia streams. Such mathematical models after analysis yield deep-insights to tune certain application parameters. Using this parameter tuning, it is possible to reshape variable media workloads to reduce processing and storage requirements. The central tenet of this parametric tuning is to adapt the workload such that vii Abstract only average or minimum processor cycle time required for every multimedia data item is provided, and not the maximum. Our results show that choosing the appropriate initial playout delay (after which the video starts) can lead to effective processor utilization. This delay parameter is typically arbitrarily chosen. Instead, we propose to estimate the value of the parameter such that it is sufficient to provide average cycle time required for every data item. This delay, however, could be large and can lead to huge buffer sizes. Hence we propose two-ways to reduce the buffer sizes: (1) in a multi-processor set-up this delay parameter could be redistributed to different processors i.e., apart from the output device, the processors also start after some delay; and (2) allowing tolerable loss in quality. Both these methods show substantial reduction in buffer size. The model we have estimates the delay parameter in all of the above mentioned techniques. Our mathematical framework fits well to deal with media streams in that it could express variability effortlessly and quickly explore cost-quality tradeoffs. These essential attributes of our model substantially brought out the benefits in workload shaping. An important advantage of the workload fitting techniques is from the stochastic models; relaxing constraints that guarantee full output quality yielded significant reductions in processing and memory requirements. Abstract 124 CHAPTER 6. FUTURE WORK AND CONCLUSIONS the arrivals (and processing requirements) either we are pessimistic on the number of items that are arriving at the input buffer or we not capture a large set of streams. The later point being that our arrival and processor service models will not include even streams that mostly conforms to the bounds. Towards this, as a first step, we relaxed in our mathematical model a constraint that requires the playout buffer should never underflow. In Chapter 5, we numerically evaluated the minimum playout delay required such that the buffer underflow is tolerable. The work highlighted how much savings in terms of resources and processing requirement could be achieved with such relaxations on deterministic buffer underflow constraints. These insights were made possible since the system design is mathematically modeled using real-time calculus; the mathematical framework preserved the inherent characteristic feature of a multimedia stream, that is, the variability in terms of arrival and execution of stream objects. The preliminary advantage of using such mathematical models is the fast exploration, and the analysis of several design parameters of the SoC. Currently, our effort is focused towards moving from the deterministic setting to the stochastic framework. This research direction is promising, so, in future, we envision to develop a system design software tool, which is based on the mathematical framework we developed. The mathematical framework would have incorporated all the important extensions pointed in this chapter. We believe such a tool would be beneficial for the system design community in terms of designing complex systems in a cost-effective manner. Bibliography Daniel D. Gajski Andreas Gerstlauer, Haobo Yu. RTOS modeling for system level design. In DATE’03: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pages 130–135, Munich, Germany, March 2003. Ronnie T. Apteker, James A. Fisher, Valentin S. Kisimov, and Hanoch Neishlos. Video acceptability and frame rate. IEEE MultiMedia, 2(3):32–40, Spring 1995. Scott Banachowski, Timothy Bisson, and Scott A. Brandt. Integrating besteffort scheduling into a real-time system. In Proceedings of the Real-Time Systems Symposium (RTSS), pages 139–150, Washington, December 2004. IEEE. Jean-Yves Le Boudec and Patrick Thiran. Network calculus: a theory of deterministic queuing systems for the internet. Springer-Verlag, New York, 2001. Scott A. Brandt, Scott Banachowski, Caixue Lin, and Timothy Bisson. Dynamic integrated scheduling of hard real-time, soft real-time and non125 126 BIBLIOGRAPHY real-time processes. In Proceedings of the Real-Time Systems Symposium (RTSS), pages 396–405, Washington, December 2003. IEEE. Le Cai and Yung-Hsiang Lu. Dynamic power management using data buffers. In Proceedings of the conference on Design, automation and test in Europe (DATE), volume 1, pages 526–531, Washington, DC, Feburary 2004. Le Cai and Yung-Hsiang Lu. Energy management using buffer memory for streaming data. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, 24(2):141–152, 2005. Jerome Chevalier, Olivier Benny, Mathieu Rondonneau, Guy Bois, El Mostapha Aboulhamid, and Francois-Raymond Boyer. Languages for system specification: Selected contributions on UML, systemC, system Verilog, mixed-signal systems, and property specification from FDL’03, chapter Space: a hardware/software systemC modeling platform including an RTOS, pages 91–104. Kluwer Academic Publishers, 2004. ISBN 1-40207990-7. Youngchul Cho, Sungjoo Yoo, Kiyoung Choi, Nacer-Eddine Zergainoh, and Ahmed Amine Jerraya. sign. Scheduler implementation in MpSoC de- In Proceedings of the conference on Asia South Pacific design automation(ASP-DAC 2005), pages 151–156, Shanghai, China, January 2005. Kihwan Choi, Ramakrishna Soma, and Massoud Pedram. Off-chip latencydriven dynamic voltage and frequency scaling for an MPEG decoding. In BIBLIOGRAPHY 127 Proceedings of the annual conference on design automation (DAC), San Diego, California, June 2004. Dirk Desmet, D. Verkest, and Hugo De Man. Operating system based software generation for systems-on-chip. In Proceedings of the annual conference on Design automation (DAC), pages 396–401, Los Angeles, California, June 2000. Kenneth J. Duda and David R. Cheriton. Borrowed-virtual-time (BVT) scheduling: supporting latency-sensitive threads in a general-purpose scheduler. In Proceedings of the Symposium on Operating System Principles (SOSP), pages 261–276, New York, December 1999. ACM. Ajay Dudani, Frank Mueller, and Yifan Zhu. Energy-conserving feedback EDF scheduling for embedded systems with real -time constraints. In Proceedings of the joint conference on Languages, Compilers and Tools for embedded systems: software and compilers for embedded systems (LCTESSCOPES), Berlin, Germany, June 2002. Santanu Dutta, Rune Jensen, and Alf Rieckmann. Viper: A multiprocessor SoC for advanced set-top box and digital TV systems. IEEE Design and Test, 18(5):21–31, September/October 2001. Anwar Elwalid and Debasis Mitra. Traffic shaping at a network node: Theory, optimum design, admission control. In Proceedings of the Annual Joint Conference of the Computer and Communications Societies (INFOCOM), pages 444–454, Washington, April 1997. IEEE. 128 BIBLIOGRAPHY Carla fabiana Chiasserini and Ramesh R. Rao. Improving battery performance by using traffic shaping techniques. IEEE Journal on Selected Areas in Communications, 19(7):1385–1394, July 2001. Lovic Gauthier, Sungjoo Yoo, and Ahmed Amine Jerraya. Automatic generation and targeting of application specific operating systems and embedded systems software. In DATE’01: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, pages 679–685, Munich, Germany, March 2001. Leonidas Georgiadis, Roch Gu´erin, Vinod Peris, and Kumar N. Sivarajan. Efficient network QoS provisioning based on per node traffic shaping. IEEE/ACM Transactions on Networking, 4(4):482–501, Feburary 1996. Pawan Goyal, Xingang Guo, and Harrick M. Vin. A hierarchical CPU scheduler for multimedia operating systems. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI), pages 107–121, New York, October 1996. ACM. Matthias Gries. Methods for evaluating and covering the design space during early design development. Integration, The VLSI Journal, 38(2):131–183, 2004. Sang-Il Han, Xavier Guerin, Soo-Ik Chae, and Ahmed Amine Jerraya. Buffer memory optimization for video codec application modeled in simulink. In In Proceedings of the ACM Annual conference on Design automation (DAC), pages 689–694, San Francisco, CA, July 2006. BIBLIOGRAPHY 129 Zhengting He, Aloysius Mok, and Cheng Peng. Timed RTOS modeling for embedded system design. In Proceedings of the Real Time and Embedded Technology and Applications Symposium (RTAS), pages 448–457, San Francisco, California, March 2005. Sven Heithecker and Rolf Ernst. Traffic shaping for an FPGA based SDRAM controller with complex QoS requirements. In Proceedings of the annual conference on Design Automation (DAC), pages 575–578, New York, June 2005. ACM. Tomas Henriksson, Pieter van der Wolf, Axel Jantsch, and Alistair Bruce. Network calculus applied to verification of memory access performance in SoCs. In Proceedings of the Workshop on Embedded Systems for Real-Time Multimedia (ESTIMedia), pages 21–26, Salzburg, Austria, October 2007. Jianghai Hu and Yung-Hsiang Lu. Buffer management for power reduction using hybrid control. In Proceedings of the Conference on Decision and Control and the European Control Conference, pages 6997–7002, Washington, December 2005. IEEE. Christopher J. Hughes, Jayanth Srinivasan, and Sarita V. Adve. Saving energy with architectural and frequency adaptations for multimedia applications. In Proceedings of the annual international symposium on Microarchitecture (MICRO), Austin, Texas, December 2001. Chia hui Wang, Jan ming Ho, Ray i Chang, and Shun chin Hsu. A feedbackcontrolled EDF scheduling algorithm for real-time multimedia transmis- 130 BIBLIOGRAPHY sion. Technical Report TR-IIS-01-008, Institute of Information Science, Academia Sinica, Taipei, Taiwan, ROC, 2001. Chaeseok Im and Soonhoi Ha. An energy optimization technique for latency and quality constrained video applications. IEEE Design and Test, 21(5): 358–366, September-October 2003. Chaeseok Im and Soonhoi Ha. Dynamic voltage scaling for real-time multitask scheduling using buffers. In Proceedings of the 2004 ACM SIG- PLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES), pages 88–94, Washington, DC, June 2004. Chaeseok Im, Huiseok Kim, and Soonhoi Ha. Dynamic voltage scheduling technique for low-power multimedia applications using buffers. In Proceedings of the Symposium on Low Power Electronics and Design (ISLPED), pages 34–39, New York, August 2001. ACM. Ravindra Jejurikar and Rajesh Gupta. Dynamic volatge scaling for system wide energy minimization in real-time embedded systems. In Proceedings of the international symposium on Low power electronics and design (ISPLED), Newport Beach, California, August 2004. Dong-Lk Ko and Shuvra S. Bhattacharyya. Modeling and optimization of buffering trade-offs for hardware implementation of image processing applications. In IEEE Workshop on Signal Processing Systems Design and Implementation, pages 591–596, Athens, Greece, November 2005. Kanishka Lahiri, Anand Raghunathan, and Sujit Dey. System level performance analysis for designing on-chip communication architectures. IEEE 131 BIBLIOGRAPHY Transactions on Computer Aided-Design of Integrated Circuits and Systems, 20(6):768–783, June 2001a. Kanishka Lahiri, Anand Raghunathan, and Sujit Dey. Performance and stability of communication networks via robust exponential bounds. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20(6):952–961, June 2001b. J.-Y. Le Boudec. Some properties of variable length packet shapers. IEEE/ACM Trans. on Networking, 10(3):329–337, 2002. libmpeg2. A free MPEG2 video stream decoder. http://libmpeg2.sourceforge.net/, 2006. Yanhong Liu, Alexander Maxianguine, Samarjit Chakraborty, and Wei Tsang Ooi. Processor frequency selection for SoC platforms for multimedia applications. In Proceedings of the IEEE Real Time Systems Symposium (RTSS), pages 336–345, December 2004. Yung-Hsiang Lu, Luca Benini, and Giovanni De Micheli. Dynamic frequency scaling with buffer insertion for mixed workloads. IEEE Transacations on Computer-Aided Design of Integrated Circuits and Systems, 21(11):1284– 1305, November 2002. Jan Madsen, Kashif Virk, and Mercury Gonzales. Abstract RTOS modeling for multiprocessor system-on-chip. In Proceedings of the International Symposium on System-on-Chip, pages 147–150, Tampere, Finland, November 2003. 132 BIBLIOGRAPHY Sorin Manolache, Petru Eles, and Zebo Peng. Buffer space optimisation with communication synthesis and traffic shaping for NoCs. In Design, Automation and Test in Europe (DATE), pages 718–723, Belgium, March 2006. European Design and Automation Association. Alexander Maxiaguine, Samarjit Chakraborty, Simon Kunzli, and Lothar Thiele. Evaluating schedulers for multimedia processing on buffer- constrained SoC platforms. IEEE Design and Test, 21(5):368–377, 2004. R. Le Moigne, O. Pasquier, and J-P. Calvez. A generic RTOS model for real-time systems simulation with systemC. In DATE’04: Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, volume 3, pages 82–87, Paris, France, March 2004. Sue B. Moon, Jim Kurose, and Don Towsley. Packet audio playout delay adjustment: performance bounds and algorithms. Multimedia Systems, (1):17–28, January 1998. Arno Moonen, Marco Bekooij, Ren´e van den Berg, and Jef van Meerbergen. Decoupling of computation and communication with a communication assist. In Proceedings of the Euromicro Conference on Digital System Design Architectures, Methods and Tools (DSD), pages 63–68, L¨ ubeck, Germany, August 2007. Praveen K. Murthy and Shuvra S. Bhattacharyya. Buffer merging- A powerful technique for reducing memory requirements of synchronous dataflow specifications. ACM Transactions on Design Automation of Electronic Systems (TODAES), 9(2):212–237, April 2004. 133 BIBLIOGRAPHY Amit Nandi and Radu Marculescu. System-level power/performance analysis for embedded systems design. In Proceedings of the annual conference on Design automation (DAC), pages 599–604, June 2001. Jason Nieh and Monica S. Lam. A SMART scheduler for multimedia applications. ACM Transactions on Computer Systems, 21(2):117–163, May 2003. Nytimes. New mobile Apple introduces phone signals iPod apples that ambition, plays 2005, videos, 2007. www.nytimes.com/2005/10/13/technology/13apple.html, www.nytimes.com/2007/01/09/technology/09cnd-iphone.html. Heung-nam Kim onghwan Son, Chansu Yu. Dynamic voltage scaling on MPEG decoding. In Proceedings of the International Conference on Parallel and Distributed Systems (ICPADS), Kyongju City, Korea, June 2001. Preeti Ranjan Panda, Nikil D. Dutt, Alexandru Nicolau, Francky Catthoor, Arnout Vandecappelle, Erik Brockmeyer, Chidamber Kulkarni, and Eddy de Greef. Data and memory optimization techniques for embedded systems. ACM Transactions Design Automation Electronic Systems, 6(2): 149–206, April 2001. Ameet Patil and Neil Audsley. Implementing application specific RTOS policies using reflection. In RTAS ’05: Proceedings of the 11th IEEE Real Time and Embedded Technology and Applications Symposium, pages 438–447, San Francisco, California, March 2005. 134 BIBLIOGRAPHY JoAnn M. Paul, Alex Bobrek, Jeffrey E. Nelson, Joshua J. Pieper, and Donald E. Thomas. Schedulers as model-based design elements in programmable heterogeneous multiprocessors. In DAC’03: Proceedings of the 40th conference on Design automation, pages 408–411, Anaheim, CA, June 2003. Thomas Plagemann, Vera Goebel, and Otto Anshus. Operating system support for multimedia systems. The Computer Communications Journal, 23 (3):267–289, Feburary 2000. Christian Poellabauer and Karsten Schwan. Energy-aware traffic shaping for wireless real-time applications. In Proceedings of the Real-Time and Embedded Technology and Applications Symposium (RTAS), pages 48–55, Washington, May 2004. IEEE. Balaji Raman and Samarjit Chakraborty. Application-specific workload shaping in multimedia-enabled personal mobile devices. In Proceedings of the international conference on Hardware/software codesign and system synthesis (CODES+ISSS), pages 4–9, New York, October 2006. ACM. Balaji Raman, Samarjit Chakraborty, and Wei Tsang Ooi. Meeting CPU constraints by delaying playout of multimedia tasks. In Proceedings of the international workshop on Network and operating systems support for digital audio and video (NOSSDAV), pages 165–170, Stevenson, Washington, June 2005. Balaji Raman, Samarjit Chakraborty, Wei Tsang Ooi, and Santanu Dutta. Reducing data-memory footprint of multimedia applications by delay redis- BIBLIOGRAPHY 135 tribution. In Proceedings of the ACM/IEEE annual conference on Design automation (DAC), pages 738–743, June 2007. Ramachandran Ramjee, Jim Kurose, Don Towsley, and Henning Schulzrinne. Adaptive playout mechanism for packetized audio applications in wide area networks. In Proceedings of the Annual Joint Conference of the Computer and Communications Societies (INFOCOM), pages 680–688, Washington, June 1998a. IEEE. Ramachandran Ramjee, Jim Kurose, Don Towsley, and Henning Schulzrinne. Adaptive playout mechanism for packetized audio applications in wide area networks. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM), pages 680–688, June 1998b. Kai Richter and Rolf Ernst. Event model interfaces for heterogenous systems analysis. In Proceedings of the IEEE International Conference on Design Automation and Test in Europe(DATE), pages 506–513, March 2002a. Kai Richter and Rolf Ernst. Event model interfaces for heterogenous systems analysis. In Proceedings of the IEEE International Conference on Design Automation and Test in Europe(DATE), pages 506–513, March 2002b. Martijn J. Rutten, Jos T. J. van Eijndhoven, Egbert G. T. Jaspers, Pieter van der Wolf, Evert-Jan D. Pol, Om Prakash Gangwal, and Adwin Timmer. A heterogeneous multiprocessor architecture for flexible media processing. IEEE Design and Test, 19(4):39–50, July 2002. Nima Sarshar and Xiaolin Wu. Buffer size reduction through buffer sharing 136 BIBLIOGRAPHY for streaming applications. In IEEE International conference on Multimedia and Expo (ICME), pages 1635–1638, Taipei, Taiwan, June 2004. Simon Schliecker, Matthias Ivers, and Rolf Ernst. of communicating tasks in MPSoCs. Integrated analysis In Proceedings of the interna- tional conference on Hardware/software codesign and system synthesis (CODES+ISSS), pages 288–293, Seoul, Korea, October 2006. Sander Stuijk, Marc Geilen, and Twan Basten. Exploring trade-offs in buffer requirements and throughput constraints for synchronous dataflow graphs. In In Proceedings of the ACM Annual conference on Design automation (DAC), pages 899–904, San Francisco, CA, April 2006. Morihiko Tamai, Tao Sun, Keiichi Yasumoto, Naoki Shibata, and Minoru Ito. Energy-aware video streaming with QoS control for portable computing devices. In Proceedings of the international workshop on Network and operating systems support for digital audio and video (NOSSDAV), Country Cork, Ireland, June 2004. Tektronix. MPEG elementary streams. ftp://ftp.tek.com/tv/test/streams/Element/index.html, 1996. Lothar Thiele, Samarjit Chakraborty, and Martin Naedele. Real time calculus for scheduling hard real time systems. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), pages 101– 104, May 2000. Patrick Thiran, Jean yves Le Boudec, and Frederic Worm. Network calculus applied to optimal multimedia smoothing. In Proceedings of the Annual 137 BIBLIOGRAPHY Joint Conference of the Computer and Communications Societies (INFOCOM), pages 1474–1483, Washington, April 2001. IEEE. Girish V. Varatkar and Radu Marculescu. Traffic analysis for on-chip networks design of multimedia applications. In Proceedings of the annual conference on Design Automation (DAC), pages 416 – 434, New York, June 2002. ACM. Girish V. Varatkar and Radu Marculescu. On-chip traffic modeling and synthesis for MPEG-2 video applications. IEEE Transactions on Very Large Scale Integration Systems, 12(1):108–119, January 2004. Ernesto Wandeler, Alexander Maxiaguine, and Lothar Thiele. Quantitative characterization of event streams in analysis of hard real-time applications. volume 29, pages 205–225, Netherlands, 2005. Springer. Ernesto Wandeler, Alexander Maxiaguine, and Lothar Thiele. Performance analysis of greedy shapers in real-time systems. In Design, Automation and Test in Europe (DATE), pages 444–449, Belgium, March 2006. European Design and Automation Association. Andreas Wieferink, Tim Kogel, Rainer Leupers, Gerd Ascheid, Heinrich Meyr, Gunnar Braun, and Achim Nohl. System level proces- sor/communication co-exploration methodology for multiprocessor systemon-chip platforms. IEE Proceedings on Computer and Digital Techniques, 152(1):3–11, January 2005. Duminda Wijesekera and Jaideep Srivastava. Quality of service (QoS) metrics 138 BIBLIOGRAPHY for continuous media. Multimedia Tools and Applications, 3(2):127–166, July 1996. Duminda Wijesekera, Jaideep Srivastava, Anil Nerode, and Mark Foresti. Experimental evaluation of loss perception in continuous media. Multimedia Systems, 7(6):486–499, November 1999. Hoesoek Yang, Hyunuk Jung, and Soonhoi Ha. Buffer minimization in RTL synthesis from coarse-grained dataflow specification. In Proceedings of the workshop on Synthesis And System Integration of MIxed Technologies (SASMI), Nagoya, Japan, April 2006. Sungjoo Yoo, Gabriela Nicolescu, Lovic Gauthier, and Ahmed Amine Jerraya. Automatic generation of fast timed simulation models for operating systems in soc design. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition (DATE), pages 620–627, Paris, France, March 2002. Wanghong Yuan and Klara Nahrstedt. Energy-efficient soft real-time CPU scheduling for mobile multimedia systems. In Proceedings of the symposium on Operating systems principles (SOSP), pages 149–163, New York, October 2003. ACM. Nicholas H. Zamora, Xiaoping Hu, and Radu Marculescu. System-level performance/power analysis for platform-based design of multimedia applications. ACM Transactions on Design Automation of Electronic Systems, 12 (1):2, 2007. BIBLIOGRAPHY 139 Xingping Zhu and Sharad Malik. A hierarchical modeling framework for onchip communicating architectures of multiprocessing SoCs. ACM Transactions on Design Automation of Electronic Systems, 12(1):6, January 2007. [...]... input and output multimedia streams is intrinsically good in modeling the inherent variability of multimedia workloads (The calculus that is used to construct these models is described in detail in the subsequent chapter) The three shaping techniques are: (1) smoothing, (2) squeezing, and (3) slashing The first of these techniques, smoothing, shows the importance in tuning a key application parameter,... processing a multimedia stream The advantages in capturing the characteristics of a multimedia stream will become clear x(t) B MPEG decoder y(t) Datebook ready queue of tasks C(t) fill-level compressed video stream b scheduler shaping Actual Workload Smooth Shaping Squeeze Slash Figure 1.1: Shaping Techniques for Multimedia players decoded video 7 1.1 WHAT IS WORKLOAD SHAPING? 8 CHAPTER 1 INTRODUCTION... time applications, and (2) unpredictability in the requirement of processing resources by the media workload (Patil and Audsley, 2005) So, there is a continuing interest in the OS research community in designing application- specific operating systems (Plagemann et al., 2000) such that in devices with limited processing capacity and buffer space, resources can be allocated with the knowledge of the application. .. resources in devices running multimedia applications Towards achieving this goal, three workload shaping techniques are proposed in this thesis, they are, smoothing, squeezing, and slashing The insights from each of these techniques if applied to system design would potentially save significant resources and provide guarantees on the output quality of the multimedia applications The mathematical theory behind... 1.2 Shaping Techniques Below, we explain the shaping techniques, emphasizing the benefits that shaping provides in terms of resource utilization It will become clear that the advantages of the proposed techniques are primarily based on the model’s accuracy, that is in capturing the sequence of multimedia items in the stream 10 CHAPTER 1 INTRODUCTION The mathematical framework used to represent input... these resource savings were primarily due to the modeling of data sequence in multimedia streams, before and after processing In addition, this report also proposes a model in which the constraints on quality could be relaxed This analytical framework enables an informed trade-off between tolerable loss in output and device cost Together, as explained soon, we term our techniques as workload shaping ... multimedia task over a time interval are adjusted such that other tasks could fit in, we term this technique as squeezing Consider a situation in which the multimedia task running in the processor consumes most of the processor bandwidth and could not run any other task Thus an 1.2 SHAPING TECHNIQUES 13 incoming periodic task has to be shed because meeting the deadline of both the periodic and the multimedia... certain application parameters, which can act as resource managers, so as to effectively utilize the on-chip resources In this thesis, we propose insights to shape media application workloads using such design parameters (e.g playout delay) so as to significantly reduce the on-chip resource requirements The shaping techniques, the reader understands, exploits the inherent characterstics of the multimedia... define shaping, we must first understand the System-on-Chip (SoC) in a media player Thus, we begin with an overview of the components in a SoC in portable players, and their main functions A SoC contains one or more processing elements, some buffer memory and interfaces between memories and processors Figure 1.1 shows this: the input and playout buffer are memories, and the processing element is linked... chapter introduced the main goal of the thesis - to develop workload shaping techniques so as to effectively utilize processing and memory resources on-chip In achieving this goal, the preceding chapter also stated the research approach, that is, to predict variability in the multimedia workload It was argued that the variation in the number of data items arriving, and the variation in the execution requirement . Following this, we briefly discuss these workload shaping techniques. 1.2 Shaping Techniques Below, we explain the shaping techniques, emphasizing the benefits that shaping provides in terms of resource. B fill-level decoded video compressed video stream shaping y(t) x(t) C(t) Actual Workload Smooth Squeeze Slash Shaping Figure 1.1: Shaping Techniques for Multimedia players. 8 CHAPTER 1. INTRODUCTION The input buffer, tempor. APPLICATION- SPECIFIC WORKLOAD SHA PING IN RESOURCE- CONSTRAINED MEDIA PLAYERS BALAJI RAMAN Master of Science, NUS A THESIS SUBMITTED