Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 146 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
146
Dung lượng
849,15 KB
Nội dung
PERCEPTION-AWARE LOW-POWER AUDIO PROCESSING TECHNIQUES FOR PORTABLE DEVICES HUANG WENDONG NATIONAL UNIVERSITY OF SINGAPORE 2008 PERCEPTION-AWARE LOW-POWER AUDIO PROCESSING TECHNIQUES FOR PORTABLE DEVICES HUANG WENDONG ( B.Eng. Xidian University ) ( M.Eng. Tsinghua University ) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF COMPUTER SCIENCE NATIONAL UNIVERSITY OF SINGAPORE 2008 Acknowledgements First and foremost, I sincerely thank my advisor, Dr. Wang Ye, for providing immediate helps whenever I met difficulties in my study. I consider myself very fortunate for having studied in his group. I continuously benefit from his guidance, encouragement and support in so many ways. He identifies my problems and helps me to correct them, encourages me to pursue academic goals, and gives me sufficient opportunities to develop my research ability. Without his solid supports, this thesis would not have been possible. I would like to thank Dr. Samarjit Chakraborty for introducing me into embedded system field. It is during the joint project with him that I have learned Simplescalar tool sets, and had an understanding of network calculus. Both of them have proved to be helpful for my thesis work. I would like to thank Dr. Wei Tsang Ooi and Dr. Weng Fai Wong for their valuable suggestions on my thesis proposal. These suggestions have inspired me to consider my thesis work from new perspectives. I thank everyone in multimedia lab and DIVA. They are all good lab-mates and always ready to help me. I bothered them again and again to conduct those boring audio subjective tests. They have never hesitated to so. I will certainly miss Huaxin and Zhaoming for their kindness. I will miss Yicheng, Zhang Sheng, and Zhehui as i well, for discussing interesting problems. I especially thank Tran Vu An for helping me to prepare video experimental data and organize subjective tests for the thesis work. I thank Xiaopeng for his kindness. I thank Ye Ning for his professional answers to my various questions about Latex. I thank Bingjun for sharing uncertainty modeling materials with me, although I have in fact spent little time on reading them. I am very grateful to my parents. They always encourage me, support me with dedication and require nothing from me. They are a constant source of my spiritual force. Last, but not least, I would like to thank my wife, Liu Bo, for all she has done during these four years. She has managed to free me from the care of the housework. She has missed many opportunities of enjoyment when I have been occupied by work. And she has suffered a lot from my tension and frustration. But she has shown an understanding and has said nothing on all of these, although she always complains that I spend too much time on computer out of work. ii Table of Contents Chapter Introduction 1.1 Background: System Organization and Power Consumption Issues .1 1.1.1 System Organizations and Sources of Power Consumption .2 1.1.2 Energy Efficient Approaches for Computation Components .8 1.2 Characteristics of Audio Decoding Applications .16 1.3 Related Works 19 1.3.1 Workload Reduction .19 1.3.2 DVS Techniques .21 1.3.3 Main Challenges of the Existing Techniques for Low Power Audio Applications 23 1.4 Our Methodology of Low Power Audio Techniques for Portable Devices .25 1.5 Contributions of the Thesis 31 Chapter A Joint Encoder-Decoder Framework for Supporting Low Power Audio Decoding .35 2.1 Introduction 35 2.2 Related Works 39 2.2.1 Noise Shaping Techniques in AAC 39 2.2.2 Computation Efficient Techniques for Transforms 40 2.3 Overview of the Proposed Work 42 2.4 Joint ASP and Quantization Noise Shaping .44 2.4.1 Truncation Noise Shaping of SOPOT Coefficients 44 iii 2.4.2 Noise Allocation over SOPOT Coefficient Blocks 53 2.4.3 Workload Estimation Module .57 2.5 Experimental Results .58 2.5.1 IFFT Workload Reduction 60 2.5.2 Subjective Evaluation .62 2.5.3 Increase of File Sizes 64 Chapter An Optimal DVS Scheme Supported by Media Servers for Low-Power Multimedia Applications 67 3.1 Introduction 67 3.2 Problem Formulation 74 3.3 Energy Optimization Techniques 76 3.3.1 Bounds on the Processor Speed 76 3.3.2 Estimation of the Input Buffer and the Playback Buffer 77 3.3.3 The Optimal Speed Profile Algorithm 80 3.4 Experimental Results 83 3.4.1 Experimental Results for Audio 83 3.4.2 Experimental Results for Video 84 3.5 Proof of Optimality 91 Chapter Frequency Band and Stereo Image based Workload Scalable Decoding Scheme 97 4.1 Introduction 97 4.1.1 Perception-Awareness in Audio Decoding .98 4.1.2 Perception aware Workload Scalable Processing .100 iv 4.2 Frequency Band and Stereo Image Scalable Decoding .101 4.2.1 Frequency Bandwidth Scalability .102 4.2.2 Stereo Image Scalability .104 4.2.3 Multiple Level Decoding 105 4.3 Efficient Algorithm for Synthesis Filterbank .109 4.3.1 Asymmetric Partial Spectrum Reconstruction for Stereo Audio 109 4.3.2 Conceptual Framework .110 4.3.3 Cosine Re-modulation 113 4.3.4 Polyphase Subfilters 115 4.3.5 Up-Sampling by Repetition 115 4.4 Experimental Evaluation 116 4.4.1 Subjective Evaluation of BSS Decoding Scheme .116 4.4.2 Workload Estimation 118 Chapter Conclusions and Future Works 121 5.1 Summary 121 5.2 Future Works 124 Bibliography .126 v Summary Energy efficiency is a critical design consideration for portable devices. With the popularity of multimedia applications on such platforms, energy efficiency methods optimized for these applications are becoming increasingly important. In this thesis, we study perception-aware low power audio processing techniques for portable device. These works are mainly motivated by the fact that the audio decoding application is a significant source of energy consumption in context of portable devices, while it has received much less attention till now. Energy efficient techniques have been widely studied in terms of video decoding applications. Audio decoding applications, however, have different characteristics and more critical requirement on playback quality. As a result, low power audio decoding applications are not sufficiently supported by current high-level design methodologies and concrete techniques of low power multimedia processing. Targeting low power audio decoding, we propose a new conceptual methodology framework based on the usage modes of the portable device. It makes use of two kinds of design strategies. First, for the case that the application’s requirements on resources are satisfied, we extend the low power design to the encoder and media server side to optimize the energy efficiency of the decoding process without degradation of playback quality. Second, for the case that its requirements are not satisfied due to limited available resources, we propose the concept of workload scalable decoding to support the low power resource scheduling. vi The main contributions of the thesis are as follows. We present a novel scheme, a joint encoder-decoder framework (JEDF), which allows the decoder to have a desirable tradeoff between energy and storage consumption without sacrificing playback quality. JEDF employs Approximate Signal Processing (ASP) technique at decoder side to reduce the computational workload. To guarantee the playback quality, JEDF jointly shapes the ASP noise (introduced by the decoder) and the quantization noise (introduced by the encoder) subject to the masking threshold. We propose a new scheme of media server supporting DVS for low power multimedia decoding, to overcome the inherent limitations of existing DVS techniques. Towards this new direction, we have designed an optimal speed control scheme, which achieves the maximal energy savings among all feasible speed profiles for the given multimedia bitstream. We propose a frequency Band and Stereo-image Scalable (BSS) decoding framework based on an analysis of the perceptual relevance of different audio components. BSS provides the desired workload scalability for the resource scheduling process. Especially, we have designed a novel algorithm, namely asymmetric partial spectrum reconstruction (APSR), to remove the redundant computations associated with stereoimage scalability. vii List of Tables Table 3.1 Experimental results on energy consumption and buffer requirements for audio bitstream. IB and PB: input buffer size and playback buffer size of the proposed scheme; EnR and PBR: energy consumption ratio and playback buffer requirement of the baseline over the proposed scheme, respectively 84 Table 3.2 Configurations of decoding the six video clips. FI and FP: feasibility condition for input buffer and playback buffer respectively, both of them measured in Macro Blocks, the value in bold is used for both input buffer and playback buffer to estimate the other items; IB and PB: input buffer size and playback buffer size, measured in Kbytes, both of them derived from the max(FI,FP); Delay: introduced delay by buffering in sec. 86 Table 3.3 Comparisons between our scheme and the baseline 2. NEC: normalized energy consumption of the baseline over our scheme; BUF: maximal buffer occupancy of the baseline in terms of Macro Blocks; RED: reduced buffer size ratio achieved by our scheme (referred to Table 3.2). 88 Table 3.4 Energy consumption ratio between the proposed scheme and the TMEC 89 Table 4.1 Four different decoding groups .103 Table 4.2 Five decoding levels, where workload reduction is measured in terms of subbands, with a standard MP3 decoder (decoding level 5) as the baseline 106 Table 4.3 Perceptual evaluation results for different APSR profiles .118 viii S:32), (M:32, S:16), (M:32, S:8) and (M:32, S:4), respectively. Each program had an additional copy with (M:32, S:32) given as references. The result of our subjective evaluation is shown in Table 4.3. M:32 Sample Sample Sample Sample Sample S:32 4.90 4.85 4.93 4.90 4.90 S:16 4.90 4.90 4.90 4.90 4.95 S:8 4.75 4.85 4.70 4.85 4.87 S:4 4.35 4.65 4.80 4.33 4.57 Table 4.3 Perceptual evaluation results for different APSR profiles We can see that profiles (M:32, S:4) and (M:32, S:8) only incur slight quality degradation. Especially, the profile (M:32, S:16) is almost indistinguishable from the full decoding profile used in the standard MPEG audio decoder. These observations show that the proposed APSR algorithm can provide satisfactory playback quality. 4.4.2 Workload Estimation We evaluated our decoder using two different classes of audio clips, those having a bitrate of 160 kbits/sec and the other class having a bitrate of 128 kbits/sec. In the former class, the average number of bits per frame is higher compared to the latter class. We measured the workload of these clips with ARM simulation tool [55]. We implemented our BSS decoder based on the MP3 decoder from the Fraunhofer IIS, which is the reference source code of MPEG-1 audio standard. The main merit of the reference decoder is that its implementation is well documented in informative part of [57] and this facilitates reader’s analysis and comparison. All the audio clips we used had a sampling frequency of 44.1K PCM samples/sec per channel, which corresponds to CD quality audio and duration of 20 sec. 118 We measured the workload of a task in terms of the instruction number performed to accomplish the task, as the usual measurement of computer performance. The workload W of decoding a MP3 frame is closely related to the processor’s speed, which can be illustrated by the following example. As audio decoding is a real-time application, audio frames are required to complete decoding within period T to avoid playback jitter. Then the processor’s speed should be at least set as W∕T. Through workload reduction, we can lower the required processor speed, which will result in energy savings using DVS. Table 4.4 lists the average workloads, namely million instructions per frame (MIPF), for five decoding levels using the high bitrate class of clips. We obtained almost identical results for the low bitrate (128 kbits/sec) class of clips as well. Based on these MIPF values, we calculated the workload reduction with respect to decoding level 5. For a given MP3 clip, let its workload of decoding level j be W j , 1≤j≤5. For decoding level i, 1≤i≤4, its workload reduction is calculated as: W5 − Wi . The W5 significance of the normalized workload reduction is twofold. First, as pointed out in [26], MIPF values of the same application vary on different platforms. But these values can derive each other by a scaling factor. In this case, normalized workload reduction is more meaningful and provides generic information across various platforms. Second, those processing modules may be implemented by various techniques. For example, IMDCT can be performed with direct implementation or fast implementation. This will make workload vary largely in different implementations. 119 Normalized workload reduction is insensitive to particular implementations and remains to be stable when linear component dominates the overall workload. Decoding level Decoding configuration M:32 S:32 M:32 S:16 M:24 S:12 M:16 S:8 M:8 S:0 MIPF 11.76 8.95 6.85 4.77 1.71 Workload reduction -- 23.9% 41.7% 59.4% 85.0% Table 4.4 The average workload (MIPF) for the five decoding levels, along with the normalized workload reduction with respect to the standard MP3 decoder (decoding level 5) Table 4.4 verifies that the workload is roughly proportional to the frequency bandwidth to be decoded. Clearly, the decoding configuration (M:32, S:32) requires the highest amount of computation. Compared to this baseline, significant and approximately even workload reductions are achieved by other decoding levels. 120 Chapter Conclusions and Future Works 5.1 Summary In this thesis, we looked into perception-aware low power audio processing techniques for portable device. We exploit two methods, namely workload reduction and DVS, to achieve energy efficiency. For low power audio decoding applications, the most challenging problem is that they have different characteristics and design requirements from current low power multimedia technologies. We approached this problem from two levels: high-level design methodology and concrete techniques, and covered three important aspects: workload reduction with non-degradation of playback quality, DVS with smoothing out fluctuation, and workload scalability. These three techniques provide a comprehensive solution for the application scenarios of portable devices. We believe that low power audio processing techniques is still in its infancy, which is reflected at both levels of methodology and concrete techniques. Unlike other matured disciplines, there is no suitable established methodology framework to guide the design of low power audio processing. This is the main reason leading to the ineffectiveness of existing techniques. Due to this, we begin with establishing a methodology framework for designing low power audio processing technology. The 121 key ideas of our works can be summarized as three concepts. First, we have developed the taxonomy of low power audio processing techniques. It is based on the heterogeneous usage modes of portable devices. The taxonomy consists of two kinds of techniques with different objectives and design strategies. Second, we have proposed to extend the low power audio decoding design to the encoder side. It is the key to address the problem of achieving optimized energy efficiency without playback quality degradation, while all existing techniques are unable to solve this problem due to their inherent limitations. This idea is also applicable to DVS. Third, we have proposed the concept of workload scalable decoding to support energy efficiency operations of voltage schedulers and users’ choices. The significance of these three concepts relies on that they partially fill the gap between the requirements of audio applications and existing low power techniques, and may lead to a number of innovative low power techniques targeting audio processing. More specially, following the above proposed methodology, main results that we have obtained in this thesis can be summarized as follows. • We have proposed a novel framework, JEDF, which allows the decoder to have a desirable tradeoff between energy and memory consumption without sacrificing playback quality. This is achieved by a joint noise shaping process. With no statistically significant differences from a standard encoder of the same configuration, JEDF can on average save around 40% of the overall computational workload of an AAC low complexity decoder. On the other hand, it only incurs a modest increase in file sizes. For the bit rate of 128 Kb/s, 122 the average file size increase ratio is 9.52%. And the increase ratios of the compressed file sizes decrease as the bit rate increases. More importantly, JEDF represents a new direction for workload reduction in long block transform computations. In general, although various methods have been proposed to save the computations for transform, it is hard to find an effective way to reduce the workload of a long block transform. In a similar manner to the “lossy compression” used in audio encoding to achieve high compression ratios, JEDF performs “lossy transform” for the long block transform, where the noises of higher levels are allowed to achieve significant workload reduction. JEDF then addressed these noises by joint noise shaping. • We have proposed a new concept of media server supporting DVS, which is superior to existing client-only approaches. As a first step to this new direction, we have designed an optimal speed control scheme for intra-task voltage scheduling. The significance of the optimal solution is twofold. First, it is of significant interest in theory since this is the first time to identify the lower bound of energy consumption achievable for the given buffers. Second, it requires much less memories than the existing approaches. This largely facilitates its applications and provides additional opportunities for energy savings. In terms of video, experimental results show that the proposed techniques: 1) lead to only 2% additional energy consumption compared to theoretical minimal energy consumption; 2) require buffer sizes of less than frames, and introduce delay of less than 0.1 sec. Compared to representative buffering based DVS techniques, our work improves the performance of 123 energy efficiency by 28.3% with the same buffer sizes or reduces 50% of the buffer requirement at the same level of energy consumption. These results demonstrate the superiority of the media server supported DVS scheme. • We have, for the first time, identified a dynamic nature of perception awareness of audio playback applications in the context of portable devices. It provides a new opportunity for workload scalability. In implementing the workload scalability, we have solved two key issues. First, how to design the decoding levels which offer the desirable tradeoffs between playback quality and workload? This problem turned out to be closely related to the user’s listening experience. Second, how to exploit stereo image scalability to reduce the workload of the synthesis filterbank? To address this problem, we have developed APSR technique, which is a useful extension to the well-known PSR technology. 5.2 Future Works Based on the study presented in this thesis, some possible future works are listed below. • In our implementation of JEDF, we have concentrated on the truncation noise shaping of IFFT in the IMDCT to achieve the specified workload reduction. As an immediate next step, we plan to extend the proposed scheme to other parts of the IMDCT. Furthermore, how to represent the new side information such as the truncation positions of SOPOT coefficients in an efficient way remains to be addressed. Currently we concentrate on the truncation noise shaping to 124 accomplish the specified workload reduction. This results in irregular truncation positions for different SOPOT coefficients. This implies that we should insert the side information of the truncation positions into the coded bitstream. Then a critical issue is how to deal with the new side information in an efficient way as we have 255 coefficient blocks for a 512-point IFFT. We plan to solve this problem by clustering all the frames into limited number of groups. All members in the same group share the same side information of truncation positions. This method will efficiently compress the side information of truncation positions. • In terms of media server supported DVS, we plan to extend the proposed scheme to multiple task scenarios. Moreover, the analysis framework also provides insights into issues of selecting input and playback buffer configurations in terms of individual media bitstreams with more accurate estimations. The new estimation can be used to identify the build-in buffer ranges to support a class of multimedia processing applications, which is an important issue in designing a SoC platform. • BSS has provided the required characteristics to support energy efficiency performed by voltage scheduler. It remains to be a critical issue that how to design the voltage scheduling algorithms to find the optimized tradeoff between the playback quality and the limited available computational resources. 125 Bibliography [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] Acquaviva, A.; Benini, L.; and Ricco, B., “Software-controlled Processor Speed Setting for Low-power Streaming Multimedia,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, 20(11), November 2001, pp. 1283-1292 Argenti, F.; Del Taglia, F.; and Del Re, E., “Audio Decoding with Frequency and Complexity Scalability,” IEE Proceedings on Vision, Image and Signal Processing, 149(3), 2002, pp.152-158 Austin, T.; Larson, E.; and Ernst, D., “SimpleScalar: An Infrastructure for Computer System Modeling,” IEEE Computer, 35(2):59–67, 2002, pp. 59-67 Bavier, A.; Montz, A.; and Peterson, L., “Predicting MPEG Execution Times,” SIGMETRICS/ PERFORMANCE International Conference On Measurement and Modeling of Computer Systems, 1998, pp. 131-140 Benini, L.; and De Micheli, G., “System-level Power Optimization: Techniques and Tools,” ACM Transaction on Design automation of electronic systems, Vol.5, No.2, Apr. 2000, pp.115-192 Bernhard, G., “A Bit Rate Scalable Perceptual Coder for MPEG-4 Audio,” 103rd Audio Engineering Society Convention, 1997, Preprint 4620 Bosi, M.; and Goldberg, R. E., “Introduction to Digital Audio Coding and Standards,” Kluwer Academic Publishers, 2002 Breuer, M. A., “Determining Error Rate in Error Tolerant VLSI Chips”, IEEE International Workshop on Electronic Design, Test and Applications, Jan, 2004, pp.321-326 Breuer, M. A., “Multi-media Applications and Imprecise Computation,” Euromicro Conference on Digital System Design, Aug., 2005. pp.2-7 Britanak, V.; Rao, K. R., “An Efficient Implementation of the Forward and Inverse MDCT in MPEG Audio Coding,” IEEE Signal processing Letters, Vol. 8(2), 2001, pp.48-51 Brooks, D.; Tiwari, V.; and Martonosi, M., “Wattch: A Framework for Architecture-level Power Analysis and Optimization,” International Symposium on Computer Architecture, 2000, pp.83-94 Bruno, D.; Dorgival G.; Wagner, M.; and Ricardo, B., “Limiting the Power Consumption of Main Memory,” ACM SIGARCH Computer Architecture, San Diego, USA, June, 2007, pp. 290-301 126 [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] Burd, T. D.; Pering, T. A.; Stratakos, A. J.; and Brodersen, R. W., “A Dynamic Voltage Scaled Microprocessor System,” IEEE Journal of Solid-State Circuit, vol. 35, no.11, Nov. 2000, pp.1571-1580. Cai, L.; and Lu, Y. H., “Dynamic Power Management using Data Buffers,” conference on Design, Automation and Test in Europe, 2004, pp. 526-531 Chakraborty, S.; Wang, Y.; and Huang, W., “A Perception-Aware Low-Power Software Audio Decoder for Portable Devices,” IEEE Workshop on Embedded Systems for Real-Time Multimedia, September 22-23, 2005, New York, pp. 13-18 Chan, S. C.; and Yiu, P. M., “An Efficient Multiplierless Approximation of the Fast Fourier Transform Using Sum-of-Powers-of-Two (SOPOT) Coefficients,” IEEE Signal Processing Letters, Vol. 9. PART 10, 2002, pp. 322-325 Chandrakasan, A.; Gutnik, V.; and Xanthopoulos, T., “Data Driven Signal Processing: An Approach for Energy Efficient Computing,” International Symposium on Low Power Electronics and Design, Monterey, CA, USA, 1994, pp.347-352 Chang, N.; Kim, K.; and Lee, H. G., “Cycle-accurate Energy Consumption Measurement and Analysis: Case Study of Arm7tdmi,” International. Symposium on Low-Power Electronics and Design, 2000, pp.185-190 Chen, C.; Chang, C.; and Ku, C., “A Low Power-Consuming Embedded System Design by Reducing Memory Access Frequencies,” IEICE Transactions on Information and Systems, Vol. E88-D, 12, Dec, 2005, pp.2748-2756 Chen, R. Y.; Irwin, M. J.; and Bajwa, R. S., “Architecture-level Power Estimation and Design Experiments,” ACM Transactions on Design Automation of Embedded Systems, 2001, pp.50-66 Chen, W. H.; Smith, C. H.; and Fralick, S., “A Fast Computational Algorithm for the Discrete Cosine Transform,” IEEE Transactions on Communications, COM25(9), September 1977, pp. 1004-1009 Cheng, L.; Mohapatra, S.; Zarki, M. E.; Dutt, N.; and Venkatasubramanian, N., “A backlight optimization scheme for video playback on mobile devices,” IEEE Consumer Communications and Networking Conference, Vol.2, 8-10, Jan, 2006, pp.883-887 Cho, Y. C.; Choi, S., “Nonnegative features of spectro-temporal sounds for classification,” Pattern Recognition Letters, Vol. 26(9), 2005, pp.1327-1336 Choi, K.; Dantu, K.; Cheng, W.; and Pedram, M., “Frame-based Dynamic Voltage and Frequency Scaling for a MPEG Decoder,” International Conference on Computer Aided Design, 2002, pp.732-737 Choi, K.; Soma, R.; and Pedram, M., “Off-chip Latency-driven Dynamic Voltage and Frequency Scaling for an Mpeg Decoding,” Design Automation Conference, 2004, pp.544-549 Chung, E.; Benini, L.; and Micheli, G., “Contents Provider-Assisted Dynamic Voltage Scaling for Low Energy Multimedia Applications,” International Symposium on Low Power Electronics and Design, 2002, pp.42-47 127 [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] Contreras, G.; and Martonosi, M., “Power Prediction for Intel XScale Processors Using Performance Monitoring Unit Events” International Symposium on Low Power Electronics and Design, Aug., 2005, pp.221-226 Crochiere, R. E.; and Rabiner, L. R., “Multirate Digital Signal Processing”, Prentice-Hall, 1983 Daudet, L.; Torrésani, B., “Hybrid Representations for Audiophonic Signal Encoding,” Signal processing, 82(11), 2002, pp.1595-1617 De Smet, P.; Bruyland, I., “Optimized Recursive Subband Synthesis Windowing for Implementing Efficient MPEG Audio Decoders,” IEEE Signal Processing Letters, Vol.10 (10), 2003, pp.303-306 De Smet, P.; Rooms, F.; Luong, H.; and Philips, W., “Do Not Zero-Pute: An Efficient Homespun MPEG-Audio Layer II Decoding and Optimization Strategy,” ACM Multimedia Conference, Oct. 2004, pp. 376 - 379 Diniz, P.S.; Silva, E.A.; and Netto, S.L., “Digital Signal Processing : System Analysis and Design”, New York : Cambridge University Press, 2001 Duarte, D. E.; Vijaykrishnan, N.; and Irwin, M., “A Clock Power Model to Evaluate Impact of Architectural and Technology Optimizations,” IEEE Transactions on VLSI, vol. 10, no. 6, Dec. 2002, pp. 844-855 Fan, X.; Ellis, C.; and Lebeck, A., “Memory Controller Policies for DRAM Power Management,” International Symposium on Low Power Electronics and Design, Aug. 2001, pp. 129-134 Feig, E.; and Winograd, S., “Fast Algorithms for the Discrete Cosine Transform,” IEEE Transactions on Signal Processing, 40(9), September 1992, pp. 2174–2193 Figueras, J., “Modeling Power at Different Levels” in “Low Power Design in Deep Submicron Electronics,” Edited by Nebel, W.; and Mermet, J., Kluwer Academic Publishers, 1996 Flautner, K.; and Mudge, T., “Vertigo: Automatic Performance-setting for Linux,” the 5th symposium on Operating systems design and implementation, Boston, MA, Dec. 2002, USENIX, pp.105-116 Fogel, D. B., “What Is Evolutionary Computation?” IEEE Spectrum, Vol. 37, No. 2, February 2000, pp. 26-32 Forsyth, N. T.; Chambers, J. A.; Naylor, P. A., “Alternating fixed-point algorithm for stereophonic acoustic echo cancellation,” IEE proceedings. Vision, image and signal processing, Vol. 149 (1), pp.1-9 Friedman, E. G., “Clock Distribution Networks in Synchronous Digital Integrated Circuits,” Proceedings of IEEE, Vol.89(5), May 2001, pp.665-692 Gazor, S.; Liu, T., “Adaptive Filtering with Decorrelation for Coloured AR Environments,” IEE proceedings. Vision, image and signal processing, Vol. 152 (6), 2005, pp.806-818 Ghurumuruhan, G.; and Prabhu, K. M. M., “Fixed-point Fast Hartley Transform Error Analysis,” Signal Processing, Vol.84, 2004 , pp.1307-1321 128 [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] Gluth, R., “Regular FFT-Related Transform Kernels for DCT/DST-based Polyphase Filter Banks,” International Conference on Acoustics, Speech, and Signal Processing, Vol.3, 1991, pp.2205-2208 Gronowski, P. E.; Bowhill, W. J.; Preston, R. P.; Gowan, M. K.; and Allmon, R. L., “High-Performance Microprocessor Design,” IEEE Journal of Solid-State Circuits, Vol. 33 (5), May 1998, pp.676-686 Grunwald, D.; Levis, P.; Farkas, K.; Morrey, C.; and Neufeld, M., “Policies for Dynamic Clock Scheduling,” Symposium on Operating Systems Design and Implementation, Oct 2000, pp.6-6 Gutnik, V.; and Chandrakasan, A. P., “Embedded Power Supply for Low Power DSP,” IEEE Transactions on VLSI Systems, Vol.5, No.4, Dec, 1997, pp.425-435 Haid, J.; Schoegler, W.; and Manninger, M., “Design of an Energy-Aware systemin-package for playing MP3 in Wearable Computing devices,” IEEE International Systems-on-Chip(SOC) conference, Austria, 2003, pp. 35- 38 Hicks, P.; Walnock, M.; and Owens, R. M., “Analysis of Power Consumption in Memory Hierarchies,” International Symposium on Low Power Electronics and Design, 1997, pp.239-242 Hu, Z.; and Wan, H., “A Novel Generic Fast Fourier Transform Pruning Technique and Complexity Analysis,” IEEE Transactions on Signal Processing, Vol. 53, No. 1, Jan. 2005, pp.274-282 Huang, W.; Wang ,Y.; and Chakraborty, S., “Power-Aware Bandwidth and Stereo-Image Scalable Audio Decoding,” ACM Multimedia Conference, November 06-11, 2005, Hilton, Singapore, pp.291-294 Huang. W.; and Wang, Y., “Efficient Partial Spectrum Reconstruction using an Asymmetric PQMF Algorithm for MPEG-Coded Stereo Audio,” IEEE International Conference on Multimedia & Expo, July 9-12, 2006, Toronto, Canada, pp. 901 - 904 Huang, Y.; Chakraborty, S.; and Wang, Y., “Using Offline Bitstream Analysis for Power-Aware Video Decoding in Portable Devices,” ACM Multimedia Conference, November 06-11, 2005, Hilton, Singapore, pp. 299 - 302 Hughes, C. J.; Srinivasan, J.; and Adve, S. V., “Saving Energy with Architectural and Frequency Adaptations for Multimedia Applications,” 34th Annual International Symposium on Microarchitecture (MICRO), 2001, pp. 250- 261 Im, C.; Ha, S.; and Kim, H., “Dynamic Voltage Scheduling with Buffers for Lowpower Multimedia Applications,” ACM Transactions on Embedded Computing Systems, 3(4), 2004, pp. 686–705 Irani, S.; Shukla, S. K.; and Gupta, R., “Algorithms for Power Savings,” ACMSIAM symposium on Discrete algorithms, 2003, pp.37-46 Ishihara, T.; and Yasuura, H., “Voltage Scheduling Problem for Dynamically Variable Voltage Processors,” International Symposium on Low Power Electronics and Design, 1999, pp.197-202 ISO/IEC, “MPEG1 11172-3: Audio Coding ,” 2000 129 [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] ISO/IEC, “MPEG2 13818-7: Advanced Audio Coding,” 2006 James, D. V., “Quantization Errors in the Fast Fourier Transform,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-23, No.3, June 1975, pp.277-283 Jochens, G.; Kruse, L.; Schmidt, E.; and Nebel, W., “A New Parameterizable Power Macro-model for Datapath Components,” Design, Automation and Test in Europe, 1999, pp.29-36 Kadayif, I.; Kandemir, M.; Chen, G.; Vijaykriishnan, N.; Irwin, M.J.; and Sivasubramaniam, M. J., “Compiler-Directed High-Level Energy Estimation and Optimization,” ACM Transactions on Embedded Computing Systems, Vol. 4, Nov, 2005, pp.819-850 Keltcher, P.; Richardson, S.; and Siu, S., “An Equal Area Comparison of Embedded DRAM and SRAM Memory Architectures for a Chip Multiprocessor,” Technical Report HPL-2000-53, Computer Systems Technology HP Laboratories Palo Alto, Apr 2000 Keutzer, K.; Malik, S.; Newton, A. R.; Rabaey, J. M.; and Sangiovanni, V., “System Level Design: Orthogonalization of Concerns and Platform-based Design,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 19(12), December 2000, pp. 1523 - 1543 Lee, H. G.; and Chang, N., “Energy-aware Memory Allocation in Heterogeneous Non-volatile Memory Systems,” International Symposium on Low Power Electronics and Design, Aug. 2003, pp. 420-423 Lee, S.; Ermedahl, A.; and Min, S.L., “An Accurate Instruction-level Energy Consumption Model for Embedded RISC Processors,” ACM SIGPLAN workshop on Languages, compilers and tools for embedded systems, 2001, pp.1-10 Liang, J.; and Tran, T. D. ,“Fast Multiplierless Approximations of the DCT with the lifting scheme,” IEEE Transactions on Signal Processing, Vol. 49, No. 12, Dec, 2001, pp.3032-3044 Lim, J.; Kim, M.; Kim, J.; and Kim, K., “Semantic Transcoding of Video based on Regions of Interest,” Visual Communications and Image Processing, 2003 pp.1232-1243 Liu, C. M.; Lee, W. C.; and Chien, C. T., “Bit Allocation for Advanced Audio Coding using Bandwidth-Proportional Noise-Shaping Criterion,” Proc. of the 6th International Conference on Digital Audio Effects (DAFX-03), London, UK, September 8-11, 2003 Liu, J.; Chou, P. H.; Bagherzadeh, N.; and Kurdahi, F., “Power-Aware Scheduling under Timing Constraints for Mission-Critical Embedded Systems,” Design Automation Conference, 2001, pp.840-845 Liu, J.; Shih, W.; Lin, K.; Bettati, R.; and Chung, J., “Imprecise Computations,” proceedings of the IEEE, Vol. 82(1), Jan, 1994, pp.83-94 Liu, Y.; Maxiaguine, A.; Chakraborty, S.; and Ooi, W. T., “Processor Frequency Selection for SoC Platforms for Multimedia Applications,” IEEE Real-Time Systems Symposium, Lisbon, December 2004, pp. 336-345 130 [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] Lorch, J. R.; and Smith, A. J., “PACE: A New Approach to Dynamic Voltage Scaling,” IEEE Transactions on Computers, 53 (7), July, 2004, pp.856-869 Lu,Y. H.; Benini, L.; and De Micheli, G., “Dynamic Frequency Scaling with Buffer Insertion for Mixed Workloads,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 21(11), Nov. 2002 , pp.1284-1305 Lu, Z.; Lach, J.; Stan, M.; and Skadron, K., “Reducing Multimedia Decode Power using Feedback Control,” International Conference on Computer Design, San Jose, CA, Oct. 2003, pp. 489- 496 Lu, Z.; Lach, J.; Stan, M.; and Skadron, K., “Design and Implementation of an Energy Efficient Multimedia Playback System,” Asilomar Conference on Signals, Systems and Computers, 2006, pp.1491-1497 Luo, J.; and Jha, N. K., “Power-profile Driven Variable Voltage Scaling for Heterogeneous Distributed Real-time Embedded Systems,” International Conference on VLSI Design, 2003, pp.369–375 Markel, J. D., “FFT Pruning,” IEEE Transactions on Audio and Electroacoustics, Vol. AU-19, Dec.1971, pp305-311 Maxiaguine, A.; Zhu, Y.; Chakraborty, S.; and Wong, W.-F., “Tuning Soc Platforms for Multimedia Processing: Identifying Limits and Tradeoffs,” International conference on Hardware/software codesign and system synthesis, 2004, pp.128-133 McMillan, L.; and Westover, L., “A Forward-Mapping Realization of the Inverse Discrete Cosine Transform,” IEEE Data Compression Conference, March 24-27, 1992, pp. 219-228 Mesarina, M.; and Turner, Y., “Reduced Energy Decoding of Mpeg Streams,” Multimedia Systems, 9(2),2003, pp.202–213 Mock, T., “Music Everywhere,” IEEE Spectrum, Sep 2004, pp.42-47 Mohapatra, S.; Cornea, R.; Dutt, N.; Nicolau, A.; and Venkatasubramanian, N., “Integrated Power Management for Video Streaming to Mobile Handheld Devices,” ACM Multimedia Conference, Nov, 2003, pp.582-591 Montanaro, J.; etc, “A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor,” IEEE Journal of Solid-State Circuits, Vol.31 (11), Nov., 1996, pp.1703-1714 Mudge, T., “Power: A First-Class Architectural Design Constraint,” IEEE Computer, April 2001, pp.52-58 Nawab, S. H.; Oppenheim, A. V.; Chandrakasan, A. P.; Winograd, J. M.; Ludwig, J. T., “Approximate Signal Processing,” Journal of VLSI Signal Processing Systems, Vol. 15 (1-2), Jan 1997, pp.177-200 Nguyen, T. Q., “Partial Spectrum Reconstruction using Digital Filter Banks,” IEEE Transactions on Signal Processing, 41(9), 1993, pp.2778-2795 Nielsen, L. S.; Niessen, C.; Sparso, J.; and Van Berkel, K., “Low-power Operation using Self Timed Circuits and Adaptive Scaling of the Supply Voltage,” IEEE Transactions on VLSI Systems, 2, Dec.,1994, pp.391-397 131 Oppenheim, A.; and Weinstein, C. J., “Effects of Finite Register Length in Digital Filtering and the Fast Fourier Transform,” Proceedings of the IEEE, Vol. 60, No. 8, Aug. 1972, pp.957-976 [89] Oppenheim, A.; Nawab, H.; Verghese, G.; and Womell, G., “Algorithms for Signal Processing,” 1st Rapid Prototyping of Application Specific Signal Processors (RASSP) Conference, 1994 pp.146-153 [90] Pedram, M., “Design Technologies for Low Power VLSI,” Encyclopedia of Computer Science and Technology, 1995, pp.73-96 [91] Pering, T.; Burd, T.; and Brodersen, R., “The Simulation and Evaluation of Dynamic Voltage Scaling Algorithms,” International Symposium on Low Power Electronics and Design, 1998, pp.76-81 [92] Ponomarev, D.; Kucuk, G.; and Ghose, K., “Dynamic Allocation Of Datapath Resources For Low Power,” Workshop on Complexity-Effective Design, Jul, 2001, pp.90-102 [93] Pouwelse, J.; Langendoen, K.; Lagendijk, I.; and Sips, H., “Power Aware Video Decoding,” the 22nd Picture Coding Symposium, Seoul, Korea, 2001, pp.303-306 [94] Qu, G.; and Potkonjak, M., “Energy Minimization with Guaranteed Quality of Service,” International Symposium on Low Power Electronics and Design, 2000, pp.43-48 [95] Raghunathan, V.; Pering, T.; Want, R.; Nguyen, A.; and Jensen, P., “Experience with A Low Power Wireless Mobile Computing Platform,” The International Symposium on Low Power Electronics and Design, Aug, 2004, pp.363-368 [96] Rao, R.; and Vrudhula, S., “Energy Optimal Speed Control of Devices with Discrete Speed Sets,” Design Automation Conference, 2005, pp.901-904 [97] Roberts, A. W.; and Varberg, D. E., “Convex functions”, Academic Press, 1973 [98] Roy, K., “Software Design for Low Power,” in “Low Power CMOS VLSI Circuit Design,” John Wiley & Sons, Inc, 2000 [99] Roy, K.; and Prasad, S., “Low-power Static Ram Architectures,” in “Low Power CMOS VLSI Circuit Design,” John Wiley & Sons, Inc, 2000 [100] Simunic, T.; Benini, L.; De Micheli, G., and Hans, M., “Source Code Optimization and Profiling of Energy Consumption in Embedded Systems,” International Symposium on System Synthesis, 2000, pp.193-199 [101] Sinevriotis, G.; Leventis, A.; Anastasiadou, D.; Stavroulopoulos, C.; Papadopoulos, T.; Antonakopoulos, T.; and Stouraitis, T., “SOFLOPO: Towards systematic software exploitation for low-power designs,” International Symposium on Low-Power Electronics and Design, 2000. [102] Shin, D.; Kim, J.; and Lee, S., “Intra-Task Voltage Scheduling for Low-Energy Hard Real-Time Applications,” IEEE Design & Test of Computers, Vol. 18, No.2, 2001 pp.20-30 [103] Steinke, S.; Knauer, M.; Wehmeyer, L.; and Marwedel, P., “An Accurate and Fine Grain Instruction-level Energy Model supporting Software Optimizations,” [88] 132 International Workshop on Power And Timing Modeling, Optimization and Simulation, 2001. [104] Vaidyanathan, P.P., “Multirate Systems and Filter Banks”, Prentice-Hall, 1992 [105] Wang, Y.; Huang, W.; and Korhonen, J., “A Framework for Robust and Scalable Audio Streaming,” ACM Multimedia Conference, 2004, pp.144-151 [106] Wang , Z., “Pruning the Fast Discrete Cosine Transform,” IEEE Transactions on Communications, Vol.39, No.5, May 1991, pp.640-643 [107] Weiser, M.; Welch, B.; Demers, A.; and Shenker, S., “Scheduling for Reduced CPU Energy,” Operating Systems Design and Implementation, 1994, pp.13-23 [108] Yao, F.; Demers, A.; and Shenker, S., “A Scheduling Model for Reduced CPU Energy,” IEEE Annual Foundations of Computer Science, 1995, pp.374-382 [109] Yuan, W.; and Nahrstedt, K., “Energy-efficient Soft Real-time CPU Scheduling for Mobile Multimedia Systems,” ACM Symposium on Operating Systems Principles, 2003, pp.149-163 [110] Yuan, W.; and Nahrstedt, K., “Practical Voltage Scaling for Mobile Multimedia Devices,” ACM Multimedia Conference, Oct. 2004, pp. 924 - 931 [111] Zheng, F.; Garg, N.; Sobti, S.; Zhang, C.; Joseph R.; Krishnamurty A.; and Wang, R., “Considering the Energy Consumption of Mobile Storage Alternatives,” IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, Oct 2003, pp.36-45 [112] http://www.audiocoding.com [113] http://tcpmp.corecodec.org/ [114] SimpleScalar/ARM: http://www.simplescalar.com/v4test.html [115] MAD (libmad): http://sourceforge.net/projects/mad/ [116] Libavcodec: http://www.afterdawn.com/glossary/terms/libavcodec.cfm 133 [...]... requirement for audio, and it becomes the most challenging issue in low power audio processing As a summary, audio decoding applications are an important source of energy consumption and require special techniques, especially the workload reduction 18 techniques, to meet the critical requirements of the users Based on these observations, we will study perception- aware low power audio processing techniques for. .. they playback a video clip only once for most cases This leads to different usage modes for video and audio: users prefer download-playback mode for audio and streaming mode for video, which makes low power audio techniques have different focuses from video Third, there are much fewer limiting factors for users to listen to audio in context of portable devices Portable devices are typically used when accompanying... discussion on non-violate storage of portable devices In section 1.2, we will see that an audio player running on a portable device is usually fed data from local storage, where the performance of non-violate storage should be taken into consideration for low power design In context of portable devices, flash ROM is employed as the non-violate storage due to its lower power consumption, faster read access... total power consumption in different architectures Such different weights may vary largely, which 7 is demonstrated by the power breakdowns for two representative processors in Figure 1.2 [83][44] Figure 1.2 reveals that because of the diversity, an optimized low power design for certain architecture may be much less efficient for others Low power design of multimedia applications for portable devices. .. Characteristics of Audio Decoding Applications So far, research on energy efficient multimedia applications concentrates on the video decoding applications In comparison, low power audio decoding techniques have 16 received much less attention However, we believe that audio decoding applications are a significant source of energy consumption and require different techniques for low power design Both... techniques for portable devices 1.3 Related Works In this thesis, we will achieve low power audio processing from two fundamental approaches: workload reduction and DVS In this section, we briefly review the related works on them to highlight the challenges of the current techniques 1.3.1 Workload Reduction Modern audio codecs, including MPEG-1 audio layer III (MP3), MPEG-2 and -4 Advanced Audio Coding... widely employ transform based methods These transform modules are responsible for a dominant fraction of the overall workload of the decoding process Taking MPEG-2 AAC low complexity profile for example, the Inverse Modified Discrete Cosine Transform module is responsible for 86% of the overall workload Therefore, in this thesis, we will concentrate on workload reduction for transform algorithms The... reduced workload need further optimization by code transformation techniques for the target platform Moreover, workload is an intrinsic metric for such techniques and it is closely related to power consumption In contrast to energy consumption, the main merits of the workload measurement are twofold First, workload is very consistent on different platforms for a given algorithm [26] Second, we can obtain... stand for middle channel, side channel, left channel and right channel data, respectively .107 Figure 4.4 Structure of synthesis filter bank in MPEG-1 audio 111 Figure 4.5 Evaluation results for different BSS configurations 117 xi Chapter 1 Introduction Power consumption has become a critical design consideration for battery-powered portable devices, such as mobile phones, PDAs and audio/ video... platforms On the other hand, battery technology has been progressing in a much slower pace These facts suggest that the battery life have become the major bottleneck of the multimedia applications on portable devices Energy efficient techniques become increasingly important for these applications 1.1 Background: System Organization and Power Consumption Issues In this thesis, we concentrate on audio processing . PERCEPTION- AWARE LOW- POWER AUDIO PROCESSING TECHNIQUES FOR PORTABLE DEVICES HUANG WENDONG NATIONAL UNIVERSITY OF SINGAPORE 2008 PERCEPTION- AWARE LOW- POWER AUDIO PROCESSING. 1.3.2 DVS Techniques 21 1.3.3 Main Challenges of the Existing Techniques for Low Power Audio Applications 23 1.4 Our Methodology of Low Power Audio Techniques for Portable Devices 25 . important. In this thesis, we study perception- aware low power audio processing techniques for portable device. These works are mainly motivated by the fact that the audio decoding application is