Luận văn thạc sĩ kĩ thuật viễn thông: Khám phá sự kiện thú vị trong phim chỉ sử dụng tín hiệu âm thanh Detection of interesting events in movies using only the audio signal
Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 57 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
57
Dung lượng
1,95 MB
Nội dung
DUBLIN CITY UNIVERSITY SCHOOL OF ELECTRONIC ENGINEERING Detection of Interesting Events in Movies using only the Audio signal PHAM MINH LUAN NGUYEN August 2009 MASTER OF ENGINEERING IN TELECOMMUNICATIONS Supervised by Dr. Sean Marlow Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN ii Acknowledgements I would like to thank my supervisor Dr. Sean Marlow for his extensive guidance, enthusiasm and commitment to this project. Thanks also due to Dr. David Sadlier for supporting movies and codes. Thanks also to all other friends/colleagues for their contribution to the establishment. Declaration I hereby declare that, except where otherwise indicated, this document is entirely my own work and has not been submitted in whole or in part to any other university. Signed: Date: Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN iii Abstract The imminent rapid expansion in the movie industry is driving the need for efficient digital video indexing, browsing and playback systems. This report is to develop the idea which makes an automatic detector system to detect the exciting events directly from the original movie using only the audio signal. Interesting events in movies are typically flagged by high audio amplitude. Detection of these events based on the audio amplitude is an efficient method. It is a fast detection method, which takes advantage of the fact that audio features are computationally cheaper than the visual features. Then the highlight events are classified to evaluate the automatic system. Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN iv Contents ACKNOWLEDGEMENTS II DECLARATION II ABSTRACT III CONTENTS IV LIST OF FIGURES VI LIST OF GRAPHS VII LIST OF TABLES IX CHAPTER 1 -INTRODUCTION 1 1.1 R ELATED WORK 2 1.1.1 Automatically Selecting Shots for Action Movie Trailers 2 1.1.2 Voice Processing for Automatic TV Sports Program Highlights Detection 3 1.1.3 Audio/visual analysis for high-speed TV advertisement detection from MPEG bistream 4 1.2 E XCITING EVENT DETECTION IN MOVIE USING AUDIO SIGNAL 5 CHAPTER 2 – MPEG-1 AUDIO/VIDEO STANDARD 6 2.1 O VERVIEW 6 2.2 MPEG-1 LAYER 2 A UDIO 7 CHAPTER 3 – MOVIE HIGHLIGHT DETECTION 10 3.1 G ETTING G ROUND T RUTH 10 3.2 A UTOMATIC D ETECTION 15 3.2.1 Getting Scale Factor 16 3.2.2 Audio amplitude threshold 19 CHAPTER 4 – RESULTS AND ANALYSIS 36 4.1 R ESULTS 36 4.1.1 The average audio amplitude 36 4.1.2 The audio amplitude threshold time 36 4.1.3 Results and result tables 36 4.2 P RECISION AND R ECALL 44 CHAPTER 5 - CONCLUSIONS AND FURTHER WORK 45 Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN v 5.1 S YSTEM E VALUATION 45 5.2 F URTHER WORK 46 REFERENCES 48 Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN vi List of Figures F IGURE 2-1: ISO/MPEG-1 LAYER I/II ENCODER 7 F IGURE 2-2: S TRUCTURE OF L AYER – II SUBBAND SAMPLES 9 F IGURE 2-3: T HE DATA BITSTREAM STRUCTURE OF L AYER - II 9 F IGURE 3-1: MPEG-1 L AYER -II F REQUENCY S UBBANDS 16 F IGURE 3-2: V IDEO FRAME AUDIO LEVELS GENERATED FROM SCALEFACTORS CORRESPODING TO TEMPORALLY ASSOCIATED AUDIO 18 Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN vii List of Graphs G RAPH 3-1: P ER -F RAME A UDIO A MPLITUDE LEVEL FOR EXAMPLE MOVIE 17 G RAPH 3-2: P ER -S ECOND A UDIO A MPLITUDE LEVEL FOR EXAMPLE MOVIE 18 G RAPH 3-3: A UDIO AMPLITUDE PROFILE OF THE N IGHT AT THE M USEUM 2 20 G RAPH 3-4: A UDIO AMPLITUDE DETECTION OF THE N IGHT AT THE M USEUM 2 20 G RAPH 3-5: A UDIO AMPLITUDE DETECTION OF THE N IGHT AND THE M USEUM 2 AND G ROUND T RUTH (B LUE IS AUTOMATIC DETECTION . R ED IS THE G ROUND T RUTH ) 20 G RAPH 3-6: A UDIO AMPLITUDE PROFILE OF T HE K ING D OM 21 G RAPH 3-7: A UDIO AMPLITUDE DETECTION OF THE K ING D OM 21 G RAPH 3-8: A UDIO AMPLITUDE DETECTION OF THE K ING D OM AND G ROUND T RUTH 21 G RAPH 3-9: A UDIO AMPLITUDE PROFILE OF T HE L EGEND OF B UTCH AND S UNDANCE 22 G RAPH 3-10: A UDIO AMPLITUDE DETECTION OF T HE L EGEND OF B UTCH AND S UNDANCE 22 G RAPH 3-11: C OMPARE RESULT AUTOMATIC DETECTION AND G ROUND T RUTH 22 G RAPH 3-12: A UDIO AMPLITUDE PROFILE (N IGHT AT THE M USEUM 2 - ONE FRAME ) 24 G RAPH 3-13: A UTOMATIC DETECTION AND G ROUND T RUTH (N IGHT AT THE M USEUM 2 – ONE FRAME ) 24 G RAPH 3-14: A UDIO AMPLITUDE PROFILE (N IGHT AT THE M USEUM 2 – TWO FRAMES ) 25 G RAPH 3-15: A UTOMATIC DETECTION AND G ROUND T RUTH (N IGHT AT THE M USEUM 2 – TWO FRAMES ) 25 G RAPH 3-16: A UDIO AMPLITUDE PROFILE (N IGHT AT THE M USEUM 2 - TWO SECONDS ) 26 G RAPH 3-17: A UTOMATIC DETECTION AND G ROUND T RUTH (N IGHT AT THE M USEUM 2 – TWO SECONDS ) 26 G RAPH 3-18: A UDIO AMPLITUDE PROFILE (N IGHT AT THE M USEUM 2 – FOUR SECONDS ) 27 G RAPH 3-19: A UTOMATIC DETECTION AND G ROUND T RUTH (N IGHT AT THE M USEUM 2 – FOUR SECONDS ) 27 G RAPH 3-20: A UDIO AMPLITUDE PROFILE (T HE K ING D OM – ONE FRAME ) 28 G RAPH 3-21: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE K ING D OM – ONE FRAME ) 28 G RAPH 3-22: A UDIO AMPLITUDE PROFILE (T HE K ING D OM – TWO FRAMES ) 29 G RAPH 3-23: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE K ING D OM – TWO FRAMES ) 29 G RAPH 3-24: A UDIO AMPLITUDE PROFILE (T HE K ING D OM – TWO SECONDS ) 30 G RAPH 3-25: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE K ING D OM – TWO SECONDS ) 30 G RAPH 3-26: A UDIO AMPLITUDE PROFILE (T HE K ING D OM – FOUR SECONDS ) 31 G RAPH 3-27: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE K ING D OM – FOUR SECONDS ) 31 G RAPH 3-28: A UDIO AMPLITUDE PROFILE (T HE L EGEND OF B UTCH AND S UNDANCE – ONE FRAME ) 32 Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN viii G RAPH 3-29 A UTOMATIC DETECTION AND G ROUND T RUTH (T HE L EGEND OF B UTCH AND S UNDANCE – ONE FRAME ) 32 G RAPH 3-30: A UDIO AMPLITUDE PROFILE (T HE L EGEND OF B UTCH AND S UNDANCE – TWO FRAMES ) 33 G RAPH 3-31: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE L EGEND OF B UTCH AND S UNDANCE – TWO FRAMES ) 33 G RAPH 3-32: A UDIO AMPLITUDE PROFILE (T HE L EGEND OF B UTCH AND S UNDANCE – TWO SECONDS ) 34 G RAPH 3-33: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE L EGEND OF B UTCH AND S UNDANCE – TWO SECONDS ) 34 G RAPH 3-34: A UDIO AMPLITUDE PROFILE ((T HE L EGEND OF B UTCH AND S UNDANCE – FOUR SECONDS ) 35 G RAPH 3-35: A UTOMATIC DETECTION AND G ROUND T RUTH (T HE L EGEND OF B UTCH AND S UNDANCE – FOUR SECONDS ) 35 Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN ix List of Tables T ABLE 3-1: G ROUND T RUTH OF N IGHT AT THE M USEUM 2 11 T ABLE 3-2: G ROUND T RUTH OF T HE K ING D OM 12 T ABLE 3-3: G ROUND T RUTH OF T HE K ING D OM ( CONTINUE ) 13 T ABLE 3-4: G ROUND T RUTH OF T HE L EGEND OF B UTCH AND S UNDANCE 13 T ABLE 3-5: G ROUND T RUTH OF T HE L EGEND OF B UTCH AND S UNDANCE ( CONTINUE ) 14 T ABLE 4-1: C OMPARE RESULTS BETWEEN THE AUTOMATIC SYSTEM AND THE G ROUND T RUTH 38 T ABLE 4-2: P OSSIBLE EXCITING EVENTS ARE DETECTED BY AUTOMATIC SYSTEM 38 T ABLE 4-3: G ROUND T RUTH E VENTS MISSED IN AUTOMATIC SYSTEM 39 T ABLE 4-4: C OMPARE RESULTS BETWEEN THE AUTOMATIC SYSTEM AND THE G ROUND T RUTH 40 T ABLE 4-5: P OSSIBLE EXCITING EVENTS ARE DETECTED BY AUTOMATIC SYSTEM 41 T ABLE 4-6: C OMPARE RESULTS BETWEEN THE AUTOMATIC SYSTEM AND THE G ROUND T RUTH 42 T ABLE 4-7: P OSSIBLE EXCITING EVENTS ARE DETECTED BY AUTOMATIC SYSTEM 43 T ABLE 4-8: G ROUND T RUTH E VENTS MISSED IN AUTOMATIC SYSTEM 43 T ABLE 4-9: P RECISION AND R ECALL VALUES FOR THREE MOVIES 44 Detection of Interesting Events in Movies using only the audio signal– PHAM MINH LUAN NGUYEN 1 Chapter 1 -Introduction The growing availability of video content creates a strong requirement for efficient tools to manage or access multimedia data [3]. Considerable progress has been made in audio analysis for movie content with automatic highlight detection being one of the targets of recent research. Highlight detection is important, since they provide the user with a short version of the movie that ideally contains all important information for understanding the content. Hence, the user may quickly evaluate the movie as interesting or not. Audio, which includes voice, music, and various kinds of environmental sounds, is an important type of media, and also a significant part of audiovisual data. However, since there are more and more digital audio databases in place these days, people are realizing the importance of effective management for audio databases relying on audio content analysis. Audio segmentation and classification have applications in professional media production, audio archive management, commercial music usage, surveillance, and so on. Furthermore, audio content analysis may play a primary role in video annotation. Current approaches for video segmentation and indexing are mostly focused on the visual information. However, visual – based processing often leads to a far too fine segmentation of the audiovisual sequence with respect to the diverse multimedia components (audio, visual, and textual information) will be essential in achieving a fully functional system for video parsing. Existing research on content – based on audio data management is very limited. There are in general four directions [6]. One direction is audio segmentation and classification. One basic problem is speech/music discrimination. The second direction is audio retrieval. One specific technique in content-based audio retrieval is query-by-humming. The third direction is audio analysis for video indexing. The fourth direction is the integration of audio and visual information for video segmentation and indexing. [...]... is the Ground Truth) 20 Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN Graph 3-6: Audio amplitude profile of The KingDom Graph 3-7: Audio amplitude detection of the KingDom Graph 3-8: Audio amplitude detection of the KingDom and Ground Truth (Blue is automatic detection Red is the Ground Truth) 21 Detection of Interesting Events in Movies using only the audio. .. This is the audio amplitude The audio amplitude in movie is one indicator of exciting events The exciting events usually happen with high audio amplitude in movies The high audio amplitude events may be the gunshot event, fighting events, crash events, or explosion events So the audio amplitude may be helpful to highlight the events 5 Detection of Interesting Events in Movies using only the audio signal ... 4 Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN 1.2 Exciting event detection in movie using audio signal We also have some cases to study about event detection and movie detection The first case, they had detected events in movie by using the audiovisual data [3] The second case, they use the audio signal to highlight events in the sport program [4] The. .. 25 Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN Graph 3-16: Audio amplitude profile (Night at the Museum 2 - two seconds) Graph 3-17: Automatic detection and Ground Truth (Night at the Museum 2 – two seconds) 26 Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN Graph 3-18: Audio amplitude profile (Night at the. .. Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN 3.2 Automatic Detection The audio amplitude gives a benefit feature to detect the exciting events In movie, the exciting events may be happen in milliseconds, the other events may happen in second or minute The other hand, some events has high audio amplitude but they are not exciting events Some events are exciting events but audio amplitude... Sundance There are some detection result graphs of three movies 19 Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN Graph 3-3: Audio amplitude profile of the Night at the Museum 2 Graph 3-4: Audio amplitude detection of the Night at the Museum 2 Graph 3-5: Audio amplitude detection of the Night and the Museum 2 and Ground Truth (Blue is automatic detection. .. the Ground Truth, another problem is the length of the events Example: the event is a gunshot combine fighting, beating, so we need to choose the main event happen or we can combine all of these events to become a big event In some cases, the big event has long happened – time, so the automatic detection can get result as much as we want 10 Detection of Interesting Events in Movies using only the audio. .. than audio amplitude threshold In four seconds case, the exciting events are picked out when audio amplitude value of at least four seconds series is lager than audio amplitude threshold The movies are also three movies: Night at the Museum 2, The KingDom, The Legend of Butch and Sundance 3.2.2.3 Night at the Museum 2 23 Detection of Interesting Events in Movies using only the audio signal PHAM MINH... audio amplitude so we just focus about the threshold of the audio amplitude and the threshold time of the audio amplitude In once case, we changed the value to compare and to find the better way to detect exciting movie 15 Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN 3.2.1 Getting Scale Factor 3.2.1.1 Reduction of cut-off frequency [4], [5] One scale factor... case, they use the audiovisual data to detect the ad-break in a television program [5] However, they have not to detect the events in movie using the only audio signal The method uses the audio signal to highlight events in movie is the cheaper way It does not have too much time to calculate as the audiovisual data method In this document, we choose a figure of the audio signal to highlight event in movie . advertisement breaks. Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN 5 1.2 Exciting event detection in movie using audio signal We also have. values of the 12 samples. Thus it provides an indication of the maximum power exhibited by any one of the 12 samples within the group. Detection of Interesting Events in Movies using only the audio. implementation of the first three parts of the MPEG-1 standard. Detection of Interesting Events in Movies using only the audio signal PHAM MINH LUAN NGUYEN 7 2.2 MPEG-1 layer 2 Audio