Aaron K. Baughman Jiang Gao Jia-Yu Pan Valery A Petrushin Editors Multimedia Data Mining and Analytics Disruptive Innovation Multimedia Data Mining and Analytics Aaron K Baughman Jiang Gao Jia-Yu Pan Valery A Petrushin • • Editors Multimedia Data Mining and Analytics Disruptive Innovation 123 Editors Aaron K Baughman IBM Corp Durham, NC USA Jia-Yu Pan Google Inc Mountain View, CA USA Jiang Gao Nokia Inc Sunnyvale, CA USA Valery A Petrushin 4i, Inc Carlsbad, CA USA ISBN 978-3-319-14997-4 DOI 10.1007/978-3-319-14998-1 ISBN 978-3-319-14998-1 (eBook) Library of Congress Control Number: 2014959196 Springer Cham Heidelberg New York Dordrecht London © Springer International Publishing Switzerland 2015 This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made Printed on acid-free paper Springer International Publishing AG Switzerland is part of Springer Science+Business Media (www.springer.com) Preface In recent years, disruptive developments in computing technology, such as largescale and mobile computing, has accelerated the growth in volume, velocity, and variety of multimedia data while enabling tantalizing analytical processing potential During the last decade, multimedia data mining research extended its scope to cover more data modalities and shifted its focus from analysis of data of one modality to multi-modal data, from content-base search to concept-base search, and from corporate data to social networked communities data Ubiquity of advanced computing devices such as smart phones, tablets, e-book readers, networked gaming platforms, which serve both as data producers and ideal personalized delivery tools, brought a wealth of new data types including geographical aware data, and personal behavioral, preference and sentiment data Developments in networked sensor technology allow enriched behavioral personal data that include physiological and environmental data that can be implemented to build deep, intrinsic, and robust models This book reflects on the major focus shifts in multimedia data mining research and applications toward networked social communities, mobile devices, and sensors Vast amount of multimedia are produced, shared, and accessed everyday in various social platforms These multimedia objects (images, videos, texts, tags, sensor readings, etc.) represent rich, multifaceted recordings of human behavior in the networked society, which lead to a range of important social applications, such as consumer behavior forecasting for business to optimize advertising and product recommendations, local knowledge discovery to enrich customer experience (e.g., for tourism or shopping), detection of emergent news events and trends, etc In addition to techniques for mining single media items, all these applications require new methods for discovering robust features and stable relationships among the content of different media modalities and users, in a dynamic, social context rich, and likely noisy environment Mobile devices with multimedia sensors, such as cameras and geographic location sensors (GPS), have further integrated multimedia into people’s daily lives New features, algorithms, and applications for mining multimedia data collected with mobile devices enable the accessibility and usefulness of multimodal data in v vi Preface peoples’ daily lives Examples of such applications include personal assistants, augmented reality systems, social recommendations, entertainment, etc In addition to the research topic mentioned above, this book also includes chapters devoted to privacy issues in multimedia social environments, large-scale biometric data processing, content and concept-based multimedia search, advanced algorithms for multimedia data representation, processing, and visualization This book is mostly based on extended and updated papers presented at the Multimedia Data Mining Workshops held in conjunction with Association of Computing Machinery (ACM) Special Interest Group Knowledge Discovery and Data Mining (SIGKDD) Conferences in 2010–2013 The book also includes several invited chapters The editors recognize that this book cannot cover the entire spectrum of research and applications in multimedia data mining but provides several snapshots of some interesting and evolving trends in this field The editors are grateful to the chapter authors whose efforts made this book possible and organizers of the ACM SIGKDD Conferences for their supports We also thank Dr Farhan Balush for sharing his LaTex expertise that helped to unify the chapters We thank the Springer-Verlag employees Wayne Wheeler, who supported the book project, and Simon Rees, who helped with coordinating the publication and editorial assistance Durham, NC, September 2014 Sunnyvale, CA Mountain View, CA Carlsbad, CA Aaron K Baughman Jiang Gao Jia-Yu Pan Valery A Petrushin Contents Part I Introduction Disruptive Innovation: Large Scale Multimedia Data Mining Aaron K Baughman, Jia-Yu Pan, Jiang Gao and Valery A Petrushin Part II Mobile and Social Multimedia Data Exploration Sentiment Analysis Using Social Multimedia Jianbo Yuan, Quanzeng You and Jiebo Luo 31 Twitter as a Personalizable Information Service Mario Cataldi, Luigi Di Caro and Claudio Schifanella 61 Mining Popular Routes from Social Media Ling-Yin Wei, Yu Zheng and Wen-Chih Peng 93 Social Interactions over Location-Aware Multimedia Systems Yi Yu, Roger Zimmermann and Suhua Tang 117 In-house Multimedia Data Mining Christel Amato, Marc Yvon and Wilfredo Ferré 147 Content-Based Privacy for Consumer-Produced Multimedia Gerald Friedland, Adam Janin, Howard Lei, Jaeyoung Choi and Robin Sommer 157 vii viii Contents Part III Biometric Multimedia Data Processing Large-Scale Biometric Multimedia Processing Stefan van der Stockt, Aaron K Baughman and Michael Perlitz Detection of Demographics and Identity in Spontaneous Speech and Writing Aaron Lawson, Luciana Ferrer, Wen Wang and John Murray Part IV Evaluating Web Image Context Extraction Sadet Alcic and Stefan Conrad 11 Content Based Image Search for Clothing Recommendations in E-Commerce Haoran Wang, Zhengzhong Zhou, Changcheng Xiao and Liqing Zhang 13 205 Multimedia Data Modeling, Search and Evaluation 10 12 177 Video Retrieval Based on Uncertain Concept Detection Using Dempster–Shafer Theory Kimiaki Shirahama, Kenji Kumabuchi, Marcin Grzegorzek and Kuniaki Uehara Multimodal Fusion: Combining Visual and Textual Cues for Concept Detection in Video Damianos Galanopoulos, Milan Dojchinovski, Krishna Chandramouli, Tomáš Kliegr and Vasileios Mezaris 14 Mining Videos for Features that Drive Attention Farhan Baluch and Laurent Itti 15 Exposing Image Tampering with the Same Quantization Matrix Qingzhong Liu, Andrew H Sung, Zhongxue Chen and Lei Chen 229 253 269 295 311 327 Contents Part V ix Algorithms for Multimedia Data Presentation, Processing and Visualization 16 Fast Binary Embedding for High-Dimensional Data Felix X Yu, Yunchao Gong and Sanjiv Kumar 347 17 Fast Approximate K-Means via Cluster Closures Jingdong Wang, Jing Wang, Qifa Ke, Gang Zeng and Shipeng Li 373 18 Fast Neighborhood Graph Search Using Cartesian Concatenation Jingdong Wang, Jing Wang, Gang Zeng, Rui Gan, Shipeng Li and Baining Guo 397 Listen to the Sound of Data Mark Last and Anna Usyskin (Gorelik) 419 Author Index 447 Subject Index 449 19 Contributors Sadet Alcic Department of Databases and Information Systems, Institute for Computer Science, Heinrich-Heine-University of Duesseldorf, Duesseldorf, Germany Christel Amato IBM France Laboratory, Bois Colombes Cedex, France Farhan Baluch Research and Development Group, Opera Solutions, San Diego, CA, USA Aaron K Baughman IBM Corporation, Research Triangle Park, NC, USA Mario Cataldi LIASD, Department of Computer Science, Université Paris 8, Paris, France Krishna Chandramouli Division of Enterprise and Cloud Computing, VIT University, Vellore, India Lei Chen Department of Computer Science, Sam Houston State University, Huntsville, TX, USA Zhongxue Chen Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN, USA Jaeyoung Choi International Computer Science Institute, Berkeley, CA, USA Stefan Conrad Department of Databases and Information Systems, Institute for Computer Science, Heinrich-Heine-University of Duesseldorf, Duesseldorf, Germany Luigi Di Caro Department of Computer Science, University of Turin, Turin, Italy Milan Dojchinovski Web Engineering Group, Faculty of Information Technology, Czech Technical University in Prague, Prague, Czech Republic; Department of Information and Knowledge Engineering, Faculty of Informatics and Statistics, University of Economics, Prague, Czech Republic xi 440 M Last and A Usyskin Fig 19.9 Summer 03–07 dataset In this experiment, like in the first one, the dependent variable was the subject’s total score S, whereas the independent variables represented the demographic and musical characteristics of the subjects 19.5.2 Hypotheses Similar to the first experiment (see Sect 19.4.2), we have tested the null hypothesis that the subjects are unable to provide correct answers or to take the right decisions by listening to the musical sequences generated by our sonification methodology Thus, Fig 19.10 WeatherJuly1978-1 dataset 19 Listen to the Sound of Data Fig 19.11 Experiment 2: Age distribution 441 14 12 12 10 6 Age 20- Age 30- Age 40- Age 50- Age > 60 30 40 50 60 Fig 19.12 Experiment 2: Musical experience distribution 13 14 12 10 10 10 4 None 1-5 years 6-12 years 12+ years Years of Musical Experience we have compared the distribution of the subjects’ answers/decisions to the uniform distribution For tasks where the null hypothesis was rejected, we have explored the effect of the following subject’s characteristics on his/her performance in the test: age, gender, occupation, musical experience, and musical hearing ability The test performance measures were identical to the ones defined in Sect 19.4.2 19.5.3 Subjects A total of 37 subjects took part in the second experiment (only those who answered all the questions) All data about the subjects was taken from their answers to the first part of the questionnaire (user information) only According to above, there were 21 male and 16 female subjects between the ages of 20 and 70 The age distribution of the subjects is shown in Fig 19.11 No one had any familiarity with sonification techniques (excluding possible participation in the first experiment, half a year earlier) and very few had experience with time series databases or data mining techniques The distribution of the subjects’ musical experience (if any) is shown in Fig 19.12 Each subject could participate in the test at his/her time of convenience, without any timing constraints at any stage of the experiment 442 M Last and A Usyskin Table 19.4 Number of correct answers per-task Number of answers Number of correct answers Task Task Task Task Task 148 148 148 111 185 137 125 106 82 148 Percentage of correct answers (%) 92.57 84.46 71.62 73.87 80.00 19.5.4 Experimental Results The experiment was designed in the increasing order of difficulty We started from relatively simple sonification techniques and tasks (the first and the second task) and up to the last task (no 5), which was a relatively complex one For all questions in every task, the percentage of correct answers has been significantly higher than the percentage that would have been obtained by chance, at the 99.9 % significance level As shown in Table 19.4, the best result was obtained for the first task (92.57 %), whereas the worst one was for the third task (71.62 %) In the first four tasks, the subjects were required to distinguish between two sequences and to analyze each one of them separately Only in the fifth task, they had to listen to the two sequences simultaneously It is noteworthy that for this, more complex task, the percentage of correct answers was still relatively high—80 % Having verified that our sonification algorithm is a useful method for decision making and time series databases examination with ability to perform some popular data mining tasks, we have proceeded with the second part of our experiment aimed at finding the personal characteristics/abilities that affect the subjects performance We have evaluated the same attributes as in the first experiment, namely Age, Gender, Occupation, Musical Experience, and Musical Pitch Ability The attributes have been partitioned into the same categories except for the Musical Pitch Ability, which had the following five categories: mus_1—None, mus_2—Low musical pitch Fig 19.13 Experiment 2: The effect of occupation groups 19 Listen to the Sound of Data 443 Fig 19.14 Experiment 2: The effect of musical experience Fig 19.15 Experiment 2: The effect of pitch ability ability, mus_3—Average musical pitch ability, mus_4—Good musical pitch ability, mus_5—Excellent musical pitch ability According to the results of ANOVA test, there is no statistically significant difference in total scores across different age, gender, or occupation groups in general However, we have found a statistically significant difference in total scores between the following two specific occupation groups: {natural sciences, engineering} versus {humanities, music} (see Fig 19.13) Apparently, people with more technical background (engineers, physicists, etc.) find it more difficult to understand musified information We have also found a statistically significant difference in total scores across different musical experience groups (contrary to the results of the first experiment) and different pitch ability categories These results are demonstrated in Figs 19.14 and 19.15, respectively As expected, the subjects’ performance is improving with the amount of their musical experience as well as with their pitch ability On the other hand, even subjects without any musical experience or pitch 444 M Last and A Usyskin ability can still perform much better than chance (58–63 % of correct answers versus 29 % with a random guess) 19.6 Conclusions In this chapter, we have presented and evaluated a novel sonification methodology for representing information in time series databases so that humans can perform interactive data mining tasks without the need to view the actual data Our algorithm can be used for sonification of univariate and multivariate time series The algorithm can use three different sonification techniques: two for sonification of real-valued attributes and one for sonification of nominal attributes The user can choose a specific sonification technique for each time series, according to the task performed Our generic sonification methodology can be used for representing time series from various domains, like weather forecasting, stock market, health care, process manufacturing, etc It can assist people who are unable to utilize visual representations (physically or due to other simultaneously performed tasks), by providing them with the necessary tools to acquire, understand, and analyze time series data The innovative features of our methodology include using a segmentation algorithm as a pre-sonification step and representing time series data on the Western musical scale The empirical evaluation of our technique included two online user studies The first user study was designed to evaluate the basic ability of the subjects to use our sonification technique for performing some basic data mining tasks on a univariate time series The second user study has evaluated some more complex tasks on bivariate time series There were 44 subjects in the first experiment, and 37 subjects in the second experiment The questionnaire of the second experiment has been prepared in the view of the lessons learned from the first experiment Both experiments have shown that using our sonification algorithm for exploring univariate and bivariate time series provides very promising results, in terms of some important data mining tasks, like classification, clustering, and change detection In the second experiment, we have also demonstrated the use of our algorithm for effective decision-making Furthermore, we have discovered three factors that can determine the user’s ability to successfully mine sonified data These are (in decreasing order of importance): musical pitch ability, musical experience, and occupation The average number of correct answers for both experiments was about 80 % Interested readers can listen to examples of sonified time series (including those used in our experiments) at [25] In future experiments, one can study the effect of training on the users’ performance as well as evaluate the user ability to perform various data exploration tasks on more than two simultaneously played time series Future research may also include developing an “online” version of the proposed sonification algorithm for interactive mining of continuous, nonstationary data streams 19 Listen to the Sound of Data 445 References Barras S, Kramer G (1999) Using sonification Multimed Syst 7(1):23–31 Brockwell PJ, Davis RA (2002) Time series: theory and methods, 2nd edn Springer, New York El-Azm F (2005) Sonification and augmented data sets in binary classification PhD Dissertation, Institute of Informatics and Mathematical Modeling at the Technical University of Denmark Gaver WW (1994) Using and creating auditory icons In: Kramer G (ed) Auditory display: sonification, audification, and auditory interfaces Addison-Wesley, Reading, pp 417–446 Hermann T (2008) Taxonomy and definitions for sonification and auditory display In: Susini P, Warusfel O (eds) Proceedings of the 14th international conference on auditory display (ICAD 2008) IRCAM, Paris Hermann T, Hunt A (2005) An introduction to interactive sonification IEEE Multimed 12(2):20–24 Hermann T, Ritter H (1999) Listen to your data: Model-based sonification for data analysis In: Proceedings of the international symposium on intelligent multimedia and distance education (ISIMADE’99), Baden-Baden Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration Data Min Knowl Discov 7(4):349–371 Keogh E, Chu S, Hart D, Pazzani M (2004) Segmenting time series: a survey and novel approach In: Last M, Kandel A, Bunke H (eds) Data mining in time series databases World Scientific Publishing Company, Singapore, pp 1–21 10 Kramer G (1994) An introduction to auditory display In: Kramer G (ed) Auditory display: sonification, audification and auditory interfaces, vol XVIII Addison-Wesley, pp 1–78 11 Kramer G, Walker B, Bonebright T, Cook P, Flowers J, Miner N, Neuhoff J (1999) Sonification report: status of the field and research agenda Technical report, ICAD 12 Last M, Gorelik A (2008) Using sonification for mining time series data In: Proceedings of the 9th international Workshop on Multimedia Data Mining (MDM/KDD 2008), Las Vegas 24 August 2008, pp 63–72 13 Last M, Kandel A, Bunke H (2004) Data mining in time series databases In: Machine Perception and Artificial Intelligence, vol 57 World Scientific, Singapore 14 Last M, Klein Y, Kandel A (2001) Knowledge discovery in time series databases IEEE Trans Syst Man Cybern, Part B—Cybern, 31(1): 160–169 15 Leumi Group http://www.bankleumi.co.il 16 Liu L-M, Bhattacharyya S, Sclove SL, Chena R, Lattyak WJ (2001) Data mining on time series: an illustration using fast-food restaurant franchise data Comput Stat Data Anal 37(4):455–476 17 MSQ Project, http://www.aconnect.de/friends/editions/computer/msq2/msq.html 18 Muller W, Schumann H (2003) Visualization methods for time-dependent data—an overview In: Proceedings of the 2003 winter simulation conference, vol 1, pp 737–745 19 Nesbitt KV, Barrass S (2004) Finding trading patterns in stock market data IEEE Comput Graph Appl 24(5):45–55 20 Noirhomme-Fraiture M, Schöller O, Demoulin C, Simoff S (2002) Sonification of time dependent data In: Proceedings of international workshop on visual data mining Helsinki, pp 113– 125 21 Patterson, R (1982) Guidelines for auditory warning systems on civil aircraft Civil aviation authority 22 Pauletto S, Hunt A (2009) Interactive sonification of complex data Int J Hum-Comput Stud 67(11):923–933 23 Pauletto S, Hunt A (2004) A toolkit for interactive sonification In: Proceedings of the international conference of auditory display (ICAD) Sydney 24 Peretz I, Zatorre RJ (2005) Brain organization for music processing Annu Rev Psychol 56:89– 114 25 Sonification examples, http://www.ise.bgu.ac.il/faculty/mlast/data/MidiandPics.zip 446 M Last and A Usyskin 26 Tel-Aviv Stock Exchange, http://www.tase.co.il 27 UCR Time series classification/clustering page, http://www.cs.ucr.edu/~eamonn/time_series_ data/ 28 Walker BN, Godfrey MT, Orlosky JE, Bruce C, Sanford J (2006) Aquarium sonification: soundscapes for accessible dynamic informal learning environments In: Proceedings of the international conference on auditory display (ICAD2006) London, pp 238–241 29 Walker BN, Kim J, Pendse A (2007) Musical soundscapes for an accessible aquarium: Bringing dynamic exhibits to the visually impaired In: Proceedings of the international computer music conference (ICMC 2007) Denmark 30 Walker BN, Lindsay J, Nance A, Nakano Y, Palladino DK, Dingler T, Jeon M (2013) Spearcons (speech-based earcons) improve navigation performance in advanced auditory menus Hum Factors: J Hum Factors Ergon Soc 55(1):157–182 31 Williamson J, Murray-Smith, R (2002) Audio feedback with gesture recognition Technical report TR-2002-127, Department of Computer Science, University of Glasgow Author Index A Alcic, Sadet, 229 Amato, Christel, 147 B Baluch, Farhan, 311 Baughman, Aaron K., 3, 177 Grzegorzek, Marcin, 269 Guo, Baining, 397 I Itti, Laurent, 311 J Janin, Adam, 157 C Cataldi, Mario, 61 Chandramouli, Krishna, 295 Chen, Lei, 327 Chen, Zhongxue, 327 Choi, Jaeyoung, 157 Claudio Schifanella, 61 Conrad, Stefan, 229 D Di Caro, Luigi, 61 Dojchinovski, Milan, 295 F Ferrer, Luciana, 206 Ferré, Wilfredo, 147 Friedland, Gerald, 157 G Galanopoulos, Damianos, 295 Gan, Rui, 397 Gao, Jiang, Gong, Yunchao, 347 (Gorelik), Anna Usyskin, 419 K Ke, Qifa, 373 Kliegr, Tomáš, 295 Kumabuchi, Kenji, 269 Kumar, Sanjiv, 347 L Last, Mark, 419 Lawson, Aaron, 205 Lei, Howard, 157 Li, Shipeng, 373, 397 Liu, Qingzhong, 327 Luigi Di Caro, 61 Luo, Jiebo, 31 M Mario Cataldi, 61 Mezaris, Vasileios, 295 Murray, John, 206 P Pan, Jia-Yu, © Springer International Publishing Switzerland 2015 A.K Baughman et al (eds.), Multimedia Data Mining and Analytics, DOI 10.1007/978-3-319-14998-1 447 448 Peng, Wen-Chih, 93 Perlitz, Michael, 177 Petrushin, Valery A., S Schifanella, Claudio, 61 Shirahama, Kimiaki, 269 Sommer, Robin, 157 Sung, Andrew H., 327 T Tang, Suhua, 117 U Uehara, Kuniaki, 269 V van der Stockt, Stefan, 177 W Wang, Haoran, 253 Author Index Wang, Jing, 373, 397 Wang, Jingdong, 373, 397 Wang, Wen, 206 Wei, Ling-Yin, 93 X Xiao, Changcheng, 253 Y You, Quanzeng, 31 Yuan, Jianbo, 31 Yu, Felix X., 347 Yu, Yi, 117 Yvon, Marc, 147 Z Zeng, Gang, 373, 397 Zhang, Liqing, 253 Zheng, Yu, 93 Zhou, Zhengzhong, 253 Zimmermann, Roger, 117 Subject Index 0–9symbols 3D recordings, 161 A Account linking, 157 Active authentication, 206, 218 Aging population, 148, 149 Alternating optimization, 352, 358, 362 Analytics, 148, 152 Anonymity, 161, 164 Approximate nearest neighbor search, 398 AQBC, 363 Architectures, 202 Attack vectors, 157 Audification, 421 Auditory display, 420 Auditory interface, 420 Auto Color Correlogram, 166 Automatic speech recognition, 297 B Big data DeepQA, 17 Map Reduce, 17 Meets fast analytics, 16 Stream computing, 17 Bilinear Binary Embedding (BBE), 347, 349–351, 354, 364, 365, 367, 370 Bilinear projection, 347, 349, 363, 367 Binary embedding, 347, 351, 354, 356, 358 Biometric identification, 179, 202 Biometrics, 219 Bottom up attention, 320 Burstiness concept of, 65 definition of, 71 mathematical definition of, 73 C Calibration, 206, 219 Cartesian concatenation, 404, 415 Circulant Binary Embedding (CBE), 347, 349, 354–358, 363, 364, 370 Circulant matrix, 347, 354, 355, 357 Circulant projection, 349, 357 Circular convolution, 355 Clothing recommendations, 253, 266 Cognitive computing, 18, 19 Collaborative learning, 93, 98 Color and Edge Directivity Descriptor, 166 Computer vision, 313 limitations of, 32 Concept, 271, 272, 277, 281, 282 Concept-based video retrieval, 269, 271, 272, 291 Concept detection, 273, 274, 276, 277, 280, 282, 287, 289, 290, 291 Concept learning based on visual features, 32 Connected objects, 147, 148 Content-based video retrieval, 270, 271 Consumer-produced videos, 164 Content management system, 238 Content personalization Online Topic Model, 66 Copy-paste, 329, 338, 339 Correlation, 153 Creative Commons, 164 Creativity yes and, Cropping, 328, 332, 341 © Springer International Publishing Switzerland 2015 A.K Baughman et al (eds.), Multimedia Data Mining and Analytics, DOI 10.1007/978-3-319-14998-1 449 450 Cybercasing, 158 D Daily routine, 153 Data acquisition, 148, 149 Data Law, 10, 11, 13 definition of, 10 multimedia, 10 nanotechnology, 10 National Institutes of Standards and Technology, 12 principles of, 11 relation to Moore’s Law, 12–13 Demographics, 206, 221 Dempster–Shafer theory (DST), 278 Density ratio, 279, 284, 286, 290 Density ratio estimation, 279, 280, 284, 286 Digitized world 3D models, Google Maps, Dimensionality reduction, 354 Discrete Fourier Transformation (DFT), 355, 359 Disney imagine, innovate, impact, imagineers, Disruptive innovation, 3–5, 7, 18 definition of S-curves, relation to multimedia data media, Document Object Model (DOM), 233 E Educational techniques (privacy), 157 Eigenface accuracy, 35 eigenfaces, 39 facial expressions, 32 facial sentiment recognition, 35 Emerging terms automatic keyword selection, 74 mathematical definition of, 74 user driven, 73 to emerging topics, 77 Emerging topics correlation vector, 78 from emerging terms, 77 mathematical definition of, 78 relationship to tweets, 78 users’ interests, 66 Environmental acoustic noise, 161 Evaluation automatic ranking, 85 Subject Index history, 84 user selected threshold, 85 worthiness, 83 Evaluation framework, 229, 231, 234–236, 247 Evaluation metrics, 230, 233, 247 Event identification social networks, 65 EXIF, 171 Exotic sensors, 161 Eye movements, 312, 313, 315, 317, 324 F Facebook, 31, 62 Facial detection fdlibmex, 41 Factor analysis, 167 Fast Fourier Transformation (FFT), 349, 355, 363, 368 Fisher Vector, 348, 350 Flickr, 31, 64, 160, 161, 164, 167 Forgery detection, 328 Fuzzy Color and Texture Histogram, 166 G Gabor, 166 Gaussian Mixture Model, 166, 167 Gaze, 312 Genetic algorithm, 322 Geographic popularity, 138 Geo-location, 159 Geo-social behaviors, 118, 144 Geo-tagged multimedia, 118, 119 GPU, 363 Ground truth, 229, 233, 235, 237, 247 H Hamming distance, 348, 356–358 Hashing, 348, 357 Health care, 149 Hierarchical orientation features, 255, 257 Hit list, 158 Human computer interfaces relation to multimedia, Human emotions anger, 37 disgust, 37 fear, 37 happiness, 37 sadness, 37 surprise, 37 Subject Index Human mood, 33 Human visual system, 311 I Identity, 158, 165 image filter, 235 Image forensics, 327 Image search, 256, 262, 266 Image sentiment classification asymmetric bagging, 39 bag of words, 43 decision fusion, 43 eigenvectors, 41 logistic regression, 42 performance of, 42 support vector machine (SVM), 42 textual content, 42 Imbalanced problem, 284, 288, 290 Immersive environments Kinect, Inference, 157, 159, 170 Inference chain, 163 Information leakage, 159 Innovation Moore’s Law, 10 S-curves, Interactive data exploration, 420 Internet Advanced Research Projects Agency Network (ARPANET), Defense Advanced Research Projects Agency (DARPA), Hypertext Transfer Protocol (HTTP), World Wide Web (WWW), Inverse Discrete Fourier Transformation (IDFT), 355, 356, 359 ITQ, 363, 366 J JPEG, 327–329, 332 K Karalinska Directed Emotional Faces (KFEF), 37, 40 K-Nearest Neighbor-based Undersampling (kNNU), 284 Kronecker product, 351 L Labeling, 79 451 Language, 206, 210, 212, 213, 219, 221 Large scale computing cloud computing, 15 High Performance Computing (HPC), 15 Large-Scale Concept Ontology for Multimedia(LSCOM), 273 Large-scale facial identification, 188, 201, 202 Large-scale fingerprint identification, 180 LibSVM, 334, 335, 339 Locality sensitive Hashing (LSH), 357, 364– 366 Location-aware preference mining, 134, 144 Location estimation, 157, 160, 171 Location recommendations, 118 Log-likelihood ratio, 166 Logistic regression, 336, 337, 340 Longest common subsequence, 246 M Machine learning, 207, 210 Mass function, 278, 279, 282 Massive Open Online Course multimedia in society, MFCC features, 166, 167 Microblogging, 62 Misaligned recompression, 329, 332, 341 Mitigation (privacy attack), 157, 158, 161, 169, 171 Mobile computing devices, 10 social media, 10 Modeling attention, 313 Monash extractor, 232, 233, 242, 250 Moore’s Law, 9, 10, 13 definition, 9–10 mechanical computing, relation to data law, 12–13 semiconductors, 10 MQTT, 150 Multimedia, 147, 148, 150, 153, 154 conferences, 19 dataset examples, 12 Multimedia analytics, 157, 161 Multimedia content diffusion, 142, 144 Multimedia Data Mining, 3, 4, 5, 7, 13, 21 Multimodal fusion, 300 Musical sonification, 420 N Natural language processing, 205 Neighborhood graph search, 400, 402, 409 452 Neuromorphic, 312 N-Terms Window, 240 O Object detection, 161 Online Topic Model (OLDA) content personalization, 66 definition of, 67 Organic computing artificial life, 14 DNA computing, 15 evolutionary computing, 14 principles of, 13 von Neumann architectures, 14 Overview of the book chapters, 4–26 P Paragraph extractor, 241, 242 Parseval’s theorem, 359 Pattern, 153 Personalization mathematical definition of, 76 relationship to burstiness, 77 users’ context, 74 Person detection, 161 Plausibility, 279, 280, 284, 286, 288 Plausibility function, 279, 280, 283, 285– 288, 290, 291 Predicting gaze, 317 Principal Component Analysis, 166 Privacy, 157–161, 164, 169–171 Probabilistic Linear Discriminant Analysis, 166 Product quantization, 363 Prosody, 214, 216, 219, 223 Q Quality, 149, 152 Quantization, 327, 328, 331, 334, 341 R Ranking discussion, 82 mathematical definition of, 80 Real-time, 147, 148, 152 Real-time Vectorization tweet vector, 67 tweets, 67 Recommendation Subject Index aggregation, 63 social networks, 63 Region descriptor, 275 Region detector, 275 Reputation definition of, 72 Risks, 157, 158, 170 Route Inference, 93 S S-curves, 5, 7, 12 definition of, Salience, 313, 317 Saliency map, 314 Security, 157, 164, 170 Segmentation, 424–426, 428, 444 Semantic gap, 230, 270, 271 Semantic learning based on visual features, 32 Semi-supervised learning example of, 33 Sensor detection, 161 Sensors, 147, 148, 150–152 Sentiment analysis applications of, 32 correlation between images and text, 51 description of, 31 multimedia, 44, 46, 54, 56 number of images, 47 overview of, 35 sentiment prediction, 35 short tweets, 44, 47, 52 tweets, 46, 48 Sentiment analysis multimedia correlation between images and text, 54 low level features, 34 short tweets, 52 Twitter API, 49 visual sentiment classification, 47 Sentiment ontology Sentribute, 35 Sentribute definition of, 36 eigenfaces, 39 facial sentiment recognition, 39 feature selection, 36 framework, 35 image sentiment classification, 33 liblinear, 35, 37 low-level features, 35 mid-level attribute, 36 sentiment analysis, 36 Subject Index sentiment detection results, 45 sentiment ontology, 36 SUN Database, 36 SH, 363 Shape Topic Model, 256 Shift-recompression, 327, 328, 331, 332 SKLSH, 363 Smart home, 148 Social interactions, 134 Social media, 114 mobile devices, Social networking, 157, 162, 171 Social networks applications, 32 description of, 19 event identification, 65 semi-supervised learning, 33 Social platform Facebook, 31 Flickr, 31 Twitter, 32 Social relations graphs, 33 Sociolinguistics, 207, 223 Spatial Bag of Features, 256, 261 Speaker recognition, 161, 164, 212, 213, 215, 217 Speech, 206, 212–214, 218 Statistics, 148, 153 Surveillance, 154 T Tags, 164 Tamura, 166 Temporal Stream Processing burstiness, 68 follower, 69 graph model, 70 nutrition, 69 quality, 69 reputation of users, 70 vitality, 68 Text analysis, 296 Text similarity TF-IDF, 65 Textual description, 164 Textual sentiment analysis, 47 Thinking machines Antikythera, 18 Deep Blue, 18 ENIAC, 18 evolution of, 18–19 453 Napier’s rods, 18 System 360, 18 Watson, 18 Time estimation, 160 Time series, 419, 420, 423, 427, 433 Topic detection discussion, 82 user study, 82 Topic graph, 79 Trajectory data mining, 93 TRECVID, 276, 281, 285 TRECVID Search task, 285, 288 TRECVID Semantic Indexing task (SIN), 277 Trend analysis Trendistic, 64 Twopular, 64 Trend discovery, 420 Tumblr, 61 Twitter data sets, 33 definition of, 69 microblogging, 62 U Uncertain concept detection, 280, 287, 289 Uncertain Data, 290 Uncertainty, 278, 282, 284, 290 Unconstrained Least-Squares Importance Fitting, 284 Undersampling, 284, 286, 291 User matching, 168 User study evaluating personalization, 85 topic detection, 85 V Vector of Locally Aggregated Descriptors (VLAD), 348, 350, 353 Video analysis, 296 VIPS, 232, 233, 244, 245, 249, 250 Virtual worlds, 207, 208 Visual attention, 313 Visual concept detection, 296, 305 Visual content overview of, 35 topics of sentiment analysis, 31 W Web document, 229, 230, 232–236, 238, 245 Web image, 230, 232, 233 454 Web image context, 229, 230, 233 Weibo microblogging, 32 Wellness, 148, 154 Western tonal music, 420 Within-Class Covariance Normalization, 166 Subject Index Writing, 206, 221 Z Zigbee, 151 .. .Multimedia Data Mining and Analytics Aaron K Baughman Jiang Gao Jia-Yu Pan Valery A Petrushin • • Editors Multimedia Data Mining and Analytics Disruptive Innovation 123 Editors... Jiang Gao Jia-Yu Pan Valery A Petrushin Contents Part I Introduction Disruptive Innovation: Large Scale Multimedia Data Mining Aaron K Baughman, Jia-Yu Pan, Jiang Gao and Valery A Petrushin Part... Chapter Disruptive Innovation: Large Scale Multimedia Data Mining Aaron K Baughman, Jia-Yu Pan, Jiang Gao and Valery A Petrushin Abstract This chapter gives an overview of multimedia data processing