1. Trang chủ
  2. » Giáo Dục - Đào Tạo

Human visual perception, study and applications to understanding images and videos

192 295 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 192
Dung lượng 3,99 MB

Nội dung

Human Visual Perception, study and applications to understanding Images and Videos HARISH KATTI National University of Singapore 2012 For my parents . Acknowledgements I want to thank my supervisor Prof. Mohan Kankanhalli and co-supervisor Prof. Chua Tat-Seng for their patience and support while I made this eventful journey. I was lucky to not only have learnt the basics of research, but also some valuable life skills from them. My interest in research was nurtured further by my interactions with Profs. Why Yong-Peng, K. R. Ramakrishnan, Nicu Sebe, Zhao Shengdong, Yan Shuicheng and Congyan Lang through collaborative research. Prof. Low Kok Lim’s support for our eye-tracking studies was both liberal and unconditional. I want to thank Dr. Ramanathan for the close interaction and fruitful work that became an important part of my thesis. The administrative staff at the School of Computing have been supportive throughout my time as a PhD student and then as a Research Assistant, I take this opportunity to thank Ms Loo Line Fong, Irene, Emily and Agnes in particular for their commitment and responsiveness time and again. PhD has been a long, sometimes solitary and largely introspective journey. My labmates and friends played a variety of roles ranging from mentors, buddies and critics, at different times. I want to thank my friends Vivek, Shweta, Sanjay, Ankit, Reetesh, Anoop, Avinash, Chiang, Dr. Ravindra, Shanmuga, Karthik and Daljit for the interesting discussions we had. I also crossed paths with some wonderful people like Chandra, Wu Dan and Nivethida and grew as a person because of them. An overseas PhD comes at the cost of being away from loved ones. I thank my parents Dr. Gururaj , Smt. Jayalaxmi and sister Dr. Spandan for being understanding, tolerant and supportive through my long post-graduate stint through a Masters and now a PhD degree. To my dear wife Yamuna, I am more complete and happy for having found you and am looking forward to seeing more of life and growing older by your side. On research . I almost wish I hadn’t gone down that rabbit-hole, and yet, and yet, it’s rather curious, you know, this sort of life! -Alice, “Alice in the Wonderland”. The sole cause of man’s unhappiness is that he does not know how to stay quietly in his room. -Blaise Pascal, “Pensées“, 1670 Two kinds of people are never satisfied, ones who love life, and ones who love knowledge. -Maulana Jalaluddin Rumi On exploring life and making choices, right and wrong . Two roads diverged in a yellow wood, And sorry I could not travel both And be one traveler, long I stood And looked down one as far as I could To where it bent in the undergrowth; Then took the other, as just as fair, And having perhaps the better claim Because it was grassy and wanted wear, Though as for that the passing there Had worn them really about the same, And both that morning equally lay In leaves no step had trodden black. Oh, I marked the first for another day! Yet knowing how way leads on to way I doubted if I should ever come back. I shall be telling this with a sigh Somewhere ages and ages hence: Two roads diverged in a wood, and I, I took the one less traveled by, And that has made all the difference. -Robert Frost Abstract Assessing whether a photograph is interesting, or spotting people in conversation or important objects in an images and videos, are visual tasks that we humans effortlessly and in a robust manner. In this thesis I first explore and quantify how humans distinguish interesting photos from Flickr in a rapid time span ([...]... Typical tasks accomplished in Automated Image understanding and relevant references 2 Details of Flickr images collected for 5 of the 14 image themes chosen 3 131 Using eye-gaze information to classify for Action and No Action social scenes 10 129 Combining concept detectors and fixations to classify face and person images 9 111... for different images measured using equation 14 Distinct images have grouped under each of the 4 themes, the plot represents values over more than 100 images The method separates out images with strong visual elements and interactions affective-red,aesthetic-green and action-blue from those which have low interaction or weak visual elements (magenta ) action and affect images are grouped together by... annotators assign white to foreground regions and the black to the background 38 104 Visualization manually annotated ground truth for randomly chosen images from the NUSEF dataset [71] The images can have one or more ROIs 14 104 39 Performance of the binning method for 50 randomly chosen images from the NUSEF dataset The binning method employs a conservative strategy to select... information and meta-data 6 75 18 The panel illustrates how the arrangemnt of different visual elements in images can give rise to rich and abstract semantics Beginning from simple texture in (a), the meaning of an image can be dominated by low level cues like color and depth in (b), shape and symmetry in (c) and (d) The unusual interaction of cat and book gives rise to an element... 5 67 15 Exemplar images from various semantic categories (top) and corresponding gaze patterns (bottom) from NUSEF Categories include Indoor (a) and Outdoor (b) scenes, faces- mammal (c) and human (d), affect-variant group (e,f), action-look (g) and read (h), portrait- human (i,j) and mammal (k), nude (l), world (m,n), reptile (o) and injury (p) Darker circles denote earlier fixations while whiter... moderately abstract reasoning task, to gauge the economic situation of the family (3) To find the ages of family members (4) Another abstract task, to find the activity that the family was involved in prior to arrival of the visitor (5) To remember the clothes worn by people (6) To remember positions taken by people in the room (7) A more abstract task, to infer how long the visitor had been away from the family... binning method correspond to visual elements that might be at the level of objects, gestalt elements or abstract concepts (a) ROIs correspond to the faces involved in the conversation and the apple logo on the laptop.(b) Key elements in the image solitary mountain and the two vanishing points one on the left where the road curves around and another where the river vanishes into the valley Vanishing points... 2 39 5 Different factors that can affect human visual attention and hence, subsequent understanding of visual content 40 6 Some results from Yarbus’s seminal work [53] Subject gaze patterns from 3 minute recordings, under different tasks posed prior to viewing the painting “An unexpected visitor” by I.E Repin The original painting is shown in the top left panel The different tasks posed are as follows,... subject looks at visual input (c) The on-screen location being attended to (d) An off-the-shelf camera is used to establish a mapping between images of the subject’s eye while viewing the video 17 72 The schema visualizes the overall organization of the thesis and highlights the components described in this chapter The current chapter deals with analysis and modeling of visual content,... Naive content creator can get lost or altered either during encoding into visual content, or in conversion between media types during the (encode,store,consume) cycle Effects of the Semantic gap are more pronounced in situations where Naive users generate and consume visual media 3 45 8 The schema represents information flow hierarchy and chapter organization in the thesis The top layer lists different . Human Visual Perception, study and applications to understanding Images and Videos HARISH KATTI National University of Singapore 2012 For my parents 2 Acknowledgements I want to thank. 49 2.1 Human Visual Perception and Visual Attention . . . . 49 2.2 Eye-gaze as an artifact of Human Visual Attention . . 49 2.3 Image Understanding . . . . . . . . . . . . . . . . . . 52 2.4 Understanding. objects in an images and videos, are visual tasks that we humans do effortlessly and in a robust manner. In this thesis I first explore and quantify how hu- mans distinguish interesting photos from

Ngày đăng: 09/09/2015, 18:49

TỪ KHÓA LIÊN QUAN