Preface
Organization
Contents – Part I
Contents – Part II
Best Paper Candidate
Deep Graph Laplacian Hashing for Image Retrieval
1 Introduction
2 The Proposed Approach
2.1 Deep Hashing Model
2.2 Objective Function
2.3 Learning
3 Experiments
4 Conclusion
References
Deep Video Dehazing
1 Introduction
2 Related Work
3 Our Method
4 Training Dataset
5 Experimental Results
6 Conclusions
References
Image Tagging by Joint Deep Visual-Semantic Propagation
1 Introduction
2 Proposed Method
3 Experiments
3.1 Implementation Details
3.2 Performance on NUS-WIDE
3.3 Performance on MS-COCO
3.4 Performance on ESP Game
4 Conclusion
References
Exploiting Time and Frequency Diversities for High-Quality Linear Video Transmission: A MCast Framework
Light Field Image Compression with Sub-apertures Reordering and Adaptive Reconstruction
Video Coding
Fast QTBT Partition Algorithm for JVET Intra Coding Based on CNN
1 Introduction
2 Related Works on Fast CU Decision Methods
3 QTBT Structure and Category Analysis
4 CNN Architecture and Training
5 Fast QTBT Partition Method Based on CNN Classifier
6 Experimental Results
7 Conclusion
References
A Novel Saliency Based Bit Allocation and RDO for HEVC
1 Introduction
2 Context-Aware Saliency Map
3 Saliency Map Based Bits Allocation Scheme
4 Proposed Perceptual RDO
5 Experiment
6 Conclusion
References
Light Field Image Compression Scheme Based on MVD Coding Standard
A Real-Time Multi-view AVS2 Decoder on Mobile Phone
Abstract
1 Introduction
2 Decoder Implementation
2.1 Framework Level Optimization
2.2 Frame Level Threading
2.3 Data Level Paralleling
3 Performance Analysis
4 Multi-view System
5 Conclusion
Acknowledgement
References
Compressive Sensing Depth Video Coding via Gaussian Mixture Models and Object Edges
Image Super-Resolution, Debluring, and Dehazing
AWCR: Adaptive and Weighted Collaborative Representations for Face Super-Resolution with Context Residual-Learning
1 Introduction
2 Preliminaries
3 Proposed Method
3.1 Dictionary Learning from Context-Patch
3.2 Adaptive and Weighted Collaborative Representations (AWCR)
3.3 Context Residual-Learning
4 Experimental Results
5 Conclusion
References
Single Image Haze Removal Based on Global-Local Optimization for Depth Map
Single Image Dehazing Using Deep Convolution Neural Networks
1 Introduction
2 Haze Removal
3 Experiment Results
4 Conclusions
References
SPOS: Deblur Image by Using Sparsity Prior and Outlier Suppression
1 Introduction
2 SPOS Overview
3 The Proposed Method
4 Image Restoration
5 Experiment
6 Conclusion
References
Single Image Super-Resolution Using Multi-scale Convolutional Neural Network
Person Identity and Emotion
A Novel Image Preprocessing Strategy for Foreground Extraction in Person Re-identification
-1Age Estimation via Pose-Invariant 3D Face Alignment Feature in 3 Streams of CNN
1 Introduction
2 Proposed Algorithm
3 Experiments
4 Conclusions
References
Face Alignment Using Local Probabilistic Features
Multi-modal Emotion Recognition with Temporal-Band Attention Based on LSTM-RNN
1 Introduction
2 Related Work
3 Method
4 Experiment
5 Conclusion
References
Multimodal Fusion of Spatial-Temporal Features for Emotion Recognition in the Wild
A Fast and General Method for Partial Face Recognition
Abstract
1 Introduction
2 Proposed Method
3 Experiment
3.1 Data Set
3.2 Partial Face Recognition on PubFig
3.3 Open Set Partial Face Verification
3.4 Occluded Face Recognition on AR
4 Conclusion
Acknowledgements
References
Tracking and Action Recognition
Adaptive Correlation Filter Tracking with Weighted Foreground Representation
1 Introduction
2 The Proposed Algorithm
2.1 Review of the Staple Algorithm
2.2 The Weighted Foreground Appearance Feature Descriptor
2.3 Staple Framework with Adaptive Learning Rate
3 Experiments
4 Conclusion
References
A Novel Method for Camera Pose Tracking Using Visual Complementary Filtering
Trajectory-Pooled 3D Convolutional Descriptors for Action Recognition
Temporal Interval Regression Network for Video Action Detection
Motion State Detection Based Prediction Model for Body Parts Tracking of Volleyball Players
Abstract
1 Introduction
2 Proposal
2.1 Motion State Detection Based Prediction Model
2.2 Band-Width Sobel Likelihood Model
2.3 Cluster Scoring Based Estimation
3 Experiment
3.1 Evaluation Method
3.2 Result and Analysis
4 Conclusion
Acknowledgments
References
Detection and Classification
Adapting Generic Detector for Semi-Supervised Pedestrian Detection
StairsNet: Mixed Multi-scale Network for Object Detection
1 Introduction
2 Related Work
3 Model
4 Experiments
5 Conclusions
References
A Dual-CNN Model for Multi-label Classification by Leveraging Co-occurrence Dependencies Between Labels
1 Introduction
2 Proposed Model
3 Experiments
4 Conclusion
References
Multi-level Semantic Representation for Flower Classification
Abstract
1 Introduction
2 Related Work
3 Methods
4 Experiments
5 Conclusions
Acknowledgements
References
Multi-view Multi-label Learning via Optimal Classifier Chain
Tire X-ray Image Impurity Detection Based on Multiple Kernel Learning
1 Introduction
2 The idMKL Method
2.1 Candidate Impurity Extraction
2.2 Candidate Impurity Representation
2.3 Multiple Kernel Learning Module
3 Experiments
4 Conclusion
References
Multimedia Signal Reconstruction and Recovery
CRF-Based Reconstruction from Narrow-Baseline Image Sequences
Abstract
1 Introduction
2 Raw Depth Generation
3 Depth Refinement
4 Experiment
5 Conclusion
Acknowledgements
References
Better and Faster, when ADMM Meets CNN: Compressive-Sensed Image Reconstruction
1 Introduction
2 Background
3 Method
3.1 Problem Dissection Based on ADMM
3.2 The x Subproblem: CNN for Image Denoising Problem
3.3 The z Subproblem: Quadratic Programming
4 Experiments
5 Conclusions
References
Sparsity-Promoting Adaptive Coding with Robust Empirical Mode Decomposition for Image Restoration
A Splicing Interpolation Method for Head-Related Transfer Function
1 Introduction
2 Proposed Method
2.1 Post-processing of HRIRs
2.2 Forecast with RBF Neural Network
2.3 Calculate with Tetrahedron Interpolation
2.4 Splicing the HRIR
3 Evaluation
4 Conclusion
References
Structured Convolutional Compressed Sensing Based on Deterministic Subsamplers
Abstract
1 Introduction
2 Backgrounds of Compressed Sensing
2.1 Classic Compressed Sensing Working Procedure
2.2 Mathematical Tools, Notations, and Preliminaries
3 Theoretical Feasibility Analysis of Proposed Deterministic Subsamplers
4 Experiments and Results
4.1 Compressed Sensing with Proposed Hadamard Matrix Based Subsamplers
4.2 Convolutional Compressed Sensing with Golay Sequence Based Phase Modulation
4.3 Reconstruction from Partial Fourier Data (RecPF) Modified Based on FZC Sequence and OSTM
5 Conclusion and Future Works
References
Blind Speech Deconvolution via Pretrained Polynomial Dictionary and Sparse Representation
1 Introduction
2 Preliminaries
3 Blind Speech Deconvolution with Polynomial Sparse Representation
4 Simulations and Results
4.1 Experimental Setup
4.2 Results and Analysis
5 Conclusions and Future Work
References
Text and Line Detection/Recognition
Multi-lingual Scene Text Detection Based on Fully Convolutional Networks
1 Introduction
2 The Proposed Method
2.1 Feature Extractor of Characters with VGG-16
2.2 Transfer VGG-16 Classifier to FCN
2.3 Bounding Box Selection
2.4 Transfer to Multi-lingual Detection Task
3 Experimental Results
4 Conclusion
References
Cloud of Line Distribution for Arbitrary Text Detection in Scene/Video/License Plate Images
1 Introduction
2 The Proposed Method
2.1 Text Candidate Detection
2.2 Polygonal Approximation for Contour Points Detection
2.3 Cloud of Line Distribution for Character Component Detection
2.4 Text Line Formation
3 Experiments
4 Conclusion
References
Affine Collaborative Representation Based Classification for In-Air Handwritten Chinese Character Recognition
Overlaid Chinese Character Recognition via a Compact CNN
1 Introduction
2 Synthetic Dataset
3 Related Concepts
3.1 Feature Map Size
3.2 Receptive Field
4 Architectures
5 Implementation Details
6 Experiments
6.1 Compactness of Models
6.2 How Many Fully Connected Layers
6.3 Width of Fully Connected Layer
6.4 Filter Sizes and Network Depth
6.5 Comparisons of All the Models
7 Conclusion
References
Efficient and Robust Lane Detection Using Three-Stage Feature Extraction with Line Fitting
Social Media
Saliency-GD: A TF-IDF Analogy for Landmark Image Mining
An Improved Clothing Parsing Method Emphasizing the Clothing with Complex Texture
Detection of Similar Geo-Regions Based on Visual Concepts in Social Photos
Unsupervised Concept Learning in Text Subspace for Cross-Media Retrieval
1 Introduction
2 Our Approach
2.1 Concept Terms Generating
2.2 Concept Terms Filtering
2.3 Concept Clustering
2.4 Text Subspace Mapping
3 Experiments
4 Conclusion
References
Image Stylization for Thread Art via Color Quantization and Sparse Modeling
Least-Squares Regulation Based Graph Embedding
1 Introduction
2 Proposed Method
3 Experiments
3.1 Experimental Setup
3.2 Experimental Results
4 Conclusions
References
SSGAN: Secure Steganography Based on Generative Adversarial Networks
Generating Chinese Poems from Images Based on Neural Network
Abstract
1 Introduction
2 Our Model
3 Experiments
3.1 Data and Training
3.2 Evaluation
3.2.1 Evaluation Metrics
3.2.2 Evaluation Results
4 Conclusions
Acknowledgments
References
Detail-Enhancement for Dehazing Method Using Guided Image Filter and Laplacian Pyramid
Personalized Micro-Video Recommendation via Hierarchical User Interest Modeling
1 Introduction
2 Related Work
3 Our Method
4 Experiments
5 Conclusion
References
3D and Panoramic Vision
MCTD: Motion-Coordinate-Time Descriptor for 3D Skeleton-Based Action Recognition
Dense Frame-to-Model SLAM with an RGB-D Camera
1 Introduction
2 Proposed Method
3 Experiment
4 Conclusion
References
Parallax-Robust Hexahedral Panoramic Video Stitching
1 Introduction
2 Proposed Approach
2.1 Layered Feature Points Matching
2.2 Global Projective Warping
2.3 Layered Content-Preserving Warping
2.4 Postprocessing
3 Experiments
4 Conclusion
References
Image Formation Analysis and Light Field Information Reconstruction for Plenoptic Camera 2.0
Part Detection for 3D Shapes via Multi-view Rendering
1 Introduction
2 Part Detection
3 Experiments
3.1 Dataset and Evaluation
3.2 Performance of Our Proposed Method
3.3 Global Threshold vs. Category-Specific Threshold
4 Conclusions
References
Benchmarking Screen Content Image Quality Evaluation in Spatial Psychovisual Modulation Display System
A Fast Sample Adaptive Offset Algorithm for H.265/HEVC
Blind Quality Assessment for Screen Content Images by Texture Information
1 Introduction
2 Proposed Method
2.1 Orientation Feature Extraction
2.2 Structure Feature Extraction
2.3 Regression Model for Quality Prediction
3 Experimental Results
4 Conclusion
References
Assessment of Visually Induced Motion Sickness in Immersive Videos
Hybrid Kernel-Based Template Prediction and Intra Block Copy for Light Field Image Coding
1 Introduction
2 Proposed Coding Scheme
3 Experiment Results
4 Conclusion
References
Asymmetric Representation for 3D Panoramic Video
Deep Learning for Signal Processing and Understanding
Shallow and Deep Model Investigation for Distinguishing Corn and Weeds
1 Introduction
2 Creation of the Weed Detection Dataset
3 Detection Based on Hand-Crafted Feature
4 Detection Based on Improved Faster R-CNN Model
5 Experimental Results and Analyses
6 Conclusion
References
Representing Discrimination of Video by a Motion Map
1 Introduction
2 Method
2.1 Motion Map
2.2 Motion Map Network
2.3 Training
3 Experiment
3.1 Datasets
3.2 Implement Details
3.3 Results
4 Conclusion
References
Multi-scale Discriminative Patches for Fined-Grained Visual Categorization
Chinese Characters Recognition from Screen-Rendered Images Using Inception Deep Learning Architecture
Abstract
1 Introduction
2 The Proposed Method
3 Experimental Results
4 Conclusions
Acknowledgments
References
Visual Tracking by Deep Discriminative Map
Hand Gesture Recognition by Using 3DCNN and LSTM with Adam Optimizer
1 Introduction
2 Method
3 Dataset
3.1 Preproccesing
3.2 Data Augmentation
3.3 Reorganize Dataset
4 Training and Results
5 Conclusions
References
Learning Temporal Context for Correlation Tracking with Scale Estimation
1 Introduction
2 Related Work
3 Proposed Method
4 Experimental Results
5 Conclusions
References
Deep Combined Image Denoising with Cloud Images
Vehicle Verification Based on Deep Siamese Network with Similarity Metric
1 Introduction
2 Related Work
3 Proposed Method
4 Experiments
4.1 Experimental Settings
4.2 Comparison with Baseline Methods
4.3 Comparison with State-of-the-Art Methods
5 Conclusion
References
Style Transfer with Content Preservation from Multiple Images
Task-Specific Neural Networks for Pose Estimation in Person Re-identification Task
Mini Neural Networks for Effective and Efficient Mobile Album Organization
Sweeper: Design of the Augmented Path in Residual Networks
1 Introduction
2 Related Works
3 Sweeper Module
4 Experiments
5 Conclusions
References
Large-Scale Multimedia Affective Computing
Sketch Based Model-Like Standing Style Recommendation
Joint L1-L2 Regularisation for Blind Speech Deconvolution
1 Introduction
2 Background and Previous Work
3 Joint L1-L2 Norm Based Blind Speech Deconvolution
4 Simulations and Results
4.1 Experimental Setup
4.2 Performance Indices
4.3 Results and Analysis
5 Conclusions and Future Work
References
Multi-modal Emotion Recognition Based on Speech and Image
Analysis of Psychological Behavior of Undergraduates
Sensor-Enhanced Multimedia Systems
Compression Artifacts Reduction for Depth Map by Deep Intensity Guidance
LiPS: Learning Social Relationships in Probe Space
1 Introduction
2 Technical Background
2.1 WiFi Probe Sniffing
2.2 Skipgram
3 LiPS Design
4 Evaluation
5 Conclusion
References
The Intelligent Monitoring for the Elderly Based on WiFi Signals
Abstract
1 Introduction
2 Method
3 Results and Evaluation
4 Conclusion
Acknowledgments
References
Sentiment Analysis for Social Sensor
Recovering Overlapping Partials for Monaural Perfect Harmonic Musical Sound Separation Using Modified Common Amplitude Modulation