Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 819 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
819
Dung lượng
5,84 MB
Nội dung
The OpenCV Reference Manual Release 2.4.6.0 July 01, 2013 CONTENTS Introduction 1.1 API Concepts core 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 7 55 86 114 172 182 195 211 213 imgproc Image Processing 3.1 Image Filtering 3.2 Geometric Image Transformations 3.3 Miscellaneous Image Transformations 3.4 Histograms 3.5 Structural Analysis and Shape Descriptors 3.6 Motion Analysis and Object Tracking 3.7 Feature Detection 3.8 Object Detection 223 223 251 263 277 287 302 306 317 highgui High-level GUI and Media I/O 4.1 User Interface 4.2 Reading and Writing Images and Video 4.3 Qt New Functions 319 319 324 334 video Video Analysis 341 5.1 Motion Analysis and Object Tracking 341 calib3d Camera Calibration and 3D Reconstruction 355 6.1 Camera Calibration and 3D Reconstruction 355 features2d 2D Features Framework 7.1 Feature Detection and Description 7.2 Common Interfaces of Feature Detectors 7.3 Common Interfaces of Descriptor Extractors 7.4 Common Interfaces of Descriptor Matchers The Core Functionality Basic Structures Basic C Structures and Operations Dynamic Structures Operations on Arrays Drawing Functions XML/YAML Persistence XML/YAML Persistence (C API) Clustering Utility and System Functions and Macros 1 387 387 391 401 403 i 7.5 7.6 7.7 Common Interfaces of Generic Descriptor Matchers 409 Drawing Function of Keypoints and Matches 414 Object Categorization 416 objdetect Object Detection 421 8.1 Cascade Classification 421 8.2 Latent SVM 427 ml Machine Learning 9.1 Statistical Models 9.2 Normal Bayes Classifier 9.3 K-Nearest Neighbors 9.4 Support Vector Machines 9.5 Decision Trees 9.6 Boosting 9.7 Gradient Boosted Trees 9.8 Random Trees 9.9 Extremely randomized trees 9.10 Expectation Maximization 9.11 Neural Networks 9.12 MLData 433 433 436 438 442 447 454 459 463 468 468 472 478 10 flann Clustering and Search in Multi-Dimensional Spaces 485 10.1 Fast Approximate Nearest Neighbor Search 485 10.2 Clustering 489 11 gpu GPU-accelerated Computer Vision 11.1 GPU Module Introduction 11.2 Initalization and Information 11.3 Data Structures 11.4 Operations on Matrices 11.5 Per-element Operations 11.6 Image Processing 11.7 Matrix Reductions 11.8 Object Detection 11.9 Feature Detection and Description 11.10 Image Filtering 11.11 Camera Calibration and 3D Reconstruction 11.12 Video Analysis 491 491 492 496 503 508 516 537 541 546 557 572 581 12 photo Computational Photography 603 12.1 Inpainting 603 12.2 Denoising 604 13 stitching Images stitching 13.1 Stitching Pipeline 13.2 References 13.3 High Level Functionality 13.4 Camera 13.5 Features Finding and Images Matching 13.6 Rotation Estimation 13.7 Autocalibration 13.8 Images Warping 13.9 Seam Estimation 13.10 Exposure Compensation ii 607 607 608 608 611 612 617 621 622 627 630 13.11 Image Blenders 632 14 nonfree Non-free functionality 635 14.1 Feature Detection and Description 635 15 contrib Contributed/Experimental Stuff 15.1 Stereo Correspondence 15.2 FaceRecognizer - Face Recognition with OpenCV 15.3 Retina : a Bio mimetic human retina model 15.4 OpenFABMAP 643 643 645 719 727 16 legacy Deprecated stuff 16.1 Motion Analysis 16.2 Expectation Maximization 16.3 Histograms 16.4 Planar Subdivisions (C API) 16.5 Feature Detection and Description 16.6 Common Interfaces of Descriptor Extractors 16.7 Common Interfaces of Generic Descriptor Matchers 733 733 735 738 740 747 754 755 17 ocl OpenCL-accelerated Computer Vision 17.1 OpenCL Module Introduction 17.2 Data Structures and Utility Functions 17.3 Data Structures 17.4 Operations on Matrics 17.5 Matrix Reductions 17.6 Image Filtering 17.7 Image Processing 17.8 Object Detection 17.9 Feature Detection And Description 763 763 765 766 769 779 780 786 793 795 18 superres Super Resolution 807 18.1 Super Resolution 807 Bibliography 809 iii iv CHAPTER ONE INTRODUCTION OpenCV (Open Source Computer Vision Library: http://opencv.org) is an open-source BSD-licensed library that includes several hundreds of computer vision algorithms The document describes the so-called OpenCV 2.x API, which is essentially a C++ API, as opposite to the C-based OpenCV 1.x API The latter is described in opencv1x.pdf OpenCV has a modular structure, which means that the package includes several shared or static libraries The following modules are available: • core - a compact module defining basic data structures, including the dense multi-dimensional array Mat and basic functions used by all other modules • imgproc - an image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on • video - a video analysis module that includes motion estimation, background subtraction, and object tracking algorithms • calib3d - basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction • features2d - salient feature detectors, descriptors, and descriptor matchers • objdetect - detection of objects and instances of the predefined classes (for example, faces, eyes, mugs, people, cars, and so on) • highgui - an easy-to-use interface to video capturing, image and video codecs, as well as simple UI capabilities • gpu - GPU-accelerated algorithms from different OpenCV modules • some other helper modules, such as FLANN and Google test wrappers, Python bindings, and others The further chapters of the document describe functionality of each module But first, make sure to get familiar with the common API concepts used thoroughly in the library 1.1 API Concepts cv Namespace All the OpenCV classes and functions are placed into the cv namespace Therefore, to access this functionality from your code, use the cv:: specifier or using namespace cv; directive: #include "opencv2/core/core.hpp" The OpenCV Reference Manual, Release 2.4.6.0 cv::Mat H = cv::findHomography(points1, points2, CV_RANSAC, 5); or #include "opencv2/core/core.hpp" using namespace cv; Mat H = findHomography(points1, points2, CV_RANSAC, ); Some of the current or future OpenCV external names may conflict with STL or other libraries In this case, use explicit namespace specifiers to resolve the name conflicts: Mat a(100, 100, CV_32F); randu(a, Scalar::all(1), Scalar::all(std::rand())); cv::log(a, a); a /= std::log(2.); Automatic Memory Management OpenCV handles all the memory automatically First of all, std::vector, Mat, and other data structures used by the functions and methods have destructors that deallocate the underlying memory buffers when needed This means that the destructors not always deallocate the buffers as in case of Mat They take into account possible data sharing A destructor decrements the reference counter associated with the matrix data buffer The buffer is deallocated if and only if the reference counter reaches zero, that is, when no other structures refer to the same buffer Similarly, when a Mat instance is copied, no actual data is really copied Instead, the reference counter is incremented to memorize that there is another owner of the same data There is also the Mat::clone method that creates a full copy of the matrix data See the example below: // create a big 8Mb matrix Mat A(1000, 1000, CV_64F); // create another header for the same matrix; // this is an instant operation, regardless of the matrix size Mat B = A; // create another header for the 3-rd row of A; no data is copied either Mat C = B.row(3); // now create a separate copy of the matrix Mat D = B.clone(); // copy the 5-th row of B to C, that is, copy the 5-th row of A // to the 3-rd row of A B.row(5).copyTo(C); // now let A and D share the data; after that the modified version // of A is still referenced by B and C A = D; // now make B an empty matrix (which references no memory buffers), // but the modified version of A will still be referenced by C, // despite that C is just a single row of the original A B.release(); // finally, make a full copy of C As a result, the big modified // matrix will be deallocated, since it is not referenced by anyone C = C.clone(); You see that the use of Mat and other basic structures is simple But what about high-level classes or even user data types created without taking automatic memory management into account? For them, OpenCV offers the Ptr Chapter Introduction The OpenCV Reference Manual, Release 2.4.6.0 template class that is similar to std::shared_ptr from C++ TR1 So, instead of using plain pointers: T* ptr = new T( ); you can use: Ptr ptr = new T( ); That is, Ptr ptr encapsulates a pointer to a T instance and a reference counter associated with the pointer See the Ptr description for details Automatic Allocation of the Output Data OpenCV deallocates the memory automatically, as well as automatically allocates the memory for output function parameters most of the time So, if a function has one or more input arrays (cv::Mat instances) and some output arrays, the output arrays are automatically allocated or reallocated The size and type of the output arrays are determined from the size and type of input arrays If needed, the functions take extra parameters that help to figure out the output array properties Example: #include "cv.h" #include "highgui.h" using namespace cv; int main(int, char**) { VideoCapture cap(0); if(!cap.isOpened()) return -1; Mat frame, edges; namedWindow("edges",1); for(;;) { cap >> frame; cvtColor(frame, edges, CV_BGR2GRAY); GaussianBlur(edges, edges, Size(7,7), 1.5, 1.5); Canny(edges, edges, 0, 30, 3); imshow("edges", edges); if(waitKey(30) >= 0) break; } return 0; } The array frame is automatically allocated by the >> operator since the video frame resolution and the bit-depth is known to the video capturing module The array edges is automatically allocated by the cvtColor function It has the same size and the bit-depth as the input array The number of channels is because the color conversion code CV_BGR2GRAY is passed, which means a color to grayscale conversion Note that frame and edges are allocated only once during the first execution of the loop body since all the next video frames have the same resolution If you somehow change the video resolution, the arrays are automatically reallocated The key component of this technology is the Mat::create method It takes the desired array size and type If the array already has the specified size and type, the method does nothing Otherwise, it releases the previously allocated data, if any (this part involves decrementing the reference counter and comparing it with zero), and then allocates a new buffer of the required size Most functions call the Mat::create method for each output array, and so the automatic output data allocation is implemented 1.1 API Concepts The OpenCV Reference Manual, Release 2.4.6.0 Some notable exceptions from this scheme are cv::mixChannels, cv::RNG::fill, and a few other functions and methods They are not able to allocate the output array, so you have to this in advance Saturation Arithmetics As a computer vision library, OpenCV deals a lot with image pixels that are often encoded in a compact, 8- or 16-bit per channel, form and thus have a limited value range Furthermore, certain operations on images, like color space conversions, brightness/contrast adjustments, sharpening, complex interpolation (bi-cubic, Lanczos) can produce values out of the available range If you just store the lowest (16) bits of the result, this results in visual artifacts and may affect a further image analysis To solve this problem, the so-called saturation arithmetics is used For example, to store r, the result of an operation, to an 8-bit image, you find the nearest value within the 255 range: I(x, y) = min(max(round(r), 0), 255) Similar rules are applied to 8-bit signed, 16-bit signed and unsigned types This semantics is used everywhere in the library In C++ code, it is done using the saturate_cast functions that resemble standard C++ cast operations See below the implementation of the formula provided above: I.at(y, x) = saturate_cast(r); where cv::uchar is an OpenCV 8-bit unsigned integer type In the optimized SIMD code, such SSE2 instructions as paddusb, packuswb, and so on are used They help achieve exactly the same behavior as in C++ code Note: Saturation is not applied when the result is 32-bit integer Fixed Pixel Types Limited Use of Templates Templates is a great feature of C++ that enables implementation of very powerful, efficient and yet safe data structures and algorithms However, the extensive use of templates may dramatically increase compilation time and code size Besides, it is difficult to separate an interface and implementation when templates are used exclusively This could be fine for basic algorithms but not good for computer vision libraries where a single algorithm may span thousands lines of code Because of this and also to simplify development of bindings for other languages, like Python, Java, Matlab that not have templates at all or have limited template capabilities, the current OpenCV implementation is based on polymorphism and runtime dispatching over templates In those places where runtime dispatching would be too slow (like pixel access operators), impossible (generic Ptr implementation), or just very inconvenient (saturate_cast()) the current implementation introduces small template classes, methods, and functions Anywhere else in the current OpenCV version the use of templates is limited Consequently, there is a limited fixed set of primitive data types the library can operate on That is, array elements should have one of the following types: • 8-bit unsigned integer (uchar) • 8-bit signed integer (schar) • 16-bit unsigned integer (ushort) • 16-bit signed integer (short) • 32-bit signed integer (int) • 32-bit floating-point number (float) • 64-bit floating-point number (double) Chapter Introduction The OpenCV Reference Manual, Release 2.4.6.0 C++: void ocl::BruteForceMatcher_OCL_base::matchConvert(const Mat& trainIdx, const Mat& imgIdx, const Mat& distance, std::vector& matches) ocl::BruteForceMatcher_OCL_base::knnMatch Finds the k best matches for each descriptor from a query set with train descriptors C++: void ocl::BruteForceMatcher_OCL_base::knnMatch(const oclMat& query, const oclMat& train, std::vector& matches, int k, const oclMat& mask=oclMat(), bool compactResult=false) C++: void ocl::BruteForceMatcher_OCL_base::knnMatchSingle(const oclMat& query, const oclMat& train, oclMat& trainIdx, oclMat& distance, oclMat& allDist, int k, const oclMat& mask=oclMat()) C++: void ocl::BruteForceMatcher_OCL_base::knnMatch(const oclMat& query, std::vector& matches, int k, const std::vector& masks=std::vector(), bool compactResult=false ) C++: void ocl::BruteForceMatcher_OCL_base::knnMatch2Collection(const oclMat& query, const oclMat& trainCollection, oclMat& trainIdx, oclMat& imgIdx, oclMat& distance, const oclMat& maskCollection=oclMat()) Parameters query – Query set of descriptors train – Training set of descriptors It is not be added to train descriptors collection stored in the class object k – Number of the best matches per each query descriptor (or less if it is not possible) mask – Mask specifying permissible matches between the input query and train matrices of descriptors compactResult – If compactResult is true , the matches vector does not contain matches for fully masked-out query descriptors The function returns detected k (or less if not possible) matches in the increasing order by distance The third variant of the method stores the results in GPU memory See Also: DescriptorMatcher::knnMatch() ocl::BruteForceMatcher_OCL_base::knnMatchDownload obtained via ocl::BruteForceMatcher_OCL_base::knnMatchSingle() _OCL_base::knnMatch2Collection() to vector with DMatch ocl::BruteForceMatcher Downloads matrices 17.9 Feature Detection And Description or 799 The OpenCV Reference Manual, Release 2.4.6.0 C++: void ocl::BruteForceMatcher_OCL_base::knnMatchDownload(const oclMat& trainIdx, const oclMat& distance, std::vector& matches, bool compactResult=false) C++: void ocl::BruteForceMatcher_OCL_base::knnMatch2Download(const oclMat& trainIdx, const oclMat& imgIdx, const oclMat& distance, std::vector& matches, bool compactResult=false) If compactResult is true , the matches vector does not contain matches for fully masked-out query descriptors ocl::BruteForceMatcher_OCL_base::knnMatchConvert via ocl::BruteForceMatcher_OCL_base::knnMatchSingle() _OCL_base::knnMatch2Collection() to CPU vector with DMatch ocl::BruteForceMatcher Converts matrices obtained or C++: void ocl::BruteForceMatcher_OCL_base::knnMatchConvert(const Mat& trainIdx, const Mat& distance, std::vector& matches, bool compactResult=false) C++: void ocl::BruteForceMatcher_OCL_base::knnMatch2Convert(const Mat& trainIdx, const Mat& imgIdx, const Mat& distance, std::vector& matches, bool compactResult=false) If compactResult is true , the matches vector does not contain matches for fully masked-out query descriptors ocl::BruteForceMatcher_OCL_base::radiusMatch For each query descriptor, finds the best matches with a distance less than a given threshold C++: void ocl::BruteForceMatcher_OCL_base::radiusMatch(const oclMat& query, const oclMat& train, std::vector& matches, float maxDistance, const oclMat& mask=oclMat(), bool compactResult=false) C++: void ocl::BruteForceMatcher_OCL_base::radiusMatchSingle(const oclMat& query, const oclMat& train, oclMat& trainIdx, oclMat& distance, oclMat& nMatches, float maxDistance, const oclMat& mask=oclMat()) C++: void ocl::BruteForceMatcher_OCL_base::radiusMatch(const oclMat& query, std::vector& matches, float maxDistance, const std::vector& masks=std::vector(), bool compactResult=false) 800 Chapter 17 ocl OpenCL-accelerated Computer Vision The OpenCV Reference Manual, Release 2.4.6.0 C++: void ocl::BruteForceMatcher_OCL_base::radiusMatchCollection(const oclMat& query, oclMat& trainIdx, oclMat& imgIdx, oclMat& distance, oclMat& nMatches, float maxDistance, const std::vector& masks=std::vector()) Parameters query – Query set of descriptors train – Training set of descriptors It is not added to train descriptors collection stored in the class object maxDistance – Distance threshold mask – Mask specifying permissible matches between the input query and train matrices of descriptors compactResult – If compactResult is true , the matches vector does not contain matches for fully masked-out query descriptors The function returns detected matches in the increasing order by distance The methods work only on devices with the compute capability >= 1.1 The third variant of the method stores the results in GPU memory and does not store the points by the distance See Also: DescriptorMatcher::radiusMatch() ocl::BruteForceMatcher_OCL_base::radiusMatchDownload Downloads matrices obtained via ocl::BruteForceMatcher_OCL_base::radiusMatchSingle() ocl::BruteForceMatcher_OCL_base::radiusMatchCollection() to vector with DMatch or C++: void ocl::BruteForceMatcher_OCL_base::radiusMatchDownload(const oclMat& trainIdx, const oclMat& distance, const oclMat& nMatches, std::vector& matches, bool compactResult=false) C++: void ocl::BruteForceMatcher_OCL_base::radiusMatchDownload(const oclMat& trainIdx, const oclMat& imgIdx, const oclMat& distance, const oclMat& nMatches, std::vector& matches, bool compactResult=false) If compactResult is true , the matches vector does not contain matches for fully masked-out query descriptors ocl::BruteForceMatcher_OCL_base::radiusMatchConvert via ocl::BruteForceMatcher_OCL_base::radiusMatchSingle() _OCL_base::radiusMatchCollection() to vector with DMatch ocl::BruteForceMatcher Converts matrices obtained 17.9 Feature Detection And Description or 801 The OpenCV Reference Manual, Release 2.4.6.0 C++: void ocl::BruteForceMatcher_OCL_base::radiusMatchConvert(const Mat& trainIdx, const Mat& distance, const Mat& nMatches, std::vector& matches, bool compactResult=false) C++: void ocl::BruteForceMatcher_OCL_base::radiusMatchConvert(const Mat& trainIdx, const Mat& imgIdx, const Mat& distance, const Mat& nMatches, std::vector& matches, bool compactResult=false) If compactResult is true , the matches vector does not contain matches for fully masked-out query descriptors ocl::HOGDescriptor struct ocl::HOGDescriptor The class implements Histogram of Oriented Gradients ([Dalal2005]) object detector struct CV_EXPORTS HOGDescriptor { enum { DEFAULT_WIN_SIGMA = -1 }; enum { DEFAULT_NLEVELS = 64 }; enum { DESCR_FORMAT_ROW_BY_ROW, DESCR_FORMAT_COL_BY_COL }; HOGDescriptor(Size win_size=Size(64, 128), Size block_size=Size(16, 16), Size block_stride=Size(8, 8), Size cell_size=Size(8, 8), int nbins=9, double win_sigma=DEFAULT_WIN_SIGMA, double threshold_L2hys=0.2, bool gamma_correction=true, int nlevels=DEFAULT_NLEVELS); size_t getDescriptorSize() const; size_t getBlockHistogramSize() const; void setSVMDetector(const vector& detector); static vector getDefaultPeopleDetector(); static vector getPeopleDetector48x96(); static vector getPeopleDetector64x128(); void detect(const oclMat& img, vector& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size()); void detectMultiScale(const oclMat& img, vector& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size(), double scale0=1.05, int group_threshold=2); void getDescriptors(const oclMat& img, Size win_stride, oclMat& descriptors, int descr_format=DESCR_FORMAT_COL_BY_COL); Size win_size; Size block_size; 802 Chapter 17 ocl OpenCL-accelerated Computer Vision The OpenCV Reference Manual, Release 2.4.6.0 Size block_stride; Size cell_size; int nbins; double win_sigma; double threshold_L2hys; bool gamma_correction; int nlevels; private: // Hidden } Interfaces of all methods are kept similar to the CPU HOG descriptor and detector analogues as much as possible ocl::HOGDescriptor::HOGDescriptor Creates the HOG descriptor and detector C++: ocl::HOGDescriptor::HOGDescriptor(Size win_size=Size(64, 128), Size block_size=Size(16, 16), Size block_stride=Size(8, 8), Size cell_size=Size(8, 8), int nbins=9, double win_sigma=DEFAULT_WIN_SIGMA, double threshold_L2hys=0.2, bool gamma_correction=true, int nlevels=DEFAULT_NLEVELS) Parameters win_size – Detection window size Align to block size and block stride block_size – Block size in pixels Align to cell size Only (16,16) is supported for now block_stride – Block stride It must be a multiple of cell size cell_size – Cell size Only (8, 8) is supported for now nbins – Number of bins Only bins per cell are supported for now win_sigma – Gaussian smoothing window parameter threshold_L2hys – L2-Hys normalization method shrinkage gamma_correction – Flag to specify whether the gamma correction preprocessing is required or not nlevels – Maximum number of detection window increases ocl::HOGDescriptor::getDescriptorSize Returns the number of coefficients required for the classification C++: size_t ocl::HOGDescriptor::getDescriptorSize() const ocl::HOGDescriptor::getBlockHistogramSize Returns the block histogram size C++: size_t ocl::HOGDescriptor::getBlockHistogramSize() const 17.9 Feature Detection And Description 803 The OpenCV Reference Manual, Release 2.4.6.0 ocl::HOGDescriptor::setSVMDetector Sets coefficients for the linear SVM classifier C++: void ocl::HOGDescriptor::setSVMDetector(const vector& detector) ocl::HOGDescriptor::getDefaultPeopleDetector Returns coefficients of the classifier trained for people detection (for default window size) C++: static vector ocl::HOGDescriptor::getDefaultPeopleDetector() ocl::HOGDescriptor::getPeopleDetector48x96 Returns coefficients of the classifier trained for people detection (for 48x96 windows) C++: static vector ocl::HOGDescriptor::getPeopleDetector48x96() ocl::HOGDescriptor::getPeopleDetector64x128 Returns coefficients of the classifier trained for people detection (for 64x128 windows) C++: static vector ocl::HOGDescriptor::getPeopleDetector64x128() ocl::HOGDescriptor::detect Performs object detection without a multi-scale window C++: void ocl::HOGDescriptor::detect(const oclMat& img, vector& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size()) Parameters img – Source image CV_8UC1 and CV_8UC4 types are supported for now found_locations – Left-top corner points of detected objects boundaries hit_threshold – Threshold for the distance between features and SVM classifying plane Usually it is and should be specfied in the detector coefficients (as the last free coefficient) But if the free coefficient is omitted (which is allowed), you can specify it manually here win_stride – Window stride It must be a multiple of block stride padding – Mock parameter to keep the CPU interface compatibility It must be (0,0) ocl::HOGDescriptor::detectMultiScale Performs object detection with a multi-scale window C++: void ocl::HOGDescriptor::detectMultiScale(const oclMat& img, vector& found_locations, double hit_threshold=0, Size win_stride=Size(), Size padding=Size(), double scale0=1.05, int group_threshold=2) Parameters img – Source image See ocl::HOGDescriptor::detect() for type limitations 804 Chapter 17 ocl OpenCL-accelerated Computer Vision The OpenCV Reference Manual, Release 2.4.6.0 found_locations – Detected objects boundaries hit_threshold – Threshold for the distance between features and SVM classifying plane See ocl::HOGDescriptor::detect() for details win_stride – Window stride It must be a multiple of block stride padding – Mock parameter to keep the CPU interface compatibility It must be (0,0) scale0 – Coefficient of the detection window increase group_threshold – Coefficient to regulate the similarity threshold When detected, some objects can be covered by many rectangles means not to perform grouping See groupRectangles() ocl::HOGDescriptor::getDescriptors Returns block descriptors computed for the whole image C++: void ocl::HOGDescriptor::getDescriptors(const oclMat& img, Size win_stride, oclMat& descriptors, int descr_format=DESCR_FORMAT_COL_BY_COL) Parameters img – Source image See ocl::HOGDescriptor::detect() for type limitations win_stride – Window stride It must be a multiple of block stride descriptors – 2D array of descriptors descr_format – Descriptor storage format: – DESCR_FORMAT_ROW_BY_ROW - Row-major order – DESCR_FORMAT_COL_BY_COL - Column-major order The function is mainly used to learn the classifier 17.9 Feature Detection And Description 805 The OpenCV Reference Manual, Release 2.4.6.0 806 Chapter 17 ocl OpenCL-accelerated Computer Vision CHAPTER EIGHTEEN SUPERRES SUPER RESOLUTION 18.1 Super Resolution The Super Resolution module contains a set of functions and classes that can be used to solve the problem of resolution enhancement There are a few methods implemented, most of them are descibed in the papers [Farsiu03] and [Mitzel09] superres::SuperResolution Base class for Super Resolution algorithms class superres::SuperResolution : public Algorithm, public superres::FrameSource The class is only used to define the common interface for the whole family of Super Resolution algorithms superres::SuperResolution::setInput Set input frame source for Super Resolution algorithm C++: void superres::SuperResolution::setInput(const Ptr& frameSource) Parameters frameSource – Input frame source superres::SuperResolution::nextFrame Process next frame from input and return output result C++: void superres::SuperResolution::nextFrame(OutputArray frame) Parameters frame – Output result superres::SuperResolution::collectGarbage Clear all inner buffers C++: void superres::SuperResolution::collectGarbage() 807 The OpenCV Reference Manual, Release 2.4.6.0 superres::createSuperResolution_BTVL1 Create Bilateral TV-L1 Super Resolution C++: Ptr superres::createSuperResolution_BTVL1() C++: Ptr superres::createSuperResolution_BTVL1_GPU() This class implements Super Resolution algorithm described in the papers [Farsiu03] and [Mitzel09] Here are important members of the class that control the algorithm, which you can set after constructing the class instance: • int scale Scale factor • int iterations Iteration count • double tau Asymptotic value of steepest descent method • double lambda Weight parameter to balance data term and smoothness term • double alpha Parameter of spacial distribution in Bilateral-TV • int btvKernelSize Kernel size of Bilateral-TV filter • int blurKernelSize Gaussian blur kernel size • double blurSigma Gaussian blur sigma • int temporalAreaRadius Radius of the temporal search area • Ptr opticalFlow Dense optical flow algorithm 808 Chapter 18 superres Super Resolution BIBLIOGRAPHY [Arthur2007] Arthur and S Vassilvitskii k-means++: the advantages of careful seeding, Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 2007 [Borgefors86] Borgefors, Gunilla, Distance transformations in digital images Comput Vision Graph Image Process 34 3, pp 344–371 (1986) [Felzenszwalb04] Felzenszwalb, Pedro F and Huttenlocher, Daniel P Distance Transforms of Sampled Functions, TR2004-1963, TR2004-1963 (2004) [Meyer92] Meyer, F Color Image Segmentation, ICIP92, 1992 [Telea04] Alexandru Telea, An Image Inpainting Technique Based on the Fast Marching Method Journal of Graphics, GPU, and Game Tools 1, pp 23-34 (2004) [RubnerSept98] 25 Rubner C Tomasi, L.J Guibas The Earth Mover’s Distance as a Metric for Image Retrieval Technical Report STAN-CS-TN-98-86, Department of Computer Science, Stanford University, September 1998 [Fitzgibbon95] Andrew W Fitzgibbon, R.B.Fisher A Buyer’s Guide to Conic Fitting Proc.5th British Machine Vision Conference, Birmingham, pp 513-522, 1995 [Hu62] 13 Hu Visual Pattern Recognition by Moment Invariants, IRE Transactions on Information Theory, 8:2, pp 179-187, 1962 [Sklansky82] Sklansky, J., Finding the Convex Hull of a Simple Polygon PRL $number, pp 79-83 (1982) [Suzuki85] Suzuki, S and Abe, K., Topological Structural Analysis of Digitized Binary Images by Border Following CVGIP 30 1, pp 32-46 (1985) [TehChin89] Teh, C.H and Chin, R.T., On the Detection of Dominant Points on Digital Curve PAMI 11 8, pp 859-872 (1989) [Canny86] 10 Canny A Computational Approach to Edge Detection, IEEE Trans on Pattern Analysis and Machine Intelligence, 8(6), pp 679-698 (1986) [Matas00] Matas, J and Galambos, C and Kittler, J.V., Robust Detection of Lines Using the Progressive Probabilistic Hough Transform CVIU 78 1, pp 119-137 (2000) [Shi94] 10 Shi and C Tomasi Good Features to Track Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 593-600, June 1994 [Yuen90] Yuen, H K and Princen, J and Illingworth, J and Kittler, J., Comparative study of Hough transform methods for circle finding Image Vision Comput 1, pp 71–77 (1990) [Bouguet00] Jean-Yves Bouguet Pyramidal Implementation of the Lucas Kanade Feature Tracker [Bradski98] Bradski, G.R “Computer Vision Face Tracking for Use in a Perceptual User Interface”, Intel, 1998 809 The OpenCV Reference Manual, Release 2.4.6.0 [Bradski00] Davis, J.W and Bradski, G.R “Motion Segmentation and Pose Recognition with Motion History Gradients”, WACV00, 2000 [Davis97] Davis, J.W and Bobick, A.F “The Representation and Recognition of Action Using Temporal Templates”, CVPR97, 1997 [Farneback2003] Gunnar Farneback, Two-frame motion estimation based on polynomial expansion, Lecture Notes in Computer Science, 2003, (2749), , 363-370 [Horn81] Berthold K.P Horn and Brian G Schunck Determining Optical Flow Artificial Intelligence, 17, pp 185203, 1981 [Lucas81] Lucas, B., and Kanade, T An Iterative Image Registration Technique with an Application to Stereo Vision, Proc of 7th International Joint Conference on Artificial Intelligence (IJCAI), pp 674-679 [Welch95] Greg Welch and Gary Bishop “An Introduction to the Kalman Filter”, 1995 [Tao2012] Michael Tao, Jiamin Bai, Pushmeet Kohli and Sylvain Paris SimpleFlow: A Non-iterative, Sublinear Optical Flow Algorithm Computer Graphics Forum (Eurographics 2012) [Zach2007] Zach, T Pock and H Bischof “A Duality Based Approach for Realtime TV-L1 Optical Flow”, In Proceedings of Pattern Recognition (DAGM), Heidelberg, Germany, pp 214-223, 2007 [Javier2012] Javier Sanchez, Enric Meinhardt-Llopis and Gabriele Facciolo “TV-L1 Optical Flow Estimation” [BT98] Birchfield, S and Tomasi, C A pixel dissimilarity measure that is insensitive to image sampling IEEE Transactions on Pattern Analysis and Machine Intelligence 1998 [BouguetMCT] J.Y.Bouguet MATLAB calibration tool http://www.vision.caltech.edu/bouguetj/calib_doc/ [Hartley99] Hartley, R.I., Theory and Practice of Projective Rectification IJCV 35 2, pp 115-127 (1999) [HH08] Hirschmuller, H Stereo Processing by Semiglobal Matching and Mutual Information, PAMI(30), No 2, February 2008, pp 328-341 [Slabaugh] Slabaugh, G.G Computing Euler angles from http://www.soi.city.ac.uk/~sbbh653/publications/euler.pdf (verified: 2013-04-15) a rotation matrix [Zhang2000] 26 Zhang A Flexible New Technique for Camera Calibration IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11):1330-1334, 2000 [Rosten06] Rosten Machine Learning for High-speed Corner Detection, 2006 [RRKB11] Ethan Rublee, Vincent Rabaud, Kurt Konolige, Gary R Bradski: ORB: An efficient alternative to SIFT or SURF ICCV 2011: 2564-2571 [LCS11] Stefan Leutenegger, Margarita Chli and Roland Siegwart: BRISK: Binary Robust Invariant Scalable Keypoints ICCV 2011: 2548-2555 [AOV12] Alahi, R Ortiz, and P Vandergheynst FREAK: Fast Retina Keypoint In IEEE Conference on Computer Vision and Pattern Recognition, 2012 CVPR 2012 Open Source Award Winner [Viola01] Paul Viola and Michael J Jones Rapid Object Detection using a Boosted Cascade of Simple Features IEEE CVPR, 2001 The paper is available online at http://research.microsoft.com/enus/um/people/viola/Pubs/Detect/violaJones_CVPR2001.pdf [Lienhart02] Rainer Lienhart and Jochen Maydt An Extended Set of Haar-like Features for Rapid Object Detection IEEE ICIP 2002, Vol 1, pp 900-903, Sep 2002 This paper, as well as the extended technical report, can be retrieved at http://www.multimedia-computing.de/mediawiki//images/5/52/MRL-TR-May02-revised-Dec02.pdf [Felzenszwalb2010] Felzenszwalb, P F and Girshick, R B and McAllester, D and Ramanan, D Object Detection with Discriminatively Trained Part Based Models PAMI, vol 32, no 9, pp 1627-1645, September 2010 [Fukunaga90] 11 Fukunaga Introduction to Statistical Pattern Recognition second ed., New York: Academic Press, 1990 810 Bibliography The OpenCV Reference Manual, Release 2.4.6.0 [Burges98] Burges A tutorial on support vector machines for pattern recognition, Knowledge Discovery and Data Mining 2(2), 1998 (available online at http://citeseer.ist.psu.edu/burges98tutorial.html) [LibSVM] C.-C Chang and C.-J Lin LIBSVM: a library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2:27:1–27:27, 2011 (http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf) [Breiman84] Breiman, L., Friedman, J Olshen, R and Stone, C (1984), Classification and Regression Trees, Wadsworth [HTF01] Hastie, T., Tibshirani, R., Friedman, J H The Elements of Statistical Learning: Data Mining, Inference, and Prediction Springer Series in Statistics 2001 [FHT98] Friedman, J H., Hastie, T and Tibshirani, R Additive Logistic Regression: a Statistical View of Boosting Technical Report, Dept of Statistics*, Stanford University, 1998 [BackPropWikipedia] http://en.wikipedia.org/wiki/Backpropagation Wikipedia article about the back-propagation algorithm [LeCun98] 25 LeCun, L Bottou, G.B Orr and K.-R Muller, Efficient backprop, in Neural Networks—Tricks of the Trade, Springer Lecture Notes in Computer Sciences 1524, pp.5-50, 1998 [RPROP93] 13 Riedmiller and H Braun, A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm, Proc ICNN, San Francisco (1993) [Muja2009] Marius Muja, David G Lowe Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration, 2009 [Dalal2005] Navneet Dalal and Bill Triggs Histogram of oriented gradients for human detection 2005 [Felzenszwalb2006] Pedro F Felzenszwalb algorithm [Pedro F Felzenszwalb and Daniel P Huttenlocher Efficient belief propagation for early vision International Journal of Computer Vision, 70(1), October 2006 [Yang2010] 17 Yang, L Wang, and N Ahuja A constant-space belief propagation algorithm for stereo matching In CVPR, 2010 [Brox2004] 20 Brox, A Bruhn, N Papenberg, J Weickert High accuracy optical flow estimation based on a theory for warping ECCV 2004 [FGD2003] Liyuan Li, Weimin Huang, Irene Y.H Gu, and Qi Tian Foreground Object Detection from Videos Containing Complex Background ACM MM2003 9p, 2003 [MOG2001] 16 KadewTraKuPong and R Bowden An improved adaptive background mixture model for real-time tracking with shadow detection Proc 2nd European Workshop on Advanced Video-Based Surveillance Systems, 2001 [MOG2004] 26 Zivkovic Improved adaptive Gausian mixture model for background subtraction International Conference Pattern Recognition, UK, August, 2004 [ShadowDetect2003] Prati, Mikic, Trivedi and Cucchiarra Detecting Moving Shadows IEEE PAMI, 2003 [GMG2012] Godbehere, A Matsukawa and K Goldberg Visual Tracking of Human Visitors under VariableLighting Conditions for a Responsive Audio Art Installation American Control Conference, Montreal, June 2012 [BL07] 13 Brown and D Lowe Automatic Panoramic Image Stitching using Invariant Features International Journal of Computer Vision, 74(1), pages 59-73, 2007 [RS10] Richard Szeliski Computer Vision: Algorithms and Applications Springer, New York, 2010 [RS04] Richard Szeliski Image alignment and stitching: A tutorial Technical Report MSR-TR-2004-92, Microsoft Research, December 2004 Bibliography 811 The OpenCV Reference Manual, Release 2.4.6.0 [SS00] Heung-Yeung Shum and Richard Szeliski Construction of panoramic mosaics with global and local alignment International Journal of Computer Vision, 36(2):101-130, February 2000 Erratum published July 2002, 48(2):151-152 [V03] Vivek Kwatra, Arno Schödl, Irfan Essa, Greg Turk and Aaron Bobick Graphcut Textures: Image and Video Synthesis Using Graph Cuts To appear in Proc ACM Transactions on Graphics, SIGGRAPH 2003 [UES01] 13 Uyttendaele, A Eden, and R Szeliski Eliminating ghosting and exposure artifacts in image mosaics In Proc CVPR’01, volume 2, pages 509–516, 2001 [WJ10] Wei Xu and Jane Mulligan Performance evaluation of color correction approaches for automatic multiview image and video stitching In Intl Conf on Computer Vision and Pattern Recognition (CVPR10), San Francisco, CA, 2010 [BA83] Burt, P., and Adelson, E H., A Multiresolution Spline with Application to Image Mosaics ACM Transactions on Graphics, 2(4):217-236, 1983 [Lowe04] Lowe, D G., “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60, 2, pp 91-110, 2004 [Bay06] Bay, H and Tuytelaars, T and Van Gool, L “SURF: Speeded Up Robust Features”, 9th European Conference on Computer Vision, 2006 [AHP04] Ahonen, T., Hadid, A., and Pietikainen, M Face Recognition with Local Binary Patterns Computer Vision - ECCV 2004 (2004), 469–481 [BHK97] Belhumeur, P N., Hespanha, J., and Kriegman, D Eigenfaces vs Fisherfaces: Recognition Using Class Specific Linear Projection IEEE Transactions on Pattern Analysis and Machine Intelligence 19, (1997), 711–720 [Bru92] Brunelli, R., Poggio, T Face Recognition through Geometrical Features European Conference on Computer Vision (ECCV) 1992, S 792–800 [Duda01] Duda, Richard O and Hart, Peter E and Stork, David G., Pattern Classification (2nd Edition) 2001 [Fisher36] Fisher, R A The use of multiple measurements in taxonomic problems Annals Eugen (1936), 179–188 [GBK01] Georghiades, A.S and Belhumeur, P.N and Kriegman, D.J., From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose IEEE Transactions on Pattern Analysis and Machine Intelligence 23, (2001), 643-660 [Kanade73] Kanade, T Picture processing system by computer complex and recognition of human faces PhD thesis, Kyoto University, November 1973 [KM01] Martinez, A and Kak, A PCA versus LDA IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 23, No.2, pp 228-233, 2001 [Lee05] Lee, K., Ho, J., Kriegman, D Acquiring Linear Subspaces for Face Recognition under Variable Lighting In: IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 27 (2005), Nr [Messer06] Messer, K et al Performance Characterisation of Face Recognition Algorithms and Their Sensitivity to Severe Illumination Changes In: In: ICB, 2006, S 1–11 [RJ91] 19 Raudys and A.K Jain Small sample size effects in statistical pattern recognition: Recommendations for practitioneers - IEEE Transactions on Pattern Analysis and Machine Intelligence 13, (1991), 252-264 [Tan10] Tan, X., and Triggs, B Enhanced local texture feature sets for face recognition under difficult lighting conditions IEEE Transactions on Image Processing 19 (2010), 1635–650 [TP91] Turk, M., and Pentland, A Eigenfaces for recognition Journal of Cognitive Neuroscience (1991), 71–86 [Tu06] Chiara Turati, Viola Macchi Cassia, F S., and Leo, I Newborns face recognition: Role of inner and outer facial features Child Development 77, (2006), 297–311 812 Bibliography The OpenCV Reference Manual, Release 2.4.6.0 [Wiskott97] Wiskott, L., Fellous, J., Krüger, N., Malsburg, C Face Recognition By Elastic Bunch Graph Matching IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (1997), S 775–779 [Zhao03] Zhao, W., Chellappa, R., Phillips, P., and Rosenfeld, A Face recognition: A literature survey ACM Computing Surveys (CSUR) 35, (2003), 399–458 [IJRR2008] 13 Cummins and P Newman, “FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance,” The International Journal of Robotics Research, vol 27(6), pp 647-665, 2008 [TRO2010] 13 Cummins and P Newman, “Accelerating FAB-MAP with concentration inequalities,” IEEE Transactions on Robotics, vol 26(6), pp 1042-1050, 2010 [IJRR2010] 13 Cummins and P Newman, “Appearance-only SLAM at large scale with FAB-MAP 2.0,” The International Journal of Robotics Research, vol 30(9), pp 1100-1123, 2010 [ICRA2011] Glover, et al., “OpenFABMAP: An Open Source Toolbox for Appearance-based Loop Closure Detection,” in IEEE International Conference on Robotics and Automation, St Paul, Minnesota, 2011 [AVC2007] Alexandra Teynor and Hans Burkhardt, “Fast Codebook Generation by Sequential Data Analysis for Object Classification”, in Advances in Visual Computing, pp 610-620, 2007 [Iivarinen97] Jukka Iivarinen, Markus Peura, Jaakko Srel, and Ari Visa Comparison of Combined Shape Descriptors for Irregular Objects, 8th British Machine Vision Conference, BMVC‘97 http://www.cis.hut.fi/research/IA/paper/publications/bmvc97/bmvc97.html [Farsiu03] 19 Farsiu, D Robinson, M Elad, P Milanfar Fast and robust Super-Resolution Proc 2003 IEEE Int Conf on Image Process, pp 291–294, 2003 [Mitzel09] Mitzel, T Pock, T Schoenemann, D Cremers Video super resolution using duality based TV-L1 optical flow DAGM, 2009 Bibliography 813 ... are the following: – DECOMP_LU is the LU decomposition The matrix must be non-singular 28 Chapter core The Core Functionality The OpenCV Reference Manual, Release 2.4.6.0 – DECOMP_CHOLESKY is the. .. to the reference counter; // when array points to user-allocated data, the pointer is NULL int* refcount; // other members }; 2.1 Basic Structures 17 The OpenCV Reference Manual, Release 2.4.6.0. .. example: // add the 5-th row, multiplied by to the 3rd row M.row(3) = M.row(3) + M.row(5)*3; 18 Chapter core The Core Functionality The OpenCV Reference Manual, Release 2.4.6.0 // now copy the 7-th