An introduction to pattern recognition
An Introduction to Pattern Recognition Michael Alder HeavenForBooks.com An Introduction to Pattern Recognition by Michael Alder HeavenForBooks.com An Introduction to Pattern Recognition This Edition ©Mike Alder, 2001 Warning: This edition is not to be copied, transmitted excerpted or printed except on terms authorised by the publisher HeavenForBooks.com An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear Next: Contents An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear Michael D Alder September 19, 1997 Preface Automation, the use of robots in industry, has not progressed with the speed that many had hoped it would The forecasts of twenty years ago are looking fairly silly today: the fact that they were produced largely by journalists for the benefit of boardrooms of accountants and MBA's may have something to with this, but the question of why so little has been accomplished remains The problems were, of course, harder than they looked to naive optimists Robots have been built that can move around on wheels or legs, robots of a sort are used on production lines for routine tasks such as welding But a robot that can clear the table, throw the eggshells in with the garbage and wash up the dishes, instead of washing up the eggshells and throwing the dishes in the garbage, is still some distance off Pattern Classification, more often called Pattern Recognition, is the primary bottleneck in the task of automation Robots without sensors have their uses, but they are limited and dangerous In fact one might plausibly argue that a robot without sensors isn't a real robot at all, whatever the hardware manufacturers may say But equipping a robot with vision is easy only at the hardware level It is neither expensive nor technically difficult to connect a camera and frame grabber board to a computer, the robot's `brain' The problem is with the software, or more exactly with the algorithms which have to decide what the robot is looking at; the input is an array of pixels, coloured dots, the software has to decide whether this is an image of an eggshell or a teacup A task which human beings can master by age eight, when they decode the firing of the different light receptors in the retina of the eye, this is computationally very difficult, and we have only the crudest ideas of how it is done At the hardware level there are marked similarities between the eye and a camera (although there are differences too) At the algorithmic level, we have only a shallow understanding of the issues http://ciips.ee.uwa.edu.au/~mike/PatRec/ (1 of 11) [12/12/2000 4:01:56 AM] An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear Human beings are very good at learning a large amount of information about the universe and how it can be treated; transferring this information to a program tends to be slow if not impossible This has been apparent for some time, and a great deal of effort has been put into research into practical methods of getting robots to recognise things in images and sounds The Centre for Intelligent Information Processing Systems (CIIPS), of the University of Western Australia, has been working in the area for some years now We have been particularly concerned with neural nets and applications to pattern recognition in speech and vision, because adaptive or learning methods are clearly of great potential value The present book has been used as a postgraduate textbook at CIIPS for a Master's level course in Pattern Recognition The contents of the book are therefore oriented largely to image and to some extent speech pattern recognition, with some concentration on neural net methods Students who did the course for which this book was originally written, also completed units in Automatic Speech Recognition Algorithms, Engineering Mathematics (covering elements of Information Theory, Coding Theory and Linear and Multilinear algebra), Artificial Neural Nets, Image Processing, Sensors and Instrumentation and Adaptive Filtering There is some overlap in the material of this book and several of the other courses, but it has been kept to a minimum Examination for the Pattern Recognition course consisted of a sequence of four micro-projects which together made up one mini-project Since the students for whom this book was written had a variety of backgrounds, it is intended to be accessible Since the major obstructions to further progress seem to be fundamental, it seems pointless to try to produce a handbook of methods without analysis Engineering works well when it is founded on some well understood scientific basis, and it turns into alchemy and witchcraft when this is not the case The situation at present in respect of our scientific basis is that it is, like the curate's egg, good in parts We are solidly grounded at the hardware level On the other hand, the software tools for encoding algorithms (C, C++, MatLab) are fairly primitive, and our grasp of what algorithms to use is negligible I have tried therefore to focus on the ideas and the (limited) extent to which they work, since progress is likely to require new ideas, which in turn requires us to have a fair grasp of what the old ideas are The belief that engineers as a class are not intelligent enough to grasp any ideas at all, and must be trained to jump through hoops, although common among mathematicians, is not one which attracts my sympathy Instead of exposing the fundamental ideas in algebra (which in these degenerate days is less intelligible than Latin) I therefore try to make them plain in English There is a risk in this; the ideas of science or engineering are quite diferent from those of philosophy (as practised in these degenerate days) or literary criticism (ditto) I don't mean they are about different things, they are different in kind Newton wrote `Hypotheses non fingo', which literally translates as `I not make hypotheses', which is of course quite untrue, he made up some spectacularly successful hypotheses, such as universal gravitation The difference between the two statements is partly in the hypotheses and partly in the fingo Newton's `hypotheses' could be tested by observation or calculation, whereas the explanations of, say, optics, given in Lucretius De Rerum Naturae were recognisably `philosophical' in the sense that they resembled the writings of many contemporary philosophers and literary critics They may persuade, they may give the sensation of profound insight, but they not reduce to some essentially prosaic routine for determining if they are actually true, or at least useful Newton's did This was one of the great philosophical advances made by Newton, and it has been underestimated by philosophers since http://ciips.ee.uwa.edu.au/~mike/PatRec/ (2 of 11) [12/12/2000 4:01:56 AM] The reader should therefore approach the discussion about the underlying ideas with the attitude of irreverence and disrespect that most engineers, quite properly, bring to non-technical prose He should ask: what procedures does this lead to, and how may they be tested? We deal with high level abstractions, but they are aimed always at reducing our understanding of something prodigiously complicated to something simple It is necessary to make some assumptions about the reader and only fair to say what they are I assume, first, that the reader has a tolerably good grasp of Linear Algebra concepts The concepts are more important than the techniques of matrix manipulation, because there are excellent packages which can the calculations if you know what to compute There is a splendid book on Linear Algebra available from the publisher HeavenForBooks.com I assume, second, a moderate familiarity with elementary ideas of Statistics, and also of contemporary Mathematical notation such as any Engineer or Scientist will have encountered in a modern undergraduate course I found it necessary in this book to deal with underlying ideas of Statistics which are seldom mentioned in undergraduate courses I assume, finally, the kind of general exposure to computing terminology familiar to anyone who can read, say, Byte magazine, and also that the reader can program in C or some similar language I not assume the reader is of the male sex I usually use the pronoun `he' in referring to the reader because it saves a letter and is the convention for the generic case The proposition that this will depress some women readers to the point where they will give up reading and go off and become subservient housewives does not strike me as sufficiently plausible to be worth considering further This is intended to be a happy, friendly book It is written in an informal, one might almost say breezy, manner, which might irritate the humourless and those possessed of a conviction that intellectual respectability entails stuffiness I used to believe that all academic books on difficult subjects were obliged for some mysterious reason to be oppressive, but a survey of the better writers of the past has shown me that this is in fact a contemporary habit and in my view a bad one I have therefore chosen to abandon a convention which must drive intelligent people away from Science and Engineering in large numbers The book has jokes, opinionated remarks and pungent value judgments in it, which might serve to entertain readers and keep them on their toes, so to speak They may also irritate a few who believe that the pretence that the writer has no opinions should be maintained even at the cost of making the book boring What this convention usually accomplishes is a sort of bland porridge which discourages critical thought about fundamental assumptions, and thought about fundamental assumptions is precisely what this area badly needs An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear So I make no apology for the occasional provocative judgement; argue with me if you disagree It is quite easy to that via the net, and since I enjoy arguing (it is a pleasant game), most of my provocations are deliberate Disagreeing with people in an amiable, friendly way, and learning something about why people feel the way they do, is an important part of an education; merely learning the correct things to say doesn't get you very far in Mathematics, Science or Engineering Cultured men or women should be able to dissent with poise, to refute the argument without losing the friend The judgements are, of course, my own; CIIPS and the Mathematics Department and I are not responsible for each other Nor is it to be expected that the University of Western Australia should ensure that my views are politically correct If it did that, it wouldn't be a university In a good university, It is a case of Tot homines, quot sententiae, there are as many opinions as people Sometimes more! I am most grateful to my colleagues and students at the Centre for assistance in many forms; I have shamelessly borrowed their work as examples of the principles discussed herein I must mention Dr Chris deSilva with whom I have worked over many years, Dr Gek Lim whose energy and enthusiasm for Quadratic Neural Nets has enabled them to become demonstrably useful, and Professor Yianni Attikiouzel, director of CIIPS, without whom neither this book nor the course would have come into existence q Contents q Basic Concepts r Measurement and Representation s s Telling the guys from the gals s r From objects to points in space Paradigms Decisions, decisions s Metric Methods s Neural Net Methods (Old Style) s Statistical Methods s s s Parametric Non-parametric CART et al r Clustering: supervised v unsupervised learning r Dynamic Patterns r Structured Patterns r Alternative Representations http://ciips.ee.uwa.edu.au/~mike/PatRec/ (4 of 11) [12/12/2000 4:01:57 AM] An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear s Strings, propositions, predicates and logic s Fuzzy Thinking s Robots r r Exercises r q Summary of this chapter Bibliography Image Measurements r Preliminaries s Image File Formats r Generalities r Image segmentation: finding the objects s s Little Boxes s Border Tracing s r Mathematical Morphology Conclusions on Segmentation Measurement Principles s s r Issues and methods Invariance in practice Measurement practice s Quick and Dumb s Scanline intersections and weights s Moments s Zernike moments and the FFT s Historical Note s Masks and templates s Invariants s Simplifications and Complications r Syntactic Methods r Summary of OCR Measurement Methods r Other Kinds of Binary Image r Greyscale images of characters s r Segmentation: Edge Detection Greyscale Images in general http://ciips.ee.uwa.edu.au/~mike/PatRec/ (5 of 11) [12/12/2000 4:01:57 AM] An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear s s Measuring Greyscale Images s Quantisation s r Segmentation Textures Colour Images s Generalities s Quantisation s Edge detection s Markov Random Fields s Measurements r r IR and acoustic Images r Quasi-Images r Dynamic Images r Summary of Chapter Two r Exercises r q Spot counting Bibliography Statistical Ideas r History, and Deep Philosophical Stuff s s Histograms and Probability Density Functions s r The Origins of Probability: random variables Models and Probabilistic Models Probabilistic Models as Data Compression Schemes s r Maximum Likelihood Models s r Models and Data: Some models are better than others Where Models come from? Bayesian Methods s s Bayesian Statistics s r Bayes' Theorem Subjective Bayesians Minimum Description Length Models s Codes: Information theoretic preliminaries s Compression for coin models http://ciips.ee.uwa.edu.au/~mike/PatRec/ (6 of 11) [12/12/2000 4:01:57 AM] An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear s Compression for pdfs s Summary of Rissanen Complexity r r Exercises r q Summary of the chapter Bibliography Decisions: Statistical methods r The view into r Computing PDFs: Gaussians s One Gaussian per cluster s s Lots of Gaussians: The EM algorithm s s r Dimension The EM algorithm for Gaussian Mixture Modelling Other Possibilities Bayesian Decision s s Non-parametric Bayes Decisions s r Cost Functions Other Metrics How many things in the mix? s Overhead s Example s The Akaike Information Criterion s Problems with EM r r Exercises r q Summary of Chapter Bibliography Decisions: Neural Nets(Old Style) r History: the good old days s s The death of Neural Nets s The Rebirth of Neural Nets s r The Dawn of Neural Nets The End of History Training the Perceptron s The Perceptron Training Rule http://ciips.ee.uwa.edu.au/~mike/PatRec/ (7 of 11) [12/12/2000 4:01:57 AM] Footnotes streets This is important The damage those guys could if let loose in business or politics doesn't bear thinking about http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (53 of 66) [12/12/2000 4:05:27 AM] Footnotes bibliography It may be remarked that many books on languages and automata are written by mathematical illiterates for mathematical illiterates, which makes them hard to read, and Eilenberg is a notable exception His book is hard to read for quite different reasons: he assumes you are intelligent .on After doing it my way, I discovered that Eilenberg had got there before me, and he explains how somebody else had got there before him Such is Life http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (54 of 66) [12/12/2000 4:05:28 AM] Footnotes them Alternatively, you can go into seedy bars with a clear conscience, knowing that you no longer have to worry about meeting people who will expect payment for slipping you a suitable initialisation http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (55 of 66) [12/12/2000 4:05:28 AM] Footnotes indeed If you are used to reading chinese you may feel differently about this, but then I expect you are having a lot of trouble reading the present work http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (56 of 66) [12/12/2000 4:05:28 AM] Footnotes long The reason for allowing infinitely long sequences is not that anybody expects to meet one and be able to say with satisfaction, `wow, that was infinitely long', it is more of a reluctance to specify the longest case one expects to get In other words, it is done to simplify things; the infinitude of the natural numbers is a case in point There may be some grounds for reasonable doubt as to whether this in fact works: sometimes one merely defers the difficulties to a later stage .end Grammarians used to believe that high levels of inflection showed a highly evolved language, probably because Latin was regarded as morally superior to English, German or French Then it was discovered that Chinese used to be inflected a few thousand years ago but the chinese very http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (57 of 66) [12/12/2000 4:05:28 AM] Footnotes sensibly gave it up .level Past a certain point it is called `plagiarism' http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (58 of 66) [12/12/2000 4:05:28 AM] Footnotes is As Herod said to the three wise men http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (59 of 66) [12/12/2000 4:05:28 AM] Footnotes methods Or whatever is the pet method favoured by the audience .world To a geometer, it looks as though the algebraists have gotten into the cookie-jar when everyone knows they are there to tidy things up and nit-pick about details, not go around being creative Or, even worse than algebraists, gasp, logicians When logicians go around being creative, universes totter and crumble http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (60 of 66) [12/12/2000 4:05:29 AM] Footnotes calculation mere singularity is unimportant; to have one eigenvalue collapsing to zero may be regarded as a misfortune, to have both collapse seems like carelessness http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (61 of 66) [12/12/2000 4:05:29 AM] Footnotes world As when one's progeny want one to switch off the Bach so they can concentrate on the Heavy Metal, or vice versa http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (62 of 66) [12/12/2000 4:05:29 AM] Footnotes group A Topological Group is a collection of invertible transformations such that the composite (do one then another) of any two is a third, and also having the property that the multiplication operation and the inversion operation are continuous A Lie Group (pronounced `Lee') is a topological group where the operations are also differentiable and the set forms a manifold, which is a higher dimensional generalisation of a curve or surface This definition is sloppy and intended to convey a vague and intuitive idea which is adequate for present purposes .group The term is from differential topology, and may be found in any of the standard texts It is not necessary to expand upon its precise meaning here, and an intuitive sense may be extracted from http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (63 of 66) [12/12/2000 4:05:29 AM] Footnotes the context .book But watch out for the sequel http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (64 of 66) [12/12/2000 4:05:29 AM] Footnotes cognition Blakemore conjectured it in the paper cited, but it probably struck a lot of people that this was a variant of a Hebbian learning rule http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (65 of 66) [12/12/2000 4:05:29 AM] Footnotes preference There is a belief in some quarters that God is a mathematician, or why did He make Physics so mathematical? And in particular, He must be a geometer, because most of Physics uses geometry This is like wanting the Sun to shine at night when we need it and not in the daytime when it's light anyway The fact is that the most powerful languages allow you to say more interesting and important things That is why I use geometry Mike Alder 9/19/1997 http://ciips.ee.uwa.edu.au/~mike/PatRec/footnode.html (66 of 66) [12/12/2000 4:05:30 AM] About this document Up: An Introduction to Pattern Previous: Bibliography About this document An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear This document was generated using the LaTeX2HTML translator Version 97.1 (release) (July 13th, 1997) Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds The command line arguments were: latex2html PatRec.tex The translation was initiated by Mike Alder on 9/19/1997 Mike Alder 9/19/1997 http://ciips.ee.uwa.edu.au/~mike/PatRec/node221.html [12/12/2000 4:34:46 AM] .. .An Introduction to Pattern Recognition by Michael Alder HeavenForBooks.com An Introduction to Pattern Recognition This Edition ©Mike Alder, 2001 Warning: This edition is not to be copied, transmitted... HeavenForBooks.com An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear Next: Contents An Introduction to Pattern Recognition: Statistical,... [12/12/2000 4:01:58 AM] An Introduction to Pattern Recognition: Statistical, Neural Net and Syntactic methods of getting robots to see and hear s Geometry and Dynamics s Extensions to Higher Order Statistics