THE INTELLIGENT WEB This page intentionally left blank the Intelligent Web Search, Smart Algorithms, and Big Data GAUTAM SHROFF Great Clarendon Street, Oxford, OX2 6DP, United Kingdom Oxford University Press is a department of the University of Oxford It furthers the University’s objective of excellence in research, scholarship, and education by publishing worldwide Oxford is a registered trade mark of Oxford University Press in the UK and in certain other countries © Gautam Shroff 2013 The moral rights of the author have been asserted First Edition published in 2013 Impression: All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without the prior permission in writing of Oxford University Press, or as expressly permitted by law, by licence or under terms agreed with the appropriate reprographics rights organization Enquiries concerning reproduction outside the scope of the above should be sent to the Rights Department, Oxford University Press, at the address above You must not circulate this work in any other form and you must impose this same condition on any acquirer Published in the United States of America by Oxford University Press 198 Madison Avenue, New York, NY 10016, United States of America British Library Cataloguing in Publication Data Data available Library of Congress Control Number: 2013938816 ISBN 978–0–19–964671–5 Printed in Italy by L.E.G.O S.p.A.-Lavis TN Links to third party websites are provided by Oxford in good faith and for information only Oxford disclaims any responsibility for the materials contained in any third party website referenced in this work To my late father, who I suspect would have enjoyed this book the most ACKNOWLEDGEMENTS Many people have contributed to my thinking and encouraged me while writing this book But there are a few to whom I owe special thanks First, to V S Subrahamanian, for reviewing the chapters as they came along and supporting my endeavour with encouraging words I am also especially grateful to Patrick Winston and Pentti Kanerva for sparing the time to speak with me and share their thoughts on the evolution and future of AI Equally important has been the support of my family My wife Brinda, daughter Selena, and son Ahan—many thanks for tolerating my preoccupation on numerous weekends and evenings that kept me away from you I must also thank my mother for enthusiastically reading many of the chapters, which gave me some confidence that they were accessible to someone not at all familiar with computing Last but not least I would like to thank my editor Latha Menon, for her careful and exhaustive reviews, and for shepherding this book through the publication process vi CONTENTS List of Figures ix Prologue: Potential xi Look The MEMEX Reloaded Inside a Search Engine Google and the Mind Deeper and Darker 20 29 Listen 40 Shannon and Advertising The Penny Clicks Statistics of Text Turing in Reverse Language and Statistics Language and Meaning Sentiment and Intent 40 48 52 58 61 66 73 Learn 80 Learning to Label Limits of Labelling Rules and Facts Collaborative Filtering Random Hashing Latent Features Learning Facts from Text Learning vs ‘Knowing’ 83 95 102 109 113 114 122 126 vii CONTENTS Connect 132 Mechanical Logic The Semantic Web Limits of Logic Description and Resolution Belief albeit Uncertain Collective Reasoning 136 150 155 160 170 176 Predict 187 Statistical Forecasting Neural Networks Predictive Analytics Sparse Memories Sequence Memory Deep Beliefs Network Science 192 195 199 205 215 222 227 Correct 235 Running on Autopilot Feedback Control Making Plans Flocks and Swarms Problem Solving Ants at Work Darwin’s Ghost Intelligent Systems 235 240 244 253 256 262 265 268 Epilogue: Purpose 275 References 282 Index 291 viii LIST OF FIGURES Turing’s proof 158 Pong games with eye-gaze tracking 187 Neuron: dendrites, axon, and synapses 196 Minutiae (fingerprint) 213 Face painting 222 Navigating a car park 246 Eight queens puzzle 257 ix ...THE INTELLIGENT WEB This page intentionally left blank the Intelligent Web Search, Smart Algorithms, and Big Data GAUTAM SHROFF Great Clarendon... world wide web In other words, rather than ‘traditional’ artificial intelligence, the successes we are witnessing are better described as those of web intelligence’ xiii THE INTELLIGENT WEB arising... scale *** The web is believed to have well over a trillion web pages, of which at least 50 billion have been catalogued and indexed by search engines such as Google, making them searchable by