www.it-ebooks.info www.it-ebooks.info Sau Sheong Chang Exploring Everyday Things with R and Ruby www.it-ebooks.info ISBN: 978-1-449-31515-3 [LSI] Exploring Everyday Things with R and Ruby by Sau Sheong Chang Copyright © 2012 Sau Sheong Chang. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles ( http://my.safaribooksonline.com). For more information, contact our cor porate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Editors: Andy Oram and Mike Hendrickson Production Editor: Kristen Borg Copyeditor: Rachel Monaghan Proofreader: Kiel Van Horn Indexer: Angela Howard Cover Designer: Karen Montgomery Interior Designer: David Futato Illustrator: Robert Romano July 2012: First Edition Revision History for the First Edition: 2012-06-26 First release See http://oreilly.com/catalog/errata.csp?isbn=9781449315153 for release details. Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Exploring Everyday Things with R and Ruby, the image of a hooded seal, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc., was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, or for damages resulting from the use of the information con tained herein. www.it-ebooks.info Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1. The Hat and the Whip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Ruby 1 Why Ruby 2 Installing Ruby 3 Running Ruby 4 Requiring External Libraries 5 Basic Ruby 7 Everything Is an Object 13 Shoes 19 What Is Shoes? 19 A Rainbow of Shoes 20 Installing Shoes 20 Programming Shoes 21 Wrap-up 25 2. Into the Matrix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Introducing R 27 Using R 28 The R Console 29 Sourcing Files and the Command Line 31 Packages 33 Programming R 35 Variables and Functions 36 Conditionals and Loops 37 Data Structures 39 Importing Data 46 Charting 51 Basic Graphs 51 iii www.it-ebooks.info Introducing ggplot2 53 Wrap-up 61 3. Offices and Restrooms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 The Simple Scenario 64 Representing Restrooms and Such 66 The First Simulation 69 Interpreting the Data 73 The Second Simulation 79 The Third Simulation 83 The Final Simulation 88 Wrap-up 91 4. How to Be an Armchair Economist. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 The Invisible Hand 96 A Simple Market Economy 96 The Producer 97 The Consumer 99 Some Convenience Methods 100 The Simulation 100 Analyzing the Simulation 103 Resource Allocation by Price 107 The Producer 107 The Consumer 108 Market 109 The Simulation 110 Analyzing the Second Simulation 112 Price Controls 116 Wrap-up 119 5. Discover Yourself Through Email. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 The Idea 121 Grab and Parse 122 The Emailing Habits of Enron Executives 126 Discover Yourself 130 Number of Messages by Day of the Month 130 MailMiner 134 Number of Messages by Day of Week 137 Number of Messages by Month 138 Number of Messages by Hour of the Day 139 Interactions 142 Comparative Interactions 144 iv | Table of Contents www.it-ebooks.info Text Mining 147 Wrap-up 154 6. In a Heartbeat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 My Beating Heart 157 Auscultation 158 Homemade Digital Stethoscope 158 Extracting Data from Sound 159 Generating the Heart Sounds Waveform 164 Finding the Heart Rate 166 Oximetry 168 Homemade Pulse Oximeter 168 Extracting Data from Video 169 Generating the Heartbeat Waveform and Calculating the Heart Rate 172 Wrap-up 174 7. Schooling Fish and Flocking Birds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 The Origin of Boids 178 Simulation 179 Roids 181 The Boid Flocking Rules 187 Supporting Rules 190 A Variation on the Rules 191 Going Round and Round 193 Putting in Obstacles 194 Wrap-up 195 8. Money, Sex, and Evolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 It’s a Good Life 198 Money 198 Sex 211 Birth and Death 211 The Changes 211 Evolution 218 What We Will Be Changing 219 Implementation 220 Wrap-up 224 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Table of Contents | v www.it-ebooks.info www.it-ebooks.info Preface Explorers Ahoy! It’s hard to compare intrepid explorers like Ferdinand Magellan, James Cook, and Roald Amundsen with someone, well, like me. While these adventurers braved the elements, wild nature, and unknown dangers to discover new worlds (at least for their civilization), my biggest physical achievement to date would probably be completing a 10-kilometer charity quarter-marathon—walking. The explorers of old had it good, of course, when it came to choices of unexplored places to stake their claim on. Christopher Columbus only had to sail due west from Europe, and he discovered two entire continents. For us, there are far fewer choices. There isn’t much landmass on Earth that is yet unexplored; even the Mariana Trench, the deepest part of the world’s oceans, has been conquered. But explorer I am, and explorer you will be in this book. While much of the known physical world has been conquered (see Figure P-1), the unknown still looms over most of us. We are all born with a sense of wonder and amazement at the world around us. Many of us just learn to turn it off as we grow older and jaded. I believe this is partly because we don’t understand what goes on in the world around us well enough, and thus we don’t care either. Click the remote and the TV turns on—why and how does that work? The first time we tried to ask, we were probably given a blank stare or waved away—who cares as long as you can watch the next season of American Idol? That soon grows to be our reaction as well. vii www.it-ebooks.info Figure P-1. The Scott expedition to the South Pole (photo from the Public Domain Review; http://publicdomainreview.org/2012/03/29/remembering-scott) Well, in this book, I’ll take you along winding paths to bring back the original, wide- eyed person you were. We’ll find the magic again, and hopefully at the end of the book, you’ll continue where we leave off and make your own way in that journey of exploration and discovery. Data, Data, Everywhere We are swamped with data every minute and second of our lives. I don’t mean this metaphorically, and I am not simply waxing lyrical about big data either. In fact, we’re so swamped that our eyes have evolved and adapted to this fact by shutting off our environment for a very short while every millisecond. In a phenom enon called saccadic masking, the brain shuts down during a fast eye movement (a saccade) to remove blurred images that come to our retina. Blurred images are not very useful, so the brain discards them, rendering us effectively blind (without us realizing it) during a saccade. viii | Preface www.it-ebooks.info [...]... Open URI, and the Net packages (HTTP, IMAP, SMTP, and so on) To use the standard libraries and any other libraries other than the Ruby core, you will need to require them in your program: require 'base64' In addition to the standard libraries, you will often need to use external libraries developed by the Ruby community or yourself The most common way to distribute Ruby libraries is through RubyGems,... external libraries to make life easier Two sets of Ruby libraries come preinstalled with Ruby Core This is the default set of classes and modules that comes with Ruby, including String, Array, and so on Ruby www.it-ebooks.info | 5 Standard These libraries, found in the /lib folder of the Ruby source code, are distributed with Ruby but are not included by default when you run it These include libraries... will use are Ruby and R I’ve chosen them for specific purposes Ruby is easy to learn and to read, perfectly suited to explain concepts in humanreadable code I will be using Ruby to write simulations and to do preprocessing to get data R, on the other hand, is great for analyzing data and for generating charts for visualization Although you don’t need to be a Ruby or R programmer to be able to appreciate... through the Ruby interpreter For example, you could save puts "hello world!" to a file named hello_world.rb After that, you can try this command at the console: $ ruby hello_world.rb hello world! Most of the examples in this book will be run this way Requiring External Libraries While you can probably get away with writing simpler Ruby programs without any other libraries than the ones built into Ruby itself,... use Safari Books Online as their primary resource for research, problem solving, learning, and certification training Safari Books Online offers a range of product mixes and pricing programs for organizations, government agencies, and individuals Subscribers have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly... book First, Ruby is a programming language for human beings Yukihiro “Matz” Matsu moto, the creator of Ruby, often said that he tried to make Ruby natural, not simple, in a way that mirrors life Ruby programming is a lot like talking to your good friend, the computer Ruby was designed to make programming fun and to put the human back into the equation for programming For example, to print “I love Ruby ... their work much more efficiently If you’re someone from these two camps, you are likely already taking full advantage of the power of computers However, for programmers and many other people, writing computer programs started with providing tools for businesses and for improving business processes It’s all about using computers to reduce cost, increase revenue, and improve efficiency For many professional... serious later on) are FXRuby, WxRuby, qtRuby, and Tk If you’re looking for something totally cross-platform, JRuby with Swing is a good option, although there are other alternatives to Swing, like SWT and Limelight For Macs, a good alternative is MacRuby However, in this book, we’ll be using Shoes What Is Shoes? Shoes is a cross-platform toolkit for writing graphical applications with Ruby It’s entirely... data structures The two most important data structures, which you’ll meet very often in this book (and also in Ruby programming), are arrays and hashes Arrays are indexed containers that hold a sequence of objects You can create arrays using square brackets ([]) or using the Array class Arrays are indexed through a running integer starting with 0, using the [] operator: a = [1, 2, 'this', 'is', 3.45]... neither are Ruby and R conventional tools for exploring the world around us They just make things a whole lot more fun Ruby Each of these tools will need its own chapter We’ll start off first with Ruby and then discuss R in the next chapter Obviously, there is no way I can explain the entire Ruby programming language in a single chapter of a book, so I will give enough information to whet your appetite and . Sheong Chang Exploring Everyday Things with R and Ruby www.it-ebooks.info ISBN: 978-1-449-31515-3 [LSI] Exploring Everyday Things with R and Ruby by Sau. O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Exploring Everyday Things with R and Ruby, the image of a hooded seal, and related trade