(BQ) Part 2 book Programming interactivity has contents Bitmaps and pixels, physical feedback, protocols and communication, graphics and opengl, detection and gestures, movement and location, spaces and environments, further resources,...and other contents.
CHAPTER 10 Bitmaps and Pixels In this chapter, you’ll learn about video and images and how your computer processes them, and you’ll learn about how to display them, manipulate them, and save them to files Why are we talking about video and images together? Well, both video and photos are bitmaps comprised of pixels A pixel is the color data that will be displayed at one physical pixel in your computer monitor A bitmap is an array of pixel data Video is several different things with quite distinct meanings: it is light from a projector or screen, it is a series of pixels, it is a representation of what was happening somewhere, or it is a constructed image Another way to phrase this is that you can also think of video as being both file format and medium A video can be something on a computer screen that someone is looking at, it can be data, it can be documentation, it can be a surveillance view onto a real place, it can be an abstraction, or it can be something fictional It is always two or more of these at once because when you’re dealing with video on a computer, and especially when you’re dealing with that video in code, the video is always a piece of data It is always a stream of color information that is reassembled into frames by a video player application and then displayed on the screen Video is also something else as well, because it is a screen, a display, or perhaps an image That screen need not be a standard white projection area; it can be a building, a pool of water, smoke, or something that conceals its nature as video and makes use of it only as light A picture has a lot of the same characteristics A photograph is, as soon as you digitize it, a chunk of data on a disk or in the memory of your computer that, when turned into pixel data to be drawn to the screen, becomes something else What that something else is determines how your users will use the images and how they will understand them A picture in a viewer is something to be looked at A picture in a graphics program is something to be manipulated A picture on a map is a sign that gives some information 337 Using Pixels As Data Any visual information on a computer is comprised of pixel information This means graphics, pictures, and videos A video is comprised of frames, which are roughly the same as a bitmapped file like a JPEG or PNG file I say roughly because the difference between a video frame and a PNG is rather substantial if you’re examining the actual data contained in the file that may be compressed Once the file or frame has been loaded into Processing or openFrameworks, though, it consists of the same data: pixels The graphics that you draw to the screen can be accessed in oF or Processing by grabbing the screen data We’ll look at creating screen data later in this chapter, but the real point to note is that any visual information can be accessed via pixels Any pixel is comprised of three or four pieces of information stored in a numerical value, which in decimal format would look something like this: 255 000 000 255 which is full red with 100 percent alpha, or this: 255 255 255 255 which is white with 100 percent alpha In the case of most video data, you’ll find that the alpha value is not included, supplying only three pieces of information as in the value for red: 255 000 000 Notice in Figure 10-1 that although the hexadecimal representation of a pixel has the order alpha, red, green, blue (often this will be referenced as ARGB), when you read the data for a pixel back as three or four different values, the order will usually be red, green, blue, alpha (RGBA) Figure 10-1 Numerical representations of pixel data The two characters 0x in front of the number tell the compiler that you’re referring to a hexadecimal number Without it, in both Processing and oF, you’ll see errors when you compile In oF, when you get the pixels of the frame of a video or picture, you’ll get four unsigned char values, in RGBA order To get the pixels of an ofImage object, use the getPixels() method, and store the result in a pointer to an unsigned char Remember from Chapter that C++ uses unsigned char where Arduino and Processing use the byte variable type: 338 | Chapter 10: Bitmaps and Pixels unsigned char * pixels = somePicture.getPixels(); So, now you have an array of the pixels from the image The value for pixels[0] will be the red value of the first pixel, pixels[1] will be the green value, pixels[2] will be the blue, and pixels[3] will be the alpha value (if the image is using an alpha value) Remember that more often than not, images won’t have an alpha value, so pixels[3] will be the red value of the second pixel While this may not be the most glamorous section in this book, it is helpful when dealing with video and photos, which, as we all know, can be quite glamorous A bitmap is a contiguous section of memory, which means that one number sits next to the next number in the memory that your program has allocated to store the bitmap The first pixel in the array will be the upper-left pixel of your bitmap, and the last pixel will be the lower-right corner, as shown in Figure 10-2 Figure 10-2 The pixels of a 1280 × 853 bitmap You’ll notice that the last pixel is at the same index as the width of the image multiplied by the height of the image This should give you an idea of how to inspect every pixel in an image Here’s how to it in Processing: int imgSize = b.height * b.width; for(int i = 0; i < imgSize; i++) { // something with myImage.pixels[i]); } And here’s how to it in oF: unsigned char * pixels = somePicture.getPixels(); // one value for each color component of the image int length = img.height * img.width * 3; int i; Using Pixels As Data | 339 for(i = 0; i < length; i++) { // something with the color value of each pixel } Notice the difference? The Processing code has one value for each pixel, while the oF code has three because each pixel is split into three parts (red, green, and blue) or four values if the image has an alpha channel (red, green, blue, alpha) Using Pixels and Bitmaps As Input What does it mean to use bitmaps as input? It means that each pixel is being analyzed as a piece of data or that each pixel is being analyzed to find patterns, colors, faces, contours, and shapes, which will then be analyzed Object detection is a very complex topic that attracts many different types of researchers from artists to robotics engineers to researchers working with machine learning In Chapter 14, computer vision will be discussed in much greater detail For this chapter, the input possibilities of the bitmap will be explored a little more simply That said, there are a great number of areas that can be explored You can perform simple presence detection by taking an initial frame of an image of a room and comparing it with subsequent frames A substantial difference in the two frames would imply that someone or something is present in the room or space There are far more sophisticated ways to motion detection, but at its simplest, motion detection is really just looking for a group of pixels near one another that have changed substantially in color from one frame to the next The tone of the light in a room can tell you what time it is, whether the light in a room is artificial, and where it is in relation to the camera Analyzing the brightest pixels in a bitmap is another way of using pixel data for creating interactions If your application runs in a controlled environment, you can predict what the brightest object in your bitmap will be: a window, the sun, a flashlight, a laser A flashlight or laser can be used like a pointer or a mouse and can become a quite sophisticated user interface Analyzing color works much the same as analyzing brightness and can be used in interaction in the same way A block, a paddle, or any object can be tracked throughout the camera frame through color detection Interfaces using objects that are held by the user are often called tangible user interfaces because the user is holding the object that the computer recognizes Those are both extremely sophisticated projects, but on at a simpler level you can plenty of things with color or brightness data: create a cursor on the screen, use the position of the object as a dial or potentiometer, create buttons, navigate over lists As long as the user understands how the data is being gathered and analyzed, you’re good to go In addition to analyzing bitmaps for data, you can simply use a bitmap as part of a conversion process where the bitmap is the input data that will be converted into a novel new data form Some examples of this are given in the next section of this chapter 340 | Chapter 10: Bitmaps and Pixels Another interesting issue to consider is that for an application that does not know where it is, bitmap data is an important way of determining where it is, of establishing context While GPS can provide important information about the geographic location of a device, it doesn’t describe the actual context in which the user is using the application Many mobile phones and laptops now have different affordances that are contextual, such as reading the light in the room to set the brightness of the backlighting on the keyboard, lighting up when they detect sudden movement that indicates that they are about to be used, autoadjusting the camera, and so on Thinking about bitmap data as more than a picture can help you create more conversational and rich interactions Once you move beyond looking at individual bitmaps and begin using arrays of bitmaps, you can begin to determine the amount of change in light or the amount of motion without the great deal of the complex math that is required for more advanced kinds of analysis Providing Feedback with Bitmaps If you’re looking to make a purely abstract image, it’s often much more efficient to create a vector-based graphic using drawing tools One notable exception to this is the “physical pixel,” that is, some mechanical object that moves or changes based on the pixel value This can be done using servo motors, solenoid motors, LED matrices, or nearly anything that you can imagine Chapter 11 contains information about how to design and build such physical systems; however, this chapter focuses more on processing and displaying bitmaps Sometimes the need for a video, bitmap, or a photo image in an application is obvious A mapping application begs for a photo view Many times, though, the need for a photograph is a little subtler or the nature of the photograph is subtler Danny Rozins’s Wooden Mirror is one of the best examples of a photograph that changes our conception of the bitmap, the pixel, and the mirror In it is a series of mechanical motors that flip small wooden tiles (analogous to pixels in a bitmap) to match an incoming video stream so that the image of the viewer is created in subtle wooden pixels He has also developed The Mirrors Mirror, which has a similar mechanism turning small mirrors These mirrors act as the pixels of the piece, both reflecting and representing the image data Another interesting use of the pixel is Benajmin Gaulon’s PrintBall, a sort of inkjet printer that uses a paintball gun as the printhead and paintball as ink The gun uses a mounted servo motor that is controlled by a microcontroller that reads the pixels of an image and fires a paintball onto a wall in the location of the pixel, making a bitmap of brightly colored splashes from the paintball Though the application simply prints a bitmap, it prints in an interesting way that is physically engaging and interesting These works both raise some of the core questions in working with video and images: who are you showing? What are you showing? Are you showing viewers videos of themselves? Who then is watching the video of the viewers? Are you showing them Providing Feedback with Bitmaps | 341 how they are seen by the computer? How does their movement translate into data? How is that data translated into a movement or an image? Does the user have control over the image? If so, how? What sorts of actions are they are going to be able to take, and how will these actions be organized? Once they are finished editing the image, how will they be able to save it? So, what is a bitmapped image to you as the designer of an interactive application? It depends on how you approach your interaction and how you conceive the communication between the user and your system Imagery is a way to convey information, juxtaposing different information sources through layering and transparency Any weather or mapping application will demonstrate this with data overlaying other data or images highlighting important aspects, as will almost any photo-editing application With the widespread availability of image-editing tools like Photoshop, the language of editing and the act of modifying images are becoming commonplace enough that the play, the creation of layers, and the tools to manipulate and juxtapose are almost instantly familiar As with many aspects of interactive applications, the language of the created product and the language of creating that product are blending together This means that the creation of your imagery, the layering and the transparency, the framing, and even the modular nature of your graphics can be a collaborative process between your users and your system After all, this is the goal of a truly interactive application The data of a bitmap is not all that dissimilar from the data when analyzing sound In fact, many sound analysis techniques, fast Fourier transforms among one of the more prominent that was discussed in Chapter are used in image analysis as well This chapter will show you some methods for processing and manipulating the pixels that make up the bitmap data of an image or of a frame of a video Looping Through Pixels In both Processing and oF, you can easily parse through the pixels of an image using the getPixels() method of the image We’ll look at Processing first and then oF The following code loads an image, displays it, and then processes the image, drawing a 20 × 20 pixel rectangle as it loops using the color of each pixel for the fill color of the rectangle: PImage pic; int location = 0; int fullSize; void setup() { pic = loadImage("test.jpg"); fullSize = pic.height * pic.width; size(pic.width, pic.height); } void draw() { 342 | Chapter 10: Bitmaps and Pixels background(255, 255, 255); image(pic, 0, 0); if(location == fullSize) { location = 0; } else { location++; } } fill(pic.pixels[location]); int row = location / width; int pos = location - (row * width); rect(pos, row, 20, 20); This code will work with a single picture only To work with multiple pictures, you’ll want to read the pixels of your application, rather than the pixels of the picture Before you read the pixels of your application, you’ll need to call the loadPixels() method This method loads the pixel data for the display window into the pixels array The pixels array is empty before the pixels are loaded, so you’ll need to call the loadPixels() method before trying to access the pixels array Add the call to the loadPixels() method, and change the fill() method to read the pixels of the PApplet instead of the pixels of the PImage: loadPixels(); fill(pixels[location]); Looping through the pixels in oF is done a little differently In your application, add an ofImage and a pointer to an unsigned char: #ifndef _PIXEL_READ #define _PIXEL_READ #include "ofMain.h" #define bytesPerPixel = class testApp : public ofBaseApp public: void setup(); void update(); void draw(); }; #endif ofImage pic; int location; int fullSize; unsigned char * pixels; In the setup() method of your application, get access to the pixels of the image using the getPixels() method The rest of the code is about the same as the Processing version with one exception as mentioned earlier—for each pixel, there are three unsigned char values in the pixels array: Looping Through Pixels | 343 #include "PixelRead.h" void testApp::setup() { location = 0; pic.loadImage("image_test.jpg"); fullSize = pic.width * pic.height; ofSetWindowShape(pic.width, pic.height); pixels = pic.getPixels(); ofSetVerticalSync(true); ofEnableAlphaBlending(); } void testApp::update(){} void testApp::draw() { ofSetupScreen(); pic.draw(0,0); //location = (mouseY * pic.width) + mouseX; // the interactive version if(location == fullSize) { // the noninteractive version location = 0; } else { location++; } int r = pixels[3 * location]; int g = pixels[3 * location+1]; int b = pixels[3 * location+2]; ofSetColor(r, g, b); int col = location % pic.width; int row = location / pic.width; } ofCircle(col, row, 20); ofSetColor(0xffffff); To grab the pixels of your entire application, create an ofImage instance, and then call the grabScreen() method to load the pixels from the screen into the image object: void grabScreen(int x, int y, int w, int h); An example call might look like this: int screenWidth = ofGetScreenWidth(); // these should be in setup() int screenHeight = ofGetScreenHeight(); // this would go in draw screenImg.grabScreen(0, 0, screenWidth, screenHeight); The ofGetScreenWidth() and ofGetScreenHeight() methods aren’t necessary if you already know the size of the screen, but if you’re in full-screen mode and you don’t know the size of the screen that your application is being shown on, then it can be helpful 344 | Chapter 10: Bitmaps and Pixels Manipulating Bitmaps A common way to change a bitmap is to examine each pixel and modify it according to the value of the pixels around it You’ve probably seen a blurring filter or a sharpen filter that brought out the edges of an image You can create these kinds of effects by examining each pixel and then performing a calculation on the pixels around it according to a convolution kernel A convolution kernel is essentially a fancy name for a matrix A sample kernel might look like this: 11 11 11 11 11 11 11 11 This indicates that each pixel in the list will have this kernel applied to it; all the pixels around the current pixel will be multiplied by 0.11, the current pixel will be multiplied by 8, and the result will be summed Take a look at Figure 10-3 Figure 10-3 Performing an image convolution On the left is a pixel to which the convolution kernel will be applied Since determining the final value of a pixel is done by examining all the pixels surrounding the image, the second image shows what the surrounding pixels might look like The third image shows the grayscale value of each of the nine pixels Just below that is the convolution kernel that will be applied to each pixel After multiplying each pixel by the corresponding value in the kernel, the pixels will look like the fourth image Note that this doesn’t actually change the surrounding pixels This is simply to determine what value will be assigned to the center pixel, the pixel to which the kernel is currently being applied Each of those values is added together, and the sum is set as the grayscale value of the pixel in the center of the kernel Since that value is greater than 255, it’s rounded down to 255 This has the net result, when applied to an entire image, of leaving only dark pixels that are surrounded by other dark pixels with any color All the rest of the pixels are changed to white Manipulating Bitmaps | 345 Applying the sample convolution kernel to an image produces the effects shown in Figure 10-4 Figure 10-4 Effect of a convolution filter Now take a look at the code for applying the convolution kernel to a grayscale image: PImage img; float[][] kernel = { { 111, 111, 111 }, { 111, 8, 111 }, { 111, 111, 111 }}; void setup() { img = loadImage("street.jpg"); // Load the original image size(img.width, img.height); // size our Processing app to the image } void draw() { img.loadPixels(); // make sure the pixels of the image are available // create a new empty image that we'll draw into PImage kerneledImg = createImage(width, height, RGB); // loop through each pixel for (int y = 1; y < height-1; y++) { // Skip top and bottom edges for (int x = 1; x < width-1; x++) { // Skip left and right edges float sum = 0; // Kernel sum for this pixel // now loop through each value in the kernel for (int kernely = −1; kernely