Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 98 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
98
Dung lượng
2,31 MB
Nội dung
P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 STREAMING TECHNOLOGIES AND SYSTEMS 555 able to use the iconoscope to imitate the ways that hu- man eyes view images for television broadcast (Inventors Online Museum, 2002). This technology was a key com- ponent in the advancement of electronic television and serves as the foundation for the design of the modern elec- tronic televisions in use today (Fortner, 2002). Although television may have provided the foundation for the technology of streaming video, the Internet has provided the means that has made it available to con- sumers in their homes and to businesses. The Internet has revolutionized the computer and communications world as never before. It has become a worldwide medium for broadcasting, information dissemination, collaboration, and interaction between individuals without regard to location (Leiner et al., 2000). NETWORKING CONCEPTS Because streaming video is delivered to the user over a network, it is important to understand the basics of how the information is handled and transmitted through a net- work. In essence, networking involves one computer ex- changing information with another computer. Most Inter- net address begins with http://. HTTP stands for hypertext transfer protocol and is a standard or protocol (RealNet- works, 2000). It tells a browser and computer that HTML has been sent to it so it can read the incoming information. In the case of some streaming video locations on the In- ternet, the addresses start with PNM://, RTP://, or RTSP://. PNM stands for progressive networks media and it is an older protocol. However, there are still a number of video clips in use that use this protocol (RealNetworks, 2000). RTP stands for real-time protocol, and it is one of the most commonly used protocols for streaming media on the In- ternet (Compaq Computer Corporation [Compaq], 1998). RTSP stands for real-time streaming protocol, which is the newest protocol (RealNetworks, 2000). In all three cases, these addresses tell a browser and computer that stream- ing video has been sent to it. It should be noted that any computer receiving streaming video must have a special application installed that can read and play the video. This topic will be discussed in more detail in the next section. HOW STREAMING WORKS Streaming involves taking video or audio files, breaking them down into packets of information, and sending them to their destination. At the receiving end, the viewer can then play the video as it is being downloaded. Because of the way that information flows on the network, it is easy to see that there would be a number of interruptions and delays in playing the video. To address this issue, a technology, called buffering, was developed to ensure that the playing of the video on the receiving end is smooth. Buffering is the process where a large number of infor- mation packets are collected before the playing of the video begins. Once enough packets have been collected, the playing of the video will begin. As the video plays, the buffering will continue until all of the information has been received. It is important to note that the video is not stored on the user’s computer; it is received, buffered, and played. The process described above is referred to as true streaming. It should not be confused with a method called pseudo-streaming or progressive download. Pseudo- streaming users wait until a significant portion of video file has been downloaded to their computer before view- ing the video. This method allows users to save files to the hard drives on their computer for later viewing. Pro- gressive download works best with very short media clips and a small number of simultaneous users ( DoIt & WISC, 2002). Streaming video may involve a video with or without sound. In the case of a video with sound, the visual por- tion of the video is delivered on one stream while the au- dio is delivered on another stream. Technology has been developed to synchronize these streams at the destina- tion to ensure that the sound matches up with the ac- tion being viewed. Streaming files that include more than one medium are known as rich media. It should be noted that streaming can include slide presentations, text, video, audio, or any combination of these. A number of components are required in order to make streaming video work on the Internet. First, the user must have a computer connection to the Internet via a local area network or modem. The user must also have a Web browser with the appropriate video player or plug-in in- stalled. Many plug-ins can be downloaded from the Web for free. A plug-in works in conjunction with a browser to play streaming video files. A Web server stores Web pages or HTML files. Streaming video files are usually kept on a separate dedicated streaming server. When a streaming video link is clicked on a Web page, the browser reads the HTML code and the lets the player/plug-in take over (DoIt & WISC, 2002). The player accesses the selected video on the streaming server using the video protocols (RTP and RTSP) discussed in the previous section. After a few sec- onds of buffering, the video will start and play. STREAMING TECHNOLOGIES AND SYSTEMS A number of technologies are available for streaming video. The three major technologies are RealOne, Quick- Time, and Windows Media (DoIt & WISC, 2002). Each streaming technology has three common hardware and/or software components: (1) servers and video files; (2) video players and plug-ins; and (3) compression, encoding, and creation tools (DoIt & WISC, 2002). The specifics of each technology will be discussed in more depth in a later sec- tion in this paper. Each streaming technology mentioned above may have its own proprietary server and media file types that they use. Also, RealOne, QuickTime, and Windows Media have their own servers that stream files in their own formats. Therefore, it is important to create video files in a for- mat that are compatible with the technology and server that will be used to stream the files. However, and rela- tively newer product called Helix offers open, comprhen- sive didital media communication for all players. In order to play the video file, the user must have the second component, the player, installed on their com- puter. Users can download the player from the Web for free or, sometimes, it is included with the browser. As P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 VIDEO STREAMING556 with the video files and servers, each technology may have its own proprietary player. In some cases, one technol- ogy’s media files cannot be played by another technology’s player. As indicated above, the third common streaming tech- nology component is file creation, compression, and en- coding. It involves the process of creating video files for streaming. Again, each technology may have its own proprietary way of creating, compressing, and encoding streaming video files. Therefore, special software may be needed to create streaming video files that are compatible with the video player on the receiving end. The above discussion has focused on the system re- quirements for streaming video. At this point, it is worth noting that the typical streaming video system has five basic functions. First, the video must be captured, digi- tized, and then stored on a disk. Second, after the video is stored on a disk, it can be edited to improve its quality and content. Third, the video file must be compressed and encoded to the appropriate streaming format. Fourth, the video is delivered to the user via the video server. And, fifth, the user receives, decodes, buffers, and plays the video on the computer. CAPTURING AND DIGITIZING VIDEO In working with streaming video, the first step is to record the video or obtain a recorded video. There are two types of video that can be recorded. The first is analog video, which is produced with a vhs, hi-8, or beta cam format. The second is digital video, which is produced with a dig- ital recorder or camera (DoIt & WISC, 2002). Analog video contains video information in frames consisting of varying analog voltage values. It tends to degrade over time and it can contain imperfections such as snow in the picture. Digital video contains video infor- mation in a series of digital numbers that can be stored and transmitted without imperfections. Digital video does not degrade over time. The recent advances in the digital technology make it easier to store, retrieve, and edit digital video (Compaq, 1998). If the video is from an analog source, it will have to be converted and compressed into a digital format. In order to do this conversion, an analog video capture card and the appropriate software will have to be installed on the computer. The video capture card is an expansion card that works in conjunction with, or replaces, the graph- ics adapter inside the computer. If the video is digital, a FireWire capture card can be used and the analog-to- digital step is not needed (Videomaker Magazine, 2001). A side note on the digital video format that is worth- while to review is that digital video often uses a different color format than the format used for computer monitors. Computer monitors display the color information for each pixel on the screen using the RGB (red, green, blue) for- mat. (Pixels can be defined as the small elements or points that make up the frame.) Digital video frequently uses a format known as YCrCb, where Y represents the bright- ness (or luma) of a pixel, and Cr and Cb represent the pure color. In the different color schemes used in digital video, each pixel will have a brightness component but groups of pixels may share the CrCb color data. Hence, the terms 24-bit, 16-bit, and 12-bit color schemes refer to the num- ber of color bits required per pixel (Compaq, 1998). With the capture and conversion of the video, the video is transferred into a format that can be edited and then encoded for streaming. A number of formats are avail- able. One of the most common of these is the AVI format. AVI stands for Audio Video Interlaced and was created by Microsoft. It is one of the oldest formats in use and is included with Microsoft’s Windows applications (Fischer & Schroeder, 1996). This format was used in many of the early video editing systems and software. However, there are restrictions in using this format; the most notable of these is compatibility issues with some of the more ad- vanced editing systems. Even with these issues, many edit- ing systems and software can still use this format. Another format is the MOV format, which was origi- nally developed for the Macintosh computer by Apple. It then became the proprietary standard of Apple’s Quick- Time streaming technology (Fischer & Schroeder, 1996). One of the most recent formats is the MPEG format. MPEG is a newer format and it is becoming very popular with streaming video users. MPEG stands for Motion Pic- tures Experts Group, which is an international organiza- tion that developed standards for the encoding of moving images (Fischer & Schroeder, 1996). There are a number of MPEG standards available, primarily for the encoding and compression of streaming video. These will be dis- cussed in more detail later in this paper. However, one of the initial standards that was developed, MPEG-1, is used for the storage of video. In capturing and converting video for streaming, it is recommended to maintain the highest quality video pos- sible. The result will be very large video files that will have to be edited and streamed. However, it is better to start with the highest quality that can be maintained and then scale down to the quality that can be streamed. Starting with a lower quality leaves fewer options for editing, com- pression, and encoding. EDITING THE VIDEO Once the video has been captured and converted to a digital format, it can be edited with a variety of edit- ing tools. As mentioned above, each of the three main streaming technologies—RealOne, QuickTime, and Win- dows Media—has editing tools. Editing is critical as it impacts how the video is ultimately received by the user and the end user’s needs are paramount (see Producing Streaming Video for more.) In editing a video, one of the first things that may have to be done is cropping the video. Cropping the video involves removing the edges, where electronic er- rors, glitches, and black bars may be seen. These usually appear during the process of recording and converting the video. In most cases, removing about 5% of the edges will eliminate the glitches. In cropping a video, it is important to remember that the final dimensions of the video must be compatible with the encoding technology (Kennedy, 2001). Television systems use a technique called interlacing to display a picture on the screen. This process involves displaying the picture on every other line on the television P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 BANDWIDTH 557 screen. Then lines are inserted between the first set. This process of alternating the picture lines eliminates any flicker on the screen. Videos also have this feature. How- ever, for streaming video that will be displayed on a computer screen, interlacing is not needed. Some cap- ture cards have a deinterlacing feature and some cam- corders will record video without interlacing. However, if the video is interlaced at the editing step, and the file is very large, it is advisable to deinterlace the video during editing (Kennedy, 2001). Also, when film is converted to video, additional frames are added because film is shot at 24 frames per second and depending on the video standard, television may run from 25 to 30 frames per second (Kennedy, 2001). The process of converting film to video, where the additional frames are put in, is called Telecine. It is best to avoid adding frames that are not needed. Therefore, if it is available, an Inverse Telecine conversion should be used to reduce the video back to 24 frames per second (Kennedy, 2001). If a video has been shot with a lot of motion, the video could appear to be shaky or fuzzy, and not ideal for stream- ing. If this is the case, the best option may to use a still frame or slow motion. A still frame or slow motion may not look very natural, but it is better than streamed video that is not viewable. Although special effects are great when viewed in a movie, they do not work well in streaming video because they utilize a lot of memory and impact the quality of the video. It is generally recommended that special effects be removed from the video. Streaming video is limited in its ability to deliver smooth video for any motion such as dance that relies on fluid movements. Also, if text is used in the video, it should be concise, legible, and easy to read. Audio is a very important part of streaming video. If the video has an audio portion, the quality of the audio needs to be reviewed. For example, it is advisable to avoid the use of background music or other noise in order to ensure that speakers can be heard clearly. It is also good to prepare the audio to work on the worst speaker sys- tem that any potential user may have. If the audio is not clear then the usefulness of the video is greatly dimin- ished. BANDWIDTH Before covering the topic of compressing and encoding, it is essential to understand the concept of bandwidth. The reason is that bandwidth is a critical factor in the trans- mission and reception of streaming video. Bandwidth is, simply put, the amount of information that can pass through a particular point of the wire in a specific amount of time (RealNetworks, 2000). Network bandwidth can be compared to a water pipe and a file to a tank of water. If the pipe is very narrow, then it will take a long time for the water from the tank to flow through the pipe. If the pipe is larger, then it will take less time for the water to flow through (Microsoft.com, 2000). Therefore, the higher the bandwidth, the greater the amount of information that can flow through the network to the destination. At the destination, the speed of the modem or other device used to connect to the Internet determines the bandwidth of the stream that is received. Table 1 Available Bandwidths Technology Throughout Fast Ethernet 100 Mbps Ethernet 10 Mbps Cable Modem 8 Mbps ADSL 6 Mbps 1 × CD-ROM 1.2 Mbps Single Channel ISDN 64 Kbps High Speed Modem 56 Kbps Standard Modem 28 Kbps Because video files are large and many networks have limited bandwidths, there are many issues involved in transmitting these files over networks. Although many computer networks have installed new devices and tech- nology to improve their bandwidths, this is one of the biggest challenges to streaming a video over a net- work. The Internet was not designed to handle streaming video. File sizes are measured in kilobytes (abbreviated as K or KB). A kilobyte contains 1,024 bytes. When this conver- sion is applied to large video files and the math is done to determine transmission rates, it is apparent that these files have a huge amount of information that has to be trans- mitted. For example, a full-screen, full-motion video can require a data transmission rate of up to 216 Megabits per second (Mbps) (Compaq, 1998). This exceeds the highest available data rates in most networks. Table 1 shows the available bandwidth for several methods of data delivery, according to Compaq (1998). In reviewing the above exhibit, it should be noted that the throughput listed for each technology represents an upper limit for that technology. In most cases, the actual throughput will be below this limit due to the amount of traffic on the network. Depending on the conditions of their connections, many users will see their data fluctu- ate up and down. One minute, they may have a 10 Kbps rate; the next minute, it may jump to 24 Kbps (Kennedy, 2000). Therefore, it is important for the provider of the streaming video to match the data rate to the conditions and limitations of the potential users. Also, the Fast Ethernet and Ethernet technologies listed in Table 1 are used primarily in businesses and orga- nizations. Single channel ISDN (integrated services digi- tal network) is also used by businesses for video phones and video conferencing. Cable modems and ASDL (asym- metrical digital subscriber loops) are available to individ- ual Internet users, but they are newer, more expensive technologies and are not as widely available as modems. Thus, it is safe to say that most Internet users have ei- ther a 56-Kbps high-speed modem or a 28-Kbps standard modem. Two options are available for successfully delivering streaming video over networks. The first option involves scaling the video to smaller window sizes. This is impor- tant for low-bandwidth networks where many clients have modem access. The second option involves compressing the video using compression algorithms designed for this purpose. This is needed for most networks because of the P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 VIDEO STREAMING558 high bandwidth requirements of videos that have not been compressed. Scaling and compressing video do affect the quality of the video. The quality of the video is impacted by frame rate, color, and resolution. Frame rate is the number of still images that make up one second of a moving video image. Images move fluidly and naturally at 30 frames per second, which is the National Television Standards Committee (NTSC) standard for full motion video. How- ever, film is usually 24 frames per second (Compaq, 1998l). Videos with a frame rate of less than 15 frames per second become noticeably jumpy. It should be noted that most phone and modem technology limits the frame rate to 10 frames per second (Videomaker Magazine, 2001). The second quality variable, color depth, is the num- ber of bits of data the computer assigns to each pixel of the frame. The more bits of color data assigned to each pixel, the more colors can be displayed on the screen. Most videos are either 8-bit 256-color, 16-bit 64,000-color, or 24 bit 16.8-million color. The 8-bit color is very grainy and not suitable for video. The 24-bit color is the best, but it greatly increases the size of the streaming file, so the 16- bit color is normally used (Videomaker Magazine, 2001). The third quality variable, resolution, is measured by the number of pixels contained in the frame. Each pixel displays the brightness and color information that it re- ceives from the video signal. The more pixels in the frame, the higher the resolution. For example, if the video is 640 × 480, there are 640 pixels across each of the 480 ver- tical lines of pixels. Streamed video ranges from postage stamp size, which is 49 × 49 pixels, to full PC monitor screen, which is 640 × 480 pixels, and beyond (Video- maker Magazine, 2001). SCALING As mentioned previously, scaling involves reducing video to smaller windows. For example, this can be accom- plished by reducing the frame resolution from a full screen (640 × 480) to a quarter screen (320 × 240). In addi- tion, frame rate and color depth can also be scaled. For example, the frame rate can be reduced from 30 to 15 frames per second. The color depth can be scaled from 24-bit to 16-bit. According to Compaq (1998), the process noted in this example would reduce the video file size from 216 Mbps to 18 Mbps and the quality of the video would be reduced. However, as can be seen from the available band- widths shown in Table 1, many delivery methods would not support a data rate of 18 Mbps. Therefore, to further reduce the data rate, video compression is necessary. COMPRESSING AND ENCODING The goal of compression is to represent video with as few bits as possible. Compression of video and audio involves the use of compression algorithms known as codecs. The term codec comes from the combination of the terms encoder and decoder—cod from encoder and dec from decoder (RealNetworks, 2000). An encoder converts a file into a format that can be streamed. This includes breaking a file down into data packets that can be sent and read as they are transmitted through the network. A decoder sorts, decodes, and reads the data packets as they are received at the destination. Files are compressed by encoder/decoder pairs for streaming over a network. Encoders generally accept specific input file formats used in the capture and digitizing process. The encoders then convert the input formats into proprietary streaming formats for storage or transmission to the decoder. Some codecs may be process-intensive on the encode side in or- der to create programs one time that will be played many times by the users. Other codecs are divided more equally between encoding and decoding; these are typically used for live broadcasts (Compaq, 1998). As mentioned above, each of the three major streaming technologies has its preferred encoding and compressing formats. Many users opt to work with one of these three technologies because they are relatively easy to use, and technical support is provided by each of the technologies. These technologies provide options to users for selecting video quality and data transmission rates during the com- pression and encoding process. Depending on the appli- cation and technology used, multiple streaming files may have to be produced to match the different bandwidths of the networks over which the video is streamed. Two of the three major technologies have advanced options where a streaming file can be produced that has a data transmis- sion rate that will adapt to the varying bandwidths of the networks. The specifics of these technologies will be dis- cussed in a later section. Even with the dominance of the three major technolo- gies, there are some open standards for compression al- gorithms. It is important to be aware of these standards and understand how the compression algorithms work. With this knowledge, the user can make better decisions when creating, delivering, and viewing streaming video. The compression algorithms will be discussed in more detail later. However, they all utilize the same basic com- pression techniques to one degree or another. Therefore, it is essential to review the compression techniques before discussing the algorithms. First, compression techniques are either lossless or lossy. Lossless compression is a process where data are compressed without any alteration of the data in the com- pression process. There are situations where messages must be transmitted without any changes. In these cases, lossless compression can be used. For example, lossless compression is typically used on computers to compress large files before emailing them (Vantum Corporation, 2001). A number of lossless techniques are available. How- ever, for video files in particular, more compression is needed than the lossless techniques can provide. Lossy techniques involve altering or removing the data for efficient transmission. With these techniques, the orig- inal video can only be approximately reconstructed from its compressed representation. This is acceptable for video and audio applications as long the data alteration or removal is not too great. The amount of alternation or removal that is acceptable depends on the application (Vantum Corporation, 2001). A number of video compression techniques take advantage of the fact that the information from frame to frame is essentially the same. For example, a video that shows a person’s head while that person is talking will have the same background throughout the video. The only changes will be in the person’s facial expressions and other P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 VIDEO COMPRESSION ALGORITHMS 559 gestures. In this situation, the video information can be represented by a key frame along with delta frames con- taining the changes between the frames. This is known as interframe compression. In addition, individual frames may be compressed using lossy techniques. An example of this is a technique where the number of bits representing color information is reduced and some color information is lost. This is known as intraframe compression. Com- bining the interframe and intraframe compression tech- niques can result in up to a 200:1 compression (Compaq, 1998). Another compression technique is called quantizing. It is the basis for most lossy compression algorithms. Es- sentially, it is a process where rounding of data is done to reduce the display precision. For the most part, the eye cannot detect these changes to the fine details (Fischer & Schroeder, 1996). An example of this type of compression is the intraframe compression described above. Another example is the conversion from the RGB color format used in computer monitors to the YcrCb format used in digital videos that was discussed in the capturing and digitizing section of this paper. Filtering is a very common technique that involves the removal of unnecessary data. Transforming is another technique, where a mathematical function is used to con- vert the data into a code used for transmission. The trans- form can then be inverted to recover the data (Vantum Corporation, 2001). For videos that have audio, the actual process used to compress audio is very different from that used to com- press video even though the techniques that are used are very similar to those described above. This is because the eye and ear work very differently. The ear has a much higher dynamic range and resolution. The ear can pick out more details but it is slower than the eye (Filippini, 1997). Sound is recorded as voltage levels and it is sam- pled by the computer a number of times per second. The higher the sampling rate, the higher the quality and hence, the greater the need for compression. Compressing audio data involves removing the unneeded and redundant parts of the signal. In addition, the portions of the signal that cannot be heard are removed. VIDEO COMPRESSION ALGORITHMS Some algorithms were designed for wide bandwidths and some for narrow bandwidths. Some algorithms were de- veloped specifically for CD-ROMs and others for stream- ing video. There are a number of compression algorithms available for streaming video; this chapter will discuss the major ones in use today. These algorithms are MPEG-1, MPEG-2, MPEG-4, H.261, H.263, and MJPEG. The video compression algorithms can be separated into two groups: those that make use of frame-to-frame redun- dancy and those that do not. The algorithms that make use of this redundancy can achieve significantly greater compression. However, more computational power is re- quired to encode video where frame-to-frame redundan- cies are utilized. As mentioned in earlier in this paper, MPEG stands for Moving Pictures Experts Group, which is a work group of the International Standards Organization (ISO) (Compaq, 1998). This group has defined several levels of standards for video and audio compression. The MPEG standard only specifies a data model for compression and, thus, it is an open, independent standard. MPEG is becoming very popular with streaming video creators and users. The first of these standards, MPEG-1, was made avail- able in 1993 and was aimed primarily at video conferenc- ing, videophones, computer games, and first-generation CD-ROMs. It was designed for consumer video and CD-ROM audio applications that operate at a data rate of approximately 1.5 Mbps and a frame rate of 30 frames per second. It has a resolution of 360 × 242 and supports play- back functions such as fast forward, reverse, and random access into the bitstream (Compaq, 1998). It is currently used for video CDs and it is a common format for video on the Internet when good quality is desired and when its bandwidth requirements can be supported (Vantum Corporation, 2001). MPEG-1 uses interframe compression to remove redundant data between the frames, as discussed in the previous section on compression techniques. It also uses intraframe compression within an individual frame as described in the previous section. This compression al- gorithm generates three types of frames: I-frames, P- frames, and B-frames. I-frames do not reference other previous or future frames. They are stand-alone or Inde- pendent frames and they are larger than the other frames. They are compressed only with intraframe compression. They are the entry points for indexing or rewinding the video, because they represent complete pictures (Compaq, 1998). On the other hand, P-frames contain predictive infor- mation with respect to the previous I or P frames. They contain only the pixels that have changed since the last frame, and they account for motion. In addition, they are smaller than the I-frames, because they are more compressed. I-frames are sent at regular intervals during transmission process. P-frames are sent at some time in- terval after the I-frames have been sent (this time inter- val will vary based on the transmission of the streaming video). If the video has a lot of motion, the P-frames may not come fast enough to give the perception of smooth mo- tion. Therefore, B-frames are inserted between the I- and P-frames. B-frames use data in the previous I- or P-frames as well as the future I- or P-frames, thus, they are consid- ered bidirectional. The data that they contain are an in- terpolation of the data in the previous and future frames, with the assumption that the pixels will not drastically change between the two frames. As a result, the B-frames have the most compression and are the smallest of the three types of frames. In order for a decoder to decode the B-frames, it must have the I- and P-frames that they are based on; thus the frames may be transmitted out of order to reduce decoding delays (Comqaq, 1998). A frame sequence consisting of an I-frame and its fol- lowing B- and P-frames before the next I-frames is called a group of pictures (GOP) (Compaq, 1998). There are usu- ally around 15 frames in a GOP. An example of the MPEG encoding process can be seen in Figure 1. The letters I, P, and B in the figure represent the I-, P-, and B-frames that could possibly be included in a group of pictures. The let- ters were sized to indicate the relative size of the frame (as compared to the other frames). P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 VIDEO STREAMING560 Group of Pictures Entry Point B B P B B P B B P B B P B B Entry Point 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 1: MPEG-1 encoding process. One disadvantage of the MPEG format is that it can- not easily be edited because video cannot be entered at any point. And the quality of the resulting video is im- pacted by the amount of motion in the video. The more motion in the video, the greater the probability that the quality will be reduced. The MPEG encoding and de- coding process can require a large amount of computa- tional resources, which requires the use of specialized computer hardware or a computer with a powerful proce- ssor. MPEG-2 was released in 1994 and was designed to be compatible with MPEG-1. It is used primarily for deliv- ering digital cable and satellite video to homes. It is the basis of DVD and HDTV. MPEG-2 utilizes the same com- pression techniques as MPEG-1. However, it has been en- hanced so that it has better compression efficiency than MPEG-1. MPEG-2 supports two encoding schemes de- pending on the application. The first scheme has a vari- able bit rate, which keeps the quality constant. The sec- ond scheme involves varying the quality to keep the bit rate constant. MPEG-2 is not considered an ideal format for streaming over the Internet because it works best at transmission rates higher than most networks can handle (Cunningham & Francis, 2001). MPEG-4 is one of the most recent video formats and is geared toward Internet and mobile applications includ- ing video conferencing, video terminals, Internet video phones, wireless mobile video, and interactive home shop- ping. It was originally designed to support data rates less than 64 Kbps but has been enhanced to handle data rates ranging from 8 Kbps to 35 Mbps. MPEG-4 is different from MPEG-1 and -2 in that it has been enhanced to han- dle the transmission of objects described by shape, tex- ture, and motion, versus just the transmission of rectan- gular frames of pixels. In fact, it is very similar to H.263, which is the video conferencing standard (Compaq, 1998). This feature makes MPEG-4 well suited to handle multi- media objects, which are used in interactive DVD, inter- active Web pages, and animations. MPEG-7 is the newest standard. It is designed for mul- timedia data and can be used independent of the other MPEG standards. Work is being done on an extension of the MPEG-7 standard, called MPEG-21. The H.261 and H.263 standards are designed for video conferences and video phone applications that are trans- mitted over an ISDN network. H.261 has the ability to adapt the image quality to the bandwidth of the trans- mission line. The transmission rate for H.261 is usually around 64 Kbps (Fischer & Schroeder, 1996). H.263 was developed as an enhancement to H.261 and was designed to support lower bit rates than H.261. It has a higher pre- cision for motion compensation than H.261. H.263 is very similar to the MPEG standards, particularly MPEG-4, and uses the same compression techniques (Vantum Corpora- tion, 2001). MJPEG stands for Motion JPEG, and JPEG stands for Joint Photographic Experts Group. JPEG is an interna- tional standard for compressing still frames. MJPEG is a sequence of JPEG compressed still images that represent a moving picture. Thus, MJPEG is a compression method that is applied to each frame without respect to the pre- ceding or following image (Vantum Corporation, 2001). MJPEG can be edited easily but it is not able to handle audio. AUDIO COMPRESSION ALGORITHMS Each of the three major streaming technologies has its preferred algorithms for compressing audio. In addition, the MPEG group has defined an audio standard called MPEG-1 for audio. As discussed previously, audio com- pression is different than video, although it uses similar techniques. The MPEG audio compression uses psychoa- coustic principles, which deal with the way the human brain perceives sound (Filippini, 1997). The first principle utilized in the MPEG audio compres- sion is the masking effect. This means that weak sounds are not heard, or they are masked, when they are near a strong sound. For example, when audio is digitized, some compression occurs because data are removed and noise is added to the audio. This noise can be heard during silent moments, or between words or sentences. However, this noise is not heard during talking or when music is playing. This is because the noise is a weaker sound and is masked by the louder talking or music. MPEG uses this masking effect to raise the noise floor around a strong sound be- cause the noise will be masked anyway. And, by raising the noise floor, fewer data bits are used, and the signal (or file) is compressed. MPEG uses an algorithm to divide up the sound spectrum into subbands. It then calculates the op- timum masking threshold for each band (Filippini, 1997). The second psychoacoustic principle is that the human ear is less sensitive to high and low frequencies, versus middle frequencies. In essence, MPEG employs a filtering technique along with the masking effect to remove data from the high and low frequencies where the changes will not be noticed. It maintains the data in the middle fre- quencies to keep the audio quality as high as possible. DELIVERING THE VIDEO Once the video has been compressed and encoded for streaming, the next step is to serve the video to the users on the Internet. As discussed earlier in this chapter, deliv- ering video over the Internet is usually accomplished with a streaming server, instead of a Web server. A streaming P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 RECEIVING,DECODING, AND PLAYING THE VIDEO 561 server has some specialized software that allows it to manage a data stream as it is being transmitted through the network. It utilizes the streaming protocols (RTP and RTSP) to transmit the video file. A Web server can be used to stream video, but it has been designed to transfer text and images over the Internet and it does not have the means to control a stream (Strom, 2001). When a Web server is used, a user selects a video file and it starts to be copied down to the PC using HTTP like any other data source on the Internet. The player takes control and the video is buffered and played. But because the Web servers are not able to control the stream, the delivery of the video can be erratic and the user could experience rebuffering interruptions. Thus, it is best to use a video server to ensure that the user will have a smooth playback without interruptions. Video servers have capacity limitations, and they can only deliver a certain number of streams at any one time. The capacity of a server is measured in the number of si- multaneous streams that it can put out at any given point in time. This can range from 20 to 5,000 or more, depend- ing on the type of server (DoIt & WISC, 2002). If a user tries to access a video file after the server has reached its maximum capacity, the user will get a message stating that the server is busy and to try playing the video again after 1 or 2 minutes. It is essential to note that streaming servers require the appropriate hardware, network connections, and techni- cal expertise to set them up and administer them. This can consume time and resources, so many people and busi- nesses choose to outsource this task to a host. A host is an agent or department that has the facilities and techni- cal expertise to serve other people’s streaming videos and other media content (DoIt & WISC, 2002). Hosts usually charge a fee for their services. There are numerous hosts that advertise on the Internet. When selecting a host, it is important to ensure that they can support the streaming technology being used by the client. When using a host, the client will be able to transfer media files from his or her local computer to a stream- ing server. This is usually done by using special software called an FTP client (DoIt & WISC, 2002i). The host will set the person up with a password-protected account and a designated amount of server space. With this situation, the person may have text and graphics for a Web site re- siding on a Web server. Then, he or she has streaming files on a streaming server. This can be managed by using certain HTML tags on the Web page that will trigger and control the playback of the media files from the streaming server. This involves specifying the path of the particular video file on the streaming server. Each of the three ma- jor streaming technologies has its own unique embedded HTML tags for controlling the video files on servers. Many of the encoding applications can generate these HTML tags (DoIt & WISC, 2002). As covered earlier in the discussion on bandwidth, not all networks are suited for the streaming of video. Video works best when the bandwidth of the network is contin- uously high. However, when the bandwidth of the video exceeds that of the network, delays in the transmission of the data packets can occur. These delays will cause the picture to flicker and the audio (if present) to start Server Router Network Router Client PC’s Figure 2: Video on demand. and stop. In order to deal with the issues of streaming video and media, a new measure of network capability have been developed. It is called quality of service (QoS) (Compaq, 1998). Networks that have a good QoS measure provide a guaranteed bandwidth with few delays. The net- works that have the best QoS are those that have dedicated connections for streaming. Another network characteristic that needs to be consid- ered for streaming video is the network’s ability to support video-on-demand delivery and webcasting delivery. With video on demand (also know as unicasting), a stream is delivered onetoone to each client, and the user can request the video at any time. This type of delivery can consume a lot of network bandwidth, depending the number of users requesting a video. According to Compaq (1998), Figure 2 shows a simple diagram of how video on demand works. Each line in the exhibit represents a separate stream. Webcasting, is used for live events where there can be potentially many viewers. Webcasting delivers one stream to many clients simultaneously. It does not consume as much bandwidth as video on demand. But as noted pre- viously, video on demand is much more common because of the convenience it offers to users. Webcasting is sched- uled for specific times and requires a lot of effort and re- sources to coordinate. Networks that support webcasting must have routers that are multicast capable. Figure 3 shows how a webcast works, according to Compaq (1998). The lines in the exhibit represent the video stream (note the single line going across the network). RECEIVING, DECODING, AND PLAYING THE VIDEO Finally, at the client desktop, the user accesses the video file. As discussed above, the user clicks on the video file that he or she wants to view, the request is routed to the appropriate file on the video server, and the player technology on the user’s PC takes control of the data Server Router Network Multicast Router Client PC’s Figure 3: Webcasting. P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 VIDEO STREAMING562 transmission. The player buffers the stream(s), decodes the data packets, and converts the information back to analog so the video can be viewed. The player usually has functionality that will allow the user to play, pause, rewind, and fast forward. The player technology must be compatible with the streaming technology in order for the user to view the video. With video on demand, the user can control access to the video. He or she can start, stop, rewind, and so forth at will. Although this freedom is desirable to the user, it does consume bandwidth on the network (as noted above). With webcasting, the user can only watch the video stream as it is being transmitted; he or she does not have any con- trol over the stream. Webcasting does not use as much bandwidth as video on demand. PRODUCING STREAMING VIDEO Up to this point, this discussion has covered the aspects of streaming video after it has been created or produced. However, there are some techniques that should be used when used when producing streaming video that will make the capturing, editing, compressing, and encoding processes go much smoother. Because streaming video has to be compressed before delivery on the network, one of the most important things to remember when produc- ing the video is to minimize motion and changes in the ob- jects or people in the video. The more motion and change there is in the video, the more the video will have to be compressed and thus data (such as fine details and color) will be altered or removed. Therefore, it is best to use a tripod or other method to anchor the camera whenever possible. If the camera is held in the video producer’s hands, all of the hand move- ments will be incorporated into the video. The video pro- ducer should also avoid panning the camera as much as possible and avoid zooming in and out on a scene. Thus, eliminating the movement of the camera and keeping zooming in and out to a minimum will prevent changes from being introduced into the video. The video producer should also try to keep the back- ground as simple and consistent as possible. The producer should avoid trees, buildings, and so forth that will add complexity to the video, which will mean more data to compress. In addition, the producer should try to stay as close to the subject as possible when shooting the video. There may be some temptation to choose a wide shot of the scene however, will viewed online, the video will seem fuzzy. It is important to remember that the compression will remove a lot of the fine detail of the wide shot. Last, the video producer should use an external micro- phone whenever possible. With an external microphone, the producer can keep it as close to the subject as possible to get good quality audio. With good quality audio, the audio compression will work much better. Audio is just as important as the images being displayed in the video. VIDEO STREAMING USES The previous sections have focused on the technical as- pects of creating, delivering, and playing streaming video. This section will focus on the many uses of streaming video and the preferences of users. First, streaming me- dia (video along with audio) have grown rapidly over the last few years. The number of Internet sites transmitting streaming video grew from 30,000 in mid-1998 to 400,000 by late 1999. The Net Aid concert in October, 1999, set a world record for the largest Internet broadcast event for a single day—2.5 million streams. The BBC Online’s Euro- pean solar eclipse site served a million streams in a day in August, 1999. The BBC estimated that its streaming audi- ence was growing by 100% every 4 months (Tanaka, 2000). As can been seen from the above statistics, streaming video continues to gain in popularity even with the tech- nical challenges involved in streaming the video over the Internet. A Web survey conducted by Tanaka (2000) indi- cates that streaming video appeals to users because they can select what they want to view when they want to view it. Users like the fact that streaming technology has made specialized or unique videos or other or media available to them. The streaming video uses fall into the primary cat- egories of entertainment, news/information, education, training, and business. Entertainment was one of the ear- liest uses of streaming video and still remains the pri- mary use of streaming video today. Entertainment covers a wide range of media including movies, music, and TV shows. There are numerous Web sites promoting free and pay-per-view movies. Many sites feature independent film makers, foreign films, and pornography (Bennett, 2002). Pornography video sites are some of the oldest entertain- ment sites on the Internet. At this time, pornography may be the largest online movie market of all on the Web (Bennett, 2002). Recently, some Web sites have been established that show hit movies on the Internet. For example, viewers can watch the blockbuster movies on a Web site for $3.95 (Graham, 2002). The Hollywood studios have been slow to utilize the Internet as a medium for showing their movies because they want to be sure that this is a safe way to de- liver their films. It is interesting to note that many movies are available on the Internet in unauthorized versions. Many of these movies were copied from DVDs or shot on a camcorder in a theater and then traded on file-sharing sites such as Morpheus and Kaaza (Graham, 2002). According to Graham (2002), users need a high-speed Internet access, such as cable or ADSL, in order to watch movies over the Internet. Even with high-speed access, users may experience stutter, or stopping and starting, if there is a lot of traffic on the network. Users will be able to view the movie only on a partial or full PC screen size window. Development in the streaming video world has been the integration of over-the-air and online entertainment programs. For example, in November, 1999, ABC.com and Warner Brothers Online hosted a simulcast of an episode of the Drew Carey Show. The television audience watched Drew’s daily activities, while the Internet audience saw footage of what was happening in his home when he was out at work. ABC indicated that approximately 650,000 streams were served (Tanaka, 2000). In the news/information category, many users like to utilize the Internet to view video clips of domestic news and international news items. Other users tend P1: C-46 Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0 THE BIG THREE STREAMING TECHNOLOGIES 563 to gravitate toward sites that provide clips on sporting events. For example, the on-air rating of the JFK Jr. tragedy was only 1.4, while over 2.3 million streams were delivered from the CNN.com Web site (Tanaka, 2000). The use of streaming video in the education and train- ing areas has been rapidly growing for the last few years. Universities and colleges, in particular, have been explor- ing the use of streaming video for their distance learn- ing programs. Distance learning has become popular be- cause many people who have been in the work force for a few years are returning to school to obtain an advanced degree, pursue a career change, or upgrade their skills. Many opt for distance learning programs because of their work or travel schedules, or because the academic pro- grams they desire are not available locally. Because of the growth of distance learning programs, many colleges and universities have started to use streaming video as an al- ternative to mailing out VCR tapes, which can be cum- bersome. With streaming video, these institutions can ex- pand their distance learning programs to meet the needs of their students. In addition, there are many Web sites that offer training and tutorial programs on a variety of subjects. A common presentation method used by educators is a lecture that includes static slides. These is must easier to create and will provide good quality sound and images for those students who have modem connections. There are a number of software tools that can be used to combine PowerPoint slides with narrations to create streaming pre- sentations. In view of the above discussion, it should also be pointed out that streaming video is used for teaching ma- terial that involves motion or dynamic interaction. Some examples of this include medical or laboratory proce- dures, processes in the physical sciences, interpersonal skills, and illustrations of real world events or activities (DoIt & WISC, 2002). In addition, live training or teaching webcasts are produced using audio, slides, or video. The participants access the Web site from their computers. In- teraction between the instructor and participants occurs in real time. The participants can use a chat window to type in questions to the presenter during the session (DoIt & WISC, 2002). These events are very challenging to coor- dinate and deliver and are not as common as illustrated audio presentations. Businesses and companies are starting to use stream- ing videos for advertising and communications. Some businesses have started to webcast their products in order to improve their sales. One of the most talked about events was the Victoria’s Secret fashion show that was web- cast in February, 1999 (Tanaka, 2000). Another form of advertising that has become increas- ingly popular is the video banner ad (Tanaka, 2000). This technology involves using a program that detects whether or not the client PC has a streaming media player, and then determines the type if there is one present. This is done before the user clicks on the Web page. Once the user clicks on the Web page, video is played using the me- dia player on the PC. If there is no media player on the PC, then a regular GIF banner is displayed. Businesses are also using streaming media to broad- cast presentations, corporate meetings, and in-house seminars to their employees. Many companies are finding that this is less expensive than live meetings and seminars, where travel expenses are incurred. And it offers oppor- tunities for communication that would not otherwise be available. For example, a company that uses streaming technology may choose to broadcast an industry analysts’ meeting or public relations event that, without this tech- nology, would not be feasible to do. THE BIG THREE STREAMING TECHNOLOGIES As mentioned previously in this chapter, there are three major technologies for streaming video: RealOne, Quick- Time, and Windows Media. These three players provide all of the tools needed for streaming video, including ap- plications for creating, editing, compression, encoding, serving, and playing. Of these three, RealOne is the old- est and still the most widely used (Sauer, 2001). RealOne claims that they have over 70% of the Internet stream mar- ket with their player being installed in over 90% of home PCs (Cunningham & Francis, 2001). The RealOne technol- ogy supports over 40 media formats and employs the lat- est generation of encoding and compression techniques. They have also developed a technology, called Surestream, that utilizes an automatic bit-rate technology to adjust the data stream rate to the bandwidth characteristics of the user (Cunningham & Francis, 2001). RealOne has developed some strategic partnerships that may give it a competitive advantage for the near future. First, RealOne now supports Apple’s QuickTime technology. And it is working with the National Basket- ball Association and the Major Baseball League on a pay- per-view model (Cunningham & Francis, 2001). However, only the basic player and server versions are free; the more advanced server and productions tools available from RealOne can cost up to several thousand dollars. Stream- ing is ReadMedia’s core business and they must charge fees for the use of their applications, whereas their competitors can incorporate their streaming technology into other products they sell, such as operating systems (Cunningham and Francis, 2001). Also, RealOne is SMIL compliant. SMIL stands for syn- chronized multimedia integration language and it pro- vides a time-based synchronized environment to stream audio, video, text, images, and animation (Strom, 2001). SMIL is a relatively new language available to streaming users. It is the officially recognized standard of the World Wide Web Consortium (Strom, 2001). SMIL has attracted a lot of attention because of the features and flexibility it offers to users. QuickTime was developed by Apple in 1991 and it is one of the oldest formats for videos that are downloaded. It is the one of the recent entrants into the streaming video market (Sauer, 2001). One of the advantages that QuickTime offers is that it can support different com- pression techniques, including those used by RealOne, as noted above. QuickTime also features an open plug-in function that will allow the utilization of outside compres- sion techniques (Cunningham & Francis, 2001). It is also SMIL compliant (as noted above for RealOne). Quicktime is available in the Apple MAC operating system. But it [...]... in the Creation of Internet- Based VEs Conclusion Glossary Cross References References 5 71 575 576 577 577 577 578 570 INTRODUCTION A flexible manufacturing facility in a remote location is in full operation Machines are busy producing parts, automated robots are helping to assemble them into finished products, and conveyor systems are transporting them to a packing and shipping bay Above the din of the. .. VIRTUAL PRIVATE NETWORKS In B C D A .1 A A.2 Port 1 1 2 In A C D 1 Port 1 2 2 1 B .1 B 2 2 In A B C D .3 2 D Port 2 2 1 1 In A B D Port 1 2 1 1 5 83 B.5 C.4 2 C D.2 C .3 Figure 3: Example of two connectionless VPNs to forward user traffic Instead, a routing protocol distributes topology information such that each node can make an independent, yet coordinated, decision about the next hop on which to forward... by the IETF specifies the details of IPsec RFC 24 01 (Kent and Atkinson, 19 98) describes the overall IP security architecture, whereas RFC 2 411 (Thayer et al, 19 98) gives an overview of the IPsec protocol suite and the documents that describe it Three protocols make up IPsec, with the names identifying the function performed The two primary protocols involved in the transfer of data are called the authentication... there is the raw video that must be captured and digitized into the appropriate input file format Then the video must be encoded and compressed into the proper streaming format Next, the video is delivered over the Internet from a special server, called a video server The user then receives and plays the video The process sounds simple but the actual functions are very complex, as can be seen from the. .. streaming media themselves would have not come into being if it were not for the development of the Internet Interestingly 565 enough, the data transmission or bandwidth limitations of the Internet remain one of the biggest challenges to streaming video over a network The Internet simply was not designed for streaming media With the bandwidth limitations of the Internet in mind, a lot the streaming media... the ORB through an object adapter This skeleton can map the request back to the implementation language of the server object (the skeletal program is obtained using an IDL compiler the server is running on) When the ORB receives a request, the skeleton in essence performs a call-back to the server object After the server completes processing the request, the results are returned to the client via the. .. (whether implemented in the same language or in different languages at each site) can communicate to each other using the Internet Inter-ORB protocol (IIOP) (Figure 6) The IIOP is the general inter-ORB protocol (GIOP) over the TCP/IP, which is mandatory for CORBA 2.0 compliance The IIOP is based on the TCP/IP, which is the most popular transport mechanism available today and is the protocol of the Internet. .. a request for another interface, which can be provided by an instance of another object, which is on another computer on the Internet COM’s distribution mechanism can connect the client to the server so that the method called from the client is received by the server (or provider) on another computer where it is executed and the return values are returned to the client (or consumer) The distribution... site: http://www vantum.com /pdf/ codecs .pdf Videomaker Magazine (20 01) Streaming video primer Retrieved December 31 , 20 01, from Chaminade College Preparatory Web site: http://www.chaminade.org/ mis/Articles/StreamingVideo.htm Williams, M (20 01) New chip could bring video to mobile phones Retrieved January 17 , 20 01, from CNN.com Web site: http://www.cnn.com/20 01/ TECH/computing/ 01/ 17/ mpeg4.chip.idg/index.html... sites using the customer’s link-layer identifiers, for example the Ethernet A PE-based L3 VPN provides an L3 service that routes packets between customer sites using the customer network’s address space, for example the IP The CE-based approach is the simplest from the service provider backbone perspective, but it requires a fair amount of configuration and management of the CE On the other hand, the network-based . Entry Point B B P B B P B B P B B P B B Entry Point 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Figure 1: MPEG -1 encoding process. One disadvantage of the MPEG format is that it can- not easily be edited. greater the amount of information that can flow through the network to the destination. At the destination, the speed of the modem or other device used to connect to the Internet determines the bandwidth. for the water from the tank to flow through the pipe. If the pipe is larger, then it will take less time for the water to flow through (Microsoft.com, 2000). Therefore, the higher the bandwidth, the