Chapter 23- Conferencing on the Internet ppt

Chapter 23 Conferencing on the Internet Conferen cing involves communication among several users. Multimedia conferencing, including audio, video, instant messaging, whiteboard sharing, and file transfer, is a popular service on the Internet and in enterprises. Chat rooms where users exchange instant messages are an example of a conference service on the Internet. The collaboration tools used in most enterprises are also examples of conferences. Thus, conferences are not limited to traditional unmoderated audio or video conferences. They can include all types of media and can be moderated by using floor control mechanisms. Conferencing is an important area for enterprises with employees working in different countries. A conference system including collaboration tools can save much money and time by reducing the need for face-to-face meetings where attendees need to travel great distances. However, we are still far from having conference systems that can replace face-to - face meetings completely. That is why there is much ongoing research in areas such as telepresence and virtual reality. The goal is to make virtual interactions as close to real ones as possible. 23.1 Conferencing Standardization at the IETF In the past, working groups such as MMUSIC did some work on conferencing (e.g., SDP was designed with multiparty sessions in mind) . Lately, the working groups that have been active in this area have been SIPPING and XCON. In fact, implementers sometimes find it confusing to have similar specifications in the same area coming from two different working groups. Knowing the history behind conferencing standardization at the IETF will help readers understand how the specifications coming from both working groups relate among them. Initially, the SIPPING working group developed a set of specifications that described how to provide conferencing services using SIP. Coming from the SIPPING working group, these specifications were, unsurprisingly, very much focused on SIP. Pieces needed to build a complete conference service such as floor control and conference management mechanisms (beyond the simple ones SIP provides) were out of the scope of this work. The XCON working group was chartered to work on generalizing the work done in SIPPING so that different signaling protocols (not only SIP) could be used and to specify those missing pieces needed to build a complete conference system. The charter was limited to centralized conferences where clients connect to a central server following a star topology. ´ıa- M ar t´ın The 3G IP Multimedia Subsystem (IMS): Merging the Internet and the Cellular Worlds Third Edition Gonzalo Camarillo and Miguel A. Garc © 2008 John Wiley & Sons, Ltd. ISBN: 978- 0- 470- 51662- 1 484 CHAPTER 23. CONFERENCING ON THE I NTERNET Conferences using different topologies such as full-meshed and cascaded conferences were left out of scope. The results of the work of these two working groups include two conferencing frame- works: the SIPPING conferencing framework and the XCON conferencing framework. We discuss both of them, their differences, and how they relate to each other. 23.2 The SIPPING Conferencing Framework The SIPPING conferencing framework (specified in RFC 4353 [272]) describes three conferencing models: loosely coupled, fully distributed, and tightly coupled. In the loosely- coupled conferencing model, shown in Figure 23.1, media streams are multicast. Conference participants join the multicast group of the conference using, for example, IGMP (Internet Group Management Protocol, specified in RFC 3376 [95]) in order to receive media. Conferen ce participants do not typically have any signaling relationsh ip between them. Still, they can use SIP to invite new participants into the conference. A SIP INVITE request sent to a new participant would contain (in its body) al l information needed to join the multicast group. Figure 23.1: The loosely-coupled conference model In the fully-distributed conferencing model, shown in Figure 23.2, each participant has a signaling relationship with all of the other participants in th e conference. Each pa rticipant sends media to all of the other particip ants. In the tightly-coupled conferencing model, shown in Figure 23.3, each participant has a signaling relationship with a central conference server. The central conference server mixes the media received from different participants and distributes it to all of them. Of course, the three conferencing models just described are not the only models that can be implemented with SIP. Many other variants are possible. For example, when the central conference server in a tightly-coupled conference is distributed among several SIP nodes, the resulting model is typically referr e d to as the cascaded conferencing model. In any case, the SIPPING conferencing framework focuses on the tightly-coupled conferencing model; the rest of the models are considered to be out of scope of our work. 23.2. THE SIPPING CONFERENCING FRAMEWORK 485 Figure 23.2: The fully-distributed conference model Figure 23.3: The tightly-coupled conference model 23.2.1 Signaling Architecture Figure 23.4 shows the signaling architecture proposed by the SIPPING conferencing framework. The conference server consists of several logical functions: the conference policy, the conference policy server, and the focus, which includes the conference notification service. The conference policy is the set of rules that define a conference. The conference policy includes information about the participants of the conference, the time and date when the conference will take place, the media streams the conference has, etc. Participants manipulate 486 CHAPTER 23. CONFERENCING ON THE INTERNET Figure 23.4: Signaling architecture in the SIPPING framework the conference policy (e.g., to add a video stream to an audio-only conference) through the conference policy server. The protocol between participants and the conference policy server is left unspecified. The focus interacts with the confer ence participants using SIP. It acts as a user agent towards all of the participants. The focus includes the conference notification service, which provides participants with information about the conference using the SIP event package for the conference state (specified in RFC 4575 [289]). This event package defines an XML- based format to convey conference-related information. Figure 23.5 shows an example of a document that uses this format. This document, which is mostly self-explanatory, describes a conference and provides information about two of its participants: Bob and Alice. Bob was kicked out from the conference because he experienced bad voice quality and Alice was brought in into the conference by Mike. Note that even though the number of participants in the conference is 33 (see the <user-count> element), the document only provides detailed information about two of them (Bob and Alice). Conferencing servers can omit information about certain users for policy reasons. The XML document in Figure 23.5 is already fairly long, even though it only carries information about two users. A document describing a large conference with many users would b e much longer. In principle, every time a small change occurs in the conference (e.g., one user leaves the conference), the conference notifications service would need to send a new large XML document that would very similar to the last one it sent (e.g., the only difference would be in the elements related to the user that left). This would result in a non-efficient bandwidth use. In order to avoid this situation, the SIP event package for conference state implements a mechanism for partial notifications. The “state” attribute in dicates whethe r an element carries full or partial information. In addition, the “state” attribute can also indicate that an element 23.2. THE SIPPING CONFERENCING FRAMEWORK 487 <?xml version="1.0" encoding="UTF-8"?> <conference-info xmlns="urn:ietf:params:xml:ns:conference-info" entity="URI}sips:conf233@example.com" state="full" version="1"> <! CONFERENCE INFO > <conference-description> <subject>Agenda: This month’s goals</subject> <service-uris> <entry> <uri>http://sharepoint/salesgroup/</uri> <purpose>web-page</purpose> </entry> </service-uris> </conference-description> <! CONFERENCE STATE > <conference-state> <user-count>33</user-count> </conference-state> <! USERS > <users> <user entity="sip:bob@example.com" state="full"> <display-text>Bob Hoskins</display-text> <! ENDPOINTS > <endpoint entity="sip:bob@pc33.example.com"> <display-text>Bob’s Laptop</display-text> <status>disconnected</status> <disconnection-method>departed</disconnection-method> <disconnection-info> <when>2005-03-04T20:00:00Z</when> <reason>bad voice quality</reason> <by>sip:mike@example.com</by> </disconnection-info> <! MEDIA > <media id="1"> <display-text>main audio</display-text> <type>audio</type> <label>34567</label> <src-id>432424</src-id> <status>sendrecv</status> </media> </endpoint> </user> Figure 23.5: Example of an XML-based conference description (part 1) 488 CHAPTER 23. CONFERENCING ON THE INTERNET <! USER > <user entity="sip:alice@example.com" state="full"> <display-text>Alice</display-text> <! ENDPOINTS > <endpoint entity="sip:4kfk4j392jsu@example.com;grid=433kj4j3u"> <status>connected</status> <joining-method>dialed-out</joining-method> <joining-info> <when>2005-03-04T20:00:00Z</when> <by>sip:mike@example.com</by> </joining-info> <! MEDIA > <media id="1"> <display-text>main audio</display-text> <type>audio</type> <label>34567</label> <src-id>534232</src-id> <status>sendrecv</status> </media> </endpoint> </user> </users> </conference-info> Figure 23.6: Example of an XML-based conference description (part 2) has been deleted. Accordingly, the “state” attribute can take on the following values: full, partial, or deleted. An element with a “state” attribute with a value of partial carries only the information that has changed since the pr evious document was sent to the p articipant. If a parent element has a “state” of full, all of its child elements should also have a “state” of fu ll. On the other hand, if a parent element has a “state” of partial, its child elements can have any “state”. The default value for the “state” attribute is full. The only e lements that can carry a “state” attribute are <conference-info>, <users>, <user>, <endpoint>, <sidebars-by-val>,and<sidebars-by-ref>. In Figure 23.5, all of the “state” attributes have a value of full. 23.2.2 Media Architecture The SIPPING conferencing framework describes the following media plane realizations: centralized server, endpoint server, media server component, distributed mixing, and cascaded mixers. In the centralized-server model, a central server handles both signaling and media, as shown in Figure 23.3. In the endpoint-server model, one of the endpoints behaves as the central server in the centralized-server model, as shown in Figure 23.7. The endpoint-server model is typically the result of a two-party call between two endpoints that transitions into an ad-hoc conference. This is the case when the users involved in the original two-party call decide to bring in one or more additional users into the call at some point. 23.3. THE XCON CONFERENCING FRAMEWORK 489 Figure 23.7: The endpoint-server model The endpoint-server model works well when the endpoint performing the mixing does not have processing, bandwidth, or battery constraints. Conferences between endpoints with those constraints are better handled by a central server. In the media-server-component model, the central server of the centralized-server model is divided into two servers: an application server and a mixing server. The application server interacts with the co nference participants but does not have mixing capabilities. The mixing server performs the actual media mixing. The interface between the application server and the mixing server is based on SIP. The application server can use SIP mechanisms such as third-party call control (specified in RFC 3725 [282]) to instruct the mixing server how to mix the conference’s media streams. The SIPPING conferencing framework does not talk about distributed conference servers that use a protocol other than SIP (e.g., H.248 [189]) between the server handling SIP signaling (i.e., hosting the focus) and the server performing the mixing. However, this model can be considered a special case of the centralized-server model in which the internal structure of the server is distributed. In the distributed-mixing model, the centr al server of the centralized-server model handles signaling but not media. The central server does n ot have any media mixing capabilities; instead, it instructs users to exchange media among them. In this model, the conference server is, effectively, a third-party call controller (as specified in RFC 3725 [282]). Figure 23.8 shows h ow, in this model, media can be exchanged using unicast or multicast. In the cascaded-server s model, the mixing functionality is distributed among several physical mixers. The central server handling the signaling of the conference coordinates all of the mixers so that all users receive the conference’s media correctly. 23.3 The XCON Conferencing Framework As discussed earlier, the XCON working group was chartered to work on generalizing the work on conferencing performed on SIPPING, which was specific to SIP. The XCON 490 CHAPTER 23. CONFERENCING ON THE INTERNET Figure 23.8: The distributed-mixing model framework (specified in the Internet-Draft “A Framework and Data Model for Centralized Conferencing” [82]) defines the conferencing architecture shown in Figure 23.9. This figure shows a conference system able to host several conferences. That is why the figure shows more than one conference object. 23.3.1 Conference Objects A conference object contains all o f the information related to a given conference. It is the same concept as the conference policy in the SIPPING conferencing framework (see Figure 23.4) with a different name. Figure 23.5 shows an example of the XML-based format to describe conference policies developed by the SIPPING working group (which is specified in RFC 4575 [289]). The XCON working group extended this format so that it can be used to describe more general conferences (i.e., not only SIP-based conferences) and to provide more information about a given conference (e.g., floor-control-related information was missing from the original format and was added by the XCON working group). The resulting format is referred to as the XCON data model (which is specified in the Internet-Draft “Conference Information Data Model for Centralized Conferencing (XCON)” [224]). The improvements in the XCON data model, with respect to the original format defined by the SIPPING working group, include the ability to carry different types o f URIs and the inclusion of information that relates to floor control, conference scheduling, and media controls (e.g., a control to mute a media stream). In order to create a conference, it is necessary to create its con ference object. The initial values for the variables of a conference object are typically taken from a conference blueprint. A conference blueprint is a template to create conference objects. For example, a conference 23.3. THE XCON CONFERENCING FRAMEWORK 491 Figure 23.9: XCON architecture system may have a conference blueprint with the typical values to create an audio-only conference. 23.3.2 Conference Control Server Users can manipulate conference objects and, thus, the properties of any conference, using a conference control protocol. Such a protocol runs between the participant’s conference control client and the conference control server. The XCON working group is chartered to develop a conference control protocol. We expect this working group to specify such a protocol in the future. One of the main decisions concerning this protocol is whether it should follow a semantic approach or a syntactic approach. A semantically-oriented protocol would have primitives to p erform conference-related operations such as create a conference, add a user to a conference, and remove a media stream from a conference. Such primitives would h ave an effect on a conference object (which is described by an XML document). For example, the creation of a conference would create a new conference object. The addition of a user would add a new <user> element to the XML document describing the conference object. A syntactically-oriented pr otocol would have primitives to operate d irectly at the XML level. For example, in order to add a user to a conference, the protocol would directly instruct 492 CHAPTER 23. CONFERENCING ON THE INTERNET the conference control server to add a <user> element to the XML document describing the conference object. Both approaches have advantages and disadvantages. A syntactically-oriented protocol may initially be more complex since it would need to provide g eneral XML manipulation mechanisms. On the other hand, it would not need to be extended in order to manipulate new data model elements that m ay be defined in the future. A semantically-oriented protocol may initially be simpler and, in general, more efficient but would need to be extended in order to perform new operations. Specifying policies (e.g., only the moderator can add new user into the conference) seems to be easier if the semantic approach is followed. The XCON working group started working on an XCAP-based protocol that followed a syntactic approach. However, that protocol was abandoned and, at present, it seems that the conference control protocol to be developed by XCON will follow a semantic approach. 23.3.3 Foci and Notification Service As in the SIPPING conferencing framework, an XCON focus has a signaling relationship with the user agents in the conference. However, in the XCON framework, a conference can have multiple foci; each one handling a different protocol (e.g., SIP and H.323). In the SIPPING framework, both the focus and the notification service used SIP and, thus, were part of the same logical entity. The XCON framework separates them into two different logical entities because they can use different protocols. As discussed earlier, the XCON data model extends the XML-based format used by the SIPPING notification service (which is specified in RFC 4575 [289]). The XCON notification service needs to be able to use this extended format (i.e., the XCON data model) in its notifications. An extension to the SIP event package for conference state (also specified in RFC 4575 [289]) has been defined so that the event package can carry information in the format specified in the XCON data model (this extension is specified in the Internet- Draft “Conference Event Package Data Format Extension for Centralized Conferencing (XCON)” [113]). 23.3.4 Floor Control Server Floor control is used to manage the access to a shared resource. Examples of resources in a conferencing environment are a shared whiteboard, a video stream, and a voice stream. The user that has the floor corresponding to a resource at a given moment is allowed to access the resource. For example, the user that has the floor corresponding to a shared whiteboard, is allowed to draw on the whiteboard. It is important to note the difference between not being allowed to do something and actually being kept from doing it. Let u s think of a face-to-face conference where all participants have their own microphone. The conference’s chair will indicate which participant can speak (e.g., to ask a question) at a given time. However, the chair does not need to manage access to the microphones. If the participants are polite enough, they will only talk into their microphones when they are told to by the chair. However, if participants start talking when it is not their turn, the chair may have to disable all of the microphones expect the one of the participant that has the floor at any given time. Therefore, the fact that a conference uses floor control does not imply that floor-control- related decisions are enforced in any way. They may or may not be enforced, depending on the environment. [...]... than one floor in an atomic operation Such a floor request will only be granted when all of the floors requested can be granted to the participant 494 CHAPTER 23 CONFERENCING ON THE INTERNET Figure 23.10: BFCP architecture 23.4.1 Contacting the Floor Control Server In order to contact a conference’s floor control server, the conference participants (which will act as BFCP clients when they contact the. .. entities: a BFCP client, the floor control server, and a floor chair 496 CHAPTER 23 CONFERENCING ON THE INTERNET Figure 23.12: BFCP message flow The floor chair in Figure 23.12 is responsible for the floor whose ID is 100 The floor chair joins the ongoing conference by requesting information about floor 100 by sending a FloorQuery message to the floor control server The floor control server informs the floor chair that... session description generated by a floor control server (only “m” and “a” lines are shown for simplicity reasons) The “setup” and “connection” attributes (specified in RFC 4145 [319]) relate to the establishment of the TCP connection The “fingerprint” attribute (specified in RFC 4572 [206]) relates to the establishment of TLS on top of the TCP connection in order to provide integrity protection and confidentiality... addition, the floor control server’s SDP description (it can be the offer or the answer in the offer/answer exchange) also contains parameters related to BFCP These parameters are the conference ID and the floor IDs to be used by the client, the role each endpoint will perform (client or floor control server), and the relation between floors and media streams (the SDP format to establish BFCP connections... example, two endpoint establishing a floor-controlled shared whiteboarding session between them need to decide which endpoint acts as the floor control server The “confid” and the “userid” attributes provide the client with the conference ID and the client’s BFCP user ID respectively The client will use these values in the BFCP messages it sends to the floor control server The “floorid” attributes associate floors... the data it needs using the conference event package The XCON data model describes how to encode all of these data in an XML document Once the client obtains, via the conference event package, all of the data it needs, it establishes a TCP connection to the floor control server The client uses the conference ID and the user ID obtained through the conference event package in its BFCP messages 23.4.2... establish a connection with a floor control server: inside and outside the context of an offer/answer exchange 23.4.1.1 Inside an Offer/Answer Exchange Connections within the context of an offer/answer exchange are established in the same manner as any other media stream The client and the server perform an offer/answer exchange using SIP where the exchange the parameters needed to establish the connection (e.g.,... connection, the client needs to obtain the same data as when an offer/answer exchange is used (i.e., the server’s IP address and port number, the conference and user ID, floor IDs and their relationship with resources such as media streams, etc.) Instead of getting all of these data in a session description from the floor control server, the client typically obtains all of the data it needs using the conference... instruct the conference’s mixer to ignore incoming media from participants that do not hold the floor A conference can have multiple floors Each of them can control the access to a different resource within the conference A floor control server can have different policies regarding multiple floors For example, in a video conference, the floor control server can grant the video floor to the participant holding the. ..23.4 THE BINARY FLOOR CONTROL PROTOCOL (BFCP) 493 In XCON, the enforcement of floor-control-related decisions is outside the scope of floor control That is, the floor control server uses a floor control protocol to communicate with its clients However, if the floor control protocol wants to enforce its decisions, it will use a different protocol For example, the floor control server could . participants of the conference, the time and date when the conference will take place, the media streams the conference has, etc. Participants manipulate 486 CHAPTER 23. CONFERENCING ON THE INTERNET Figure. receive the conference’s media correctly. 23.3 The XCON Conferencing Framework As discussed earlier, the XCON working group was chartered to work on generalizing the work on conferencing performed on. cascaded conferences were left out of scope. The results of the work of these two working groups include two conferencing frame- works: the SIPPING conferencing framework and the XCON conferencing

Định dạng
Số trang	16
Dung lượng	668,97 KB