Introduction
In recent years, the exchange of multimedia documents, which include elements such as audio, images, text, and video, has gained significant popularity, especially with the emergence of languages like SMIL and SVG Additionally, high-speed internet connectivity is now widely available across the globe.
In our scenario, the user needs to work within specific contexts, making a general multimedia document often unsuitable To address this, users seek to customize multimedia content to fit their particular usage needs, necessitating an adaptive multimedia document system There are two approaches to achieve this: a local software solution and a distributed system While local software has limitations, such as device capacity and the need for updates, a distributed system can effectively circumvent these issues.
Fig 1.1 Un contexte d'adaptation de document multimédia
This internship is conducted within the MM (MultiMedia) team of the Signal and Image Processing Department at TELECOM-ParisTech (formerly ENST Paris) It follows the doctoral research of Z Kazi-Aoul [KA07] on the adaptation of multimedia documents through the composition of elementary services In this context, a decision-making engine matches a multimedia document's description with its usage context, leading to a derived adaptation description This adaptation is represented as a composition tree of various elementary services.
Image en couleur et texte en anglais
Image en noir et blanc et texte en franỗais
Then I stumbled upon Paris Hilton, I didn’t even think about her before because I wouldn’t even categorize her in the loosest for of an “artist.”
Ensuite, je suis tombé sur Paris Hilton, je n'ai même pas penser à elle avant parce que je n'aurait même pas classer dans le son pour loosest d'un "artiste"
During the internship, we will strive to establish an appropriate formalism to automate the usability of these descriptions This will likely involve leveraging research related to semantic web services, focusing on the definition of preconditions and postconditions.
Our work has led us to select BPEL 1 as the language for describing the composition of elementary services, primarily utilizing web services for adaptations The focus of this internship will be on developing a set of deployable web services within this framework, along with implementations aimed at achieving effective and practical use of BPEL execution engines for adaptations.
La contribution principale de ce stage est de donner un démonstrateur des possibilités offertes par l’utilisation de Web Services multimédias distribués, notamment pour l’adaptation de document multimédia
This document is structured into three main parts The first part includes the introduction and literature review Following this are three technical chapters: Multimedia Document Adaptation via Web Services, Semantic Web Services, and Web Service Composition, which may be challenging to comprehend Additionally, Chapter 4, focused on WSDL-S, is still under development and relies solely on a W3C proposal, lacking tools for result validation The final section comprises two chapters that analyze the results and provide a conclusion.
Etat de l’art
Classification des services
In his thesis, Mr Kimiaei proposed a classification of services To adapt multimedia content, we will consider processes such as transcoding resource 1, transmoding resource 2, and transforming resource 3.
Architecture du système d’adaptation
The first approach focuses on client-side adaptation, where user context—such as preferences and device capabilities—is stored on the client side The server's role is limited to sending multimedia documents According to [RAN06], the authors proposed a client-side adaptation system aimed at reducing energy consumption during display.
The second approach focuses on server-side adaptation in a client/server model, addressing the limitations of terminal devices This method allows the server to maintain multiple multimedia document versions tailored to individual user contexts However, it requires a high-capacity hard drive Recent projects, such as [PAN06], have explored web service-based multimedia content adaptation, proposing the RMob 4 architecture, which relies solely on transcoding via web services While this architecture shares similarities with Z Kazi-Aoul's thesis, it still lacks elements like transmoding and transformation.
The third approach to multimedia document adaptation is client/server-based, incorporating an intermediary module between the client and server In this setup, the number of nodes for each adaptation system is predetermined, resulting in limited flexibility It is essential to calculate the loads at each intermediary node to establish an optimal system configuration.
4 Rich Media on-line broker
PAAM
Dans sa thèse, Z Kazi-Aoul a proposé une architecture fonctionnelle de PAAM (Peer-2-Peer Architecture for the provision of Adaptable Multimedia contents) On la trouvera en Fig 2.1
Fig 2.1 Architecture fonctionnelle de PAAM [KA07]
The use of PAAM can be described as follows: a user submits a request that includes multimedia documents and a usage context to the planner The planner then obtains a description of the document and, after jointly analyzing this description along with the usage context, prepares an adaptation plan to communicate to the adaptation manager The adaptation manager searches the network for elementary adapters capable of executing the various steps of the adaptation plan and invokes them to fulfill the user's request.
The architecture of PAAM is based on a peer-to-peer (P2P) model, meaning the number of nodes in the network is not fixed Each node can function as either a client or a server, necessitating an adaptation management system to effectively handle the varying functionalities of these nodes within the network.
In her thesis, Z Kazi-Aoul validated that BPEL is an effective technology for composition She also provided a prototype for the PAAM architecture; however, her prototype employs very limited methods, which allowed for only a certain number of tests and validations Furthermore, she has yet to implement BPEL in her prototype This is why I am undertaking this internship following her thesis.
Web services du traitement de document multimédia
Web service
A web service is a technology that enables applications to communicate remotely over the Internet, regardless of programming language or execution platform This communication relies on the HTTP protocol and follows a request-response model, using messages formatted in XML, specifically SOAP messages Consequently, these communications typically pass through firewalls without filtering.
A web service is defined by a WSDL file, which outlines the description of interfaces, including data types and function parameters Essentially, a WSDL document consists of seven elements.
- Type : un conteneur pour définitions des types de données utilisant quelques types existants (string, int, float, date, etc.)
- Message : une définition abstraite pour les données échangées entre web services
- Operation : une description abstraite d’une action fournie par le web service
- Port Type : un ensemble abstrait d’opérations fournies par un ou des points d’accès
- Binding : un protocole concret et la spécification du format de données pour un port type
- Port : un seul point d’accès défini comme la combinaison d’un binding et une adresse du réseau
- Service : une collection de points d’accès reliés
3.1.1 Comment peut-on créer un web service
Today, both open-source tools like Eclipse and NetBeans, as well as commercial options such as WebSphere, JBuilder, and MyEclipse, enable developers to easily create web services in Java using the interface.
There are two primary approaches to building a new web service: the Top-down method and the Bottom-up method The Top-down approach involves using a WSDL file to generate Java classes, followed by the implementation of the web service In contrast, the Bottom-up method allows for the generation of a web service directly from an existing Java class Both methods are practical and well-explained in the referenced document [WS15], which provides further details on creating web services.
Synchronous web services are only suitable for quick processing times, as they use a single channel for both receiving requests and sending responses If the processing time extends to one or two days, the connection between the server and client may be lost, resulting in an inability to receive the response To mitigate this issue, asynchronous services are recommended, as they utilize two separate channels: one for receiving client requests and another for sending responses This approach ensures that the connection is only active during the request or response periods, enhancing reliability for longer processing tasks.
The Axis2 web service server [AXIS2] offers asynchronous web services For a deeper understanding of how to create and invoke an asynchronous web service alongside a client, refer to [WS-ASYNC] However, there are slight differences in server configuration between the guidelines in [WS-ASYNC] and the actual implementation for it to function correctly Therefore, I will focus solely on this configuration.
To create an asynchronous web service, start by developing a synchronous service using Axis2, as outlined in section 3.1.1 The web service generated by Axis2 includes a transport method definition file, typically located at /code/mobic/WebContent/WEB-INF/services/Video/META-INF/services.xml To complete the setup, add a specified line as shown in Figure 3.1.
Fig 3.1 Configuration de web service asynchrone
UDDI (Universal Description, Discovery, and Integration) is a specification that outlines how to publish and discover web services on the network To publish a web service, an XML message formatted as ebXML is used, which includes essential information such as the service's address.
1 Electronic Business using eXtensible Markup Language
IP 1 , noms de domaines, les informations sur les modalités d’usage du web service, etc Pour découvrir, la recherche se fait grâce à un moteur de recherche intégré au site de l’opérateur UDDI choisi Ce moteur de recherche nous permettra d’affiner notre recherche selon plusieurs critères : nom de web service, nom de l’entreprise, etc.
Catégorie de web services du traitement
In our multimedia document adaptation system architecture, we categorize adaptation into three main types: transcoding, transmoding, and transformation These categories are illustrated in the diagram shown in Fig 3.2.
The transcoding category enables the conversion of the encoding method of a media element, such as text, images, audio, or video Specifically, for images, audio, and video, transcoding involves changing the format type An example of this process is illustrated in Table 3-1.
Format d’origine Format du résultat
Tab 3-1 Un exemple du transcodage
Documents d’entrée contenant le langage présent, le format, etc
Text transcoding involves the encoding of characters, specifically the conversion between different character encodings, such as ASCII and UTF-8.
The transmodage category involves changing the type of media For instance, if a device lacks the necessary font to properly display Cyrillic text, an image of that text can be created, resulting in a transformation from text to image The following subsections will explore various options for multimedia document processing through transmodage.
The service specializes in converting text into images, with the number of images tailored to the user's needs, such as the desired image size and text length.
Il s’agit d’extraction OCR 1 du texte contenu dans une image Donc, il existe des erreurs (caractères incorrects) pour le retour du web service
It converts text into audio, providing an automated reading experience This service is essential in our increasingly busy lives, allowing us to multitask while staying informed about various topics such as politics and economics.
Text-to-speech conversion is closely tied to the specific language used, as each language possesses unique characteristics such as intonation and structure Consequently, a one-size-fits-all service for all languages globally is not feasible However, efforts are being made to provide solutions for various languages, including English and French.
Il s’agit de convertir un fichier audio en texte Cela veut dire qu’on a une transcription automatique pour quelques buts comme diminution de taille, etc
Il s’agit de prendre une image dans une vidéo Cela peut permettre de donner une publicité automatique basée sur des images dans une vidéo
Taking a series of images can effectively summarize video scenes, allowing for a clearer understanding with just a few pictures This method also reduces storage space compared to storing an entire video.
The transformation category involves altering a media file without changing its encoding format For instance, this may include resizing an image through scaling or cropping its edges The subsequent sections will delve into various transformation services.
Translating text from one language to another, such as from English to French, is essential for effective communication Utilizing existing tools like Google Translate can greatly enhance this process, making it easier to achieve accurate translations.
3.2.3.2 Rognage d’une image ou une vidéo
Il s’agit de rogner une image ou une vidéo Cela signifie que les deux dimensions (largeur et hauteur) d’une image ou une vidéo sont rognées
3.2.3.3 Changement d’échelle d’une image ou une vidéo
Il s’agit de changer l’échelle d’une image ou d’une vidéo Le changement d’échelle consiste à changer une dimension selon l’autre dimension
3.2.3.4 Changement de résolution d’une image
Il s’agit de changer les dimensions d’une image C’est-à-dire qu’il y a une nouvelle largeur et une nouvelle hauteur, mais le contenu de l’image est inchangé
3.2.3.5 Passage en noir et blanc d’une image ou d’une vidéo
Converting an image or video to black and white involves transforming a colored visual into a monochromatic format This process changes the original vibrant hues into varying shades of gray, creating a timeless and classic aesthetic.
3.2.3.6 Changement de fréquence d’échantillonnage d’un audio
Il s’agit de changer la fréquence d’échantillonnage d’un audio
3.2.3.7 Changement de débit pour un son ou une vidéo
Il s’agit de changer le débit en octet pour jouer Il est important dans le cas ó la bande passante disponible du réseau est petite
3.2.3.8 Diminution de poids en octet
Il s’agit de diminuer ô le poids ằ en octet d’une image Le poids concerne le nombre des bits prộsentant un pixel
Il s’agit de diminuer le bruit sur l’image, le son et la vidéo.
Réalisation
As mentioned in section 3.1.1, creating a web service using a JAVA class is straightforward and uncomplicated Therefore, this section will focus exclusively on the implementation method in JAVA.
In general, it is not feasible to execute all the tasks outlined in section 3.2 using JAVA However, leveraging existing programs, such as those written in C, can serve as a foundational approach This concept is central to the adaptation of multimedia documents.
3.3.1.1 Utilisation du logiciel pour les web services
On va montrer les logiciels utilisés (tous sont open source) pour l’adaptation dans Tab 3-2
FFmpeg is a powerful software used for audio and video processing, serving three main categories ImageMagick specializes in image-related tasks and is utilized in two categories: transcoding and transformation MPlayer is capable of converting videos to black and white, while Tesseract excels in extracting text from images, effectively transforming visual content into editable text.
Tab 3-2 Description des logiciels utilisés pour les web services
When it comes to data management, there are two main approaches The first involves storing data on a local drive, which is straightforward and easy to implement However, this method has its drawbacks; it does not allow for remote data storage and requires a large-capacity hard drive due to the substantial size of multimedia files In some cases, users may be reluctant to provide a large hard drive for the machine, or there may be associated risks.
Another approach involves using popular protocols like HTTP and FTP to send processed documents to a remote server While this method may take slightly longer than the first approach, it effectively addresses the drawbacks of the previous method In this stage, I opted for the second approach to send processed documents, as it simplifies the transfer of large files, the configuration of the PAAM system, and the deployment of PAAM compared to using the HTTP protocol.
3.3.2 Conception des Web services de PAAM
Fig 3.3 Comment peut-on réaliser le transcodage
Figure 3.3 illustrates the process of transcoding, which involves using the software FFmpeg for video and audio transcoding, while ImageMagick is employed for image transcoding Notably, text transcoding does not require external libraries The multimedia document will be sent through the data transport block, and this detail will not be reiterated in subsequent sections.
The diagram in Fig 3.4 illustrates the process of transcoding various media formats To transcode video into a slideshow and video into images, the software ffmpeg is utilized For converting text to audio, the tool eSpeak is employed When it comes to transcoding images into text, the software tesseract is used However, for converting text into images, no external libraries are required.
Fig 3.4 Comment peut-on réaliser le transmodage 3.3.2.3 Transformation a) Transformation de Texte
Fig 3.5 Comment peut-on réaliser la transformation de Texte
Figure 3.5 illustrates the process of text transformation, which involves a single service dedicated to text translation For this purpose, we utilize the Google Translate tool Additionally, the transformation of images is also considered in this context.
Fig 3.6 Comment peut-on réaliser la transformation de Image
Figure 3.6 illustrates the process of image transformation, which involves six services These services are carried out using the ImageMagick tool, highlighting its effectiveness in managing image transformations Additionally, the section also covers video transformation.
Figure 3.7 illustrates the process of video transformation, which encompasses five services Similar to image transformation using the ImageMagick tool, these five services are executed using the ffmpeg tool.
Changement de résolution d’une image
Passage en noir et blanc d’une image
Diminution de poids en octet
Diminution de bruit Transport de données
Fig 3.7 Comment peut-on réaliser la transformation de Vidéo d) Transformation de Audio
Fig 3.8 Comment peut-on réaliser la transformation de Audio
Figure 3.8 illustrates the process of audio transformation, which involves three distinct services Similar to the previous types—image and video transformation—audio transformation is executed using the same tool, ffmpeg.
In sections 3.3.2.1, 3.3.2.2, and 3.3.2.3, I have previously outlined the implementation schemes for web services In this section, I will present two specific sequence diagrams to clearly illustrate the step-by-step execution of web services using existing tools The work discussed here focuses on altering the bitrate of an audio file (refer to Fig 3.8 for the implementation diagram).
Ce diagramme est montré en Fig 3.9 et Fig 3.10
Changement de fréquence d’échantillonnage d’un audio
Changement de débit d’un audio
Passage en noir et blanc d’une vidéo
Changement de débit d’une vidéo
Diminution de bruit Transport de données
Utilisateur AudioService Audio FFMPEG FTP Serveur
1 : changer le debit d'un audio()
2 : changer le debit d'un audio()
3 : changer le debit d'un audio()
4 : changer le debit d'un audio()
Fig 3.9 Diagramme de séquence du changement de débit d'un audio en synchrone
Utilisateur AudioService Audio FFMPEG FTP Serveur
1 : changer le debit d'un audio()
2 : changer le debit d'un audio()
3 : changer le debit d'un audio()
4 : changer le debit d'un audio()
Fig 3.10 Diagramme de séquence du changement de débit d'un audio en asynchrone
Web service sémantique : WSDL-S
Ontologie PAAM
To construct the PAAM ontology, we rely on G Hagos's thesis [HA06], as his proposed model aligns well with the PAAM project His thesis offers a straightforward approach to developing an ontology related to WSDL-S, where each service operation is defined by a corresponding concept within the ontology Additionally, he provides a method for defining input and output conditions based on their respective types We will illustrate several concepts related to operations in the WSDL document, as shown in Fig 4.1.
Figure 4.1 illustrates the MultimediaAdaptationServices concept, which comprises four sub-concepts: ImageAdaptationServices, AudioAdaptationServices, TextAdaptationServices, and VideoAdaptationServices These sub-concepts are indicated by the subClassOf message; for instance, AudioTranscodingAdaptationServices is a sub-concept of AudioAdaptationServices Chapter 3 discusses that the audio transcoding service consists of two operations representing two sub-concepts of AudioTranscodingAdaptationServices However, the diagram does not sufficiently display all the concepts within our PAAM ontology A comprehensive list of all PAAM concepts can be found in Appendix 3.
1 OWL-S is an ontology, within the OWL-based framework of the Semantic Web
2 Le processus de Member Submission permet de proposer la technologie ou d’autres idées pour l’équipe de W3C
Fig 4.1 Hiérarchie de concept dans l’ontologie PAAM
Annotation du document WSDL
Before adding annotations to the WSDL document, it is essential to define the Namespace name, as shown in Table 4-1 Additionally, it is important to note that the WSDL version is 2.0, while the WSDL document generated in section 3.1.1 is in version 1.1 Therefore, it is necessary to convert the WSDL document to version 2.0 before annotating it, which can be accomplished using the tool mentioned in [WCON].
Préfix Nom de Namespace wssem http://lsdis.cs.uga.edu/projects/meteor-s/wsdl-s/examples/WSSemantics.xsd
Tab 4-1 Namespace du document WSDL
En général, chaque élément opération est annoté par un concept dans l’ontologie Cette annotation est définie par utilisant l’attribut modelReference Elle est illustrée en Fig 4.2
The term "XML Namespace" is used in XML documents to allow the use of named elements and attributes within an XML instance An XML instance can incorporate names from multiple XML vocabularies, and by assigning a namespace to each vocabulary, ambiguities between identical element or attribute names can be resolved It is essential that all element names within a namespace remain unique.
VideoAdaptationServices subClassOf subClassOf subClassOf subClassOf
AudioSpeedAdaptationService subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf subClassOf
Il existe encore d’autres concepts
4.2.2 Eléments d’entrée et de sortie
Every operation consists of two components: one for input and the other for output Each component is categorized as either simple or complex To annotate the input and output of an operation, we refer to definitions of the respective types.
To annotate a simple type, we follow the method outlined in section 4.2.1, which involves using the wssem Namespace and the modelReference attribute An example of this process can be seen in Fig 4.3.
Fig 4.3 Annotation d'un type simple Avant de parler d’annotation du type complexe, on donne un exemple pour celui comme Fig 4.4
In the context of complex types, there are two methods for annotating elements: one involves annotating across all sheets, while the other focuses on the complex type itself The second method requires defining a schema to map the content to the schema In contrast, the first method adds information to all sub-elements, which appears to be clearer for newcomers to the field For this project, we will utilize the first method for annotation.
On va donner quelques lignes dans la Fig 4.5 pour traiter l’exemple dans Fig 4.4
Fig 4.5 Annotation du type complexe
1 Un type simple est composé par un des types fondamentaux du document WSDL comme la description dans la section 3.1
2 Un type complexe est composé par plus d’un type fondamental du document WSDL comme la description dans la section 3.1
3 Une feuille est un sous-ộlộment dans le type complexe Dans Fig 4.4, on a deux feuilles avec les noms de ô stUrl ằ et de ô to_format ằ
4 Dans le domaine XML, un schéma pour définir une structure XML Autrement dit, le contenu de XML est défini par le schéma
A precondition defines a set of statements that must be met for a web service to be invoked It can specify requirements that need to be fulfilled, such as "must have an existing account with this company," or restrictions like "only Vietnamese customers can be served." Preconditions are specified as child elements of the operation, as illustrated in the schema shown in Figure 4.6.
On va expliquer quelques mots pour ce schéma :
/precondition : Cet élément spécifie l’annotation sémantique pour l’opération
/precondition/@name : L’attribut name spécifie un identificateur unique dans l'ensemble des préconditions dans le document WSDL
/precondition/@modelReference : L’attribut modelReference spécifie l'URI 1 de la part d'un modèle sémantique qui décrit la précondition L’attribut modelReference et expression sont mutuellement exclusifs
Figure 4.6 illustrates the precondition expression, which defines a specific precondition The format of this expression is determined by the semantic representation language used to convey the semantic model It is important to note that the attributes modelReference and expression are mutually exclusive.
On va donner un exemple pour illustrer l’application de précondition et d’effet à partir de la Fig 4.7 Le schéma d’effet sera expliqué dans la section 4.2.4
Figure 4.7 illustrates an annotation featuring a precondition and effect In this example, the operation "convert2Base64" is annotated with the concept "AudioBase64ConversionAdaptationService" from the PAAM ontology The precondition "Base64ConversionInputSubject" specifies that the input for this operation must be an audio resource.
An effect defines the outcome of an invoked operation, indicating either the returned output or changes in state resulting from the service invocation For instance, it may state that "the new account balance will be available" or "the credit card account will be debited." The schema of an effect illustrates these concepts effectively.
Comme pour la précondition, on a quelques descriptions pour le schéma d’effet :
/effect : Cet élément spécifie l’annotation sémantique pour l’opération
Fig 4.8 Schéma d'un effet /effect/@name : L’attribut name spécifie un identificateur unique dans l'ensemble des effets dans le document WSDL
/effect/@modelReference : L’attribut modelReference spécifie l'URI de la part d'un modèle sémantique qui décrit l’effet L’attribut modelReference et expression sont mutuellement exclusifs