Complex data type elements include the array, struct, recordset, and binaryelements.
These elements are used for more complex data structures, such as PHP arrays and classes.
Only two of these elements, arrayand struct, have direct mappings to native PHP types, but the remainder can be converted into data usable by an application.
array
The arrayelement holds data for an integer-based array. In PHP, arrays can have numeric or string indexes. Only numeric-indexed arrays map to the arrayelement. String-based indexed arrays are handled with the structelement.
■ Note Numeric index arrays in PHP are zero-based arrays. Creating arrays that are not zero-based, even though they are numerically indexed, may not result in using the arrayelement. For instance, the arrays array(2=>'a', 4=>'b', 6=>'c')and array(0=>'a', 2=>'b')serialized using the wddx extension would result in a struct with named variables rather than array elements.
The children of an arrayelement consist of the values held at each index. These values can be both simple and complex data types. This means an arrayelement can have one or more data type child elements, which are the same child elements valid for use within the dataelement. The arrayelement also contains a lengthattribute. The value of this attribute is the number of values held within the array. In PHP terms, the value of the lengthattribute is the value from calling the count()function on the array being serialized.
For instance, the following PHP arrays, which are both numerically indexed, are serialized into the same WDDX array structure:
array('a', 1, false);
array(0=>'a', 1=>1, 2=>false);
<array length='3'>
<string>a</string>
<number>1</number>
<boolean value='false'/>
</array>
struct
Structures are string-indexed collections of data. The structelement identifies the contents as being such a structure. In PHP, structures pertain to string-indexed arrays and objects. It is also important to note that any non-zero-based numerically indexed array or zero-based index array not having sequentially indexed items can result in the use of a structelement rather than an arrayelement.
The structelement is a container for zero or more varelements. These elements repre- sent variables or class properties identified by the required nameattribute. Each varelement contains a single child data type element, consisting of any element that is valid as a child of the dataor arrayelement. Thus, if you took a few variables from PHP:
$myint = 12345;
$mystring = "This is a string";
$mykeys = array('key1'=>1, 'key2'=>2);
their serialized representations of the varelement, which would live within a structelement, would be as follows:
<var name='myint'>
<number>12345</number>
</var>
<var name='mystring'>
<string>This is a string</string>
</var>
<var name='mykeys'>
<struct>
<var name='key1'>
<number>1</number>
</var>
<var name='key2'>
<number>2</number>
</var>
</struct>
</var>
As you can see from the serialization of the $mykeysvariable, it is a complex data struc- ture. The variable contains an associative array; thus, the varelement itself not only is a child
of a structelement but also contains a structelement. This structelement then contains additional varelements that identify each item in the array. If you remember that the defi- nition of a WDDX structure is not complicated but the resulting serialized document can become quite complex, you should now have an idea of what this means. The complexity of the structure being serialized is directly related to the complexity of the resulting WDDX packet. This will become even clearer within the “Using WDDX” section where you will see an object being serialized into a WDDX structure.
PHP is a case-sensitive language, so the statement $myVar = array('key'=>1, 'KEY'=>2);
results in an associative array with two distinct keys: keyand KEY. WDDX, being used by many languages (some not case sensitive), does not differentiate variable names or key names of dif- ferent case. A WDDX structure containing two variables with the same name, even if they differ in case, will use the value of the last variable when the structure is deserialized. If you serialized $myVar, your resulting structure might look like this:
<struct>
<var name='myVar'>
<struct>
<var name='key'><number>1</number></var>
<var name='KEY'><number>2</number></var>
</struct>
</var>
</struct>
While using PHP, you might end up with the same $myVarafter unserializing the packet as the original $myVarvariable, but if this structure were passed to some other system using a language that is not case sensitive, the resulting data would be an associative array, or language-equivalent structure, containing only a single index, the string key, and the cor- responding value of a numeric 2. The first value overwrites the second value because the names, even though differing in case, are not unique. For interoperability, it is important to uniquely identify names without any regard to case sensitivity.
recordset
The recordsetelement is used for tabular data, which is two-dimensional data such as data in comma-separated value (CSV) format or records from a database. The data is in a format that can be represented in a row and column format. Data serialized into this format can be composed only of simple data types. It is not required that you use the recordsetelement for two-dimensional data, and in many cases, developers use a struct instead. This tends to be the case when data contains complex types, which cannot be used with a recordsetelement, and because many languages do not have many direct mappings to a recordsettype. In addition, as you will see by its composition, some developers just do not like its structure.
A recordsetelement contains any number of fieldelements as its children. It does require two attributes: rowCountdefines the number of rows, and fieldNamesdefines the names of the fields being used within its contents. The value of the rowCountattribute is simply the number of rows of data encapsulated by the recordsetelement. The value of the fieldNamesattribute is a comma-separated list of the names of the fields used for the data. For example:
<recordset rowCount="2" fieldNames="ID,FIRST_NAME,LAST_NAME">
<!-- field elements -->
</recordset>
Based on this structure, you know that the recordset contains two records, each having three fields identified by the names listed in the fieldNamesattribute. This means the recordsetelement will contain three fieldelements.
A fieldelement contains the data for every row in the recordset for a specific field. It contains a nameattribute that identifies the name of the field, which must be one of the names from the fieldNamesattribute on the parent recordsetelement. Its child elements are com- posed of any number of simple data type elements, which means null, boolean, dateTime, number, string, or binary. Each one represents the data from a specific row for a field within the tabular data. Because a single data type usually defines the data from a field, only one of the data types will be used for every child element within a fieldelement.
The structure of the recordsetelement, because of the layout of the fieldelements, often looks odd to developers, and this is why they often use a struct instead. Rather than the XML being broken down by rows of data, it is broken down by fields, which are then broken down by rows. Consider the data from a database, as shown in Table 15-2, which is broken down by the fields for each row.
Table 15-2.Database Data
ID FIRST_NAME LAST_NAME
1 John Smith
2 Jane Doe
When using XML for this data, it is common to use a structure similar to the following:
<row>
<ID>1</ID>
<FIRST_NAME>John</FIRST_NAME>
<LAST_NAME>Smith</LAST_NAME>
</row>
<!-- Additional row elements -->
You can also use a general fieldelement name with a nameattribute set to the name of the field. In any case, the data per row is usually grouped together. In a WDDX packet, how- ever, the data is serialized into the following format when using a recordsetelement:
<recordset rowCount="2" fieldNames="ID,FIRST_NAME,LAST_NAME">
<field name="ID">
<number>1</number>
<number>2</number>
</field>
<field name="FIRST_NAME">
<string>John</string>
<string>Jane</string>
</field>
<field name="LAST_NAME">
<string>Smith</string>
<string>Doe</string>
</field>
</recordset>
As you can see, the data is grouped by field, so when reading this document logically, you are processing the data for every row for a specific field rather than processing the data per row for each field. Because of this reason alone, you may prefer using a structelement, where the data can be serialized by row, rather than a recordsetelement. Unfortunately, when receiving data, this might be out of your control.
binary
The binaryelement represents binary large objects (BLOBs), which are strings of binary data.
You may recall from previous chapters that passing binary data in its native form within XML is not safe. Not all characters produce proper XML. WDDX 1.0 mandates that the binaryele- ment contain the Base64-encoded data, although previous versions may have allowed other encodings. In any event, you set the type of encoding used on the encodingattribute, which under WDDX 1.0 is a fixed attribute containing the value base64. This element also allows the length of the binary data and MIME type of the binary data to be included using the length and typeattributes. There is not a native binarytype in PHP, so you will typically handle this data using PHP strings. For example:
<binary encoding="base64" length="9312164" type="video/mpeg">
<!-- Base64-encoded data here -->
</binary>