4.7 Type Factory Functions
Since Python 2.2 with the unification of types and classes, all of the built-in types are now classes, and with that, all of the “conversion” built-in functions likeint(),type(),list(), etc., are now factory functions. This means that although they look and act somewhat like functions, they are actually class names, and when you call one, you are actually instantiating an instance of that type, like a factory producing a good.
The following familiar factory functions were formerly built-in functions:
• int(), long(), float(), complex()
• str(), unicode(), basestring()
• list(), tuple()
• type()
Other types that did not have factory functions now do. In addition, factory functions have been added for completely new types that support the new-style classes. The following is a list of both types of factory functions:
• dict()
• bool()
• set(), frozenset()
• object()
• classmethod()
• staticmethod()
• super()
• property()
• file()
4.8 Categorizing the Standard Types
If we were to be maximally verbose in describing the standard types, we would probably call them something like Python’s “basic built-in data object primitive types.”
• “Basic,” indicating that these are the standard or core types that Python provides
• “Built-in,” due to the fact that these types come by default in Python
• “Data,” because they are used for general data storage
• “Object,” because objects are the default abstraction for data and functionality
ptg 112 Chapter 4 Python Objects
• “Primitive,” because these types provide the lowest-level granularity of data storage
• “Types,” because that’s what they are: data types!
However, this description does not really give you an idea of how each type works or what functionality applies to them. Indeed, some of them share certain characteristics, such as how they function, and others share common- ality with regard to how their data values are accessed. We should also be interested in whether the data that some of these types hold can be updated and what kind of storage they provide.
There are three different models we have come up with to help categorize the standard types, with each model showing us the interrelationships between the types. These models help us obtain a better understanding of how the types are related, as well as how they work.
4.8.1 Storage Model
The first way we can categorize the types is by how many objects can be stored in an object of this type. Python’s types, as well as types from most other languages, can hold either single or multiple values. A type which holds a single literal object we will call atomic or scalar storage, and those which can hold multiple objects we will refer to as container storage. (Container objects are also referred to as composite or compound objects in the docu- mentation, but some of these refer to objects other than types, such as class instances.) Container types bring up the additional issue of whether different types of objects can be stored. All of Python’s container types can hold objects of different types. Table 4.6 categorizes Python’s types by storage model.
Although strings may seem like a container type since they “contain” char- acters (and usually more than one character), they are not considered as such
Table 4.6 Types Categorized by the Storage Model
Storage Model
Category Python Types That Fit Category
Scalar/atom Numbers (all numeric types), strings (all are literals)
Container Lists, tuples, dictionaries
ptg 4.8 Categorizing the Standard Types 113
because Python does not have a character type (see Section 4.9). Thus strings are self-contained literals.
4.8.2 Update Model
Another way of categorizing the standard types is by asking the question,
“Once created, can objects be changed, or can their values be updated?”
When we introduced Python types early on, we indicated that certain types allow their values to be updated and others do not. Mutable objects are those whose values can be changed, and immutable objects are those whose values cannot be changed. Table 4.7 illustrates which types support updates and which do not.
Now after looking at the table, a thought that must immediately come to mind is, “Wait a minute! What do you mean that numbers and strings are immutable? I’ve done things like the following”:
x = 'Python numbers and strings' x = 'are immutable?!? What gives?' i = 0
i = i + 1
“They sure as heck don’t look immutable to me!” That is true to some degree, but looks can be deceiving. What is really happening behind the scenes is that the original objects are actually being replaced in the above examples. Yes, that is right. Read that again.
Rather than referring to the original objects, new objects with the new values were allocated and (re)assigned to the original variable names, and the old objects were garbage-collected. One can confirm this by using the id() BIF to compare object identities before and after such assignments.
Table 4.7 Types Categorized by the Update Model
Update Model
Category Python Types That Fit Category
Mutable Lists, dictionaries
Immutable Numbers, strings, tuples
ptg 114 Chapter 4 Python Objects
If we added calls to id() in our example above, we may be able to see that the objects are being changed, as below:
>>> x = 'Python numbers and strings'
>>> print id(x) 16191392
>>> x = 'are immutable?!? What gives?'
>>> print id(x) 16191232
>>> i = 0
>>> print id(i) 7749552
>>> i = i + 1
>>> print id(i) 7749600
Your mileage will vary with regard to the object IDs as they will differ between executions. On the flip side, lists can be modified without replacing the original object, as illustrated in the code below:
>>> aList = ['ammonia', 83, 85, 'lady']
>>> aList
['ammonia', 83, 85, 'lady']
>>>
>>> aList[2]
85
>>>
>>> id(aList) 135443480
>>>
>>> aList[2] = aList[2] + 1
>>> aList[3] = 'stereo'
>>> aList
['ammonia', 83, 86, 'stereo']
>>>
>>> id(aList) 135443480
>>>
>>> aList.append('gaudy')
>>> aList.append(aList[2] + 1)
>>> aList
['ammonia', 83, 86, 'stereo', 'gaudy', 87]
>>>
>>> id(aList) 135443480
Notice how for each change, the ID for the list remained the same.
ptg 4.8 Categorizing the Standard Types 115
4.8.3 Access Model
Although the previous two models of categorizing the types are useful when being introduced to Python, they are not the primary models for differentiating the types. For that purpose, we use the access model. By this, we mean, how do we access the values of our stored data? There are three categories under the access model: direct,sequence, and mapping. The different access models and which types fall into each respective category are given in Table 4.8.
Direct types indicate single-element, non-container types. All numeric types fit into this category.
Sequence types are those whose elements are sequentially accessible via index values starting at 0. Accessed items can be either single elements or in groups, better known as slices. Types that fall into this category include strings, lists, and tuples. As we mentioned before, Python does not support a character type, so, although strings are literals, they are a sequence type because of the ability to access substrings sequentially.
Mapping types are similar to the indexing properties of sequences, except instead of indexing on a sequential numeric offset, elements (values) are unordered and accessed with a key, thus making mapping types a set of hashed key-value pairs.
We will use this primary model in the next chapter by presenting each access model type and what all types in that category have in common (such as operators and BIFs), then discussing each Python standard type that fits into those categories. Any operators, BIFs, and methods unique to a specific type will be highlighted in their respective sections.
So why this side trip to view the same data types from differing perspec- tives? Well, first of all, why categorize at all? Because of the high-level data structures that Python provides, we need to differentiate the “primitive”
types from those that provide more functionality. Another reason is to be clear on what the expected behavior of a type should be. For example, if we minimize the number of times we ask ourselves, “What are the differences
Table 4.8 Types Categorized by the Access Model
Access Model Category Types That Fit Category
Direct Numbers
Sequence Strings, lists, tuples
Mapping Dictionaries
ptg 116 Chapter 4 Python Objects
between lists and tuples again?” or “What types are immutable and which are not?” then we have done our job. And finally, certain categories have general characteristics that apply to all types in a certain category. A good craftsman (and craftswoman) should know what is available in his or her toolboxes.
The second part of our inquiry asks, “Why all these different models or perspectives”? It seems that there is no one way of classifying all of the data types. They all have crossed relationships with each other, and we feel it best to expose the different sets of relationships shared by all the types. We also want to show how each type is unique in its own right. No two types map the same across all categories. (Of course, all numeric subtypes do, so we are cat- egorizing them together.) Finally, we believe that understanding all these relationships will ultimately play an important implicit role during develop- ment. The more you know about each type, the more you are apt to use the correct ones in the parts of your application where they are the most appro- priate, and where you can maximize performance.
We summarize by presenting a cross-reference chart (see Table 4.9) that shows all the standard types, the three different models we use for categori- zation, and where each type fits into these models.