First, let us look at a some sample code, from creating a database to creating a table and using it. We present examples using MySQL, PostgreSQL, and SQLite.
MySQL
We will use MySQL as the example here, along with the only MySQL Python adapter:MySQLdb, aka MySQL-python. In the various bits of code, we will also show you (deliberately) examples of error situations so that you have an idea of what to expect, and what you may wish to create handlers for.
We first log in as an administrator to create a database and grant permis- sions, then log back in as a normal client.
>>> import MySQLdb
>>> cxn = MySQLdb.connect(user='root')
>>> cxn.query('DROP DATABASE test') Traceback (most recent call last):
File "<stdin>", line 1, in ?
_mysql_exceptions.OperationalError: (1008, "Can't drop database 'test'; database doesn't exist")
>>> cxn.query('CREATE DATABASE test')
>>> cxn.query("GRANT ALL ON test.* to ''@'localhost'")
>>> cxn.commit()
>>> cxn.close()
In the code above, we did not use a cursor. Some adapters have Connec- tion objects, which can execute SQL queries with the query() method, but not all. We recommend you either not use it or check your adapter to make sure it is available.
Thecommit() was optional for us as auto-commit is turned on by default in MySQL. We then connect back to the new database as a regular user, cre- ate a table, and perform the usual queries and commands using SQL to get our job done via Python. This time we use cursors and their execute() method.
ptg 21.2 Python Database Application Programmer’s Interface (DB-API) 935
The next set of interactions shows us creating a table. An attempt to create it again (without first dropping it) results in an error.
>>> cxn = MySQLdb.connect(db='test')
>>> cur = cxn.cursor()
>>> cur.execute('CREATE TABLE users(login VARCHAR(8), uid INT)') 0L
Now we will insert a few rows into the database and query them out.
>>> cur.execute("INSERT INTO users VALUES('john', 7000)") 1L
>>> cur.execute("INSERT INTO users VALUES('jane', 7001)") 1L
>>> cur.execute("INSERT INTO users VALUES('bob', 7200)") 1L
>>> cur.execute("SELECT * FROM users WHERE login LIKE 'j%'") 2L
>>> for data in cur.fetchall():
... print '%s\t%s' % data ...
john 7000 jane 7001
The last bit features updating the table, either updating or deleting rows.
>>> cur.execute("UPDATE users SET uid=7100 WHERE uid=7001") 1L
>>> cur.execute("SELECT * FROM users") 3L
>>> for data in cur.fetchall():
... print '%s\t%s' % data ...
john 7000 jane 7100 bob 7200
>>> cur.execute('DELETE FROM users WHERE login="bob"') 1L
>>> cur.execute('DROP TABLE users') 0L
>>> cur.close()
>>> cxn.commit()
>>> cxn.close()
MySQL is one of the most popular open source databases in the world, and it is no surprise that a Python adapter is available for it.
ptg 936 Chapter 21 Database Programming
PostgreSQL
Another popular open source database is PostgreSQL. Unlike MySQL, there are no less than three current Python adapters available for Postgres: psy- copg, PyPgSQL, and PyGreSQL. A fourth, PoPy, is now defunct, having con- tributed its project to combine with that of PyGreSQL back in 2003. Each of the three remaining adapters has its own characteristics, strengths, and weak- nesses, so it would be a good idea to practice due diligence to determine which is right for you.
The good news is that the interfaces are similar enough that you can create an application that, say, measures the performance between all three (if that is a metric that is important to you). Here we show you the setup code to get aConnection object for each:
psycopg
>>> import psycopg
>>> cxn = psycopg.connect(user='pgsql')
PyPgSQL
>>> from pyPgSQL import PgSQL
>>> cxn = PgSQL.connect(user='pgsql')
PyGreSQL
>>> import pgdb
>>> cxn = pgdb.connect(user='pgsql')
Now comes some generic code that will work for all three adapters.
>>> cur = cxn.cursor()
>>> cur.execute('SELECT * FROM pg_database')
>>> rows = cur.fetchall()
>>> for i in rows:
... print i
>>> cur.close()
>>> cxn.commit()
>>> cxn.close()
Finally, you can see how their outputs are slightly different from one another.
ptg 21.2 Python Database Application Programmer’s Interface (DB-API) 937
PyPgSQL sales template1 template0 psycopg
('sales', 1, 0, 0, 1, 17140, '140626', '3221366099', '', None, None)
('template1', 1, 0, 1, 1, 17140, '462', '462', '', None, '{pgsql=C*T*/pgsql}')
('template0', 1, 0, 1, 0, 17140, '462', '462', '', None, '{pgsql=C*T*/pgsql}')
PyGreSQL
['sales', 1, 0, False, True, 17140L, '140626', '3221366099', '', None, None]
['template1', 1, 0, True, True, 17140L, '462', '462', '', None, '{pgsql=C*T*/pgsql}']
['template0', 1, 0, True, False, 17140L, '462', '462', '', None, '{pgsql=C*T*/pgsql}']
SQLite
For extremely simple applications, using files for persistent storage usually suffices, but the most complex and data-driven applications demand a full relational database. SQLite targets the intermediate systems and indeed is a hybrid of the two. It is extremely lightweight and fast, plus it is serverless and requires little or no administration.
SQLite has seen a rapid growth in popularity, and it is available on many platforms. With the introduction of the pysqlite database adapter in Python 2.5 as the sqlite3 module, this marks the first time that the Python standard library has featured a database adapter in any release.
It was bundled with Python not because it was favored over other databases and adapters, but because it is simple, uses files (or memory) as its backend store like the DBM modules do, does not require a server, and does not have licensing issues. It is simply an alternative to other similar persistent storage solutions included with Python but which happens to have a SQL interface.
Having a module like this in the standard library allows users to develop rapidly in Python using SQLite, then migrate to a more powerful RDBMS such as MySQL, PostgreSQL, Oracle, or SQL Server for production pur- poses if this is their intention. Otherwise, it makes a great solution to stay with for those who do not need all that horsepower.
ptg 938 Chapter 21 Database Programming
Although the database adapter is now provided in the standard library, you still have to download the actual database software yourself. However, once you have installed it, all you need to do is start up Python (and import the adapter) to gain immediate access:
>>> import sqlite3
>>> cxn = sqlite3.connect('sqlite_test/test')
>>> cur = cxn.cursor()
>>> cur.execute('CREATE TABLE users(login VARCHAR(8), uid INTEGER)')
>>> cur.execute('INSERT INTO users VALUES("john", 100)')
>>> cur.execute('INSERT INTO users VALUES("jane", 110)')
>>> cur.execute('SELECT * FROM users')
>>> for eachUser in cur.fetchall():
... print eachUser ...
(u'john', 100) (u'jane', 110)
>>> cur.execute('DROP TABLE users')
<sqlite3.Cursor object at 0x3d4320>
>>> cur.close()
>>> cxn.commit()
>>> cxn.close()
Okay, enough of the small examples. Next, we look at an application simi- lar to our earlier example with MySQL, but which does a few more things:
• Creates a database (if necessary)
• Creates a table
• Inserts rows into the table
• Updates rows in the table
• Deletes rows from the table
• Drops the table
For this example, we will use two other open source databases. SQLite has become quite popular of late. It is very small, lightweight, and extremely fast for all the most common database functions. Another database involved in this example is Gadfly, a mostly SQL-compliant RDBMS written entirely in Python. (Some of the key data structures have a C module available, but Gadfly can run without it [slower, of course].)
Some notes before we get to the code. Both SQLite and Gadfly require the user to give the location to store database files (while MySQL has a default area and does not require this information from the use). The most
ptg 21.2 Python Database Application Programmer’s Interface (DB-API) 939
current incarnation of Gadfly is not yet fully DB-API 2.0 compliant, and as a result, is missing some functionality, most notably the cursor attribute rowcount in our example.
Database Adapter Example Application
In the example below, we want to demonstrate how to use Python to access a database. In fact, for variety, we added support for three different database systems: Gadfly, SQLite, and MySQL. We are going to create a database (if one does not already exist), then run through various database operations such as creating and dropping tables, and inserting, updating, and deleting rows.
Example 21.1 will be duplicated for the upcoming section on ORMs as well.
Line-by-Line Explanation
Lines 1–18
The first part of this script imports the necessary modules, creates some glo- bal “constants” (the column size for display and the set of databases we are supporting), and features the setup() function, which prompts the user to select the RDBMS to use for any particular execution of this script.
The most notable constant here is DB_EXC, which stands for DataBase EXCeption. This variable will eventually be assigned the database exception module for the specific database system that the users chooses to use to run this application with. In other words, if users choose MySQL, DB_EXC will be_mysql_exceptions, etc. If we developed this application in more of an object-oriented fashion, this would simply be an instance attribute, i.e., self.db_exc_module or something like that.
Lines 20–75
The guts of consistent database access happens here in the connect() function. At the beginning of each section, we attempt to load the requested database modules. If a suitable one is not found, None is returned to indicate that the database system is not supported.
Once a connection is made, then all other code is database and adapter independent and should work across all connections. (The only exception in our script is insert().) In all three subsections of this set of code, you will notice that a valid connection should be passed back as cxn.
If SQLite is chosen (lines 24–36), we attempt to load a database adapter.
We first try to load the standard library’s sqlite3 module (Python 2.5+). If that fails, we look for the third-party pysqlite2 package. This is to support 2.4.x and older systems with the pysqlite adapter installed. If a suitable adapter
ptg 940 Chapter 21 Database Programming
Example 21.1 Database Adapter Example (ushuffle_db.py) This script performs some basic operations using a variety of databases (MySQL, SQLite, Gadfly) and a corresponding Python database adapter.
1 #!/usr/bin/env python 23 import os
4 from random import randrange as rrange 5
6 COLSIZ = 10
7 RDBMSs = {'s': 'sqlite', 'm': 'mysql', 'g': 'gadfly'}
8 DB_EXC = None 910 def setup():
11 return RDBMSs[raw_input(''' 12 Choose a database system:
1314 (M)ySQL 15 (G)adfly 16 (S)QLite
1718 Enter choice: ''').strip().lower()[0]]
1920 def connect(db, dbName):
21 global DB_EXC
22 dbDir = '%s_%s' % (db, dbName) 2324 if db == 'sqlite':
25 try:
26 import sqlite3
27 except ImportError, e:
28 try:
29 from pysqlite2 import dbapi2 as sqlite3 30 except ImportError, e:
31 return None
3233 DB_EXC = sqlite3
34 if not os.path.isdir(dbDir):
35 os.mkdir(dbDir)
36 cxn = sqlite.connect(os.path.join(dbDir, dbName)) 3738 elif db == 'mysql':
39 try:
40 import MySQLdb
41 import _mysql_exceptions as DB_EXC 42 except ImportError, e:
43 return None
4445 try:
46 cxn = MySQLdb.connect(db=dbName)
47 except _mysql_exceptions.OperationalError, e:
ptg 21.2 Python Database Application Programmer’s Interface (DB-API) 941
Example 21.1 Database Adapter Example (ushuffle_db.py) (continued)
48 cxn = MySQLdb.connect(user='root')
49 try:
50 cxn.query('DROP DATABASE %s' % dbName)
51 except DB_EXC.OperationalError, e:
52 pass
53 cxn.query('CREATE DATABASE %s' % dbName) 54 cxn.query("GRANT ALL ON %s.* to ''@'localhost'" % dbName)
55 cxn.commit()
56 cxn.close()
57 cxn = MySQLdb.connect(db=dbName)
58
59 elif db == 'gadfly':
60 try:
61 from gadfly import gadfly
62 DB_EXC = gadfly
63 except ImportError, e:
64 return None
65
66 try:
67 cxn = gadfly(dbName, dbDir)
68 except IOError, e:
69 cxn = gadfly()
70 if not os.path.isdir(dbDir):
71 os.mkdir(dbDir)
72 cxn.startup(dbName, dbDir)
73 else:
74 return None 75 return cxn
7677 def create(cur):
78 try:
79 cur.execute('''
80 CREATE TABLE users (
81 login VARCHAR(8),
82 uid INTEGER,
83 prid INTEGER)
84 ''')
85 except DB_EXC.OperationalError, e:
86 drop(cur)
87 create(cur)
8889 drop = lambda cur: cur.execute('DROP TABLE users') 90
91 NAMES = (
92 ('aaron', 8312), ('angela', 7603), ('dave', 7306), 93 ('davina',7902), ('elliot', 7911), ('ernie', 7410), 94 ('jess', 7912), ('jim', 7512), ('larry', 7311), 95 ('leslie', 7808), ('melissa', 8602), ('pat', 7711), 96 ('serena', 7003), ('stan', 7607), ('faye', 6812), 97 ('amy', 7209),
98 ) 99
(continued)
ptg 942 Chapter 21 Database Programming
Example 21.1 Database Adapter Example (ushuffle_db.py) (continued)
100 def randName():
101 pick = list(NAMES) 102 while len(pick) > 0:
103 yield pick.pop(rrange(len(pick))) 104105 def insert(cur, db):
106 if db == 'sqlite':
107 cur.executemany("INSERT INTO users VALUES(?, ?, ?)", 108 [(who, uid, rrange(1,5)) for who, uid in randName()]) 109 elif db == 'gadfly':
110 for who, uid in randName():
111 cur.execute("INSERT INTO users VALUES(?, ?, ?)",
112 (who, uid, rrange(1,5)))
113 elif db == 'mysql':
114 cur.executemany("INSERT INTO users VALUES(%s, %s, %s)", 115 [(who, uid, rrange(1,5)) for who, uid in randName()]) 116117 getRC = lambda cur: cur.rowcount if hasattr(cur,
'rowcount') else -1 118
119 def update(cur):
120 fr = rrange(1,5) 121 to = rrange(1,5) 122 cur.execute(
123 "UPDATE users SET prid=%d WHERE prid=%d" % (to, fr)) 124 return fr, to, getRC(cur)
125126 def delete(cur):
127 rm = rrange(1,5)
128 cur.execute('DELETE FROM users WHERE prid=%d' % rm) 129 return rm, getRC(cur)
130
131 def dbDump(cur):
132 cur.execute('SELECT * FROM users')
133 print '\n%s%s%s' % ('LOGIN'.ljust(COLSIZ),
134 'USERID'.ljust(COLSIZ), 'PROJ#'.ljust(COLSIZ)) 135 for data in cur.fetchall():
136 print '%s%s%s' % tuple([str(s).title().ljust(COLSIZ) \
137 for s in data]) 138
139 def main():
140 db = setup()
141 print '*** Connecting to %r database' % db 142 cxn = connect(db, 'test')
143 if not cxn:
144 print 'ERROR: %r not supported, exiting' % db
145 return
146 cur = cxn.cursor() 147
148 print '\n*** Creating users table'
ptg 21.2 Python Database Application Programmer’s Interface (DB-API) 943
is found, we then check to ensure that the directory exists because the data- base is file based. (You may also choose to create an in-memory database.) When the connect() call is made to SQLite, it will either use one that already exists or make a new one using that path if it does not.
MySQL (lines 38–57) uses a default area for its database files and does not require this to come from the user. Our code attempts to connect to the spec- ified database. If an error occurs, it could mean either that the database does not exist or that it does exist but we do not have permission to see it. Since this is just a test application, we elect to drop the database altogether (ignor- ing any error if the database does not exist), and re-create it, granting all per- missions after that.
The last database supported by our application is Gadfly (lines 59–75). (At the time of writing, this database is mostly but not fully DB-API–compliant, and you will see this in this application.) It uses a startup mechanism similar to that of SQLite: it starts up with the directory where the database files
Example 21.1 Database Adapter Example (ushuffle_db.py) (continued)
149 create(cur)
150151 print '\n*** Inserting names into table' 152 insert(cur, db)
153 dbDump(cur)
154155 print '\n*** Randomly moving folks', 156 fr, to, num = update(cur)
157 print 'from one group (%d) to another (%d)' % (fr, to) 158 print '\t(%d users moved)' % num
159 dbDump(cur)
160161 print '\n*** Randomly choosing group', 162 rm, num = delete(cur)
163 print '(%d) to delete' % rm
164 print '\t(%d users removed)' % num 165 dbDump(cur)
166167 print '\n*** Dropping users table' 168 drop(cur)
169 cur.close() 170 cxn.commit() 171 cxn.close()
172173if __name__ == '__main__':
174 main()
ptg 944 Chapter 21 Database Programming
should be. If it is there, fine, but if not, you have to take a roundabout way to start up a new database. (Why this is, we are not sure. We believe that the startup() functionality should be merged into that of the constructor gadfly.gadfly().)
Lines 77–89
Thecreate() function creates a new users table in our database. If there is an error, that is almost always because the table already exists. If this is the case, drop the table and re-create it by recursively calling this function again.
This code is dangerous in that if the recreation of the table still fails, you will have infinite recursion until your application runs out of memory. You will fix this problem in one of the exercises at the end of the chapter.
The table is dropped from the database with the one-liner drop(). Lines 91–103
This is probably the most interesting part of the code outside of database activity. It consists of a constant set of names and user IDs followed by the generator randName() whose code can be found in Chapter 11 (Func- tions) in Section 11.10. The NAMES constant is a tuple that must be con- verted to a list for use with randName() because we alter it in the generator, randomly removing one name at a time until the list is exhausted. Well, if NAMES was a list, we would only use it once. Instead, we make it a tuple and copy it to a list to be destroyed each time the generator is used.
Lines 105–115
The insert() function is the only other place where database-dependent code lives, and the reason is that each database is slightly different in one way or another. For example, both the adapters for SQLite and MySQL are DB-API–compliant, so both of their cursor objects have an execute- many() function, whereas Gadfly does not, so rows have to be inserted one at a time.
Another quirk is that both SQLite and Gadfly use the qmark parameter style while MySQL uses format. Because of this, the format strings are dif- ferent. If you look carefully, however, you will see that the arguments them- selves are created in a very similar fashion.
What the code does is this: for each name-userID pair, it assigns that indi- vidual to a project group (given by its project ID or prid). The project ID is chosen randomly out of four different groups (randrange(1,5)).
ptg 21.2 Python Database Application Programmer’s Interface (DB-API) 945
Line 117
This single line represents a conditional expression (read as: Python ternary operator) that returns the rowcount of the last operation (in terms of rows altered), or if the cursor object does not support this attribute (meaning it is not DB-API–compliant), it returns –1.
Conditional expressions were added in Python 2.5, so if you are using 2.4.x or older, you will need to convert it back to the “old-style” way of doing it:
getRC = lambda cur: (hasattr(cur, 'rowcount') \ and [cur.rowcount] or [-1])[0]
If you are confused by this line of code, don’t worry about it. Check the FAQ to see why this is, and get a taste of why conditional expressions were finally added to Python in 2.5. If you are able to figure it out, then you have developed a solid understanding of Python objects and their Boolean values.
Lines 119–129
The update() and delete() functions randomly choose folks from one group. If the operation is update, move them from their current group to another (also randomly chosen); if it is delete, remove them altogether.
Lines 131–137
The dbDump() function pulls all rows from the database, formats them for printing, and displays them to the user. The print statement to display each user is the most obfuscated, so let us take it apart.
First, you should see that the data were extracted after the SELECT by the fetchall() method. So as we iterate each user, take the three col- umns (login,uid,prid), convert them to strings (if they are not already), titlecase it, and format the complete string to be COLSIZ columns left-justi- fied (right-hand space padding). Since the code to generate these three strings is a list (via the list comprehension), we need to convert it to a tuple for the format operator ( % ).
Lines 139–174
The director of this movie is main(). It makes the individual functions to each function described above that defines how this script works (assuming that it does not exit due to either not finding a database adapter or not being able to obtain a connection [lines 143–145]). The bulk of it should be fairly self-explanatory given the proximity of the print statements. The last bits of main() close the cursor, and commit and close the connection. The final lines of the script are the usual to start the script.