Test your database schema

◆ Problem

You want to test your database schema, verifying such things as nullable columns, indices, foreign key constraints, and triggers.

◆ Background

We Agilists love to live in a dream world where the team9 collectively owns the database. This is a world in which programmers and database administrators work together in harmony to build the perfect database for our collective customer (or boss). The database schema is flexible and everyone is responsive to change.

8 This is one kind of test that does not require any assertions. Just invoke the method, and if it does not throw any exceptions, then the test passes. Not all tests require explicit assertions to be useful.

9 http://groups.yahoo.com/group/extremeprogramming/files. Look for the document called “One Team.”

328 CHAPTER 10 Testing and JDBC

When change happens, everyone knows about it immediately. When this happens, it is a beautiful thing.

The reality is that in most organizations there are large walls between “the database group” and “the developers,” making such harmonious collaboration nearly impossible. Even if not the result of some nefarious management control strategy, if your application has grown around the database, then the chances are good that a separate team maintains the database and acts as gatekeeper: all changes go through them. Whenever you submit a request to change the database schema, you need to be sure that your change was received correctly and processed correctly, and what better way to do that than with tests?

NOTE It actually happened... Before you believe that nothing could ever go wrong submitting database schema changes to a separate team, consider this story. A programmer—we’ll name him Joe—builds his component against an in-memory data model before translating that model into a relational database schema. He creates the schema, right down to the necessary DDL, and then submits a schema change request to the database team.

One week later, after his schema changes are integrated into the weekly test driver, he runs his tests against the new database schema. Lo and behold—one of them fails! Surprised, Joe examines the test driver’s database creation script—only to find that the database team has misplaced a unique index on one of the tables. Even more surprised, Joe asks the database team lead what happened. “We maintain the schema using ERwin,”10 the team lead says, “so we imported your DDL into the tool, and then exported the entire database schema into the test driver. ERwin must have messed up somewhere.” Even though Joe thought he was being precise by submitting a DDL for the database tables, something was lost in the translation. Joe learned a valuable lesson: the database schema could break!

If Joe’s experience resonates with you, then you need to add some tests to protect yourself against these kinds of surprises.

◆ Recipe

Perhaps the best solution to this problem has nothing to do with testing: create a single, unambiguous description of your application’s data model from which the DDL scripts are generated. Martin Fowler describes using XML documents—easy to parse and therefore easy to verify using XPath (see chapter 9, “Testing and

10A relational database modeling tool, part of the AllFusion Modeling Suite, from Computer Associates (www3.ca.com).

329 Test your database schema

XML”)—as the single description of the application’s data layout from which a database schema may be generated [PEAA, 49]. You can find tools for parsing XML today, but we could not find tools for parsing DDL for Java.

Let us now assume that you need to test the database schema without being able (or allowed) to use XML documents to represent it. In this case the general strategy is to write a test for each of the following aspects of your database schema.

Verify that:

■ Tables and columns exist.

■ Primary key columns are correct.

■ Foreign key constraints are correct, including cascade properties.

■ Triggers are correct.

■ Default values and check constraints are correct.

■ Stored procedures are correct.

■ Database object privileges are correct.

For any of these kinds of tests, there are two general strategies to consider: either make assertions on database meta data or test against the database judging the correctness of the schema by performing queries and checking the results. We prefer the meta data strategy, because it does not depend on any data in the database, but meta data support is different from database vendor to database vendor.

Our Coffee Shop application uses a Mimer (www.mimer.se) database to store business data, and we were not sure how well its JDBC provider supports database meta data, so we tried the Learning Test in listing 10.3.

public void testTablesAndColumnsExist() throws Exception { MimerDataSource coffeeShopDataSource = new MimerDataSource();

coffeeShopDataSource.setDatabaseName("coffeeShopData");

coffeeShopDataSource.setUser("admin");

coffeeShopDataSource.setPassword("adm1n");

Connection connection = coffeeShopDataSource.getConnection();

DatabaseMetaData databaseMetaData = connection.getMetaData();

ResultSet schemasResultSet = databaseMetaData.getSchemas();

Map databaseSchemaDescriptors = new HashMap();

while (schemasResultSet.next()) { databaseSchemaDescriptors.put(

schemasResultSet.getString("TABLE_SCHEM"), schemasResultSet.getString("TABLE_CATALOG"));

}

Listing 10.3 A Learning Test for the database

330 CHAPTER 10 Testing and JDBC

schemasResultSet.close();

connection.close();

fail(databaseSchemaDescriptors.toString());

}

This is essentially a “printf,” but it has the side effect of being easily transformed into a regression test that we can use to uncover incompatibilities or other changes in future versions of the Mimer JDBC provider. After looking at the out- put from the fail() statement, we can decide what to assert. The first thing we learned was that the column TABLE_CATALOG is not in the result set—something that the Javadoc for DatabaseMetaData.getSchemas() says ought to be there. We need a closer look at the schema meta data. Fortunately, we can use Diasparsoft Toolkit’s JdbcUtil to get a human-readable representation of a JDBC result set.

We placed this line of code before trying to process the result set:

fail(JdbcUtil.resultSetAsTable(schemasResultSet).toString());

The result set only has one column: TABLE_SCHEM, and sure enough, the CATALOG schema we expect to be there is there. We change the test to reflect this knowl- edge and remove this bit of trace code. Listing 10.4 shows the new test.

public void testTablesAndColumnsExist() throws Exception { MimerDataSource coffeeShopDataSource = new MimerDataSource();

coffeeShopDataSource.setDatabaseName("coffeeShopData");

coffeeShopDataSource.setUser("admin");

coffeeShopDataSource.setPassword("adm1n");

Connection connection = coffeeShopDataSource.getConnection();

DatabaseMetaData databaseMetaData = connection.getMetaData();

ResultSet schemasResultSet = databaseMetaData.getSchemas();

List schemaNames = new LinkedList();

while (schemasResultSet.next()) {

schemaNames.add(schemasResultSet.getString("TABLE_SCHEM"));

}

schemasResultSet.close();

connection.close();

assertTrue(schemaNames.contains("CATALOG"));

}

Listing 10.4 Our Learning Test after having learned something

What do we get?

A more direct assertion

TE AM FL Y

Team-Fly®

331 Test your database schema

We changed the “schema descriptors”—which we thought would have more than one property—to “schema names,” which are just Strings. We no longer need a Map, because each item we want to store in the collection is now a single value—a List will do. Our assertion is more direct and easier to understand: we expect there to be a schema called CATALOG. Now this test is slightly brittle, because it assumes that the schema meta data will come back in uppercase. If you are concerned about this, then use Diasparsoft Toolkit’s CollectionUtil, which provides a case-insensitive search capability for collections of strings. Replace the assertion above with the following code:

assertTrue(

CollectionUtil.stringCollectionContainsIgnoreCase(

schemaNames, "catalog"));

Both of the two preceding tests now pass and we have successfully verified the existence of the schema CATALOG in our database. You can use the remaining parts of the ResultSetMetaDataAPI to verify the existence of tables, columns, and constraints—

as always, depending on the degree to which your database vendor supports these features. Not all do, including at least one of the big players in the industry. What to do when meta data lets you down? Return to the basics: describe the expected behavior for a database with the desired characteristic and write the corresponding test. The one in listing 10.5 verifies that coffeeName is unique within the table CATA- LOG.BEANS, even though coffeeName is not a primary key.

public void testCoffeeNameUniquenessConstraint() throws Exception { MimerDataSource coffeeShopDataSource = new MimerDataSource();

coffeeShopDataSource.setDatabaseName("coffeeShopData");

coffeeShopDataSource.setUser("admin");

coffeeShopDataSource.setPassword("adm1n");

Connection connection = coffeeShopDataSource.getConnection();

PreparedStatement createBeanProductStatement = connection.prepareStatement(

"insert into catalog.beans "

+ "(productId, coffeeName, unitPrice) "

+ "values (?, ?, ?)");

createBeanProductStatement.clearParameters();

createBeanProductStatement.setString(1, "000");

createBeanProductStatement.setString(2, "Sumatra");

createBeanProductStatement.setInt(3, 725);

Listing 10.5 Verifying a unique index

332 CHAPTER 10 Testing and JDBC

assertEquals(1, createBeanProductStatement.executeUpdate());

// How will Mimer react to the duplicate entry?

}

Because we are new to Mimer, we are unsure how it will react to the duplicate entry, so we do not know which exception to expect—or whether to expect one at all. (You never know.) This means that we start again with a Learning Test. Let us execute the same update a second time and see what happens. We replace the comment with this line of code:

createBeanProductStatement.executeUpdate();

When we execute the test, Mimer tells us java.sql.SQLException: UNIQUE constraint violation, so we know to expect an SQLException, but we don’t know which SQLState corresponds, so we refine the Learning Test and replace the preceding line with this block:

try {

createBeanProductStatement.executeUpdate();

}

catch (SQLException expected) { fail(expected.getSQLState());

}

When we execute the test, we get another UNIQUE constraint violation message. What?!

Oh yes, the data is in the database from the previous test run. This is why we rec- ommend writing as many tests as possible without involving an actual database—

even in a simple case such as this we have the complication of setting up and tearing down the data. See recipe 10.6, “Manage external data in your test fixture,” for some strategies for managing a test fixture that includes a database. Now back to our test. We add code at the start of the test to delete all data from table CATALOG.

BEANS, then we execute the test. The SQLState is 23000. We consult the Mimer doc- umentation quickly and determine that this SQLState code represents an “integrity constraint violation.” Bingo. See listing 10.6 for the final version of this test.

package junit.cookbook.coffee.jdbc.test;

import java.sql.*;

import java.util.LinkedList;

import java.util.List;

import junit.framework.TestCase;

Listing 10.6 CoffeeShopDatabaseSchemaTest, the final version

Check only one row inserted

333 Test your database schema

import com.diasparsoftware.java.util.CollectionUtil;

import com.mimer.jdbc.MimerDataSource;

public class CoffeeShopDatabaseSchemaTest extends TestCase { public void testCoffeeNameUniquenessConstraint() throws Exception {

MimerDataSource coffeeShopDataSource = new MimerDataSource();

coffeeShopDataSource.setDatabaseName("coffeeShopData");

coffeeShopDataSource.setUser("admin");

coffeeShopDataSource.setPassword("adm1n");

Connection connection = coffeeShopDataSource.getConnection();

connection.createStatement().executeUpdate(

"delete from catalog.beans");

PreparedStatement createBeanProductStatement = connection.prepareStatement(

"insert into catalog.beans "

+ "(productId, coffeeName, unitPrice) "

+ "values (?, ?, ?)");

createBeanProductStatement.clearParameters();

createBeanProductStatement.setString(1, "000");

createBeanProductStatement.setString(2, "Sumatra");

createBeanProductStatement.setInt(3, 725);

assertEquals(1, createBeanProductStatement.executeUpdate());

try {

createBeanProductStatement.executeUpdate();

fail("Added two coffee products with the same name?!");

}

catch (SQLException expected) { assertEquals(

String.valueOf(23000), expected.getSQLState());

} }

// JDBC resource cleanup code omitted for brevity }

One thing to bear in mind is that this test assumes some information about the database schema. In particular, the failure message assumes that the duplicate key field is the coffee name, as opposed to, perhaps, the product ID. This is the kind of subtle dependency that tends to creep into tests for JDBC code, especially when testing against a live database. Suppose that three months from now the unique- ness constraints on the table change, and this test fails. The failure message, while trying to be helpful by being precise, is now possibly misleading. For a small table of three columns, that may not be a great problem; however, for a table with dozens

334 CHAPTER 10 Testing and JDBC

of columns, this could waste considerable debugging time by throwing the programmer off track. We tend to err on the side of putting more information in failure messages, but like any habit, there are times when it becomes a bad habit. If you are aware of the potential problem, then you are in a better position to han- dle it should it arise.

◆ Discussion

We have encountered a few issues to consider when testing against a live database, even when you have the database all to yourself. To achieve test isolation you need to clean the database before each test. There are two key consequences to this practice:

■ Table dependencies grow quickly—You must add logic to clean all the tables, and the more complex your foreign key constraints, the more complex this logic becomes. It is not uncommon in a medium-sized application to have upwards of 40 database tables, all but a few of which have foreign key constraints that determine the order in which the tables must be deleted.

■ The database is an expensive external resource—The more tests you write that exercise the database, the more slowly your tests will execute. Remember that one of the goals of Programmer Testing is that tests be fast so that you can and will execute them frequently while programming. This is what provides the refactoring safety net you need to keep your design flexible and reduce the cost of adding features.

NOTE Crunch the numbers—It is more straightforward to write tests against a live database, especially to the programmer not accustomed to decou- pling JDBC client code from a physical database. It is important to crunch the numbers and realize the benefit of refactoring away from the database. We wrote two tests to verify which columns in a table were nullable. The first approach was to take Martin Fowler’s advice and move the database definition to XML; the second approach was to use database meta data as we have described here. On a Pentium-4 1.7 GHz computer with 512 MB of RAM, the former test took an average of 0.05 seconds to execute, while the latter test took 0.5 seconds. The difference appears miniscule, but this is to check seven database columns in the same table. Assume there are 1000 such columns to check—and about 50 to 200 tables, depending on the database designer’s philoso- phy and design sense. Multiply by 143 (1000/7) and the difference is 143 * 0.45 = 64.35 seconds or more than one minute! Now as your test suite grows, startup costs such as establishing database connections cease to dominate the suite’s execution time as much as for a smaller

335 Verify your tests clean up JDBC resources

suite. Even if we are overstating the difference, and it is closer to only 30 seconds, that is 30 seconds per test execution per member of the team per day for, on average, half the lifetime of your project. Pull out your calculator and see how that adds up.

So while we wanted to provide you with examples of verifying the database schema against a live database, it is generally worth the effort to exclude the database from the equation. If you are truly concerned that the database does not work, write a few Learning Tests against the database, and then run them as part of your background build—say using Anthill or Cruise Control. At a minimum, you will have End-to-End Tests that verify that your application talks to the database correctly by testing through your application’s user interface.11 If the tests for your data access layer also verify the way you integrate with a live database, then you are duplicating efforts between the two kinds of tests. This is a waste. Focus your Object Tests on individual objects instead.

◆ Related

■ 10.6—Manage external data in your test fixture

Test a method that returns nothing

Test throwing the right exception