[ Team LiB ]
Recipe 3.3 Determining theDifferencesinDataBetweenTwo DataSet Objects
Problem
You have twoDataSetobjects with the same schema but containing different data and
need to determine the difference between thedatainthe two.
Solution
Compare thetwoDataSetobjects with the GetDataSetDifference( ) method shown in this
solution and return thedifferencesbetweenthedata as a DiffGram.
The sample code contains two event handlers and a single method:
Form.Load
Sets up the sample by creating twoDataSetobjects each containing a different
subset of records from the Categories table from the Northwind sample database.
The default view for each table is bound to a data grid on the form.
Get Difference Button.Click
Simply calls GetDataSetDifference( ) when the user clicks the button.
GetDataSetDifference( )
This method takes twoDataSetobjects with identical schemas as arguments and
returns a DiffGram of the differencesbetweenthedatainthe two.
The C# code is shown in Example 3-3
.
Example 3-3. File: DataSetDifferenceForm.cs
// Namespaces, variables, and constants
using System;
using System.Configuration;
using System.IO;
using System.Data;
using System.Data.SqlClient;
// Field name constants
private const String CATEGORYID_FIELD = "CategoryID";
DataSet dsA, dsB;
// . . .
private void DataSetDifferenceForm_Load(object sender, System.EventArgs e)
{
SqlDataAdapter da;
String sqlText;
// Fill table A with Category schema and subset of data.
sqlText = "SELECT CategoryID, CategoryName, Description " +
"FROM Categories WHERE CategoryID BETWEEN 1 AND 5";
DataTable dtA = new DataTable("TableA");
da = new SqlDataAdapter(sqlText,
ConfigurationSettings.AppSettings["Sql_ConnectString"]);
da.Fill(dtA);
da.FillSchema(dtA, SchemaType.Source);
// Set up the identity column CategoryID.
dtA.Columns[0].AutoIncrement = true;
dtA.Columns[0].AutoIncrementSeed = -1;
dtA.Columns[0].AutoIncrementStep = -1;
// Create DataSet A and add table A.
dsA = new DataSet( );
dsA.Tables.Add(dtA);
// Fill table B with Category schema and subset of data.
sqlText = "SELECT CategoryID, CategoryName, Description "
"FROM Categories WHERE CategoryID BETWEEN 4 AND 8";
DataTable dtB = new DataTable("TableB");
da = new SqlDataAdapter(sqlText,
ConfigurationSettings.AppSettings["Sql_ConnectString"]);
da.Fill(dtB);
da.FillSchema(dtB, SchemaType.Source);
// Set up the identity column CategoryID.
dtB.Columns[0].AutoIncrement = true;
dtB.Columns[0].AutoIncrementSeed = -1;
dtB.Columns[0].AutoIncrementStep = -1;
// Create DataSet B and add table B.
dsB = new DataSet( );
dsB.Tables.Add(dtB);
// Bind the default views for table A and table B to DataGrids
// on the form.
aDataGrid.DataSource = dtA.DefaultView;
bDataGrid.DataSource = dtB.DefaultView;
}
private void getDifferenceButton_Click(object sender, System.EventArgs e)
{
resultTextBox.Text = GetDataSetDifference(dsA, dsB);
}
private String GetDataSetDifference(DataSet ds1, DataSet ds2)
{
// Accept any edits within theDataSet objects.
ds1.AcceptChanges( );
ds2.AcceptChanges( );
// Create a DataSet to store the differences.
DataSet ds = new DataSet( );
DataTable dt1Copy = null;
// Iterate over the collection of tables inthe first DataSet.
for (int i = 0; i < ds1.Tables.Count; i++)
{
DataTable dt1 = ds1.Tables[i];
DataTable dt2 = ds2.Tables[i];
// Create a copy of the table inthe first DataSet.
dt1Copy = dt1.Copy( );
// Iterate over the collection of rows inthe
// copy of the table from the first DataSet.
foreach(DataRow row1 in dt1Copy.Rows)
{
DataRow row2 = dt2.Rows.Find(row1[CATEGORYID_FIELD]);
if(row2 == null)
{
// Delete rows not in table 2 from table 1.
row1.Delete( );
}
else
{
// Modify table 1 rows that are different from
// table 2 rows.
for(int j = 0; j < dt1Copy.Columns.Count; j++)
{
if(row2[j] == DBNull.Value)
{
// Column in table 2 is null,
// but not null in table 1
if(row1[j] != DBNull.Value)
row1[j] = DBNull.Value;
}
else if (row1[j] == DBNull.Value)
{
// Column in table 1 is null,
// but not null in table 2
row1[j] = row2[j];
}
else if(row1[j].ToString( ) !=
row2[j].ToString( ))
{
// Neither column in table 1 nor
// table 2 is null, and the
// values inthe columns are
// different.
row1[j] = row2[j];
}
}
}
}
foreach(DataRow row2 in dt2.Rows)
{
DataRow row1 =
dt1Copy.Rows.Find(row2[CATEGORYID_FIELD]);
if(row1 == null)
{
// Insert rows into table 1 that are in table 2
// but not in table 1.
dt1Copy.LoadDataRow(row2.ItemArray, false);
}
}
// Add the table to the difference DataSet.
ds.Tables.Add(dt1Copy);
}
// Write a XML DiffGram with containing thedifferencesbetween tables.
StringWriter sw = new StringWriter( );
ds.WriteXml(sw, XmlWriteMode.DiffGram);
return sw.ToString( );
}
Discussion
A DiffGram is an XML format used to specify original and current values for thedata
elements in a DataSet. It does not include any schema information. The DiffGram is used
by .NET Framework applications as the serialization format for the contents of a DataSet
including changes made to the Dataset.
A DiffGram is XML-based, which makes it platform and application independent. It is
not, however, widely used or understood outside of Microsoft .NET applications.
The DiffGram format is divided into three sections: current, original, and errors. The
original and current datainthe DiffGram can also be used to report the differences
between dataintwo DataSet objects. For more information about the DiffGram XML
format, see Recipe 8.8
.
The sample code contains a method GetDataSetDifference( ) that takes twoDataSet
objects with the same schema as arguments and returns a DiffGram containing the
differences indata when the second DataSet is compared to the first. Table 3-1
describes
how thedifferencesbetweentheDataSetobjects appear inthe DiffGram.
Table 3-1. DiffGram representation of DataSetdifferences
Condition DiffGram representation
Row is the same in
both DataSet 1 and
DataSet 2
Row data appears only inthe current data section of the
DiffGram.
Row is in both
DataSet 1 and DataSet
2 but the rows do not
contain the same data
Row data appears inthe current data section of the DiffGram.
The row element contains the attribute diffgr:hasChanges with
a value of "modified". Thedatainthe current section is the
updated data. The original data appears inthe original
<diffgr:before> block of the DiffGram.
Row is inDataSet 2
but not inDataSet 1
Row data appears inthe current data section of the DiffGram.
The row element contains the attribute diffgr:hasChanges with
a value of "inserted".
Row is DataSet 1 but
not inDataSet 2
Row data appears only inthe original <diffgr:before> block of
the DiffGram.
The sample begins by loading different subsets of data from the Categories table and
displaying it intwo grids on the form. This data is editable within the grids to allow
DataSet differences as reported inthe DiffGram to be investigated. In this example, the
DataSet objects both contain just a single table. To determine the difference betweenthe
DataSet objects, the tables within theDataSetobjects are compared as described next and
changes are applied to thedatain a copy of the first DataSet until it matches the second
DataSet. Once all differencesin all tables are processed, the DiffGram of the copy of the
first DataSet contains the difference inthe second DataSet when compared to the first
DataSet.
More specifically, as each table is processed, a copy is made of it. Thedatainthe copy of
the first table is modified to make it consistent with thedatainthe second table. The
modified copy of the first table is then added to theDataSet containing the differences
between thetwo DataSet objects.
The process of modifying thedatainthe copy of the first table to match thedatain
second table involves several steps:
• Rows that are inthe copy of the first table but not inthe second table (based on the
primary key value) are deleted from the copy of the first table.
• If the row is found inthe second table, the columns are compared and any
differences inthe columns inthe second table are changed inthe column inthe
first table.
• Rows that are inthe second table but not inthe copy of the first table are inserted
into the copy of the first table without accepting changes.
[ Team LiB ]
. table is then added to the DataSet containing the differences
between the two DataSet objects.
The process of modifying the data in the copy of the first. 3.3 Determining the Differences in Data Between Two DataSet Objects
Problem
You have two DataSet objects with the same schema but containing different data