ptg 1844 CHAPTER 46 SQLCLR: Developing SQL Server Objects in .NET When MatchAll() is invoked, it returns an instance of the RegexReader class. In its constructor, RegexReader sets the passed-in regular expression, input string, and options to its data members. Then, at initialization time, SQL Server invokes RegexReader’s GetEnumerator() instance method, which returns an instance of RegexEnumerator, which does all the real work, utilizing the members of the RegexReader object that is passed into its constructor and set to its private _ reader object. Reset() is called in RegexEnumerator’s constructor so that it can initialize its members in the following way: . RegexEnumerator uses a private Regex object (_rex) for performing the match and stores the resulting array of Match (Match[]) in a private Regex.Match object (_match). . The ordinal number of the match is kept in _matchIndex and initialized to 0 (in case there are no matches). . When Reset() is complete, it is up to SQL Server to iterate through the matches by calling MoveNext(). MoveNext() does the work of re-creating the row (represented as a private array of object called _current) for every successful match stored in _match: . _match[0] is set to the value of _matchIndex (incremented on a per-match basis) and corresponds to the output table column (defined in the TableDefinition named parameter) MatchIndex. . _match[1] is set to the value of an XML document that is built for every match and contains subnodes for each group and group capture. This value corresponds to the output table column GroupList. When SQL Server uses the RegexEnumerator, it first calls MoveNext() and then uses the Current property. Next, execution passes to the method specified in FillRowMethodName (FillMatchAll()). Finally, the CLR passes the latest value of _current to FillMatchAll() as the row parame- ter. Each out parameter of FillMatchAll() is set to the value for the columns in the output row. NOTE If this implementation seems daunting, the best way to overcome that is to walk though the function line by line in debug mode, using VS. Developing Managed User-Defined Types (UDTs) In the preceding section, you used a managed user-defined type (UDT) called RegexPattern to store the regular expression pattern. In this section, you explore how custom UDTs are built and used in SQL Server. The first thing to note is that although the name UDT is the same as the extended data types built using SQL Server 2000, they are by no means the same in SQL Server 2008. ptg 1845 Developing Custom Managed Database Objects SQL Server 2000’s UDTs were actually retro-named “alias data types” in SQL Server 2005. SQL Server 2008 UDTs are structs (value types) built using the .NET Framework. To create a UDT of your own, you right-click your Visual Studio project and then select Add, User-Defined Type. Next, you should name both the class and its autogenerated method RegexPattern. Notice the attribute used to decorate the RegexPattern struct: SqlUserDefinedType. Its constructor has the following parameters: . Format—Tells SQL Server how serialization (and its complement, deserialization) of the struct should be done. You specify Format.Native to let SQL Server handle serial- ization for you. You specify Format.UserDefined to do your own serialization. When Format.UserDefined is specified, the struct must implement the IBinarySerialize interface to explicitly take the values from string (or int, or whatever the value passed into the constructor of the type is) back to binary and vice versa. . A named parameter list—This list contains the following: . IsFixedLength—Tells SQL Server that the byte count of the struct is the same for all its instances. . IsByteOrdered—Tells SQL Server that the bytes of the struct are ordered so that it may be used in binary comparisons, as with ORDER BY, GROUP BY,orPARTITION BY clauses, in indexing, and when the UDT is a primary or foreign key. . MaxByteSize—Tells SQL Server not to allow more than the specified number of bytes to be held in an instance of the UDT. The overall limit is 8KB. You must specify this when using Format.UserDefined. . Name—Tells the deployment routine what to call the UDT when it is created in the database. . ValidationMethodName—Tells SQL Server which method of the struct to use to validate it when it has been deserialized (in certain cases). The implementation contract for any UDT is as follows: . It must provide a static method called Parse(), used by SQL Server for conversion to the struct from a string. . It must provide an instance method that overrides the default ToString() method for converting from the struct to a string. . It must implement the INullable interface, providing a Boolean instance method called IsNull, used by SQL Server to determine whether an instance is null. . It must have a static property called Null of the type of the struct. This property returns an instance of the struct whose value is null (that is, where IsNull is true for that instance). (This concept seems to be derived from the “null object” design pat- tern.) ptg 1846 CHAPTER 46 SQLCLR: Developing SQL Server Objects in .NET Also, you need to be aware that UDTs can have only read-only static fields, they cannot use inheritance, and they cannot have overloaded methods (except the constructor, whose overloads are mainly used when ADO.NET is the calling context). Given these fairly stringent requirements, Listing 46.6 provides an implementation of a UDT representing a regular expression pattern. LISTING 46.6 A UDT Representing a Regular Expression Pattern using System; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; //added using System.Text.RegularExpressions; [Serializable] [Microsoft.SqlServer.Server.SqlUserDefinedType( Format.UserDefined, // requires IBinarySerialize IsFixedLength=false, IsByteOrdered=true, MaxByteSize=250, ValidationMethodName = “RegexPatternValidator” )] public struct RegexPattern : INullable, IBinarySerialize { //instance data fields private Regex _reg; private bool _null; //constructor public RegexPattern(String Pattern) { _reg = new Regex(Pattern); _null = (Pattern == String.Empty); } //instance method public override string ToString() { return _reg.ToString(); } //instance property public bool IsNull ptg 1847 Developing Custom Managed Database Objects { get { if (_reg == null || _reg.ToString() == string.Empty) { return true; } else return false; } } //static method public static RegexPattern Null { get { RegexPattern NullInstance = new RegexPattern(); NullInstance._null = true; return NullInstance; } } //static method public static RegexPattern Parse(SqlString Pattern) { if (Pattern.IsNull) return Null; else { RegexPattern u = new RegexPattern((String)Pattern); return u; } } //private instance method private bool RegexPatternValidator() { return (_reg.ToString() != string.Empty); } //instance method public Int32 Match(String Input) { Match m = _reg.Match(Regex.Escape(Input.ToString())); if (m != null) ptg 1848 CHAPTER 46 SQLCLR: Developing SQL Server Objects in .NET return Convert.ToInt32(m.Success); else return 0; } //instance property public bool IsFullStringMatch { get { Match m = Regex.Match(_reg.ToString(), @”\^.+\$”); if (m != null) return m.Success; else return false; } } //instance method [SqlMethod( DataAccess = DataAccessKind.None, IsMutator = false, IsPrecise = true, OnNullCall = false, SystemDataAccess = SystemDataAccessKind.None )] public Int32 MatchingGroupCount(SqlString Input) { Match m = _reg.Match(Regex.Escape(Input.ToString())); if (m != null) return m.Groups.Count; else return 0; } //static method [SqlMethod( DataAccess = DataAccessKind.None, IsMutator = false, IsPrecise = true, OnNullCall = false, SystemDataAccess = SystemDataAccessKind.None )] public static bool UsesLookaheads(RegexPattern p) // must be static to be called with :: syntax { ptg 1849 Developing Custom Managed Database Objects Match m = Regex.Match(p.ToString(), @ if (m != null) return m.Success; else return false; } #region IBinarySerialize Members public void Read(System.IO.BinaryReader r) { _reg = new Regex(r.ReadString()); } public void Write(System.IO.BinaryWriter w) { w.Write(_reg.ToString()); } #endregion } As you can see by scanning this code, it meets the required implementation contract. In addition, it declares static and instance methods, as well as instance properties. Both static and instance methods can optionally be decorated with the SqlMethod attribute. By default, methods of UDTs are declared to be nondeterministic and nonmutator, meaning that they do not change the value of the instance. You use the named parameters of the constructor for SqlMethod to override this and other behaviors. These are its named parameters: . DataAccess—Tells SQL Server whether the method will access user table data on the server in its body. If you provide the enum value DataAccessKind.None, some opti- mizations may be made. . SystemDataAccess—Tells SQL Server whether the method will access system table data on the server in its body. Again, if you provide the enum value SystemDataAccessKind.None, some optimizations may be made. . IsDeterministic—Tells SQL Server whether the method always returns the same values, given the same input parameters. . IsMutator—Must be set to true if the method changes the state of the instance. . Name—Tells the deployment routine what to call the UDT when it is created in the database. . OnNullCall—Returns null if any arguments to the method are null. ptg 1850 CHAPTER 46 SQLCLR: Developing SQL Server Objects in .NET . InvokeIfReceiverIsNull—Indicates whether to invoke the method if the instance of the struct itself is null. To create this type in SQL Server without using Visual Studio, you use the CREATE TYPE DDL syntax, as follows: CREATE TYPE RegexPattern EXTERNAL NAME SQLCLR.RegexPattern Note that DROP TYPE TypeName is also available, but there is no ALTER TYPE statement. Let us add a few words on the code in Listing 46.6. The constructor to RegexPattern vali- dates the expression passed to it via the constructor of System.Text.RegularExpressions.Regex. If you pass an invalid regex to the T-SQL SET statement (when declaring a variable of type RegexPattern) or when the UDT is used as a table column data type and a value is modi- fied, the Regex class does its usual pattern validation, as it does in the .NET world. Let’s look at some of the ways you can use your UDT. The following example shows how to call all the public members (both static and instance) of RegexPattern: DECLARE @rp RegexPattern SET @rp = ‘(\w+)\s+?(?!bar)’ SELECT @rp.ToString() AS ToString, @rp.IsFullStringMatch AS FullStringMatch, @rp.Match(‘uncle freddie’) AS Match, @rp.MatchingGroupCount(‘loves elken’) AS GroupCount, RegexPattern::UsesLookaheads(@rp) AS UsesLH go ToString FullStringMatch Match GroupCt UsesLH (\w+)\s+?(?!bar) 0 1 2 1 (1 row(s) affected) Note that static members can be called (without an instance, that is) by using the follow- ing new syntax: TypeName::MemberName(OptionalParameters) To try this, you can create a table and populate it as shown here: CREATE TABLE dbo.RegexTest ( PatternId int IDENTITY(1,1), Pattern RegexPattern ) GO ptg 1851 Developing Custom Managed Database Objects INSERT RegexTest SELECT ‘\d+’ INSERT RegexTest SELECT ‘foo (?:bar)’ INSERT RegexTest SELECT ‘(\s+()’ Msg 6522, Level 16, State 2, Line 215 A .NET Framework error occurred during execution of user defined routine or aggregate ‘RegexPattern’: System.ArgumentException: parsing “(\s+()” - Not enough )’s. System.ArgumentException: at System.Text.RegularExpressions.RegexParser.ScanRegex() at System.Text.RegularExpressions.RegexParser.Parse(String re, RegexOptions op) at System.Text.RegularExpressions.Regex ctor(String pattern, RegexOptions options, Boolean useCache) at System.Text.RegularExpressions.Regex ctor(String pattern) at RegexPattern ctor(String Pattern) at RegexPattern.Parse(SqlString Pattern) Do you see what happens when you try to insert an invalid regex pattern into the Pattern column (the third insert statement)? The parenthesis count is off, and the CLR tells you so in the query window’s output. Because the UDT has the IsByteOrdered named parameter set to true, you can index this column (based on the struct’s serialized value) and use it in ORDER BY statements. Here’s an example: CREATE NONCLUSTERED INDEX PatternIndex ON dbo.RegexTest(Pattern) GO SELECT Pattern.ToString(), RegexPattern::UsesLookaheads(Pattern) FROM RegexTest ORDER BY Pattern go PatString UsesLookaheads \d+ 0 foo (?:bar) 1 (2 row(s) affected) Back using ADO.NET, you can access the UDT by using the new SqlDbType.Udt enum value. To try this, you can add a new C# Windows application to your sample solution. You can add a project reference to your sample project ( ”SQLCLR”) and then add a using statement for System.Data.SqlClient. Then you should add a list box called lbRegexes to the form. Finally, you should add a button called btnCallUDT to the form, double-click it, and add the code in Listing 46.7 to the body of its OnClick event handler. ptg 1852 CHAPTER 46 SQLCLR: Developing SQL Server Objects in .NET LISTING 46.7 Using a UDT from ADO.NET in a Client Application private void btnCallUDT_Click(object sender, EventArgs e) { using (SqlConnection c = new SqlConnection(ConfigurationManager.AppSettings[“connstring”])) { using (SqlCommand s = new SqlCommand(“SELECT Pattern FROM dbo.RegexTest”, c)) { c.Open(); SqlDataReader r = s.ExecuteReader(CommandBehavior.CloseConnection); { while (r.Read()) { RegexPattern p = (RegexPattern)r.GetValue(0); lbRegexes.Items.Add(p.ToString()); } r.Close(); } } } } In this example, you selected all the rows from the sample table dbo.RegexText and then cast the Pattern column values into RegexPattern structs. Finally, you called the ToString() method of each struct, adding the text of the regex as a new item in the list box. You can also create SqlParameter objects to be mapped to UDT columns by using code such as the following: SqlParameter p = new SqlParameter(“@Pattern”, SqlDbType.Udt); p.UdtTypeName = “RegexPattern”; p.Value = new RegexPattern(“\d+\s+\d+”); command.Parameters.Add(p); Finally, keep in mind that FOR XML does not implicitly serialize UDTs. You have to do that yourself, as in the following example: SELECT Pattern.ToString() AS ‘@Regex’ FROM dbo.RegexTest FOR XML PATH(‘Pattern’), ROOT(‘Patterns’), TYPE go <Patterns> <Pattern Regex=”\d+” /> <Pattern Regex=”foo (?:bar)” /> </Patterns> ptg 1853 Developing Custom Managed Database Objects Developing Managed User-Defined Aggregates (UDAs) A highly specialized feature of SQL Server 2008, managed user-defined aggregates (UDAs) provide the capability to aggregate column data based on user-defined criteria built in to .NET code. You can now extend the (somewhat small) list of aggregate functions usable inside SQL Server to include those you custom-define. NOTE If you’ve been following the examples in this chapter sequentially, at this point, you need to drop the sample table dbo.RegexTest to redeploy the assembly after creating the UDA example. The implementation contract for a UDA requires the following: . A static method called Init(), used to initialize any data fields in the struct, particu- larly the field that contains the aggregated value. . A static method called Terminate(), used to return the aggregated value to the UDA’s caller. . A static method called Aggregate(), used to add the value in the current row to the growing value. . A static method called Merge(), used when SQL Server breaks an aggregation task into multiple threads of execution (SQL Server actually uses a thread abstraction called a task), each of which needs to merge the value stored in its instance of the UDA with the growing value. UDAs cannot do any data access, nor can they have any side-effects—meaning they cannot change the state of the database. They take only a single input parameter, of any type. You can also add public methods or properties other than those required by the contract (such as the IsPrime() method used in the following example). Like UDTs, UDAs are structs. They are decorated with the SqlUserDefinedAggregate attribute, which has the following parameters for its constructor: . Format—Tells SQL Server how serialization (and its complement, deserialization) of the struct should be done. This has the same possible values and meaning as described earlier for SqlUserDefinedType. . A named parameter list—This list contains the following: . IsInvariantToDuplicates—Tells SQL Server whether the UDA behaves differ- ently with respect to duplicate values passed in from multiple rows. . IsInvariantToNulls—Tells SQL Server whether the UDA behaves differently when null values are passed to it. . IsInvariantToOrder—Tells SQL Server whether the UDA cares about the order in which column values are fed to it. . System.Data; using System.Data .Sql; using System.Data.SqlTypes; using Microsoft. SqlServer .Server; //added using System.Text.RegularExpressions; [Serializable] [Microsoft. SqlServer .Server. SqlUserDefinedType( Format.UserDefined,. 2008. ptg 1845 Developing Custom Managed Database Objects SQL Server 2000’s UDTs were actually retro-named “alias data types” in SQL Server 2005. SQL Server 2008 UDTs are structs (value types) built using. used in SQL Server. The first thing to note is that although the name UDT is the same as the extended data types built using SQL Server 2000, they are by no means the same in SQL Server 2008. ptg 1845 Developing