Introducing SQL Server 2005's CLR Integration
Posted: Thu Feb 02, 2006 12:01 am
This is on of the greatest "Shawn Wildermuth" articles about SQL Server, I hope it will help many people intresting about it.
It's long some how, so i will split it into two to not make you feel boring while reading it..
Introducing SQL Server 2005's CLR Integration
T-SQL is great for database code, but writing procedural code in T-SQL has always been difficult. Invariably, your project includes a stored procedure or two, requiring some text parsing or complex math operations. Doing this in SQL Server has been difficult. Either you wrote T-SQL code, or you wrote an extended stored procedure/function and used COM to interoperate them. Neither was a good solution, but that is all we had.
In comes SQL Server 2005 with its CLR integration to alleviate these problems. By integrating the CLR, SQL Server 2005 allows you to deploy C# or VB.NET code that is used within the SQL Server process. This means that if you need complex procedural code, you can write it as managed code.
Integrating the CLR into SQL Server is not a step to eliminating T-SQL. As .NET developers, it may seem like a good idea to do all your database code in managed code, but this is not the case. Think of the CLR integration as just another tool in your toolbox. This is the hammer that, I suspect, will be used to hammer in nails, screws, and 2x4s in projects the next year. It will be overused. Don't let your project be the ones caught guilty of this.
Details of the CLR Integration
Integrating the CLR into SQL Server involves a number of different features. Many of those features allow developers to do things they have never had the opportunity to do before in SQL Server. But, before I discuss the features and how they work, it's important to consider some details on how the CLR is hosted in SQL Server.
Integrating the CLR into the SQL Server 2005 engine was not done in a trivial way. At the end of the day, SQL Server 2005 has to be a rock-solid implementation of the database. Any new feature has had to endure intense scrutiny for how it will impact the stability of SQL Server.
In the 1.x version of the CLR, the chief client for hosting the CLR was Internet Information Server (IIS). IIS is a peculiar beast. If it finds badly behaving code, it is happy to kill threads and processes and just restart the code. Any code living within IIS was free to allocate memory, threads or even new processes, as it saw fit. Unfortunately, in SQL Server this is the opposite of the needed requirements. If some piece of CLR code starts to act in a bad way, destroying the SQL Server process is the completely wrong thing to do. The health of the SQL Server processes are critical to the stability of the platform.
The 2.0 version of the hosting environment has many more ways to communicating with the host environment. These new communication mechanisms allow for SQL Server to be in control of key operations of the CLR. SQL Server may refuse to allow creation of new memory or new threads, and disallow the destruction of the host process. In addition, the CLR integration puts CLR code into a secure sandbox of operations to improve on the stability and security inside SQL Server.
Using Managed Code
Now that you have an assembly or two loaded, you want to know how to actually have code run within SQL Server. Within SQL Server, most types of code blocks that you are familiar with in T-SQL are supported in managed code:
Stored Procedures Functions Triggers
A new type of code is supported in SQL Server 2005 called a custom aggregation. This allows you to write code that supports aggregating data. You can do things like create a custom SUM or COUNT aggregation. You might create useful extensions to SQL Server, like Standard Deviations. I'll cover custom aggregations in more specifics below.
Using managed code within SQL Server 2005 requires three steps:
You must write the managed code and compile it into an assembly. You must install the managed code's assemblies into SQL Server 2005. You must use DDL statements to tie the managed code to named objects (Stored Procedures, functions, etc.)
I'll explain each of these steps.
Writing Managed Code
The easiest way to write managed code for SQL Server 2005 is to use Visual Studio 2005. As of the writing of this article, the full Beta 2 of Visual Studio is available; it works well with the SQL Server 2005 April(2005) CTP. To write managed code for SQL Server, you must have at least the Team Server edition of Visual Studio.
Using Visual Studio to write the managed code takes many of the details of deployment out of your hands, as it supports automatic deployment. This article ignores that fact; I'll explain how to write managed code and install it in SQL Server manually. Since your projects likely need Install scripts for managed code, this skill will soon be required for most projects.
For each type of managed code that is supported, there are related attributes that are used to decorate the code to help SQL Server know about specific behaviors about the managed code. These attributes include SqlProcedure, SqlFunction, SqlUserDefinedAggregate, SqlUserDefinedType and SqlMethod. Each of these attributes is explained below.
Stored Procedures
The most common and useful managed code in your own projects are probably stored procedures. Creating managed stored procedures have the following three requirements:
The containing class must be public. The exposed method must be public. The exposed method must be static.
That is all that is required. For example, to expose a simple stored procedure:
public class SqlClr {
[SqlProcedure]
public static void MyProc() {
// Put your code here
}
}Notice there is nothing especially different about this method from any other .NET code. The SqlProcedure attribute marks this code as a stored procedure. The attribute is not required, but is good form as **** of what code is used where. It is also used in Visual Studio to allow for the automatic deployment. The only parameter that it accepts is the name parameter that will rename the automatically deployed stored procedure:
public class SqlClr {
[SqlProcedure(Name="spMyProc")]
public static void MyProc() {
// Put your code here
}
}You can specify in, out, inout and return parameters as simple .NET parameters and return types:
// Input Parameter
[SqlParameter]
public static void InputProcedure(int number) {
}
// Output Parameter
[SqlParameter]
public static void OutputProcedure(out int number) {
number = 5;
}
// In/Out Parameter
[SqlParameter]
public static void InOutProcedure(ref int number) {
number = 4;
}
// Return Parameter
[SqlParameter]
public static int ReturnProcedure() {
return 3;
}Functions
Creating managed functions in SQL Server is just as simple as creating managed stored procedures, except that they must return a value:
[SqlFunction]
public static int Subtraction(int x, int y) {
return x - y;
}Some parameters that are useful to be in the SqlFunctionAttribute are as follows:
DataAccess: Determines if the function needs to read data in the database. If this is not specified, it is assumed that no data access is required. IsDeterministic: Used to declare that a function will always return the same result, regardless of any other state in SQL Server. IsPrecise: Used to declare if the result that is being returned is scientifically precise. Name: Used to annotate a function name to use, other than the managed method name.
Managed Triggers
Much like T-SQL, SQL Server 2005 allows you to create triggers in managed code. You must take special care to understand the performance implications in using managed triggers, as their performance is likely to be lower than similarly written T-SQL triggers.
Assuming you have taken the performance considerations into account, you would write triggers by simply annotating them with the SqlTrigger attribute. The SqlTrigger attribute requires two parameters:
Event: The event to fire the trigger for. This syntax is identical to the T-SQL syntax for the event name (e.g. FOR INSERT, INSTEAD OF DELETE, etc.). In addition, you can specify event names for DDL triggers (a new SQL Server 2005 feature) by specifying a DDL event name (e.g. FOR CREATE TABLE, FOR DROP USER, etc.). Target: The source of the event. Usually, this is a table or view name for DML triggers or a database name for DDL triggers.
NOTE: Using the SqlTrigger attribute is broken in the Beta 2 version of Visual Studio. You can write the triggers, but you must register them manually, and debugging is impossible at the moment.
It's long some how, so i will split it into two to not make you feel boring while reading it..
Introducing SQL Server 2005's CLR Integration
T-SQL is great for database code, but writing procedural code in T-SQL has always been difficult. Invariably, your project includes a stored procedure or two, requiring some text parsing or complex math operations. Doing this in SQL Server has been difficult. Either you wrote T-SQL code, or you wrote an extended stored procedure/function and used COM to interoperate them. Neither was a good solution, but that is all we had.
In comes SQL Server 2005 with its CLR integration to alleviate these problems. By integrating the CLR, SQL Server 2005 allows you to deploy C# or VB.NET code that is used within the SQL Server process. This means that if you need complex procedural code, you can write it as managed code.
Integrating the CLR into SQL Server is not a step to eliminating T-SQL. As .NET developers, it may seem like a good idea to do all your database code in managed code, but this is not the case. Think of the CLR integration as just another tool in your toolbox. This is the hammer that, I suspect, will be used to hammer in nails, screws, and 2x4s in projects the next year. It will be overused. Don't let your project be the ones caught guilty of this.
Details of the CLR Integration
Integrating the CLR into SQL Server involves a number of different features. Many of those features allow developers to do things they have never had the opportunity to do before in SQL Server. But, before I discuss the features and how they work, it's important to consider some details on how the CLR is hosted in SQL Server.
Integrating the CLR into the SQL Server 2005 engine was not done in a trivial way. At the end of the day, SQL Server 2005 has to be a rock-solid implementation of the database. Any new feature has had to endure intense scrutiny for how it will impact the stability of SQL Server.
In the 1.x version of the CLR, the chief client for hosting the CLR was Internet Information Server (IIS). IIS is a peculiar beast. If it finds badly behaving code, it is happy to kill threads and processes and just restart the code. Any code living within IIS was free to allocate memory, threads or even new processes, as it saw fit. Unfortunately, in SQL Server this is the opposite of the needed requirements. If some piece of CLR code starts to act in a bad way, destroying the SQL Server process is the completely wrong thing to do. The health of the SQL Server processes are critical to the stability of the platform.
The 2.0 version of the hosting environment has many more ways to communicating with the host environment. These new communication mechanisms allow for SQL Server to be in control of key operations of the CLR. SQL Server may refuse to allow creation of new memory or new threads, and disallow the destruction of the host process. In addition, the CLR integration puts CLR code into a secure sandbox of operations to improve on the stability and security inside SQL Server.
Using Managed Code
Now that you have an assembly or two loaded, you want to know how to actually have code run within SQL Server. Within SQL Server, most types of code blocks that you are familiar with in T-SQL are supported in managed code:
Stored Procedures Functions Triggers
A new type of code is supported in SQL Server 2005 called a custom aggregation. This allows you to write code that supports aggregating data. You can do things like create a custom SUM or COUNT aggregation. You might create useful extensions to SQL Server, like Standard Deviations. I'll cover custom aggregations in more specifics below.
Using managed code within SQL Server 2005 requires three steps:
You must write the managed code and compile it into an assembly. You must install the managed code's assemblies into SQL Server 2005. You must use DDL statements to tie the managed code to named objects (Stored Procedures, functions, etc.)
I'll explain each of these steps.
Writing Managed Code
The easiest way to write managed code for SQL Server 2005 is to use Visual Studio 2005. As of the writing of this article, the full Beta 2 of Visual Studio is available; it works well with the SQL Server 2005 April(2005) CTP. To write managed code for SQL Server, you must have at least the Team Server edition of Visual Studio.
Using Visual Studio to write the managed code takes many of the details of deployment out of your hands, as it supports automatic deployment. This article ignores that fact; I'll explain how to write managed code and install it in SQL Server manually. Since your projects likely need Install scripts for managed code, this skill will soon be required for most projects.
For each type of managed code that is supported, there are related attributes that are used to decorate the code to help SQL Server know about specific behaviors about the managed code. These attributes include SqlProcedure, SqlFunction, SqlUserDefinedAggregate, SqlUserDefinedType and SqlMethod. Each of these attributes is explained below.
Stored Procedures
The most common and useful managed code in your own projects are probably stored procedures. Creating managed stored procedures have the following three requirements:
The containing class must be public. The exposed method must be public. The exposed method must be static.
That is all that is required. For example, to expose a simple stored procedure:
public class SqlClr {
[SqlProcedure]
public static void MyProc() {
// Put your code here
}
}Notice there is nothing especially different about this method from any other .NET code. The SqlProcedure attribute marks this code as a stored procedure. The attribute is not required, but is good form as **** of what code is used where. It is also used in Visual Studio to allow for the automatic deployment. The only parameter that it accepts is the name parameter that will rename the automatically deployed stored procedure:
public class SqlClr {
[SqlProcedure(Name="spMyProc")]
public static void MyProc() {
// Put your code here
}
}You can specify in, out, inout and return parameters as simple .NET parameters and return types:
// Input Parameter
[SqlParameter]
public static void InputProcedure(int number) {
}
// Output Parameter
[SqlParameter]
public static void OutputProcedure(out int number) {
number = 5;
}
// In/Out Parameter
[SqlParameter]
public static void InOutProcedure(ref int number) {
number = 4;
}
// Return Parameter
[SqlParameter]
public static int ReturnProcedure() {
return 3;
}Functions
Creating managed functions in SQL Server is just as simple as creating managed stored procedures, except that they must return a value:
[SqlFunction]
public static int Subtraction(int x, int y) {
return x - y;
}Some parameters that are useful to be in the SqlFunctionAttribute are as follows:
DataAccess: Determines if the function needs to read data in the database. If this is not specified, it is assumed that no data access is required. IsDeterministic: Used to declare that a function will always return the same result, regardless of any other state in SQL Server. IsPrecise: Used to declare if the result that is being returned is scientifically precise. Name: Used to annotate a function name to use, other than the managed method name.
Managed Triggers
Much like T-SQL, SQL Server 2005 allows you to create triggers in managed code. You must take special care to understand the performance implications in using managed triggers, as their performance is likely to be lower than similarly written T-SQL triggers.
Assuming you have taken the performance considerations into account, you would write triggers by simply annotating them with the SqlTrigger attribute. The SqlTrigger attribute requires two parameters:
Event: The event to fire the trigger for. This syntax is identical to the T-SQL syntax for the event name (e.g. FOR INSERT, INSTEAD OF DELETE, etc.). In addition, you can specify event names for DDL triggers (a new SQL Server 2005 feature) by specifying a DDL event name (e.g. FOR CREATE TABLE, FOR DROP USER, etc.). Target: The source of the event. Usually, this is a table or view name for DML triggers or a database name for DDL triggers.
NOTE: Using the SqlTrigger attribute is broken in the Beta 2 version of Visual Studio. You can write the triggers, but you must register them manually, and debugging is impossible at the moment.