How to use C# code within SQL
Posted on: 2015-09-01
This is the first time I had to have something that require a special business logic inside the database that I could not do directly in TSQL. Usually, I get around or I simply do something outside SQL like a small C# console that will handle everything. I decided to take a look a CLR SQL Server User-Defined Function
. This article describes how to create your own C# method and use it within your TSQL. Albeit the feasibility and the success of the method, I removed the code at the end to make a console application and I will explain to you why.
First thing to do is to create a SQL Server Project.
Finally, you write your own method. Mine was returning a single string so I had to return a SqlString
.
public partial class UserDefinedFunctions { [Microsoft.SqlServer.Server.SqlFunction] public static SqlString HashPassword(string plainPassword) { byte[] salt; byte[] buffer2; using (var bytes = new Rfc2898DeriveBytes(plainPassword, 0x10, 0x3e8)) { salt = bytes.Salt; buffer2 = bytes.GetBytes(0x20); } byte[] dst = new byte[0x31]; Buffer.BlockCopy(salt, 0, dst, 1, 0x10); Buffer.BlockCopy(buffer2, 0, dst, 0x11, 0x20); return new SqlString (Convert.ToBase64String(dst)); } }
This is the code I used. It does the hash of a plain password and return the hashed password. This algorithm is compatible with Entity Framework so it was a way to convert legacy password into Entity Framework format. You can see that not only that class is partial to a specific class, but it also has an attribute. The Microsoft.SqlServer.Server.SqlFunction
is required to specify the type of the method for SQL. It also allows to add additional information like if the function is deterministic.
The last step is to publish
. You need to right click the project, select Publish. This will raise a dialog where you need to select the database server, the database name, and the script to generate. This will build (if not already done) the project. The bin folder will contains the code (dll) but also a dacpac and a SQL file.
The dacpac is what is installed by Visual Studio into Sql Server. This is a file that contains SQL objects. It is used not only by C# code but could contains any SQL object like Sql Procedure for example. The format cannot be read by NotePad because it is a compressed file. You can unzip it if you are curious. And we are curious! So here is the code above dacpac unzipped file structures.
The interesting file is the model.xml. It contains a reference to the DLL file but also the function metadata we created and the source of the code inside a CDATA.
From here, you can just use a simple select, pass your parameter and you are all set.
If you cannot have you database working with CLR, it maybe because it is disabled. To enable, you need to do the following command once.
sp_configure 'show advanced options', 1; GO RECONFIGURE; GO sp_configure 'clr enabled', 1; GO RECONFIGURE; GO
The problem I ran into is the performance. This is very slow compared to any native SQL code. It is also slower than execute the same code directly inside a simple Console application that loop through all the data. For example, removing that call take 7 seconds. Even if it wasn't converting the password and keeping this one plain, the time was reasonable. Using the CRL SQL Server User-Defined Function the time raise to 11 minutes. Doing the loop in a console by reading the data and updating the value with ADO.Net took half of the time. The time is getting high because of the code itself which is not a simple task. Even if it takes few milliseconds to execute in a console, the hashing process is still little expensive to do. Also, CRL in SQL add a layer of slowness because the SQL engine need to stub inside your SQL query Microsoft intermediate language (MSIL) a bridge to the function. This generated stub is compiled into optimized code for the SQL Server you are executing this one to.
That said, the performance was quite slow from few seconds to several minutes. I had a huge script that was converting a lot of data, gigs and gigs of data across a lot of tables from one database to another. The overhead was not something I wanted to paid during the conversion and could be executed afterward by a small console job later. I could also use the CLR function in a second script later because even if the performance was several minutes, creating the console application wasn't saving much more. That said, during the creation of the CLR function for SQL I got curious about performance and I learned that this is far from being a subject that is black or white. Most of the time, it depends of the operation executed to know if it worth it or not to use a C# CLR function. I personally believe it should be limited. Even if it is possible to debug with Visual Studio those function remotely, I believe that business logic should remain in the application tier --t he database should be used only for persisting the information. Nevertheless, special case like using Regex on string, doing special data manipulations can be not possible with SQL only. This is a great tool to have on your belt and should be used wisely.