C# Transaction for SQL query

Transaction lets you rollback if a problem occur. That mean that if you insert 100 entries into the database and 1 of them is wrong, the whole list of entry will be not saved in the persistence. This is very interesting because most of the time you want to have your action to be atomic.

This is an example of a good practice method that should been in a data access layer. To simplify, all will be written in the same file and the database used is the Northwind database.

public class Customer
{
	public string Id { get; set; }
	public string Name { get; set; }
	public bool IsNew { get; set; }
}

public partial class _Default : System.Web.UI.Page
{
	protected void Page_Load(object sender, EventArgs e)
	{
		List<Customer> customers = new List<Customer>(){
			   new Customer(){Id="ID001", Name="Name001", IsNew=true}
		   ,   new Customer(){Id="ID002", Name="Name002", IsNew=true}
		   ,   new Customer(){Id="ID003", Name="Name003", IsNew=true}
		};
		SaveCustomers(customers);
	}

	public int SaveCustomers(IEnumerable<Customer> customers)
	{
		ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];

		int rowsAffected=0;
		using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
		{
			connection.Open();
			var transaction = connection.BeginTransaction();
			using (var command = new SqlCommand())
			{
				command.Transaction = transaction;
				command.Connection = connection;
				try
				{
					foreach (var customer in customers)
					{
						if (customer.IsNew)
						{
							command.CommandText = "INSERT INTO customers (CustomerID, CompanyName) VALUES (@id, @name)";
						}
						else
						{
							command.CommandText = "UPDATE customers SET CompanyName = @name WHERE CustomerID = @id";
						}
						command.CommandType = System.Data.CommandType.Text;
						command.Parameters.Clear(); //Remove 
						command.Parameters.Add(new SqlParameter("id", customer.Id));
						command.Parameters.Add(new SqlParameter("name", customer.Name));
						rowsAffected += command.ExecuteNonQuery();
					}
		  
					transaction.Commit();
				}
				catch
				{
					transaction.Rollback();
				}
				
			}
			connection.Close();

		}

		return rowsAffected;
	}
}

This code contain some interesting things. First, to create a transaction you must have a connection already open. This is required. Otherwise you will have an exception that will told you that the connection is close.

Second, you must know that if you forget to commit once you are done querying that even if all queries were legit that the database won’t have your data. That’s right, even if the rowsAffected += command.ExecuteNonQuery(); add number in the rowAffected, this won’t be real until you commit. The code above works because all customers have IsNew to true. If you run this code and change one to False you will tell the Save method to use the Update statement. The command will Update but the database won’t find the customer and throw an exception. What it means is that the code will go in the catch and RollBack the transaction. Even if the rowsAffected is at 1, you will have 0 rows added or updated in the database.

Type of isolation

To create a transaction you need to get it from the DbCommand. The reason is simple, a transaction is different between database provider. So, when using the SqlDbCommand, you will have more option of transaction for the Microsoft Sql Server Database. Also, transaction code will do isolation on the database. Sql Server let you do different type of transaction.

Sql Server Transaction's type

  • ReadUncommitted
  • This is a very weak transaction and should not be used. In fact, this type doesn’t lock anything in the database. It’s possible that the data changes or be deleted by the time the commit is done. This can lead to unpredictable reaction. This type of isolation will produce for a Select statement this query:

    SELECT * FROM CUSTOMERS WITH (NOLOCK)
    
  • ReadCommitted
  • This one try to have a share-lock on the data. If it succeed, it will protect any change from the external into those values.

  • RepeatableRead
  • All the data implicate in the transaction are lock.

  • Serializable
  • All the table is lock. This prevent adding other rows to the table. This is the best isolation but it comes with the cost of being not performant. Locking more value than desire can lead to reduce the speed to all other connection that need to read or edit these rows.

  • Snapshot
  • If your database is Sql Server 2005 and over you can use this type of transaction that will do a copy of the affected row. To be more accurate, rows are versionned and when others connection want to access these rows instead of locking a copy is made. This keep the system in good shape without creating any slow time for anyone.

Transaction Type

This type of transaction is known as Lightweight Transaction Manager (LTM) because only one connection is required. Sometime, you may use one connection to read and while it’s looping you could open an other connection to update. Anytime you have two connections open, the transaction type become Distributed Transaction (DT). Microsoft Windows operating system have a Distributed Transaction Coordinator (DTC) that is a service that run in the background of your computer. The Distributed Transaction Coordinator (DTC) requires the use of System.Transactions that will be discussed in a later article or to use explicit code to execute the code under DTC. To use DTC developers have to create classes that inherited from the ServicedComponent class in the System.EnterpriseServices namespace. This won’t be discussed because it’s easier now to simply use the System.Transactions that handle implicitly if it really require to be DTC or LTM.

Distributed Transaction Coordinator (DTC)

DataReader and DataAdapters to read from the database

To read data from the database you need to use something else from DbCommand that was returning no data. Two options are available. The first one is the DataReader, the second is the DataAdapters.

DataReader

This is the fastest way to read data from the database. To create a DataReader object, the use of ExecuteReader from the DbCommand class is required. It’s required because you will still need to give them the type of query you want and the SQL statement (or the stored procedure name). Then, the method ExecuteReader() return a DataReader. In the example below, the reader is a SqlDataReader. This one derive from DataReader and is more specialized for Microsoft Sql Server. The latest solution is really not performant.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "SELECT CustomerID, CompanyName FROM Customers WHERE country LIKE 'canada' ";
		command.CommandType = System.Data.CommandType.Text;
		using (SqlDataReader reader = command.ExecuteReader())
		{
			while (reader.Read())
			{
				System.Diagnostics.Debug.WriteLine(string.Format("{0}:{1}",reader["CustomerID"], reader["CompanyName"]));
			}
		}
	}
	connection.Close();
}

The DataReader acts like a cursor. It can only go forward. You can access the value of each row by using the square braquet []. You can put the name of the field or a integer that represent the index of the position of the field. For example, we could have replace

System.Diagnostics.Debug.WriteLine(string.Format("{0}:{1}",reader["CustomerID"], reader["CompanyName"]));

by

System.Diagnostics.Debug.WriteLine(string.Format("{0}:{1}",reader[0], reader[1]));

The problem with the latest version is the maintainability. It’s harder to know what will be displayed. Also, it can be problematic if the order of the returned data change. Let say that you get data from a stored procedure and for an unknown reason the database administrator change the order of the returned field. This would create a problem but not if the name were used.

You can also load the whole data set in one shot using the Load method. The data is loaded into a DataTable.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
DataTable datas = new DataTable();
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
    connection.Open();
    using (var command = new SqlCommand())
    {
        command.Connection = connection;
        command.CommandText = "SELECT CustomerID, CompanyName FROM Customers WHERE country LIKE 'canada' ";
        command.CommandType = System.Data.CommandType.Text;
        using (SqlDataReader reader = command.ExecuteReader())
        {
            datas.Load(reader, LoadOption.Upsert);
        }
    }
    connection.Close();
    System.Diagnostics.Debug.WriteLine("Number of rows loaded : " + datas.Rows.Count);
}

One problem with DataReader is that it uses a cursor on the server and it keeps the connection to the server open while looping. The second problem is that while looping you won’t be able to execute an other query withing the loop. To solve that problem you can load all the data and loop after or use the Load method. Another option is to use a DbConnection with a special connection string. This connection string will require to have the attribute “MultipleActiveResultSets=True”. This let have multiple command but at the cost of performance issue. It has to be used with cautiousness.

Scalar

Sometime, we want to have just a single value. This can be an ID from a username, but more often it will be to get a value from an aggregate function, like a count. We could use the DataReader but this give some overhead for a single value.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "SELECT count(CustomerID) FROM Customers WHERE country LIKE 'canada' ";
		command.CommandType = System.Data.CommandType.Text;
		int countValue = (int)command.ExecuteScalar();
		System.Diagnostics.Debug.WriteLine("Count value:" + countValue);
	}
	connection.Close();
}

As you can see, line 10 use the ExecuteScalar from the DbCommand to return an Object. In our case, we return an integer.

DbDataAdapter

The DbDataAdapter is the reader that give you the more leverage in data functionality. Not only it let you get the data but it let you update them.

The DbDataAdapter class derive from DataAdapter. As the other command, a Microsoft Sql Server one exist and it’s called : SqlDataAdapter.

The DbDataAdapter contains a SelectCommand that use an ExecuteReader. Internally the data is loaded in a DataTable. This let the developer change the value inside the DataTable and if desired update them to the database or insert new rows added and also delete those who are not anymore in the DataTable. The DbDataApter is wise enough to open and close the connection. In the other end, it let the user open the connection if desired. In that case, the user will also need to close it. This can be useful when multiple calls is required because it won’t open-close repetitively which is costly in performance. Here is an example of how to load data from the database into a DbDataAdapter.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
DataSet returnedValuesSet = new DataSet();
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "SELECT CustomerID, CompanyName FROM Customers WHERE country LIKE 'canada' ";
		command.CommandType = System.Data.CommandType.Text;
		using (var dataAdapter = new SqlDataAdapter(command))
		{
			dataAdapter.Fill(returnedValuesSet);
		}
	}
	connection.Close();
	System.Diagnostics.Debug.WriteLine("Number of rows loaded : " + returnedValuesSet.Tables[0].Rows.Count);
}

So, now the DataSet is filled up with the query data. It’s a non-connected object but will still be able to do update later on. From this example, we can see that the SqlDataAdapter fill up with the Fill method. This Fill method can also takes a second argument that is the name of the name of the table. This mean we can modify the previous example with this one and instead of referring the data in the DataSet with an index, we will be able to use the string as a key.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
DataSet returnedValuesSet = new DataSet();
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "SELECT CustomerID, CompanyName FROM Customers WHERE country LIKE 'canada' ";
		command.CommandType = System.Data.CommandType.Text;
		using (var dataAdapter = new SqlDataAdapter(command))
		{
			dataAdapter.Fill(returnedValuesSet, "Customers");
		}
	}
	connection.Close();
	System.Diagnostics.Debug.WriteLine("Number of rows loaded : " + returnedValuesSet.Tables["Customers"].Rows.Count);
}

The table name is important if you want to use other feature like insert, update and delete because the SqlAdapter will use this table to do the desired task. If you use a fake table name, you will get an exception. This exception is “Update unable to find TableMapping[‘Table’] or DataTable ‘Table’.”.

Invalid Operation with DataSet Update

The correct code would be to provide in the InsertCommand property of the SqlDataAdapter a SQL statement with an Insert statement or a stored procedure. If you do not want to create every time the Insert, Update, Delete and Select statement, it’s possible to use the SqlCommandBuilder.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
DataSet returnedValuesSet = new DataSet();
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "SELECT CustomerID, CompanyName FROM Customers WHERE country LIKE 'canada' ";
		command.CommandType = System.Data.CommandType.Text;
		using (var dataAdapter = new SqlDataAdapter(command))
		{
			dataAdapter.Fill(returnedValuesSet, "Customers");
			var builder = new SqlCommandBuilder(dataAdapter);
			dataAdapter.InsertCommand = builder.GetInsertCommand();
			returnedValuesSet.Tables["Customers"].Rows.Add("-1", "My Company Name");

			dataAdapter.Update(returnedValuesSet,"Customers");
		}
	}
	connection.Close();
}

This will generate the InsertCommand as you can see in this screenshot:

Automatically create SQL statement with DbCommandBuilder

The created SQL statement use parameter style and is Sql injection protected.

So far the DbDataAdapter commit 1 change at a time. This is because the DbDataAdapter has by default its UpdateBatchSize to 1. You can select the size you want or set it to 0 to let the DbDataAdapter use the largest batch size. This will increase the performance of the application.

In practice

When using Sql statement, stored procedure or direct Sql you should always use parameters. In practice, in your data access layer when you have a method that return data from the database it should query with a DataReader and return a DataSet. The reason is the DataSet let you close the connection and the DataReader is justified by its speed.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
DataSet returnedValuesSet = new DataSet();
DataTable returnedTable = new DataTable();
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "SELECT CustomerID, CompanyName FROM Customers WHERE country LIKE @country";
		command.CommandType = System.Data.CommandType.Text;
		SqlParameter paramCountry = new SqlParameter("country","canada");
		command.Parameters.Add(paramCountry);
		using (var reader = command.ExecuteReader())
		{
			returnedTable.Load(reader,LoadOption.Upsert);
			returnedValuesSet.Tables.Add(returnedTable);
		}
	}
	connection.Close();
	
}

For inserting, updating and deleting, parameters must be used too. Here is a good example of code that could be use in a data access layer methods. Your method should return the number of row and accept as parameter your model object of the Customer.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
DataSet returnedValuesSet = new DataSet();
DataTable returnedTable = new DataTable();
int rowsAffected;
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandText = "INSERT INTO customers (CustomerID, CompanyName) VALUES (@id, @name)";
		command.CommandType = System.Data.CommandType.Text;
		command.Parameters.Add(new SqlParameter("id","P1"));
		command.Parameters.Add(new SqlParameter("name", "MyName"));
		rowsAffected = command.ExecuteNonQuery();
	}
	connection.Close();
}

Do not forget to handling the DbAdapter’s UpdateBatchSize when you have a collection that you are going to update. Let say that you pass to your data access layer a collection ( IList ). Your method should check the Count of items and adjust the UpdateBatchSize to the Count. This way, your performance will increase by a lot.

Last important thing is the use of Transaction. Usually when something goes wrong you would like to know about it and to cancel the current batch edition to restart it once the problem is fixed. Sql Transaction will be discussed in a future article but should be there for insert, update and delete queries.

Lambda Expression

Lambda Expression can be of two types:

  • Code Delegate (compiled)
  • Expression Tree Object (runtime)

The Code Delegate is used a lot with Linq. Almost all extension like Where is a delegate function that is compiled into IL code. It’s very fast because it’s compiled. On the other hand, some situation when the data is not evaluate against direct memory or require some reflection against what is passed requires Expression Tree Object Lambda.

In the code, you can know if it’s a Code Delegate or Expression Tree with the declaration of the parameter. If the method you call use

Func<T,bool> predicate //bool can be something else

It’s because it’s a code delegate. In fact, the Func method is a predefined delegate that take one or more parameter with a return value. This can be compiled. Conversely, Expression Tree Object use

Expression<Fun<T,bool>> predicate //bool can be something else

Instead of compiling into IL, this generate code to let the framework analyse the expression. It lets the developer know the type, the name of the object and the value of it. This is why Linq To Sql requires to use expression instead of compiled because it needs to read the delegate information and translate it into SQL statement.

Real Life Example

I think a simple example could be the one of INotifiedPropertyChanged. This interface give a method that has for parameter a string that must be the name of the property that has changed. When using straight from this implementation, some problem may arise like not entering the name of the property correctly or when refactoring, the tool may not check the string value to change it. To solve this issue, you can use Expression Tree Object to get all methods of the current object and to select one of it.

protected virtual void OnPropertyChanged(string propertyName)
{
	PropertyChangedEventHandler handler = this.PropertyChanged;
	if (handler != null)
	{
		var e = new PropertyChangedEventArgs(propertyName);
		handler(this, e);
	}
}

protected void OnPropertyChanged<T>(Expression<Func<T>> propertyChanged)
{
	if (propertyChanged != null)
	{
		var expression = (MemberExpression)propertyChanged.Body;
		this.OnPropertyChanged(expression.Member.Name);
	}
}

The line 11 contain the Expression Tree Object that contain the delegate Func. This mean that it doesn’t take any parameter and return a value of type .

private MyObject myObject;
public MyObject MyObject
{
	get { return myObject; }
	set {
		myObject = value;
		base.OnPropertyChanged(() => MyObject); // This was before used like this : base.OnPropertyChanged("MyObject"); 
	}
}

Line 7 contains the Lambda method and in the same line, in comment, you can see the original.

DbCommand under the microscope

DbCommand uses the DbConnection to be able to send SQL query or call a stored procedure. It can also execute DDL (Data Definition Language) query to create table or to modify table structure.

Here is an example of update a row with the SqlCommand that inherit from DbCommand.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
	connection.Open();
	using (var command = new SqlCommand())
	{
		command.Connection = connection;
		command.CommandType = System.Data.CommandType.Text;
		command.CommandText = "UPDATE Region SET RegionDescription = 'Estern Yes!' WHERE RegionID=1";
		command.ExecuteNonQuery();
		connection.Close();
	}
}

So, from this example you must have guess that SqlCommand that inherit from DbCommand derive also from IDisposable. This is why the using statement is still use in this example. The command got the connection that was opened previously and than the CommandType is set to Text to execute directly SQL query.

From there the ExecuteNonQuery method is executed. This method is perfect to call stored procedure or Sql statement that does not return data. Also, this command can return the number of row affected.

int amountOfRowAffected = command.ExecuteNonQuery();

To execute stored procedure, the command type need to be changed and the command text must have the stored procedure name. To make a test lets create to the Northwind database the stored procedure.


ALTER PROCEDURE [dbo].[UpdateRegion]
(	
    @id INT
    ,@txt VARCHAR(50)
)
AS
BEGIN
  UPDATE region
  SET RegionDescription = @txt
  WHERE RegionID = @id
END

To execute lets use this code:

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
    connection.Open();
    using (var command = new SqlCommand())
    {
        command.Connection = connection;
        command.CommandType = System.Data.CommandType.StoredProcedure;
        command.CommandText = "UpdateRegion";
        command.Parameters.Add(new SqlParameter("id","1"));
        command.Parameters.Add(new SqlParameter("txt", "Est"));
        command.ExecuteNonQuery();
        connection.Close();
    }
}

Not only the CommandType and CommandText have changed, but a new addition appear with parameter. Since the stored procedure take the id of the region and the text to display as new description, the command must be aware of what to pass to the stored procedure.

The creation of the parameter can also be create with the use of the command. That’s it, you can create parameter with the CreateParameter method of the DbCommand class.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
    connection.Open();
    using (var command = new SqlCommand())
    {
        command.Connection = connection;
        command.CommandType = System.Data.CommandType.StoredProcedure;
        command.CommandText = "UpdateRegion";
        SqlParameter param1 = command.CreateParameter();
        param1.ParameterName = "id";
        param1.Value = "1";
        command.Parameters.Add(param1);
        command.Parameters.Add(new SqlParameter("txt", "Estern"));
        command.ExecuteNonQuery();
        connection.Close();
    }
}

The DbParameter, in the few example above was of type SqlParameter. The class SqlParameter derives from DbParameter and contains additional specification for Sql Server. For example, the Db Type are more specific to Sql Server. Also, you can find multiple additional methods to specify the direction of the parameter, the type, the direction, etc.

SqlParameter param1 = command.CreateParameter();
param1.ParameterName = "id";
param1.Value = "1";
param1.Direction = System.Data.ParameterDirection.Input;
param1.SqlDbType = System.Data.SqlDbType.Int;
param1.IsNullable = false;

To conclude, the DbCommand is the class to use when you need to execute a query that does not return any data.

Mastering SqlConnection

The SqlConnection object is an object that derive of DbConnection. It opens the connection between the application and the database. It’s also inherit of IDisposable because of DbConnection.

var connection = new SqlConnection();
connection.ConnectionString = @"Data Source=PATRICK-PC\SQLEXPRESS;Initial Catalog=Northwind;Integrated Security=True";
connection.Open();
connection.Close();

or

using (var connection = new SqlConnection())
{
	connection.ConnectionString = @"Data Source=PATRICK-PC\SQLEXPRESS;Initial Catalog=Northwind;Integrated Security=True";
	connection.Open();
	connection.Close();
}

These two snippets of code illustrate the creation of a connection. Both of them were using connection string from a string but the value can be directly loaded from the app.config or web.config.

ConnectionStringSettings connectionStringSettings = ConfigurationManager.ConnectionStrings["ApplicationServices"];
using (var connection = new SqlConnection(connectionStringSettings.ConnectionString))
{
    connection.Open();
    connection.Close();
}
<configuration>
  <connectionStrings>
    <add name="ApplicationServices"
         connectionString="Data Source=PATRICK-PC\SQLEXPRESS;Initial Catalog=Northwind;Integrated Security=True"
         providerName="System.Data.SqlClient" />
  </connectionStrings>
  ...
  ...

Linq to Sql and Entity Framework cache

Linq to Sql has a caching system that rely on the primary key of each object. Inside the data context, an identity map keep the value of the retrieved data. Every thing is handled by the data context and you have to worry about nothing. Keep in mind that if someone else change the value that the cache won’t be refreshed.

If you change the data on your side, the cache won’t also be updated. This can be problematic in some situation. If this behavior is not what you want, you can disable the caching feature of Linq to Sql with a property of the data context.

myDataContext.ObjectTrackingEnabled = false;

Or, you can refresh specific object.

myDataContext.Refresh(RefreshMode.OverwriteCurrentValues, myObjectToRefresh);

The reason behind this technical choice is that initially it has been developed for performance on consecutive call of the same object. The goal is not the have a real caching system.

To conclude, the title mentions Entity Framework and I haven’t say anything about it yet. The reason is that Entity Framework react the same way as Linq to Sql for the cache.

How to install Entity Framework 4.1 RC?

You need to open the solution where you want to have the latest version of EF. If you are running under .Net 3.5, big chance that you are running on the version 1. If you are running on .Net 4.0 you should be on the EF version 4. The 4.1 require to do some manual change.

First, be sure you have the NuGet extension.

NuGet package installed

Second, open the console of NuGet and type : install-package EntityFramework. To open the console you need to go to View > Other Windows > Package Manager Console.

Installation of Entity Framework 4.1 RC

EntityFramework 4.1 will be installed on the solution that is active.

You can also use the NuGet Package Manager which give you a visual interface to download Entity Framework. This is located in Project > Manage NuGet Package.

Once done, your project will now have a reference to the DLL of Entity Framework.

Entity Framework Object Context Life Cycle compared to Linq to Sql Data Context Life Cycle

Entity Framework Object Context

The object created from the Entity Model that hold all objects has a life cycle like all other objects. But, this one should catch more your attention because it can come very big with all the object’s state tracking that it has to remember. Moreover, it caches values. So, I do not have to tell you that with a medium or big application that the ram memory can come very high.

ObjectContext class inherit from IDisposable interface so you can, and should, use USING statement when using Entity Framework. The problem with this approach is that the object tracking is loss. So, it solves only the problem concerning the memory. If you are getting information and know that you gonna change it than is better not to use USING, but still use .Dispose at the end. But, if you get information only or you update right away the information, it’s better to use USING.

States

Entity Framework has 5 possibles states. The first one is when new object is adding. The state is than Added. Once saved the state of this object come to Unchanged. When an object is deleted, its state changes to Deleted until the object context saves it to the database. It’s than to Detached. The Modified state is a little bit more tricky. When using Entity Object (Not POCO) when a scalar object is changed, the state automatically change to Modified. For POCO, it requires to have a call to DetectChanges to be able to mark the object as Modified. When an object is loaded from the object context, it’s default state is unchanged.

Here is a complete example that show with the Northwind database and Entityframework the state change from the creating to the removing phase.

Lets take this code that create a new Order for a Customer and then remove it.

var db = new NorthwindEntities();
//Add a new Order
var firstCustomer = db.Customers.First();
Debug.WriteLine("---Customer loaded---");
Debug.WriteLine("FirstCustomer:" + db.ObjectStateManager.GetObjectStateEntry(firstCustomer).State);
var newOrder = new Orders(); // db.CreateObject<Orders>(); would do the samethiing
Debug.WriteLine("---Orders Created---");
Debug.WriteLine("FirstCustomer:" + db.ObjectStateManager.GetObjectStateEntry(firstCustomer).State);
Debug.WriteLine("newOrder: N/A because not yet inside the ObjectContext so not state yet. Could attach before adding to get state but not applicable now because the key is generated by the db and can't attack without key.");
Debug.WriteLine("newOrder:" + db.ObjectStateManager.GetObjectStateEntry(newOrder).State);
firstCustomer.Orders.Add(newOrder);
Debug.WriteLine("---Orders added---");
Debug.WriteLine("FirstCustomer:" + db.ObjectStateManager.GetObjectStateEntry(firstCustomer).State);
Debug.WriteLine("newOrder:" + db.ObjectStateManager.GetObjectStateEntry(newOrder).State);
db.SaveChanges();
Debug.WriteLine("---Object Context save---");
Debug.WriteLine("FirstCustomer:" + db.ObjectStateManager.GetObjectStateEntry(firstCustomer).State);
Debug.WriteLine("newOrder:" + db.ObjectStateManager.GetObjectStateEntry(newOrder).State);
Debug.WriteLine("---Remove the created object---");
firstCustomer.Orders.Remove(newOrder);
Debug.WriteLine("FirstCustomer:" + db.ObjectStateManager.GetObjectStateEntry(firstCustomer).State);
Debug.WriteLine("newOrder:" + db.ObjectStateManager.GetObjectStateEntry(newOrder).State);
db.SaveChanges();
Debug.WriteLine("---Object Context save---");
Debug.WriteLine("FirstCustomer:" + db.ObjectStateManager.GetObjectStateEntry(firstCustomer).State);
Debug.WriteLine("newOrder:" + db.ObjectStateManager.GetObjectStateEntry(newOrder).State);   

The output of this code is :

---Customer loaded---
FirstCustomer:Unchanged
---Orders Created---
FirstCustomer:Unchanged
newOrder: N/A because not yet inside the ObjectContext so not state yet. Could attach before adding to get state
---Orders added---
FirstCustomer:Unchanged
newOrder:Added
---Object Context save---
FirstCustomer:Unchanged
newOrder:Unchanged
---Remove the created object---
FirstCustomer:Unchanged
newOrder:Modified
---Object Context save---
FirstCustomer:Unchanged
newOrder:Unchanged

We could also explore the state with the event.

var db = new NorthwindEntities();
db.ObjectStateManager.ObjectStateManagerChanged += new System.ComponentModel.CollectionChangeEventHandler(ObjectStateManager_ObjectStateManagerChanged);
//Get existing Customer that has Order
var customersWithOrder = db.Customers.Where(customer => customer.Orders.Count>15);
//Add a new Order
var firstCustomer = db.Customers.First();
var newOrder = new Orders(); // db.CreateObject<Orders>(); would do the samethiing
firstCustomer.Orders.Add(newOrder);
db.SaveChanges();
firstCustomer.Orders.Remove(newOrder);
db.DeleteObject(newOrder);
db.SaveChanges();

This produce:

-->LifeCycle.Customers-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Add
-->LifeCycle.Orders-->Remove
-->LifeCycle.Orders-->Remove

I prefer the first output because it’s more clear.

Thread

The ObjectContext class is not tread safe. You must have one object context per thread or to create you own thread synchronization process.

Linq to Sql Data Context

As the ObjectContext, the DataContext class hold the track of each object state. It knows what has been modified, deleted or added. Also, Linq to Sql uses state even if nothing has changed to the object. When loaded, the object will be Unchanged or if you create a new object or deserialize an object, it will be a Untracked.

When an object is attached, by default its state will be PossiblyModified and this is until the SubmitChange. The next three states are very common ToBeInserted, ToBeModified, ToBeDeleted. The last possible state is when an object has been SumbitChange when it was ToBeDeleted. When this happen, the object is still in the DataContext but with the Deleted state.

Thread

The Linq to Sql DataContext class is not thread safe. It should not be static, neither

Where and when should EF or Linq2Sql live?

Matt Warren, software achitect at Microsoft on the C# programming language product team and member of the team that created LINQ to SQL explains this question with a design pattern. Theses ORM should be treated as the Unit of Work pattern.

A Unit of Work keeps track of everything you do during a business transaction that can affect the database. When you’re done, it figures out everything that needs to be done to alter the database as a result of your work.

That’s it. The ORM object should live until the business task is done. For example, you need to edit a employee, you create the object when loading it, it’s alive until it’s saved or the task is cancelled. It does not remain until the application close in the memory.

Connection pooling means to reuse database connection

Connection Pooling is handled by the Connection Manager to reuse available connection instead of creating a new one. This is performance wise because the system won’t create and kill connection every time a database is require. This is also very true for web application that not only a same user may call multiple time the database per page but also go in multiple page and this is done by many users. To activate Pooling, add in the collection string “Pooling=True;”, that’s it.

To use connection pooling suffices to change the connection string. From there you can select the size of the pool and many other options.

Default value for the pool size is 100

The way pooling works does not involve any modification or special behavior for the database. It’s all code handled. All ADO.NET provider can use this mechanism.

Also, pooling does not have any huge disadvantage and than should always be activated. The only moment it can be bright to turn it off is when debugging connection problem.

Visual Studio Express Database Explorer difference

When you download Visual Studio 2010 Express Edition you have to choose which version you want. In those choice you have the C# version and the web version. Both of them seem to be similar but have some different template. For example, you cannot create a ASP.NET website with the C# one and cannot create a Console Application with the web one.

It also have an other big difference and it’s the Databae Explorer. The C# version won’t let you connect to SQL Server. Only SQL Compact or SQL file.

Here are 2 screen shots of the two Data Explorer:

Visual Studio 2010 C# Edition

Visual Studio 2010 Web Edition

This is interesting because if you want to use SQL Profile you need to use the SQL server and not a file or a compact edition. The reason behind this is that Microsoft may have think that application usually does not require to have a direct access to the SQL server and because they try to cut functionality in each of the Express edition, the C# edition got this cut. However, it’s possible to create a website with a SQL Server because usually web server use SQL Server and not a SQL database file.