Entity Framework Uniqueness of Complex Object reference

Working with Entity Framework and Complex object can be challenging. For example, they must be defined even if in your code you set them as null. Another limitation of complex object is that you cannot have two of them referencing the same object. You will get the following error when saving:

The entity of type ‘xxx’ references the same complex object of type ‘zzz’ more than once. Complex objects cannot be referenced multiple times by the same entity.

This can happen really easily. Imagine this scenario where you have an entity that has two properties of a complex type “Money”.

public class House{
    public Money CurrentPrice{get;set;}
    public Money BoughtPrice{get;set;}
}

If you just buy the house, the current price and bought price will be the same. You may just set the same Money instance to both properties. However, it will fail. It fails not because they have the same inner value (let’s say the same decimal) but fails because both properties are linked to the same complex object.

The quick fix is to clone the price before saving the entity. By cloning each properties, they will be of the same complex type, but both objects will have be from a different object. Since complex type is simple by nature, like cannot reference other entities, the cloning is usually very simple if not just using the basic MemberwiseClone.

This is one of many traps that you may figure out only once you hit the database with the specific code of having the same reference. Do not overlook that scenario and always make sure that complex objects come with a unique instance before saving your changes with Entity Framework.

How to boost caching performance to cache Entity Framework object

Entity Framework objects are dangerous for caching because of their nature to keep references to object. If you have an object that contains a list of object that can contain back the initial object, you come in a scenario where you have a infinite deepness of reference. While this is not a problem in memory, since it’s just pointer. However, if you serialize, it can be problematic. Json.Net provides some way to serialize reference which will serialize once and then refer to the object by a $ref id. However, this can still be expensive because the framework needs to navigate through the objects tree to determine if or not it needs more serialization. Another way to optimize the serialization with Json.Net is to have a custom ContractResolver where you can evaluate the level of deepness you are and stop serializing. The reference + custom ContractResolver looks like this:

public static class Serialization
{
    public static string Serialize<T>(T objectToSerialize, int maxDepth = 5)
    {
        using (var performanceLog = new GlimpseCodeSection("Serialize"))
        {
            using (var strWriter = new StringWriter())
            {
                using (var jsonWriter = new CustomJsonTextWriter(strWriter))
                {
                    Func<bool> include = () => jsonWriter != null && jsonWriter.CurrentDepth <= maxDepth;
                    var resolver = new DepthContractResolver(include);
                    var serializer = new JsonSerializer();
                    serializer.Formatting = Formatting.Indented;
                    serializer.ContractResolver = resolver;
                    serializer.ReferenceLoopHandling = ReferenceLoopHandling.Serialize;
                    serializer.PreserveReferencesHandling = PreserveReferencesHandling.Objects;
                    serializer.TypeNameHandling = TypeNameHandling.All;
                    serializer.ConstructorHandling = ConstructorHandling.AllowNonPublicDefaultConstructor;
                    serializer.NullValueHandling = NullValueHandling.Include;
                    serializer.Serialize(jsonWriter, objectToSerialize);
                }
                return strWriter.ToString();
            }

        }
    }

    public static T Deserialize<T>(string objectSerialized)
    {
        using (var performanceLog = new GlimpseCodeSection("Deserialize"))
        {
            var contractResolver = new PrivateResolver();
            var obj = JsonConvert.DeserializeObject<T>(objectSerialized
                , new JsonSerializerSettings
                {
                    ReferenceLoopHandling = ReferenceLoopHandling.Serialize,
                    PreserveReferencesHandling = PreserveReferencesHandling.Objects,
                    TypeNameHandling = TypeNameHandling.All,
                    ConstructorHandling = ConstructorHandling.AllowNonPublicDefaultConstructor,
                    ContractResolver = contractResolver,
                    NullValueHandling = NullValueHandling.Include
                });
            return obj;
        }
    }

    /// <summary>
    /// Allow to have private method to be written in the serialization
    /// </summary>
    public class PrivateResolver : DefaultContractResolver
    {
        protected override JsonProperty CreateProperty(MemberInfo member, MemberSerialization memberSerialization)
        {
            var prop = base.CreateProperty(member, memberSerialization);

            if (!prop.Writable)
            {
                var property = member as PropertyInfo;
                if (property != null)
                {
                    var hasPrivateSetter = property.GetSetMethod(true) != null;
                    prop.Writable = hasPrivateSetter;
                }
            }

            return prop;
        }
    }

    public class DepthContractResolver : DefaultContractResolver
    {
        private readonly Func<bool> includeProperty;

        public DepthContractResolver(Func<bool> includeProperty)
        {
            this.includeProperty = includeProperty;
        }

        protected override JsonProperty CreateProperty(MemberInfo member, MemberSerialization memberSerialization)
        {
            var property = base.CreateProperty(member, memberSerialization);
            //See if we should serialize with the depth
            var shouldSerialize = property.ShouldSerialize;
            property.ShouldSerialize = obj => this.includeProperty() 
                                                && (shouldSerialize == null || shouldSerialize(obj));

            //Setter if private is okay to serialize
            if (!property.Writable)
            {
                var propertyInfo = member as PropertyInfo;
                if (propertyInfo != null)
                {
                    var hasPrivateSetter = propertyInfo.GetSetMethod(true) != null;
                    property.Writable = hasPrivateSetter;
                }
            }


            return property;
        }

        protected override IList<JsonProperty> CreateProperties(Type type, MemberSerialization memberSerialization)
        {
            IList<JsonProperty> props = base.CreateProperties(type, memberSerialization);
            var propertyToSerialize = new List<JsonProperty>();
            foreach (var property in props)
            {
                if (property.Writable)
                {
                    propertyToSerialize.Add(property);
                }
                else
                {
                    var propertyInfo = type.GetProperty(property.PropertyName);
                    if (propertyInfo != null)
                    {
                        var hasPrivateSetter = propertyInfo.GetSetMethod(true) != null;
                        if (hasPrivateSetter)
                        {
                            propertyToSerialize.Add(property);
                        }
                    }
                }
            }
            return propertyToSerialize;
        }

    }

    

    public class CustomJsonTextWriter : JsonTextWriter
    {
        public int CurrentDepth { get; private set; } = 0;
        public CustomJsonTextWriter(TextWriter textWriter) : base(textWriter)
        {
        }

        public override void WriteStartObject()
        {
            this.CurrentDepth++;
            base.WriteStartObject();
        }

        public override void WriteEndObject()
        {
            this.CurrentDepth--;
            base.WriteEndObject();
        }
    }
}

The problem is that even with those optimizations, the time can be long. One common pattern is that you have a big Entity Framework object that you want to serialize. Before sending the object to serialize, you want to cut some branches by setting to null properties. For example, if you have the main entity that has many collections, you may want to null the collection and just setting the object with less sub-objects into Redis. The problem is that if you null a property, the main object will have some missing data and your object is in a bad state. So, the pattern is to serialize the object once and deserialize it right away. Null some properties on that deserialized object, which is a total clone. Any changes doesn’t affect the real object. From that clone, you can serialize this one and set it to Redis. The problem is that it takes 2 serializations operation and 1 deserialization while the best case scenario would be 1 serialization.

The pattern remains good, but the way to achieve it is wrong. A better approach would be to clone the object in C#. The benefit is the speed, the disadvantage is that you need to have cloning method on all your classes which can be time consuming. It’s also difficult to know how to clone each object. Often you will need a shallow clone and a deep clone. Depending of the situation and the class, you need to call the right cloning method. The speed is varying a lot but on big cloning object where the graph is huge I saw result going from 500ms to 4ms. Very good for a first clone operation. After, cutting some properties and serializing again, the same object can take about 20ms to serialize.

Here is an example :

public Contest ShallowCloneManual()
{
	var contest = (Contest)this.MemberwiseClone();
	contest.RegistrationRules = this.registrationRules.DeepCloneManual();
	contest.AllowedMarkets = this.AllowedMarkets?.ShallowCloneManual();
	contest.ContestOrderType = this.contestOrderType?.DeepCloneManual();
	contest.Creator = this.Creator?.ShallowCloneManual();
	contest.DailyStatistics = this.DailyStatistics?.ShallowCloneManual();
	contest.InitialCapital = this.InitialCapital.DeepCloneManual();
	contest.Moderators = this.Moderators?.ShallowCloneManual();
	contest.Name = this.Name.DeepCloneManual();
	contest.TransactionRules = this.TransactionRules.DeepCloneManual();
	contest.StockRules = this.StockRules?.DeepCloneManual();
	contest.ShortRules = this.ShortRules?.DeepCloneManual();
	contest.OptionRules = this.OptionRules?.DeepCloneManual();
	contest.Portefolios = this.Portefolios?.ShallowCloneManual();
	return contest;
}

public Contest DeepCloneManual()
{
	var contest = (Contest)this.MemberwiseClone();
	contest.RegistrationRules = this.registrationRules.DeepCloneManual();
	contest.AllowedMarkets = this.AllowedMarkets?.ShallowCloneManual();
	contest.ContestOrderType = this.contestOrderType?.DeepCloneManual();
	contest.Creator = this.Creator?.ShallowCloneManual();
	contest.DailyStatistics = this.DailyStatistics?.ShallowCloneManual();
	contest.InitialCapital = this.InitialCapital.DeepCloneManual();
	contest.Moderators = this.Moderators?.DeepCloneManual();
	contest.Name = this.Name.DeepCloneManual();
	contest.TransactionRules = this.TransactionRules.DeepCloneManual();
	contest.StockRules = this.StockRules?.DeepCloneManual();
	contest.ShortRules = this.ShortRules?.DeepCloneManual();
	contest.OptionRules = this.OptionRules?.DeepCloneManual();
	contest.Portefolios = this.Portefolios?.DeepCloneManual();
	return contest;
}

Some improvements could be done to be more generic. For example, DeepCloneManual could take a static option object which track the deep level and stop cloning. The impact of doing the cloning in C# was significant on Azure Webjob where thousand of objects needed to be reduced and send to Azure Redis. You can see by yourself the drop in the following graph where the 75th percentile get down from 16 minutes to less than 4 minutes and the 95th percentile from +20 minutes to 4 minutes.
CustomCSharpClone

To conclude, cloning by serializing and deserializing an Entity Framework object is expensive in term of processing but fast to use. It should be used with parsimony.

How to diagnostic slow code with Visual Studio

During the development of one feature, I noticed the performance to be very slow in some scenario. It was not obvious at first because the task was to simply update a user profile. The user profile in question is stored in a single table. It’s a pretty straight forward task. Before persisting the data, some validations are done but that is it.

This is where Visual Studio can be very useful with the integrated Diagnostic Tools. The diagnostic tools provide information about event and on any of them, you can come back in time and replay the call stacks which is pretty useful. It also gives some timing information, cpu usage and memory usage. To start diagnosing, simply attach Visual Studio to the process you want to diagnostic. After, open Visual Studio’s diagnostic tools that is located in the top menu under Debug > Profiler > Performance Explorer > Show Performance Explorer.

Here is an example of the output that I got from my performance problem.

DiagnosticTool

Visual Studio Diagnostic tools events include Entity Framework SQL statements. This is where I realized that the user’s table was updated but also hundred of others which looks to be a table linked to this one. Here was the performance bottleneck, the culprit! I never expected to update anything related to that table — just the main user’s table.

Entity Framework code was like this:

public void Update(ApplicationUser applicationModel)
{
	//Update the password IF necessary
	var local = UnitOfWork.Set<ApplicationUser>().Local.FirstOrDefault(f => f.Id == applicationModel.Id);
	if (local != null)
	{
		UnitOfWork.Entry(local).State = EntityState.Detached;
	}
	UnitOfWork.Entry(applicationModel).State = EntityState.Modified;
	if (string.IsNullOrEmpty(applicationModel.PasswordHash))
	{
		UnitOfWork.Entry(applicationModel).Property(f => f.PasswordHash).IsModified = false;
	}
	UnitOfWork.Entry(applicationModel).Property(f => f.UserName).IsModified = false;
	UnitOfWork.Entry(applicationModel).Property(f => f.CreationDateTime).IsModified = false;
	UnitOfWork.Entry(applicationModel).Property(f => f.ValidationDateTime).IsModified = false;
	UnitOfWork.Entry(applicationModel).Property(f => f.LastLogin).IsModified = false;
	UnitOfWork.Entry(applicationModel).Property(f => f.SecurityStamp).IsModified = false;
	UnitOfWork.Entry(applicationModel).Property(f => f.Language).IsModified = false;
}

As you can notice, nothing is done directly on the property that has the collection of “reputation”. The problem is that if the user as in that collection 250 objects, that for an unknown reason, Entity Framework does 250 updates. Since we want just to update first name, last name and few other basic properties than we need to be sure to remove those unwanted updates. After some modification with Entity Framework, like nulling every collection before updating, The SQL provided was only a single SQL, whence the performance at full speed.

Dapper.Net coexistence with Entity Framework and caveats

Dapper.Net is a micro ORM (objection relational mapping) that StackOverFlow is using. It’s open source and it still maintained by the team, mostly by Marc Gravell who is a top 5 users in StackOverflow too. The project I am working on is getting slowed down by Entity Framework (EF) since few months and the introduction to an alternative solution was required. This is why I introduced Dapper.Net in the solution. The goal is to slowly switch toward something more under my control and Dapper.Net offers this by letting you write queries in SQL directly. In theory, it looks good since it’s used by a top website, still maintained and is less intrusive by having less magic behind the scene. That said, most problem that I was hitting with Entity Framework was with DbContext and Dapper.Net simply doesn’t have central object that they to be intelligent. In this process of introducing Dapper.Net, I’ll keep Entity Framework generating the database and keep this one to read the data. I will just not use it to save problematic entities.

Using Dapper.Net is simple. You use a Nuget package, the DLL is downloaded and you can use it. Not a lot of knowledge is require to use the basic because you are using a basic DbConnection which is enhanced with static method (by extension) to let you query and execute SQL code. However, this assumption that it does not require a lot of knowledge ends soon when you try to do something a little bit out of scope of a quick get and set. The first caveat is that the documentation is very slim. One can say that it doesn’t need more, but this is only true with basic scenario. To help explaining some limitations, this article will use an example where we will use Dapper.Net to save an entity called Contest. It’s a class that has properties which are object. Some of them are saved directly in the “Contest” table — this is called “Complex Object” in Entity Framework. And some others are saved in other tables with a foreign key that link them. It as 3 optional properties in a 0 to 1 relationship which go to other table, also it has two 0 to many (0-*) relationship to other classes. It also have a 1 to 1 relationship that is required. The number of columns in the “Contest” class is about 20. Here is a high level of the class diagram of the Contest class. I added some blue inside the aggregation to illustrate complex object and the white one are for classes that are from different tables than the contest ones.

2015-11-23 09.08.57

The second Dapper.Net limitation you will get is about the concept of complex object. Dapper.Net doesn’t know about it, which is fair since it’s a EF concept, but the problem is how can you tell this ORM to map a specific syntax into an object instead of a field directly into the main object (Contest). This is no where in the documentation. Here is two examples that you can see in almost every system: name that is localized and money with currency.

ComplexObjectDapper

Dapper.Net is expecting to have properties named Name_French, Name_English, InitialCapital_Value, InitialCapital_CurrencyTypeId. But, the problem is that Contest has 1 property called Name of type LocalizedString which has 2 properties called French and English and 1 property called InitialCapital of type Money which has 2 properties Value and CurrencyTypeId. This doesn’t work at all. The way to work with Complex Type (and also relationship tables) with Dapper is to use the multi mapping feature. This multi mapping allows you to divide a single database rows into different objects by defining a key pattern which is by default id. In our example, we could specify that the pattern is Name_French and InitialCapital_Value to be the cue to create a new object. The problem is not obvious first but is the limitation that Dapper.Net ORM decided to be built. You cannot have more than 7 mappings. The problem is even more huge when you learn that multi mapping is also the way to work with relationship (join). In our case, just the 1-1, the three 0-1 and the two 0-* take 6 of the multi mapping. This constraint is even worse when you think that some objects in relationship to the main one (contest) may also need to be loaded (multiple inner joins). For example, you are loading a Contest, that as a relationship to a list of User that participate which has a relation to a list of reputations which has a complex object for the type of reputation. Right there, we are using 3 mappings (contest->user, user->reputation, reputation->type). Very fast you hit the 7 multi mapping limitation of Dapper. The limitation of 7 multi-mappings is very arbitrary and could and should be unlimited. In SQL, you can create as much joins as you want and the ORM should follow that principle.

So, a way to get around limitation, you can create multiple select queries at a performance price. What is surprising is that few years ago, this limitation was raised and even got a pull request sent to the team to fix the 7 mappings. This one got rejected. Stackoverflow.com has a limited set of tables and not a rich business domain of classes as many enterprise software. This may justify that in their opinion that 7 multi-mappings is enough. That said, working with Dapper and Entity Framework can become more challenging than expected.

A second approach to work with a lot of complex type and relationship is to use a different class which flatten every table fields and then, using a more conventional classes mapping like AutoMapper to map the flatten table class into your rich domain class. However, this come with the cost of having more classes, the cost of having more mapping and finally, more unit test to write to ensure that the data goes from one place to the other one correctly.

A third approach is to use the basic dynamic query which return a row of object. You need to cast each fields but also you need to handle multiple rows from your joins. At that point, you are almost doing what you would do with ADO.Net.

A third Dapper.Net limitation is that it is not possible to configure this one to be wiser. Without being bloated like Entity Framework and without having to have a DbContext, Dapper.Net could have been a little more wise for mapping. You know ORM end with a “M” for “Mapping”.

Overall, Dapper.Net can work in parallel of Entity Framework as long as you adapt some of your SQL habits. Reducing the number of join, doing multiple queries, etc. I found it easier to start including Dapper.Net for write scenario than read scenario. This is justify by the fact that most of my insert and update scenario where for 1 entity at a time which just require me to create a basic SQL insert or SQL update query, copy and paste it inside a String variable and bind the object into it and execute. Yet, my current experience with Dapper.Net is mitigated and I feel that the .Net environment still have some place for improvement in the ORM area.

Entity Framework Many to Many Relationship

The theory for many to many relationship is simple: you map two entities with key that you define on each side and Entity Framework generate the table that you have assigned the name with. Something like the code below works perfectly. The problem occurs when you want to save entities later.

this.HasMany(d => d.Moderators).WithMany(d => d.ContestsUserModerate).Map(ul=> {
                ul.MapRightKey("ApplicationUserModeratorId");
                ul.MapLeftKey("ContestId");
                ul.ToTable("ContestModerators", Constants.SchemaNames.User);
            });

The problem is is you save the entity on which you configured the many to many relationship than you might get problem with Entity Framework trying to insert the two entities you are trying to maps. The problem is that if you have entity A that try to have B than A.B is having the entity and not a foreign key attribute. The same is true for B.A which is having a collection of A. In both case, it’s a collection of entities. The problem is bigger than just trying to insert those entity cause you could just write in your save method a code to mark those collection’s entities the status to UnChanged. But, the problem is that if you have rich entities than this one will contains other entities. So you have to go through all of its entities and handle all states of each properties or nullify all properties and use the foreign key properties. This is simply not viable for big classes.

The solution is to forget about the Map method of Entity Framework and to create your own association entity.

For example, the previous code we were having an Entity called Contest that try to get a many to many to ApplicationUser. So, you create a ContestApplicationUser class. Than, you have in both entities (Contest and ApplicationUser) a collection of the new class you just created : ContestApplicationUser. You need to create a configuration for ContestApplicationUser which will define the primary key which should be the primary key of both entities. This can be done by specifying in an anonymous class both primary key of your classes. Finally, you need to specify that both properties are required in the class and that this one has a relationship to the specific entity they both represent.

public class ContestApplicationUserConfiguration : AssociationConfiguration<ContestApplicationUser>
{
    public ContestApplicationUserConfiguration() : base(Constants.SchemaNames.Contest)
    {
        this.HasKey(c => new { UserId = c.UserId, ContestId = c.ContestId });
        this.HasRequired(d => d.Contest).WithMany(d => d.Users).HasForeignKey(d => d.ContestId);
        this.HasRequired(d => d.User).WithMany(d => d.Contests).HasForeignKey(d => d.UserId);
    }
}

The AssociationConfiguration class is a class I created that simply an helper that create the table name with the type of the entity.

 public abstract class AssociationConfiguration<T> : EntityTypeConfiguration<T> where T : class
    {
        protected AssociationConfiguration(string schemaName)
        {
            ToTable(typeof(T).Name, schemaName);
        }
    }

From here, it’s very simple to save any new associations. You just need to instantiate the class, ContestApplicationUser and you just fill up the two properties for the foreign keys. On the other way around, when you load the entities with the collection you will include the collection and the other side.

    .Include(d => d.Users)
    .Include(d => d.Users.Select(f=>f.User))

No more hassle when saving and still the full ledge entity when loading. One caveat of this solution is that you cannot insert new association before having both of the entity already inserted.

Entity Framework Complex Type and its Tracking

Entity Framework has something called complex type. They are classes that does not have unique identifier. They are used has a subset of an entity. For example, you can have an Address class that you want to have inside another class but do not want to have to have a table for the Address. This is useful for concepts that can be shared across entities without having to refer them as entity themselves. Another example could be Audit. An Audit table can be a solution, but you may want to have a ModifiedDate and ModifiedUser for some of your entities. The solution would be to copy paste those two properties on all your entities or to have a class that has those two properties and to use this class inside entities that want to have audit.

Let’s see a coding example.

public class House
{
	public int Id { get; set; }
	public double Price { get; set; }
	public Address Address { get; set; }
}

public class Address
{
	public string Street { get; set; }
	public int Number { get; set; }
	public string City { get; set; }
}

An house has an Address. Later on, if we add Business entity, this one could also refer to the Address entity without having to copy and paste all Address properties.

The main point of that article is that every properties inside a complex type is considered as one by Entity Framework’s tracking system. It means that if you mark any of the property to IsModified true that the whole complex type will be tracked as changed. Same thing if you mark that complex type has IsModified to true, than every property, even if not changed, will be marked to be changed. The principle behind that is that complex type is treated as an immutable type. That mean that complex type is a whole. Changing one value of this one would mean to create a new instance of that object. I am not sure that this is very convenient in many “real life scenario” but in theory it makes sense.

To illustrate more the point, here is a second example with Entity Framework calls.

	public class MyEntity
	{
	    public MyComplexClass Property1 { get; set; }
	}

	var entityLocal = this.Set<MyEntity>().Local.SingleOrDefault(d => d.Id == entity.Id);
	if (entityLocal == null)
	{
	   entityLocal = this.Set<MyEntity>().Attach(entity);
	}
	this.ChangeTracker.Entries<MyEntity>().Single(d => d.Entity == entityLocal).State = EntityState.Modified;
	this.Entry(entity).Property(s => s.Property1.SubProperty1).IsModified = true;
	this.Entry(entity).Property(s => s.Property1.SubProperty2).IsModified = false;//This remove all modification of the object!!!
	this.SaveChanges();

As you can see, the entity is marked as modified and I set one of the complex type property to false. The result is that none of the complex type properties will be saved. Here is the SQL generated:

    update [dbo].[MyEntity]
    set @p = 0
    where (([Id] = @0))

The Sql update just the property of the entity, not the one of the complex type. However, if we do not mark the SubProperty2 with IsModified to False, we have a SQL that save the whole entity, with the complex type.

    update [dbo].[MyEntity]
    set [Property1_SubProperty1] = @0
    , [Property1_SubProperty2] = null
    where (([Id] = @1))

I took the shortcut to profile the Sql Database to see the output but you could just check from the database context the properties changed and realize the same behavior.

var entry = DatabaseContext.Entry(entity);
var namesOfChangedProperties = entry.CurrentValues.PropertyNames
       .Where(p => entry.Property(p).IsModified)
        .ToArray();

At the end, you have to think the Complex Type as a whole. That mean that everything is related for Entity Framework. It also mean that if you are concerned about performance that you may not want to have huge complex type because, during load or save, the transportation of the information become into a single package. You may have end result that is not what desired, have too much information loaded or half a complex type saved (in the case you have just fill up half of the object). Take in consideration that Entity Framework limitation/feature when you create your classes design.

Why I do not recommend any more Entity Framework as your ORM

If you follow that blog or know me than the title may shock you a little. Even if you do not know me, you might think I am crazy. Entity Framework (ef) is THE framework used in enterprise and it THE Microsoft’s ORM. It is not that I haven’t try to like it. I posted dozen and dozen of articles. Being a user before usable version (before version 4.3) and still a user today. For those who do not know me, well I am the one that fix EF every place where it does not work. It seems that this is a trend wherever I go. People at some point do not figure out what to do with EF and I can make it works. So, in the final, I guess I am not too bad. However, since few months, I have been using EF with a larger scale application and for a long time and I cannot handle it anymore. To give you some context, I have not a simple CRUD application. It has above 100 entities that has a lot of relationship between them. Sometime even circular dependencies. I am using in that project Unit of Work as shown by Julie Lerman and it is wonderful in theory and I still recommend that approach, but not with EF. I am also using, in this particular project, ID field in addition of the object property for foreign key, this way I can null the property and only use the ID has a key, well when possible. This is considered the “easy” approach with foreign key with Entity Framework.

Before starting to enumerate Entity Framework nightmare, I want to say that I am not allergic to EF, just that if I need to start a new project, I will never suggest that ORM. I’ll use something way lighter. I like the Migration tool, well until you reach a database that you cannot drop. I like the fact that I have not to handle SQL script to generate my database or to seed the database. Refactoring is also interesting because it does not require to play with text file. I like code first configuration until it starts to become a huge mess because you people can configure either way entities. On the final, EF is nice on the first look but more you go deep and more the pain is present.

Let’s start with migration tool. This is great. I can launch with few key strokes my development database, seed some values. Same thing for my integration environment where I have some transactions that rollback between test the changes. This is great. The problem is that the script generated by migration tool sometime just does not make any sense. The quickest way is to kill all migration scripts (c# generated file) and start again. This is really fast but cause problem if you have a database with already some data. You also do not have any more the Up and Down methods so it is impossible to come back to a previous version in the case that you need to rollback to fix a bug. Of course, you have the workaround to reuse the file used in your source control, but well… also, the seed is awesome if you have a blank database, otherwise, welcome trouble and maintenance cost goes very high. Again, I am not talking about some simple CRUD application that has 20 entities… I am talking about enterprise grade software with a minimum of 100 entities with 10 to 20 properties each. That said, this Migration tool is not what make me not recommend Entity Framework.

The killer argument against Entity Framework is that the DbContext is a mess to work with. If you are using an unit of work, then you have that unit that collect entities state during the DbContext life — mostly all the Http Request. So you have a Save method, you test this one and it works. You call that method from a different part of your system and suddenly it breaks. You have error about concurrency, about the entity has changed, about entity that try to be added, etc. You try to debug, your figure out that some entity has the property set and the id set, so you want to null the property but the problem is that code after the commit may need this information so you cannot just null properties. You have than several choices. You can divide your model from your entity, which come with a huge cost of mapping. You can get back the entity from the database, which has the cost of having to do additional calls to the database. You can also clone the entity, null the properties and save the clone. This approach is the one taken in my last project. The disadvantage is that you do not have once saved a new entity the ID automatically set into your model class, just on the clone and you also finish to have code like that:

public void Save(StockOrder order)
{
    var orderToSave = order.DeepClone();
    portefolio.Owners = null;
    portefolio.Contest = null;
    portefolio.PortefolioStatistics = null;
    foreach (var order1 in portefolio.Orders)
    {
         order1.Portefolio = null;
         order1.UserWhoPlacedTheOrder = null;
    }
//...

Well, this is really just a small snippet and this is not really pretty. The problem become even uglier if you just want to change 1 value — for example, changing an entity’s state. To do something like that I would expect just to have to set which property I want to save and commit. Indeed, this is not what always works. So after having to null all properties of the main entity that we do not care to save, we need to specify which scalar property we want. Here is another piece of code that illustrate the problem.

if (portefolio.IsNew())
{
	base.UnitOfWork.Entry(portefolio).State = EntityState.Added;
}
else
{
	var portefolioWaitingBeingSaved = base.UnitOfWork.Set<Portefolio>().Local.SingleOrDefault(d => d.Id == portefolio.Id);
	if (portefolioWaitingBeingSaved == null || base.UnitOfWork.Entry(portefolioWaitingBeingSaved).State == EntityState.Detached) //Only do it once
	{
		base.UnitOfWork.Entry(portefolio).State = EntityState.Added;         // Add to generate new ID for entities underneat
		base.UnitOfWork.Entry(portefolio).State = EntityState.Unchanged;     // We change none of the scalar property, except the one that we specify after this call
		base.UnitOfWork.Entry(portefolio).Property(g => g.Capital).IsModified = true;
	}
}

If the entity “Portefolio” is new, than we just add. This is fine. However, if this is not the case, than we save which mean that I just want to save the “Capital”. Well, I have to to a dace with Entity Framework by Adding and setting back to Unchanged the state, then setting the property that I want to save to true. Do not forget that I had also to put all other properties to null. I should just had to specify the entity, ensure that this one has its primary key set and tell EF to update the property desired. That’s it… but no.

This does not end there, sometime in a detached software like in the case of web development, the use of unit of work become more a pain than something else. The main goal is to have several operation under the same transactions. It also helps a lot during test to mock the unit of work but sometime I couldn’t get rid of problem about the DbContext having a reference to Entities that I just do not care any more. I created a Reset method the UnitOfWork class which can detach all entities. This way I can call this method before some operations. However, it works not always. So, I have a hack in the unit of work that simply kill the DbContext and create a new one. Sometime, if more than one commit is required in an Http Request, this method is needed. To few time I need to use it I feel bad, bad that I need to hack around with a version 6…

The bad experience is more than just hacking around, is that the ORM is supposed to make the life easier of developers and EF seems that this is not the case. This is not the case of what I see when I go into enterprise, neither my own. I am organizing my last project with Visual Studio Online and I have all my tasks and bug well planned. A quick glance to my impediment show that more than 70% of them are about Entity Framework. 85% of the time that I went above my estimation if because of Entity Framework bugs. Ouch.

The last thing I want to share is that Entity Framework lacks of basic feature like doing an update with a where clause. EF team expect that you get the entity from the database first on an Http Request this way you can ensure that the entity retrieved is really for the user and than you update this one — this cause 2 database calls. Instead, a better approach would have been to get the data from the Http Request, build your entity with the binding fields of the Html form and save this one with a where clause with the user authenticated. If the entity does not belong to the user than no update is done — EF could return 0 entity saved. 1 database call. This is one scenario on many that EF does not make sense if every database call is important for you. Which make me think that most of the time people are working on Intranet application when using Entity Framework. This is fine. And well, there is not a way out-of-the-box the set index… how come after 6 version? Oh yeah, EF 7 will work on NoSql database… this is not what we need, we need a solid ORM.

To conclude this rant on Entity Framework, I continue to work with Entity Framework every day on my side project (not at Microsoft), I am sure I will continue to work in the future with Entity Framework in multiple companies that use it on their website. However, I will not start a new project with this ORM on my own. If after 4 years of EF under my belt I still struggle to update a single field in one of my class than it means that something is wrong.

Entity Framework (EF) modifying an instance that is already in the context

If you set an entity state to modify without having this one loaded and you save everything then you have no problem. However, if you got the entity from a previous load, this one is inside the local context which can raise the following exception.

Attaching an entity of type ‘Model.Entities.Users.ApplicationUser’ failed because another entity of the same type already has the same primary key value. This can happen when using the ‘Attach’ method or setting the state of an entity to ‘Unchanged’ or ‘Modified’ if any entities in the graph have conflicting key values. This may be because some entities are new and have not yet received database-generated key values. In this case use the ‘Add’ method or the ‘Added’ entity state to track the graph and then set the state of non-new entities to ‘Unchanged’ or ‘Modified’ as appropriate.

EventViewerEntityFrameworkModifyError

This is very annoying and unfortunate that Entity Framework (ef) does not have something more intelligent to handle those case. In short, you have to detach the local version and set to modify the entity you are modifying. This scenario is pretty common if you receive the object from a web request and you want to save the entity. Between the request that contains the object and the time you change the state to modify (saving code) you may have loaded a list that contains your entity or you may have loaded a part of this entity for validation purpose. Nonetheless, you have to handle this case if you want to save this entity.

A way to do it is to navigate inside the local of the DbSet to see if this one is there. If the entity is present, then you detach.

var local = yourDbContext.Set<YourModel>()
                         .Local
                         .FirstOrDefault(f => f.Id == yourModel.Id);
if (local != null)
{
  yourDbContext.Entry(local).State = EntityState.Detached;
}
yourDbContext.Entry(applicationModel).State = EntityState.Modified;

As you can see, we first verify if the entity is present inside the DbSet. If it is null, then we detach the local entity. In all case, we set to modify the entity (applicationModel) to have this one considered by Entity Framework to be updated.

Create an Index for you Entity Framework Index

Either you are working with a field that require a fast access or if the field is used as reference in some case, you will need to create an Index. Unfortunately, Entity Framework does not offer a quick way to do it. To tell Entity Framework to generate the Sql Index on the column, you must use a Column Annotation. Entity Framework column annotation has an already created class named IndexAnnotation that can be added to your column.

this.Property(d => d.Date)
    .HasColumnAnnotation("Index"
                        , new IndexAnnotation(
                                              new IndexAttribute("IX_Date") {IsUnique = true}
                                             )
                        );

The code above set for a Date column an index. It creates a IndexAnnotation which create a unique index attribute.

This result into an Index inside the migration class with a unique property set to true.

CreateTable(
	"YourEntityName",
	c => new
		{
			Id = c.Int(nullable: false, identity: true),
			Date = c.DateTime(nullable: false),
		})
	.PrimaryKey(t => t.Id)
	.Index(t => t.Date, unique: true);

Never use Primitive Type for any Entities

I see in a lot of system an abuse of primitive types. Instead of using a class for an entity, the use of primitive is used. Often, when additional needs is required, additional primitives are used which duplicate. Here is some basic example:

public class Item
{
    public double Price{get;set;}
    public string Name{get;set;}
}

The problem is obvious, the Item class should not be in a primitive type for the Price. The first reason is if later we need to have additional information, like for example the currency, you will be stuck with a second property. The problem become even more obvious if you have several money element for the same class. The class become clutters of properties. It is also way easier in the future to add additional property or method without having to change a lot of place in your software — you change at one place, the class.

public class Item
{
    public double Price{get;set;}
    public int CurrencyTypeForPrice{get;set;}
    public double SuggestedPrice{get;set;}
    public int CurrencyTypeForSuggestedPrice{get;set;}
    public string Name{get;set;}
}

The second problem is when your application become big and you realize that you should have used decimal instead of double that you have to change at several places instead to a single class. The third problem is about operations. How can you compare two prices? You have to compare the double properties (the price) with the int property (the currency type) every time. Rather than having a class with the operator equal overriding that does it at one place, you have to do it everywhere.

The forth problem is about passing information by parameter. When you have a single class, for example a Money class instead of a decimal + type for currency, it is way cleaner to use. The fifth reason is when you are using Asp.Net MVC and Template. You can create a visual editor and display template for your entity type. For example, you could create a Money.cshtml which shows the right control for the user to allow him to select from a drop down the currency. Without a specific class for your entity you would have to create an Html helper that take two parameters: the amount and the currency.

Finally, you can add additional validation. If you entity is for money than you can say that your money must always be positive. This also imply the advantage to be able to unit test at a single place instead of everywhere when you use your money logic.

To conclude, you can see that their is several advantages like when refactoring your entity, when adding operations or when using framework like Asp.Net MVC. A cleaner code and less repetition is also very interesting. None but not the less, having a class instead of a primitive type allow you to unit tests logic on any logic about your entity.