Why I do not recommend any more Entity Framework as your ORM
Posted on: 2015-06-09
If you follow that blog or know me than the title may shock you a little. Even if you do not know me, you might think I am crazy. Entity Framework (ef) is THE framework used in enterprise and it THE Microsoft's ORM. It is not that I haven't try to like it. I posted dozen and dozen of articles. Being a user before usable version (before version 4.3) and still a user today. For those who do not know me, well I am the one that fix EF every place where it does not work. It seems that this is a trend wherever I go. People at some point do not figure out what to do with EF and I can make it works. So, in the final, I guess I am not too bad. However, since few months, I have been using EF with a larger scale application and for a long time and I cannot handle it anymore. To give you some context, I have not a simple CRUD application. It has above 100 entities that has a lot of relationship between them. Sometime even circular dependencies. I am using in that project Unit of Work as shown by Julie Lerman and it is wonderful in theory and I still recommend that approach, but not with EF. I am also using, in this particular project, ID field in addition of the object property for foreign key, this way I can null the property and only use the ID has a key, well when possible. This is considered the "easy" approach with foreign key with Entity Framework.
Before starting to enumerate Entity Framework nightmare, I want to say that I am not allergic to EF, just that if I need to start a new project, I will never suggest that ORM. I'll use something way lighter. I like the Migration tool, well until you reach a database that you cannot drop. I like the fact that I have not to handle SQL script to generate my database or to seed the database. Refactoring is also interesting because it does not require to play with text file. I like code first configuration until it starts to become a huge mess because you people can configure either way entities. On the final, EF is nice on the first look but more you go deep and more the pain is present.
Let's start with migration tool. This is great. I can launch with few key strokes my development database, seed some values. Same thing for my integration environment where I have some transactions that rollback between test the changes. This is great. The problem is that the script generated by migration tool sometime just does not make any sense. The quickest way is to kill all migration scripts (c# generated file) and start again. This is really fast but cause problem if you have a database with already some data. You also do not have any more the Up and Down methods so it is impossible to come back to a previous version in the case that you need to rollback to fix a bug. Of course, you have the workaround to reuse the file used in your source control, but well... also, the seed is awesome if you have a blank database, otherwise, welcome trouble and maintenance cost goes very high. Again, I am not talking about some simple CRUD application that has 20 entities... I am talking about enterprise grade software with a minimum of 100 entities with 10 to 20 properties each. That said, this Migration tool is not what make me not recommend Entity Framework.
The killer argument against Entity Framework is that the DbContext is a mess to work with. If you are using an unit of work, then you have that unit that collect entities state during the DbContext life -- mostly all the Http Request. So you have a Save method, you test this one and it works. You call that method from a different part of your system and suddenly it breaks. You have error about concurrency, about the entity has changed, about entity that try to be added, etc. You try to debug, your figure out that some entity has the property set and the id set, so you want to null the property but the problem is that code after the commit may need this information so you cannot just null properties. You have than several choices. You can divide your model from your entity, which come with a huge cost of mapping. You can get back the entity from the database, which has the cost of having to do additional calls to the database. You can also clone the entity, null the properties and save the clone. This approach is the one taken in my last project. The disadvantage is that you do not have once saved a new entity the ID automatically set into your model class, just on the clone and you also finish to have code like that:
public void Save(StockOrder order) { var orderToSave = order.DeepClone(); portefolio.Owners = null; portefolio.Contest = null; portefolio.PortefolioStatistics = null; foreach (var order1 in portefolio.Orders) { order1.Portefolio = null; order1.UserWhoPlacedTheOrder = null; } //...
Well, this is really just a small snippet and this is not really pretty. The problem become even uglier if you just want to change 1 value -- for example, changing an entity's state. To do something like that I would expect just to have to set which property I want to save and commit. Indeed, this is not what always works. So after having to null all properties of the main entity that we do not care to save, we need to specify which scalar property we want. Here is another piece of code that illustrate the problem.
if (portefolio.IsNew()) { base.UnitOfWork.Entry(portefolio).State = EntityState.Added; } else { var portefolioWaitingBeingSaved = base.UnitOfWork.Set<Portefolio>().Local.SingleOrDefault(d => d.Id == portefolio.Id); if (portefolioWaitingBeingSaved == null || base.UnitOfWork.Entry(portefolioWaitingBeingSaved).State == EntityState.Detached) //Only do it once { base.UnitOfWork.Entry(portefolio).State = EntityState.Added; // Add to generate new ID for entities underneat base.UnitOfWork.Entry(portefolio).State = EntityState.Unchanged; // We change none of the scalar property, except the one that we specify after this call base.UnitOfWork.Entry(portefolio).Property(g => g.Capital).IsModified = true; } }
If the entity "Portefolio" is new, than we just add. This is fine. However, if this is not the case, than we save which mean that I just want to save the "Capital". Well, I have to to a dace with Entity Framework by Adding and setting back to Unchanged the state, then setting the property that I want to save to true. Do not forget that I had also to put all other properties to null. I should just had to specify the entity, ensure that this one has its primary key set and tell EF to update the property desired. That's it... but no.
This does not end there, sometime in a detached software like in the case of web development, the use of unit of work become more a pain than something else. The main goal is to have several operation under the same transactions. It also helps a lot during test to mock the unit of work but sometime I couldn't get rid of problem about the DbContext having a reference to Entities that I just do not care any more. I created a Reset method the UnitOfWork class which can detach all entities. This way I can call this method before some operations. However, it works not always. So, I have a hack in the unit of work that simply kill the DbContext and create a new one. Sometime, if more than one commit is required in an Http Request, this method is needed. To few time I need to use it I feel bad, bad that I need to hack around with a version 6...
The bad experience is more than just hacking around, is that the ORM is supposed to make the life easier of developers and EF seems that this is not the case. This is not the case of what I see when I go into enterprise, neither my own. I am organizing my last project with Visual Studio Online and I have all my tasks and bug well planned. A quick glance to my impediment show that more than 70% of them are about Entity Framework. 85% of the time that I went above my estimation if because of Entity Framework bugs. Ouch.
The last thing I want to share is that Entity Framework lacks of basic feature like doing an update with a where clause. EF team expect that you get the entity from the database first on an Http Request this way you can ensure that the entity retrieved is really for the user and than you update this one -- this cause 2 database calls. Instead, a better approach would have been to get the data from the Http Request, build your entity with the binding fields of the Html form and save this one with a where clause with the user authenticated. If the entity does not belong to the user than no update is done -- EF could return 0 entity saved. 1 database call. This is one scenario on many that EF does not make sense if every database call is important for you. Which make me think that most of the time people are working on Intranet application when using Entity Framework. This is fine. And well, there is not a way out-of-the-box the set index... how come after 6 version? Oh yeah, EF 7 will work on NoSql database... this is not what we need, we need a solid ORM.
To conclude this rant on Entity Framework, I continue to work with Entity Framework every day on my side project (not at Microsoft), I am sure I will continue to work in the future with Entity Framework in multiple companies that use it on their website. However, I will not start a new project with this ORM on my own. If after 4 years of EF under my belt I still struggle to update a single field in one of my class than it means that something is wrong.