LINQ-to-Entities: Follow-Up

There’s been a bit of discussion about my last article, “LINQ-to-Entities: The Blackberry Storm of ORMs?“. I thought I’d try to clear up a bit of what I was trying to say, especially with regards my statement about LINQ-to-Entities returning differing values depending on code order.

Firstly, the discussions for reference:

#1: It’s Not Functional

Imagine the following SQL statements:

SELECT * FROM [Order] WHERE [CustomerId] = 1
SELECT * FROM [Order]
SELECT * FROM [Order] WHERE [CustomerId] = 1

In the absence of any other concurrent users, if I suggested that the first and third identical queries should return different values, you’d think I was nuts. But that’s exactly what LINQ-to-Entities does.

The world is moving towards functional programming. Functions named “Get” (or “SELECT”) really shouldn’t have side-effects. LINQ-to-Entities violates the principle of least surprise – most coders would expect a SELECT to have no (functional) side-effects.

#2: Performance Over Correctness

I understand why LINQ-to-Entities doesn’t load the values in the first instance. There is a danger when using ORMs that queries inside loops can lead to N+1 queries, killing performance. (In my last job I was involved in writing a custom ORM system, so I know a little about the problem).

However, in my opinion there really are two options to solve this:

  1. Performance indicators are applied as ‘hints’ that don’t affect functionality – if you find a block of code is producing bad or multiple queries, you add pre-loading hints.
  2. Performance is determined to be sufficiently important that idea #1 is no good. In that case, if the appropriate pre-loads haven’t been issued, an exception is thrown.

LINQ-to-Entities have decided to go for ‘option’ #3, silently return the wrong answer. I always thought the guiding principle of .NET was managed coding – sacrifice a little performance for a great deal of safety: bounds checking, type checking, and so forth. Perhaps the ADO.NET team still think they’re writing C code?

#3: It’s Not A Replacement To LINQ-to-SQL

There’s no one correct way to write an ORM. Different applications have different requirements. A general purpose ORM will never satisfy 100% of developers. Fine. I’m happy with that; there’s a nice market for specialist providers.

What I’m not happy with is that while LINQ-to-SQL seemed to make 90% of developers happy, it’s being replaced with LINQ-to-Entities that (judging by the feedback I’ve seen) makes far less developers happy.

I’m fine with the ADO.NET team writing a solution that fills that 10% gap or otherwise augments LINQ-to-SQL. I’m not happy with them replacing a 90% solution with a specialist 10% solution.

#4: It Hinders Common Scenarios

I work largely in the web app development area, which I understand is just one of many dev scenarios that Microsoft must support. Having said that, it is an increasingly large area (perhaps the largest commercial area, considering that Windows desktop applications are essentially dead apart from internal coporate dev).

Let’s take a typical data-driven web-page:

http://stackoverflow.com/users/6604/stusmith

(Now you can put a face to my writing).

I can imagine the pseudo-LINQ being something like this:

var user = (from u in data.Users
            where u.UserName == "stusmith"
            select u).Single();

// Display the header details.

var questions = from q in user.Questions
                orderby q.Votes desc
                select q;

// Display each question.

var answers = from a in user.Answers
              orderby a.Votes desc
              select a;

// Display each answer.

// Etc for tags, badges, and so forth.

It’s very common, doesn’t involve any performance issues, but LINQ-to-Entities demands that I explicitly load from the database.

#5: Is It Nothing More Than LINQ-to-Objects – or – Where’s The Magic Gone?

The response to my article from the ADO.NET guys is:

“The case here is actually a misunderstanding on the part of the author. The second query that they [sic] author runs, var order, is actually a LINQ to Objects query, not a LINQ to Entities (or LINQ to SQL) query.”

So… LINQ-to-Entities only kicks in when I either (a) query top-level tables, or (b) call Load() methods?

In which case, LINQ-to-Entities actually does a lot less than LINQ-to-SQL. Another quote:

“…the explicit loading in the Entity Framework means that it will not make extra trips to the database “magically” for you.”

My point exactly.

LINQ-to-Entities is like LINQ-to-SQL but with the magic removed.

(Maybe that should be their marketing slogan? “LINQ-to-Entities – now with 100% less magic!”).

I always thought the point of an ORM was that it was a transparent mapping from database to objects. I can write explicit loads myself; they’re called SELECT statements. The wonder of LINQ-to-SQL was that I didn’t have to. Now I do. Is that really progress?

7 Responses to “LINQ-to-Entities: Follow-Up”

  1. “LINQ-to-Entities – now with 100% less magic!”

    lol…

  2. “LINQ-to-Entities is like LINQ-to-SQL but with the magic removed.”

    Very Good !
    Thank you for a lovely article

  3. Excellent summary of the EF shortcomings on the ‘coding side’ of things.

    Add to that what comes out on the ‘other side’, e.g. the poor SQL generated by Linq-to-Entities vs the highly optimized and efficient SQL generated by Linq-to-SQL.

    The MSDN forum for EF has a lot of horror stories and examples where simple 5-table joins come out as 1500+ table SQL queries.

  4. I like my LINQ-to-SQL magic thank you very much. You can’t have it!

  5. Perfect article,
    I couldn’t agree you you more!
    But, then, what is the solution? Use Linq-to-SQl or write plain sql?
    Can Linq-to-sql replace the sql queries and actually help you?
    Or … say … maybe … nHibernate?

    Thanks!

  6. In you example, if I needed the header and the answers and so forth I would use the “Includes” method on a L2E for each of the related data you need.

    See this article for an example:
    http://blogs.msdn.com/bethmassi/archive/2008/12/10/master-details-with-entity-framework-explicit-load.aspx

    EF has many ways to access data. You can use L2E, eSQL or even EntityClient directly. Being able to choose the best option in the context you are working its in my opinion something good, not bad. Being able to load imediatly or only when I really need related data is also important for me. And to anyone that cares about scalability and performance.

    I’m not saying that EF is perfect. Far from it. It has many shortcommings that propably will be addressed in v2. And thats the point. Its v1 and already has a huge potential. How many products as complex as this may say the same at V1?

    BTW… I’m using v1 today on production and I’m loving it. Moving from a another similar product to EF will raise issues for sure. People are used to work on a certain way. Thats a good thing because as I see it, Microsoft has will be including a lot of feedback for v2.

    As for me, I’m amazed by the complexity and reach of V1 as it is. I’m expecting a V2 full of new stuff. Who knows. You will probably get an propertie that allows you to lazyload stuff. Not that I care about it. Afterall… you can actually change the partial class of the model to do what you need without having to wait for MS to do it. T4 template is the magic word.

    regards

  7. Demented Devil on March 31st, 2009 at 12:43 pm

    As someone who was forced to write their own ORM using VS2005 I can say that I love BOTH L2S and EF. Which one do I use? Well that depends. However I can tell you that my ORM stole the EntitySet/EntityRef/lazy loading and object caching concepts from EF it worked fantastically well although it was far from perfect and far from painless…

    If its magic you want then look no further than LINQ – that is the true piece of wonder code.

    These days I’m having plenty of fun with RIA Services which reuses EF and Silverlight!

    Besides which – nobody says you HAVE to use it – you don’t like it – fine – stop bitching!

Leave a Reply