hackification

Homepage

...rediscover the joy of coding

LINQ-to-Entities: Follow-Up

There's been a bit of discussion about my last article, "LINQ-to-Entities: The Blackberry Storm of ORMs?". I thought I'd try to clear up a bit of what I was trying to say, especially with regards my statement about LINQ-to-Entities returning differing values depending on code order.

Firstly, the discussions for reference: #1: It's Not Functional

Imagine the following SQL statements:
SELECT * FROM [Order] WHERE [CustomerId] = 1
SELECT * FROM [Order]
SELECT * FROM [Order] WHERE [CustomerId] = 1
In the absence of any other concurrent users, if I suggested that the first and third identical queries should return different values, you'd think I was nuts. But that's exactly what LINQ-to-Entities does.

The world is moving towards functional programming. Functions named "Get" (or "SELECT") really shouldn't have side-effects. LINQ-to-Entities violates the principle of least surprise - most coders would expect a SELECT to have no (functional) side-effects.

#2: Performance Over Correctness

I understand why LINQ-to-Entities doesn't load the values in the first instance. There is a danger when using ORMs that queries inside loops can lead to N+1 queries, killing performance. (In my last job I was involved in writing a custom ORM system, so I know a little about the problem).

However, in my opinion there really are two options to solve this:
  1. Performance indicators are applied as 'hints' that don't affect functionality - if you find a block of code is producing bad or multiple queries, you add pre-loading hints.
  2. Performance is determined to be sufficiently important that idea #1 is no good. In that case, if the appropriate pre-loads haven't been issued, an exception is thrown.
LINQ-to-Entities have decided to go for 'option' #3, silently return the wrong answer. I always thought the guiding principle of .NET was managed coding - sacrifice a little performance for a great deal of safety: bounds checking, type checking, and so forth. Perhaps the ADO.NET team still think they're writing C code?

#3: It's Not A Replacement To LINQ-to-SQL

There's no one correct way to write an ORM. Different applications have different requirements. A general purpose ORM will never satisfy 100% of developers. Fine. I'm happy with that; there's a nice market for specialist providers.

What I'm not happy with is that while LINQ-to-SQL seemed to make 90% of developers happy, it's being replaced with LINQ-to-Entities that (judging by the feedback I've seen) makes far less developers happy.

I'm fine with the ADO.NET team writing a solution that fills that 10% gap or otherwise augments LINQ-to-SQL. I'm not happy with them replacing a 90% solution with a specialist 10% solution.

#4: It Hinders Common Scenarios

I work largely in the web app development area, which I understand is just one of many dev scenarios that Microsoft must support. Having said that, it is an increasingly large area (perhaps the largest commercial area, considering that Windows desktop applications are essentially dead apart from internal coporate dev).

Let's take a typical data-driven web-page:

http://stackoverflow.com/users/6604/stusmith

(Now you can put a face to my writing).

I can imagine the pseudo-LINQ being something like this:
var user = (from u in data.Users
            where u.UserName == "stusmith"
            select u).Single();

// Display the header details.

var questions = from q in user.Questions orderby q.Votes desc   select q;

// Display each question.

var answers = from a in user.Answers orderby a.Votes desc select a;

// Display each answer.

// Etc for tags, badges, and so forth.
It's very common, doesn't involve any performance issues, but LINQ-to-Entities demands that I explicitly load from the database.

#5: Is It Nothing More Than LINQ-to-Objects - or - Where's The Magic Gone?

The response to my article from the ADO.NET guys is:

"The case here is actually a misunderstanding on the part of the author. The second query that they [sic] author runs, var order, is actually a LINQ to Objects query, not a LINQ to Entities (or LINQ to SQL) query."

So... LINQ-to-Entities only kicks in when I either (a) query top-level tables, or (b) call Load() methods?

In which case, LINQ-to-Entities actually does a lot less than LINQ-to-SQL. Another quote:

"...the explicit loading in the Entity Framework means that it will not make extra trips to the database "magically" for you."

My point exactly.
LINQ-to-Entities is like LINQ-to-SQL but with the magic removed.
(Maybe that should be their marketing slogan? "LINQ-to-Entities - now with 100% less magic!").

I always thought the point of an ORM was that it was a transparent mapping from database to objects. I can write explicit loads myself; they're called SELECT statements. The wonder of LINQ-to-SQL was that I didn't have to. Now I do. Is that really progress?