Lingering reservations about ORM

I have always had reservations about locking myself into a strict Object Relational Model scheme for the same reason I have been allergic to Windows. I was never able to articulate exactly why but my friend Juozas Kaziukėnas has brought it into sharp focus in his latest blog post on Doctrine.

Here’s the deal: I played with and studied Doctrine and I found it to be the most flexible and powerful of all the data mapping systems I have seen. Plus, Zend Framework is moving to it, so eventually I think it will be completely integrated with Zend Framework whether I like it or not. I had some concern due to alarming memory usage statistics reported when using Doctrine. But it seems that (as is often the case) the bad performance is the result of programmer ignorance, and not inefficiency of Doctrine itself.

Juozas addresses memory usage in his latest post. Of course you can alleviate some problems with smart cacheing and you can alleviate others (probably even more) by doing the appropriate queries in the first place. Here is his simple example that totally focuses the problem for me:

Ignorant Code:

$user = Doctrine_Core::getTable('User')->find(1); 
foreach ($user->Comments as $comment)
{
   print $comment->NewsItem->title . '
'; }

The code above is totally inappropriate use of lazy loading. It is getting the “main” record and then lazy loading each hasMany in a loop with a separate query. If you already know you’re going to be iterating over the comments, go to the bother of doing an explicit join and include them in the initial query so you get all the data you want in one go like this:

Good Code:

$query = Doctrine_Query::create(); 
$query->from('Comments C')
        ->innerJoin('C.NewsItem N')
        ->where('C.user_id = ?'); 
foreach ($query->execute(array(1)) as $comment)
{
   print $comment->NewsItem->title . '
'; }

When I saw this I thought, “Of course, that’s exactly it!” Juozas knows what he’s doing — he is a Doctrine expert. Most coders aren’t. What ORM’s do is analogous to what code generation systems do: They generate queries (code) that gets the job done but unless you are aware of the underlying reality, it will do it in a way suboptimal way. I have already seen bad hand coded SQL springing from bad schema design. Now you can have a perfect schema and still have bad SQL that you never really even see. It’s just another way to divorce coders from the underlying reality of SQL, the same way DreamWeaver can strip the C out of CSS. Sure, it allows a lot more ignorant yobbos to do things without pissing all over their hands, but what is going to be the performance and Total Cost of Ownership (TCO) of that system once you get some load on it!!

Back in the pre-ORM days I hand generated each query and tried to make SQL do as much work as possible. I would envision the result set I wanted, then come up with a query (usually with multiple joins) to deliver it. The only time I ever ran into performance issues with queries was just once with a very complex query with subqueries where I did the joins in a suboptimal order. I was able to get a hundredfold increase in performance by a simple EXPLAIN that told me exactly how to fix it.

Ah well, I look at it as an opportunity. Don’t get me wrong, I think Doctrine is a great tool that will speed initial development. I’m much happier to use it after seeing this blog post. I’d be willing to hazard a guess that most of the reported performance issues are due to stupid queries such as this example, and Doctrine is powerful enough that they can be fixed. So after the kiddies have tromped through and completely pegged your application’s memory with first attempt queries, talk to ME (an old school DBA who actually knows how queries work) and I can probably squeeze an order or two of magnitude of performance out of it.

7 comments to Lingering reservations about ORM

  • Hi, I am one of the core developers of Doctrine.

    First of all, the “good code” misses a “C.” before “NewsItem N”. This is a requirement because it must know which relation it’s going to be considered. It’s not a top-level component.

    I think everybody agrees that any ORM can bloat an application if used incorrectly. By taking advantage of an OQL (DQL is a proprietary OQL), auto load of relations are possible.
    Of course we may suggest to not load the entire relations tree also, you must know what you’re doing.

    Doctrine 1.X series is a bit bloated since it grouped LOTS of functionality. But I’d suggest you to take a look at Doctrine 2, which is a very well engineered solution based in JPAv2 (JSR-317) spec.

    We internally apply lots of optimizations, but nothing better than a DBA allied in the project to point to exact places. That’s why we also provide a DBAL layer, allowing DBAs to play with the schema, including add indexes, constraints, checks, etc.
    But you have to be aware that we need to have some guidance, and we always focus on most generic way. So an specific query may not be the optimal one for a given platform, but it works in all of them.

    Keep compatibility with Oracle, MySQL, PgSQL, SQLite, SQL Server… arf… it’s not an easy task! =)

    Anyway, good to see that we got some attention from DBAs also!

    Cheers,

    Guilherme Blanco

  • Thispost is really interesting, I didn’t know all of this.

  • admin

    Sorry about the typos, I just copied the example verbatim late at night without setting it up first because it brought the principal issue into such sharp focus. Will fix shortly. And I totally understand about making it database agnostic having to make a few sacrifices. Now you are making me itch to try V 2.0. My results were all based on 1.2

    Truly Doctrine is the first one I’ve seen that exposes enough granularity without losing the X-plat abstraction so that I believe I could work with it. 🙂 (and in a really rare critical situation I could write a non portable optimal query and just document it really really hard!) Keep up the good work!

  • The problem with your good code is that you are using your ORM in more of a relational manner than Object Orientated. It looks like you just wrote a glorified SQL query.

    I think ORMs still have not broken away from SQL enough. You should be telling the ORM what you want and it should decide to do the join.

    I think a ORM call should be structure like this:
    $user->comments()->NewsItems()->Rows

    $user->comments() should not return a comment, but a ORM object. Internally the $user->comment() object should simple record “join user and comment tables where x”.

    $user->comments()->NewsItems() should not return a NewsItem, but a ORM object. Internally the $user->comments()->NewsItems() object should simple record “join user, comment and newsitem tables where x and x”.

    When $user->comments()->NewsItems()->Rows property is called it should then be able to join all required tables and only require 1 SQL query.

    In your example it would do this:
    $user = Core::getTable(‘User’)->find(1);
    foreach ($user->Comments()->Rows as $comment)
    {
    print $comment->NewsItems()->Row->title . ”;
    }

    Only when Rows/Row is called would the ORM run sql Query. Until then you whould only be dealing wirh ORM objects.

  • admin

    What you’re saying is more typified by the “bad code.” I get this user object and I also want some related stuff. I told it that and it gave me what I wanted but it decided to do it a really dumb way.

    To be sure, the “good” code does look more like a SWL query.

    I don’t think that ORMs should totally break away from SQL, at least not until we can have both clarity and decent performance from “telling it what you want.” and that presents another layer of communication. I can tell it what I want very eloquently with a high performing SQL query. Now I have to learn a whole bunch of new methods just to do what I can already do.

    The ORM would have to be doing some kind of analysis of the code that is callig it to figure out that it need to join a bunch of stuff and then iterate through the results!

  • thanks for the information on ORM… keep posting.

  • Your information about ORM.. is nice. I think that’s a great collection and defined in well manner. Thanks for posting.

A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.