App Engine & LINQ: Trend toward SQL SELECT?
While trying out Google App Engine, I noticed an interesting duality in the use of the Data Store API in the guestbook sample application. To create a record, they create an object and save it, just like one normally would with an ORM library like Active Record. However, when they retrieve the record, instead of saying something like:
greetings = Greeting.find(:all, :order => "date DESC", :limit => 10)
…they use a GQL query:
greetings = db.GqlQuery("SELECT * FROM Greeting " +
"ORDER BY date DESC LIMIT 10")
Maybe I’m reading too much into it, but since this is in the first tutorial, it seems the person who wrote it might think that using a SELECT statement is a straightforward way of doing getting the information. Of course, a beginning programmer might have an easier time shooting herself in the foot with this method, because she could easily add a string from user input into the sql string, opening the door for SQL injection attacks, but for now, I’ll ignore that. I think that, in many cases, crafting a well-performing WHERE statement is easier in SQL than it is using ORM find methods. I think for joins it’s probably more so.
In typical web apps, however, inserts and updates are almost always simple, and those SQL statements aren’t needed. Perhaps by default, the query language could just allow for select statements, and have the user rely on ORM for the insert statements. That way, the beginning developer might be less likely to screw up the whole database by leaving out a search term in the WHERE clause of an UPDATE statement.
Microsoft’s LINQ, or Language Integrated Query, seems to be geared toward searching. In the examples, I almost always see the FROM statement. LINQ also works for XML, and to me it looks like more convenient syntax than using an XML library. LINQ also doesn’t rely on strings for constructing the queries, but instead integrates the query language into the general-purpose language, so it isn’t susceptible to SQL injection attacks.
One issue I haven’t touched on yet is how the results of a SELECT query are returned. If it’s just a row of dictionaries corresponding to fields specified between SELECT and FROM, some benefits of using ORM are lost. However, with Google’s DataStore API, this appears not to be the case. It looks like they’re getting back a Greeting object. I wonder what they get in the case of a join. Something to research later. The most useful thing I can think of would be an array of dictionaries containing each of the joined objects. For example, if Greeting and User where joined, the first result, results[1], would be a dictionary containing a Greeting and User object (results[1][’Greeting’] and results[1][’User’]). Then, one or both of these could easily be read and/or updated, and all of these operations would go through the code in the objects’ class definitions.
In Google’s DataStore API, we still have the security issue, as SQL strings are being created. Perhaps the server could be set to run with warnings for possible SQL injection attacks turned on by default. It might not be bulletproof, but along with only allowing SELECT statements by default, it could keep the data pretty safe.
Perhaps these new development tools are part of a trend toward using SQL for selects where convenient, but almost never for inserts/updates. If so, I think it might make web programming simpler. They could also be used when ORM methods would be equally convenient, and make it easy for the user to convert from a single-table query to a join.
I would like to see someone who knows more about using databases in web applications than I do write about this topic. In the meantime, I’m going to poke around Google App Engine and learn more about how the DataStore API works.