Posted by: camz | April 6, 2007

Objective Relations


I have often heard people say that you can’t mix object models and relational models. I have to admit that it took a while for me to fully understand, and now that I do, I’m certain that most people that said it to me, didn’t. That sounds a little harsh and it probably is, but the people most likely to impart this morsel of wisdom probably have comp-sci degrees and learned how to program in an academic setting. They probably believe in “pure” programming methodologies as well, and talk about patterns, and anti-patterns, and extreme programming. Unfortunately, most are what I have previously described as fragile developers.

Anyhow, back to the topic at hand… mixing object and relational models.

There really isn’t that much difference between object and relational models. In fact, if you were to look at an entity-relationship diagram for a relational database and a object model diagram from a distance you probably wouldn’t be able to tell them apart. The truth is that I have yet to come across a data model that can’t be represented by either model.

Object models represent hierarchies, basically nested data structures that contain other data structures or collections of data structures, which in turn can be made up of similar hierarchies. You can represent the exact same hierarchies in a relational database, the main difference is that you need to add meta-data to represent the hierarchical structure, and this meta-data is visible. It usually takes the form of id columns and foreign key references. Technically speaking, the relational model is no different when you look under the covers. Those relationships appear as pointers, arrays of pointers, and linked lists, of course no OO-language worth using would ever let you see this, but it’s there and it’s the meta-data required to make the objects work. The object purists call this abstraction, and it’s a “Good Thing”, and I agree.

So where is the disconnect between object models and relational models? There really isn’t one. They can represent the same data. There are differences though. The object model is based on usage, the only reason to create an object is to use it, which is why we define objects with classes and those classes include methods, and properties for manipulating and accessing the data in the objects. A relational model doesn’t try to accomplish the same thing, it’s only goal is to store the data in an object hierarchy and provide a mechanism for retrieving that data to reconstruct the original object. Unless the object has no hierarchy, the mechanism of storing or retrieving the data will involve a sequence of steps to accomplish this task. This doesn’t make the relational model inferior to the object model, it just makes it different, remember the two models are not trying to accomplish the same goal.

Unless you are writing an application that never saves or loads any data, you’ve had to deal with object persistence (OO-speak for writing to disk). Chances are that for any application of any realistic size, that persistence involved a relational database (and a damn good chance that it’s a SQL database). This is where the first disconnect occurs. There is a mapping process involved, which will produce multiple queries to the database. The object hierarchy will be traversed and each level will produce at least one database query. It is easy to confuse this mapping between models with “mixing” the models.

I get it now, you can’t mix them, unfortunately most people DO mix them without realizing that they have done so, which is where the trouble starts. That trouble usually takes the form of id column in a database. You see there is a problem with the object model, which is that when you follow the “pure” rules, you always deal with the FULL hierarchy of objects, and those can get pretty huge. There is a tendency for object models to allow for much deeper hierarchies than the designer/developer ever intended to use. A common solution to this is to introduce the concept of a “lightly loaded” object, which in other words is a partial object hierarchy. We don’t want to load all the data for all the levels, but we don’t want to take away the ability to go from an lightly loaded object to a fully loaded one (or at least one loaded to a slightly deeper level in the hierarchy). Which means that we need to include some piece of data that can represent the deeper level and provide us with enough information to retrieve that data if and when we decided we need it. You want something you can use to reference that level, which is something that the relational model in our database is already doing for us, typically in the form of an id column.

It is far too easy to borrow the id from the database and use it in our lightly loaded object. This is where the trouble begins, since this IS mixing the two models. We’ve now taken meta-data from the relational model used to store/persist the data and we put it in our object. Our object now contains meta-data which the OO-language normally abstracts and keeps safely out of our way. Too bad we just messed that up, eh? There is a good reason that OO-languages abstract that meta-data away from your average developer, if they were allowed to see the meta-data they’d expect to be able to manipulate it, and then most of them would screw it up.

I’m not saying that you shouldn’t use that id from the database. It is really convenient, isn’t it? Very tempting. Beware though when you do, it must be done with complete knowledge that you just mixed the two models. You don’t have a pure object anymore, and all that abstraction that you created to prevent your object from having to know anything about persistence, well.. you just screwed that up, it’s too late your object not only has to have knowledge of the persistence, it’s exposing meta-data from your persistence layer. You object might even have to be “clever” to make sure that it manages these ids properly. It’s more likely though that you will create other objects that contain these ids without being part of your object hierarchy, you build peer objects that contain this data. Whups… now we are starting to represent the relational data in our object model, we’ve mixed them up even more. The trouble cascades from this point onwards.

So, you can mix the models, careful examination of most object models will contain a mix. This becomes a worst-practice when you let developer work with this model without the knowledge/realization that they are using a mixed model. They’ll run into issues (this is a classic example of what Joel Spolsky refers to as the Law of Leaky Abstractions), they’ll probably blame the relational database for some of there issues. They’ll be wrong. The only way to avoid these issues is to have full knowledge of the object model AND the relational model AND to be able to identify and understand where the two are mixed. Unfortunately it’s pretty rare to find a developer with this ability.

In the end it turns out that what you really can’t mix are developers with only partial knowledge of the two models, and that just isn’t the same as not being able to mix the models.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: