Object-Relational Mapping(Part-II)
In this part we will try to identify the differences these two models present that brings into fore the impedance-mismatch. I have created a data model and an object model and to keep the things simple for now, all of the objects map one to one with the data entities (tables). In addition, I will be using terms Table, Row, Column etc. rather than relation, tuple or attribute as these are easy to understand. Looking carefully at the two diagrams I am sure you can identify some of the most basic differences in the approaches the two methodologies take while defining entities.

Fig-1:Data Model

Fig-2:Object Model
Shadow Information - Identity Problem
The first difference you might have noticed is that while there is a identity column for each entity in the data model, we don't have an identity property in any of the objects. This is not considered to be a good practice to have key value in an object since each object can be distinctly identified by its location in memory. But in case of data stored in a table it becomes imperative to identify each row uniquely by using a primary key. This extra piece of information which is required to persist an object to a database is called shadow information.
Association
Another major difference is how the associative relationship is modeled in object and relational world. In the above diagrams, you can see that there is a many-to-many relationship between Users and Roles. Object model supports this kind of a relationship natively whereas there is no direct support for it in Relational model. That is why we have to introduce an associative table UserRole (Figure 1) to implement many-to-many relationship. Another subtle difference is in the way you access the child object. In the relational model, each row in child table holds information about its parent which means that parent does not have any idea about its children. For example, in Figure 1, LineItem table contains a reference to Order in the form of OrderID. Contrary to it, object model models it differently in the form of aggregation and composition. In Figure 2, you can see that Order object conatins a collection of LineItem objects.
Partial Data Load
While retrieving data from a database, it is a common practice to retrieve only those column values that are necessary for a business process in order to optimize network bandwidth. No matter how fast the network is, it is still costly enough to bring down the application performance if not used Judiciously. That means we need to optimize our SQL query and retrieve only that amount of data which is relevant to the current process. That is where problem arises. Once we retrieve data and populate our objects, most of the attributes of that object will remain empty as data corresponding to those attributes has not been retrieved. That is OK until we try to persist that object back to the database since there is a great probability that some of the empty attributes correspond to the columns that are declared NOT NULL. Secondly even if the columns are nullable, existing values will be overwritten by the empty values.
Apart from these major problems there are other issues as well like mapping inheritance, computed fields, foreign keys referencing only a single table etc. We will take a look into how to solve some of these problems in the next article of the series which hopefully I will try to publish soon. So keep watching this space for more..

Comments
Thanks Farrukh for explaning this with the figure. It is very helpful. One question though. When updating from the object back to the table, you have mentioned about overwritten values to empty value. I could always do a selective update, right? I can only update the specific columns whose values i retrieved!
Posted by: Atul Gupta | April 17, 2008 01:07 PM