Here is my first post, and I have been spending the last week trying to come up
with something sufficiently profound. Since I have not been able to do that, I
have decided to post something simple instead. Future weeks will expand on this.
The first newsgroups I read on a regular basis were comp.databases.oracle.server
and comp.databases.theory. In the latter, I noticed some comments regarding
nulls seemed to occur on a regular basis, and I wondered what was behind them. A
little over a year ago, I decided to research this and find out what the nature
of this controversy was. I now think that I understand where the "No Nulls"
group is coming from.
So, I am going to start with a (very) brief summary of the opposing arguments.
Subsequent posts will elaborate on them (if some other Oracle related notion
doesn't distract me).
To begin with, I need some exposition of SQL nulls. Rather than attempt to
improve upon what many have done before, I will just cite what I regard as the
best article on the subject:
http://www.oracle.com/technology/oramag/oracle/05-jul/o45sql.html
In the late 1960's, Codd and others developed their relational algebra, which
became the basis for the theory behind modern relational data modeling. Whenever
you see someone speak of tuples rather than tables, relations rather than
statements, or types rather than domains, it is in the context of this algebra
that they are speaking. At the time, there was some hope that should relational
databases become practical, this algebra would be their language. Instead, SQL
became the standard interface. And SQL, while close to relational algebra,
differs in a number of essential aspects.
One of these aspects is the existence of nulls, more precisely, how they are
implemented. Nulls turn the logic of SQL into a three valued logic. Relational
algebra is based on the traditional two valued logic. And so nulls become one of
the things wrong with SQL.
But there is a bigger problem. Nulls, as defined by SQL, result in some
unexpected and surprising traps. For example, a column being nullable (whether or
not it actually contains any nulls) will prevent the optimizer from using an
Anti-Join to optimize a query. And the existence of nulls can even result in a
logically wrong result set.
But on the other hand, it was none other than Codd himself who introduced nulls
into the relational database realm. His 3rd feature of relational databases was
a "systematic way of handling missing information." And while he proposed was
not exactly what SQL developed, he did remain a proponent of nulls all his life.
Well, this has gotten me started. Tomorrow, I should have more focus.
Sunday, January 14, 2007
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment