<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>WoogaLog &#187; SQL</title>
	<atom:link href="http://camz.wordpress.com/category/software-development/sql/feed/" rel="self" type="application/rss+xml" />
	<link>http://camz.wordpress.com</link>
	<description>Various Ramblings &#38; Commentary</description>
	<lastBuildDate>Wed, 18 Jun 2008 18:10:51 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='camz.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/88347e88432a6a88c9ff93ef9b462f6e?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>WoogaLog &#187; SQL</title>
		<link>http://camz.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://camz.wordpress.com/osd.xml" title="WoogaLog" />
		<item>
		<title>Telling Good from Bad Design</title>
		<link>http://camz.wordpress.com/2008/06/15/telling-good-from-bad-design/</link>
		<comments>http://camz.wordpress.com/2008/06/15/telling-good-from-bad-design/#comments</comments>
		<pubDate>Mon, 16 Jun 2008 02:36:11 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[SQL]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[qnx]]></category>
		<category><![CDATA[software design]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/?p=24</guid>
		<description><![CDATA[As the common wisdom goes, the best way to fix a badly designed system is to design it right in the first place.
I know all to well just how much truth there is in that.  The challenge is in telling when a design is bad in the first place, which is more difficult than [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=24&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>As the common wisdom goes, the best way to fix a badly designed system is to design it right in the first place.</p>
<p>I know all to well just how much truth there is in that.  The challenge is in telling when a design is bad in the first place, which is more difficult than it sounds.  There are all sorts of metrics that can be captured during the development process, but none of them will yield any true indication of how good (or bad) the design actually is.  Experience tends to be the only thing that is reliable, and even then it can often just be a &#8220;gut feeling&#8221; that something is wrong and incredibly difficult to identify what exactly that is.  When a design is mediocre, it&#8217;s even harder since there is often an equal mix of good and bad making it all that more difficult to tell one way or another.</p>
<p>I&#8217;ve seen a lot of interesting things, a small handful of amazing things, quite a few questionable things, and even some amazingly horrifying things done in the name of design. I&#8217;ve had to dig deep into systems and their design to either work with them or fix them (usually fix them).  Through that experience I have discovered that the design of a project can be summarized with one of two simple characteristics: one for good, the other for bad&#8230;<br />
<span id="more-24"></span></p>
<p><strong>Good Design</strong><br />
<em>Good designs produce additional benefits, features, and capabilities that the original designers did not conceive or plan, but that exist as a byproduct of the design.  These capabilities can be leveraged with little or no change to the original design, or extend the original design without requiring extensive refactoring or redesign.  New capabilities can be found in exceptional designs with nothing more than a different perspective rather than any change to the code or design.</em></p>
<p><strong>Bad Design</strong><br />
<em>Bad designs produce workarounds, special cases, constant (and often extensive) refactoring of both the code and the design, and are typically accompanied by long, involved explanations [from the designers] on why something can&#8217;t be done.</em></p>
<p>It really is as simple as that.  The hard part is that it takes experience to recognize both of these and there is no magical way to get that experience other than to work on a lot of projects.  That includes working on bad projects too.  They might suck, but if you pay attention to what went wrong, and take the time to figure out why, even a bad project can become a gold mine of knowledge.</p>
<p><strong>A Good Example &#8211; The QNX® RTOS</strong><br />
One system that I have encountered that would fall into the category of good design would the the message-passing inter-process communications (IPC) in the <a href="http://www.qnx.com/">QNX® Realtime Operating System</a>.  I have worked with 3 generations of QNX, and although each generation enhanced the IPC mechanism, the &#8220;guts&#8221; of the QNX IPC remained the same because the fundamental design was good.  They were able up add capabilities to new versions without changing the fundamentals of the original design.</p>
<p>Each and every version of QNX has the IPC mechanism at the core of the OS, for those interested in learning more check out the System Architecture Guide online.  I will will focus on a few examples of how this single design feature of the OS demonstrates the characteristic I have described in this article.</p>
<p>One of the first features that QNX gets &#8220;for free&#8221; as a direct result of building the IPC into the core (yes, it is in the kernel) of the OS is that of modularity.  The actual OS is made up of a group of processes that work together to provide the functionality of a traditional monolithic kernel.  They also extend the IPC mechanism across the network, which has the side-effect of making the network completely transparent.  This is an exceedingly difficult thing to accomplish in a monolithic kernel, and thanks to the excellent design of QNX&#8217;s IPC, it comes &#8220;for free&#8221;.  A network of QNX machines is a loosely-coupled multi-processing computer, something that requires special software and application awareness on other platforms to accomplish.</p>
<p>The IPC mechanism allows the OS itself to be built of modules, and once again we see byproducts of this design choice.  Multiple filesystems modules, multiple network links, device drivers, etc.  Fault protection becomes a byproduct as well, since the IPC mechanism allows the OS to be built from multiple cooperating processes using IPC, each process is protected from the other, and the failure of any process does not have a catastrophic effect on the whole OS.</p>
<p>The simplicity of the IPC implementation also provided synchronization of cooperating processes, something that again, is often difficult to implement with other designs.  Asynchronous communication was also possible without changing the IPC mechanism, and only changing how an application made use of the IPC features.</p>
<p>In later versions of QNX, a graphics system was added which QNX called Photon.  Photon also leveraged the IPC mechanism to simplify the implementation of graphics drivers, input devices, and &#8220;clipping&#8221; of windows as they overlap.  Photon itself enjoyed its own &#8220;freebies&#8221; as a direct result of layering on top of the IPC mechanism.  Remote consoles, mirrored consoles, multi-monitor support, foreign platform support (Phindows &#8211; Photon in windows &amp; PhinX &#8211; Photon in X) all became free or trivial with the design.</p>
<p>These represent just a few of the additional features that just &#8220;fall out&#8221; of a single good (in this case excellent) design.</p>
<p><strong>A Bad Example &#8211; Generating SQL in an OO-System</strong><br />
This is an all too common scenario in object-oriented development.  You start out with an hierarchical object design,  and then create a corresponding relational database design in which to persist the object data.  The typical approach is to create a data access layer (DAL) which consists of objects that essentially serialize object data into SQL, and deserialize from SQL back into objects.  To keep things simple, especially when the OO-developer is designing the database, it is common to see a &#8220;one table per object&#8221; schema design.</p>
<p>The problems come when you want to retrieve a &#8220;complete&#8221; object which includes all its hierarchical child objects.  The most intuitive solution is to have the DAL object call other DAL objects for the child objects.  This is easy to do, and easy to test, and looks fine on the surface.  Of course, it isn&#8217;t fine, and the first issue to crop up is one of performance.  The performance issue is caused by iterating through an object, generating additional DAL / SQL calls, which in turn do the same for each level until there are no child objects remaining. For an object with lots of child objects, or lots of levels in the object hierarchy, this quickly results a large number of SQL queries, all for a single object.  The issue is compounded when retrieving data for an array of objects.</p>
<p>Let&#8217;s take a step back to look at this.  The design isn&#8217;t very good, as it fails to resolve the issue of making an object hierarchy work well with a relational database.  The design is simple, which would normally be good, but the simplicity results in iteration and an increased number of round-trips to the database server.  Neither of which could be considered a beneficial byproduct of the design.</p>
<p>Typically this regarded as a performance, rather than a design issue and a workaround is often introduced to improve the performance.  Typically the DAL layer will get a &#8220;depth&#8221; parameter added to limit the number of levels we retrieve in the hierarchy since we also discovered that we don&#8217;t always want the entire hierarchy. Every single DAL object must now be modified to use and update our depth-tracker.  When we are all done, it works, but performance is still poor, because we have only reduced the iteration rather than eliminating it.</p>
<p>This process can continue indefinitely, workarounds usually address the symptoms of the real problem instead of the problem itself.  They also tend to be very specific and targeted, increasing the likelihood that any given patch may introduce new issues and create new symptoms since the underlying issue has not been addressed.  After a while, the number of patches can make identifying the root-cause more difficult.</p>
<p>One of the better solutions to the SQL issue outlined here is quite simple, stop thinking in OO, and start thinking in relations, which is what the database uses.  Once you do this, it becomes obvious that you can take the most frequent requests, identify how many levels of objects are involved and write a single SQL query that uses join statements to get all the levels of data in one fell-swoop.  This is one example where the differences between OO methodologies and relational are disconnected.  What is obvious in one isn&#8217;t necessarily efficient, optimal, or even a good idea in the other, and may even be contrary to the other methodology.</p>
<p>Each step taken in this bad design is relatively simple, and appears to be the most intuitive at the time, making it extremely easy to wind up with a design that has issues. The symptoms often masquerade as a different problem making it difficult to realize that the design contains a flaw rather than the implementation. This is quite typical in the absence of requirements forcing an iterative cycle of partial design intermixed with coding in place of upfront design and understading of how each element of the solution will be implemented <em>and</em> used.</p>
<p>This is perhaps why this particular example (mapping object to relational) comes up so often, and why it so easy to take such a course without knowing it.</p>
<p>As you can see, in the bad design example, we got no beneficial byproducts from the design, and instead got a series of issues and problems requiring more and more workarounds to address.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/24/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/24/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/24/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=24&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2008/06/15/telling-good-from-bad-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Server Suckage</title>
		<link>http://camz.wordpress.com/2007/04/14/sql-server-suckage/</link>
		<comments>http://camz.wordpress.com/2007/04/14/sql-server-suckage/#comments</comments>
		<pubDate>Sun, 15 Apr 2007 05:54:18 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Rant]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/04/14/sql-server-suckage/</guid>
		<description><![CDATA[I&#8217;ve been working on a system written in .Net/C# that uses SQL Server, my database experience prior to SQL Server included Oracle, Ingress, Sybase, Postgresql, and some MySql.  Each had their quirks, but I&#8217;m learning more and more of the SQL Server quirks, and at times seriously having to wonder how Microsoft gets off [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=21&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve been working on a system written in .Net/C# that uses SQL Server, my database experience prior to SQL Server included Oracle, Ingress, Sybase, Postgresql, and some MySql.  Each had their quirks, but I&#8217;m learning more and more of the SQL Server quirks, and at times seriously having to wonder how Microsoft gets off calling this an enterprise class product.  I will fully admit that some of my complaints are with how .Net communicates with SQL Server, and these probably are not Sql Server&#8217;s fault.</p>
<p>Here is my short list of gripes (in no particular order):<br />
<span id="more-21"></span></p>
<ol>
<li><b>No Sequences</b><br />
Yes, I know, you can use IDENTITY columns, but they are not the same, and SQL Server places some foolish restrictions on their use.  Instead of an IDENTITY column being a &#8220;normal&#8221; scalar column it is special and you can never update the column, although you can insert, but only after additional SQL directives.  The IDENTITY implementation is close to a sequence, but Microsoft just didn&#8217;t implement it quite right.  I do wonder why they didn&#8217;t introduce proper sequences with SQL Server 2005, they could have done this quite easily, a proper sequence could peacefully co-exist with the existing IDENTITY nonsense.
</li>
<li><b>Inconsistent Parsing</b><br />
I was creating query that needed to run properly against two versions of a table, one version had an additional column to simplify the query.  Using the IF EXISTS statement, I wrote a query similiar to the following:<br />
<code>IF EXISTS (<br />
  SELECT NULL<br />
  FROM sysobjects so, syscolumns sc<br />
  WHERE so.id = sc.id<br />
  AND so.name = 'table_name'<br />
  AND so.xtype = 'U'<br />
  AND sc.name = 'column_name'<br />
) SELECT column_name FROM table_name</code><br />
When I ran it against a database that had the column it worked, when I ran it against one that didn&#8217;t have the column, I got an error of it complaining that column_name didn&#8217;t exist.  I shouldn&#8217;t have even tried, that is the whole idea behind using the IF EXISTS&#8230;  The following query:<br />
<code>IF EXISTS (<br />
  SELECT NULL<br />
  FROM sysobjects<br />
  AND name = 'table_name'<br />
  AND so.xtype = 'U'<br />
) SELECT * FROM table_name</code><br />
Works as expected when run against a database that does not have the table.</p>
<p>So, in one case it parses the statement inside the IF EXISTS and does some checking, in another case it does not.
</p>
</li>
<li><b>NOCHECK Constraints</b><br />
The ability to temporarily disable constraint checking can be quite convenient, particularly while bulk loading data.  What I don&#8217;t like about this is that this &#8220;WITH NOCHECK&#8221; state can actually stick around indefinitely, and is included in backups of the database.  The constraint appears to be present, but is not actually being enforced.
</li>
<li><b>Escaping Reserved Words &amp; Special Characters</b><br />
This is one of those things that looks reasonably useful until you actually take a couple moments to think about it.  SQL Server will let you use special characters like spaces, periods, and even reserved words when defining your schema, all you need to to is enclose them in square brackets.  The following is valid:<br />
<code>CREATE TABLE [Table] (<br />
[hard.to.use.column] varchar(20),<br />
[also hard to use] varchar(10),<br />
[null] varchar(10)<br />
)</code><br />
There are good reasons why we have reserved words, and why spaces and such are not allowed in table or column names.  While it is admirable that Microsoft provided such a convenient way to differentiate the real reserved words from the ones we want to use, it&#8217;s horribly misguided.  No one should do this, <i>ever</i>, it&#8217;s just plain confusing, not to mention non-portable, foolish, and, oh yeah&#8230; stupid.
</li>
<li><b>Inconsistent context from .Net/C#</b><br />
I will admit that I do not know if this is a SQL Server thing, or a .Net/C# thing.  This is related to the debate about using parameters or not (which I won&#8217;t get into), the issue is that .Net/C# does not execute these in the same way.  Here is a code sample to help demonstrate this:<br />
<code>public Query1( SqlConnection cnx, int id )<br />
{<br />
   cmd = cnx.CreateCommand();<br />
   cmd.CommandText = String.Format( "SELECT * FROM myTable WHERE id = {0}", id );<br />
   SqlDataReader r = cmd.ExecuteReader();<br />
   if( r.Read() ) { /* read result */ }<br />
   r.Close()<br />
}</p>
<p>public Query2( SqlConnection cnx, int id )<br />
{<br />
  cmd = cnx.CreateCommand();<br />
  cmd.CommandText = "SELECT * FROM myTable WHERE id = @id";<br />
  cmd.Paramaters.AddWithValue( "@id", id );<br />
  SqlDataReader r = cmd.ExecuteReader();<br />
   if( r.Read() ) { /* read result */ }<br />
   r.Close()<br />
}</p>
<p></code><br />
Query1 will be sent to SQL Server as a normal query.  If you were to build another query in the same program it would execute in the same context.  In other words, you can do a transaction by issuing the raw SQL as multiple statements using a combination of cmd.ExecuteReader() and cmd.ExecuteNonQuery().  You could also issue the sql to allow IDENTITY INSERTS, and then loop through some in-memory data and create multiple INSERT statements from that data.</p>
<p>Query2 should be identical, but it is not.  It executes in it&#8217;s own context as a side-effect of the fact that a stored procedure is actually used to execute the actual query.  The parameters are defined and their initial value set as arguments to the stored procedure, with the actual parameterized query being passed un-altered from .Net to the stored procedure as one of the arguments.</p>
<p>While it seems a bit excessive to invoke a stored procedure to deal with the parameters, this is not what bothers me.  What bothers me is that a parameter-less query and a parameterized query are not handled in an identical manner by .Net/C#, especially with the nice side effect of the difference in execution context of the two.  Code that produces valid SQL without parameters can break without explanation simply by implementing parameters.</p>
<p>Another issue related to this is that .Net/C# appears to always parse the sql before deciding how to execute it.  In an attempt to reduce potential errors where you might build a query with undefined parameters, it actually makes it impossible to create a stored procedure or user defined function from C#.  Since these things will use parameters in T-SQL, they conflict with how C# handles parameters, the following code snippet will produce a .Net exception:<br />
<code>cmd.CommandText = "CREATE PROCEDURE foo @id int AS SELECT * FROM myTable WHERE id = @id";<br />
cmd.ExecuteNonQuery();</code><br />
The exception will be that you have not defined @id with a cmd.Parameters.AddWithValue() statement.  Of course, you didn&#8217;t do that because you are not trying to use a parameter, you are issuing DDL SQL to create a stored procedure that will use one.  I have yet to find out how to tell .Net/C# to stop doing this so that this SQL can be executed.  I means you can&#8217;t programatically create a database schema from within a C# application.
</p>
</li>
<li><b>Column Aliases <i>not</i> Available to ORDER BY</b><br />
This one should be a bug, the fact that if you define an alias for a column you should be able to use that alias in the ORDER BY clause.  You can&#8217;t.  This is particularly annoying for a calculated column or one that transforms the data using CASE or by concatenating text results.  You wind up having to duplicate the calculation in the ORDER BY clause.  Silly.
</li>
<li><b>Default Collation Sequence is Case-Insensitive</b><br />
When you install SQL Server (2000, MSDE, 2005, 2005 EE) the default collation sequence is case-insensitive.  I&#8217;d never seen this before when working with other databases.  I only discovered it by accident when looking at a client&#8217;s data and discovered that data being inserted into a column with an FK constraint was allowing lower-case when I knew the referenced table was all upper case.  I was shocked.  This might make some sense for some other alphabets, but it certainly produced some unexpected results for me.</p>
<p>You can actually create each table and column with a different collation sequence.  This might be flexible, but I suspect it breaks several rules of design and should be avoided.
</p>
</li>
</ol>
<p>So there you go.  Some of the reasons why I dislike SQL Server.  I&#8217;m sure I will think of some more&#8230; I&#8217;ll add some of them to the list as I think of them.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/21/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/21/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/21/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=21&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/04/14/sql-server-suckage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>The Art of SQL</title>
		<link>http://camz.wordpress.com/2007/04/06/the-art-of-sql/</link>
		<comments>http://camz.wordpress.com/2007/04/06/the-art-of-sql/#comments</comments>
		<pubDate>Sat, 07 Apr 2007 05:13:27 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[review]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/04/06/the-art-of-sql/</guid>
		<description><![CDATA[ About a month ago, Mark Zaugg handed me a copy of this book, he&#8217;d bought it but had not yet had the time to give it a read.  It was a very interesting read, quite unlike any other book I have ever read on a technical subject.  The author (Stéphane Faroult) has [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=19&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://www.chapters.indigo.ca/books/item/books-978059600894/0596008945/The-Art-of-SQL?ref=Search+Books%3a+'art+of+sql'"><img class="alignleft" style="float:left;border:0;" src="http://www.oreilly.com/catalog/covers/0596008945_thumb.gif" alt="The Art of SQL" align="left" /></a> About a month ago, Mark Zaugg handed me a copy of this book, he&#8217;d bought it but had not yet had the time to give it a read.  It was a very interesting read, quite unlike any other book I have ever read on a technical subject.  The author (Stéphane Faroult) has themed the book on the classic <a href="http://www.chapters.indigo.ca/books/item/books-978048642557/0486425576/The-Art-of-War?ref=Books%3a+Search+Top+Sellers">The Art of War</a> by Sun-Tzu, and included many quotes and examples of a historical military nature.  That might seem pretty odd for a technical book, so you&#8217;ll have to trust me when I say that it works.<br />
<span id="more-19"></span></p>
<p>The book does provide some tips and techniques that can be used to tune SQL queries and to solve some common problems in some pretty non-obvious ways.  After you read this tomb you will be better at analysing, understanding, and tuning queries, and you will have some new and arcane skills with SQL, but that is not the primary benefit this book delivers.  Mark had told me when he loaned me the book that he had heard it would make you &#8220;turn your head sideways to look at SQL&#8221;.  He won&#8217;t understand just how accurate that statement was until he&#8217;s read the book.</p>
<p>The author shares his knowledge of SQL and relational theory in a way that is incredibly compelling.   You finish each chapter with new insight into how and (more importantly) <em>why</em> different queries perform differently.  The examples that are provided are realistic enough to make sense and it is very easy to recognize how the information, analysis, and techniques would apply to your own projects.  I now understand significantly more about relational theory, SQL, and I have indeed learned how to look at SQL with my head sideways.  The knowledge the book imparts goes beyond this as well, it discusses good design of the tables in a database, and it makes you think about the design of the databases you might already be using.</p>
<p>I will probably pick up a copy of this book for myself, even though I have now read it, it was impressive enough that I think it should have a home on my bookshelf.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/19/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/19/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/19/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=19&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/04/06/the-art-of-sql/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>

		<media:content url="http://www.oreilly.com/catalog/covers/0596008945_thumb.gif" medium="image">
			<media:title type="html">The Art of SQL</media:title>
		</media:content>
	</item>
		<item>
		<title>NULL-Thing to See Here</title>
		<link>http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/</link>
		<comments>http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/#comments</comments>
		<pubDate>Sun, 01 Apr 2007 01:32:20 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[SQL]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/</guid>
		<description><![CDATA[I&#8217;ve been writing quite a bit of SQL lately, and although I&#8217;ve been messing around with SQL for many, many years, there are still things to learn.  The most recent lesson involved about 3 hours of staring at a query trying to figure out why it wasn&#8217;t working they way it was supposed to. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=13&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve been writing quite a bit of SQL lately, and although I&#8217;ve been messing around with SQL for many, many years, there are still things to learn.  The most recent lesson involved about 3 hours of staring at a query trying to figure out why it wasn&#8217;t working they way it was supposed to.  You quickly learn that in SQL you can&#8217;t test for NULL using = (or !=) and instead you must use <b>IS NULL</b> and <b>IS NOT NULL</b>.  Initially this just seems to be a quirk with SQL, but the true relevance of this hit home in my latest battle with a query.<br />
<span id="more-13"></span><br />
I&#8217;ve been writing code in C for over 20 years, I&#8217;ve gotten quite familiar with, and comfortable with the uses of pointers, and in particular with null pointers.  In the world of pointers, we assign null for a lot of reason&#8230; when we need to indicate that the pointer has no value, we use null, we use it in linked lists to indicate that there is no next link (or no previous link for doubly linked lists),  just about all the uses though get interpreted as meaning &#8220;no value&#8221; or &#8220;nothing&#8221;.  A null string, is not the same as an empty string, instead it is one that you have not created yet, it&#8217;s nothing, it indicates something that does not exist.</p>
<p>So, it&#8217;s easy to think of NULL in SQL the same way, as nothing.  The problem is that in SQL, NULL is NOT nothing, even though the most common uses of NULL in database columns is to indicate no the absence of a value.  When SQL was first created there the creators were basing many of it&#8217;s features on mathematical set theory.  In set theory a null is undefined, or unknown, it is used when you don&#8217;t know the value of something, or can&#8217;t (yet) know the value of something.  Set theory does not allow you to not have a value, the closest it comes to this would be an empty set.  As in all mathematics though  it <i>is</i> possible to not know the value of something while still knowing that that something does indeed exist.  SQL inherits this definition of NULL from mathematics.</p>
<p>This is why we have to write <b>IS NULL</b> and <b>IS NOT NULL</b> instead of <b>= NULL</b> and <b>!= NULL</b>.  If it were nothing, then it would make sense to be able to compare nothing to nothing, and it would make sense that nothing compared to nothing would be a match.  Undefined is quite different.  Two unknowns are not equal, or at least we can&#8217;t know if they are equal or not since we don&#8217;t know their value.  In fact it is more likely that two unknowns would be different (once we finally discover their values) than it would for them to be the same.</p>
<p>Okay, so I&#8217;ve explained why we can&#8217;t compare two NULL values.  This isn&#8217;t new, anyone working in SQL learns this quickly enough, but they rarely understand why it can&#8217;t be done, just what the &#8220;workaround&#8221; is.  Understanding this becomes crucial when you JOIN tables in SQL and the columns that you join on can contain NULLs.  This is where I spent more than 3 hours scratching my head in wonder.  I joined two tables, or more accurately, I joined a table with a previous version of itself in order to identify rows that existed in one table and not the other based on their data rather than their numeric id columns.  I knew the id columns had changed, but that the data had not.  The query didn&#8217;t work, in fact in some cases it almost looked like the query was producing the opposite result of what it should have.  I was convinced that SQL Server 2000 was buggy and I&#8217;d somehow managed to confuse the SQL parser.</p>
<p>I was wrong.  The tables had to be joined on 12 columns (yeah, I know &#8230; it was ugly), and all but one of those could contain NULLs, reality was that most of the columns did contain NULL.  The problem is in how a join is specifed in SQL, here is an example:</p>
<p><code> SELECT foo, bar FROM quux INNER JOIN baz ON quux.code = baz.code</code></p>
<p>The join is specified by identifying columns that reference each other, and the join is then performed with rows from each table that contain rows with equal values.  Ah! there is the problem, when NULLs are involved, nothing is equal to them, not even another NULL.  So joins with columns that have nulls produce results that just plain don&#8217;t seem to work.  It all makes sense when we remember that NULL does not mean nothing, but means unknown/undefined.  The solution is quite simple:</p>
<p><code> SELECT foo, bar FROM quux INNER JOIN baz ON ISNULL(quux.code,0) = ISNULL(baz.code,0)</code></p>
<p>Unfortunately, this will impact performance quite significantly for large tables.  Assuming there was an index on the code columns of both tables, the index would not be usable, since we are now applying the ISNULL() function, and the index was generated on the column itself, not the result of that column being fed to a function.  The result is a full table scan of both tables involved.</p>
<p>The lesson here is that as you design a database (or refactor one) you should put some serious thought into any column that allows the use of NULL.  I&#8217;d even go so far as to recommend that you try to avoid them and think of other ways to represent the same data.  Often the same result can be achieved with a second table that links to a &#8220;base&#8221; table, and only contains data when a matching row in the base table requires it.  When implemented in this manner the absence of data rather than the presence of a NULL is a much better, and clearer indication of nothing than using a NULL.  if you do it this way, you won&#8217;t run into weird behavior from JOINs when you least expect it.</p>
<p>&#8230;and that&#8217;s is&#8230; NULL-Thing to see here, carry on&#8230;</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/13/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/13/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/13/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=13&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
	</channel>
</rss>