<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>WoogaLog &#187; Software Development</title>
	<atom:link href="http://camz.wordpress.com/category/software-development/feed/" rel="self" type="application/rss+xml" />
	<link>http://camz.wordpress.com</link>
	<description>Various Ramblings &#38; Commentary</description>
	<lastBuildDate>Wed, 18 Jun 2008 18:10:51 +0000</lastBuildDate>
	<generator>http://wordpress.com/</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<cloud domain='camz.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://www.gravatar.com/blavatar/88347e88432a6a88c9ff93ef9b462f6e?s=96&#038;d=http://s.wordpress.com/i/buttonw-com.png</url>
		<title>WoogaLog &#187; Software Development</title>
		<link>http://camz.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://camz.wordpress.com/osd.xml" title="WoogaLog" />
		<item>
		<title>Telling Good from Bad Design</title>
		<link>http://camz.wordpress.com/2008/06/15/telling-good-from-bad-design/</link>
		<comments>http://camz.wordpress.com/2008/06/15/telling-good-from-bad-design/#comments</comments>
		<pubDate>Mon, 16 Jun 2008 02:36:11 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[SQL]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[qnx]]></category>
		<category><![CDATA[software design]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/?p=24</guid>
		<description><![CDATA[As the common wisdom goes, the best way to fix a badly designed system is to design it right in the first place.
I know all to well just how much truth there is in that.  The challenge is in telling when a design is bad in the first place, which is more difficult than [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=24&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>As the common wisdom goes, the best way to fix a badly designed system is to design it right in the first place.</p>
<p>I know all to well just how much truth there is in that.  The challenge is in telling when a design is bad in the first place, which is more difficult than it sounds.  There are all sorts of metrics that can be captured during the development process, but none of them will yield any true indication of how good (or bad) the design actually is.  Experience tends to be the only thing that is reliable, and even then it can often just be a &#8220;gut feeling&#8221; that something is wrong and incredibly difficult to identify what exactly that is.  When a design is mediocre, it&#8217;s even harder since there is often an equal mix of good and bad making it all that more difficult to tell one way or another.</p>
<p>I&#8217;ve seen a lot of interesting things, a small handful of amazing things, quite a few questionable things, and even some amazingly horrifying things done in the name of design. I&#8217;ve had to dig deep into systems and their design to either work with them or fix them (usually fix them).  Through that experience I have discovered that the design of a project can be summarized with one of two simple characteristics: one for good, the other for bad&#8230;<br />
<span id="more-24"></span></p>
<p><strong>Good Design</strong><br />
<em>Good designs produce additional benefits, features, and capabilities that the original designers did not conceive or plan, but that exist as a byproduct of the design.  These capabilities can be leveraged with little or no change to the original design, or extend the original design without requiring extensive refactoring or redesign.  New capabilities can be found in exceptional designs with nothing more than a different perspective rather than any change to the code or design.</em></p>
<p><strong>Bad Design</strong><br />
<em>Bad designs produce workarounds, special cases, constant (and often extensive) refactoring of both the code and the design, and are typically accompanied by long, involved explanations [from the designers] on why something can&#8217;t be done.</em></p>
<p>It really is as simple as that.  The hard part is that it takes experience to recognize both of these and there is no magical way to get that experience other than to work on a lot of projects.  That includes working on bad projects too.  They might suck, but if you pay attention to what went wrong, and take the time to figure out why, even a bad project can become a gold mine of knowledge.</p>
<p><strong>A Good Example &#8211; The QNX® RTOS</strong><br />
One system that I have encountered that would fall into the category of good design would the the message-passing inter-process communications (IPC) in the <a href="http://www.qnx.com/">QNX® Realtime Operating System</a>.  I have worked with 3 generations of QNX, and although each generation enhanced the IPC mechanism, the &#8220;guts&#8221; of the QNX IPC remained the same because the fundamental design was good.  They were able up add capabilities to new versions without changing the fundamentals of the original design.</p>
<p>Each and every version of QNX has the IPC mechanism at the core of the OS, for those interested in learning more check out the System Architecture Guide online.  I will will focus on a few examples of how this single design feature of the OS demonstrates the characteristic I have described in this article.</p>
<p>One of the first features that QNX gets &#8220;for free&#8221; as a direct result of building the IPC into the core (yes, it is in the kernel) of the OS is that of modularity.  The actual OS is made up of a group of processes that work together to provide the functionality of a traditional monolithic kernel.  They also extend the IPC mechanism across the network, which has the side-effect of making the network completely transparent.  This is an exceedingly difficult thing to accomplish in a monolithic kernel, and thanks to the excellent design of QNX&#8217;s IPC, it comes &#8220;for free&#8221;.  A network of QNX machines is a loosely-coupled multi-processing computer, something that requires special software and application awareness on other platforms to accomplish.</p>
<p>The IPC mechanism allows the OS itself to be built of modules, and once again we see byproducts of this design choice.  Multiple filesystems modules, multiple network links, device drivers, etc.  Fault protection becomes a byproduct as well, since the IPC mechanism allows the OS to be built from multiple cooperating processes using IPC, each process is protected from the other, and the failure of any process does not have a catastrophic effect on the whole OS.</p>
<p>The simplicity of the IPC implementation also provided synchronization of cooperating processes, something that again, is often difficult to implement with other designs.  Asynchronous communication was also possible without changing the IPC mechanism, and only changing how an application made use of the IPC features.</p>
<p>In later versions of QNX, a graphics system was added which QNX called Photon.  Photon also leveraged the IPC mechanism to simplify the implementation of graphics drivers, input devices, and &#8220;clipping&#8221; of windows as they overlap.  Photon itself enjoyed its own &#8220;freebies&#8221; as a direct result of layering on top of the IPC mechanism.  Remote consoles, mirrored consoles, multi-monitor support, foreign platform support (Phindows &#8211; Photon in windows &amp; PhinX &#8211; Photon in X) all became free or trivial with the design.</p>
<p>These represent just a few of the additional features that just &#8220;fall out&#8221; of a single good (in this case excellent) design.</p>
<p><strong>A Bad Example &#8211; Generating SQL in an OO-System</strong><br />
This is an all too common scenario in object-oriented development.  You start out with an hierarchical object design,  and then create a corresponding relational database design in which to persist the object data.  The typical approach is to create a data access layer (DAL) which consists of objects that essentially serialize object data into SQL, and deserialize from SQL back into objects.  To keep things simple, especially when the OO-developer is designing the database, it is common to see a &#8220;one table per object&#8221; schema design.</p>
<p>The problems come when you want to retrieve a &#8220;complete&#8221; object which includes all its hierarchical child objects.  The most intuitive solution is to have the DAL object call other DAL objects for the child objects.  This is easy to do, and easy to test, and looks fine on the surface.  Of course, it isn&#8217;t fine, and the first issue to crop up is one of performance.  The performance issue is caused by iterating through an object, generating additional DAL / SQL calls, which in turn do the same for each level until there are no child objects remaining. For an object with lots of child objects, or lots of levels in the object hierarchy, this quickly results a large number of SQL queries, all for a single object.  The issue is compounded when retrieving data for an array of objects.</p>
<p>Let&#8217;s take a step back to look at this.  The design isn&#8217;t very good, as it fails to resolve the issue of making an object hierarchy work well with a relational database.  The design is simple, which would normally be good, but the simplicity results in iteration and an increased number of round-trips to the database server.  Neither of which could be considered a beneficial byproduct of the design.</p>
<p>Typically this regarded as a performance, rather than a design issue and a workaround is often introduced to improve the performance.  Typically the DAL layer will get a &#8220;depth&#8221; parameter added to limit the number of levels we retrieve in the hierarchy since we also discovered that we don&#8217;t always want the entire hierarchy. Every single DAL object must now be modified to use and update our depth-tracker.  When we are all done, it works, but performance is still poor, because we have only reduced the iteration rather than eliminating it.</p>
<p>This process can continue indefinitely, workarounds usually address the symptoms of the real problem instead of the problem itself.  They also tend to be very specific and targeted, increasing the likelihood that any given patch may introduce new issues and create new symptoms since the underlying issue has not been addressed.  After a while, the number of patches can make identifying the root-cause more difficult.</p>
<p>One of the better solutions to the SQL issue outlined here is quite simple, stop thinking in OO, and start thinking in relations, which is what the database uses.  Once you do this, it becomes obvious that you can take the most frequent requests, identify how many levels of objects are involved and write a single SQL query that uses join statements to get all the levels of data in one fell-swoop.  This is one example where the differences between OO methodologies and relational are disconnected.  What is obvious in one isn&#8217;t necessarily efficient, optimal, or even a good idea in the other, and may even be contrary to the other methodology.</p>
<p>Each step taken in this bad design is relatively simple, and appears to be the most intuitive at the time, making it extremely easy to wind up with a design that has issues. The symptoms often masquerade as a different problem making it difficult to realize that the design contains a flaw rather than the implementation. This is quite typical in the absence of requirements forcing an iterative cycle of partial design intermixed with coding in place of upfront design and understading of how each element of the solution will be implemented <em>and</em> used.</p>
<p>This is perhaps why this particular example (mapping object to relational) comes up so often, and why it so easy to take such a course without knowing it.</p>
<p>As you can see, in the bad design example, we got no beneficial byproducts from the design, and instead got a series of issues and problems requiring more and more workarounds to address.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/24/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/24/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/24/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/24/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/24/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=24&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2008/06/15/telling-good-from-bad-design/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>Anything but Cheap</title>
		<link>http://camz.wordpress.com/2008/06/14/when-cheap-isnt/</link>
		<comments>http://camz.wordpress.com/2008/06/14/when-cheap-isnt/#comments</comments>
		<pubDate>Sat, 14 Jun 2008 15:04:36 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Rant]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/?p=18</guid>
		<description><![CDATA[I often hear statements in the form of &#8220;X is cheap, don&#8217;t worry about it&#8221;.  This usually comes up in design sessions, software or infrastructure, it doesn&#8217;t matter, you can usually find or hear a statement in this form in either type of design session.
The most common versions would be in reference to disk [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=18&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I often hear statements in the form of <em>&#8220;X is cheap, don&#8217;t worry about it&#8221;</em>.  This usually comes up in design sessions, software or infrastructure, it doesn&#8217;t matter, you can usually find or hear a statement in this form in either type of design session.</p>
<p>The most common versions would be in reference to disk space, bandwidth, and CPU.</p>
<p>I suppose this type of thinking is a result of the effects of <a href="http://www.intel.com/technology/mooreslaw/">Moore&#8217;s Law</a>.  One of the troubling things about this sort of thinking is that it is frequently wrong and is indicative of someone with a closed mind, an inability to see the big picture, lack of insight, short-sightedness, or any other similar phrases.  In all cases, when someone makes an X is cheap statement it&#8217;s a safe bet that they haven&#8217;t really thought things through.  These statements are very dangerous, particularly if you happen to be in a design session when someone says this and is of particular concern if that someone is a team leader or manager.</p>
<p>The most dangerous aspect of these statements is that on the surface they may appear to be accurate and correct.  Unfortunately, unless we are designing as system that people will only look at (ie. the surface), rather than actually <em>use</em>, it is rarely correct.</p>
<p><span id="more-18"></span>I&#8217;ll make an attempt to examine the most common of these statements to show how and why they are incorrect and assumptions.  The analysis will be from a business/enterprise perspective rather than a personal/consumer one.  You might be surprised to see how inter-related these can be!  Let&#8217;s start with:</p>
<p><strong>Disk is cheap, don&#8217;t worry about size</strong></p>
<p>I will have to start by conceding the fact that disk storage has gotten very cheap indeed.  The average consumer can put a <a href="http://en.wikipedia.org/wiki/Terabyte">terabyte</a> of disk into their PC for less than $200.  So how can that NOT be cheap?</p>
<p>We first need to understand what is <em>really</em> being said here.  This usually isn&#8217;t a reference to disks and storage in general, it will be in reference to data files or databases.</p>
<p>So lets talk about files&#8230; in the past files weren&#8217;t that big, they were text files or word documents, and although the word documents were bigger, the size was still reasonable.  This is no longer true, text files are rarely used anymore, the same information is now in a PDF file, or word document, an image, or a multi-media file.  Those PDF and word files usually have graphics in them too, and to ensure that they look good when they print the graphics are stored at full resolution, even if they are only displayed thumbnail-sized in the document itself.  The average document size is now measured in MB instead of KB, and for media files, GB is becoming more and more common.</p>
<p>So, what am I trying to say?  The amount of storage that is available at a reasonable cost has increased by orders of magnitude, but so has the average size of the files that we store.  The net result is that we can store about  the same number of files as we used to before.</p>
<p>Even if we can store more files there is another cost involved. The more files and directories we have on our disks, the more difficult it becomes to find a file when we need it.  If a user can&#8217;t remember where they put a file on a file server and they have to search for it, that potentially ties up a lot of I/O and CPU resources as the server scans all the files.  That results in slower performance for all the other users relying on that server.  That same server could be hosting a database, or a VM, or even just the virtual disk of a VM, and each of those would be affected by this search.   Even if you don&#8217;t search for files very often because you are so well organized&#8230; those files are getting scanned anyway a couple times a day by your anti-virus and anti-spyware software.</p>
<p>Now lets look at databases&#8230;  Modern database servers make them easy to use, quick to access, all those good things.  So where is the problem.  Let&#8217;s look at how a real production database gets used.  To start with, you don&#8217;t just put a production database on disk, that&#8217;s too risky, disks can fail.  You put it on a raid array instead, using mirroring, or striping, or whatever.  The end result is that we now have two or possibly three (depending on the level of raid) copies of that database.  We need to back up the database, that&#8217;s another copy&#8230; but in today&#8217;s web-enabled global economy, we can&#8217;t take the database server offline to back it up.  So we have to take a snapshot to another disk so that we can make a backup of it without it changing while we write to tape, that&#8217;s another copy.  Add some developers into the mix, any system that has a database probably has some developers working on new versions of the software and they need working copies of those databases to use for testing and development.  That&#8217;s another copy (per developer) for each database.  What are we up to now? 5 or 6 copies?  Once again, the way we use that so-called cheap disk space quickly changes the cost per MB when each MB that we use is actually stored multiple times.</p>
<p>Backups become important too.  Bigger databases take longer to backup, which also means they take longer to restore too.  That can be crucial when the database for a web-commerce site goes down and we need to restore it.   The companies that chose to believe the &#8220;disk is cheap&#8221; myth will be offline waiting for the restore a lot longer than the company that chose to use their disk space more wisely.</p>
<p>Storage in general is no longer local in a business environment.  Files get shared on file servers, as well as through the use of SAN and NAS devices.  This means that accessing the data on those &#8220;cheap&#8221; disks is now over a network and now bandwidth enters into the equation.  A perfect segue into&#8230;</p>
<p><strong>Bandwidth is cheap, don&#8217;t worry about the size of your data</strong></p>
<p>Once again, we start with trying to understand what is really being said here.  This is typically the speed of the available network rather than the cost of the network, although it can easily refer to both.</p>
<p>The speed and cost of networks has improved dramatically over the years.  They have gone from 300 baud modems to 1200, and on and on up to 14400 and 56Kbps only to be replaced by broadband connections anywhere from 128Kbps to 10Mbps.  Local networks went from proprietary systems to 10Mbps Ethernet, then 100Mbps, and now 1Gbps or wireless networks that started at 11Mbps and are now 54Mbps or 108Mbps.</p>
<p>What was once inconceivable to transmit over a network is now commonplace.  So where is the problem here?</p>
<p>The first misconception comes from a single user perspective of the network.  This is easiest to explain with an example.  A typical office will have a 100Mbps network, providing 100Mbps of network bandwidth to each user.  Sounds pretty good and we don&#8217;t use hubs anymore everyone uses switches, so user A doesn&#8217;t get impacted by what user B is doing on his segment of the network.  Or does he?  That depends, although in most cases the answer is yes, they are affected.  Those network switches only isolate traffic between two end points.  So, I might have 100Mbps between me and the switch, but ultimately I don&#8217;t need anything from the switch, I need it from some resource on the network.  That resource might be a file server, or a database, or a website.  There is a significant chance that I&#8217;ll be competing with the other user to connect to the same end point.  So while we both have 100Mbps to the switch, we are sharing the 100Mbps from the switch to the common end point, so we&#8217;ve effectively been reduced to 50% of the &#8220;perceived&#8221; bandwidth.  Chances are that it won&#8217;t be just two users connecting to that common end point, but a lot more.  In a small company, you are probably competing will ALL the other users, so divide that 100Mbps by 10 or 20, bringing us down to 10Mbps or 5Mbps, certainly not the amount of bandwidth that you expected to get.</p>
<p>I will admit that I&#8217;m not being completely fair in my calculation, its far to primitive to be accurate, everyone would have to be accessing the same server at the same time for those numbers to be correct.  Statistically we will get much better performance, exactly how much better is harder to predict since it depends on what is being accessed.  Of course, the bigger the files we are after, the longer we send transferring their data, the more likely we are to be affected.  This is where we can start to see the impact of the &#8220;disk is cheap&#8221; mindset on other (seemingly unrelated) areas of computing.</p>
<p>So, I haven&#8217;t proven anything other than you won&#8217;t see 100% of your bandwidth.  We&#8217;ve only been talking about an typical office LAN, lets examine networked applications.</p>
<p>There are two typical varieties of networked applications.  Those that explicitly use the network by implementing some form of protocol, and those that indirectly use a network through file sharing, web-services, SOA, or database connections.  If a protocol is designed with the &#8220;bandwidth is cheap&#8221; thinking, you&#8217;ll probably find that it sends and receives lots of data.  The networks are indeed fast enough to make this appear to be a non-issue.  Unless you happen to be the the sysadmin or the hosting provider.  More efficient use of bandwidth translates directly into more users, which translates directly into revenue.  Once you max out your bandwidth with users, you might also need to buy/build new servers to handle more customers.</p>
<p>The more customers you can fit into the same bandwidth, the fewer servers and infrastructure you need, which directly affects the capital investment required (which again, affects Time-To-Revenue, since you have to recoup the infrastructure cost to become profitable).</p>
<p>More and more often you will find servers running in VMs, which wind up aggregating network traffic from all VMs to a single physical network.  Applications running in VM servers that don&#8217;t make efficient use of the network could find the VMs running out of available network bandwidth before they run out of available CPU.  If we have network traffic between two VMs, it never hits the physical network, but it does wind up consuming additional CPU possibly impacting performance of the other VMs.</p>
<p>When we introduce wireless and cellular networks to the mix the impact is more easily observed.  Sure that new 802.11n wireless is fast, but when you have an office full of users streaming video over the same shared wireless bandwidth&#8230;</p>
<p>So if you encounter someone that suggests<em> &#8220;X is cheap, don&#8217;t worry about it&#8221;</em> in your next design session or troubleshooting discussion, you will know that what they are really saying is that they probably don&#8217;t understand the issue at all.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/18/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/18/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/18/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/18/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/18/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=18&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2008/06/14/when-cheap-isnt/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>Objective Relations II: Seductive Identity</title>
		<link>http://camz.wordpress.com/2007/04/15/objective-relations-ii-seductive-identity/</link>
		<comments>http://camz.wordpress.com/2007/04/15/objective-relations-ii-seductive-identity/#comments</comments>
		<pubDate>Mon, 16 Apr 2007 06:45:46 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[software design]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/04/15/objective-relations-ii-seductive-identity/</guid>
		<description><![CDATA[I&#8217;ve doing a lot of thinking on the topic of whether you can or can&#8217;t mix object and relational models, what I have discovered has been quite enlightening.  In my previous posting, I determined that the same data hierarchies can be represented in either an object orrelational model.  I also determined that you [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=22&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve doing a lot of thinking on the topic of whether you can or can&#8217;t mix object and relational models, what I have discovered has been quite enlightening.  In my previous posting, I determined that the same data hierarchies can be represented in either an object <em>or</em>relational model.  I also determined that you can mix the models (why not?), but that the trick is not in the mixing but in the realization that the model you have is a mixed one.  The sad realization was that most people <em>do</em> mix the models and are completely unaware of having done so, and I needed to figure out why (and how) this was happening.</p>
<p>The use of IDs in objects is where the models typically mix, and also where most people fail to realize that a mix has taken place.  Seems that the concept of a simple integer id incorporated into an id is a seductively powerful concept.  I won&#8217;t claim to completely understand why this is, but I will share with you my thoughts on why the id has such seductive power that even the die-hard OO zealots fall victim to it&#8217;s allure.<br />
<span id="more-22"></span></p>
<p>For the older generations of developers (of which, I belong) that learned to program with the original high-level languages and who have experience with relational databases, the use of an id is quite natural.  We did not typically design our data models using hierarchal data structures, we thought in terms of records, and if we designed any complex data model, we designed it in a relational database.  You quickly learn to use ids to link tables together, and since we started out with more resource constrained systems, the appeal of using 3NF relational models compliments our awareness of the amount of resources our applications consume.  When your code is written to deal with data as arrays of structures, it is much, much, easier to write functions (we didn&#8217;t call them methods) to manipulate the data if you could just pass an id for the record around, or perhaps a pointer rather than moving the entire structure.</p>
<p>The basics of writing programs typically include the use of arrays, and eventually arrays of data structures, which are usually our first forays into more advanced programming tasks involving large amounts of data.  When you use an array, the index of the array becomes a &#8220;natural&#8221; id for a data record.  The same is true of anyone learning an OO language, the basics still involve learning about arrays.  It&#8217;s a quite easy to adapt from an array index to an id column in a relational database.  The use of those ids inside a data structure that references another data structure works very well.  The code itself might use a pointer, but you can&#8217;t write a pointer out to disk and then read the same data back again and still have the pointer be valid.  If it&#8217;s an id, you can&#8230; they fit perfectly into the relational model when using an relation database.</p>
<p>Of course even a C programmer can create hierarchical data structures using pointers, linked lists, and such.  In the end though, when we need to persist that data, the ids are convenient.</p>
<p>So what about the object world?</p>
<p>I believe that in a &#8220;pure&#8221; OO model, the ids do NOT belong.  The represent a direct coupling of the persistence layer and the object layer, which is considered a violation of most OO methodologies.  There is a problem with object models though, which is that a complete object hierarchy graph can actually represent a lot of duplicated data.  The exact same reasons why we use 3NF in relational models applies here too, it takes a LOT of effort to ensure that all the copies of a child object get updated in all the in-memory object graphs.  In C you&#8217;d just use pointers to &#8220;solve&#8221; this issue, but a pure object model does not expose such a low-level capability, which is where the use of an id comes into play.  The id effectively becomes an abstract pointer to another object, eliminating the need to include the entire child object, while still allowing an easy way to access the data in the child object.  Only populating the id of a child object becomes a convenient solution for &#8220;lazy loading&#8221; our object graph.  The other issue is that there are very few object databases, so the vast majority of the time you&#8217;ll be using a relational database for the persistence layer.</p>
<p>So the id winds up being used as an compact, convenient <em>alias</em> for a child object, providing us with a convenient solution to &#8220;lazy loading&#8221; object graphs, and allowing us to pass an id as an argument to a method in place of passing the object itself.  Convenience aside, this violates the pure OO methodology and introduces coupling not only between the object and the persistence layer, but depending on usage may also create tighter coupling between objects and external methods the manipulate them.</p>
<p>The conclusion is that the seductive quality of the id is <strong>convenience</strong>, and for most developers convenience trumps following the purer object models.  In fact, this convenience is so seductive that most OO developers don&#8217;t even realize that using an id is mixing object and relational models.</p>
<p>Truth be told, creating a &#8220;pure&#8221; object oriented system is hard work, really hard.  The relation model fits very closely with how a compiler has to actually implement the OO concepts in our HLLs, which means if you design and use your objects in a more relational way, you&#8217;ll wind up with a program that is more likely to perform much better than a pure OO system.  The CPUs that eventually execute our code (which still has to occur even when we use languages that &#8220;execute&#8221; in virtual environments like JVMs and the .Net CLR) perform better with relational models than object ones.</p>
<p>I was once told by another developer that the design that they always used and found to work very well was to use a one to one mapping of object to database table.  Truth be told, this is NOT an object model, it&#8217;s a relational one&#8230; if you can perform a 1:1 mapping from your objects to relational database tables, then you have a relational model, if you had an actual object model, it would NOT map 1:1 to a database.  The reason it worked was not because they had implemented an OO system properly, and not because they managed to avoid mixing the two models.  The did manage to avoid mixing the models, but they used a pure relational model, and <em>not</em> an object model.  As I mentioned before, the pure OO model is very difficult to do properly, especially when you must provide a persistence layer that has to deal with a relational data model in the form of a database.</p>
<p>This performance aspect is another influence in the seductive power of ids in our objects, their use can help improve performance of a slow OO system by a significant factor.</p>
<p>So how would you implement something like lazy loading in a purely object model?  I believe the answer is inheritance, one of the core concepts/capabilities of an OO system, and probably the most abused, misused, or neglected part of an OO language.  People either go crazy with so much inheritance that you wind up with unusable code, and insanely deep / complex class hierarchies, or they ignore them all together and use inheritance as an overly complex type enum.</p>
<p>I think a proper object data model would have a family of objects, where the &#8220;top&#8221; levels always represent the lazy load state.  This level would <em>never</em> contain complex objects, no arrays of child objects, and only a single level of child object (ie. the only child objects included should not contain other objects, only simple types).  The various levels of &#8220;heavy&#8221; loading would be accommodated by additional objects that inherit from top level and then add their own members any additional levels of data.</p>
<p>Not all OO languages make this easy to do.  To make it easy, you&#8217;d have to have methods that allow either the base (lazy) object or the inherited object (full object) as an argument.  You can do this in C#, but you have to use casts (which is a PITA, and definitely not convenient).  You don&#8217;t get this functionality unless you use an interface and inheritance, which requires a lot more up front design and planning to get into a design early enough to produce benefits.</p>
<p>Welcome to the dark side of Objective Relations&#8230; the id is your mistress from the relational side.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/22/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/22/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/22/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/22/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/22/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/22/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/22/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/22/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/22/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/22/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/22/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/22/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=22&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/04/15/objective-relations-ii-seductive-identity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Server Suckage</title>
		<link>http://camz.wordpress.com/2007/04/14/sql-server-suckage/</link>
		<comments>http://camz.wordpress.com/2007/04/14/sql-server-suckage/#comments</comments>
		<pubDate>Sun, 15 Apr 2007 05:54:18 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Rant]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/04/14/sql-server-suckage/</guid>
		<description><![CDATA[I&#8217;ve been working on a system written in .Net/C# that uses SQL Server, my database experience prior to SQL Server included Oracle, Ingress, Sybase, Postgresql, and some MySql.  Each had their quirks, but I&#8217;m learning more and more of the SQL Server quirks, and at times seriously having to wonder how Microsoft gets off [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=21&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve been working on a system written in .Net/C# that uses SQL Server, my database experience prior to SQL Server included Oracle, Ingress, Sybase, Postgresql, and some MySql.  Each had their quirks, but I&#8217;m learning more and more of the SQL Server quirks, and at times seriously having to wonder how Microsoft gets off calling this an enterprise class product.  I will fully admit that some of my complaints are with how .Net communicates with SQL Server, and these probably are not Sql Server&#8217;s fault.</p>
<p>Here is my short list of gripes (in no particular order):<br />
<span id="more-21"></span></p>
<ol>
<li><b>No Sequences</b><br />
Yes, I know, you can use IDENTITY columns, but they are not the same, and SQL Server places some foolish restrictions on their use.  Instead of an IDENTITY column being a &#8220;normal&#8221; scalar column it is special and you can never update the column, although you can insert, but only after additional SQL directives.  The IDENTITY implementation is close to a sequence, but Microsoft just didn&#8217;t implement it quite right.  I do wonder why they didn&#8217;t introduce proper sequences with SQL Server 2005, they could have done this quite easily, a proper sequence could peacefully co-exist with the existing IDENTITY nonsense.
</li>
<li><b>Inconsistent Parsing</b><br />
I was creating query that needed to run properly against two versions of a table, one version had an additional column to simplify the query.  Using the IF EXISTS statement, I wrote a query similiar to the following:<br />
<code>IF EXISTS (<br />
  SELECT NULL<br />
  FROM sysobjects so, syscolumns sc<br />
  WHERE so.id = sc.id<br />
  AND so.name = 'table_name'<br />
  AND so.xtype = 'U'<br />
  AND sc.name = 'column_name'<br />
) SELECT column_name FROM table_name</code><br />
When I ran it against a database that had the column it worked, when I ran it against one that didn&#8217;t have the column, I got an error of it complaining that column_name didn&#8217;t exist.  I shouldn&#8217;t have even tried, that is the whole idea behind using the IF EXISTS&#8230;  The following query:<br />
<code>IF EXISTS (<br />
  SELECT NULL<br />
  FROM sysobjects<br />
  AND name = 'table_name'<br />
  AND so.xtype = 'U'<br />
) SELECT * FROM table_name</code><br />
Works as expected when run against a database that does not have the table.</p>
<p>So, in one case it parses the statement inside the IF EXISTS and does some checking, in another case it does not.
</p>
</li>
<li><b>NOCHECK Constraints</b><br />
The ability to temporarily disable constraint checking can be quite convenient, particularly while bulk loading data.  What I don&#8217;t like about this is that this &#8220;WITH NOCHECK&#8221; state can actually stick around indefinitely, and is included in backups of the database.  The constraint appears to be present, but is not actually being enforced.
</li>
<li><b>Escaping Reserved Words &amp; Special Characters</b><br />
This is one of those things that looks reasonably useful until you actually take a couple moments to think about it.  SQL Server will let you use special characters like spaces, periods, and even reserved words when defining your schema, all you need to to is enclose them in square brackets.  The following is valid:<br />
<code>CREATE TABLE [Table] (<br />
[hard.to.use.column] varchar(20),<br />
[also hard to use] varchar(10),<br />
[null] varchar(10)<br />
)</code><br />
There are good reasons why we have reserved words, and why spaces and such are not allowed in table or column names.  While it is admirable that Microsoft provided such a convenient way to differentiate the real reserved words from the ones we want to use, it&#8217;s horribly misguided.  No one should do this, <i>ever</i>, it&#8217;s just plain confusing, not to mention non-portable, foolish, and, oh yeah&#8230; stupid.
</li>
<li><b>Inconsistent context from .Net/C#</b><br />
I will admit that I do not know if this is a SQL Server thing, or a .Net/C# thing.  This is related to the debate about using parameters or not (which I won&#8217;t get into), the issue is that .Net/C# does not execute these in the same way.  Here is a code sample to help demonstrate this:<br />
<code>public Query1( SqlConnection cnx, int id )<br />
{<br />
   cmd = cnx.CreateCommand();<br />
   cmd.CommandText = String.Format( "SELECT * FROM myTable WHERE id = {0}", id );<br />
   SqlDataReader r = cmd.ExecuteReader();<br />
   if( r.Read() ) { /* read result */ }<br />
   r.Close()<br />
}</p>
<p>public Query2( SqlConnection cnx, int id )<br />
{<br />
  cmd = cnx.CreateCommand();<br />
  cmd.CommandText = "SELECT * FROM myTable WHERE id = @id";<br />
  cmd.Paramaters.AddWithValue( "@id", id );<br />
  SqlDataReader r = cmd.ExecuteReader();<br />
   if( r.Read() ) { /* read result */ }<br />
   r.Close()<br />
}</p>
<p></code><br />
Query1 will be sent to SQL Server as a normal query.  If you were to build another query in the same program it would execute in the same context.  In other words, you can do a transaction by issuing the raw SQL as multiple statements using a combination of cmd.ExecuteReader() and cmd.ExecuteNonQuery().  You could also issue the sql to allow IDENTITY INSERTS, and then loop through some in-memory data and create multiple INSERT statements from that data.</p>
<p>Query2 should be identical, but it is not.  It executes in it&#8217;s own context as a side-effect of the fact that a stored procedure is actually used to execute the actual query.  The parameters are defined and their initial value set as arguments to the stored procedure, with the actual parameterized query being passed un-altered from .Net to the stored procedure as one of the arguments.</p>
<p>While it seems a bit excessive to invoke a stored procedure to deal with the parameters, this is not what bothers me.  What bothers me is that a parameter-less query and a parameterized query are not handled in an identical manner by .Net/C#, especially with the nice side effect of the difference in execution context of the two.  Code that produces valid SQL without parameters can break without explanation simply by implementing parameters.</p>
<p>Another issue related to this is that .Net/C# appears to always parse the sql before deciding how to execute it.  In an attempt to reduce potential errors where you might build a query with undefined parameters, it actually makes it impossible to create a stored procedure or user defined function from C#.  Since these things will use parameters in T-SQL, they conflict with how C# handles parameters, the following code snippet will produce a .Net exception:<br />
<code>cmd.CommandText = "CREATE PROCEDURE foo @id int AS SELECT * FROM myTable WHERE id = @id";<br />
cmd.ExecuteNonQuery();</code><br />
The exception will be that you have not defined @id with a cmd.Parameters.AddWithValue() statement.  Of course, you didn&#8217;t do that because you are not trying to use a parameter, you are issuing DDL SQL to create a stored procedure that will use one.  I have yet to find out how to tell .Net/C# to stop doing this so that this SQL can be executed.  I means you can&#8217;t programatically create a database schema from within a C# application.
</p>
</li>
<li><b>Column Aliases <i>not</i> Available to ORDER BY</b><br />
This one should be a bug, the fact that if you define an alias for a column you should be able to use that alias in the ORDER BY clause.  You can&#8217;t.  This is particularly annoying for a calculated column or one that transforms the data using CASE or by concatenating text results.  You wind up having to duplicate the calculation in the ORDER BY clause.  Silly.
</li>
<li><b>Default Collation Sequence is Case-Insensitive</b><br />
When you install SQL Server (2000, MSDE, 2005, 2005 EE) the default collation sequence is case-insensitive.  I&#8217;d never seen this before when working with other databases.  I only discovered it by accident when looking at a client&#8217;s data and discovered that data being inserted into a column with an FK constraint was allowing lower-case when I knew the referenced table was all upper case.  I was shocked.  This might make some sense for some other alphabets, but it certainly produced some unexpected results for me.</p>
<p>You can actually create each table and column with a different collation sequence.  This might be flexible, but I suspect it breaks several rules of design and should be avoided.
</p>
</li>
</ol>
<p>So there you go.  Some of the reasons why I dislike SQL Server.  I&#8217;m sure I will think of some more&#8230; I&#8217;ll add some of them to the list as I think of them.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/21/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/21/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/21/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/21/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/21/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=21&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/04/14/sql-server-suckage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>Objective Relations</title>
		<link>http://camz.wordpress.com/2007/04/06/objective-relations/</link>
		<comments>http://camz.wordpress.com/2007/04/06/objective-relations/#comments</comments>
		<pubDate>Sat, 07 Apr 2007 06:43:12 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[software design]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/04/06/objective-relations/</guid>
		<description><![CDATA[I have often heard people say that you can&#8217;t mix object models and relational models.  I have to admit that it took a while for me to fully understand, and now that I do, I&#8217;m certain that most people that said it to me, didn&#8217;t.  That sounds a little harsh and it probably [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=20&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I have often heard people say that you can&#8217;t mix object models and relational models.  I have to admit that it took a while for me to fully understand, and now that I do, I&#8217;m certain that most people that said it to me, didn&#8217;t.  That sounds a little harsh and it probably is, but the people most likely to impart this morsel of wisdom probably have comp-sci degrees and learned how to program in an academic setting.  They probably believe in &#8220;pure&#8221; programming methodologies as well, and talk about patterns, and anti-patterns, and extreme programming.   Unfortunately, most are what I have previously described as <a href="http://camz.wordpress.com/2004/09/03/fragile-developers/">fragile developers</a>.</p>
<p>Anyhow, back to the topic at hand&#8230; mixing object and relational models.</p>
<p>There really isn&#8217;t that much difference between object and relational models.  In fact, if you were to look at an entity-relationship diagram for a relational database and a object model diagram from a distance you probably wouldn&#8217;t be able to tell them apart.  The truth is that I have yet to come across a data model that can&#8217;t be represented by either model.<br />
<span id="more-20"></span></p>
<p>Object models represent hierarchies, basically nested data structures that contain other data structures or collections of data structures, which in turn can be made up of similar hierarchies.  You can represent the exact same hierarchies in a relational database, the main difference is that you need to add meta-data to represent the hierarchical structure, and this meta-data is visible. It usually takes the form of id columns and foreign key references.  Technically speaking, the relational model is no different when you look under the covers.  Those relationships appear as pointers, arrays of pointers, and linked lists, of course no OO-language worth using would ever let you see this, but it&#8217;s there and it&#8217;s the meta-data required to make the objects work.  The object purists call this abstraction, and it&#8217;s a &#8220;Good Thing&#8221;, and  I agree.</p>
<p>So where is the disconnect between object models and relational models?  There really isn&#8217;t one.  They can represent the same data.  There are differences though.  The object model is based on usage, the only reason to create an object is to use it, which is why we define objects with classes and those classes include <em>methods</em>, and <em>properties</em> for manipulating and accessing the data in the objects.  A relational model doesn&#8217;t try to accomplish the same thing, it&#8217;s only goal is to store the data in an object hierarchy and provide a mechanism for retrieving that data to reconstruct the original object.  Unless the object has no hierarchy, the mechanism of storing or retrieving the data will involve a sequence of steps to accomplish this task.  This doesn&#8217;t make the relational model inferior to the object model, it just makes it different, remember the two models are not trying to accomplish the same goal.</p>
<p>Unless you are writing an application that never saves or loads any data, you&#8217;ve had to deal with <em>object persistence</em> (OO-speak for writing to disk).  Chances are that for any application of any realistic size, that persistence involved a relational database (and a damn good chance that it&#8217;s a SQL database).  This is where the first disconnect occurs.   There is a mapping process involved, which will produce multiple queries to the database.  The object hierarchy will be traversed and each level will produce at least one database query.  It is easy to confuse this mapping between models with &#8220;mixing&#8221; the models.</p>
<p>I get it now, you can&#8217;t mix them, unfortunately most people <em>DO</em> mix them without realizing that they have done so, which is where the trouble starts.  That trouble usually takes the form of id column in a database.  You see there is a problem with the object model, which is that when you follow the &#8220;pure&#8221; rules,  you always deal with the FULL hierarchy of objects, and those can get pretty huge.  There is a tendency for object models to allow for much deeper hierarchies than the designer/developer ever intended to use.  A common solution to this is to introduce the concept of a &#8220;lightly loaded&#8221; object, which in other words is a partial object hierarchy.  We don&#8217;t want to load all the data for all the levels, but we don&#8217;t want to take away the ability to go from an lightly loaded object to a fully loaded one (or at least one loaded to a slightly deeper level in the hierarchy).  Which means that we need to include some piece of data that can represent the deeper level and provide us with enough information to retrieve that data if and when we decided we need it.  You want something you can use to <em>reference</em> that level, which is something that the relational model in our database is already doing for us, typically in the form of an id column.</p>
<p>It is far too easy to borrow the id from the database and use it in our lightly loaded object.  This is where the trouble begins, since this <em>IS</em> mixing the two models.  We&#8217;ve now taken meta-data from the relational model used to store/persist the data and we put it in our object.  Our object now contains meta-data which the OO-language normally abstracts and keeps safely out of our way.   Too bad we just messed that up, eh?  There is a good reason that OO-languages abstract that meta-data away from your average developer, if they were allowed to see the meta-data they&#8217;d expect to be able to manipulate it, and then most of them would screw it up.</p>
<p>I&#8217;m not saying that you shouldn&#8217;t use that id from the database.  It is <em>really</em> convenient, isn&#8217;t it?  Very tempting.  Beware though when you do, it must be done with complete knowledge that you just mixed the two models.  You don&#8217;t have a pure object anymore, and all that abstraction that you created to prevent your object from having to know anything about persistence, well.. you just screwed that up, it&#8217;s too late your object not only has to have knowledge of the persistence, it&#8217;s exposing meta-data from your persistence layer.  You object might even have to be &#8220;clever&#8221; to make sure that it manages these ids properly.  It&#8217;s more likely though that you will create other objects that contain these ids without being part of your object hierarchy, you build peer objects that contain this data.  Whups&#8230; now we are starting to represent the relational data in our object model, we&#8217;ve mixed them up even more.  The trouble cascades from this point onwards.</p>
<p>So, you <em>can</em> mix the models, careful examination of most object models will contain a mix.  This becomes a worst-practice when you let developer work with this model without the knowledge/realization that they are using a mixed model.  They&#8217;ll run into issues (this is a classic example of what <a href="http://www.joelonsoftware.com/">Joel Spolsky</a> refers to as the <a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">Law of Leaky Abstractions</a>), they&#8217;ll probably blame the relational database for some of there issues.  They&#8217;ll be wrong.  The only way to avoid these issues is to have full knowledge of the object model AND the relational model AND to be able to identify and understand where the two are mixed. Unfortunately it&#8217;s pretty rare to find a developer with this ability.</p>
<p>In the end it turns out that what you really can&#8217;t mix are developers with only partial knowledge of the two models, and that just isn&#8217;t the same as not being able to mix the models.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/20/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/20/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/20/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/20/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/20/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=20&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/04/06/objective-relations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>The Art of SQL</title>
		<link>http://camz.wordpress.com/2007/04/06/the-art-of-sql/</link>
		<comments>http://camz.wordpress.com/2007/04/06/the-art-of-sql/#comments</comments>
		<pubDate>Sat, 07 Apr 2007 05:13:27 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Reviews]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[book]]></category>
		<category><![CDATA[review]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/04/06/the-art-of-sql/</guid>
		<description><![CDATA[ About a month ago, Mark Zaugg handed me a copy of this book, he&#8217;d bought it but had not yet had the time to give it a read.  It was a very interesting read, quite unlike any other book I have ever read on a technical subject.  The author (Stéphane Faroult) has [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=19&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><a href="http://www.chapters.indigo.ca/books/item/books-978059600894/0596008945/The-Art-of-SQL?ref=Search+Books%3a+'art+of+sql'"><img class="alignleft" style="float:left;border:0;" src="http://www.oreilly.com/catalog/covers/0596008945_thumb.gif" alt="The Art of SQL" align="left" /></a> About a month ago, Mark Zaugg handed me a copy of this book, he&#8217;d bought it but had not yet had the time to give it a read.  It was a very interesting read, quite unlike any other book I have ever read on a technical subject.  The author (Stéphane Faroult) has themed the book on the classic <a href="http://www.chapters.indigo.ca/books/item/books-978048642557/0486425576/The-Art-of-War?ref=Books%3a+Search+Top+Sellers">The Art of War</a> by Sun-Tzu, and included many quotes and examples of a historical military nature.  That might seem pretty odd for a technical book, so you&#8217;ll have to trust me when I say that it works.<br />
<span id="more-19"></span></p>
<p>The book does provide some tips and techniques that can be used to tune SQL queries and to solve some common problems in some pretty non-obvious ways.  After you read this tomb you will be better at analysing, understanding, and tuning queries, and you will have some new and arcane skills with SQL, but that is not the primary benefit this book delivers.  Mark had told me when he loaned me the book that he had heard it would make you &#8220;turn your head sideways to look at SQL&#8221;.  He won&#8217;t understand just how accurate that statement was until he&#8217;s read the book.</p>
<p>The author shares his knowledge of SQL and relational theory in a way that is incredibly compelling.   You finish each chapter with new insight into how and (more importantly) <em>why</em> different queries perform differently.  The examples that are provided are realistic enough to make sense and it is very easy to recognize how the information, analysis, and techniques would apply to your own projects.  I now understand significantly more about relational theory, SQL, and I have indeed learned how to look at SQL with my head sideways.  The knowledge the book imparts goes beyond this as well, it discusses good design of the tables in a database, and it makes you think about the design of the databases you might already be using.</p>
<p>I will probably pick up a copy of this book for myself, even though I have now read it, it was impressive enough that I think it should have a home on my bookshelf.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/19/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/19/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/19/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=19&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/04/06/the-art-of-sql/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>

		<media:content url="http://www.oreilly.com/catalog/covers/0596008945_thumb.gif" medium="image">
			<media:title type="html">The Art of SQL</media:title>
		</media:content>
	</item>
		<item>
		<title>NULL-Thing to See Here</title>
		<link>http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/</link>
		<comments>http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/#comments</comments>
		<pubDate>Sun, 01 Apr 2007 01:32:20 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[SQL]]></category>
		<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/</guid>
		<description><![CDATA[I&#8217;ve been writing quite a bit of SQL lately, and although I&#8217;ve been messing around with SQL for many, many years, there are still things to learn.  The most recent lesson involved about 3 hours of staring at a query trying to figure out why it wasn&#8217;t working they way it was supposed to. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=13&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I&#8217;ve been writing quite a bit of SQL lately, and although I&#8217;ve been messing around with SQL for many, many years, there are still things to learn.  The most recent lesson involved about 3 hours of staring at a query trying to figure out why it wasn&#8217;t working they way it was supposed to.  You quickly learn that in SQL you can&#8217;t test for NULL using = (or !=) and instead you must use <b>IS NULL</b> and <b>IS NOT NULL</b>.  Initially this just seems to be a quirk with SQL, but the true relevance of this hit home in my latest battle with a query.<br />
<span id="more-13"></span><br />
I&#8217;ve been writing code in C for over 20 years, I&#8217;ve gotten quite familiar with, and comfortable with the uses of pointers, and in particular with null pointers.  In the world of pointers, we assign null for a lot of reason&#8230; when we need to indicate that the pointer has no value, we use null, we use it in linked lists to indicate that there is no next link (or no previous link for doubly linked lists),  just about all the uses though get interpreted as meaning &#8220;no value&#8221; or &#8220;nothing&#8221;.  A null string, is not the same as an empty string, instead it is one that you have not created yet, it&#8217;s nothing, it indicates something that does not exist.</p>
<p>So, it&#8217;s easy to think of NULL in SQL the same way, as nothing.  The problem is that in SQL, NULL is NOT nothing, even though the most common uses of NULL in database columns is to indicate no the absence of a value.  When SQL was first created there the creators were basing many of it&#8217;s features on mathematical set theory.  In set theory a null is undefined, or unknown, it is used when you don&#8217;t know the value of something, or can&#8217;t (yet) know the value of something.  Set theory does not allow you to not have a value, the closest it comes to this would be an empty set.  As in all mathematics though  it <i>is</i> possible to not know the value of something while still knowing that that something does indeed exist.  SQL inherits this definition of NULL from mathematics.</p>
<p>This is why we have to write <b>IS NULL</b> and <b>IS NOT NULL</b> instead of <b>= NULL</b> and <b>!= NULL</b>.  If it were nothing, then it would make sense to be able to compare nothing to nothing, and it would make sense that nothing compared to nothing would be a match.  Undefined is quite different.  Two unknowns are not equal, or at least we can&#8217;t know if they are equal or not since we don&#8217;t know their value.  In fact it is more likely that two unknowns would be different (once we finally discover their values) than it would for them to be the same.</p>
<p>Okay, so I&#8217;ve explained why we can&#8217;t compare two NULL values.  This isn&#8217;t new, anyone working in SQL learns this quickly enough, but they rarely understand why it can&#8217;t be done, just what the &#8220;workaround&#8221; is.  Understanding this becomes crucial when you JOIN tables in SQL and the columns that you join on can contain NULLs.  This is where I spent more than 3 hours scratching my head in wonder.  I joined two tables, or more accurately, I joined a table with a previous version of itself in order to identify rows that existed in one table and not the other based on their data rather than their numeric id columns.  I knew the id columns had changed, but that the data had not.  The query didn&#8217;t work, in fact in some cases it almost looked like the query was producing the opposite result of what it should have.  I was convinced that SQL Server 2000 was buggy and I&#8217;d somehow managed to confuse the SQL parser.</p>
<p>I was wrong.  The tables had to be joined on 12 columns (yeah, I know &#8230; it was ugly), and all but one of those could contain NULLs, reality was that most of the columns did contain NULL.  The problem is in how a join is specifed in SQL, here is an example:</p>
<p><code> SELECT foo, bar FROM quux INNER JOIN baz ON quux.code = baz.code</code></p>
<p>The join is specified by identifying columns that reference each other, and the join is then performed with rows from each table that contain rows with equal values.  Ah! there is the problem, when NULLs are involved, nothing is equal to them, not even another NULL.  So joins with columns that have nulls produce results that just plain don&#8217;t seem to work.  It all makes sense when we remember that NULL does not mean nothing, but means unknown/undefined.  The solution is quite simple:</p>
<p><code> SELECT foo, bar FROM quux INNER JOIN baz ON ISNULL(quux.code,0) = ISNULL(baz.code,0)</code></p>
<p>Unfortunately, this will impact performance quite significantly for large tables.  Assuming there was an index on the code columns of both tables, the index would not be usable, since we are now applying the ISNULL() function, and the index was generated on the column itself, not the result of that column being fed to a function.  The result is a full table scan of both tables involved.</p>
<p>The lesson here is that as you design a database (or refactor one) you should put some serious thought into any column that allows the use of NULL.  I&#8217;d even go so far as to recommend that you try to avoid them and think of other ways to represent the same data.  Often the same result can be achieved with a second table that links to a &#8220;base&#8221; table, and only contains data when a matching row in the base table requires it.  When implemented in this manner the absence of data rather than the presence of a NULL is a much better, and clearer indication of nothing than using a NULL.  if you do it this way, you won&#8217;t run into weird behavior from JOINs when you least expect it.</p>
<p>&#8230;and that&#8217;s is&#8230; NULL-Thing to see here, carry on&#8230;</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/13/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/13/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/13/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=13&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2007/03/31/null-thing-to-see-here/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>Dynamic Loading in PHP</title>
		<link>http://camz.wordpress.com/2005/08/15/dynamic-loading-in-php/</link>
		<comments>http://camz.wordpress.com/2005/08/15/dynamic-loading-in-php/#comments</comments>
		<pubDate>Mon, 15 Aug 2005 21:56:30 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[LAMP]]></category>
		<category><![CDATA[PHP]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2005/08/15/dynamic-loading-in-php/</guid>
		<description><![CDATA[I recently started on a small project in PHP, and in working on the project I began to explore how to leverage the OO capabilities of PHP. Much to my delight, I discovered that PHP is a far more capable language than its normally given credit for. I’m sure that there are just as many [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=14&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>I recently started on a small project in PHP, and in working on the project I began to explore how to leverage the OO capabilities of PHP. Much to my delight, I discovered that PHP is a far more capable language than its normally given credit for. I’m sure that there are just as many reasons for that as there are opinions, so I won’t share my opinions (this time). Instead, I would like to share a technique that I have recently developed since I feel that it will not only be interesting to the PHP community, but useful as well.</p>
<p>The technique is a form of dynamic loading, the PHP equivalent of a DLL (more or less). At first, this seems like a rather strange concept for an interpreted language, but it has uses, the most important being that it can reduce code complexity without sacrificing functionality. It is based on the OO capabilities of PHP, and as such is an abstraction layer, which I have blogged about in the past. As with all abstractions, it is always important to understand the internals of the abstraction instead of just blindly using it. With that warning issued, lets get into the actual technique.</p>
<p>In general terms, I needed a “driver” for my project, so I will use the driver model as the basis for this article. Many years ago I stumbled across a driver technique that was used in gnuplot. It was a clever abstraction of the driver layer, even though it was written in C (none of those fancy OO languages existed, and those that did, were not well known and had not gained any real acceptance). This was (and still is) a fine example of how OO-design is a <em>methodology</em> and does not require an OO language to implement.<br />
<span id="more-14"></span></p>
<p><strong>A little history from C</strong><br />
The trick that gnuplot used was to define a structure that contained some data elements and some pointers to functions. If this sounds an aweful lot like a modern class definition, you’re right, it’s virtually the same thing. The functions implemented the lowest level of functionality in gnuplot, and there were high-level functions that called these “driver” functions to implement more complex functionality. Here is what some of the code looked like:</p>
<p><code><br />
typedef struct Driver {<br />
char name[31],<br />
void *(plot)( int, int ),<br />
void *(pen)( int ),<br />
void *(move)( int, int ),<br />
void *(line)(int, int )<br />
} DEVICE_DRIVER;<br />
</code></p>
<p>Each driver in gnuplot would provide the functions for implementing these basic features, and a global module provided an array of them along with the appropriate initialization. This could also have been implemented using a DLL-type technique, but again, those techniques, let alone DLL’s were not in common use at the time (there I go dating myself). Using this model was quite simple, a global variable existed which was a pointer to the current driver within the array. Here is what a call to one of the functions would look like:</p>
<p><code><br />
#define PEN_DOWN 1<br />
#define PEN_UP   0</code></p>
<p>(*driver-&gt;pen)( PEN_UP );<br />
(*driver-&gt;move)( 0, 0 );<br />
(*driver-&gt;pen)( PEN_DOWN );<br />
(*driver-&gt;line)( 0, 100 );<br />
(*driver-&gt;line)( 100, 100 );<br />
(*driver-&gt;line)( 100, 0 );<br />
(*driver-&gt;line)( 0, 0 );</p>
<p>When the user selected the output driver, it was a simple matter to loop through the array looking for a name match and then setting the global pointer appropriately.</p>
<p>It was simple and elegant, and most importantly, it provided a layer of abstraction for the higher-level functions so that the device-specific code stayed in one place and there were no #ifdef/#endif commented sections in the main code.</p>
<p><strong>Time for PHP</strong><br />
I wanted to do the same thing with PHP. The first challenge was that PHP doesn’t have structures, if you want to have a structure you have to use an object, which in turn means defining a class. This was actually a good thing, since classes also support methods, which work perfectly as a replacement for the pointers to functions that were used in C. I started off by defining a base class, for our example, I’ll use a subset of database functions.</p>
<p><code><br />
class DB_DLL {<br />
var $name;<br />
var $db_link;</code></p>
<p>function DB_DLL() { $this-&gt;name = null; }<br />
function init( $db_host, $db_user, $db_pass );<br />
function connect( $db_name ) { return null; };<br />
function query( $db_query ) { return null; };<br />
}</p>
<p>Those of you that didn’t skip over the C part on gnuplot will notice that this looks remarkably similiar, almost identical to the C technique. It should we are doing the same thing. The only difference lies in how we initialize things.</p>
<p>In C we had to have a global array of structures and each was hard-coded to initialize it in order for it to be usable to the application. The other big difference is that in C each driver actually did populate the same data type with it’s information, so we had an array of the same data structures. Things don’t quite work that way in PHP. In order to provide our own functionality for each “driver&#8221;, we must extend the base class, which creates a new data type, a new object. This also creates a new problem, but we will get to that later. Here is one example of the code to extend the base class for MySQL.</p>
<p><code><br />
class DB_MYSQL extends DB_DLL {</code></p>
<p>function DB_MYSQL(<br />
{<br />
$this-&gt;name = “MySQL&#8221;;<br />
}<br />
function init(<br />
{<br />
$this-&gt;$db_link = @mysql_connect( $db_host, $db_user, $db_pass );<br />
}</p>
<p>function connect( $db_name )<br />
{<br />
return @mysql_db_select( $db_name );<br />
}</p>
<p>function query( $db_query )<br />
{<br />
return @mysql_query( $db_query );<br />
}</p>
<p><strong>Clever tricks</strong><br />
Okay, so now we have a base class and some extended classes that actually implement our “drivers” for (in this case) various database types. We still have a problem, which is how to instantiate an object of an extended class. They all have new names, so how do we manage this?</p>
<p>First we need to organize things a bit, so I’ll have to backtrack a little to do this. The organization is in terms of how we name files and where we put them. So the first thing we will do, is to place the definition of our base class into it’s own file so that we can include it in any PHP module that needs it. For now, lets call it class.inc, and of course it will need the standard  wrapper tags. Next we will put the definition of the extended classes into their own files as well. Although I have only shown a sample of the code for MySQL, lets say that we’ve created modules for PostgreSQL, and Oracle as well. To keep things even tidier, lets put them all in a subdirectory called drivers. We should now have the following files:</p>
<p><code><br />
class.inc<br />
drivers/mysql.inc<br />
drivers/postgresql.inc<br />
drivers/oracle.inc<br />
</code></p>
<p>Now, back to solving our little problem of instantiation. Turns out this is an easier problem than you might think. PHP has the ability to call a named function, simply by putting the name of the function as a string into a variable and then calling it by adding () after it. So we now go back to each driver file that we created and we add a function outside of the class definition. The one for MySQL looks like this:</p>
<p><code><br />
function new_db_dll_mysql() { return new DB_MYSQL() }<br />
</code></p>
<p>Okay, looks pretty simple. It is, that’s the beauty of it. We have one more trick though, and this where things get more interesting. We still need to make this all useful, and to do that we will instantiate each driver and build an array. Since these are all extended from our base class, the array is just an array of the base class, with each element instantiated to the appropriate extended class. We take advantage of calling a named function to do this. Here is what our main code looks like:</p>
<p><code><br />
global $DLL;<br />
require_once( “class.inc” );</code></p>
<p>$dll_list = array( “mysql&#8221;, “postgresql&#8221;, “oracle” );<br />
foreach( $dll_list as $driver ) {<br />
require_once( “drivers/$driver.inc” );<br />
$init_func = “new_db_dll_$driver&#8221;;<br />
$DLL[$driver] = $init_func();<br />
}</p>
<p>We created an array to hold the list of drivers we wanted to initialize / instantiate, and then used a simple loop to do all the work. Using any of the function is pretty simple too, here is an example:</p>
<p><code><br />
$driver = “mysql";</code></p>
<p>DLL[$driver]-&gt;init( “localhost&#8221;, “userid&#8221;, “passwd” );<br />
DLL[$driver]-&gt;connect( “my_db” );<br />
$result = DLL[$driver]-&gt;query( “SELECT * FROM blog_entry WHERE id = 5″ );</p>
<p>Once very important thing to notice about the code that initialized the drivers is that it does not get longer or more complex as we add more drivers. The only thing that changes is the array. This may seem quite trivial, but it is actually where some of the more powerful functionality is hiding. I will also admit that my choice of using database functions for abstraction with this technique probably makes it less obvious as to the power of this technique.</p>
<p>To make this more apparent, think about using this technique something other than a database. Lets say we have written drivers that create different pages on a web site. We might have one for forums, and one for downloads, and yet another for administration functions. Another might be a set of functions for connecting to different instant messenger web-interfaces.</p>
<p>Instead of putting the list of these driver modules into an array, lets put them into a database. We can now take a userid for the logged in user and build a query that will provide us with a list of modules that this particular user has access to. That list can then be used to dynamically initialize/instantiate only those modules that the user has access to.</p>
<p>We’ve just transformed a simple technique for creating drivers into a mechanism for dynamically including selected features into a web-application!</p>
<p>I should probably make one comment about the technique before anyone else points it out. There is a slight simplification that can be made to the technique if you use include_once() instead of require_once(). The trick is that an included file can return a value. So instead of having an extra function at the end of the extended class definition you could just have:</p>
<p><code><br />
return new DB_MYSQL();<br />
</code></p>
<p>With the initialization loop now looking like this:</p>
<p><code><br />
foreach( $dll_list as driver ) {<br />
$DLL[$driver] = include_once( “drivers/$driver.inc” );<br />
}<br />
</code></p>
<p>Which reduces the initialization code by a couple more lines.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/14/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/14/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/14/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/14/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/14/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=14&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2005/08/15/dynamic-loading-in-php/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>The Knuth of the Matter</title>
		<link>http://camz.wordpress.com/2005/04/12/the-knuth-of-the-matter/</link>
		<comments>http://camz.wordpress.com/2005/04/12/the-knuth-of-the-matter/#comments</comments>
		<pubDate>Tue, 12 Apr 2005 21:05:06 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Software Development]]></category>
		<category><![CDATA[abstractions]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2005/04/12/the-knuth-of-the-matter/</guid>
		<description><![CDATA[“People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise, the programs they write will be pretty weird.”
– Donald Knuth, 2004
Donald Knuth is very wise, a god of computing, so I don’t doubt for a second that he understands that this quote [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=15&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p><strong>“People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise, the programs they write will be pretty weird.”</strong><br />
<em>– Donald Knuth, 2004</em></p>
<p>Donald Knuth is very wise, a god of computing, so I don’t doubt for a second that he understands that this quote applies equally to layers of software abstraction as well. I love this quote though, it has some wonderful subtleties that I am sure are completely lost on <a>fragile programmers</a>.</p>
<p>So lets talk about abstraction. It’s a wonderful thing, but only if there is some comprehension of what is being abstracted. Sure, you can treat the underlying abstraction layer as a black box, but when we are talking about software that can, more often than not, be a fatal thing to do. Of course with software abstractions, the fatality is rarely instant, instead an obscure, difficult or impossible to document set of circumstances must occur. As so many of the current and next generation of high-level languages gain acceptance, there is more and more abstraction and less and less understanding of the the underlying layers.<br />
<span id="more-15"></span></p>
<p>One example is in the use of XML for everything. Don’t get me wrong, I like XML, what I am opposed to and appalled by is that so many developers use it for everything. I’ve seen it used for configuration files, for scripts, for calling functions and passing arguments and for data payloads and IPC. It can do all of those, in fact it can do some very well. So, what’s the problem? Well, there are several, the first is that XML isn’t always the best solution, it might not even be a suitable solution. The second is that all too often a developer chooses XML as a solution without even questioning or wondering if it’s the appropriate solution. The availability of XML parsers makes it convenient, and it’s also pretty popular right now.</p>
<p>This is where we come back to some of the subtleties of Knuth’s statement. I know, you are thinking <em>“but what does XML have to do with hardware?”.</em></p>
<p>Ah, see… that’s the subtlety. At some point you have to <span style="text-decoration:underline;">do</span> something with that XML. You might need to parse it, or store it, or transmit it.</p>
<p>To parse it you need CPU power, XML isn’t exactly the easiest thing to parse, it requires decent amount of processor and memory to get the job done. That places a constraint on just how fast you can do that job, or possibly on how many concurrent jobs you can do without it taking so long as to become unacceptable. The more places you use XML, the more of a concern this becomes.</p>
<p>What about storing it. XML is very verbose, I don’t think anyone will disagree that one thing XML <em>isn’t</em> is small. That means it uses more disk space than other solutions. It also means it takes longer to read it in (not counting the time it takes to parse). You might need a bigger disk, or a faster disk or both. I suspect that some of you are saying that yes, XML is large, but it usually compresses quite well, so you can store it compressed. A nice easy solution. Well, not exactly, it’s a trade off, compression isn’t easy either, it takes CPU and possibly memory too. It’s convenient though, you probably have an API or method that does it for you. Another example where the convenience of an solution takes the place of actually thinking about it’s suitability.</p>
<p>Transmitting XML has the same issues as storing it, and a couple more too. We enjoy nice, fast, low-latency, inexpensive broadband access to the internet. It’s been years since you even thought of something as <em>primitive</em> as a 56K dial-up connection, let alone a 14.4K one. You have lots of bandwidth to spare, sure that XML payload is big, but it still transmits in a couple seconds, so no problem, right? Wrong. Most of the time when you are transmitting XML it’s a commercial/business application. Bigger messages mean that we max out our bandwidth sooner, which means fewer transactions per second. Want to handle more? Now you need a bigger connection, and those are expensive. Want to support wireless users? They don’t have as much bandwidth, suddenly your solution isn’t working so well.</p>
<p>I was going to say something about threads in relation to this topic, but I see that XML has proven to be more than enough.</p>
<p>Knuth has got it right, but I think maybe he left some unsaid. Sometimes it isn’t just the programs they write that can be pretty weird, but also the way those programs behave.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/15/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/15/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/15/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/15/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/15/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=15&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2005/04/12/the-knuth-of-the-matter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
		<item>
		<title>Fragile Developers</title>
		<link>http://camz.wordpress.com/2004/09/03/fragile-developers/</link>
		<comments>http://camz.wordpress.com/2004/09/03/fragile-developers/#comments</comments>
		<pubDate>Fri, 03 Sep 2004 22:44:54 +0000</pubDate>
		<dc:creator>camz</dc:creator>
				<category><![CDATA[Software Development]]></category>

		<guid isPermaLink="false">http://camz.wordpress.com/2004/09/03/fragile-developers/</guid>
		<description><![CDATA[Most applications are far from simple, those days are gone. Most applications are actually distributed systems with complex interactions between modules and architecture layers. The first hint of this was in client/server applications. These were some of the first distributed applications created and to build them properly and efficiently required different design methodologies. Suddenly the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=16&subd=camz&ref=&feed=1" />]]></description>
			<content:encoded><![CDATA[<div class='snap_preview'><br /><p>Most applications are far from simple, those days are gone. Most applications are actually distributed systems with complex interactions between modules and architecture layers. The first hint of this was in client/server applications. These were some of the first distributed applications created and to build them properly and efficiently required different design methodologies. Suddenly the performance of IPC or the network was now a factor in the performance of the overall system.</p>
<p>Some developers thought that client/server was &#8220;hard&#8221; or complicated, or any number of other excuses. The reality was that they required knowledge of other architecture layers that had been abstracted. If you understood the layers below your application, it was easy, and you could develop applications that performed well. If you didn&#8217;t understand the layers below, or blindly used the abstractions of them, then you ran into problems.<br />
<span id="more-16"></span></p>
<p>We see the same thing with some of these modern languages, J2EE in particular. The problem has compounded itself with far too many layers of abstraction and architecture. Not only is it more difficult to understand all the layers, but there are conceivable so many of them that it may even be impossible to properly understand them all. One thing is certain though, which is that you are still fundamentally building a distributed system. Which is where the problem lies. Language frameworks like J2EE trick too many developers into thinking they are building one big application, and they fail to think about the distributed nature of the whole thing.</p>
<p>When you are working with client/server applications or developing a system in QNX it&#8217;s pretty obvious that the system is distributed. The solution consists of discrete modules that are started and stopped from a command line independent of each other. It&#8217;s not so obvious in J2EE where the framework does it all, including startup and shutdown and hides it away from the fragile developers.</p>
<p>That is what it comes down to in the end, these modern languages and frameworks produce fragile developers. You&#8217;ve heard the cliche that if you program in C it is easy to &#8220;shoot yourself in the foot&#8221;, and it&#8217;s darn near impossible to do that in something like Java. It&#8217;s those darn pointers and memory leaks they all say.</p>
<p>The thing that we forget is that shooting yourself in the foot is rarely fatal. Sure, it hurts like hell, but we learn from it. The things we learn are the details of the underlying architecture layers. If you are in Java, you can&#8217;t shoot yourself in the foot, at the very worst, you can drop something moderately heavy on your foot. The problem is that it&#8217;s not painful enough so you tend to do it over and over again because you forget the pain all to quickly.</p>
<p>We don&#8217;t need more fragile developers, we need ones that that know the value of getting bumped and bruised along the way&#8230; those bumps and bruises ultimately make us stronger. As a result we make stronger applications.</p>
<p>So go ahead, be a fragile developer, but don&#8217;t expect to be able to build anything other than a fragile application.</p>
<img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/camz.wordpress.com/16/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/camz.wordpress.com/16/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/camz.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/camz.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/camz.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/camz.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/camz.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/camz.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/camz.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/camz.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/camz.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/camz.wordpress.com/16/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=camz.wordpress.com&blog=431845&post=16&subd=camz&ref=&feed=1" /></div>]]></content:encoded>
			<wfw:commentRss>http://camz.wordpress.com/2004/09/03/fragile-developers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c0f9ae6afbe7d95ff1107cdd09781c9c?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">camz</media:title>
		</media:content>
	</item>
	</channel>
</rss>