<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:clearspace="http://www.jivesoftware.com/xmlns/clearspace/rss" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>Gather 'round the Grill</title>
    <link>http://www.enzeecommunity.com/blogs/grill</link>
    <description />
    <pubDate>Tue, 13 Oct 2009 14:49:59 GMT</pubDate>
    <generator>Clearspace 2.5.3 (http://jivesoftware.com/products/clearspace/)</generator>
    <dc:date>2009-10-13T14:49:59Z</dc:date>
    <item>
      <title>We Interrupt this Broadcast...</title>
      <link>http://www.enzeecommunity.com/blogs/grill/2009/10/13/we-interrupt-this-broadcast</link>
      <description>&lt;!-- [DocumentBodyStart:0be00052-130d-4ed6-90f1-b8d1f75f572b] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;Famous words, or some such like, uttered by Orson Welles as he launched into a scary parody of alien terror on national radio. Really scary for some. And proferred on Halloween night in 1938, so dare I say, 'tis the season (almost).&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Ahh, not to fear, this purports to be a painless foray. But I do have a story to tell.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Several projects ago (I always start this way, so you won't think I'm talking about you!) - I worked with some really sharp data engineers on boiling out a solution for retail operational reporting. The data arrived every five minutes or more, or less, and sometimes in parallel loads, with 24x7 regularity. More and more Netezza implementations are going this way, and you too, should look into processing data at the speed of thought. In any case, the reporting users wanted to plumb the depths of this data store, to the tune of eighty billion records and growing. (Okay, small I know (for some of you) but humor me).&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Well and good, except rather late in the game, the reporting users spontaneously expressed a desire to review the detail through metadata-based "lens", that is, set up some drilling levels and other metadata-based entry points, such that the entire operational model would be seen through this reporting "lens" and it would provide all the context for the consumers.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now, such a model as described, would require such enormous power from a standard SMP/RDBMS-styled system, that we might well cause structural damage on the raised floor for sheer physical weight of said system. That is, if we really expected a report to return within a day or two of the request. Ahem! as I facetiously clear my literary throat.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;But the worst-case for any given query for the above was around 8 minutes, and over 99 percent of the thousands of queries submitted, returned in less than 30 seconds. Oh, yeah, it was smokin' hot. In most queries using zone maps and the like, we saw returns in mere multiple seconds. Pshaw! Says the tick-tock-man, chocolate and vanilla, don't waste my time.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;However (and there's always a catch) many of the larger reports were actually conglomerations of these smaller queries, and their aggregate time would occasionally exceed ten minutes or more. And even though this was a far cry from the "days away" we would expect from an SMP/RDBMS system, it was still 'too slow' for the users. Now, this is true adrenalin-junkie stuff, sort of like the old Far-Side cartoon of a young man standing with a fork in front of a waffle iron, captioned "Wendell Zurkowitz, slave to the waffle light". I recall how one man noted that many years ago we would wait hour(s) for a traditional oven to finish cooking, and now get impatient when the microwave instructions are greater than five minutes.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Perspective.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;And rather than punt to the users and say, "Hey guys, this is just unrealistic" and degenerate into "expectation management" - the challenge was to actually achieve faster turnaround times on the reports. And here, I'm talking about getting these ten-minute reports into the 30-second zone. Would we have to embrace some extreme engineering for this feat? Methinks not - but the form of the process to get there was quite instructive.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now recall I noted that the above model had operational tables, which were to be the detailed source, and a retail reporting hierarchy that was largely metadata-based. This reporting hierarchy had some significant size as well, perhaps a fourth the size of the eighty-billion-record fact table it had to link into. Yet both of these were on separate distribution keys. Queryng one meant broadcasting another.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;And now, for broadcasting.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Whenever two tables are distributed on different keys, a join between them cannot be initially co-located. To support the co-location, Netezza will broadcast the salient information from one table's context to the other. This means the physical data has to move from its home SPU, out onto the inter-SPU network fabric, and find its way to the target SPU where it will be further examined. Broadcasting for small tables is inconsequential and barely a blink on the radar. For larger tables it can have strange effects. For example, we saw one query return consistently in ten seconds. Yet when running side-by-side with itself (multiple users) it could take several times longer.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;The reason is that both queries were competing for bandwidth on the inter-SPU fabric, among other things. The simplest solution, of course, is to get our metadata table distributed on the same key as the operational tables. The problem was simply in the complexity of this metadata table and how it mapped to the core information. "Blowing it out" into a materialized form of information would require significant planning and design, because a misstep could easily make the reports turn out wrong, and this was unthinkable. In all this, the maintainability had to be considered, because if our initial complexity is too high, the maintainability is in jeopardy - &lt;em&gt;by design&lt;/em&gt;.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Of course, we would spend most of our time in testing this scenario. Coding and implementation in most BI shops is a nit compared to the testing we have to execute to validate the outcome. Netezza is no different, except we can close the testing loop sooner if we have more power. And of course, for something of this magnitude, to test the change from minutes to seconds, we would need a powerful machine to measure the difference. Whenever we ran the new solution on a smaller machine, the difference couldn't even be measured. No, the power of the machine makes the testable difference visible and measurable.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;As I noted, the form of this exercise was the most instructive part. Rather than form a means to align these two tables for co-located joins, the first effort was in attempting to tune the queries. You know, "query engineering", which is the mainstay of performance engineering on an SMP/RDBMS platform, and old habits are hard to break. The data engineers were somehow in denial that they would receive extraordinary power from configuring &lt;strong&gt;&lt;em&gt;the data&lt;/em&gt;&lt;/strong&gt;. Rather they trusted their instincts and chose to attack &lt;strong&gt;&lt;em&gt;the queries&lt;/em&gt;&lt;/strong&gt;.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now, in any platform, regardless of shape, size or vendor, power is always and forever the domain of hardware. Software cannot manufacture more CPUs or network speed. If the physical plant is not ready, the software can only use what it has at its disposal. The software itself is largely a cost center, because it can only drain the machine's energy through inefficiency. In an SMP/RDBMS machine, the only option we have is to engineer the queries, because the physical plant is configured to be general purpose.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;In a purpose-built machine, however, the query is simply a controlling mechanism to Netezza's resources. The host will chop it apart into snippets and dispatch these to the component that they will serve. Extreme query engineering on the other hand, assumes that jockeying around with the query can actually affect our fate. (contrast; a poorly written query is different from directly engineering a well-written query). And besides, do we really want to spend our time carefully engineering the query to the point of functional brittleness? In an SMP/RDBMS machine we will see queries that extend for tens of pages in a very daunting complexity. Maintaining these is a full-time job for our consultants. They swarm on the machine, and carefully tune their handiwork to avoid breakage.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Yet, we purchased a Netezza machine to get away from this complexity. To reduce, clarify and simplify our administration and consumption of the data. So as I watched these engineers bat themselves against the problem, no differently than a fly batting against a window, I watched them pull out their hair in generous tufts when little they did offered the significant gains they expected. This outcome was entirely counter-intuitive to their training. They were acccustomed to using and tuning software to make things work faster.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;Sweeping the hair from the floor one evening, I mentioned (for the x-teenth time) that the broadcast effect was killing them. Once our engineers grasped the broadcasting problem, I thought we would make headway, but things actually got worse. They started trying try to control the broadcast &lt;strong&gt;&lt;em&gt;as&lt;/em&gt;&lt;/strong&gt; the root cause rather than the symptom. In one test, I saw one of the largest tables leap into a broadcast and we just killed the query outright (it would probably still be running, even today). The engineers lamented: How do we make sure the larger table doesn't broadcast? How do we control the broadcasting to our benefit? Answers exist to all of these, but it's like talking to a drug addict, one who is addicted to the drug of SMP/RDBMS and claims he can 'quit anytime'.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;And then the truth came out, "David, if we can make this 10100 machine process data like a 10400 machine, we'll look like heroes!" To which I ask "How?" to which the response is: "We can save them all that money they would have spent on the hardware..." Well, not really. You've just chosen something else to spend the money on, namely performance engineering, the cost of time-to-market, the cost of a marginal implementation and the cost of human labor (the most expensive asset you have, by the way). But since the only way to get a 10100 to perform like a 10400 is to &lt;em&gt;actually be&lt;/em&gt; a 10400, well, you see the futility. 432 SPUs versus 108 SPUs? And they really, truly thought they could - I mean - &lt;em&gt;seriously&lt;/em&gt;. Let's keep in mind that the opposite is true. If we &lt;em&gt;can't&lt;/em&gt; make the 10100 process data like a 10400, perhaps our approach is flawed? Heroes or goats. Take your pick. In my estimation, there's only one hero in the room. The big black box.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So the broadcast is the &lt;em&gt;symptom&lt;/em&gt;, not the root cause. How about, we quit broadcasting, cold turkey? Take the data model through a detox program and the engineers through a series of deprogramming seminars to - well - it's not that bad. Typically the average engineer only has to see it operate in an adverse manner to become a believer. But a believer they must be, or they will not take action to correct the problem, &lt;em&gt;correctly&lt;/em&gt;.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So one of them finally decided to produce a map table, one that would map the metadata into the operational tables such that all core joins would become co-located, with a common distribution. And lo, the first test of this blew their minds. Even the complex reports were now coming back in single-digit times, and the reports that had been running ten minutes or longer were now under a minute, even with multiple users. In fact, they saw the performance and scalability practically handed to them - simply because they &lt;strong&gt;&lt;em&gt;configured the data correctly&lt;/em&gt;&lt;/strong&gt;. It had little to do with query engineering.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now one may ask the obvious question, and please do so now: &lt;strong&gt;Why don't you just build out some user-facing tables and forget leveraging the operational tables?&lt;/strong&gt; After all, we don't build our non-Netezza reporting systems on top of operational data, do we? We build-out dimensional models and other handy structures to postively affect the user experience and simplify the flow (and the maintenance). This functional decoupling is a mainstay of reporting environments. (Okay, the next entry will focus on this). But in this case, suffice to say that the owner of the machine had placed down a hard-mandate on disk utilization. At no time could we foray into replicated detail, or even summary of detail without a plan to access the operational detail on a drill-down and the like. Interestingly, the required reporting tables would have only cost mere fractions of the cost (on disk) of the time/labor and effort put into making the operational tables viable. This is why it deserves its own treatment in a separate rant - er - essay. Stay tuned, and don't touch that radio dial.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;Back to the drama - A telltale symptom that we're doing something wrong, is when we start down the engineering path. It's an &lt;strong&gt;&lt;em&gt;appliance&lt;/em&gt;&lt;/strong&gt;. We don't engineer toasters, blenders or laundry machines. But the difference here seems to be subtle. It's not. In this case, the culprit was the broadcast, something to be eliminated rather than managed. And no amount of creative query hoop-jumping would overcome this. Get the joins onto the SPUs. It seems obvious to those who have been around the machine for bit. But for those who have not, the learning curve is upon them. Be patient with them for as long as it takes to get it right. Once we have a believer, we'll never have the conversation again. As long as we stay in a theoretical zone, however expect them to stay in the spin cycle. This is like many things scientific. Seeing is believing.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Whenever I (and others like me) observe a ritual of performance engineering, each participant holding out the hope that "just one thing" will offer stratospheric boost so they can all wipe their foreheads and go home - this is the surest sign of one of two things: Either the data is poorly configured and is causing the queries to be ineffcient, or the data is properly configured and the machine does not have enough physics to achieve the goal. If the focus is on query engineering, they are wasting time. If the focus is on data engineering, at some point it will reach a "diminishing return". Either the machine has the power or it doesn't. Time to switch to Netezza, or if using Netezza, time to add some physics (a frame or two) to make it happen.&lt;/p&gt;&lt;p&gt;&lt;br/&gt;Moral of the story: Performance is found in the physics, not the carefully engineered queries. If we find ourselves "engineering" our queries for performance reasons - we should take a step back, take a deep breath - click our heels together and say softly: "There's no power like SPU power. There's no power like SPU power." Repeat as necessary.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;And pay no attention to the man behind the curtain. I'll bet he and Orson Welles never even met.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:0be00052-130d-4ed6-90f1-b8d1f75f572b] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">engineering</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">performance</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">broadcast</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">table</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">fact</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">reporting</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">physics</category>
      <pubDate>Tue, 13 Oct 2009 15:00:39 GMT</pubDate>
      <author>dbirmingham</author>
      <guid>http://www.enzeecommunity.com/blogs/grill/2009/10/13/we-interrupt-this-broadcast</guid>
      <dc:date>2009-10-13T15:00:39Z</dc:date>
      <clearspace:dateToText>10 months, 3 weeks ago</clearspace:dateToText>
      <wfw:comment>http://www.enzeecommunity.com/blogs/grill/comment/we-interrupt-this-broadcast</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/grill/feeds/comments?blogPost=1113</wfw:commentRss>
    </item>
    <item>
      <title>Honor the Host</title>
      <link>http://www.enzeecommunity.com/blogs/grill/2009/05/26/honor-the-host</link>
      <description>&lt;!-- [DocumentBodyStart:2c2bf909-f562-4355-8a2f-f9128a9716be] --&gt;&lt;div class='jive-rendered-content'&gt;&lt;p&gt;Some enterprises will stand up a Netezza machine and point all their data processing towards it. They wouldn't think of actually installing anything on the Netezza machine (such as database clients or other client software) and of course, are strongly advised against by the vendor. Why is this? The Netezza host has a lot of work to do in keeping those spinning SPUs happy and busy. Adding other duties can detract from this critical mission, and we don't want that.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;But we can also abuse the host in subtle ways. A case in point follows - you may have other tales to tell.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;We always have a need to pull in a wide variety of files. In this particular case, dozens of intake tables in their various staging locations. In many installations, the intake table definitions are few, discrete and stable. But in just as many, the staging tables will mirror the upstream sources, with one table for each upstream interface. In our case, handling source-to-target with no ETL in between. We extract directly from the source into an intake table definition that mimics source column names, but the data types are all varchar to facilitate "dirty" intake. The objective is to get the data into the machine.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Then we convert this intake table to its final form, the internal Netezza table that is identical to the source table in column name and type. This conversion is a simple table copy, mechanically speaking, but we have to do some light ELT to make it happen. For example, we need to guard against nulls, empty strings, bogus numeric values and the like. In our case, numerics could be dozens of characters in width because the upstream definition happened to be a view with no defined precision. A typical intake SQL could look like:&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;select&lt;/p&gt;&lt;p&gt;case when column is null then value else column end,&lt;br/&gt;case when translate('-+.0123456789','') = '' then column else null end,&lt;/p&gt;&lt;p&gt;etc&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Such that each column is wrapped with this kind of logic (call it "Intake ELT"). Now, we don't &lt;em&gt;manually&lt;/em&gt; wrap these column defs, we do it &lt;em&gt;dynamically&lt;/em&gt; from the Netezza catalog definition. (And for efficiency, we cache it for later reuse, but that's another story).&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now we have an intake-ELT that looks thus:&lt;/p&gt;&lt;p&gt;&lt;br/&gt;External Database Table -&amp;gt; network -&amp;gt; intake table -&amp;gt;  Intake ELT -&amp;gt; Staging Table&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Note for clarity - the External Database Table and Staging Table are "book ends" to this operation, and have the same column names, data types and column order. We don't absolutely require common column ordering, but it's handy for troubleshooting.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Note also that this works just as well for flat file intake as database intake. Better, in fact, because we can more easily load multiple files at once than multiple tables at once (the database might not like multiple extracts)&lt;/p&gt;&lt;p&gt;&lt;br/&gt;All of this worked swimmingly until we encountered a slightly different kind of data feed, one that had to be extracted from an archival source into flat files. Rather than present the flat file as normal (on the network) the admins decided to use the available on-board Netezza storage pad (5 TB of space). Keep in mind that we were not allowed to execute anything directly on the machine, so we had to set up External Tables on top of these files to load them, rather than using NZLOAD. This, too, worked transparently and all was well. Then a "bright idea" occurred, that in the above equation the Intake ELT faced a table (our intake table) and couldn't we just use the intake ELT right on top of the External Table, eliminating the additional middle-man?&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Like so:&lt;/p&gt;&lt;p&gt;&lt;br/&gt;flat File -&amp;gt; External Table -&amp;gt; Intake ELT -&amp;gt; Staging Table&lt;/p&gt;&lt;p&gt;&lt;br/&gt;The above configuration only appears more efficient by eliminating the Intake table. Looks are quite deceiving, cconsidering how much "per-column work" the Intake ELT had to perform to get data into the Staging Table. What is not obvious, is that the Intake ELT is now sitting on top of the External Table, which is a &lt;em&gt;Host&lt;/em&gt;-managed table, not a &lt;em&gt;SPU&lt;/em&gt;-managed table. In this configuration, we have reduced our power from a 108-SPU problem to a 4-(Host) CPU problem. The immediate loss of power was measurable in orders of magnitude.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;So under the covers, here's the power-plant difference in the two models:&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;External Database Table -&amp;gt; network -&amp;gt; intake table -&amp;gt;  Intake ELT -&amp;gt; Staging Table&lt;br/&gt;                          |----HOST -------------|---SPUs--------------------------------|&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;flat File -&amp;gt; External Table -&amp;gt; Intake ELT -&amp;gt; Staging Table&lt;br/&gt;             |------HOST ------------------------------|&lt;/p&gt;&lt;p&gt;&lt;br/&gt;So we can see that the second model is abusing the host with the Intake ELT, and if we go with the original model, the ELT will be handled by the SPUs, offering the necessary scalability and power. In a continuum, we can see where we might initially install nzload or external tables and perhaps "tweak" them along the way. Then a maintenance developer comes along and sees that the "easiest" place to add a fix is in the external table or the nzload rather than pushing it to SPUs. The external table and nzload can (and should) do light-intake formatting per their interface specifications, but no further.&lt;/p&gt;&lt;p style="min-height: 8pt; height: 8pt; padding: 0px;"&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;The over-arching directive remains the same - get the data into the SPU-based tables as rapidly as possible and then do the "dirty-work" with massively parallel power.&lt;/p&gt;&lt;/div&gt;&lt;!-- [DocumentBodyEnd:2c2bf909-f562-4355-8a2f-f9128a9716be] --&gt;</description>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">external</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">table</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">intake</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">nzload</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">host</category>
      <category domain="http://www.enzeecommunity.com/blogs/grill/tags">staging</category>
      <pubDate>Tue, 26 May 2009 14:47:06 GMT</pubDate>
      <author>dbirmingham</author>
      <guid>http://www.enzeecommunity.com/blogs/grill/2009/05/26/honor-the-host</guid>
      <dc:date>2009-05-26T14:47:06Z</dc:date>
      <clearspace:dateToText>1 year, 3 months ago</clearspace:dateToText>
      <wfw:comment>http://www.enzeecommunity.com/blogs/grill/comment/honor-the-host</wfw:comment>
      <wfw:commentRss>http://www.enzeecommunity.com/blogs/grill/feeds/comments?blogPost=1087</wfw:commentRss>
    </item>
  </channel>
</rss>

