[ www.netezza.com ]

Gather 'round the Grill

3 Posts tagged with the reporting tag
0

In the past several projects, the issue of using views has consistently arisen, as to when to use, not to use and what to expect. Views are one of those mainstay workhorses that we love to hate and sometimes hate to love, but used correctly, can save a world of hurt and lost development time.

 

So we would ask the question - why use a view at all? Isn't the table definition good enough? And what of a synonym? Isn't this just as good?

 

Well, synonyms are handy for configuration management and invaluable for testing. For BI, however, they don't pass-thru the metadata of their underpinning relation's metadata for consumption by the BI tool, so this can be problematic. We also cannot refresh a synonym easily. It has to be dropped and then created in two operations, where view gives us the concurrency-protected operation of "create or replace view" and is muey bien. More on synonyms in another blog entry.

 

Views can easily reach across databases, giving us the ability to stand-up a consumption-point that contains one-part tables and one-part views without having to push data around (very handy for say, a reference database that we want as an on-demand resource of fresh information). I'm a big fan of setting up consumption-point databases so that a user comes to a pre-designated place, not the master repository, to fulfill their information needs. This decouples the user from the master repository and gives us enormous freedom in the ongoing enhancement of their user experience.Views are the vehicle towards this goal.

 

Views also let us do on-demand case/when conversions and typecasting that can be completely encapsulated from the consuming process.


And of course a really cool part about Netezza views - is that we can include as many columns as we want in its "select" clause, the view will not fetch them all, only the one's mentioned in the select that consumes the view - this is a win-win because otherwise it would fetch all of the columns and then drop the majority on the floor to deliver a few.

 

Views have the lightweight nature of a single SQL statement that can be easily installed, where a stored proc often contains multiple SQL statements. Both of these mechanisms serve to hide the logic from the BI tool. But think about this - would we use the stored proc as part of another join? Or would we expect to just select from the stored proc and consume an answer? The more complex the operation, the more we need to just select-and-consume, and take the burden off the BI tool to know more than it has to.

 

A pernicious part of integrating BI tools is just that - expecting that it will know all it needs to know to interact with the Netezza MPP. This is - as you may painfully discover - a false expectation. Case in point, we might have a very creative intersection table between two large fact tables, and we can formulate a query that will browse the information-we-want in mere seconds. Then we plug in our BI tool and ask it to manufacture a query to do the same thing, but it struggles. Now we have to make a call - do we deploy the BI tool in the hopes that later releases will resolve this, or do we install a view or stored procedure that adapt the BI tool to our data model, and then wait for the BI tool to get better in a later release? You see, we can always toss the adaptation when our BI tool gets better. But we cannot allow our user-experience to languish on the same terms. More on this in another essay.


So before I jump into a lot of other things we like about views, I'll address some of the above in their more malignant form.

 

I'll loosely divide views into two buckets - simple and complex. The simple view consumes a single table and may have columnar transforms on it. A complex view, simply put, has more than one table in the join logic.

 

A simple view cannot be easily misused, but a complex view can be misused so easily it will make the head spin on your best troubleshooter. For example, I cannot count the times I've seen a case where a master query joining on a view, which in turn joined on a view, which in turn joined on a view etc. How deep can you go? This is not the issue at all. The issue is in treating the view as though it is a reusable, inheritable object rather than a standalone select-and-consume capability. So where do we draw the line?

 

Transactional thinking - that is - the notion that we can install nested (inherited) views because they handle transaction-at-a-time anyhow and any given instance of them will have a negligible performance problem - is completely washed away when dealing with multi-billion-row scales on a Netezza platform. It's not a transactional platform, so each view potentially initiates a full table scan. Multiply these nested upon nested views and we have nested tables scans - sometimes several separate scans on the same table. Which is more efficient, to look at a multi-billion row table one, or multiple times?

 

One customer had a query that started running very slow one day. We went through a process of discovery to find out what had changed. Seems that a new version of an existing view had
been installed, and the bad query was consuming this view deep under the covers. The bad query and view  were both accessing one of the largest tables in the database, the bad query was now scanning the big table twice, taking a double-hit on the master query itself. Even worse, the changed view did not leverage the big table's zone maps or its distribution key. So a change in one place dramatically affected unchanged functionality of a master query.

 

Because we are embracing economies of extraordinary scale, dynamic objects have a propensity to lose performance integrity over time. What worked yesterday may not work today, so we have to tune it. Netezza is so efficient that this tuning necessity may not arise for years after the implementation. (In one case, four years afterward). By that time, the knowledge of the system's dependencies are not fresh on everyone's mind, so it is easy to make a spot-fix on the view and deploy it. In so doing, we may create a cascading effect for all the other places that consume the view and do so with the expectation of original behavior. In short, the latent nested view architecture is a minefield. We should not implement it because it creates trouble from day one, even though nobody has stepped on mine just yet.

 

At one customer site we had to sift through six levels of view logic to find the performance problem. The customer wanted to know what they should do to fix the problem, but "the problem" was in the overall inplementation and the nested views, not the one bad view, or for that matter, the recent performance symptoms of a minefield implementation.

 

Views can behave as traditional objects if they are single-table views or they leverage additional tables that are small and inconsequential to performance. Don't ever include a big-fat table in a view as part of a performance boosting strategy unless you can designate that the view is in fact a standalone entry point and not something that can arbitrarily participate in the JOIN clause of another master query. Why is this? Invariably we will forget the complexity of the view and then attempt to join it in another operation. For a BI tool, this could be highly problematic as well, because a view that was once simple could spontaneously go complex, and if it affects performance, we'll be pulling our hair out to find the problem through what reduces to a scavenger hunt, or worse, a submarine hunt.

 

Many BI tools simply choke on automatically forming a "complex" Netezza query because there is an implicit assumption of indexes via primary keys, and if these don't exist, the BI tool does the best it can, which in many cases is the least-common-denominator of a query structure. This this doesn't play well on the SPUs for large-scale queries. I cannot count how many times I've seen a convoluted query that we just de-engineered and simplified, and ran an order-of-magnitude faster than the one conjured by the BI tool, yet nothing the tool folks could do seemed to make the BI tool form it the same way. To the rescue: a view that did the right thing - and that was that.

 

What's that? Putting together a view diminishes the flexibilty of the query? Only marginally, and since we're dealing with billions of rows, we don't have much runway for "ultimate" flexibility anyhow. The larger the datasets, the more we need to make sure the queries are as efficiently formed as possible. And since this means as simply formed as possible, we're not talking about BI Tool query engineering, but query de-engineering.


To avoid pain and injury, don't treat all views the same. If we have a complex view, we should tune and designate it as standalone. No matter how much we like its results, it is better not to just arbitrarily include it in another join. the primary reason being - most views are not set up to regard a distribution. So when we include it with our other join, the resolution of distribution might take the form of of least-performing, lower-denominator. We don't want that.

 

One alternative, oddly, is to CTAS - execute such a view in context and insert its data into a temporary table, then use the temporary table in the master join. This affords us the option to (a) leverage the view's normally small output (b) preserve the distribution or (c) align distribution to the next operation (d) simplify the implementation. Of course, your BI tool may not support this, or may support it in an inefficient fashion. Most of the major BI tools will accommodate advanced scenarios, so get your product support rep on the wire and have a heart-to-heart.

 

Yet another alternative is to use the view like an in-line view, except in the where-clause in correlated sub-query. This can often take the form of a where-not-exists clause or the like and can also be very efficient.

 

Another alternative is to break apart the view's logic and assimilate it into the larger view so that all logic is preserved. But you'll be maintaining that logic in two places, right? Not necessarily. We have a lot of view DDL executables that do not directly spawn from a modeling tool. Several of those being in BASH script, which provides for parameterization of logic. If we put the logic into a parameter, then produce the views by including the parameterized logic, we will maintain the core logic in one place (script) but actually deploy two views that leverage it. This is essentially what happens under the covers with many object-oriented environments anyhow. Multiple objects will consume another class and deploy an instance that includes that class, so this approach embraces that inheritance pattern. Not in the dynamic run-time of the view, but in the view's initial DLL-level deployment.

 

MYLOGIC=$( cat <<!
a.limit1 between 50 and 60 and
a.limit2 between 1000 and 50000 and
a.tran_amt < 10000 and
b.employee_id <> 9999
!
)

 

view1="create view view1 as select col1, col2, col3 from mytable a where $MYLOGIC ;"


view2="create view view2 as select col1, col2, col4 from mytable a join yourtable b on a.id = b.id where a.col1 = b.col1 and $MYLOGIC ;"

 

If our modeling tool supports this capability as part of its functionality, and we should leverage it before simply bolting a view into a join. If our modeling tool does not support it, this scripted DDL scenario is easy enough to formulate and leverage without a lot of overhead. The objective: two views that both behave as optimized joins, rather than one view that behaves as a join-with-a-view.

 

Either way, there is a theme here, that simply including the complex view as part of another join's  logic - as though it was a table - is risky and can, even at the outset, offer up such bad performance as to be a non-starter. So a plain-vanilla practice should be to make the complex view behave in a standalone query-and-consume fashion by default. Make no assumptions that it is okay to arbitrarily include it in a larger query's join clause.

 

The further downside is that a de-facto join-with-a-view can work really well at the outset, but the scale of the data can catch up to the even the most robust of implementations, and wiring up the complex view dependencies creates a problem that will not scale, but will only become obvious over time (a minefield)

 

One group invoked a standard for view naming conventions. The simple views would have no prefix at all, so they would look like tables to the casual user. Fair game and all that. The complex views were labeled as v_<viewname> as a cue to a user or report builder: don't use it in the join of a larger query. You'd think that if there was an implicit rule to avoid using anything "v_" prefix that people would play nicely. But not so, since your reporting users may have come from a RDBMS background where it's perfectly okay to mix views into the master query. Awareness of the standard is one thing, but actually embracing it is another. We cannot protect our systems from people who either don't know the rules, don't understand them, or cannot map their experiences from an RDBMS to an MPP.

 

So a suggestion here would be to name the view in a manner that is a departure from common view nomenclature. Calling it an sp_(NAME) might draw the ire of your admins who want stored procs named for what they are, and not obfuscate their names. But if our views are not really common views, and have caveats on their usage, we need a safer naming convention, one that aligns with the goal we are trying to achieve - that of adapting the BI tool to the MPP. One group used a naming convention of "bi_", while another used "rpt_", and still another used the common acronym for their given BI tool. The point is to adopt a convention that is somewhat unconventional, so that those with conventional thinking are able to transform their thinking without finding themselves in a minefield.

 

Nothing is worse than overlooking a minefield - it's a scary view - a view to a kill.

0 Comments Permalink
0

Manhattan Skylines

Posted by David Birmingham Mar 4, 2010

Marcus Gray watched in consternation as the viral program cranked up. He knew that in moments the band of hackers would once again take over the Manhattan power grid. For now, they were doing it as a prank. But he also realized it could be a test run for something even bigger. Like a grid-by-grid shutdown of the entire system, opening the door for untold mayhem on the darkened streets.

 

Moments later, messages from the hacker gang started appearing on all their terminals. Taunting barbs letting everyone know that they were in complete control and nobody could stop them. Gray shook his head and closed his eyes, hoping that this would pass quickly. Losing power even in one part of the grid could spell pandemonium and place lives and fortunes at risk. The weight on his shoulders was crushing.

 

"I think I can help," said a voice from behind. Lane McBride from the Federal Counter-Terrorism Unit based in Manhattan, leaned over to regard Gray's terminal.

 

Gray turned to the voice, recognizing it with hope in his eyes, and said, "They're at it again."

 

"I saw the precursors," McBride noted, "That they were entering the system."

 

"Yeah, but it doesn't matter if we can't find exactly where they are," Gray sighed, shaking his head, "They're in a hundred different buildings, including the Empire State. You guys have agents standing by at all of them, but they have to search the buildings floor-by-floor to find them. The problem is, we have to shut down communications for the building so that they can't warn each other. So even if we could catch a few, do you have any idea how long a floor-to-floor search takes in the Empire State? We can't keep that building offline from communication for that long."

 

"Not to worry," McBride grinned, "I have an algorithm that will directly pinpoint their floors. All we have to do is send our officers up to the floor, and I bet we can round them up in minutes."

 

"Wow," Gray whistled, "I'd like to see that."

 

McBride whipped out a flash stick, plugged it in and let the program do its work. Within seconds, it had pinpointed each hacker, the building their signal was coming from and the floor of the building. "Here we go."

 

"I like it," Gray grinned.

 

McBride touched several buttons on his phone and dispatched the information, and monitored as each of the officers acknowledged the information and the plan. "We'll know soon enough."

 

Gray noted, "The problem has always been that they could hear us coming and could shift floors anytime they wanted."

 

"Not this time," McBride smirked, "At least, not if we do it right."

 

The first officer to report back was from the Empire State. Two of the hackers had been stationed there on separate floors. Both were now in custody and unable to warn their cohorts in the other buildings. Gray listened in awe as one by one, the officers reported in, having captured their respective quarries with minimal effort.

 

"That was brilliant," Gray stared at the screen as the weight seemed to lift from his shoulders, "How did you come up with the algorithm?"

 

"Simple process of elimination. I just looked at the problem from a very-large-scale search. The most important information is where the perps aren't - not where they are. The algorithm zones in on the candidate floor by understanding which floors are not candidates. Process of elimination leads the way. So we can search the Empire State and Chrysler buildings just as quickly as a single-story, capture the floor number and we're done."


---------------------

Some of you already see the parallels. It's how a zone map works. But how does it apply?

 

When we take a look at the Record Distribution option in the Netezza Admin GUI, we're often happy with a "ragged edge" for all the SPUs. And a "flat top" is the ticket. But what about the case of a "Manhattan Skyline", where we have high peaks and low valleys? This is higher than normal skew (something we're supposed to avoid, right?) People see those and shun them. However, these are often the natural result of an intermediate table produced by an ELT operation, and often a result of multi-pass queries in a BI tool. These usually leverage the mainstay workhorse CTAS (Create-Table-As-Select), so in many cases, people are tempted to turn on "random" for all CTAS operations. Or just maybe - one of our regular static supporting tables is deliberately distributed as a Manhattan Skyline just because we want to regularly perform co-located joins with it on larger master table using the same distribution key.

 

In any case, a primary reason we would get this kind of Manhattan Skyline distribution is if we are trying to preserve an existing distribution in order to perform a follow-on operation with tables on the same distribution. Whew! And why would we allow this to continue? Isn't a random distribution better than a Manhattan Skyline? Our problem remains: if the table has such a Manhattan Skyline distribution, we have higher than normal skew. Any full-scan on the table will cause the query to perform as slow as the "tallest bar"  (the SPU with too much of the table's data). As the table grows in size, the problem worsens. It is not a scalable distribution in its latent form, so don't embrace one without a plan.

 

Well, random distribution has a risk too, especially at the BI level, of negatively affecting concurrency performance. Even if our individual queries are not hindered by the data-broadcast incurred by the random distribution, they could just be a one-hit-wonder, because running many of these operations side-by-side can sometimes saturate the inter-SPU fabric, affecting concurrency. If we can keep the processing on the SPUs, we can avoid this problem entirely. So the issue is one of user scalability, something that all of us care about and that the other vendors (sometimes) turn a blind eye to. Netezza has it covered, and as usual, it's so simple a cave man could do it (now I'll get mail!)

 

So now we have two options, neither of which seem good - (a) keep the Manhattan Skyline distribution or (b) use a random one. Let me say that random is not always bad, but it poses a potential danger for concurrency. Likewise the Manhattan Skyline can often be a latent result of an intermediate CTAS so is unavoidable anyhow. And why would we want to preserve an existing distribution on a CTAS? The answer - because it will be a co-located write (blazingly fast). But wait! Don't we get a co-located write by default?

 

Maybe.

 

I have noted in prior posts how the default distribution for a CTAS might not be what we want or expected, so here's a quick recap:

 

(a) For simple single-table CTAS, it will preserve the source distribution key - (co-located write)

(b) For simple multi-table-join CTAS, it will leverage the first column result in the "select" clause (maybe a co-located write)

(c) For CTAS using summaries/group functions in the select, it will leverage the columns in the "group-by" clause (rarely a co-located write)

 

If any of the above are not the original distribution of the source(s), we could inadvertently sacrfice our co-located write. But we can preserve it if we specifically use "distribute on" with the CTAS execution. With co-located writes, this means the data never leaves the SPUs. If we distribute the CTAS on anything else, the data must leave its current SPU and find its way to another one. This initiates a data broadcast (and can negatively affect concurrency). Preserving the distribution, we get the benefit of a co-located write (avoiding broadcast to make the table) and set up the next operation for a co-located read (also avoid the broadcast to leverage the table). Short answer: preserving the distribution preserves concurrency performance. Now the SPUs are working for us at physics-speed.

 

Rather than just live with the latent effects, lets embrace and harness them for the good of all mankind. Well - er -  at least for our user base.

 

What we really want is threefold -

 

(1) preserve the distribution with a co-located write (preserve concurrency, potential Manhattan Skyline as latent artifact)
(2) leverage the result with a co-located read (preserve concurrency, potential penalty from Manhattan Skyline)
(3) mitigate the Manhattan Skyline with a zone map (ahh, best of all worlds)

 

So to get the first two, we can simply preserve the distribution with a "distribute on (key)" clause and make sure the distribution key is part of the "where/join" operations.. This is the simple part.

 

To get the third, we need to either (a) sort the data as it is created, or (b) make a materialized view after-the-fact to get the zone map effect for selected columns. The first one (sorting) is often easier than it sounds, and with strongly filtered intermediate tables is also very scalable. The second one (materialized view) has some caveats but is very fast to create. What does the zone map actually do? It effectively stripes each SPUs portion of the table so that only the section in the zone is actually addressed. Like McBride's algorithm, it's as though the rest of the data isn't even there, because the zone map has guided the optimizer to completely ignore it. So whether the SPU's data has a tall bar or a short bar, the performance is the same. We need all three of the above and the zone map mitigates the potential problem of unexpectedly high skew from an intermediate distribution - or an outlier table that we need to distribute on a common key. Even if (1) and (2) above give us a good distribution today, it could always "go Manhattan" in the future.

 

Another obvious question is "If this is an intermediate result, why bother? Just filter out the stuff I don't want and then there's no issue, right?" Well, technically yes, for a single operation, but I know of at least a dozen cases where the intermediate table is used for a lot of downstream activity, not just a one-off throwaway. So our stewardship rule is: make the data better. For the next downstream process or the ultimate data consumer, the data should get better every time we touch it.

 

Rather than rewrite or re-design a carefully tested and detailed process, adding a simple "order by" or MV is easy and preserves the existing logic, and data model, with little impact and high return. This is especially true of a static supporting table, because we can install what we need on the table's creation. The consuming processes all benefit from it with no more than regular query execution (materialized views are transparent).

 

In the end, we can still leverage the plain-vanilla parts of the Netezza performance model (zone maps, co-location) without having to over-engineer the data using indexes, intersection tables or summaries. This preserves something more  - the ongoing resilience and adaptability of the model itself.

 

Recap:

 

  • Apply the "distribute on" clause of the CTAS to avoid the latent effect of default distribution.
  • Preserve co-location for reads and writes in intermediate tables.
  • If a potential Manhattan Skyline distribution is the CTAS result, rather than go random, sort the CTAS result by a selected column or use a materialized view.
  • As always, apply strong filters to the CTAS creation so that it's not simply copying one table's contents to another (carve the data out).
  • Experiment for the best fit, but remember that Netezza is an appliance.
  • We don't need to engineer the queries, only apply simple performance model alignments in the data itself, to leverage the machine's physics
0 Comments Permalink
0

Famous words, or some such like, uttered by Orson Welles as he launched into a scary parody of alien terror on national radio. Really scary for some. And proferred on Halloween night in 1938, so dare I say, 'tis the season (almost).

 

Ahh, not to fear, this purports to be a painless foray. But I do have a story to tell.

 

Several projects ago (I always start this way, so you won't think I'm talking about you!) - I worked with some really sharp data engineers on boiling out a solution for retail operational reporting. The data arrived every five minutes or more, or less, and sometimes in parallel loads, with 24x7 regularity. More and more Netezza implementations are going this way, and you too, should look into processing data at the speed of thought. In any case, the reporting users wanted to plumb the depths of this data store, to the tune of eighty billion records and growing. (Okay, small I know (for some of you) but humor me).

 

Well and good, except rather late in the game, the reporting users spontaneously expressed a desire to review the detail through metadata-based "lens", that is, set up some drilling levels and other metadata-based entry points, such that the entire operational model would be seen through this reporting "lens" and it would provide all the context for the consumers.

 

Now, such a model as described, would require such enormous power from a standard SMP/RDBMS-styled system, that we might well cause structural damage on the raised floor for sheer physical weight of said system. That is, if we really expected a report to return within a day or two of the request. Ahem! as I facetiously clear my literary throat.

 

But the worst-case for any given query for the above was around 8 minutes, and over 99 percent of the thousands of queries submitted, returned in less than 30 seconds. Oh, yeah, it was smokin' hot. In most queries using zone maps and the like, we saw returns in mere multiple seconds. Pshaw! Says the tick-tock-man, chocolate and vanilla, don't waste my time.

 

However (and there's always a catch) many of the larger reports were actually conglomerations of these smaller queries, and their aggregate time would occasionally exceed ten minutes or more. And even though this was a far cry from the "days away" we would expect from an SMP/RDBMS system, it was still 'too slow' for the users. Now, this is true adrenalin-junkie stuff, sort of like the old Far-Side cartoon of a young man standing with a fork in front of a waffle iron, captioned "Wendell Zurkowitz, slave to the waffle light". I recall how one man noted that many years ago we would wait hour(s) for a traditional oven to finish cooking, and now get impatient when the microwave instructions are greater than five minutes.

 

Perspective.

 

And rather than punt to the users and say, "Hey guys, this is just unrealistic" and degenerate into "expectation management" - the challenge was to actually achieve faster turnaround times on the reports. And here, I'm talking about getting these ten-minute reports into the 30-second zone. Would we have to embrace some extreme engineering for this feat? Methinks not - but the form of the process to get there was quite instructive.

 

Now recall I noted that the above model had operational tables, which were to be the detailed source, and a retail reporting hierarchy that was largely metadata-based. This reporting hierarchy had some significant size as well, perhaps a fourth the size of the eighty-billion-record fact table it had to link into. Yet both of these were on separate distribution keys. Queryng one meant broadcasting another.

 

And now, for broadcasting.

 

Whenever two tables are distributed on different keys, a join between them cannot be initially co-located. To support the co-location, Netezza will broadcast the salient information from one table's context to the other. This means the physical data has to move from its home SPU, out onto the inter-SPU network fabric, and find its way to the target SPU where it will be further examined. Broadcasting for small tables is inconsequential and barely a blink on the radar. For larger tables it can have strange effects. For example, we saw one query return consistently in ten seconds. Yet when running side-by-side with itself (multiple users) it could take several times longer.


The reason is that both queries were competing for bandwidth on the inter-SPU fabric, among other things. The simplest solution, of course, is to get our metadata table distributed on the same key as the operational tables. The problem was simply in the complexity of this metadata table and how it mapped to the core information. "Blowing it out" into a materialized form of information would require significant planning and design, because a misstep could easily make the reports turn out wrong, and this was unthinkable. In all this, the maintainability had to be considered, because if our initial complexity is too high, the maintainability is in jeopardy - by design.

 

Of course, we would spend most of our time in testing this scenario. Coding and implementation in most BI shops is a nit compared to the testing we have to execute to validate the outcome. Netezza is no different, except we can close the testing loop sooner if we have more power. And of course, for something of this magnitude, to test the change from minutes to seconds, we would need a powerful machine to measure the difference. Whenever we ran the new solution on a smaller machine, the difference couldn't even be measured. No, the power of the machine makes the testable difference visible and measurable.

 

As I noted, the form of this exercise was the most instructive part. Rather than form a means to align these two tables for co-located joins, the first effort was in attempting to tune the queries. You know, "query engineering", which is the mainstay of performance engineering on an SMP/RDBMS platform, and old habits are hard to break. The data engineers were somehow in denial that they would receive extraordinary power from configuring the data. Rather they trusted their instincts and chose to attack the queries.

 

Now, in any platform, regardless of shape, size or vendor, power is always and forever the domain of hardware. Software cannot manufacture more CPUs or network speed. If the physical plant is not ready, the software can only use what it has at its disposal. The software itself is largely a cost center, because it can only drain the machine's energy through inefficiency. In an SMP/RDBMS machine, the only option we have is to engineer the queries, because the physical plant is configured to be general purpose.

 

In a purpose-built machine, however, the query is simply a controlling mechanism to Netezza's resources. The host will chop it apart into snippets and dispatch these to the component that they will serve. Extreme query engineering on the other hand, assumes that jockeying around with the query can actually affect our fate. (contrast; a poorly written query is different from directly engineering a well-written query). And besides, do we really want to spend our time carefully engineering the query to the point of functional brittleness? In an SMP/RDBMS machine we will see queries that extend for tens of pages in a very daunting complexity. Maintaining these is a full-time job for our consultants. They swarm on the machine, and carefully tune their handiwork to avoid breakage.

 

Yet, we purchased a Netezza machine to get away from this complexity. To reduce, clarify and simplify our administration and consumption of the data. So as I watched these engineers bat themselves against the problem, no differently than a fly batting against a window, I watched them pull out their hair in generous tufts when little they did offered the significant gains they expected. This outcome was entirely counter-intuitive to their training. They were acccustomed to using and tuning software to make things work faster.


Sweeping the hair from the floor one evening, I mentioned (for the x-teenth time) that the broadcast effect was killing them. Once our engineers grasped the broadcasting problem, I thought we would make headway, but things actually got worse. They started trying try to control the broadcast as the root cause rather than the symptom. In one test, I saw one of the largest tables leap into a broadcast and we just killed the query outright (it would probably still be running, even today). The engineers lamented: How do we make sure the larger table doesn't broadcast? How do we control the broadcasting to our benefit? Answers exist to all of these, but it's like talking to a drug addict, one who is addicted to the drug of SMP/RDBMS and claims he can 'quit anytime'.

 

And then the truth came out, "David, if we can make this 10100 machine process data like a 10400 machine, we'll look like heroes!" To which I ask "How?" to which the response is: "We can save them all that money they would have spent on the hardware..." Well, not really. You've just chosen something else to spend the money on, namely performance engineering, the cost of time-to-market, the cost of a marginal implementation and the cost of human labor (the most expensive asset you have, by the way). But since the only way to get a 10100 to perform like a 10400 is to actually be a 10400, well, you see the futility. 432 SPUs versus 108 SPUs? And they really, truly thought they could - I mean - seriously. Let's keep in mind that the opposite is true. If we can't make the 10100 process data like a 10400, perhaps our approach is flawed? Heroes or goats. Take your pick. In my estimation, there's only one hero in the room. The big black box.

 

So the broadcast is the symptom, not the root cause. How about, we quit broadcasting, cold turkey? Take the data model through a detox program and the engineers through a series of deprogramming seminars to - well - it's not that bad. Typically the average engineer only has to see it operate in an adverse manner to become a believer. But a believer they must be, or they will not take action to correct the problem, correctly.

 

So one of them finally decided to produce a map table, one that would map the metadata into the operational tables such that all core joins would become co-located, with a common distribution. And lo, the first test of this blew their minds. Even the complex reports were now coming back in single-digit times, and the reports that had been running ten minutes or longer were now under a minute, even with multiple users. In fact, they saw the performance and scalability practically handed to them - simply because they configured the data correctly. It had little to do with query engineering.

 

Now one may ask the obvious question, and please do so now: Why don't you just build out some user-facing tables and forget leveraging the operational tables? After all, we don't build our non-Netezza reporting systems on top of operational data, do we? We build-out dimensional models and other handy structures to postively affect the user experience and simplify the flow (and the maintenance). This functional decoupling is a mainstay of reporting environments. (Okay, the next entry will focus on this). But in this case, suffice to say that the owner of the machine had placed down a hard-mandate on disk utilization. At no time could we foray into replicated detail, or even summary of detail without a plan to access the operational detail on a drill-down and the like. Interestingly, the required reporting tables would have only cost mere fractions of the cost (on disk) of the time/labor and effort put into making the operational tables viable. This is why it deserves its own treatment in a separate rant - er - essay. Stay tuned, and don't touch that radio dial.


Back to the drama - A telltale symptom that we're doing something wrong, is when we start down the engineering path. It's an appliance. We don't engineer toasters, blenders or laundry machines. But the difference here seems to be subtle. It's not. In this case, the culprit was the broadcast, something to be eliminated rather than managed. And no amount of creative query hoop-jumping would overcome this. Get the joins onto the SPUs. It seems obvious to those who have been around the machine for bit. But for those who have not, the learning curve is upon them. Be patient with them for as long as it takes to get it right. Once we have a believer, we'll never have the conversation again. As long as we stay in a theoretical zone, however expect them to stay in the spin cycle. This is like many things scientific. Seeing is believing.

 

Whenever I (and others like me) observe a ritual of performance engineering, each participant holding out the hope that "just one thing" will offer stratospheric boost so they can all wipe their foreheads and go home - this is the surest sign of one of two things: Either the data is poorly configured and is causing the queries to be ineffcient, or the data is properly configured and the machine does not have enough physics to achieve the goal. If the focus is on query engineering, they are wasting time. If the focus is on data engineering, at some point it will reach a "diminishing return". Either the machine has the power or it doesn't. Time to switch to Netezza, or if using Netezza, time to add some physics (a frame or two) to make it happen.


Moral of the story: Performance is found in the physics, not the carefully engineered queries. If we find ourselves "engineering" our queries for performance reasons - we should take a step back, take a deep breath - click our heels together and say softly: "There's no power like SPU power. There's no power like SPU power." Repeat as necessary.

 

And pay no attention to the man behind the curtain. I'll bet he and Orson Welles never even met.

0 Comments Permalink