"What Netezza is doing is... going a step further: score the data as it is streamed into the appliance and before it even hits the database... However, it is not just the performance gain that is significant. This initiative means that developers are embedding analytic software into the Netezza Data Warehouse Appliance so that it becomes, in effect, an application appliance."
Philip Howard, Director of Research, Technology, Bloor Research - from his 5th October posting, "The Netezza Developer Network"
Pardon the title's riff on the late-1970s Elvis Costello hit song What's So Funny 'Bout Peace, Love and Understanding, but a recent mini-dustup got me to thinking about providing a bit of insight into why Netezza's approach to "Streaming Analytic™ Appliances" is different from others' entries in the market. It seems the recasting of Netezza's mission in terms of streaming analytics rather than the more-limiting data warehouse appliances, along with the launch of the Netezza Developer Network (NDN), has caused something of a hullabaloo among some of our competitors (refer to recent stories from Teradata/SAS, Greenplum, IBM/SPSS and industry analyst, Curt Monash).
And well it should. While some would seem to declaim Netezza's positioning on the topic as 'nothing more than UDFs', and argue that what matters is supporting them effectively, we must beg to differ (and to differentiate). In short, we feel the Netezza approach to Streaming Analytics opens the door to dramatically change the way data warehouse systems are viewed, used and even deployed.
The positioning of some of our larger, (recently) publicly-traded competitors may suggest that they see themselves not just as expert in the domain of data warehouse systems, but also as experts in the ways of CRM, advanced scoring and analytics, etc. They seem to have bolted on homegrown software packages as extensions of their data warehouse offerings in the market. That may well be the case, but we don't really see how it's possible for one vendor to "corner the market" on innovation - a view that we think is borne out in recent announcements of closer UDF-based partnerships. Still others, more from the new-entrant category, claim that the only thing required is simply to support basic UDF functionality as an extension to the database. We think both ends of that argument are incorrect.
Instead, we at Netezza think it best to "stick to our knitting". Our aim is to provide the high-performance infrastructure along with a technical and community foundation to enable others much more expert than we are to drive the algorithmic and application-level innovation by their ability to exploit the performance of our streaming analytic appliance. To again provide a riff on something from the late-70s and early-80s BASF ad (a campaign that has recently been rekindled in that company's marketing), our vision could be summarized as, "We don't make your advanced applications; we make your advanced applications 'run like rockets'**."
** "Raw SPU functions are called like any other SQL function... ...and they run like rockets by exploiting the Netezza architecture."
— Justin Lindsey, Chief Technology Officer, Netezza, speaking at the 2007 International Netezza User Conference, 26th September, 2007
<font color="#008000">What Netezza Provides</font>
What Netezza provides in this mix is an extremely high-performance system, particularly well-suited at storage-intensive operations (like data warehousing) and in particular, operations that (like data warehousing & BI) can benefit from a data streaming architecture in which critical reduction of unnecessary data can be accomplished as rapidly as it is read from the storage elements - allowing for greater processing efficiencies. We've written extensively about this before (see Spotlighting FPGAs, parts one, two and three) and won't repeat the arguments here.
Another key lever that Netezza provides by way of the NPS® appliance is the fact that our intelligent storage elements known as Snippet Processing Units (SPUs) are really each compute nodes. They are capable of running compiled C or Java code, with the added task-by-task "customizability" of an FPGA that can further accelerate performance, operating in an MPP compute grid but with the simplicity of the our appliance approach.
Consider this: if those 100s of SPUs in an NPS appliance could be used to run C code to execute SQL query processing tasks, why couldn't they equally be tasked to perform tasks that go well above and beyond those enabled (encumbered?) by the set-based, structured-data logic of SQL? Where others may use UDF or even UDA functionality in the data warehouse systems to collect up and standardize use of SQL functionality across users, the streaming analytics enabled by Netezza allows users to "draw outside the lines" of SQL.
Another thing Netezza provides for NDN members is a set of some basic building blocks - functions and an algorithmic work area that form the foundation for more advanced work to be produced. In so doing, some of these appear to be greatly in common with the standard fare of "traditional" SQL functional extensions: record-level functions or User-Defined Functions (UDFs) and aggregate-level functions or User-Defined Aggregates (UDAs) are part of the foundation. But some of the other parts go far beyond those definitions allowing for developers to implement functions retaining a sense of state or to cascade multiple complex algorithmic processes to build even more powerful solutions, all making use of the streaming nature of the NPS analytic appliance to push performance even further.
And finally, what Netezza provides is a simple development appliance platform on which NDN members can develop and verify their algorithms, including the performance impacts of operating in parallel. Affectionately known by the decidedly non-marketing name, SPUBox, the platform is a fully-functional version of the NPS appliance, with four Snippet Processing Units, a host processor and network connectivity. Weighing in at a little over 40 pounds (18 kilos), one might call it a "0th generation luggable analytic appliance" but one that only consumes about as much energy as two 75W light bulbs. We granted more than ten of them to new NDN members at our September global user conference in Boston, some with special decal "wraps" to stand out above the ordinary compute platforms you may be used to.
<font color="#008000">It Takes a Vilage...</font>
I think the word potential is important when applied to streaming analytics, because what we're doing is opening the door to the potential of the data warehouse to be used in a very different way. Those extended uses are being made possible by Netezza and our community of users, developers and partners that is being fostered and growing, virtually with each passing day.
What we are seeking to unleash is a new level of performance and innovation in the use of storage-intensive analytical computing, but the important bit is that Netezza is not looking to do this alone. In fact, we do not picture ourselves as having cornered the market on analytical algorithm writers. Instead what we launched with the NDN is intended to evolve as a cooperative and competitive global web of experts who will build on their own and one another's innovations. Here, the term, "coopetition" seems trite; I'd prefer to think of the NDN as an opportunity for innovative "mashups" at the building-block, advanced algorithmic and applications levels.
Foundational elements are used together to enable basic value-add functions to be built. Those are mixed and matched, typically but not always with standard SQL fare, to enable more complex algorithms to be realized. And, in turn, the algorithms enable very high-performance applications and new uses of the NPS appliance to be realized. In some cases, one entity may do most or all of the above work. In many others, we are already seeing cooperation among members to use and reuse modules developed elsewhere to extend the capabilities.
Like what has been accomplished by artists with a humble plastic child's toy such as the Lego, these capabilities can be mixed and built-upon to create innovations we may not even be able to imagine today.
The opportunity is helped along by the network effects of an open community and members (by last count, in excess of 50 spread around the world) spanning entities from university professors and graduate students to BI applications providers to end-customers of the Netezza Performance Server? appliance - and everything in between. This is where the true industry expertise lies. This is also the source of innovation for what can be possible with the opening up of Netezza's architecture to more than "just" data warehouse and BI.
Where will all of this lead? To advanced text, image, bioinformatics or video processing? Perhaps. Into the domain of the 'what if' Monte Carlo or Genetic algorithm simulations for risk analysis and predictive resource optimization? That's another possibility. But we're confident that people are going to use the NPS appliance in new and innovative ways as a result of Streaming Analytics and the NDN - and in ways which may well help shape the features and functionality of the appliance in releases to come.
<font color="#008000">What's So Special?</font>
What's so special about all that? Well with these foundational building blocks, imagine being able to develop customer- or threat-scoring algorithms that could be accomplished in as little as one pass through record data in a data warehouse instead of multiple passes required to denormalize or pivot data, or worse still, large extracts of the data from the warehouse to an off-board computing complex in order to perform the denormalization and scoring tasks. What if this single-pass technique yielded a 10X speedup in processing? What if it could be more than 100X - perhaps even allowing a task that formerly was accomplished in over 10 hours to be done in less than 20 minutes? Might that change the way that particular analytical task was used? Might that change someone's business? We think it could. More importantly, so do many of our customers, partners and prospects.
To date, the Netezza Developer Network has dozens of active partners participating in the program globally, with more than 100 applications to become part of the program pending [note: if you're thinking of your own really exciting "on stream" application ideas, you can apply online at http://www.netezza.com/ndn]. We think from the combined innovation and expertise of this group, the NDN has the potential to take the NPS analytic appliance to new levels of performance and new applications domains that will continue to include, but may go far beyond, the standard Data Warehouse Appliance of our roots.

