[ www.netezza.com ]
2

Change, but no Change

Posted by Phil Francisco Jul 31, 2009

Just trying to clarify. Curt Monash's informative blog on the coming Netezza system and family of products includes the following:

 

<snip>

 

Beyond the switcheroo in components, Netezza is making substantial changes to its hardware architecture. In current Netezza products, the FPGA plays the role of a disk controller on steroids — it receives data, does some SQL or other analytic operations on it, and then throws it over the wall to the CPU for the rest of the processing. The new Netezza product family, however, adds an actual disk controller. More important, it adds fast interconnects between the FPGAs, the disk controller, and RAM — specifically, as Phil Francisco put it in an email,

using multiple parallel channels of PCIe with much faster interconnection rates and lower contention between the blade server and the “DB accelerator card” with the FPGAs.

DMA (Direct Memory Access) technology also fits into the picture somehow.

 

<snip>

 

...which seems to beg further clarification.

 

While Curt suggests big changes are afoot in Netezza's “architecture” - I think a more appropriate viewpoint would be that it's “the same architecture with a new physical implementation”. That is, the concept of data streaming from disk through the system is just as important now as it ever was.

 

S-Blade Diagram.jpg

 

True, we did move the "disk controller" function to a pair of HBA (Host Bus Adapter) cards that interface with the disk enclosures using multiple, redundant SAS (Serial-Attached SCSI), and providing more than ample bandwidth to stream all the drives per rack continuously to the blades. For those who click-thru on Curt's blog, this function is embedded in the device labeled “SAS Expander Module” (one on both the blade server and the "DB accelerator") in the 3rd chart of the PDF file (and also shown above) and allows data to stream from disk through to memory and then on to the FPGA without delay.

 

SP Data Flow.jpg

 

To move data between the blade server and the DB accelerator cards, we use IBM's expansion card (formerly known as "sidecar") technology to provide multiple parallel high-speed PCIe (peripheral component interconnect express) channels delivering the data streams from the disk drives to the memory on each blade server and providing very high-speed interconnect between the FPGA devices and that same memory, using DMA (direct memory access) to effect high-speed memory access without encumbering the CPU to get at it.

 

FPGA Engines.jpg

 

With all this high-speed interconnectivity, Netezza has been able to alter the data flow so that data streams to the memory first and then to the various FAST engines (see above diagram and/or refer to Issue 16: The Latest Addition to Netezza's FAST Engines Framework) in the FPGA. Those engines act as a "turbocharger" for query processing, implementing data decompression, restricting, projecting and applying the appropriate visibility rules in a pipelined process; typically filtering out well over 95% of the data scanned. From the FPGA, the resulting reduced data set is passed on to the CPU memory for additional processing to complete the process.

 

So, the logical streaming model of data from from disk to FPGA to CPU is retained, with significantly higher throughput as a result. But there's an added benefit: the fact that the originally-scanned data can remain in memory, still in compressed & unfiltered form, to be used as a cache avoiding disk scan activity where possible and helping boost system performance even more. In short, "Change, but no Change."

 

I hope that helps - with Curt's architecture viewpoint as well as with questions about our use of PCIe interconnects to raise performance.

2 Comments Permalink
0

 

"Don't be afraid to try the greatest sport around

(catch a wave, catch a wave)
Everybody tries it once
Those who don't just have to put it down
You paddle out turn around and raise
And baby that's all there is to the coastline craze
You gotta catch a wave and you're sittin' on top of the world"
– from "Catch a Wave" by The Beach Boys (1963)

Surf's up! Summer seems to finally have arrived in the Boston area and a number of vendors in the data warehousing and analytics space are hoping to catch a wave riding on a flurry of industry announcements. A few trends continue to build in the news:

 

  1. Data sizes continue to grow alongside the pressure to increase performance & shrink data latencies;
  2. Workload complexity and user counts continue to grow;
  3. More and more, customers are seeing the value of running advanced analytical processing directly in their primary data repository (see item #1 for reasons why); and
  4. Industry prices for data warehousing and analytics have begun another shift downward.


Today I'd like to address this last point. According to more than one industry analyst, over the last several years, Netezza has served as "the benchmark" for DWA pricing in the industry. Several of our competitors have sought to match and/or undercut Netezza pricing in the market. Some of the incumbent players have tried to, with very limited success, hinge their pricing off Netezza prices, match the performance of the Netezza Performance Server® system, or inoculate their pricey "flagship" products by adding less-expensive, feature-deficient products to their portfolio. But Netezza has continued to succeed in the marketplace, becoming a profitable, publicly-traded company with nearly 300 customers and 400 employees worldwide and one that is listed among the "Leaders" in the Gartner Magic Quadrant.

 

When we disrupted the data warehousing market with our first generation product in 2003 and 2004, Netezza was one of very few startups in an otherwise moribund industry. Now, with established "street cred" and hundreds of loyal customers, we intend to once again upset our competitors and lead the market in pivoting to a new competitive price-performance level. We're about to launch the fourth generation platform of our data warehouse and analytic appliances, which will advance Netezza's performance leadership and once again establish a new price-performance benchmark.

 

Admittedly, we won't be the first vendor offering high-performance data warehouse systems to move to a lower pricing plateau. That task is usually done by early-stage start-ups looking to find a way to differentiate themselves. True to form, Dataupia probably can claim establishing a lower price point first and recently another multiyear "start-up" has also started lower. But those are offerings from very modestly-sized startups with no established market "track record". Netezza will be the first company with proven product maturity, customer base and financial viability to do so.

 

Just how and what are we doing to cause this disruption? Well, let's just say things around the "briefing table" have been quite hectic, and that I and others will have more news about that to follow shortly.

 

[As you might imagine, it's been getting more and more difficult to keep things under wraps – in recent weeks we've even had to fight people off from getting early "sneak peeks". ]

 

Until then hey, it's summertime! So here's what I'd recommend –

 

"So take a lesson from a top-notch surfer boy

(catch a wave, catch a wave)
Get yourself a big board
But don't you treat it like a toy
Just get away from the shady turf
And baby go catch some rays on the sunny surf
And when you catch a wave you'll be sittin' on top of the world


Catch a wave and you'll be sittin' on top of the world"

 

 

Twin Fin: A short board (usually 5'8" - 6'8") with a wide tail for maneuverability and a fin near each rail for stability in radical turns.

 

Purpose: A wider tail area provides more planing area and lift, which creates more speed by efficiently utilizing wave energy. Milking speed and energy from smart surf with extremely sensitive and responsive turning ability are this design's strong points

0 Comments Permalink
Bookmark and Share

Actions