[ www.netezza.com ]

Thinking Inside the Box

4 Posts tagged with the enzee_universe tag
3

Netezza Migrator.jpg

It may have been the result of a misunderstanding or a comment heard out of context. But whatever the background for the commentary, let me simply state that Netezza is completely committed to the success of the Netezza Migrator and all the other Netezza products and functionality launched at Enzee Universe 2010 this past week. Migrator eliminates a potential barrier to TwinFin™ adoption (i.e., migration costs) and logically should lead to easier acceptance and broader system sales for Netezza. Furthermore, our partnership with EnterpriseDB at both the corporate and technical levels has been and remains extremely solid and strong.

As I stated in the
announcement of the product, “The Netezza Migrator product allows organizations to make data warehouse migration decisions independent of proprietary software lock-in. Organizations using data integration and BI applications with embedded Oracle-proprietary database constructs, interfaces and utilities can now more easily manage their migration from Oracle to a TwinFin appliance. The Netezza Migrator will allow our customers to achieve the performance, scale and cost advantages of their TwinFin systems while maintaining their prior investment in proprietary software.” The Netezza Migrator is specifically designed to reduce the time, complexity and costs required of our customers to move their IT applications to the Netezza TwinFin platform.

With Migrator, Netezza’s customers will be able to extract themselves from the dreaded “Oracle lock-in” of functions and procedures written using Oracle-proprietary techniques and they can decide which of their applications to migrate directly to Netezza and just when, at their own pace. Its capabilities go well beyond the extremely limited capabilities provided by Oracle’s own ‘Database Gateway for ODBC’. Migrator provides an Oracle compatible wrapper around Netezza that is optimized in ways that Oracle could never hope, nor deign, to provide with its "Heterogeneous Services" functionality: including support for Netezza syntax pushdown, high speed API, and Netezza user defined functions.

Migrator makes it even easier for Netezza’s customers to move all of their data warehouse from Oracle to Netezza. In short, this is something we feel is extremely valuable for Netezza and particularly “liberating” for our customers.

3 Comments Permalink
0
"I cannot imagine life without Netezza." from a tweet by "noogle" (Twitter, 11 May 2010)

 

800px-Shibuya_night.jpg

[Photo credit: 2006 photo of “the scramble” intersection in the Shibuya district of Tokyo, courtesy of "Bantosh" and Wikipedia]

 

Late in May, members of my team and I were in Tokyo's ultra-bustling Shibuya district for a few days on our "worldwide whirlwind training tour" with the global field sales teams regarding the details of the TwinFin i-Class product offering. The late-night scene, hairstyles and outfits there border on the outrageously-hip. There are high-def billboards, electronic gadgets, and of course the bright lights of retailers, bars, clubs and restaurants all through Shibuya. Advances in high technology are virtually 2nd nature to the people there. So with that as the backdrop, imagine the surprise of hearing over a beer or two (see earlier reference to bars & clubs in Shibuya) that a customer "could not live without Netezza".

 

We're proud of our highly referenceable customer base at Netezza and our "easy to do business with" relationships with our customers and partners. In my six years with the company I've met a pretty fair number of really enthusiastic customers including people who held "welcome" parties for their Netezza systems, use "Netezza" as a verb ("Did you Netezza that data?") or even an adverb ("It's Netezza easy."). But I can't recall any customer who said that they, "could not imagine living without Netezza".

 

Simple self-promotion is not the real point of this post though. What is is the thread of a dilemma that noogle presents us with in his 40+ word tweet. It's something that business managers and analysts face on a daily basis: what is more important –

  • being able look for strategic and/or tactical competitive nuggets by performing SQL OLAP analytics on their full, atomic-level dataset; or
  • looking for that guidance by using advanced analytical toolsets on subsets or aggregations of their data that are extracted from the data warehouse?

 

Here's the whole tweet by noogle in it's original form:

デー タが莫大になると分析が不可能になる。少ないデータを複雑なアルゴリズムで分析するよりも、莫大なデータを単純なアルゴリズムで分析する方が有益。統計学 とは逆。アホかという量のデータ分析の手助けするのがNetezza。もうあたしはNetezzaの無い世界では生きていけない。

 

And here's a translation of it into English [parenthetical comments and emphasis are mine]:

When data is huge, complex analytics are impossible. It’s far more beneficial analyzing massive data with simple [SQL] logic, rather than analyzing small data with complicated analysis. This is opposite of statistics [based on sampling techniques]. Analyzing data which is “crazy massive” is Netezza. I cannot imagine life without Netezza.

 

It turns out that noogle is a long-time user of advanced analytics and predictive techniques. He knows their value, but his tweet exposes of weakness of today's typical analytical environment. By not being able to perform advanced analysis inside the database, most of that work (if performed at all) is done in external servers based on data sets that are extracted (filtered, sampled and/or aggregated) from the data warehouse.

 

That adds latency to do the extraction and limits the "currency" of the data. Depending on whom you ask, it also limits the accuracy of the results. For instance, looking at aggregations or samples may give you a sense of the "big picture" but not necessarily uncover the needle in the haystack (e.g., fraud detection) or the impact of a long tail that can be exploited in a particular business.

 

So noogle's choice is to use the analytic horsepower of TwinFin over the sampling techniques. But if one is limited to the set-based logic of SQL, perhaps aided by user-defined functions, you are again limited in the predictive visibility that those tools can provide. Faced with the dilemma this customer chose being able to analyze all the data over statistically sampling and performing advanced analytics on the sample set. Having an answer to that dilemma is precisely what has driven the advent of the i-Class functionality for TwinFin.

 

We're excited about TwinFin i-Class, but I'm interested in what others may have to say about this. Does your company employ advanced analytical techniques and how have you reconciled the "sampling" versus "full data set" questions in your business? And what are the prospects and pitfalls of doing "crazy complicated analytics on crazy massive data" all in one simple, high-performance data warehouse appliance from your perspective?

0 Comments Permalink
0

A loyal customer alerted us toan Oracle blog by Jean-Pierre Dijcks earlier today that showed the Oracle FUD machine is fully revved-up and ready to go. I'd like to offer a rebuttal, however in the interest of not intruding on Jean-Pierre's entry with an overly-long comment, I've just put a short response on his blog post with a pointer to this one.


Misconceptions and Misunderstandings, or Errors and Plain-old FUD?

I’m writing to correct *just a few* of the misconceptions about what is really important in high-performance, scalable data warehouse systems, errors, or just plain-old pure “competitive FUD” points from Jean-Pierre's posting earlier today. We certainly have posted some information recently about the TwinFin product and Curt Monash’s postings late Thursday provided more info. If his readers are interested in learning more, or even signing up for a “Test Drive”, they should visit www.netezza.com.

First off, I think this is a “banner day” for Netezza. We believe that TwinFin (and the other products in the new product family)
extend both our performance and price-performance advantage over our competitors. We stand by our marketing statements that we regularly demonstrate 10-100X performance advantages over our competitors, particularly competitive offerings of the major incumbent DW system vendors (“Just who are those incumbents?” Jean-Pierre's readers may ask. Well let’s just say that we see Oracle as the incumbent system and/or a challenger system in over 50% of our deal flow.).

Regarding his claims about DBM being “
faster than Netezza” (and I can only assume he meant at “real” data warehouse tasks) - we’re ready whenever Oracle feels up to actually taking one of their Database Machines onsite to a customer for a fair, open customer benchmark. So far, Oracle have been, shall we say, “a little reticent” to do on-site benchmark testing against Netezza.

Next, given the large number of incorrect points in the original posting, I think perhaps that just a few of them will be useful enough for readers to get the gist of just how far afield some of the ‘facts’ are:

  • It all comes down to data scan rates per rack”: Would that it were true that all of data warehousing boiled down to full-stream data scans (as if the entire world of analytics relied on “select count(*) from lineitem” types of queries), then we could all measure “goodness” on how many GB/sec of data could be burst-scanned in our systems. But that’s not the case. So we build Netezza’s data and analytic appliances to deliver the best possible overall performance at the best price and power requirements. As a consequence, and following from those same numbers as-posted, a single rack of TwinFin can process (not just scan) about 400 million rows of data per second. That’s process, as in: “scan, decompress, project, restrict, AND join, etc.”. Need more processing firepower? Netezza’s system performance scales linearly with the addition of more S-Blades: at the low-end, the TwinFin 3 can deliver as much as 100M rows/second of processing horsepower, while the TwinFin 120 can provide you with 4 billion rows/second.  Does a system that still relies on using SMP-based servers running “plain old” Oracle 11g RAC scale similarly for data warehousing?


  • Non-open Linux running on FPGAs”: I’m really not sure what (if anything) was meant by this, but saying that Netezza’s FPGAs “are apparently running non-open Linux” is oxymoronic on at least two different levels (FPGAs don’t typically “run” an OS and, “non-open Linux” - really?)


  • User data & compresssion”: I also enjoyed the accounting of all that “user data” available to DBM users in the Oracle table and the various comments about compression. When Netezza quotes user data capacities in our systems, the numbers reflect real raw user data space, not space that will be further reduced because of required indexes in an attempt to boost performance. Furthermore, Netezza’s compression & decompression techniques allow us to extract “pure performance” from their use. By not relying on CPU cycles to decompress the data before we can process it any further, the FPGA engines decompress the data, on-the-fly, as fast as it streams off the disk drives. Can Oracle make either of those claims?


  • Tolerating node failures without downtime”: In perhaps the most bald-faced inaccuracy, the Oracle blog claimed, that Netezza “continues to lack the ability to tolerate node failures without downtime”. This I can only chock up to pure competitive “FUD-ism” as our capabilities in this area have been quite strong throughout the four generations of Netezza appliances and are further strengthened in TwinFin. Netezza is a fully-redundant system with no single point of failure, even in our smallest systems. Failover in the presence of failures of the disk drives, S-Blades, internal networking or host processors (in short, everything) is automatic and done in-service, with hot-swappable replacement throughout.


  • Appliance simplicity”: One thing Jean-Pierre didn’t address that might have been humorous to see his take on is the notion of “appliance simplicity” - basically the ability to build, support and maintain large to very large-sized data warehouses, with heavy workloads, with no or minimal tuning, partitioning, indexing or other “performance duct tape” required. Routinely, this capability in the Netezza systems is what delights our customers most and we have customers managing systems with several hundreds of terabytes of user data (not indexes + data, mind you - real data) with fractions of an FTE (full-time employee) devoted to them.


I hope that clears up some of the misconceptions. If any of Jean-Pierre's readers or Oracle customers would like to see or hear more about TwinFin for themselves, we definitely would invite them to come stop by our booth (#207) at
TDWI or come to one or our regional Enzee Universe events coming to a location near you.

0 Comments Permalink
0

 

"Don't be afraid to try the greatest sport around

(catch a wave, catch a wave)
Everybody tries it once
Those who don't just have to put it down
You paddle out turn around and raise
And baby that's all there is to the coastline craze
You gotta catch a wave and you're sittin' on top of the world"
– from "Catch a Wave" by The Beach Boys (1963)

Surf's up! Summer seems to finally have arrived in the Boston area and a number of vendors in the data warehousing and analytics space are hoping to catch a wave riding on a flurry of industry announcements. A few trends continue to build in the news:

 

  1. Data sizes continue to grow alongside the pressure to increase performance & shrink data latencies;
  2. Workload complexity and user counts continue to grow;
  3. More and more, customers are seeing the value of running advanced analytical processing directly in their primary data repository (see item #1 for reasons why); and
  4. Industry prices for data warehousing and analytics have begun another shift downward.


Today I'd like to address this last point. According to more than one industry analyst, over the last several years, Netezza has served as "the benchmark" for DWA pricing in the industry. Several of our competitors have sought to match and/or undercut Netezza pricing in the market. Some of the incumbent players have tried to, with very limited success, hinge their pricing off Netezza prices, match the performance of the Netezza Performance Server® system, or inoculate their pricey "flagship" products by adding less-expensive, feature-deficient products to their portfolio. But Netezza has continued to succeed in the marketplace, becoming a profitable, publicly-traded company with nearly 300 customers and 400 employees worldwide and one that is listed among the "Leaders" in the Gartner Magic Quadrant.

 

When we disrupted the data warehousing market with our first generation product in 2003 and 2004, Netezza was one of very few startups in an otherwise moribund industry. Now, with established "street cred" and hundreds of loyal customers, we intend to once again upset our competitors and lead the market in pivoting to a new competitive price-performance level. We're about to launch the fourth generation platform of our data warehouse and analytic appliances, which will advance Netezza's performance leadership and once again establish a new price-performance benchmark.

 

Admittedly, we won't be the first vendor offering high-performance data warehouse systems to move to a lower pricing plateau. That task is usually done by early-stage start-ups looking to find a way to differentiate themselves. True to form, Dataupia probably can claim establishing a lower price point first and recently another multiyear "start-up" has also started lower. But those are offerings from very modestly-sized startups with no established market "track record". Netezza will be the first company with proven product maturity, customer base and financial viability to do so.

 

Just how and what are we doing to cause this disruption? Well, let's just say things around the "briefing table" have been quite hectic, and that I and others will have more news about that to follow shortly.

 

[As you might imagine, it's been getting more and more difficult to keep things under wraps – in recent weeks we've even had to fight people off from getting early "sneak peeks". ]

 

Until then hey, it's summertime! So here's what I'd recommend –

 

"So take a lesson from a top-notch surfer boy

(catch a wave, catch a wave)
Get yourself a big board
But don't you treat it like a toy
Just get away from the shady turf
And baby go catch some rays on the sunny surf
And when you catch a wave you'll be sittin' on top of the world


Catch a wave and you'll be sittin' on top of the world"

 

 

Twin Fin: A short board (usually 5'8" - 6'8") with a wide tail for maneuverability and a fin near each rail for stability in radical turns.

 

Purpose: A wider tail area provides more planing area and lift, which creates more speed by efficiently utilizing wave energy. Milking speed and energy from smart surf with extremely sensitive and responsive turning ability are this design's strong points

0 Comments Permalink
Bookmark and Share

Actions